unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Philip Kaludercic <philipk@posteo.net>
To: Joost Kremers <joostkremers@fastmail.fm>
Cc: Emacs Devel <emacs-devel@gnu.org>
Subject: Re: [PATCH] csv-mode.el: Add function for reading a CSV line
Date: Wed, 22 May 2024 06:17:28 +0000	[thread overview]
Message-ID: <87le42v093.fsf@posteo.net> (raw)
In-Reply-To: <86wmnmixmf.fsf@fastmail.fm> (Joost Kremers's message of "Wed, 22 May 2024 00:55:04 +0200")

Joost Kremers <joostkremers@fastmail.fm> writes:

> Hi,
>
> @Philip Kaludercic, as per your suggestion
> (https://lists.gnu.org/archive/html/emacs-devel/2024-05/msg00966.html), here's a
> patch for csv-mode.el to add a function for reading a CSV line and unquoting the
> resulting field values.
>
> I've put the unquoting in a separate (internal) function, csv--unquote-value,
> which also un-escapes escaped quote characters inside the field value. As per
> RFC 4180, the escape character is the quote character. The RFC only mentions the
> double quotation mark as quote character, but csv-mode.el makes it possible to
> use other quote characters, so the patch supports that as well.
>
> I've also added a test for csv--unquote-value.
>
>
>
> -- 
> Joost Kremers
> Life has its moments
>
> From e97b8b8cca3f987cbdf5e29ec184f37825755eba Mon Sep 17 00:00:00 2001
> From: Joost Kremers <joostkremers@fastmail.com>
> Date: Wed, 22 May 2024 00:07:34 +0200
> Subject: [PATCH] Add function for reading a CSV line and return its values as
>  a list.
>
> * (csv-parse-current-row): New function; unlike csv--collect-fields,
>   unquotes the field values.
> * (csv--unquote-value): New function.
> ---
>  csv-mode-tests.el | 12 ++++++++++++
>  csv-mode.el       | 26 +++++++++++++++++++++++++-
>  2 files changed, 37 insertions(+), 1 deletion(-)
>
> diff --git a/csv-mode-tests.el b/csv-mode-tests.el
> index 0caeab7..ea955a9 100644
> --- a/csv-mode-tests.el
> +++ b/csv-mode-tests.el
> @@ -144,5 +144,17 @@
>               (csv--separator-score ?\; csv-tests--data
>                                     (length csv-tests--data)))))
>  
> +(ert-deftest csv-tests-unquote-value ()
> +  (should (equal (csv--unquote-value "Hello, World")
> +                 "Hello, World"))
> +  (should (equal (csv--unquote-value "\"Hello, World\"")
> +                 "Hello, World"))
> +  (should (equal (csv--unquote-value "Hello, \"\"World")
> +                 "Hello, \"\"World"))
> +  (should (equal (csv--unquote-value "\"Hello, \"\"World\"\"\"")
> +                 "Hello, \"World\""))
> +  (should (equal (csv--unquote-value "\"Hello, World'")
> +                 "\"Hello, World'")))
> +
>  (provide 'csv-mode-tests)
>  ;;; csv-mode-tests.el ends here
> diff --git a/csv-mode.el b/csv-mode.el
> index f639dcf..09402c2 100644
> --- a/csv-mode.el
> +++ b/csv-mode.el
> @@ -4,7 +4,7 @@
>  
>  ;; Author: "Francis J. Wright" <F.J.Wright@qmul.ac.uk>
>  ;; Maintainer: emacs-devel@gnu.org
> -;; Version: 1.23
> +;; Version: 1.24
>  ;; Package-Requires: ((emacs "27.1") (cl-lib "0.5"))
>  ;; Keywords: convenience
>  
> @@ -107,6 +107,10 @@
>  
>  ;;; News:
>  
> +;; Since 1.24
> +;; - New function `csv--unquote-value'.
> +;; - New function `csv-parse-current-row'.
> +
>  ;; Since 1.21:
>  ;; - New command `csv-insert-column'.
>  ;; - New config var `csv-align-min-width' for `csv-align-mode'.
> @@ -1400,6 +1404,26 @@ point is assumed to be at the beginning of the line."
>  	      (forward-char)))
>  	(nreverse fields)))))
>  
> +(defun csv--unquote-value (value)
> +  "Remove quotes around VALUE.
> +If VALUE contains escaped quote characters, un-escape them.  If
> +VALUE is not quoted, return it unchanged."
> +  (save-match-data
> +    (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]"))))
> +      (string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value)

Shouldn't this `string-match' be in the if-let?

Take this example,

(let ((str "1 2 3"))
  (list (string-match "2" str)
	(match-string 0 str)
	(string-match "4" str)
	(match-string 0 str)))
;;=> (2 "2" nil "2")

even though string-match failed, the match data remains and matc-string
returns non-nil values.

> +      (if-let ((quote-char (match-string 1 value))
> +               ((equal quote-char (match-string 3 value)))
> +               (unquoted (match-string 2 value)))
> +          (replace-regexp-in-string (concat quote-char quote-char) quote-char unquoted)
> +        value))))
> +
> +(defun csv-parse-current-row ()
> +  "Parse the current CSV line.
> +Return the field values as a list."
> +  (save-mark-and-excursion
> +    (goto-char (line-beginning-position))
> +    (mapcar #'csv--unquote-value (csv--collect-fields (line-end-position)))))
> +
>  (defvar-local csv--header-line nil)
>  (defvar-local csv--header-hscroll nil)
>  (defvar-local csv--header-string nil)

-- 
	Philip Kaludercic on peregrine



  reply	other threads:[~2024-05-22  6:17 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-21 22:55 [PATCH] csv-mode.el: Add function for reading a CSV line Joost Kremers
2024-05-22  6:17 ` Philip Kaludercic [this message]
2024-05-22  7:00   ` Joost Kremers
2024-05-22 16:14     ` Philip Kaludercic
2024-05-22 16:21       ` Joost Kremers
2024-05-25  8:26         ` Philip Kaludercic
2024-05-26  4:07   ` Stefan Monnier via Emacs development discussions.
2024-05-26  8:08     ` Joost Kremers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87le42v093.fsf@posteo.net \
    --to=philipk@posteo.net \
    --cc=emacs-devel@gnu.org \
    --cc=joostkremers@fastmail.fm \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).