From: Philip Kaludercic <philipk@posteo.net>
To: Joost Kremers <joostkremers@fastmail.fm>
Cc: Emacs Devel <emacs-devel@gnu.org>
Subject: Re: [PATCH] csv-mode.el: Add function for reading a CSV line
Date: Wed, 22 May 2024 16:14:31 +0000 [thread overview]
Message-ID: <87r0dthli0.fsf@posteo.net> (raw)
In-Reply-To: <86ttiqjppp.fsf@fastmail.fm> (Joost Kremers's message of "Wed, 22 May 2024 09:00:34 +0200")
Joost Kremers <joostkremers@fastmail.fm> writes:
> On Wed, May 22 2024, Philip Kaludercic wrote:
>> Joost Kremers <joostkremers@fastmail.fm> writes:
>>> +(defun csv--unquote-value (value)
>>> + "Remove quotes around VALUE.
>>> +If VALUE contains escaped quote characters, un-escape them. If
>>> +VALUE is not quoted, return it unchanged."
>>> + (save-match-data
>>> + (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]"))))
>>> + (string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value)
>>
>> Shouldn't this `string-match' be in the if-let?
>
> I considered that, but in this particular case, `(match-string 1 value)` returns
> nil if the first character of `value` isn't in `csv-field-quotes`, so it seems
> to be OK.
>
> Emphasis on "seems" though... Plus, there's no need to call `match-string` at
> all if `string-match` failed, of course. So new patch attached.
>
>> Take this example,
>>
>> (let ((str "1 2 3"))
>> (list (string-match "2" str)
>> (match-string 0 str)
>> (string-match "4" str)
>> (match-string 0 str)))
>> ;;=> (2 "2" nil "2")
>>
>> even though string-match failed, the match data remains and matc-string
>> returns non-nil values.
>
> Oh... I kinda assumed that `string-match` would always reset all of the match
> data, but apparently not. Good to know!
>
>
> Thanks,
>
> Joost
>
>
> --
> Joost Kremers
> Life has its moments
>
> From bb582c8e413451f59db1d26d4c0208348370283b Mon Sep 17 00:00:00 2001
> From: Joost Kremers <joostkremers@fastmail.com>
> Date: Wed, 22 May 2024 00:07:34 +0200
> Subject: [PATCH] Add function for reading a CSV line and return its values as
> a list.
>
> * (csv-parse-current-row): New function; unlike csv--collect-fields,
> unquotes the field values.
> * (csv--unquote-value): New function.
> ---
> csv-mode-tests.el | 23 +++++++++++++++++++++++
> csv-mode.el | 26 +++++++++++++++++++++++++-
> 2 files changed, 48 insertions(+), 1 deletion(-)
>
> diff --git a/csv-mode-tests.el b/csv-mode-tests.el
> index 0caeab7..12d0417 100644
> --- a/csv-mode-tests.el
> +++ b/csv-mode-tests.el
> @@ -144,5 +144,28 @@
> (csv--separator-score ?\; csv-tests--data
> (length csv-tests--data)))))
>
> +(ert-deftest csv-tests-unquote-value ()
> + (should (equal (csv--unquote-value "Hello, World")
> + "Hello, World"))
> + (should (equal (csv--unquote-value "\"Hello, World\"")
> + "Hello, World"))
> + (should (equal (csv--unquote-value "Hello, \"\"World")
> + "Hello, \"\"World"))
> + (should (equal (csv--unquote-value "\"Hello, \"\"World\"\"\"")
> + "Hello, \"World\""))
> + (should (equal (csv--unquote-value "'Hello, World'")
> + "'Hello, World'"))
> + (should (equal (let ((csv-field-quotes '("\"" "'")))
> + (csv--unquote-value "\"Hello, World'"))
> + "\"Hello, World'"))
> + (should (equal (let ((csv-field-quotes '("\"" "'")))
> + (csv--unquote-value "'Hello, World'"))
> + "Hello, World"))
> + (should (equal (let ((csv-field-quotes '("\"" "'")))
> + (csv--unquote-value "'Hello, ''World'''"))
> + "Hello, 'World'"))
> + (should (equal (csv--unquote-value "|Hello, World|")
> + "|Hello, World|")))
> +
> (provide 'csv-mode-tests)
> ;;; csv-mode-tests.el ends here
> diff --git a/csv-mode.el b/csv-mode.el
> index f639dcf..ebcd9da 100644
> --- a/csv-mode.el
> +++ b/csv-mode.el
> @@ -4,7 +4,7 @@
>
> ;; Author: "Francis J. Wright" <F.J.Wright@qmul.ac.uk>
> ;; Maintainer: emacs-devel@gnu.org
> -;; Version: 1.23
> +;; Version: 1.24
> ;; Package-Requires: ((emacs "27.1") (cl-lib "0.5"))
> ;; Keywords: convenience
>
> @@ -107,6 +107,10 @@
>
> ;;; News:
>
> +;; Since 1.24
> +;; - New function `csv--unquote-value'.
> +;; - New function `csv-parse-current-row'.
> +
> ;; Since 1.21:
> ;; - New command `csv-insert-column'.
> ;; - New config var `csv-align-min-width' for `csv-align-mode'.
> @@ -1400,6 +1404,26 @@ point is assumed to be at the beginning of the line."
> (forward-char)))
> (nreverse fields)))))
>
> +(defun csv--unquote-value (value)
> + "Remove quotes around VALUE.
> +If VALUE contains escaped quote characters, un-escape them. If
> +VALUE is not quoted, return it unchanged."
> + (save-match-data
> + (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]"))))
> + (if-let (((string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value))
> + (quote-char (match-string 1 value))
> + ((equal quote-char (match-string 3 value)))
> + (unquoted (match-string 2 value)))
> + (replace-regexp-in-string (concat quote-char quote-char) quote-char unquoted)
> + value))))
> +
> +(defun csv-parse-current-row ()
> + "Parse the current CSV line.
> +Return the field values as a list."
> + (save-mark-and-excursion
> + (goto-char (line-beginning-position))
> + (mapcar #'csv--unquote-value (csv--collect-fields (line-end-position)))))
> +
> (defvar-local csv--header-line nil)
> (defvar-local csv--header-hscroll nil)
> (defvar-local csv--header-string nil)
Seems fine to me. I'd apply it if there are no objections. Until then,
you can prepare to modify your package to use this change.
--
Philip Kaludercic on peregrine
next prev parent reply other threads:[~2024-05-22 16:14 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-21 22:55 [PATCH] csv-mode.el: Add function for reading a CSV line Joost Kremers
2024-05-22 6:17 ` Philip Kaludercic
2024-05-22 7:00 ` Joost Kremers
2024-05-22 16:14 ` Philip Kaludercic [this message]
2024-05-22 16:21 ` Joost Kremers
2024-05-25 8:26 ` Philip Kaludercic
2024-05-26 4:07 ` Stefan Monnier via Emacs development discussions.
2024-05-26 8:08 ` Joost Kremers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r0dthli0.fsf@posteo.net \
--to=philipk@posteo.net \
--cc=emacs-devel@gnu.org \
--cc=joostkremers@fastmail.fm \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).