all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* [PATCH] csv-mode.el: Add function for reading a CSV line
@ 2024-05-21 22:55 Joost Kremers
  2024-05-22  6:17 ` Philip Kaludercic
  0 siblings, 1 reply; 8+ messages in thread
From: Joost Kremers @ 2024-05-21 22:55 UTC (permalink / raw)
  To: Emacs Devel; +Cc: Philip Kaludercic

[-- Attachment #1: Type: text/plain, Size: 706 bytes --]

Hi,

@Philip Kaludercic, as per your suggestion
(https://lists.gnu.org/archive/html/emacs-devel/2024-05/msg00966.html), here's a
patch for csv-mode.el to add a function for reading a CSV line and unquoting the
resulting field values.

I've put the unquoting in a separate (internal) function, csv--unquote-value,
which also un-escapes escaped quote characters inside the field value. As per
RFC 4180, the escape character is the quote character. The RFC only mentions the
double quotation mark as quote character, but csv-mode.el makes it possible to
use other quote characters, so the patch supports that as well.

I've also added a test for csv--unquote-value.



-- 
Joost Kremers
Life has its moments


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-function-for-reading-a-CSV-line-and-return-its-v.patch --]
[-- Type: text/x-patch, Size: 3118 bytes --]

From e97b8b8cca3f987cbdf5e29ec184f37825755eba Mon Sep 17 00:00:00 2001
From: Joost Kremers <joostkremers@fastmail.com>
Date: Wed, 22 May 2024 00:07:34 +0200
Subject: [PATCH] Add function for reading a CSV line and return its values as
 a list.

* (csv-parse-current-row): New function; unlike csv--collect-fields,
  unquotes the field values.
* (csv--unquote-value): New function.
---
 csv-mode-tests.el | 12 ++++++++++++
 csv-mode.el       | 26 +++++++++++++++++++++++++-
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/csv-mode-tests.el b/csv-mode-tests.el
index 0caeab7..ea955a9 100644
--- a/csv-mode-tests.el
+++ b/csv-mode-tests.el
@@ -144,5 +144,17 @@
              (csv--separator-score ?\; csv-tests--data
                                    (length csv-tests--data)))))
 
+(ert-deftest csv-tests-unquote-value ()
+  (should (equal (csv--unquote-value "Hello, World")
+                 "Hello, World"))
+  (should (equal (csv--unquote-value "\"Hello, World\"")
+                 "Hello, World"))
+  (should (equal (csv--unquote-value "Hello, \"\"World")
+                 "Hello, \"\"World"))
+  (should (equal (csv--unquote-value "\"Hello, \"\"World\"\"\"")
+                 "Hello, \"World\""))
+  (should (equal (csv--unquote-value "\"Hello, World'")
+                 "\"Hello, World'")))
+
 (provide 'csv-mode-tests)
 ;;; csv-mode-tests.el ends here
diff --git a/csv-mode.el b/csv-mode.el
index f639dcf..09402c2 100644
--- a/csv-mode.el
+++ b/csv-mode.el
@@ -4,7 +4,7 @@
 
 ;; Author: "Francis J. Wright" <F.J.Wright@qmul.ac.uk>
 ;; Maintainer: emacs-devel@gnu.org
-;; Version: 1.23
+;; Version: 1.24
 ;; Package-Requires: ((emacs "27.1") (cl-lib "0.5"))
 ;; Keywords: convenience
 
@@ -107,6 +107,10 @@
 
 ;;; News:
 
+;; Since 1.24
+;; - New function `csv--unquote-value'.
+;; - New function `csv-parse-current-row'.
+
 ;; Since 1.21:
 ;; - New command `csv-insert-column'.
 ;; - New config var `csv-align-min-width' for `csv-align-mode'.
@@ -1400,6 +1404,26 @@ point is assumed to be at the beginning of the line."
 	      (forward-char)))
 	(nreverse fields)))))
 
+(defun csv--unquote-value (value)
+  "Remove quotes around VALUE.
+If VALUE contains escaped quote characters, un-escape them.  If
+VALUE is not quoted, return it unchanged."
+  (save-match-data
+    (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]"))))
+      (string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value)
+      (if-let ((quote-char (match-string 1 value))
+               ((equal quote-char (match-string 3 value)))
+               (unquoted (match-string 2 value)))
+          (replace-regexp-in-string (concat quote-char quote-char) quote-char unquoted)
+        value))))
+
+(defun csv-parse-current-row ()
+  "Parse the current CSV line.
+Return the field values as a list."
+  (save-mark-and-excursion
+    (goto-char (line-beginning-position))
+    (mapcar #'csv--unquote-value (csv--collect-fields (line-end-position)))))
+
 (defvar-local csv--header-line nil)
 (defvar-local csv--header-hscroll nil)
 (defvar-local csv--header-string nil)
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] csv-mode.el: Add function for reading a CSV line
  2024-05-21 22:55 [PATCH] csv-mode.el: Add function for reading a CSV line Joost Kremers
@ 2024-05-22  6:17 ` Philip Kaludercic
  2024-05-22  7:00   ` Joost Kremers
  2024-05-26  4:07   ` Stefan Monnier via Emacs development discussions.
  0 siblings, 2 replies; 8+ messages in thread
From: Philip Kaludercic @ 2024-05-22  6:17 UTC (permalink / raw)
  To: Joost Kremers; +Cc: Emacs Devel

Joost Kremers <joostkremers@fastmail.fm> writes:

> Hi,
>
> @Philip Kaludercic, as per your suggestion
> (https://lists.gnu.org/archive/html/emacs-devel/2024-05/msg00966.html), here's a
> patch for csv-mode.el to add a function for reading a CSV line and unquoting the
> resulting field values.
>
> I've put the unquoting in a separate (internal) function, csv--unquote-value,
> which also un-escapes escaped quote characters inside the field value. As per
> RFC 4180, the escape character is the quote character. The RFC only mentions the
> double quotation mark as quote character, but csv-mode.el makes it possible to
> use other quote characters, so the patch supports that as well.
>
> I've also added a test for csv--unquote-value.
>
>
>
> -- 
> Joost Kremers
> Life has its moments
>
> From e97b8b8cca3f987cbdf5e29ec184f37825755eba Mon Sep 17 00:00:00 2001
> From: Joost Kremers <joostkremers@fastmail.com>
> Date: Wed, 22 May 2024 00:07:34 +0200
> Subject: [PATCH] Add function for reading a CSV line and return its values as
>  a list.
>
> * (csv-parse-current-row): New function; unlike csv--collect-fields,
>   unquotes the field values.
> * (csv--unquote-value): New function.
> ---
>  csv-mode-tests.el | 12 ++++++++++++
>  csv-mode.el       | 26 +++++++++++++++++++++++++-
>  2 files changed, 37 insertions(+), 1 deletion(-)
>
> diff --git a/csv-mode-tests.el b/csv-mode-tests.el
> index 0caeab7..ea955a9 100644
> --- a/csv-mode-tests.el
> +++ b/csv-mode-tests.el
> @@ -144,5 +144,17 @@
>               (csv--separator-score ?\; csv-tests--data
>                                     (length csv-tests--data)))))
>  
> +(ert-deftest csv-tests-unquote-value ()
> +  (should (equal (csv--unquote-value "Hello, World")
> +                 "Hello, World"))
> +  (should (equal (csv--unquote-value "\"Hello, World\"")
> +                 "Hello, World"))
> +  (should (equal (csv--unquote-value "Hello, \"\"World")
> +                 "Hello, \"\"World"))
> +  (should (equal (csv--unquote-value "\"Hello, \"\"World\"\"\"")
> +                 "Hello, \"World\""))
> +  (should (equal (csv--unquote-value "\"Hello, World'")
> +                 "\"Hello, World'")))
> +
>  (provide 'csv-mode-tests)
>  ;;; csv-mode-tests.el ends here
> diff --git a/csv-mode.el b/csv-mode.el
> index f639dcf..09402c2 100644
> --- a/csv-mode.el
> +++ b/csv-mode.el
> @@ -4,7 +4,7 @@
>  
>  ;; Author: "Francis J. Wright" <F.J.Wright@qmul.ac.uk>
>  ;; Maintainer: emacs-devel@gnu.org
> -;; Version: 1.23
> +;; Version: 1.24
>  ;; Package-Requires: ((emacs "27.1") (cl-lib "0.5"))
>  ;; Keywords: convenience
>  
> @@ -107,6 +107,10 @@
>  
>  ;;; News:
>  
> +;; Since 1.24
> +;; - New function `csv--unquote-value'.
> +;; - New function `csv-parse-current-row'.
> +
>  ;; Since 1.21:
>  ;; - New command `csv-insert-column'.
>  ;; - New config var `csv-align-min-width' for `csv-align-mode'.
> @@ -1400,6 +1404,26 @@ point is assumed to be at the beginning of the line."
>  	      (forward-char)))
>  	(nreverse fields)))))
>  
> +(defun csv--unquote-value (value)
> +  "Remove quotes around VALUE.
> +If VALUE contains escaped quote characters, un-escape them.  If
> +VALUE is not quoted, return it unchanged."
> +  (save-match-data
> +    (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]"))))
> +      (string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value)

Shouldn't this `string-match' be in the if-let?

Take this example,

(let ((str "1 2 3"))
  (list (string-match "2" str)
	(match-string 0 str)
	(string-match "4" str)
	(match-string 0 str)))
;;=> (2 "2" nil "2")

even though string-match failed, the match data remains and matc-string
returns non-nil values.

> +      (if-let ((quote-char (match-string 1 value))
> +               ((equal quote-char (match-string 3 value)))
> +               (unquoted (match-string 2 value)))
> +          (replace-regexp-in-string (concat quote-char quote-char) quote-char unquoted)
> +        value))))
> +
> +(defun csv-parse-current-row ()
> +  "Parse the current CSV line.
> +Return the field values as a list."
> +  (save-mark-and-excursion
> +    (goto-char (line-beginning-position))
> +    (mapcar #'csv--unquote-value (csv--collect-fields (line-end-position)))))
> +
>  (defvar-local csv--header-line nil)
>  (defvar-local csv--header-hscroll nil)
>  (defvar-local csv--header-string nil)

-- 
	Philip Kaludercic on peregrine



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] csv-mode.el: Add function for reading a CSV line
  2024-05-22  6:17 ` Philip Kaludercic
@ 2024-05-22  7:00   ` Joost Kremers
  2024-05-22 16:14     ` Philip Kaludercic
  2024-05-26  4:07   ` Stefan Monnier via Emacs development discussions.
  1 sibling, 1 reply; 8+ messages in thread
From: Joost Kremers @ 2024-05-22  7:00 UTC (permalink / raw)
  To: Philip Kaludercic; +Cc: Emacs Devel

[-- Attachment #1: Type: text/plain, Size: 1307 bytes --]

On Wed, May 22 2024, Philip Kaludercic wrote:
> Joost Kremers <joostkremers@fastmail.fm> writes:
>> +(defun csv--unquote-value (value)
>> +  "Remove quotes around VALUE.
>> +If VALUE contains escaped quote characters, un-escape them.  If
>> +VALUE is not quoted, return it unchanged."
>> +  (save-match-data
>> +    (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]"))))
>> +      (string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value)
>
> Shouldn't this `string-match' be in the if-let?

I considered that, but in this particular case, `(match-string 1 value)` returns
nil if the first character of `value` isn't in `csv-field-quotes`, so it seems
to be OK.

Emphasis on "seems" though... Plus, there's no need to call `match-string` at
all if `string-match` failed, of course. So new patch attached.

> Take this example,
>
> (let ((str "1 2 3"))
>   (list (string-match "2" str)
> 	(match-string 0 str)
> 	(string-match "4" str)
> 	(match-string 0 str)))
> ;;=> (2 "2" nil "2")
>
> even though string-match failed, the match data remains and matc-string
> returns non-nil values.

Oh... I kinda assumed that `string-match` would always reset all of the match
data, but apparently not. Good to know!


Thanks,

Joost


-- 
Joost Kremers
Life has its moments


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-function-for-reading-a-CSV-line-and-return-its-v.patch --]
[-- Type: text/x-patch, Size: 3691 bytes --]

From bb582c8e413451f59db1d26d4c0208348370283b Mon Sep 17 00:00:00 2001
From: Joost Kremers <joostkremers@fastmail.com>
Date: Wed, 22 May 2024 00:07:34 +0200
Subject: [PATCH] Add function for reading a CSV line and return its values as
 a list.

* (csv-parse-current-row): New function; unlike csv--collect-fields,
  unquotes the field values.
* (csv--unquote-value): New function.
---
 csv-mode-tests.el | 23 +++++++++++++++++++++++
 csv-mode.el       | 26 +++++++++++++++++++++++++-
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/csv-mode-tests.el b/csv-mode-tests.el
index 0caeab7..12d0417 100644
--- a/csv-mode-tests.el
+++ b/csv-mode-tests.el
@@ -144,5 +144,28 @@
              (csv--separator-score ?\; csv-tests--data
                                    (length csv-tests--data)))))
 
+(ert-deftest csv-tests-unquote-value ()
+  (should (equal (csv--unquote-value "Hello, World")
+                 "Hello, World"))
+  (should (equal (csv--unquote-value "\"Hello, World\"")
+                 "Hello, World"))
+  (should (equal (csv--unquote-value "Hello, \"\"World")
+                 "Hello, \"\"World"))
+  (should (equal (csv--unquote-value "\"Hello, \"\"World\"\"\"")
+                 "Hello, \"World\""))
+  (should (equal (csv--unquote-value "'Hello, World'")
+                 "'Hello, World'"))
+  (should (equal (let ((csv-field-quotes '("\"" "'")))
+                   (csv--unquote-value "\"Hello, World'"))
+                 "\"Hello, World'"))
+  (should (equal (let ((csv-field-quotes '("\"" "'")))
+                   (csv--unquote-value "'Hello, World'"))
+                 "Hello, World"))
+  (should (equal (let ((csv-field-quotes '("\"" "'")))
+                   (csv--unquote-value "'Hello, ''World'''"))
+                 "Hello, 'World'"))
+  (should (equal (csv--unquote-value "|Hello, World|")
+                 "|Hello, World|")))
+
 (provide 'csv-mode-tests)
 ;;; csv-mode-tests.el ends here
diff --git a/csv-mode.el b/csv-mode.el
index f639dcf..ebcd9da 100644
--- a/csv-mode.el
+++ b/csv-mode.el
@@ -4,7 +4,7 @@
 
 ;; Author: "Francis J. Wright" <F.J.Wright@qmul.ac.uk>
 ;; Maintainer: emacs-devel@gnu.org
-;; Version: 1.23
+;; Version: 1.24
 ;; Package-Requires: ((emacs "27.1") (cl-lib "0.5"))
 ;; Keywords: convenience
 
@@ -107,6 +107,10 @@
 
 ;;; News:
 
+;; Since 1.24
+;; - New function `csv--unquote-value'.
+;; - New function `csv-parse-current-row'.
+
 ;; Since 1.21:
 ;; - New command `csv-insert-column'.
 ;; - New config var `csv-align-min-width' for `csv-align-mode'.
@@ -1400,6 +1404,26 @@ point is assumed to be at the beginning of the line."
 	      (forward-char)))
 	(nreverse fields)))))
 
+(defun csv--unquote-value (value)
+  "Remove quotes around VALUE.
+If VALUE contains escaped quote characters, un-escape them.  If
+VALUE is not quoted, return it unchanged."
+  (save-match-data
+    (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]"))))
+      (if-let (((string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value))
+               (quote-char (match-string 1 value))
+               ((equal quote-char (match-string 3 value)))
+               (unquoted (match-string 2 value)))
+          (replace-regexp-in-string (concat quote-char quote-char) quote-char unquoted)
+        value))))
+
+(defun csv-parse-current-row ()
+  "Parse the current CSV line.
+Return the field values as a list."
+  (save-mark-and-excursion
+    (goto-char (line-beginning-position))
+    (mapcar #'csv--unquote-value (csv--collect-fields (line-end-position)))))
+
 (defvar-local csv--header-line nil)
 (defvar-local csv--header-hscroll nil)
 (defvar-local csv--header-string nil)
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] csv-mode.el: Add function for reading a CSV line
  2024-05-22  7:00   ` Joost Kremers
@ 2024-05-22 16:14     ` Philip Kaludercic
  2024-05-22 16:21       ` Joost Kremers
  0 siblings, 1 reply; 8+ messages in thread
From: Philip Kaludercic @ 2024-05-22 16:14 UTC (permalink / raw)
  To: Joost Kremers; +Cc: Emacs Devel

Joost Kremers <joostkremers@fastmail.fm> writes:

> On Wed, May 22 2024, Philip Kaludercic wrote:
>> Joost Kremers <joostkremers@fastmail.fm> writes:
>>> +(defun csv--unquote-value (value)
>>> +  "Remove quotes around VALUE.
>>> +If VALUE contains escaped quote characters, un-escape them.  If
>>> +VALUE is not quoted, return it unchanged."
>>> +  (save-match-data
>>> +    (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]"))))
>>> +      (string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value)
>>
>> Shouldn't this `string-match' be in the if-let?
>
> I considered that, but in this particular case, `(match-string 1 value)` returns
> nil if the first character of `value` isn't in `csv-field-quotes`, so it seems
> to be OK.
>
> Emphasis on "seems" though... Plus, there's no need to call `match-string` at
> all if `string-match` failed, of course. So new patch attached.
>
>> Take this example,
>>
>> (let ((str "1 2 3"))
>>   (list (string-match "2" str)
>> 	(match-string 0 str)
>> 	(string-match "4" str)
>> 	(match-string 0 str)))
>> ;;=> (2 "2" nil "2")
>>
>> even though string-match failed, the match data remains and matc-string
>> returns non-nil values.
>
> Oh... I kinda assumed that `string-match` would always reset all of the match
> data, but apparently not. Good to know!
>
>
> Thanks,
>
> Joost
>
>
> -- 
> Joost Kremers
> Life has its moments
>
> From bb582c8e413451f59db1d26d4c0208348370283b Mon Sep 17 00:00:00 2001
> From: Joost Kremers <joostkremers@fastmail.com>
> Date: Wed, 22 May 2024 00:07:34 +0200
> Subject: [PATCH] Add function for reading a CSV line and return its values as
>  a list.
>
> * (csv-parse-current-row): New function; unlike csv--collect-fields,
>   unquotes the field values.
> * (csv--unquote-value): New function.
> ---
>  csv-mode-tests.el | 23 +++++++++++++++++++++++
>  csv-mode.el       | 26 +++++++++++++++++++++++++-
>  2 files changed, 48 insertions(+), 1 deletion(-)
>
> diff --git a/csv-mode-tests.el b/csv-mode-tests.el
> index 0caeab7..12d0417 100644
> --- a/csv-mode-tests.el
> +++ b/csv-mode-tests.el
> @@ -144,5 +144,28 @@
>               (csv--separator-score ?\; csv-tests--data
>                                     (length csv-tests--data)))))
>  
> +(ert-deftest csv-tests-unquote-value ()
> +  (should (equal (csv--unquote-value "Hello, World")
> +                 "Hello, World"))
> +  (should (equal (csv--unquote-value "\"Hello, World\"")
> +                 "Hello, World"))
> +  (should (equal (csv--unquote-value "Hello, \"\"World")
> +                 "Hello, \"\"World"))
> +  (should (equal (csv--unquote-value "\"Hello, \"\"World\"\"\"")
> +                 "Hello, \"World\""))
> +  (should (equal (csv--unquote-value "'Hello, World'")
> +                 "'Hello, World'"))
> +  (should (equal (let ((csv-field-quotes '("\"" "'")))
> +                   (csv--unquote-value "\"Hello, World'"))
> +                 "\"Hello, World'"))
> +  (should (equal (let ((csv-field-quotes '("\"" "'")))
> +                   (csv--unquote-value "'Hello, World'"))
> +                 "Hello, World"))
> +  (should (equal (let ((csv-field-quotes '("\"" "'")))
> +                   (csv--unquote-value "'Hello, ''World'''"))
> +                 "Hello, 'World'"))
> +  (should (equal (csv--unquote-value "|Hello, World|")
> +                 "|Hello, World|")))
> +
>  (provide 'csv-mode-tests)
>  ;;; csv-mode-tests.el ends here
> diff --git a/csv-mode.el b/csv-mode.el
> index f639dcf..ebcd9da 100644
> --- a/csv-mode.el
> +++ b/csv-mode.el
> @@ -4,7 +4,7 @@
>  
>  ;; Author: "Francis J. Wright" <F.J.Wright@qmul.ac.uk>
>  ;; Maintainer: emacs-devel@gnu.org
> -;; Version: 1.23
> +;; Version: 1.24
>  ;; Package-Requires: ((emacs "27.1") (cl-lib "0.5"))
>  ;; Keywords: convenience
>  
> @@ -107,6 +107,10 @@
>  
>  ;;; News:
>  
> +;; Since 1.24
> +;; - New function `csv--unquote-value'.
> +;; - New function `csv-parse-current-row'.
> +
>  ;; Since 1.21:
>  ;; - New command `csv-insert-column'.
>  ;; - New config var `csv-align-min-width' for `csv-align-mode'.
> @@ -1400,6 +1404,26 @@ point is assumed to be at the beginning of the line."
>  	      (forward-char)))
>  	(nreverse fields)))))
>  
> +(defun csv--unquote-value (value)
> +  "Remove quotes around VALUE.
> +If VALUE contains escaped quote characters, un-escape them.  If
> +VALUE is not quoted, return it unchanged."
> +  (save-match-data
> +    (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]"))))
> +      (if-let (((string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value))
> +               (quote-char (match-string 1 value))
> +               ((equal quote-char (match-string 3 value)))
> +               (unquoted (match-string 2 value)))
> +          (replace-regexp-in-string (concat quote-char quote-char) quote-char unquoted)
> +        value))))
> +
> +(defun csv-parse-current-row ()
> +  "Parse the current CSV line.
> +Return the field values as a list."
> +  (save-mark-and-excursion
> +    (goto-char (line-beginning-position))
> +    (mapcar #'csv--unquote-value (csv--collect-fields (line-end-position)))))
> +
>  (defvar-local csv--header-line nil)
>  (defvar-local csv--header-hscroll nil)
>  (defvar-local csv--header-string nil)

Seems fine to me.  I'd apply it if there are no objections.  Until then,
you can prepare to modify your package to use this change.

-- 
	Philip Kaludercic on peregrine



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] csv-mode.el: Add function for reading a CSV line
  2024-05-22 16:14     ` Philip Kaludercic
@ 2024-05-22 16:21       ` Joost Kremers
  2024-05-25  8:26         ` Philip Kaludercic
  0 siblings, 1 reply; 8+ messages in thread
From: Joost Kremers @ 2024-05-22 16:21 UTC (permalink / raw)
  To: Philip Kaludercic; +Cc: Emacs Devel

[-- Attachment #1: Type: text/plain, Size: 719 bytes --]

On Wed, May 22 2024, Philip Kaludercic wrote:
[patch]
> Seems fine to me.  I'd apply it if there are no objections.  Until then,
> you can prepare to modify your package to use this change.

I was just about to suggest one more test. Specifically:

```
(should (equal (let ((csv-field-quotes '("\"" "'")))
                   (csv--unquote-value "'Hello, \"World\"'"))
                 "Hello, \"World\""))
```

It tests the case where the user defined more than one possible quote character,
one being used to quote the field and the other being used inside the field.

The attached patch adds this test but is otherwise identical to the previous
one.

Otherwise, thanks! :-)


-- 
Joost Kremers
Life has its moments



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-function-for-reading-a-CSV-line-and-return-its-v.patch --]
[-- Type: text/x-patch, Size: 3852 bytes --]

From 2a37f695ffe5757b0f22ffe63d312d1407122e3a Mon Sep 17 00:00:00 2001
From: Joost Kremers <joostkremers@fastmail.com>
Date: Wed, 22 May 2024 00:07:34 +0200
Subject: [PATCH] Add function for reading a CSV line and return its values as
 a list.

* (csv-parse-current-row): New function; unlike csv--collect-fields,
  unquotes the field values.
* (csv--unquote-value): New function.
---
 csv-mode-tests.el | 26 ++++++++++++++++++++++++++
 csv-mode.el       | 26 +++++++++++++++++++++++++-
 2 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/csv-mode-tests.el b/csv-mode-tests.el
index 0caeab7..12e009e 100644
--- a/csv-mode-tests.el
+++ b/csv-mode-tests.el
@@ -144,5 +144,31 @@
              (csv--separator-score ?\; csv-tests--data
                                    (length csv-tests--data)))))
 
+(ert-deftest csv-tests-unquote-value ()
+  (should (equal (csv--unquote-value "Hello, World")
+                 "Hello, World"))
+  (should (equal (csv--unquote-value "\"Hello, World\"")
+                 "Hello, World"))
+  (should (equal (csv--unquote-value "Hello, \"\"World")
+                 "Hello, \"\"World"))
+  (should (equal (csv--unquote-value "\"Hello, \"\"World\"\"\"")
+                 "Hello, \"World\""))
+  (should (equal (csv--unquote-value "'Hello, World'")
+                 "'Hello, World'"))
+  (should (equal (let ((csv-field-quotes '("\"" "'")))
+                   (csv--unquote-value "\"Hello, World'"))
+                 "\"Hello, World'"))
+  (should (equal (let ((csv-field-quotes '("\"" "'")))
+                   (csv--unquote-value "'Hello, World'"))
+                 "Hello, World"))
+  (should (equal (let ((csv-field-quotes '("\"" "'")))
+                   (csv--unquote-value "'Hello, ''World'''"))
+                 "Hello, 'World'"))
+  (should (equal (let ((csv-field-quotes '("\"" "'")))
+                   (csv--unquote-value "'Hello, \"World\"'"))
+                 "Hello, \"World\""))
+  (should (equal (csv--unquote-value "|Hello, World|")
+                 "|Hello, World|")))
+
 (provide 'csv-mode-tests)
 ;;; csv-mode-tests.el ends here
diff --git a/csv-mode.el b/csv-mode.el
index f639dcf..ebcd9da 100644
--- a/csv-mode.el
+++ b/csv-mode.el
@@ -4,7 +4,7 @@
 
 ;; Author: "Francis J. Wright" <F.J.Wright@qmul.ac.uk>
 ;; Maintainer: emacs-devel@gnu.org
-;; Version: 1.23
+;; Version: 1.24
 ;; Package-Requires: ((emacs "27.1") (cl-lib "0.5"))
 ;; Keywords: convenience
 
@@ -107,6 +107,10 @@
 
 ;;; News:
 
+;; Since 1.24
+;; - New function `csv--unquote-value'.
+;; - New function `csv-parse-current-row'.
+
 ;; Since 1.21:
 ;; - New command `csv-insert-column'.
 ;; - New config var `csv-align-min-width' for `csv-align-mode'.
@@ -1400,6 +1404,26 @@ point is assumed to be at the beginning of the line."
 	      (forward-char)))
 	(nreverse fields)))))
 
+(defun csv--unquote-value (value)
+  "Remove quotes around VALUE.
+If VALUE contains escaped quote characters, un-escape them.  If
+VALUE is not quoted, return it unchanged."
+  (save-match-data
+    (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]"))))
+      (if-let (((string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value))
+               (quote-char (match-string 1 value))
+               ((equal quote-char (match-string 3 value)))
+               (unquoted (match-string 2 value)))
+          (replace-regexp-in-string (concat quote-char quote-char) quote-char unquoted)
+        value))))
+
+(defun csv-parse-current-row ()
+  "Parse the current CSV line.
+Return the field values as a list."
+  (save-mark-and-excursion
+    (goto-char (line-beginning-position))
+    (mapcar #'csv--unquote-value (csv--collect-fields (line-end-position)))))
+
 (defvar-local csv--header-line nil)
 (defvar-local csv--header-hscroll nil)
 (defvar-local csv--header-string nil)
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] csv-mode.el: Add function for reading a CSV line
  2024-05-22 16:21       ` Joost Kremers
@ 2024-05-25  8:26         ` Philip Kaludercic
  0 siblings, 0 replies; 8+ messages in thread
From: Philip Kaludercic @ 2024-05-25  8:26 UTC (permalink / raw)
  To: Joost Kremers; +Cc: Emacs Devel

Joost Kremers <joostkremers@fastmail.fm> writes:

> On Wed, May 22 2024, Philip Kaludercic wrote:
> [patch]
>> Seems fine to me.  I'd apply it if there are no objections.  Until then,
>> you can prepare to modify your package to use this change.
>
> I was just about to suggest one more test. Specifically:
>
> ```
> (should (equal (let ((csv-field-quotes '("\"" "'")))
>                    (csv--unquote-value "'Hello, \"World\"'"))
>                  "Hello, \"World\""))
> ```
>
> It tests the case where the user defined more than one possible quote character,
> one being used to quote the field and the other being used inside the field.
>
> The attached patch adds this test but is otherwise identical to the previous
> one.
>
> Otherwise, thanks! :-)

Looks like nobody has any objections, so I pushed the changes to
elpa.git.  Thanks.

-- 
	Philip Kaludercic on peregrine



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] csv-mode.el: Add function for reading a CSV line
  2024-05-22  6:17 ` Philip Kaludercic
  2024-05-22  7:00   ` Joost Kremers
@ 2024-05-26  4:07   ` Stefan Monnier via Emacs development discussions.
  2024-05-26  8:08     ` Joost Kremers
  1 sibling, 1 reply; 8+ messages in thread
From: Stefan Monnier via Emacs development discussions. @ 2024-05-26  4:07 UTC (permalink / raw)
  To: emacs-devel

>> +(defun csv--unquote-value (value)
>> +  "Remove quotes around VALUE.
>> +If VALUE contains escaped quote characters, un-escape them.  If
>> +VALUE is not quoted, return it unchanged."
>> +  (save-match-data
>> +    (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]"))))
>> +      (string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value)

`save-match-data` around a function's body is an anti-pattern.
This is explained in the docstring as follows:

    NOTE: The convention in Elisp is that any function, except for a few
    exceptions like car/assoc/+/goto-char, can clobber the match data,
    so ‘save-match-data’ should normally be used to save *your* match data
    rather than your caller’s match data.


-- Stefan




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] csv-mode.el: Add function for reading a CSV line
  2024-05-26  4:07   ` Stefan Monnier via Emacs development discussions.
@ 2024-05-26  8:08     ` Joost Kremers
  0 siblings, 0 replies; 8+ messages in thread
From: Joost Kremers @ 2024-05-26  8:08 UTC (permalink / raw)
  To: Stefan Monnier via Emacs development discussions.; +Cc: Stefan Monnier

[-- Attachment #1: Type: text/plain, Size: 614 bytes --]

On Sun, May 26 2024, Stefan Monnier via "Emacs development discussions." wrote:
>>> +(defun csv--unquote-value (value)
>>> +  "Remove quotes around VALUE.
>>> +If VALUE contains escaped quote characters, un-escape them.  If
>>> +VALUE is not quoted, return it unchanged."
>>> +  (save-match-data
>>> +    (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]"))))
>>> +      (string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value)
>
> `save-match-data` around a function's body is an anti-pattern.

Thanks for pointing that out. So IIUC it should just be removed? Patch:



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Remove-save-match-data-from-csv-unquote-value.patch --]
[-- Type: text/x-patch, Size: 1732 bytes --]

From 21016520838b7398559f004872f0964478f17b06 Mon Sep 17 00:00:00 2001
From: Joost Kremers <joostkremers@fastmail.com>
Date: Sun, 26 May 2024 10:05:19 +0200
Subject: [PATCH] Remove 'save-match-data' from 'csv--unquote-value'.

As per its doc string, 'save-match-data' should only be used to save
one's own match data, not the caller's.
---
 csv-mode.el | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/csv-mode.el b/csv-mode.el
index ebcd9da..37bff30 100644
--- a/csv-mode.el
+++ b/csv-mode.el
@@ -1408,14 +1408,13 @@ point is assumed to be at the beginning of the line."
   "Remove quotes around VALUE.
 If VALUE contains escaped quote characters, un-escape them.  If
 VALUE is not quoted, return it unchanged."
-  (save-match-data
-    (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]"))))
-      (if-let (((string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value))
-               (quote-char (match-string 1 value))
-               ((equal quote-char (match-string 3 value)))
-               (unquoted (match-string 2 value)))
-          (replace-regexp-in-string (concat quote-char quote-char) quote-char unquoted)
-        value))))
+  (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]"))))
+    (if-let (((string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value))
+             (quote-char (match-string 1 value))
+             ((equal quote-char (match-string 3 value)))
+             (unquoted (match-string 2 value)))
+        (replace-regexp-in-string (concat quote-char quote-char) quote-char unquoted)
+      value)))
 
 (defun csv-parse-current-row ()
   "Parse the current CSV line.
-- 
2.45.1


[-- Attachment #3: Type: text/plain, Size: 41 bytes --]



-- 
Joost Kremers
Life has its moments

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-05-26  8:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-21 22:55 [PATCH] csv-mode.el: Add function for reading a CSV line Joost Kremers
2024-05-22  6:17 ` Philip Kaludercic
2024-05-22  7:00   ` Joost Kremers
2024-05-22 16:14     ` Philip Kaludercic
2024-05-22 16:21       ` Joost Kremers
2024-05-25  8:26         ` Philip Kaludercic
2024-05-26  4:07   ` Stefan Monnier via Emacs development discussions.
2024-05-26  8:08     ` Joost Kremers

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.