unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: "Peder O. Klingenberg" <peder@klingenberg.no>
To: 46328@debbugs.gnu.org
Subject: bug#46328: 28.0.50; csv-transpose replaces field delimiters in quoted fields with newlines
Date: Tue, 23 Feb 2021 00:27:43 +0100	[thread overview]
Message-ID: <86mtvv3cpc.fsf@klingenberg.no> (raw)
In-Reply-To: <m2wnvmegzg.fsf@fastmail.fm> (Filipp Gunbin's message of "Fri, 05 Feb 2021 17:17:39 +0300")

[-- Attachment #1: Type: text/plain, Size: 350 bytes --]

On Fri, 2021-02-05 17:17:39 +0300, Filipp Gunbin wrote:

> The commas inside a (quoted) field were replaced by newlines, this looks
> like a bug.

Caused by split-string not caring about char-syntax ?\".  Here's a
patch.  If a line has quote chars, use csv-forward-field to fetch each
field, ensuring consistency in what the mode considers a field.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-transposing-csv-files-with-quoted-fields.patch --]
[-- Type: text/x-patch, Size: 2365 bytes --]

From d6b51e2f07d585106ce6ccfe484f12a9ed3fe9dc Mon Sep 17 00:00:00 2001
From: "Peder O. Klingenberg" <peder@klingenberg.no>
Date: Tue, 23 Feb 2021 00:14:35 +0100
Subject: [PATCH] Fix transposing csv files with quoted fields

* csv-mode.el
(csv--collect-fields): New function.
(csv-transpose): Use the new function instead of split-string.

(Fixes Bug#46328)
---
 csv-mode.el | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/csv-mode.el b/csv-mode.el
index eaea881801..ecc33a7bcc 100644
--- a/csv-mode.el
+++ b/csv-mode.el
@@ -4,7 +4,7 @@
 
 ;; Author: "Francis J. Wright" <F.J.Wright@qmul.ac.uk>
 ;; Maintainer: emacs-devel@gnu.org
-;; Version: 1.14
+;; Version: 1.15
 ;; Package-Requires: ((emacs "24.1") (cl-lib "0.5"))
 ;; Keywords: convenience
 
@@ -1264,9 +1264,7 @@ When called non-interactively, BEG and END specify region to process."
 	      (forward-line)
 	    (let ((lep (line-end-position)))
 	      (push
-	       (split-string
-		(buffer-substring-no-properties (point) lep)
-		csv-separator-regexp)
+	       (csv--collect-fields lep)
 	       rows)
 	      (delete-region (point) lep)
 	      (or (eobp) (delete-char 1)))))
@@ -1305,6 +1303,26 @@ When called non-interactively, BEG and END specify region to process."
 	;; Re-do soft alignment if necessary:
 	(if align (csv-align-fields nil (point-min) (point-max)))))))
 
+(defun csv--collect-fields (row-end-position)
+  "Collect the fields of a row.
+Splits a row into fields, honoring quoted fields, and returns
+the list of fields.  ROW-END-POSITION is the end-of-line position.
+point is assumed to be at the beginning of the line."
+  (let ((csv-field-quotes-regexp (apply #'concat `("[" ,@csv-field-quotes "]")))
+	(row-text (buffer-substring-no-properties (point) row-end-position))
+	fields field-start)
+    (if (not (string-match csv-field-quotes-regexp row-text))
+	(split-string row-text csv-separator-regexp)
+      (save-excursion
+	(while (< (setq field-start (point)) row-end-position)
+	  (csv-forward-field 1)
+	  (push
+	   (buffer-substring-no-properties field-start (point))
+	   fields)
+	  (if (memq (following-char) csv-separator-chars)
+	      (forward-char)))
+	(nreverse fields)))))
+
 (defvar-local csv--header-line nil)
 (defvar-local csv--header-hscroll nil)
 (defvar-local csv--header-string nil)
-- 
2.30.1.windows.1


  parent reply	other threads:[~2021-02-22 23:27 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-05 14:17 bug#46328: 28.0.50; csv-transpose replaces field delimiters in quoted fields with newlines Filipp Gunbin
2021-02-05 14:46 ` bug#46328: additional test case Filipp Gunbin
2021-02-22 23:27 ` Peder O. Klingenberg [this message]
2021-02-23 15:51   ` bug#46328: 28.0.50; csv-transpose replaces field delimiters in quoted fields with newlines Lars Ingebrigtsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86mtvv3cpc.fsf@klingenberg.no \
    --to=peder@klingenberg.no \
    --cc=46328@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).