From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Joost Kremers Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] csv-mode.el: Add function for reading a CSV line Date: Wed, 22 May 2024 09:00:34 +0200 Message-ID: <86ttiqjppp.fsf@fastmail.fm> References: <86wmnmixmf.fsf@fastmail.fm> <87le42v093.fsf@posteo.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="28547"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Emacs Devel To: Philip Kaludercic Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed May 22 09:01:14 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1s9fyJ-0007CA-U7 for ged-emacs-devel@m.gmane-mx.org; Wed, 22 May 2024 09:01:12 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1s9fy1-0004nX-Mr; Wed, 22 May 2024 03:00:54 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1s9fxq-0004m7-CK for emacs-devel@gnu.org; Wed, 22 May 2024 03:00:42 -0400 Original-Received: from fout4-smtp.messagingengine.com ([103.168.172.147]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1s9fxm-0003sB-M7 for emacs-devel@gnu.org; Wed, 22 May 2024 03:00:41 -0400 Original-Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailfout.nyi.internal (Postfix) with ESMTP id B39841380134; Wed, 22 May 2024 03:00:37 -0400 (EDT) Original-Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Wed, 22 May 2024 03:00:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.fm; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm1; t=1716361237; x=1716447637; bh=RFwyd8Kvdk IqksYgk4s3hIBqcBEkVtS4aBqe9cy+eUc=; b=bQE23xSFPx8Bx5OIKx5/R05YxK qd4Hi+VD6eCf4ow7gEckImTfLNOZM7oEN5w/KIGRWNUYp2pkgeUyLbMLcTloYVyM 5IBeRi+ewzDkoKg2FCpiagvNa+UGTg2O1YXenEKXoAsAESp3TCnRP0YaKeQAjqdy XeKW5SD519f7ypYTZyshAXft3sAl98hvrjyFu7lk1xKC915v6CCWA75VLZ4AcnIJ QP5NzyYtRrr4SBKaA1qFpKJ6YaorRAe8nSDraevlffBW1fD6J++nuqqJcG8kTW4O Wgw2jnYYvJAkIpIdu4uOJTkssOcI20tyiCMzRapF7vyUa+eAaguKX9MSYXAQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; t=1716361237; x=1716447637; bh=RFwyd8KvdkIqksYgk4s3hIBqcBEk VtS4aBqe9cy+eUc=; b=pS+SjS0Yk++vDUWpjAdWDRIx5/T+LUklUpS+T+ZMzbQH xQAkn19rENRioXbLA+7CcFX8RQv3RXF2PTrbrAtV5HH1kvijjwrrMuq4+7R6Be6j q/3YDbRadvPSuIB/URQ+2uxg7IxGJgOOtwHDgM6IQakhu23bBvPjhCciKWKs5x1+ 0TP/b34nvBCuikSXehgFqMyWm9FjxCp5Hc7XhqYuoYQ9ReIQAw2Pb1698sVw8Qr7 k1FN1y+JIOOhYPfcV3c1rYxC+pXbv32TCxz6u/Z1/2RW9My3bFTVAVw9nXMxugMc 4hR9iT0eUSnC9g2UWfFlRXisUy5L+NVy8PiMUuvI9g== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrvdeifedguddtudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefhvfevufgjfhffkfggtgesmhdtreertddttdenucfhrhhomheplfhoohhs thcumfhrvghmvghrshcuoehjohhoshhtkhhrvghmvghrshesfhgrshhtmhgrihhlrdhfmh eqnecuggftrfgrthhtvghrnhepvdeuudfghfdtvedttdelvdeludelheehfeevfffgvdeh jedvieeiveekjeeikefgnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrg hilhhfrhhomhepjhhoohhsthhkrhgvmhgvrhhssehfrghsthhmrghilhdrfhhm X-ME-Proxy: Feedback-ID: ie15541ac:Fastmail Original-Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 22 May 2024 03:00:36 -0400 (EDT) In-Reply-To: <87le42v093.fsf@posteo.net> (Philip Kaludercic's message of "Wed, 22 May 2024 06:17:28 +0000") Received-SPF: pass client-ip=103.168.172.147; envelope-from=joostkremers@fastmail.fm; helo=fout4-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:319478 Archived-At: --=-=-= Content-Type: text/plain On Wed, May 22 2024, Philip Kaludercic wrote: > Joost Kremers writes: >> +(defun csv--unquote-value (value) >> + "Remove quotes around VALUE. >> +If VALUE contains escaped quote characters, un-escape them. If >> +VALUE is not quoted, return it unchanged." >> + (save-match-data >> + (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]")))) >> + (string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value) > > Shouldn't this `string-match' be in the if-let? I considered that, but in this particular case, `(match-string 1 value)` returns nil if the first character of `value` isn't in `csv-field-quotes`, so it seems to be OK. Emphasis on "seems" though... Plus, there's no need to call `match-string` at all if `string-match` failed, of course. So new patch attached. > Take this example, > > (let ((str "1 2 3")) > (list (string-match "2" str) > (match-string 0 str) > (string-match "4" str) > (match-string 0 str))) > ;;=> (2 "2" nil "2") > > even though string-match failed, the match data remains and matc-string > returns non-nil values. Oh... I kinda assumed that `string-match` would always reset all of the match data, but apparently not. Good to know! Thanks, Joost -- Joost Kremers Life has its moments --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=0001-Add-function-for-reading-a-CSV-line-and-return-its-v.patch >From bb582c8e413451f59db1d26d4c0208348370283b Mon Sep 17 00:00:00 2001 From: Joost Kremers Date: Wed, 22 May 2024 00:07:34 +0200 Subject: [PATCH] Add function for reading a CSV line and return its values as a list. * (csv-parse-current-row): New function; unlike csv--collect-fields, unquotes the field values. * (csv--unquote-value): New function. --- csv-mode-tests.el | 23 +++++++++++++++++++++++ csv-mode.el | 26 +++++++++++++++++++++++++- 2 files changed, 48 insertions(+), 1 deletion(-) diff --git a/csv-mode-tests.el b/csv-mode-tests.el index 0caeab7..12d0417 100644 --- a/csv-mode-tests.el +++ b/csv-mode-tests.el @@ -144,5 +144,28 @@ (csv--separator-score ?\; csv-tests--data (length csv-tests--data))))) +(ert-deftest csv-tests-unquote-value () + (should (equal (csv--unquote-value "Hello, World") + "Hello, World")) + (should (equal (csv--unquote-value "\"Hello, World\"") + "Hello, World")) + (should (equal (csv--unquote-value "Hello, \"\"World") + "Hello, \"\"World")) + (should (equal (csv--unquote-value "\"Hello, \"\"World\"\"\"") + "Hello, \"World\"")) + (should (equal (csv--unquote-value "'Hello, World'") + "'Hello, World'")) + (should (equal (let ((csv-field-quotes '("\"" "'"))) + (csv--unquote-value "\"Hello, World'")) + "\"Hello, World'")) + (should (equal (let ((csv-field-quotes '("\"" "'"))) + (csv--unquote-value "'Hello, World'")) + "Hello, World")) + (should (equal (let ((csv-field-quotes '("\"" "'"))) + (csv--unquote-value "'Hello, ''World'''")) + "Hello, 'World'")) + (should (equal (csv--unquote-value "|Hello, World|") + "|Hello, World|"))) + (provide 'csv-mode-tests) ;;; csv-mode-tests.el ends here diff --git a/csv-mode.el b/csv-mode.el index f639dcf..ebcd9da 100644 --- a/csv-mode.el +++ b/csv-mode.el @@ -4,7 +4,7 @@ ;; Author: "Francis J. Wright" ;; Maintainer: emacs-devel@gnu.org -;; Version: 1.23 +;; Version: 1.24 ;; Package-Requires: ((emacs "27.1") (cl-lib "0.5")) ;; Keywords: convenience @@ -107,6 +107,10 @@ ;;; News: +;; Since 1.24 +;; - New function `csv--unquote-value'. +;; - New function `csv-parse-current-row'. + ;; Since 1.21: ;; - New command `csv-insert-column'. ;; - New config var `csv-align-min-width' for `csv-align-mode'. @@ -1400,6 +1404,26 @@ point is assumed to be at the beginning of the line." (forward-char))) (nreverse fields))))) +(defun csv--unquote-value (value) + "Remove quotes around VALUE. +If VALUE contains escaped quote characters, un-escape them. If +VALUE is not quoted, return it unchanged." + (save-match-data + (let ((quote-regexp (apply #'concat `("[" ,@csv-field-quotes "]")))) + (if-let (((string-match (concat "^\\(" quote-regexp "\\)\\(.*\\)\\(" quote-regexp "\\)$") value)) + (quote-char (match-string 1 value)) + ((equal quote-char (match-string 3 value))) + (unquoted (match-string 2 value))) + (replace-regexp-in-string (concat quote-char quote-char) quote-char unquoted) + value)))) + +(defun csv-parse-current-row () + "Parse the current CSV line. +Return the field values as a list." + (save-mark-and-excursion + (goto-char (line-beginning-position)) + (mapcar #'csv--unquote-value (csv--collect-fields (line-end-position))))) + (defvar-local csv--header-line nil) (defvar-local csv--header-hscroll nil) (defvar-local csv--header-string nil) -- 2.45.1 --=-=-=--