unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#71042: 29.3; csv-mode: csv-guess-set-separator chokes on comment lines
@ 2024-05-18 10:14 Marco Emilio Poleggi via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-07-26 10:07 ` Peder O. Klingenberg
  0 siblings, 1 reply; 4+ messages in thread
From: Marco Emilio Poleggi via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-05-18 10:14 UTC (permalink / raw)
  To: 71042


[-- Attachment #1.1.1: Type: text/plain, Size: 10658 bytes --]

Comment lines seem well supported by csv-mode. However, function 
`csv-guess-set-separator` has troubles with them.

Visit this input file:

-------------------------8<-----------------------
$ cat >csv-test.csv <<EOF
###
###
Foo;Bar;Quux
123;456;blah, blah
EOF
------------------------->8-----------------------

With no special settings, the default comma (',') separator is
detected.

Now, issue `M-x csv-guess-set-separator`: I'd expect that the effective
semicolon (';') separator is detected. But it's not and instead the hash
('#') is detected, as confirmed by `C-h v csv-separators`.

It looks like csv-guess-set-separator is fooled by the first two lines
with an even number of hash characters. Indeed, breaking that symmetry
makes the ';' detectable.

But, hey, csv-mode says:

;; CSV mode commands ignore blank lines and comment lines beginning
;; with the value of the buffer local variable `csv-comment-start',
;; which by default is #. ...

So there's probably a bug in `csv-guess-set-separator`.

Cheers,

   Marco



--------------------------------------------------------------------------------

In GNU Emacs 29.3 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.41,
  cairo version 1.18.0) of 2024-05-14 built on localhost
Windowing system distributor 'The X.Org Foundation', version 11.0.12101013
System Description: Gentoo Linux

Configured using:
  'configure --prefix=/usr --build=x86_64-pc-linux-gnu
  --host=x86_64-pc-linux-gnu --mandir=/usr/share/man
  --infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc
  --localstatedir=/var/lib --datarootdir=/usr/share
  --disable-silent-rules --docdir=/usr/share/doc/emacs-29.3-r2
  --htmldir=/usr/share/doc/emacs-29.3-r2/html --libdir=/usr/lib64
  --program-suffix=-emacs-29 --includedir=/usr/include/emacs-29
  --infodir=/usr/share/info/emacs-29 --localstatedir=/var
  --enable-locallisppath=/etc/emacs:/usr/share/emacs/site-lisp
  --without-compress-install --without-hesiod --without-pop
  --with-file-notification=inotify --with-pdumper --disable-acl
  --without-dbus --with-modules --without-gameuser --with-libgmp
  --with-gpm --with-native-compilation=aot --without-json
  --without-kerberos --without-kerberos5 --with-lcms2 --without-xml2
  --without-mailutils --without-selinux --without-sqlite3 --with-gnutls
  --without-libsystemd --with-threads --without-tree-sitter
  --without-wide-int --with-sound=no --with-zlib --with-x --without-pgtk
  --without-ns --without-gconf --without-gsettings
  --with-toolkit-scroll-bars --with-xpm --with-xft --with-cairo
  --without-harfbuzz --without-libotf --without-m17n-flt
  --with-x-toolkit=gtk3 --with-xwidgets --without-gif --with-jpeg
  --with-png --with-rsvg --without-tiff --without-webp
  --without-imagemagick --with-dumping=pdumper 'CFLAGS=-march=haswell
  -pipe -O2 -fomit-frame-pointer' 'LDFLAGS=-Wl,-O1 -Wl,--as-needed
  -Wl,-z,pack-relative-relocs''

Configured features:
CAIRO FREETYPE GLIB GMP GNUTLS GPM JPEG LCMS2 MODULES NATIVE_COMP NOTIFY
INOTIFY PDUMPER PNG RSVG SECCOMP THREADS TOOLKIT_SCROLL_BARS X11 XDBE
XIM XINPUT2 XPM XWIDGETS GTK3 ZLIB

Important settings:
   value of $LC_TIME: en_DK.UTF-8
   value of $LANG: en_US.utf8
   locale-coding-system: utf-8-unix

Major mode: CSV

Minor modes in effect:
   windmove-mode: t
   shell-dirtrack-mode: t
   csv-field-index-mode: t
   flyspell-mode: t
   server-mode: t
   desktop-save-mode: t
   global-auto-complete-mode: t
   savehist-mode: t
   delete-selection-mode: t
   override-global-mode: t
   tooltip-mode: t
   global-eldoc-mode: t
   show-paren-mode: t
   electric-indent-mode: t
   mouse-wheel-mode: t
   menu-bar-mode: t
   file-name-shadow-mode: t
   global-font-lock-mode: t
   font-lock-mode: t
   blink-cursor-mode: t
   column-number-mode: t
   line-number-mode: t
   indent-tabs-mode: t
   transient-mark-mode: t
   auto-composition-mode: t
   auto-encryption-mode: t
   auto-compression-mode: t

Load-path shadows:
/usr/share/emacs/site-lisp/cmake-mode hides 
/usr/share/emacs/site-lisp/cmake/cmake-mode
/usr/share/emacs/site-lisp/desktop-entry-mode hides 
/usr/share/emacs/site-lisp/desktop-file-utils/desktop-entry-mode
/home/marcoep/.emacs.d/elpa/transient-20240509.1849/transient hides 
/usr/share/emacs/29.3/lisp/transient
/home/marcoep/.emacs.d/elpa/bind-key-20230203.2004/bind-key hides 
/usr/share/emacs/29.3/lisp/use-package/bind-key
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package-bind-key 
hides /usr/share/emacs/29.3/lisp/use-package/use-package-bind-key
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package-core hides 
/usr/share/emacs/29.3/lisp/use-package/use-package-core
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package-delight 
hides /usr/share/emacs/29.3/lisp/use-package/use-package-delight
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package-diminish 
hides /usr/share/emacs/29.3/lisp/use-package/use-package-diminish
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package-ensure hides 
/usr/share/emacs/29.3/lisp/use-package/use-package-ensure
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package-jump hides 
/usr/share/emacs/29.3/lisp/use-package/use-package-jump
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package-lint hides 
/usr/share/emacs/29.3/lisp/use-package/use-package-lint
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package hides 
/usr/share/emacs/29.3/lisp/use-package/use-package
/home/marcoep/.emacs.d/elpa/project-0.10.0/project hides 
/usr/share/emacs/29.3/lisp/progmodes/project

Features:
(shadow mail-extr emacsbug shortdoc cl-print ses unsafep align help-fns
radix-tree lua-mode novice sgml-mode facemenu two-column org-duration
rect conf-mode cus-edit cus-start mm-archive network-stream url-cache
tramp-cmds yasnippet misearch multi-isearch windmove vc-hg vc-bzr
tramp-sh jupyter-tramp tramp-cache time-stamp jupyter-server
jupyter-server-kernel jupyter-repl jupyter-widget-client simple-httpd pp
jupyter-client jupyter-kernel jupyter-kernelspec jupyter-env
jupyter-monads thunk jupyter-messages hmac-def jupyter-mime
jupyter-rest-api url-http url-auth url-gw nsm websocket bindat
jupyter-base tramp tramp-loaddefs trampver tramp-integration files-x
tramp-compat shell web-mode disp-table cursor-sensor csv-mode sort
yaml-mode oc-basic org-element org-persist org-id org-refile avl-tree
generator ol-eww eww xdg url-queue mm-url ol-rmail ol-mhe ol-irc ol-info
ol-gnus nnselect gnus-art mm-uu mml2015 mm-view mml-smime smime gnutls
dig gnus-sum shr pixel-fill kinsoku url-file svg dom gnus-group
gnus-undo gnus-start gnus-dbus dbus xml gnus-cloud nnimap nnmail
mail-source utf7 nnoo parse-time gnus-spec gnus-int gnus-range message
sendmail puny rfc822 mml mml-sec epa epg rfc6068 epg-config mm-decode
mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045 ietf-drums
mailabbrev gmm-utils mailheader gnus-win gnus nnheader gnus-util
mail-utils range mm-util mail-prsvr ol-docview doc-view filenotify
jka-compr image-mode exif ol-bibtex bibtex iso8601 ol-bbdb ol-w3m ol-doi
org-link-doi org-clock org ob ob-tangle ob-ref ob-lob ob-table ob-exp
org-macro org-src ob-comint org-pcomplete pcomplete org-list
org-footnote org-faces org-entities ob-emacs-lisp ob-core ob-eval
org-cycle org-table ol org-fold org-fold-core org-keys oc org-loaddefs
find-func cal-menu calendar cal-loaddefs org-version org-compat org-macs
vc-git diff-mode vc-dispatcher flymake-proc flymake project compile
text-property-search comint ansi-osc ansi-color display-line-numbers
smartparens loadhist time-date ring flyspell ispell yank-media
poly-markdown polymode poly-lock polymode-base polymode-export
polymode-weave polymode-compat polymode-methods pcase polymode-core comp
comp-cstr warnings format-spec polymode-classes eieio-custom wid-edit
eieio-base markdown-mode noutline outline icons unfill server desktop
frameset darktooth-dark-theme darktooth-theme darktooth autothemer color
lisp-mnt auto-complete-config auto-complete advice edmacro kmacro popup
cl-extra help-mode savehist delsel cus-load origami origami-parsers cl s
dash hexrgb fill-column-indicator ffap thingatpt printing ps-print
ps-print-loaddefs lpr nginx-mode ebuild-mode skeleton sh-script rx smie
treesit executable finder-inf use-package use-package-ensure
use-package-delight use-package-diminish use-package-bind-key bind-key
easy-mmode use-package-core derived site-gentoo ac-php-autoloads
ac-php-core-autoloads tex-site debian-el-autoloads debian-el dired
dired-loaddefs helm-autoloads helm-core-autoloads jupyter-autoloads
pdf-tools-autoloads poly-markdown-autoloads markdown-mode-autoloads
realgud-recursive-autoloads smartparens-autoloads dash-autoloads
transient-autoloads wfnames-autoloads with-editor-autoloads info package
browse-url url url-proxy url-privacy url-expand url-methods url-history
url-cookie generate-lisp-file url-domsuf url-util mailcap url-handlers
url-parse auth-source cl-seq eieio eieio-core cl-macs password-cache
json subr-x map byte-opt gv bytecomp byte-compile url-vars cl-loaddefs
cl-lib rmc iso-transl tooltip cconv eldoc paren electric uniquify
ediff-hook vc-hooks lisp-float-type elisp-mode mwheel term/x-win x-win
term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode lisp-mode prog-mode register
page tab-bar menu-bar rfn-eshadow isearch easymenu timer select
scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors
frame minibuffer nadvice seq simple cl-generic indonesian philippine
cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite emoji-zwj charscript
charprop case-table epa-hook jka-cmpr-hook help abbrev obarray oclosure
cl-preloaded button loaddefs theme-loaddefs faces cus-face macroexp
files window text-properties overlay sha1 md5 base64 format env
code-pages mule custom widget keymap hashtable-print-readable backquote
threads xwidget-internal inotify lcms2 dynamic-setting
font-render-setting cairo move-toolbar gtk x-toolkit xinput2 x multi-tty
make-network-process native-compile emacs)

Memory information:
((conses 16 962235 132224)
  (symbols 48 46289 0)
  (strings 32 223160 6829)
  (string-bytes 1 6180337)
  (vectors 16 80049)
  (vector-slots 8 2114195 174122)
  (floats 8 1042 591)
  (intervals 56 6097 1017)
  (buffers 976 27))


[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 1865 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#71042: 29.3; csv-mode: csv-guess-set-separator chokes on comment lines
  2024-05-18 10:14 bug#71042: 29.3; csv-mode: csv-guess-set-separator chokes on comment lines Marco Emilio Poleggi via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-07-26 10:07 ` Peder O. Klingenberg
  2024-08-04  8:25   ` Eli Zaretskii
  0 siblings, 1 reply; 4+ messages in thread
From: Peder O. Klingenberg @ 2024-07-26 10:07 UTC (permalink / raw)
  To: Marco Emilio Poleggi; +Cc: 71042

[-- Attachment #1: Type: text/plain, Size: 272 bytes --]

Tags: patch

On Sat, 2024-05-18 12:14:21 +0200, Marco Emilio Poleggi wrote:

> Comment lines seem well supported by csv-mode. However, function
> `csv-guess-set-separator` has troubles with them.

Easily fixed, but needs someone with commit privileges to apply the patch


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Disregard-comments-when-guessing-separators.patch --]
[-- Type: text/x-diff, Size: 2361 bytes --]

From 6bcdef0ed429dfe88a607478c504eb52186e639e Mon Sep 17 00:00:00 2001
From: "Peder O. Klingenberg" <peder@klingenberg.no>
Date: Fri, 26 Jul 2024 10:13:24 +0200
Subject: [PATCH] Disregard comments when guessing separators

* csv-mode.el (csv--separator-candidates): Ignore
csv-comment-start when looking for potential separators.

* csv-mode-tests.el (csv-tests-guess-separator-avoid-comment):
Previously failing test case guessing the comment start as
separator char.

(Bug#71042 fixed)
---
 csv-mode-tests.el | 10 ++++++++++
 csv-mode.el       |  7 ++++++-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/csv-mode-tests.el b/csv-mode-tests.el
index 12e009ecf9..213fd033b2 100644
--- a/csv-mode-tests.el
+++ b/csv-mode-tests.el
@@ -170,5 +170,15 @@
   (should (equal (csv--unquote-value "|Hello, World|")
                  "|Hello, World|")))
 
+(ert-deftest csv-tests-guess-separator-avoid-comment ()
+  ;; bug#71042
+  (let ((testdata "###
+###
+Foo;Bar;Quux
+123;456;blah, blah
+"))
+    (message "Guessed separator: %c" (csv-guess-separator testdata))
+    (should-not (equal (csv-guess-separator testdata) ?#))))
+
 (provide 'csv-mode-tests)
 ;;; csv-mode-tests.el ends here
diff --git a/csv-mode.el b/csv-mode.el
index 1e0f99ef92..6bdfcafbbb 100644
--- a/csv-mode.el
+++ b/csv-mode.el
@@ -4,7 +4,7 @@
 
 ;; Author: "Francis J. Wright" <F.J.Wright@qmul.ac.uk>
 ;; Maintainer: emacs-devel@gnu.org
-;; Version: 1.25
+;; Version: 1.26
 ;; Package-Requires: ((emacs "27.1") (cl-lib "0.5"))
 ;; Keywords: convenience
 
@@ -107,6 +107,10 @@
 
 ;;; News:
 
+;; Since 1.26:
+;; - `csv-guess-separator' will no longer guess the comment-start
+;;    character as a potential separator character.
+
 ;; Since 1.25:
 ;; - The ASCII control character 31 Unit Separator can now be
 ;;   recognized as a CSV separator by `csv-guess-separator'.
@@ -1902,6 +1906,7 @@ When CUTOFF is passed, look only at the first CUTOFF number of characters."
                  (or (= c ?\t)
                      (= c ?\C-_)
                      (and (not (member c '(?. ?/ ?\" ?')))
+                          (not (= c (string-to-char csv-comment-start)))
                           (not (member (get-char-code-property c 'general-category)
                                        '(Lu Ll Lt Lm Lo Nd Nl No Ps Pe Cc Co))))))
         (puthash c t chars)))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* bug#71042: 29.3; csv-mode: csv-guess-set-separator chokes on comment lines
  2024-07-26 10:07 ` Peder O. Klingenberg
@ 2024-08-04  8:25   ` Eli Zaretskii
  2024-08-04 14:36     ` Philip Kaludercic
  0 siblings, 1 reply; 4+ messages in thread
From: Eli Zaretskii @ 2024-08-04  8:25 UTC (permalink / raw)
  To: Peder O. Klingenberg, Philip Kaludercic; +Cc: 71042, marcoep

> Cc: 71042@debbugs.gnu.org
> From: "Peder O. Klingenberg" <peder@klingenberg.no>
> Date: Fri, 26 Jul 2024 12:07:00 +0200
> 
> On Sat, 2024-05-18 12:14:21 +0200, Marco Emilio Poleggi wrote:
> 
> > Comment lines seem well supported by csv-mode. However, function
> > `csv-guess-set-separator` has troubles with them.
> 
> Easily fixed, but needs someone with commit privileges to apply the patch

Thanks.

Philip, could you please install this?





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#71042: 29.3; csv-mode: csv-guess-set-separator chokes on comment lines
  2024-08-04  8:25   ` Eli Zaretskii
@ 2024-08-04 14:36     ` Philip Kaludercic
  0 siblings, 0 replies; 4+ messages in thread
From: Philip Kaludercic @ 2024-08-04 14:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 71042-done, Peder O. Klingenberg, marcoep

Eli Zaretskii <eliz@gnu.org> writes:

>> Cc: 71042@debbugs.gnu.org
>> From: "Peder O. Klingenberg" <peder@klingenberg.no>
>> Date: Fri, 26 Jul 2024 12:07:00 +0200
>> 
>> On Sat, 2024-05-18 12:14:21 +0200, Marco Emilio Poleggi wrote:
>> 
>> > Comment lines seem well supported by csv-mode. However, function
>> > `csv-guess-set-separator` has troubles with them.
>> 
>> Easily fixed, but needs someone with commit privileges to apply the patch
>
> Thanks.
>
> Philip, could you please install this?

Done, and closing the report.  Note that for some reason the commit
didn't want to apply, but I couldn't figure out why.  Had to reconstruct
the change by hand, so I hope nothing went wrong.

-- 
	Philip Kaludercic on peregrine





^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-08-04 14:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-18 10:14 bug#71042: 29.3; csv-mode: csv-guess-set-separator chokes on comment lines Marco Emilio Poleggi via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-07-26 10:07 ` Peder O. Klingenberg
2024-08-04  8:25   ` Eli Zaretskii
2024-08-04 14:36     ` Philip Kaludercic

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).