* bug#71042: 29.3; csv-mode: csv-guess-set-separator chokes on comment lines
@ 2024-05-18 10:14 Marco Emilio Poleggi via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-07-26 10:07 ` Peder O. Klingenberg
0 siblings, 1 reply; 4+ messages in thread
From: Marco Emilio Poleggi via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-05-18 10:14 UTC (permalink / raw)
To: 71042
[-- Attachment #1.1.1: Type: text/plain, Size: 10658 bytes --]
Comment lines seem well supported by csv-mode. However, function
`csv-guess-set-separator` has troubles with them.
Visit this input file:
-------------------------8<-----------------------
$ cat >csv-test.csv <<EOF
###
###
Foo;Bar;Quux
123;456;blah, blah
EOF
------------------------->8-----------------------
With no special settings, the default comma (',') separator is
detected.
Now, issue `M-x csv-guess-set-separator`: I'd expect that the effective
semicolon (';') separator is detected. But it's not and instead the hash
('#') is detected, as confirmed by `C-h v csv-separators`.
It looks like csv-guess-set-separator is fooled by the first two lines
with an even number of hash characters. Indeed, breaking that symmetry
makes the ';' detectable.
But, hey, csv-mode says:
;; CSV mode commands ignore blank lines and comment lines beginning
;; with the value of the buffer local variable `csv-comment-start',
;; which by default is #. ...
So there's probably a bug in `csv-guess-set-separator`.
Cheers,
Marco
--------------------------------------------------------------------------------
In GNU Emacs 29.3 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.41,
cairo version 1.18.0) of 2024-05-14 built on localhost
Windowing system distributor 'The X.Org Foundation', version 11.0.12101013
System Description: Gentoo Linux
Configured using:
'configure --prefix=/usr --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --mandir=/usr/share/man
--infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc
--localstatedir=/var/lib --datarootdir=/usr/share
--disable-silent-rules --docdir=/usr/share/doc/emacs-29.3-r2
--htmldir=/usr/share/doc/emacs-29.3-r2/html --libdir=/usr/lib64
--program-suffix=-emacs-29 --includedir=/usr/include/emacs-29
--infodir=/usr/share/info/emacs-29 --localstatedir=/var
--enable-locallisppath=/etc/emacs:/usr/share/emacs/site-lisp
--without-compress-install --without-hesiod --without-pop
--with-file-notification=inotify --with-pdumper --disable-acl
--without-dbus --with-modules --without-gameuser --with-libgmp
--with-gpm --with-native-compilation=aot --without-json
--without-kerberos --without-kerberos5 --with-lcms2 --without-xml2
--without-mailutils --without-selinux --without-sqlite3 --with-gnutls
--without-libsystemd --with-threads --without-tree-sitter
--without-wide-int --with-sound=no --with-zlib --with-x --without-pgtk
--without-ns --without-gconf --without-gsettings
--with-toolkit-scroll-bars --with-xpm --with-xft --with-cairo
--without-harfbuzz --without-libotf --without-m17n-flt
--with-x-toolkit=gtk3 --with-xwidgets --without-gif --with-jpeg
--with-png --with-rsvg --without-tiff --without-webp
--without-imagemagick --with-dumping=pdumper 'CFLAGS=-march=haswell
-pipe -O2 -fomit-frame-pointer' 'LDFLAGS=-Wl,-O1 -Wl,--as-needed
-Wl,-z,pack-relative-relocs''
Configured features:
CAIRO FREETYPE GLIB GMP GNUTLS GPM JPEG LCMS2 MODULES NATIVE_COMP NOTIFY
INOTIFY PDUMPER PNG RSVG SECCOMP THREADS TOOLKIT_SCROLL_BARS X11 XDBE
XIM XINPUT2 XPM XWIDGETS GTK3 ZLIB
Important settings:
value of $LC_TIME: en_DK.UTF-8
value of $LANG: en_US.utf8
locale-coding-system: utf-8-unix
Major mode: CSV
Minor modes in effect:
windmove-mode: t
shell-dirtrack-mode: t
csv-field-index-mode: t
flyspell-mode: t
server-mode: t
desktop-save-mode: t
global-auto-complete-mode: t
savehist-mode: t
delete-selection-mode: t
override-global-mode: t
tooltip-mode: t
global-eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
column-number-mode: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
Load-path shadows:
/usr/share/emacs/site-lisp/cmake-mode hides
/usr/share/emacs/site-lisp/cmake/cmake-mode
/usr/share/emacs/site-lisp/desktop-entry-mode hides
/usr/share/emacs/site-lisp/desktop-file-utils/desktop-entry-mode
/home/marcoep/.emacs.d/elpa/transient-20240509.1849/transient hides
/usr/share/emacs/29.3/lisp/transient
/home/marcoep/.emacs.d/elpa/bind-key-20230203.2004/bind-key hides
/usr/share/emacs/29.3/lisp/use-package/bind-key
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package-bind-key
hides /usr/share/emacs/29.3/lisp/use-package/use-package-bind-key
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package-core hides
/usr/share/emacs/29.3/lisp/use-package/use-package-core
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package-delight
hides /usr/share/emacs/29.3/lisp/use-package/use-package-delight
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package-diminish
hides /usr/share/emacs/29.3/lisp/use-package/use-package-diminish
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package-ensure hides
/usr/share/emacs/29.3/lisp/use-package/use-package-ensure
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package-jump hides
/usr/share/emacs/29.3/lisp/use-package/use-package-jump
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package-lint hides
/usr/share/emacs/29.3/lisp/use-package/use-package-lint
/home/marcoep/.emacs.d/elpa/use-package-20230426.2324/use-package hides
/usr/share/emacs/29.3/lisp/use-package/use-package
/home/marcoep/.emacs.d/elpa/project-0.10.0/project hides
/usr/share/emacs/29.3/lisp/progmodes/project
Features:
(shadow mail-extr emacsbug shortdoc cl-print ses unsafep align help-fns
radix-tree lua-mode novice sgml-mode facemenu two-column org-duration
rect conf-mode cus-edit cus-start mm-archive network-stream url-cache
tramp-cmds yasnippet misearch multi-isearch windmove vc-hg vc-bzr
tramp-sh jupyter-tramp tramp-cache time-stamp jupyter-server
jupyter-server-kernel jupyter-repl jupyter-widget-client simple-httpd pp
jupyter-client jupyter-kernel jupyter-kernelspec jupyter-env
jupyter-monads thunk jupyter-messages hmac-def jupyter-mime
jupyter-rest-api url-http url-auth url-gw nsm websocket bindat
jupyter-base tramp tramp-loaddefs trampver tramp-integration files-x
tramp-compat shell web-mode disp-table cursor-sensor csv-mode sort
yaml-mode oc-basic org-element org-persist org-id org-refile avl-tree
generator ol-eww eww xdg url-queue mm-url ol-rmail ol-mhe ol-irc ol-info
ol-gnus nnselect gnus-art mm-uu mml2015 mm-view mml-smime smime gnutls
dig gnus-sum shr pixel-fill kinsoku url-file svg dom gnus-group
gnus-undo gnus-start gnus-dbus dbus xml gnus-cloud nnimap nnmail
mail-source utf7 nnoo parse-time gnus-spec gnus-int gnus-range message
sendmail puny rfc822 mml mml-sec epa epg rfc6068 epg-config mm-decode
mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045 ietf-drums
mailabbrev gmm-utils mailheader gnus-win gnus nnheader gnus-util
mail-utils range mm-util mail-prsvr ol-docview doc-view filenotify
jka-compr image-mode exif ol-bibtex bibtex iso8601 ol-bbdb ol-w3m ol-doi
org-link-doi org-clock org ob ob-tangle ob-ref ob-lob ob-table ob-exp
org-macro org-src ob-comint org-pcomplete pcomplete org-list
org-footnote org-faces org-entities ob-emacs-lisp ob-core ob-eval
org-cycle org-table ol org-fold org-fold-core org-keys oc org-loaddefs
find-func cal-menu calendar cal-loaddefs org-version org-compat org-macs
vc-git diff-mode vc-dispatcher flymake-proc flymake project compile
text-property-search comint ansi-osc ansi-color display-line-numbers
smartparens loadhist time-date ring flyspell ispell yank-media
poly-markdown polymode poly-lock polymode-base polymode-export
polymode-weave polymode-compat polymode-methods pcase polymode-core comp
comp-cstr warnings format-spec polymode-classes eieio-custom wid-edit
eieio-base markdown-mode noutline outline icons unfill server desktop
frameset darktooth-dark-theme darktooth-theme darktooth autothemer color
lisp-mnt auto-complete-config auto-complete advice edmacro kmacro popup
cl-extra help-mode savehist delsel cus-load origami origami-parsers cl s
dash hexrgb fill-column-indicator ffap thingatpt printing ps-print
ps-print-loaddefs lpr nginx-mode ebuild-mode skeleton sh-script rx smie
treesit executable finder-inf use-package use-package-ensure
use-package-delight use-package-diminish use-package-bind-key bind-key
easy-mmode use-package-core derived site-gentoo ac-php-autoloads
ac-php-core-autoloads tex-site debian-el-autoloads debian-el dired
dired-loaddefs helm-autoloads helm-core-autoloads jupyter-autoloads
pdf-tools-autoloads poly-markdown-autoloads markdown-mode-autoloads
realgud-recursive-autoloads smartparens-autoloads dash-autoloads
transient-autoloads wfnames-autoloads with-editor-autoloads info package
browse-url url url-proxy url-privacy url-expand url-methods url-history
url-cookie generate-lisp-file url-domsuf url-util mailcap url-handlers
url-parse auth-source cl-seq eieio eieio-core cl-macs password-cache
json subr-x map byte-opt gv bytecomp byte-compile url-vars cl-loaddefs
cl-lib rmc iso-transl tooltip cconv eldoc paren electric uniquify
ediff-hook vc-hooks lisp-float-type elisp-mode mwheel term/x-win x-win
term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode lisp-mode prog-mode register
page tab-bar menu-bar rfn-eshadow isearch easymenu timer select
scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors
frame minibuffer nadvice seq simple cl-generic indonesian philippine
cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite emoji-zwj charscript
charprop case-table epa-hook jka-cmpr-hook help abbrev obarray oclosure
cl-preloaded button loaddefs theme-loaddefs faces cus-face macroexp
files window text-properties overlay sha1 md5 base64 format env
code-pages mule custom widget keymap hashtable-print-readable backquote
threads xwidget-internal inotify lcms2 dynamic-setting
font-render-setting cairo move-toolbar gtk x-toolkit xinput2 x multi-tty
make-network-process native-compile emacs)
Memory information:
((conses 16 962235 132224)
(symbols 48 46289 0)
(strings 32 223160 6829)
(string-bytes 1 6180337)
(vectors 16 80049)
(vector-slots 8 2114195 174122)
(floats 8 1042 591)
(intervals 56 6097 1017)
(buffers 976 27))
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 1865 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* bug#71042: 29.3; csv-mode: csv-guess-set-separator chokes on comment lines
2024-05-18 10:14 bug#71042: 29.3; csv-mode: csv-guess-set-separator chokes on comment lines Marco Emilio Poleggi via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-07-26 10:07 ` Peder O. Klingenberg
2024-08-04 8:25 ` Eli Zaretskii
0 siblings, 1 reply; 4+ messages in thread
From: Peder O. Klingenberg @ 2024-07-26 10:07 UTC (permalink / raw)
To: Marco Emilio Poleggi; +Cc: 71042
[-- Attachment #1: Type: text/plain, Size: 272 bytes --]
Tags: patch
On Sat, 2024-05-18 12:14:21 +0200, Marco Emilio Poleggi wrote:
> Comment lines seem well supported by csv-mode. However, function
> `csv-guess-set-separator` has troubles with them.
Easily fixed, but needs someone with commit privileges to apply the patch
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Disregard-comments-when-guessing-separators.patch --]
[-- Type: text/x-diff, Size: 2361 bytes --]
From 6bcdef0ed429dfe88a607478c504eb52186e639e Mon Sep 17 00:00:00 2001
From: "Peder O. Klingenberg" <peder@klingenberg.no>
Date: Fri, 26 Jul 2024 10:13:24 +0200
Subject: [PATCH] Disregard comments when guessing separators
* csv-mode.el (csv--separator-candidates): Ignore
csv-comment-start when looking for potential separators.
* csv-mode-tests.el (csv-tests-guess-separator-avoid-comment):
Previously failing test case guessing the comment start as
separator char.
(Bug#71042 fixed)
---
csv-mode-tests.el | 10 ++++++++++
csv-mode.el | 7 ++++++-
2 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/csv-mode-tests.el b/csv-mode-tests.el
index 12e009ecf9..213fd033b2 100644
--- a/csv-mode-tests.el
+++ b/csv-mode-tests.el
@@ -170,5 +170,15 @@
(should (equal (csv--unquote-value "|Hello, World|")
"|Hello, World|")))
+(ert-deftest csv-tests-guess-separator-avoid-comment ()
+ ;; bug#71042
+ (let ((testdata "###
+###
+Foo;Bar;Quux
+123;456;blah, blah
+"))
+ (message "Guessed separator: %c" (csv-guess-separator testdata))
+ (should-not (equal (csv-guess-separator testdata) ?#))))
+
(provide 'csv-mode-tests)
;;; csv-mode-tests.el ends here
diff --git a/csv-mode.el b/csv-mode.el
index 1e0f99ef92..6bdfcafbbb 100644
--- a/csv-mode.el
+++ b/csv-mode.el
@@ -4,7 +4,7 @@
;; Author: "Francis J. Wright" <F.J.Wright@qmul.ac.uk>
;; Maintainer: emacs-devel@gnu.org
-;; Version: 1.25
+;; Version: 1.26
;; Package-Requires: ((emacs "27.1") (cl-lib "0.5"))
;; Keywords: convenience
@@ -107,6 +107,10 @@
;;; News:
+;; Since 1.26:
+;; - `csv-guess-separator' will no longer guess the comment-start
+;; character as a potential separator character.
+
;; Since 1.25:
;; - The ASCII control character 31 Unit Separator can now be
;; recognized as a CSV separator by `csv-guess-separator'.
@@ -1902,6 +1906,7 @@ When CUTOFF is passed, look only at the first CUTOFF number of characters."
(or (= c ?\t)
(= c ?\C-_)
(and (not (member c '(?. ?/ ?\" ?')))
+ (not (= c (string-to-char csv-comment-start)))
(not (member (get-char-code-property c 'general-category)
'(Lu Ll Lt Lm Lo Nd Nl No Ps Pe Cc Co))))))
(puthash c t chars)))
--
2.25.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* bug#71042: 29.3; csv-mode: csv-guess-set-separator chokes on comment lines
2024-07-26 10:07 ` Peder O. Klingenberg
@ 2024-08-04 8:25 ` Eli Zaretskii
2024-08-04 14:36 ` Philip Kaludercic
0 siblings, 1 reply; 4+ messages in thread
From: Eli Zaretskii @ 2024-08-04 8:25 UTC (permalink / raw)
To: Peder O. Klingenberg, Philip Kaludercic; +Cc: 71042, marcoep
> Cc: 71042@debbugs.gnu.org
> From: "Peder O. Klingenberg" <peder@klingenberg.no>
> Date: Fri, 26 Jul 2024 12:07:00 +0200
>
> On Sat, 2024-05-18 12:14:21 +0200, Marco Emilio Poleggi wrote:
>
> > Comment lines seem well supported by csv-mode. However, function
> > `csv-guess-set-separator` has troubles with them.
>
> Easily fixed, but needs someone with commit privileges to apply the patch
Thanks.
Philip, could you please install this?
^ permalink raw reply [flat|nested] 4+ messages in thread
* bug#71042: 29.3; csv-mode: csv-guess-set-separator chokes on comment lines
2024-08-04 8:25 ` Eli Zaretskii
@ 2024-08-04 14:36 ` Philip Kaludercic
0 siblings, 0 replies; 4+ messages in thread
From: Philip Kaludercic @ 2024-08-04 14:36 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 71042-done, Peder O. Klingenberg, marcoep
Eli Zaretskii <eliz@gnu.org> writes:
>> Cc: 71042@debbugs.gnu.org
>> From: "Peder O. Klingenberg" <peder@klingenberg.no>
>> Date: Fri, 26 Jul 2024 12:07:00 +0200
>>
>> On Sat, 2024-05-18 12:14:21 +0200, Marco Emilio Poleggi wrote:
>>
>> > Comment lines seem well supported by csv-mode. However, function
>> > `csv-guess-set-separator` has troubles with them.
>>
>> Easily fixed, but needs someone with commit privileges to apply the patch
>
> Thanks.
>
> Philip, could you please install this?
Done, and closing the report. Note that for some reason the commit
didn't want to apply, but I couldn't figure out why. Had to reconstruct
the change by hand, so I hope nothing went wrong.
--
Philip Kaludercic on peregrine
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-08-04 14:36 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-18 10:14 bug#71042: 29.3; csv-mode: csv-guess-set-separator chokes on comment lines Marco Emilio Poleggi via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-07-26 10:07 ` Peder O. Klingenberg
2024-08-04 8:25 ` Eli Zaretskii
2024-08-04 14:36 ` Philip Kaludercic
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.