* bug#55787: 29.0.50; inconsistent sort order with ls-lisp-version-lessp @ 2022-06-03 23:21 TAKAHASHI Yoshio 2022-06-04 7:44 ` Eli Zaretskii 0 siblings, 1 reply; 8+ messages in thread From: TAKAHASHI Yoshio @ 2022-06-03 23:21 UTC (permalink / raw) To: 55787 Hi, I encounter an inconsistent sort result. The position of "01.0" and/or "01.2" seems wrong. $ cat /tmp/test.el (require 'ls-lisp) (print (sort (vector "01.0" "10" "010" "01.2") (lambda (x y) (ls-lisp-version-lessp x y)))) $ emacs -Q --batch -l /tmp/test.el ["01.0" "10" "010" "01.2"] $ In GNU Emacs 29.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.33, cairo version 1.16.0) of 2022-05-26 built on LAPTOP-89LTAUNV Repository revision: 531688a19e2125b20c2efa032e02b9cebbedb397 Repository branch: master Windowing system distributor 'Microsoft Corporation', version 11.0.12010000 System Description: Ubuntu 22.04 LTS ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#55787: 29.0.50; inconsistent sort order with ls-lisp-version-lessp 2022-06-03 23:21 bug#55787: 29.0.50; inconsistent sort order with ls-lisp-version-lessp TAKAHASHI Yoshio @ 2022-06-04 7:44 ` Eli Zaretskii 2022-06-04 14:11 ` TAKAHASHI Yoshio 0 siblings, 1 reply; 8+ messages in thread From: Eli Zaretskii @ 2022-06-04 7:44 UTC (permalink / raw) To: TAKAHASHI Yoshio; +Cc: 55787 > From: TAKAHASHI Yoshio <yfb02119@nifty.com> > Date: Sat, 04 Jun 2022 08:21:48 +0900 > > I encounter an inconsistent sort result. The position of "01.0" and/or > "01.2" seems wrong. > > > $ cat /tmp/test.el > (require 'ls-lisp) > (print (sort (vector "01.0" "10" "010" "01.2") > (lambda (x y) > (ls-lisp-version-lessp x y)))) > $ emacs -Q --batch -l /tmp/test.el > > ["01.0" "10" "010" "01.2"] Why do you think this is wrong? This function is not meant to compare dotted versions with undotted ones, only dotted to dotted or undotted to undotted. The strings are supposed to be file names, where a dot begins an extension. See the node "More details about version sort" in the GNU Coreutils manual for more info. If you want a general-purpose version-comparison function, use version< instead. ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#55787: 29.0.50; inconsistent sort order with ls-lisp-version-lessp 2022-06-04 7:44 ` Eli Zaretskii @ 2022-06-04 14:11 ` TAKAHASHI Yoshio 2022-06-04 14:52 ` Eli Zaretskii 0 siblings, 1 reply; 8+ messages in thread From: TAKAHASHI Yoshio @ 2022-06-04 14:11 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 55787 Eli-san, Thank you for your replay. >> I encounter an inconsistent sort result. The position of "01.0" and/or >> "01.2" seems wrong. >> >> >> $ cat /tmp/test.el >> (require 'ls-lisp) >> (print (sort (vector "01.0" "10" "010" "01.2") >> (lambda (x y) >> (ls-lisp-version-lessp x y)))) >> $ emacs -Q --batch -l /tmp/test.el >> >> ["01.0" "10" "010" "01.2"] > > Why do you think this is wrong? This function is not meant to compare > dotted versions with undotted ones, only dotted to dotted or undotted > to undotted. The strings are supposed to be file names, where a dot > begins an extension. > > See the node "More details about version sort" in the GNU Coreutils > manual for more info. I report this "inconsistency" because ls-lisp does not sort files as ls program does when `dired-listing-switches' has 'v', such as "-alGv". # "01.0", "10", ... is minimal reproducible pattern that I stlipped down # my real filenames pattern. I'm not aware that `ls-lisp-version-lessp' does not support dotted-undotted mixed cases. Doc string says it acts as `strverscmp', I expect the same result (order) in dired buffer. And in below example, the result seems to act like `strverscmp'. (print (sort (vector "01.0" "10" "01.2") ; no "010" in arg. (lambda (x y) (ls-lisp-version-lessp x y)))) ["01.0" "01.2" "10"] > If you want a general-purpose version-comparison function, use > version< instead. Umm, do I need to use `version<' in `ls-lisp-handle-switches' with extracting numerical part from filename argument? -- tkh ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#55787: 29.0.50; inconsistent sort order with ls-lisp-version-lessp 2022-06-04 14:11 ` TAKAHASHI Yoshio @ 2022-06-04 14:52 ` Eli Zaretskii 2022-06-05 2:37 ` TAKAHASHI Yoshio 0 siblings, 1 reply; 8+ messages in thread From: Eli Zaretskii @ 2022-06-04 14:52 UTC (permalink / raw) To: TAKAHASHI Yoshio; +Cc: 55787 > From: TAKAHASHI Yoshio <yfb02119@nifty.com> > Cc: 55787@debbugs.gnu.org > Date: Sat, 04 Jun 2022 23:11:17 +0900 > > >> $ cat /tmp/test.el > >> (require 'ls-lisp) > >> (print (sort (vector "01.0" "10" "010" "01.2") > >> (lambda (x y) > >> (ls-lisp-version-lessp x y)))) > >> $ emacs -Q --batch -l /tmp/test.el > >> > >> ["01.0" "10" "010" "01.2"] > > > > Why do you think this is wrong? This function is not meant to compare > > dotted versions with undotted ones, only dotted to dotted or undotted > > to undotted. The strings are supposed to be file names, where a dot > > begins an extension. > > > > See the node "More details about version sort" in the GNU Coreutils > > manual for more info. > > I report this "inconsistency" because ls-lisp does not sort files as ls > program does when `dired-listing-switches' has 'v', such as "-alGv". What do you see with 'ls' and what do you see with ls-lisp? Also, in which locale are you trying this with 'ls'? > # "01.0", "10", ... is minimal reproducible pattern that I stlipped down > # my real filenames pattern. I'd prefer to see the real file names instead, since that's what ls-lisp-version-lessp was written to handle. > I'm not aware that `ls-lisp-version-lessp' does not support > dotted-undotted mixed cases. Doc string says it acts as `strverscmp', I > expect the same result (order) in dired buffer. And in below example, > the result seems to act like `strverscmp'. The exact spec of strverscmp is not known, AFAIK, and the implementation is a state machine, which is somewhat hard to reverse-engineer. I'm only aware of the documentation in the glibc manual; did you read it? Comparing with 'ls' is also somewhat problematic, because in UTF-8 locales its collation rules ignore some punctuation characters -- again, because that's how glibc implements that. Emacs on MS-Windows can emulate this behavior if you set w32-collate-ignore-punctuation to a non-nil value. > (print (sort (vector "01.0" "10" "01.2") ; no "010" in arg. > (lambda (x y) > (ls-lisp-version-lessp x y)))) > ["01.0" "01.2" "10"] If I create files by the names in your original example, I see this in a Dired buffer created by "C-u C-x d" after I set the switches to "-alv": drwxrwxrwx 1 xxxxx yyy 0 06-04 10:19 . drwxrwxrwx 1 xxxxx yyy 0 06-04 11:02 .. -rw-rw-rw- 1 xxxxx yyy 0 06-04 10:19 10 -rw-rw-rw- 1 xxxxx yyy 0 06-04 10:19 010 -rw-rw-rw- 1 xxxxx yyy 0 06-04 10:19 01.0 -rw-rw-rw- 1 xxxxx yyy 0 06-04 10:19 01.2 which seems reasonable. > > If you want a general-purpose version-comparison function, use > > version< instead. > > Umm, do I need to use `version<' in `ls-lisp-handle-switches' with > extracting numerical part from filename argument? No, I wrote that before I understood what you were trying to do. Please ignore that part. ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#55787: 29.0.50; inconsistent sort order with ls-lisp-version-lessp 2022-06-04 14:52 ` Eli Zaretskii @ 2022-06-05 2:37 ` TAKAHASHI Yoshio 2022-06-05 7:01 ` Eli Zaretskii 0 siblings, 1 reply; 8+ messages in thread From: TAKAHASHI Yoshio @ 2022-06-05 2:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 55787 Eli-san, With further tests, this ls-lisp behavior occurs only on my Mingw64 Windows Emacs environment. I can not reproduce it on my WSL2 Ubuntu environemnt. > What do you see with 'ls' and what do you see with ls-lisp? Also, in > which locale are you trying this with 'ls'? I include my trial to hope it can be reproduced on your environment. In this scenario, I use alittle more real filenames instead of just number. ================================================================ On my Windows machine, output from "M-! (shell-command) env" OS=Windows_NT LANG=ja_JP.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_TIME=C ================================================================ tkh$ cat ../createfiles.sh touch "34 アルバム-300dpi.jpg" touch "34 アルバム-300dpi.png" touch "054_交換機.jpg" touch "054_交換機.png" touch "91 部分カット.jpg" touch "91 部分カット.png" touch "0717-パソコン.jpg" touch "0717-パソコン.png" touch "1935 社屋.jpg" touch "1935 社屋.png" touch "FFFF_縁カット.jpg" touch "FFFF_縁カット.png" touch "hhhh.jpg" touch "hhhh.png" tkh$ sh ../createfiles.sh tkh$ ls -l total 0 -rw-r--r-- 1 tkh 0 Jun 5 10:45 054_交換機.jpg -rw-r--r-- 1 tkh 0 Jun 5 10:45 054_交換機.png -rw-r--r-- 1 tkh 0 Jun 5 10:45 0717-パソコン.jpg -rw-r--r-- 1 tkh 0 Jun 5 10:45 0717-パソコン.png -rw-r--r-- 1 tkh 0 Jun 5 10:45 1935 社屋.jpg -rw-r--r-- 1 tkh 0 Jun 5 10:45 1935 社屋.png -rw-r--r-- 1 tkh 0 Jun 5 10:45 34 アルバム-300dpi.jpg -rw-r--r-- 1 tkh 0 Jun 5 10:45 34 アルバム-300dpi.png -rw-r--r-- 1 tkh 0 Jun 5 10:45 91 部分カット.jpg -rw-r--r-- 1 tkh 0 Jun 5 10:45 91 部分カット.png -rw-r--r-- 1 tkh 0 Jun 5 10:45 FFFF_縁カット.jpg -rw-r--r-- 1 tkh 0 Jun 5 10:45 FFFF_縁カット.png -rw-r--r-- 1 tkh 0 Jun 5 10:45 hhhh.jpg -rw-r--r-- 1 tkh 0 Jun 5 10:45 hhhh.png tkh$ ls -lv total 0 -rw-r--r-- 1 tkh 0 Jun 5 10:45 34 アルバム-300dpi.jpg -rw-r--r-- 1 tkh 0 Jun 5 10:45 34 アルバム-300dpi.png -rw-r--r-- 1 tkh 0 Jun 5 10:45 054_交換機.jpg -rw-r--r-- 1 tkh 0 Jun 5 10:45 054_交換機.png -rw-r--r-- 1 tkh 0 Jun 5 10:45 91 部分カット.jpg -rw-r--r-- 1 tkh 0 Jun 5 10:45 91 部分カット.png -rw-r--r-- 1 tkh 0 Jun 5 10:45 0717-パソコン.jpg -rw-r--r-- 1 tkh 0 Jun 5 10:45 0717-パソコン.png -rw-r--r-- 1 tkh 0 Jun 5 10:45 1935 社屋.jpg -rw-r--r-- 1 tkh 0 Jun 5 10:45 1935 社屋.png -rw-r--r-- 1 tkh 0 Jun 5 10:45 FFFF_縁カット.jpg -rw-r--r-- 1 tkh 0 Jun 5 10:45 FFFF_縁カット.png -rw-r--r-- 1 tkh 0 Jun 5 10:45 hhhh.jpg -rw-r--r-- 1 tkh 0 Jun 5 10:45 hhhh.png tkh$ ================================================================ On my Windows machine, "054_交換機.{jpg,png}" are wrongly listed in dired buffer. drwxrwxrwx 1 0 Jun 5 10:45 . drwxrwxrwx 1 0 Jun 5 10:45 .. -rw-rw-rw- 1 0 Jun 5 10:45 34 アルバム-300dpi.jpg -rw-rw-rw- 1 0 Jun 5 10:45 34 アルバム-300dpi.png -rw-rw-rw- 1 0 Jun 5 10:45 054_交換機.png -rw-rw-rw- 1 0 Jun 5 10:45 91 部分カット.jpg -rw-rw-rw- 1 0 Jun 5 10:45 91 部分カット.png -rw-rw-rw- 1 0 Jun 5 10:45 0717-パソコン.jpg -rw-rw-rw- 1 0 Jun 5 10:45 0717-パソコン.png -rw-rw-rw- 1 0 Jun 5 10:45 054_交換機.jpg -rw-rw-rw- 1 0 Jun 5 10:45 1935 社屋.jpg -rw-rw-rw- 1 0 Jun 5 10:45 1935 社屋.png -rw-rw-rw- 1 0 Jun 5 10:45 FFFF_縁カット.jpg -rw-rw-rw- 1 0 Jun 5 10:45 FFFF_縁カット.png -rw-rw-rw- 1 0 Jun 5 10:45 hhhh.jpg -rw-rw-rw- 1 0 Jun 5 10:45 hhhh.png ================================================================ When I drilled down to understand this listing, I encountered sort order inconsistency, from my point of view, reported in my original mail. >> # "01.0", "10", ... is minimal reproducible pattern that I stlipped down >> # my real filenames pattern. > > I'd prefer to see the real file names instead, since that's what > ls-lisp-version-lessp was written to handle. I did too simplification in my original mail. It was not good for report, sorry. > The exact spec of strverscmp is not known, AFAIK, and the > implementation is a state machine, which is somewhat hard to > reverse-engineer. I'm only aware of the documentation in the glibc > manual; did you read it? I saw strverscmp man page, then source. And no attempt to understand the state machine implemantation. > Comparing with 'ls' is also somewhat problematic, because in UTF-8 > locales its collation rules ignore some punctuation characters -- > again, because that's how glibc implements that. Emacs on MS-Windows > can emulate this behavior if you set w32-collate-ignore-punctuation to > a non-nil value. I think `w32-collate-ignore-punctuation' seems not to affect my test case. In my trial, the dired buffer listing is same with t / nil of `w32-collate-ignore-punctuation'. -- tkh ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#55787: 29.0.50; inconsistent sort order with ls-lisp-version-lessp 2022-06-05 2:37 ` TAKAHASHI Yoshio @ 2022-06-05 7:01 ` Eli Zaretskii 2022-06-05 9:38 ` TAKAHASHI Yoshio 0 siblings, 1 reply; 8+ messages in thread From: Eli Zaretskii @ 2022-06-05 7:01 UTC (permalink / raw) To: TAKAHASHI Yoshio; +Cc: 55787 > From: TAKAHASHI Yoshio <yfb02119@nifty.com> > Cc: 55787@debbugs.gnu.org > Date: Sun, 05 Jun 2022 11:37:11 +0900 > > > What do you see with 'ls' and what do you see with ls-lisp? Also, in > > which locale are you trying this with 'ls'? > > I include my trial to hope it can be reproduced on your environment. In > this scenario, I use alittle more real filenames instead of just number. Thanks, I found two issues with the current implementation of ls-lisp-version-lessp, and I hope I fixed them now on the master branch. Please see if you get a more reasonable behavior. (I'm not sure you will see exactly the same order as in "ls -lv", though; not sure why.) ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#55787: 29.0.50; inconsistent sort order with ls-lisp-version-lessp 2022-06-05 7:01 ` Eli Zaretskii @ 2022-06-05 9:38 ` TAKAHASHI Yoshio 2022-06-05 9:48 ` Eli Zaretskii 0 siblings, 1 reply; 8+ messages in thread From: TAKAHASHI Yoshio @ 2022-06-05 9:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 55787 Eli-san, > Please see if you get a more reasonable behavior. (I'm not > sure you will see exactly the same order as in "ls -lv", though; not > sure why.) As you menthined in earler mail, the specification of strverscmp is not documented clearly. I believe your fix generates reasonable listing order. I appreciate your fix. Thank you! -- tkh ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#55787: 29.0.50; inconsistent sort order with ls-lisp-version-lessp 2022-06-05 9:38 ` TAKAHASHI Yoshio @ 2022-06-05 9:48 ` Eli Zaretskii 0 siblings, 0 replies; 8+ messages in thread From: Eli Zaretskii @ 2022-06-05 9:48 UTC (permalink / raw) To: TAKAHASHI Yoshio; +Cc: 55787-done > From: TAKAHASHI Yoshio <yfb02119@nifty.com> > Cc: 55787@debbugs.gnu.org > Date: Sun, 05 Jun 2022 18:38:10 +0900 > > Eli-san, > > > Please see if you get a more reasonable behavior. (I'm not > > sure you will see exactly the same order as in "ls -lv", though; not > > sure why.) > > As you menthined in earler mail, the specification of strverscmp is not > documented clearly. I believe your fix generates reasonable listing > order. I appreciate your fix. Thank you! Thanks, I'm therefore closing this bug. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-06-05 9:48 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-06-03 23:21 bug#55787: 29.0.50; inconsistent sort order with ls-lisp-version-lessp TAKAHASHI Yoshio 2022-06-04 7:44 ` Eli Zaretskii 2022-06-04 14:11 ` TAKAHASHI Yoshio 2022-06-04 14:52 ` Eli Zaretskii 2022-06-05 2:37 ` TAKAHASHI Yoshio 2022-06-05 7:01 ` Eli Zaretskii 2022-06-05 9:38 ` TAKAHASHI Yoshio 2022-06-05 9:48 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).