* bug#59275: Unexpected return value of `string-collate-lessp' on Mac @ 2022-11-15 4:08 Ihor Radchenko 2022-11-15 9:51 ` Robert Pluim 2022-11-15 13:46 ` Eli Zaretskii 0 siblings, 2 replies; 24+ messages in thread From: Ihor Radchenko @ 2022-11-15 4:08 UTC (permalink / raw) To: 59275 Hi, I am forwarding an issue originally reported on Org mailing list. https://orgmode.org/list/m2ilkwso8r.fsf@me.com On Emacs 29 (adaa2fc90e) MacOS build: (string-collate-lessp "a" "B" "C" t) ; => nil On Linux: (string-collate-lessp "a" "B" "C" t) ; => t The return value on MacOS is unexpected. See more information, including locale date, in the Org ML thread. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-15 4:08 bug#59275: Unexpected return value of `string-collate-lessp' on Mac Ihor Radchenko @ 2022-11-15 9:51 ` Robert Pluim 2022-11-16 3:47 ` Ihor Radchenko 2022-11-15 13:46 ` Eli Zaretskii 1 sibling, 1 reply; 24+ messages in thread From: Robert Pluim @ 2022-11-15 9:51 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 59275 >>>>> On Tue, 15 Nov 2022 04:08:13 +0000, Ihor Radchenko <yantar92@posteo.net> said: Ihor> Hi, Ihor> I am forwarding an issue originally reported on Org mailing list. Ihor> https://orgmode.org/list/m2ilkwso8r.fsf@me.com Ihor> On Emacs 29 (adaa2fc90e) MacOS build: Ihor> (string-collate-lessp "a" "B" "C" t) ; => nil Ihor> On Linux: Ihor> (string-collate-lessp "a" "B" "C" t) ; => t Ihor> The return value on MacOS is unexpected. Ihor> See more information, including locale date, in the Org ML thread. I think this is expected. See the long thread on emacs-devel back in July, eg https://lists.gnu.org/archive/html/emacs-devel/2022-07/msg00940.html (it resulted in the addition of `string-equal-ignore-case') Robert -- ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-15 9:51 ` Robert Pluim @ 2022-11-16 3:47 ` Ihor Radchenko 0 siblings, 0 replies; 24+ messages in thread From: Ihor Radchenko @ 2022-11-16 3:47 UTC (permalink / raw) To: Robert Pluim; +Cc: 59275 Robert Pluim <rpluim@gmail.com> writes: > I think this is expected. See the long thread on emacs-devel back in > July, eg > https://lists.gnu.org/archive/html/emacs-devel/2022-07/msg00940.html > > (it resulted in the addition of `string-equal-ignore-case') Ok. So, it looks like `compare-strings' is the way to go for system-independent string comparison. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-15 4:08 bug#59275: Unexpected return value of `string-collate-lessp' on Mac Ihor Radchenko 2022-11-15 9:51 ` Robert Pluim @ 2022-11-15 13:46 ` Eli Zaretskii 2022-11-15 15:05 ` Ihor Radchenko 1 sibling, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2022-11-15 13:46 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 59275 > From: Ihor Radchenko <yantar92@posteo.net> > Date: Tue, 15 Nov 2022 04:08:13 +0000 > > I am forwarding an issue originally reported on Org mailing list. > https://orgmode.org/list/m2ilkwso8r.fsf@me.com > > On Emacs 29 (adaa2fc90e) MacOS build: > > (string-collate-lessp "a" "B" "C" t) ; => nil > > On Linux: > > (string-collate-lessp "a" "B" "C" t) ; => t > > The return value on MacOS is unexpected. string-collate-lessp is inherently platform- (and locale-) dependent. Don't use it if you want consistent results across platforms and locales. ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-15 13:46 ` Eli Zaretskii @ 2022-11-15 15:05 ` Ihor Radchenko 2022-11-15 15:16 ` Eli Zaretskii 0 siblings, 1 reply; 24+ messages in thread From: Ihor Radchenko @ 2022-11-15 15:05 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 59275 Eli Zaretskii <eliz@gnu.org> writes: >> On Emacs 29 (adaa2fc90e) MacOS build: >> >> (string-collate-lessp "a" "B" "C" t) ; => nil >> >> On Linux: >> >> (string-collate-lessp "a" "B" "C" t) ; => t >> >> The return value on MacOS is unexpected. > > string-collate-lessp is inherently platform- (and locale-) dependent. > Don't use it if you want consistent results across platforms and > locales. Is there a better alternative? Also, do I miss something, or is this pitfall not documented in the docstring of `string-collate-lessp'? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-15 15:05 ` Ihor Radchenko @ 2022-11-15 15:16 ` Eli Zaretskii 2022-11-16 1:34 ` Ihor Radchenko 0 siblings, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2022-11-15 15:16 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 59275 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: 59275@debbugs.gnu.org > Date: Tue, 15 Nov 2022 15:05:48 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > string-collate-lessp is inherently platform- (and locale-) dependent. > > Don't use it if you want consistent results across platforms and > > locales. > > Is there a better alternative? Alternative to do what job? > Also, do I miss something, or is this pitfall not documented in the > docstring of `string-collate-lessp'? It isn't? then what is this about: This function obeys the conventions for collation order in your locale settings. For example, punctuation and whitespace characters might be considered less significant for sorting: (sort '("11" "12" "1 1" "1 2" "1.1" "1.2") 'string-collate-lessp) => ("11" "1 1" "1.1" "12" "1 2" "1.2") [...] To emulate Unicode-compliant collation on MS-Windows systems, bind ‘w32-collate-ignore-punctuation’ to a non-nil value, since the codeset part of the locale cannot be "UTF-8" on MS-Windows. The ELisp manual says in addition: This behavior is system-dependent; e.g., punctuation and whitespace are never ignored on Cygwin, regardless of locale. If this doesn't have a big WARNING sign near it, then what would? ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-15 15:16 ` Eli Zaretskii @ 2022-11-16 1:34 ` Ihor Radchenko 2022-11-16 13:00 ` Eli Zaretskii 0 siblings, 1 reply; 24+ messages in thread From: Ihor Radchenko @ 2022-11-16 1:34 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 59275 Eli Zaretskii <eliz@gnu.org> writes: >> > string-collate-lessp is inherently platform- (and locale-) dependent. >> > Don't use it if you want consistent results across platforms and >> > locales. >> >> Is there a better alternative? > > Alternative to do what job? Reliable sorting. In particular, I am looking for a better PREDICATE argument for `sort-subr' for case-sensitive and case-insensitive sorting of strings. >> Also, do I miss something, or is this pitfall not documented in the >> docstring of `string-collate-lessp'? > > It isn't? then what is this about: > > This function obeys the conventions for collation order in your > locale settings. For example, punctuation and whitespace characters > might be considered less significant for sorting: > > (sort '("11" "12" "1 1" "1 2" "1.1" "1.2") 'string-collate-lessp) > => ("11" "1 1" "1.1" "12" "1 2" "1.2") > [...] > To emulate Unicode-compliant collation on MS-Windows systems, > bind ‘w32-collate-ignore-punctuation’ to a non-nil value, since > the codeset part of the locale cannot be "UTF-8" on MS-Windows. The above sounds like we just need to worry about some edge cases where different approaches may exist to sorting. Like with punctuation, numbers, and spaces. Having (string-collate-lessp "a" "B" "C" t) ; => nil is totally unexpected because case-insensitive "a"<"B"<"C" sounds like the only reasonable outcome. I'd like the warning to be even more prominent. Feel free to disagree. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-16 1:34 ` Ihor Radchenko @ 2022-11-16 13:00 ` Eli Zaretskii 2022-11-21 7:28 ` Ihor Radchenko 0 siblings, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2022-11-16 13:00 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 59275 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: 59275@debbugs.gnu.org > Date: Wed, 16 Nov 2022 01:34:09 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > >> > string-collate-lessp is inherently platform- (and locale-) dependent. > >> > Don't use it if you want consistent results across platforms and > >> > locales. > >> > >> Is there a better alternative? > > > > Alternative to do what job? > > Reliable sorting. > In particular, I am looking for a better PREDICATE argument for > `sort-subr' for case-sensitive and case-insensitive sorting of strings. In the strict order of Unicode codepoints? Use compare-strings. > >> Also, do I miss something, or is this pitfall not documented in the > >> docstring of `string-collate-lessp'? > > > > It isn't? then what is this about: > > > > This function obeys the conventions for collation order in your > > locale settings. For example, punctuation and whitespace characters > > might be considered less significant for sorting: > > > > (sort '("11" "12" "1 1" "1 2" "1.1" "1.2") 'string-collate-lessp) > > => ("11" "1 1" "1.1" "12" "1 2" "1.2") > > [...] > > To emulate Unicode-compliant collation on MS-Windows systems, > > bind ‘w32-collate-ignore-punctuation’ to a non-nil value, since > > the codeset part of the locale cannot be "UTF-8" on MS-Windows. > > The above sounds like we just need to worry about some edge cases where > different approaches may exist to sorting. Like with punctuation, > numbers, and spaces. > > Having > > (string-collate-lessp "a" "B" "C" t) ; => nil > > is totally unexpected because case-insensitive "a"<"B"<"C" sounds like > the only reasonable outcome. It is hard to guess what will be unexpected for people. When the doc string was written, the example used there was deemed to be the most striking surprise from using locale-dependent collation, so it was what we used. > I'd like the warning to be even more prominent. You want to make it explicit that for systems where we use string-lessp the IGNORE-CASE argument is ignored? Or do you want some other change? Anyway, feel free to suggest some text to that effect. ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-16 13:00 ` Eli Zaretskii @ 2022-11-21 7:28 ` Ihor Radchenko 2022-11-21 13:31 ` Eli Zaretskii 0 siblings, 1 reply; 24+ messages in thread From: Ihor Radchenko @ 2022-11-21 7:28 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 59275 Eli Zaretskii <eliz@gnu.org> writes: >> Reliable sorting. >> In particular, I am looking for a better PREDICATE argument for >> `sort-subr' for case-sensitive and case-insensitive sorting of strings. > > In the strict order of Unicode codepoints? Use compare-strings. Thanks for the clarification. After further considerations, it looks like we should still use `string-collate-lessp' on Org side as it yields expected results if libc properly implements the collation. >> I'd like the warning to be even more prominent. > > You want to make it explicit that for systems where we use > string-lessp the IGNORE-CASE argument is ignored? Or do you want some > other change? Yes, I think. > Anyway, feel free to suggest some text to that effect. Maybe change If your system does not support a locale environment, this function behaves like `string-lessp'. to Some operating systems do not implement correct collation (in specific locale environments or at all). Then, this functions falls back to case-sensitive `string-lessp' and IGNORE-CASE argument is ignored. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-21 7:28 ` Ihor Radchenko @ 2022-11-21 13:31 ` Eli Zaretskii 2022-11-22 1:24 ` Ihor Radchenko 0 siblings, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2022-11-21 13:31 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 59275 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: 59275@debbugs.gnu.org > Date: Mon, 21 Nov 2022 07:28:55 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> Reliable sorting. > >> In particular, I am looking for a better PREDICATE argument for > >> `sort-subr' for case-sensitive and case-insensitive sorting of strings. > > > > In the strict order of Unicode codepoints? Use compare-strings. > > Thanks for the clarification. > After further considerations, it looks like we should still use > `string-collate-lessp' on Org side as it yields expected results if libc > properly implements the collation. Is the feature that uses it intended to be used only on glibc platforms (which basically means GNU/Linux)? If not, I'm surprised that you arrived at this conclusion. It is the 180 deg opposite of what I think you should have decided. Once again: locale-specific collation order is inherently unpredictable in its results, and should only be used when the locale-specific order is a _must_, like when sorting people's names for a telephone directory. > Maybe change > > If your system does not support a locale environment, this function > behaves like `string-lessp'. > > to > > Some operating systems do not implement correct collation (in specific > locale environments or at all). Then, this functions falls back to > case-sensitive `string-lessp' and IGNORE-CASE argument is ignored. Fine with me. ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-21 13:31 ` Eli Zaretskii @ 2022-11-22 1:24 ` Ihor Radchenko 2022-11-22 12:56 ` Eli Zaretskii 0 siblings, 1 reply; 24+ messages in thread From: Ihor Radchenko @ 2022-11-22 1:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 59275 [-- Attachment #1: Type: text/plain, Size: 1697 bytes --] Eli Zaretskii <eliz@gnu.org> writes: >> > In the strict order of Unicode codepoints? Use compare-strings. >> >> Thanks for the clarification. >> After further considerations, it looks like we should still use >> `string-collate-lessp' on Org side as it yields expected results if libc >> properly implements the collation. > > Is the feature that uses it intended to be used only on glibc platforms > (which basically means GNU/Linux)? If not, I'm surprised that you arrived > at this conclusion. It is the 180 deg opposite of what I think you should > have decided. > > Once again: locale-specific collation order is inherently unpredictable in > its results, and should only be used when the locale-specific order is a > _must_, like when sorting people's names for a telephone directory. We use string collation for 1. Sorting bibliographies 2. Sorting lists 3. Sorting table lines 4. Sorting tags 5. Sorting headings 6. Sorting entries in agendas 7. As a criterion for agenda/tag filtering when comparison operator is used on string property values (11.3.3 Matching tags and properties) 1-6 should follow the locale. I think we had a bug report in the past where a user got confusing about list sorting being confusing for the user language conventions. 7 is more debatable. >> Maybe change >> >> If your system does not support a locale environment, this function >> behaves like `string-lessp'. >> >> to >> >> Some operating systems do not implement correct collation (in specific >> locale environments or at all). Then, this functions falls back to >> case-sensitive `string-lessp' and IGNORE-CASE argument is ignored. > > Fine with me. See the attached patch. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-src-fns.c-Fstring_collate_lessp-Clarify-docstring.patch --] [-- Type: text/x-patch, Size: 1349 bytes --] From d9a67e94547ffeb6d8ac8a1202434fff1117af3f Mon Sep 17 00:00:00 2001 Message-Id: <d9a67e94547ffeb6d8ac8a1202434fff1117af3f.1669080246.git.yantar92@posteo.net> From: Ihor Radchenko <yantar92@posteo.net> Date: Tue, 22 Nov 2022 09:21:17 +0800 Subject: [PATCH] * src/fns.c (Fstring_collate_lessp): Clarify docstring Clarify that IGNORE-CASE argument might be ignored when the operation system does not implement string collation for the specified locale. See bug#59275. --- src/fns.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/fns.c b/src/fns.c index 035fa12935..e337c0958d 100644 --- a/src/fns.c +++ b/src/fns.c @@ -596,8 +596,9 @@ DEFUN ("string-collate-lessp", Fstring_collate_lessp, Sstring_collate_lessp, 2, bind `w32-collate-ignore-punctuation' to a non-nil value, since the codeset part of the locale cannot be \"UTF-8\" on MS-Windows. -If your system does not support a locale environment, this function -behaves like `string-lessp'. */) +Some operating systems do not implement correct collation (in specific +locale environments or at all). Then, this functions falls back to +case-sensitive `string-lessp' and IGNORE-CASE argument is ignored. */) (Lisp_Object s1, Lisp_Object s2, Lisp_Object locale, Lisp_Object ignore_case) { #if defined __STDC_ISO_10646__ || defined WINDOWSNT -- 2.35.1 [-- Attachment #3: Type: text/plain, Size: 225 bytes --] -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply related [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-22 1:24 ` Ihor Radchenko @ 2022-11-22 12:56 ` Eli Zaretskii 2022-11-23 10:39 ` Ihor Radchenko 2022-11-26 2:03 ` Ihor Radchenko 0 siblings, 2 replies; 24+ messages in thread From: Eli Zaretskii @ 2022-11-22 12:56 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 59275-done > From: Ihor Radchenko <yantar92@posteo.net> > Cc: 59275@debbugs.gnu.org > Date: Tue, 22 Nov 2022 01:24:43 +0000 > > > Once again: locale-specific collation order is inherently unpredictable in > > its results, and should only be used when the locale-specific order is a > > _must_, like when sorting people's names for a telephone directory. > > We use string collation for > > 1. Sorting bibliographies > 2. Sorting lists > 3. Sorting table lines > 4. Sorting tags > 5. Sorting headings > 6. Sorting entries in agendas > 7. As a criterion for agenda/tag filtering when comparison operator is > used on string property values (11.3.3 Matching tags and properties) > > 1-6 should follow the locale. I think only 1 and 6 are firmly in that category. For the others it depends on whether the results of the sorting are immediately displayed, or used for further processing. In the former case, using string-collate-lessp is semi-okay ("semi" because producing different results in different locales can still confuse users); in the latter case it is wrong, IMO, because you will cause unexpected results. > See the attached patch. Thanks, installed. ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-22 12:56 ` Eli Zaretskii @ 2022-11-23 10:39 ` Ihor Radchenko 2022-11-23 14:58 ` Eli Zaretskii 2022-11-26 2:03 ` Ihor Radchenko 1 sibling, 1 reply; 24+ messages in thread From: Ihor Radchenko @ 2022-11-23 10:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 59275-done Eli Zaretskii <eliz@gnu.org> writes: >> See the attached patch. > > Thanks, installed. Should we update the manual as well? 4.5 Comparison of Characters and Strings section contains the old docstring verbatim. P.S. I am wondering if there is some automated way to deal with verbatim docstrings in the manuals. They are so easy to slip through when the Elisp docstrings get updated. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-23 10:39 ` Ihor Radchenko @ 2022-11-23 14:58 ` Eli Zaretskii 2022-11-24 2:22 ` Ihor Radchenko 0 siblings, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2022-11-23 14:58 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 59275 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: 59275-done@debbugs.gnu.org > Date: Wed, 23 Nov 2022 10:39:22 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> See the attached patch. > > > > Thanks, installed. > > Should we update the manual as well? > 4.5 Comparison of Characters and Strings section contains the old > docstring verbatim. I see in the manual text that is not a verbatim copy of the doc string, but an expanded version of it with more detailed explanations. Which is how it should be: it is IMNSHO bad documentation-fu to have the manual just copycat the doc strings. (We sometimes do it for lack of time, but it is not a Good Thing.) The note about case-sensitivity of the fallback was missing from the manual, so I added it. > P.S. I am wondering if there is some automated way to deal with verbatim > docstrings in the manuals. They are so easy to slip through when the > Elisp docstrings get updated. There should be no verbatim copies of doc strings in the manual. So I'm not interested in making that bad practice easier ;-) Thanks. ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-23 14:58 ` Eli Zaretskii @ 2022-11-24 2:22 ` Ihor Radchenko 2022-11-24 7:23 ` Eli Zaretskii 0 siblings, 1 reply; 24+ messages in thread From: Ihor Radchenko @ 2022-11-24 2:22 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 59275 Eli Zaretskii <eliz@gnu.org> writes: >> Should we update the manual as well? >> 4.5 Comparison of Characters and Strings section contains the old >> docstring verbatim. > > I see in the manual text that is not a verbatim copy of the doc string, but > an expanded version of it with more detailed explanations. Which is how it > should be: it is IMNSHO bad documentation-fu to have the manual just copycat > the doc strings. (We sometimes do it for lack of time, but it is not a Good > Thing.) Fair point. > The note about case-sensitivity of the fallback was missing from the manual, > so I added it. Thanks! >> P.S. I am wondering if there is some automated way to deal with verbatim >> docstrings in the manuals. They are so easy to slip through when the >> Elisp docstrings get updated. > > There should be no verbatim copies of doc strings in the manual. So I'm not > interested in making that bad practice easier ;-) What about forgetting to update the manual when important changes are made to the docstring? I know for certain that it happened many times with Org manual. Maybe something can be done to auto-check if updates were done to the docstring but not the manual? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-24 2:22 ` Ihor Radchenko @ 2022-11-24 7:23 ` Eli Zaretskii 0 siblings, 0 replies; 24+ messages in thread From: Eli Zaretskii @ 2022-11-24 7:23 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 59275 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: 59275@debbugs.gnu.org > Date: Thu, 24 Nov 2022 02:22:41 +0000 > > > There should be no verbatim copies of doc strings in the manual. So I'm not > > interested in making that bad practice easier ;-) > > What about forgetting to update the manual when important changes are > made to the docstring? I know for certain that it happened many times > with Org manual. Maybe something can be done to auto-check if updates > were done to the docstring but not the manual? That could be a useful feature, suitable for checkdoc.el, perhaps. But there are 2 issues here that I'm not sure how would such a feature handle: . not every symbol that has a doc string is mentioned in the manuals . the doc string and the text in the manual are generally different, and so it could be that the update to a doc string doesn't require any update to the manual text So a naïve implementation would probably have too many false positives. Not sure if this could render the feature useless. Bottom line: I'm not sure we can have a good automated way of detecting updates that were missed, except at patch review time, and that is a judgment call by the person who does the review, and relies on his/her vigilance. But if someone could come up with a good way of doing that, it will be appreciated. ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-22 12:56 ` Eli Zaretskii 2022-11-23 10:39 ` Ihor Radchenko @ 2022-11-26 2:03 ` Ihor Radchenko 2022-11-26 8:06 ` Eli Zaretskii 1 sibling, 1 reply; 24+ messages in thread From: Ihor Radchenko @ 2022-11-26 2:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 59275-done Eli Zaretskii <eliz@gnu.org> writes: >> We use string collation for >> >> 1. Sorting bibliographies >> 2. Sorting lists >> 3. Sorting table lines >> 4. Sorting tags >> 5. Sorting headings >> 6. Sorting entries in agendas >> 7. As a criterion for agenda/tag filtering when comparison operator is >> used on string property values (11.3.3 Matching tags and properties) >> >> 1-6 should follow the locale. > > I think only 1 and 6 are firmly in that category. For the others it depends > on whether the results of the sorting are immediately displayed, or used for > further processing. In the former case, using string-collate-lessp is > semi-okay ("semi" because producing different results in different locales > can still confuse users); in the latter case it is wrong, IMO, because you > will cause unexpected results. 1-6 are for interactive use. As Maxim pointed out in https://orgmode.org/list/tlle59$pl3$1@ciao.gmane.io, `string-collate-lessp' generally yield better results for human consumption: " (setq lst '("semana" "señor" "sepia")) (sort lst #'string-lessp) ; => ("semana" "sepia" "señor") (sort lst #'string-collate-lessp) ; => ("semana" "señor" "sepia") " In the same thread, we also discussed what Org can do about MacOS and other systems that do not implement string collation. We concluded that a better fallback when collation is not available would be using downcase+string-lessp when `string-collate-lessp' is called with non-nil IGNORE-CASE argument. Would it be acceptable for Emacs to change the fallback behavior of `string-collate-lessp' to: 1. If string collation is not available and IGNORE-CASE is nil, fallback to`string-lessp'; 2. If string collation is not available and IGNORE-CASE is non-nil, use `downcase' + `string-lessp'. This will not compromise consistency and will yield slightly better fallback results. I also do not think that it will be backwards-incompatible. If the call to `string-collate-lessp' explicitly requests ignoring case, `downcase' is more expected than bare `string-lessp' that _does not_ ignore case. WDYT? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-26 2:03 ` Ihor Radchenko @ 2022-11-26 8:06 ` Eli Zaretskii 2022-11-26 8:47 ` Ihor Radchenko 0 siblings, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2022-11-26 8:06 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 59275 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: 59275-done@debbugs.gnu.org > Date: Sat, 26 Nov 2022 02:03:43 +0000 > > We concluded that a better fallback when collation is not available > would be using downcase+string-lessp when `string-collate-lessp' is > called with non-nil IGNORE-CASE argument. This has caveats, see below. I won't argue about your Org-local decision, since I don't know enough about the intended uses of what you did, but I do have something to say about this decision in general. I suggest at least a FIXME comment where you do this stuff, based on what I tell below. > Would it be acceptable for Emacs to change the fallback behavior of > `string-collate-lessp' to: > > 1. If string collation is not available and IGNORE-CASE is nil, fallback > to`string-lessp'; > 2. If string collation is not available and IGNORE-CASE is non-nil, > use `downcase' + `string-lessp'. 'downcase' uses the buffer-local case table if such is defined for the buffer that happens to be the current when you invoke 'downcase', and that's another cause of inconsistency and user surprises, especially when the strings you compare don't really "belong" to the current buffer. Also, in some (rarely-used) locales, downcasing has unexpected results, even with the default case-table. For example, downcasing "I" produces "ı", not "i" as expected. Did you think about these cases when making the above decision? > I also do not think that it will be backwards-incompatible. If the call > to `string-collate-lessp' explicitly requests ignoring case, `downcase' > is more expected than bare `string-lessp' that _does not_ ignore case. > > WDYT? See above. What you suggest is perhaps fine for plain-ASCII text, but not in general, IMNSHO. The reason for what Emacs currently does on systems that lack collation functions is that for such systems collation rules are indeterminate, and so inventing them by following naïve rules of plain ASCII, in particular the case-conversion rules, is potentially very wrong. These are general-purpose APIs, not something concrete in specific Org contexts, and as such, these APIs cannot "mostly work", they should work always and for every possible use case. And we are talking about a single system where these problems happen, which is macOS, right? Wouldn't it be better for "Someone" who uses macOS to just bite the bullet and write a proper collation function, or find a free software implementation of one, and include it in Emacs? This is what I did for MS-Windows at the time string-collate-lessp was added to Emacs. Why cannot macOS users do the same? ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-26 8:06 ` Eli Zaretskii @ 2022-11-26 8:47 ` Ihor Radchenko 2022-11-26 9:22 ` Eli Zaretskii 0 siblings, 1 reply; 24+ messages in thread From: Ihor Radchenko @ 2022-11-26 8:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 59275 Eli Zaretskii <eliz@gnu.org> writes: >> We concluded that a better fallback when collation is not available >> would be using downcase+string-lessp when `string-collate-lessp' is >> called with non-nil IGNORE-CASE argument. > > This has caveats, see below. I won't argue about your Org-local decision, > since I don't know enough about the intended uses of what you did, but I do > have something to say about this decision in general. I suggest at least a > FIXME comment where you do this stuff, based on what I tell below. Thanks for the information! >> Would it be acceptable for Emacs to change the fallback behavior of >> `string-collate-lessp' to: >> >> 1. If string collation is not available and IGNORE-CASE is nil, fallback >> to`string-lessp'; >> 2. If string collation is not available and IGNORE-CASE is non-nil, >> use `downcase' + `string-lessp'. > > 'downcase' uses the buffer-local case table if such is defined for the > buffer that happens to be the current when you invoke 'downcase', and that's > another cause of inconsistency and user surprises, especially when the > strings you compare don't really "belong" to the current buffer. Interesting. Is there any reason why this is not mentioned in the docstring for `downcase'? I now see 4.10 The Case Table section of the manual, and it looks like case tables should be set mostly automatically (by Emacs?) according to the language environment. Are details about this process documented anywhere? Are these case conversion tables independent of glibc? > Also, in > some (rarely-used) locales, downcasing has unexpected results, even with the > default case-table. For example, downcasing "I" produces "ı", not "i" as > expected. Did you think about these cases when making the above decision? I did not. However, I recall reading somewhere that it is possible work around this kind of issues by calling case conversion several times: upcase -> downcase -> upcase -> downcase. I did not. But now, after you reminded me about this caveat, I do recall https://nullprogram.com/blog/2014/06/13/ that mentioned something similar about caveats with composition. Just mentioning it for your reference. (I am not sure if the caveats discussed have been raised on Emacs devel). >> I also do not think that it will be backwards-incompatible. If the call >> to `string-collate-lessp' explicitly requests ignoring case, `downcase' >> is more expected than bare `string-lessp' that _does not_ ignore case. >> >> WDYT? > > See above. What you suggest is perhaps fine for plain-ASCII text, but not > in general, IMNSHO. > > The reason for what Emacs currently does on systems that lack collation > functions is that for such systems collation rules are indeterminate, and so > inventing them by following naïve rules of plain ASCII, in particular the > case-conversion rules, is potentially very wrong. These are general-purpose > APIs, not something concrete in specific Org contexts, and as such, these > APIs cannot "mostly work", they should work always and for every possible > use case. I feel that I miss something. Don't Emacs provide unicode case conversion tables? Why plain ASCII rules? > And we are talking about a single system where these problems happen, which > is macOS, right? Wouldn't it be better for "Someone" who uses macOS to just > bite the bullet and write a proper collation function, or find a free > software implementation of one, and include it in Emacs? This is what I did > for MS-Windows at the time string-collate-lessp was added to Emacs. Why > cannot macOS users do the same? It would be. But how can we ask for this? etc/TODO? Or maybe re-open this bug report? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-26 8:47 ` Ihor Radchenko @ 2022-11-26 9:22 ` Eli Zaretskii 2022-11-27 14:00 ` Maxim Nikulin 0 siblings, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2022-11-26 9:22 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 59275 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: 59275@debbugs.gnu.org > Date: Sat, 26 Nov 2022 08:47:13 +0000 > > > 'downcase' uses the buffer-local case table if such is defined for the > > buffer that happens to be the current when you invoke 'downcase', and that's > > another cause of inconsistency and user surprises, especially when the > > strings you compare don't really "belong" to the current buffer. > > Interesting. Is there any reason why this is not mentioned in the > docstring for `downcase'? Yes: because we are ashamed of that and hope to change it at some point, if we ever figure out how to do that. The way to avoid this caveat is simple: let-bind case-table when you call 'downcase'. > I now see 4.10 The Case Table section of the manual, and it looks like > case tables should be set mostly automatically (by Emacs?) according to > the language environment. Yes. But a buffer can have its local case-table. > Are details about this process documented anywhere? No. But see characters.el and the function I mention below. > Are these case conversion tables independent of glibc? Yes. We build them completely separately and from scratch, as you will see in characters.el. > https://nullprogram.com/blog/2014/06/13/ that mentioned something > similar about caveats with composition. I don't see there anything about sorting or collation. What did I miss? > Just mentioning it for your reference. (I am not sure if the caveats > discussed have been raised on Emacs devel). What did you think ought to be discussed? Btw, that blog fails to distinguish between display-time features and processing of text without displaying it. On display, Emacs combines characters that are combining, so equivalent character sequences should look the same. But Emacs doesn't by default consider equivalent character sequences as equal in all situations, leaving this to the Lisp program. Considering them always as equal looks sexy in a blog post, because it raises some brows and has the "whoah!" effect, but isn't a good policy in general, since some applications definitely need to know about the original decomposed sequence. We cannot conceal this from Lisp programs by hiding the original sequence on some low level that is not exposed to Lisp. Yes, this makes Lisp programs more complicated, but that comes with the territory: you cannot have power without complexity. > I feel that I miss something. Don't Emacs provide unicode case > conversion tables? The case tables we provide are based on Unicode, but are tweaked by the language-environment. See, for example, turkish-case-conversion-enable, which is run when the Turkish language-environment is turned on. > Why plain ASCII rules? Your logic is. What you suggest breaks down if you consider various complications in some locales. > > And we are talking about a single system where these problems happen, which > > is macOS, right? Wouldn't it be better for "Someone" who uses macOS to just > > bite the bullet and write a proper collation function, or find a free > > software implementation of one, and include it in Emacs? This is what I did > > for MS-Windows at the time string-collate-lessp was added to Emacs. Why > > cannot macOS users do the same? > > It would be. But how can we ask for this? etc/TODO? Or maybe re-open > this bug report? Anything will be fine with me, but unless the people who are asking you to do these workarounds are motivated enough to sit down and do the job, we will never get there. And guess what effect these workarounds have on their motivation. ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-26 9:22 ` Eli Zaretskii @ 2022-11-27 14:00 ` Maxim Nikulin 2022-11-27 14:23 ` Eli Zaretskii 0 siblings, 1 reply; 24+ messages in thread From: Maxim Nikulin @ 2022-11-27 14:00 UTC (permalink / raw) To: Eli Zaretskii, Ihor Radchenko; +Cc: 59275 On 26/11/2022 16:22, Eli Zaretskii wrote: >> From: Ihor Radchenko Date: Sat, 26 Nov 2022 08:47:13 +0000 >> >>> 'downcase' uses the buffer-local case table if such is defined for the >>> buffer that happens to be the current when you invoke 'downcase', and that's >>> another cause of inconsistency and user surprises, especially when the >>> strings you compare don't really "belong" to the current buffer. `downcase' is already used in Org for case-insensitive sorting. I am unsure if it appeared earlier than `string-collate-lessp' was introduced. Buffer-local conversion table is not a problem when table rows, list items (text formatting object, not elisp structure), or tags local to the current file are sorted. However when agenda is built from several files current buffer should not affect entries order. Concerning Org, my point is that caseless sorting should be uniform. Currently different functions use distinct approaches and it is more severe inconsistency. >> https://nullprogram.com/blog/2014/06/13/ that mentioned something >> similar about caveats with composition. > > I don't see there anything about sorting or collation. What did I miss? Does not composed/decomposed representation affect comparison result? Emacs-devel thread mentioned earlier in this bug contains a link describing enough issues with string comparison: https://stackoverflow.com/questions/319426/how-do-i-do-a-case-insensitive-string-comparison >>> And we are talking about a single system where these problems happen, which >>> is macOS, right? Wouldn't it be better for "Someone" who uses macOS to just >>> bite the bullet and write a proper collation function, or find a free >>> software implementation of one, and include it in Emacs? My impression was that clang should eventually get better locales support. If so, I am in doubts concerning macOS-specific implementation. I have no a macOS machine, so I may be wrong in my assumption concerning locale implementation there. However Emacs may benefit from its own implementation of collation (based on built-in Unicode character database) used on (almost) all OSes. It will allow using of several locales in parallel without switching of libc locale that is not thread-safe. I consider `downcase' as a kind of workaround (ignore case for poors) that allows graceful degradation in comparison to `string-lessp'. From my point of view e.g. case transformation rule for Turkish I is a minor issue in comparison to complete disregarding of IGNORE-CASE argument at least when results are presented to users. My argument against `downcase' in `string-collate-lessp' is that it may add noticeable performance penalty. Interestingly `compare-strings' uses upcase conversion when the IGNORE-CASE argument is true. I believed that some implementations (unrelated to Emacs) may have problems with e.g. ß and considered downcase as a safer option. ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-27 14:00 ` Maxim Nikulin @ 2022-11-27 14:23 ` Eli Zaretskii 2022-11-27 15:19 ` Maxim Nikulin 0 siblings, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2022-11-27 14:23 UTC (permalink / raw) To: Maxim Nikulin; +Cc: yantar92, 59275 > From: Maxim Nikulin <m.a.nikulin@gmail.com> > Date: Sun, 27 Nov 2022 21:00:50 +0700 > Cc: 59275@debbugs.gnu.org > > Concerning Org, my point is that caseless sorting should be uniform. You need to work hard to get that. Just using 'downcase' is not enough, and neither is using 'string-collate-equalp'. > >> https://nullprogram.com/blog/2014/06/13/ that mentioned something > >> similar about caveats with composition. > > > > I don't see there anything about sorting or collation. What did I miss? > > Does not composed/decomposed representation affect comparison result? They are different texts, so yes, they do, and they should. If you want to treat such strings as equivalent, you need to work even harder, since Emacs currently doesn't have enough infrastructure to do it right in all cases. > > Emacs-devel thread mentioned earlier in this bug contains a link > describing enough issues with string comparison: > > https://stackoverflow.com/questions/319426/how-do-i-do-a-case-insensitive-string-comparison This is about Python, no? > From my point of view e.g. case transformation rule for Turkish I is a > minor issue Why, Org doesn't want to support Turkish users? > My argument against `downcase' in `string-collate-lessp' is that it may > add noticeable performance penalty. I'd worry about correctness before performance. > Interestingly `compare-strings' uses upcase conversion when the > IGNORE-CASE argument is true. I believed that some implementations > (unrelated to Emacs) may have problems with e.g. ß and considered > downcase as a safer option. Case conversions always have problems. ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-27 14:23 ` Eli Zaretskii @ 2022-11-27 15:19 ` Maxim Nikulin 2022-11-27 15:42 ` Eli Zaretskii 0 siblings, 1 reply; 24+ messages in thread From: Maxim Nikulin @ 2022-11-27 15:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Ihor Radchenko, 59275 On 27/11/2022 21:23, Eli Zaretskii wrote: >> From: Maxim Nikulin Date: Sun, 27 Nov 2022 21:00:50 +0700 >> >> Concerning Org, my point is that caseless sorting should be uniform. > > You need to work hard to get that. Just using 'downcase' is not enough, and > neither is using 'string-collate-equalp'. I do not like that in some functions `string-collate-lessp' with IGNORE-CASE argument is used while strings are passed through `downcase' in other places. When proper locales implementation is available, I believe, it is better to consistently use IGNORE-CASE. I assume that text is presented to users, not serialized to be saved or sent as data. When `string-collate-lessp' disregards IGNORE-CASE, I consider it acceptable to use `downcase' (`upcase' may be worse since Org currently uses `downcase'). It provides reasonable balance of invested efforts and obtained result. >> Does not composed/decomposed representation affect comparison result? > > They are different texts, so yes, they do, and they should. > If you want to treat such strings as equivalent, you need to work even > harder, since Emacs currently doesn't have enough infrastructure to do it > right in all cases. `("semana" "señor" ,(ucs-normalize-NFD-string "señor") "sepia") (sort lst #'string-lessp) => ("semana" "señor" "sepia" "señor") (sort lst #'string-collate-lessp) => ("semana" "señor" "señor" "sepia") `string-collate-lessp' is able to handle at least some cases, it is another argument to use it. >> https://stackoverflow.com/questions/319426/how-do-i-do-a-case-insensitive-string-comparison > > This is about Python, no? The value of this link is a collection of examples that are not obvious for everybody. They are applicable to behavior `string-lessp' vs. `string-collate-lessp' as well. >> From my point of view e.g. case transformation rule for Turkish I is a >> minor issue > > Why, Org doesn't want to support Turkish users? From my point of view it is a minor issue in comparison to (string-collate-lessp "a" "B" "C" t) ; => nil that breaks comparison not only for accented letters. You almost manged to convince Ihor to use `string-lessp' instead of `string-collate-lessp'. I do not think it would improve quality of support of Turkish language. My suggestion is to fall back to `downcase' and `string-lessp' only if `string-collate-lessp' is unable to provide case insensitive comparison. >> My argument against `downcase' in `string-collate-lessp' is that it may >> add noticeable performance penalty. > > I'd worry about correctness before performance. `downcase' with `string-lessp' handles more cases than just `string-lessp' (leaving aside buffer-local conversion tables), so form my point of view the former is more correct. Even `downcase' with fixed "C" locale may give result more consistent with user expectations. My impression that users may be familiar with wide spread problems with sorting. ^ permalink raw reply [flat|nested] 24+ messages in thread
* bug#59275: Unexpected return value of `string-collate-lessp' on Mac 2022-11-27 15:19 ` Maxim Nikulin @ 2022-11-27 15:42 ` Eli Zaretskii 0 siblings, 0 replies; 24+ messages in thread From: Eli Zaretskii @ 2022-11-27 15:42 UTC (permalink / raw) To: Maxim Nikulin; +Cc: yantar92, 59275 > From: Maxim Nikulin <m.a.nikulin@gmail.com> > Date: Sun, 27 Nov 2022 22:19:24 +0700 > Cc: Ihor Radchenko <yantar92@posteo.net>, 59275@debbugs.gnu.org > > I do not like that in some functions `string-collate-lessp' with > IGNORE-CASE argument is used while strings are passed through `downcase' > in other places. When proper locales implementation is available, I > believe, it is better to consistently use IGNORE-CASE. I already explained up-thread why we ignore IGNORE-CASE when collation order is not known. I stand by that reasoning. I believe your opinion is based on considering only simple locales, and on the a-priori knowledge what is the locale's collation to begin with, something that Emacs cannot know in that case. > When `string-collate-lessp' disregards IGNORE-CASE, I consider it > acceptable to use `downcase' (`upcase' may be worse since Org currently > uses `downcase'). It provides reasonable balance of invested efforts and > obtained result. We disagree, sorry. > `("semana" "señor" ,(ucs-normalize-NFD-string "señor") "sepia") > (sort lst #'string-lessp) > => ("semana" "señor" "sepia" "señor") > (sort lst #'string-collate-lessp) > => ("semana" "señor" "señor" "sepia") > > `string-collate-lessp' is able to handle at least some cases On what OS and with which libc? And I don't think this is evidence of collation knowing about equivalent sequences. It is most probable the side effect of collation ignoring Latin accents altogether. > >> https://stackoverflow.com/questions/319426/how-do-i-do-a-case-insensitive-string-comparison > > > > This is about Python, no? > > The value of this link is a collection of examples that are not obvious > for everybody. They are applicable to behavior `string-lessp' vs. > `string-collate-lessp' as well. Which parts are applicable, in your opinion, and in what way? > >> From my point of view e.g. case transformation rule for Turkish I is a > >> minor issue > > > > Why, Org doesn't want to support Turkish users? > > From my point of view it is a minor issue in comparison to > > (string-collate-lessp "a" "B" "C" t) ; => nil > > that breaks comparison not only for accented letters. Org is free to make such misguided decisions, but Emacs won't. We cannot decide that some locale is "minor" and others are "major". My suggestion is to look for a solution that works in any locale. > You almost manged to convince Ihor to use `string-lessp' instead of > `string-collate-lessp'. I do not think it would improve quality of > support of Turkish language. I didn't try to convince Ihor of anything, just point out the pitfalls of using locale-specific collation order in portable programs. I said back then that I don't know enough to evaluate your decisions. Once you understand the subtle issues with these APIs, it is your call to decide how to solve your particular problems. > My suggestion is to fall back to `downcase' and `string-lessp' only if > `string-collate-lessp' is unable to provide case insensitive comparison. You can do that in Org if that's the decision of the Org developers. Emacs cannot do that automatically for the reasons I explained up-thread. > >> My argument against `downcase' in `string-collate-lessp' is that it may > >> add noticeable performance penalty. > > > > I'd worry about correctness before performance. > > `downcase' with `string-lessp' handles more cases than just > `string-lessp' (leaving aside buffer-local conversion tables), so form > my point of view the former is more correct. I'm quite sure this is only true for the cases that you considered, not in general. > Even `downcase' with fixed "C" locale may give result more consistent with > user expectations. How does it help on systems where locale-specific collation is not accessible to Emacs? > My impression that users may be familiar with wide spread problems with > sorting. Not IME. But that's a separate issue, and I don't pretend to know Org users better than you do, so I will defer to you on this one. ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2022-11-27 15:42 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-11-15 4:08 bug#59275: Unexpected return value of `string-collate-lessp' on Mac Ihor Radchenko 2022-11-15 9:51 ` Robert Pluim 2022-11-16 3:47 ` Ihor Radchenko 2022-11-15 13:46 ` Eli Zaretskii 2022-11-15 15:05 ` Ihor Radchenko 2022-11-15 15:16 ` Eli Zaretskii 2022-11-16 1:34 ` Ihor Radchenko 2022-11-16 13:00 ` Eli Zaretskii 2022-11-21 7:28 ` Ihor Radchenko 2022-11-21 13:31 ` Eli Zaretskii 2022-11-22 1:24 ` Ihor Radchenko 2022-11-22 12:56 ` Eli Zaretskii 2022-11-23 10:39 ` Ihor Radchenko 2022-11-23 14:58 ` Eli Zaretskii 2022-11-24 2:22 ` Ihor Radchenko 2022-11-24 7:23 ` Eli Zaretskii 2022-11-26 2:03 ` Ihor Radchenko 2022-11-26 8:06 ` Eli Zaretskii 2022-11-26 8:47 ` Ihor Radchenko 2022-11-26 9:22 ` Eli Zaretskii 2022-11-27 14:00 ` Maxim Nikulin 2022-11-27 14:23 ` Eli Zaretskii 2022-11-27 15:19 ` Maxim Nikulin 2022-11-27 15:42 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).