unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* case-insensitive string comparison
@ 2022-07-19 17:27 Sam Steingold
  2022-07-19 18:06 ` Mattias Engdegård
                   ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Sam Steingold @ 2022-07-19 17:27 UTC (permalink / raw)
  To: emacs-devel

Hi,

Emacs Lisp has 3 ways to describe comparison that ignores case:

1. "ignore-case", as in, e.g., `member-ignore-case'
2. "case-fold", as in, e.g., `case-fold-search'
3. "case-insensitive", as in, e.g., `minibuffer-history-case-insensitive-variables'

Is there a general rule when to use which naming?

Specifically, I would like to add

--8<---------------cut here---------------start------------->8---
(defun string-equal-ignore-case (s1 s2)
  "Like `string-equal', but case-insensitive.
Upper-case and lower-case letters are treated as equal.
Unibyte strings are converted to multibyte for comparison."
  (eq t (compare-strings s1 0 nil s2 0 nil t)))
--8<---------------cut here---------------end--------------->8---

to subr.el next to `string-prefix-p' - is this okay?

Thanks.

-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://www.peaceandtolerance.org/ https://ij.org/ https://www.memritv.org
Sex is like air.  It's only a big deal if you can't get any.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-19 17:27 case-insensitive string comparison Sam Steingold
@ 2022-07-19 18:06 ` Mattias Engdegård
  2022-07-19 18:56   ` Sam Steingold
  2022-07-19 18:16 ` Stefan Kangas
  2022-07-19 19:39 ` Roland Winkler
  2 siblings, 1 reply; 45+ messages in thread
From: Mattias Engdegård @ 2022-07-19 18:06 UTC (permalink / raw)
  To: sds; +Cc: emacs-devel

19 juli 2022 kl. 19.27 skrev Sam Steingold <sds@gnu.org>:

> (defun string-equal-ignore-case (s1 s2)

What would you tell someone complaining that

  (let ((rue "Straße"))
    (string-equal-ignore-case rue (upcase rue)))

returns nil? Asking for a friend.




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-19 17:27 case-insensitive string comparison Sam Steingold
  2022-07-19 18:06 ` Mattias Engdegård
@ 2022-07-19 18:16 ` Stefan Kangas
  2022-07-19 19:39 ` Roland Winkler
  2 siblings, 0 replies; 45+ messages in thread
From: Stefan Kangas @ 2022-07-19 18:16 UTC (permalink / raw)
  To: Sam Steingold, Emacs developers

Sam Steingold <sds@gnu.org> writes:

> Emacs Lisp has 3 ways to describe comparison that ignores case:
>
> 1. "ignore-case", as in, e.g., `member-ignore-case'
> 2. "case-fold", as in, e.g., `case-fold-search'
> 3. "case-insensitive", as in, e.g., `minibuffer-history-case-insensitive-variables'

See also Bug#56401.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-19 18:06 ` Mattias Engdegård
@ 2022-07-19 18:56   ` Sam Steingold
  2022-07-20  4:39     ` tomas
  0 siblings, 1 reply; 45+ messages in thread
From: Sam Steingold @ 2022-07-19 18:56 UTC (permalink / raw)
  To: emacs-devel, Mattias Engdegård

> * Mattias Engdegård <znggvnfr@npz.bet> [2022-07-19 20:06:50 +0200]:
>
> 19 juli 2022 kl. 19.27 skrev Sam Steingold <sds@gnu.org>:
>
>> (defun string-equal-ignore-case (s1 s2)
>
> What would you tell someone complaining that
>
>   (let ((rue "Straße"))
>     (string-equal-ignore-case rue (upcase rue)))
>
> returns nil? Asking for a friend.

This is a well-known bug in user code.
https://stackoverflow.com/q/319426/850781

-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://iris.org.il https://jij.org https://www.dhimmitude.org https://ij.org/
If a Somali pirate uses a legal Windows version, is he still a pirate?



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-19 17:27 case-insensitive string comparison Sam Steingold
  2022-07-19 18:06 ` Mattias Engdegård
  2022-07-19 18:16 ` Stefan Kangas
@ 2022-07-19 19:39 ` Roland Winkler
  2022-07-19 22:47   ` Sam Steingold
  2 siblings, 1 reply; 45+ messages in thread
From: Roland Winkler @ 2022-07-19 19:39 UTC (permalink / raw)
  To: emacs-devel

On Tue, Jul 19 2022, Sam Steingold wrote:
> Specifically, I would like to add
>
> (defun string-equal-ignore-case (s1 s2)
>   "Like `string-equal', but case-insensitive.
> Upper-case and lower-case letters are treated as equal.
> Unibyte strings are converted to multibyte for comparison."
>   (eq t (compare-strings s1 0 nil s2 0 nil t)))
>
> to subr.el next to `string-prefix-p' - is this okay?

I have run into this problem fairly often that I needed case-insensitive
string comparison, and I believe various elisp packages include a
"private" version of the above.  I always felt that
`(eq t (compare-strings s1 0 nil s2 0 nil t))' was a crutch for this
common problem.  Would it make sense to give the built-in function
string-equal an optional arg ignore-case?




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-19 19:39 ` Roland Winkler
@ 2022-07-19 22:47   ` Sam Steingold
  2022-07-20  2:21     ` Roland Winkler
  2022-07-20  3:01     ` Stefan Monnier
  0 siblings, 2 replies; 45+ messages in thread
From: Sam Steingold @ 2022-07-19 22:47 UTC (permalink / raw)
  To: emacs-devel, Roland Winkler

> * Roland Winkler <jvaxyre@tah.bet> [2022-07-19 14:39:32 -0500]:
>
> On Tue, Jul 19 2022, Sam Steingold wrote:
>> Specifically, I would like to add
>>
>> (defun string-equal-ignore-case (s1 s2)
>>   "Like `string-equal', but case-insensitive.
>> Upper-case and lower-case letters are treated as equal.
>> Unibyte strings are converted to multibyte for comparison."
>>   (eq t (compare-strings s1 0 nil s2 0 nil t)))
>>
>> to subr.el next to `string-prefix-p' - is this okay?
>
> I have run into this problem fairly often that I needed case-insensitive
> string comparison, and I believe various elisp packages include a
> "private" version of the above.  I always felt that
> `(eq t (compare-strings s1 0 nil s2 0 nil t))' was a crutch for this
> common problem.  Would it make sense to give the built-in function
> string-equal an optional arg ignore-case?

No, because I need to be able to pass `string-equal-ignore-case' to
things like `cl-find' as `:test' &c.

Also, if you look at fns.c, `string-equal' is basically `memcmp', while
`compare-strings' is way more complex.

PS. Actually, compare-strings/ignore_case is broken because it does,
essentially, upcase both arguments, see https://stackoverflow.com/q/319426/850781


-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://memri.org https://www.dhimmitude.org http://think-israel.org
A poet who reads his verse in public may have other nasty habits.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-19 22:47   ` Sam Steingold
@ 2022-07-20  2:21     ` Roland Winkler
  2022-07-20  3:01     ` Stefan Monnier
  1 sibling, 0 replies; 45+ messages in thread
From: Roland Winkler @ 2022-07-20  2:21 UTC (permalink / raw)
  To: emacs-devel

On Tue, Jul 19 2022, Sam Steingold wrote:
> No, because I need to be able to pass `string-equal-ignore-case' to
> things like `cl-find' as `:test' &c.

That sounds like a rather particular use case that, I believe, should
not motivate the design of what goes into subr.el.

> Also, if you look at fns.c, `string-equal' is basically `memcmp', while
> `compare-strings' is way more complex.

I don't think that's an obstacle for anything.

- The string-delimiting args and underlying machinery of compare-strings
  are something that can be skipped with string-equal.

- On the other hand, string comparison with case-folding is more complex
  than string comparison without case-folding, by its very definition.

> PS. Actually, compare-strings/ignore_case is broken because it does,
> essentially, upcase both arguments, see
> https://stackoverflow.com/q/319426/850781

That's a very different issue.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-19 22:47   ` Sam Steingold
  2022-07-20  2:21     ` Roland Winkler
@ 2022-07-20  3:01     ` Stefan Monnier
  2022-07-20 16:22       ` Sam Steingold
  2022-07-20 16:24       ` Roland Winkler
  1 sibling, 2 replies; 45+ messages in thread
From: Stefan Monnier @ 2022-07-20  3:01 UTC (permalink / raw)
  To: emacs-devel; +Cc: Roland Winkler

> PS. Actually, compare-strings/ignore_case is broken because it does,
> essentially, upcase both arguments, see https://stackoverflow.com/q/319426/850781

Hmm... `string-collate-equalp`?


        Stefan




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-19 18:56   ` Sam Steingold
@ 2022-07-20  4:39     ` tomas
  2022-07-20 11:35       ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: tomas @ 2022-07-20  4:39 UTC (permalink / raw)
  To: emacs-devel; +Cc: Mattias Engdegård

[-- Attachment #1: Type: text/plain, Size: 1221 bytes --]

On Tue, Jul 19, 2022 at 02:56:45PM -0400, Sam Steingold wrote:
> > * Mattias Engdegård <znggvnfr@npz.bet> [2022-07-19 20:06:50 +0200]:
> >
> > 19 juli 2022 kl. 19.27 skrev Sam Steingold <sds@gnu.org>:
> >
> >> (defun string-equal-ignore-case (s1 s2)
> >
> > What would you tell someone complaining that
> >
> >   (let ((rue "Straße"))
> >     (string-equal-ignore-case rue (upcase rue)))
> >
> > returns nil? Asking for a friend.
> 
> This is a well-known bug in user code.
> https://stackoverflow.com/q/319426/850781

One case (heh) which gets too little attention in that
(good) ref is "i" "ı" vs. "İ" vs. "I". You've to decide
on a language environment to get a chance of doing it
right (in Latin languages there are only 1 and 4, and
they map to each other, in Turkic languages 1 and 3
correspond, as 2 and 4 do).

The ref to the Unicode FAQ [1] from your ref shows that
even the Unicode folks have given up on that. To me, it
looks like an especially sleazy way to admit "well, folks,
we've messed up on this one".

Human languages are a messy mix, in which politics figures
prominently. Unicode reflects that.

Cheers
[1] http://unicode.org/faq/casemap_charprop.html#9
-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-20  4:39     ` tomas
@ 2022-07-20 11:35       ` Eli Zaretskii
  2022-07-20 13:30         ` tomas
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2022-07-20 11:35 UTC (permalink / raw)
  To: tomas; +Cc: emacs-devel, mattiase

> Date: Wed, 20 Jul 2022 06:39:46 +0200
> Cc: Mattias Engdegård <mattiase@acm.org>
> From: <tomas@tuxteam.de>
> 
> One case (heh) which gets too little attention in that
> (good) ref is "i" "ı" vs. "İ" vs. "I". You've to decide
> on a language environment to get a chance of doing it
> right (in Latin languages there are only 1 and 4, and
> they map to each other, in Turkic languages 1 and 3
> correspond, as 2 and 4 do).
> 
> The ref to the Unicode FAQ [1] from your ref shows that
> even the Unicode folks have given up on that. To me, it
> looks like an especially sleazy way to admit "well, folks,
> we've messed up on this one".
> 
> Human languages are a messy mix, in which politics figures
> prominently. Unicode reflects that.

This could be hard on the Unicode Consortium, but relatively easy in
Emacs: just bind the case table of the current buffer to something
reasonable around code which performs case-insensitive comparison.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-20 11:35       ` Eli Zaretskii
@ 2022-07-20 13:30         ` tomas
  0 siblings, 0 replies; 45+ messages in thread
From: tomas @ 2022-07-20 13:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, mattiase

[-- Attachment #1: Type: text/plain, Size: 651 bytes --]

On Wed, Jul 20, 2022 at 02:35:21PM +0300, Eli Zaretskii wrote:
> > Date: Wed, 20 Jul 2022 06:39:46 +0200
> > Cc: Mattias Engdegård <mattiase@acm.org>
> > From: <tomas@tuxteam.de>

[...]

> > Human languages are a messy mix, in which politics figures
> > prominently. Unicode reflects that.
> 
> This could be hard on the Unicode Consortium, but relatively easy in
> Emacs: just bind the case table of the current buffer to something
> reasonable around code which performs case-insensitive comparison.

...still: "something reasonable" is (human) language-dependent; if
you're writing a German-Turkish dictionary...

Cheers
-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-20  3:01     ` Stefan Monnier
@ 2022-07-20 16:22       ` Sam Steingold
  2022-07-25 14:23         ` Sam Steingold
  2022-07-20 16:24       ` Roland Winkler
  1 sibling, 1 reply; 45+ messages in thread
From: Sam Steingold @ 2022-07-20 16:22 UTC (permalink / raw)
  To: emacs-devel, Stefan Monnier

> * Stefan Monnier <zbaavre@veb.hzbagerny.pn> [2022-07-19 23:01:31 -0400]:
>
>> PS. Actually, compare-strings/ignore_case is broken because it does,
>> essentially, upcase both arguments, see https://stackoverflow.com/q/319426/850781
>
> Hmm... `string-collate-equalp`?

--8<---------------cut here---------------start------------->8---
(string-collate-equalp "a" "A" current-locale-environment t)
==> nil
current-locale-environment
==> "en_US.UTF-8"
--8<---------------cut here---------------end--------------->8---

-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://ij.org/ https://memri.org https://honestreporting.com
There are 3 kinds of people: those who can count and those who cannot.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-20  3:01     ` Stefan Monnier
  2022-07-20 16:22       ` Sam Steingold
@ 2022-07-20 16:24       ` Roland Winkler
  2022-07-20 17:06         ` Sam Steingold
  2022-07-20 17:12         ` Eli Zaretskii
  1 sibling, 2 replies; 45+ messages in thread
From: Roland Winkler @ 2022-07-20 16:24 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On Tue, Jul 19 2022, Stefan Monnier wrote:
>> PS. Actually, compare-strings/ignore_case is broken because it does,
>> essentially, upcase both arguments, see
>> https://stackoverflow.com/q/319426/850781
>
> Hmm... `string-collate-equalp`?

It would be nice if the node in the elisp manual on "comparison of
characters and strings" included some discussion on what usage cases
with case-folding can / should preferentially be covered by the
locale-dependent function string-collate-equalp versus something like
compare-strings.

In my narrow world, I can think of two extremes:

- bibtex-mode needs to compare BibTeX keywords that are ascii strings
  for which case is insignificant.  So bibtex-string= is exactly what
  Sam suggests to put into subr.el, and I believe that's good enough
  (just as almost any other approach I can think of for this particular
  problem).

- BBDB needs to know whether a name is already present in the database
  or not, ignoring case.  The function bbdb-string= is again what Sam
  suggests to put into subr.el.  The function string-collate-equalp
  might be better suited for this.  But which locale should it use?  The
  records in my BBDB cover larger parts of the world and I do not even
  know which locale(s) might work best for each of them, not to mention
  that BBDB needs to loop over all records.  Is there a "univeral
  default locale"?



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-20 16:24       ` Roland Winkler
@ 2022-07-20 17:06         ` Sam Steingold
  2022-07-20 17:16           ` Eli Zaretskii
  2022-07-20 17:12         ` Eli Zaretskii
  1 sibling, 1 reply; 45+ messages in thread
From: Sam Steingold @ 2022-07-20 17:06 UTC (permalink / raw)
  To: emacs-devel, Roland Winkler

> * Roland Winkler <jvaxyre@tah.bet> [2022-07-20 11:24:38 -0500]:
>
> On Tue, Jul 19 2022, Stefan Monnier wrote:
>>> PS. Actually, compare-strings/ignore_case is broken because it does,
>>> essentially, upcase both arguments, see
>>> https://stackoverflow.com/q/319426/850781
>>
>> Hmm... `string-collate-equalp`?
>
> - BBDB needs to know whether a name is already present in the database
>   or not, ignoring case.  The function bbdb-string= is again what Sam
>   suggests to put into subr.el.  The function string-collate-equalp
>   might be better suited for this.  But which locale should it use?

`bbdb-file-coding-system' ?

> Is there a "univeral default locale"?

UTF-8 is, I think, the generally accepted universal default today.


-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://www.memritv.org https://iris.org.il https://www.dhimmitude.org
If you need a helping hand, just remember that you already have two.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-20 16:24       ` Roland Winkler
  2022-07-20 17:06         ` Sam Steingold
@ 2022-07-20 17:12         ` Eli Zaretskii
  2022-07-20 17:37           ` Roland Winkler
  1 sibling, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2022-07-20 17:12 UTC (permalink / raw)
  To: Roland Winkler; +Cc: monnier, emacs-devel

> From: Roland Winkler <winkler@gnu.org>
> Cc: emacs-devel@gnu.org
> Date: Wed, 20 Jul 2022 11:24:38 -0500
> 
> On Tue, Jul 19 2022, Stefan Monnier wrote:
> >> PS. Actually, compare-strings/ignore_case is broken because it does,
> >> essentially, upcase both arguments, see
> >> https://stackoverflow.com/q/319426/850781
> >
> > Hmm... `string-collate-equalp`?
> 
> It would be nice if the node in the elisp manual on "comparison of
> characters and strings" included some discussion on what usage cases
> with case-folding can / should preferentially be covered by the
> locale-dependent function string-collate-equalp versus something like
> compare-strings.

I hear you, but your request is impossible to fulfill in practice.
That's because the collation rules used by this function are
implemented in the C library, and even if we know the locale,
different implementations of libc use different collation rules (in
addition, collation rules for some locales change with time).

The answer to the question "what comparison function should I use in a
specific use case" depends on the details of the use case, on the
locale, and on the libc against which Emacs was linked.

That is why the ELisp manual and the doc strings are intentionally
vague regarding what exactly should you expect as result: we simply
cannot say there anything that is accurate enough and general enough.

compare-strings, by contrast, doesn't use any collation rules, only
the current buffer's value of the case table.  So its results are more
predictable.

> - bibtex-mode needs to compare BibTeX keywords that are ascii strings
>   for which case is insignificant.  So bibtex-string= is exactly what
>   Sam suggests to put into subr.el, and I believe that's good enough
>   (just as almost any other approach I can think of for this particular
>   problem).
> 
> - BBDB needs to know whether a name is already present in the database
>   or not, ignoring case.  The function bbdb-string= is again what Sam
>   suggests to put into subr.el.  The function string-collate-equalp
>   might be better suited for this.  But which locale should it use?  The
>   records in my BBDB cover larger parts of the world and I do not even
>   know which locale(s) might work best for each of them, not to mention
>   that BBDB needs to loop over all records.  Is there a "univeral
>   default locale"?

That "universal default locale" is what Emacs uses, modulo the few
problematic characters like the dotless I etc.  For 100% predictable
results, build your own case table, bind the buffer's case table to
it, and then call case-insensitive comparison.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-20 17:06         ` Sam Steingold
@ 2022-07-20 17:16           ` Eli Zaretskii
  0 siblings, 0 replies; 45+ messages in thread
From: Eli Zaretskii @ 2022-07-20 17:16 UTC (permalink / raw)
  To: sds; +Cc: emacs-devel, winkler

> From: Sam Steingold <sds@gnu.org>
> Date: Wed, 20 Jul 2022 13:06:41 -0400
> 
> > - BBDB needs to know whether a name is already present in the database
> >   or not, ignoring case.  The function bbdb-string= is again what Sam
> >   suggests to put into subr.el.  The function string-collate-equalp
> >   might be better suited for this.  But which locale should it use?
> 
> `bbdb-file-coding-system' ?

That's not the locale, that's a locale's _codeset_.

> > Is there a "univeral default locale"?
> 
> UTF-8 is, I think, the generally accepted universal default today.

There's no such locale, AFAIK.  UTF-8 is again just a codeset.




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-20 17:12         ` Eli Zaretskii
@ 2022-07-20 17:37           ` Roland Winkler
  2022-07-20 17:50             ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Roland Winkler @ 2022-07-20 17:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

On Wed, Jul 20 2022, Eli Zaretskii wrote:
>> It would be nice if the node in the elisp manual on "comparison of
>> characters and strings" included some discussion on what usage cases
>> with case-folding can / should preferentially be covered by the
>> locale-dependent function string-collate-equalp versus something like
>> compare-strings.
>
> I hear you, but your request is impossible to fulfill in practice.
> That's because the collation rules used by this function are
> implemented in the C library, and even if we know the locale,
> different implementations of libc use different collation rules (in
> addition, collation rules for some locales change with time).

Even mentioning the difficulties could be useful here.  The elisp manual
is used by people who want to develop code that works for a wide range
of users.  So even if string comparison is a slippery terrain these
elisp hackers need to make design choices that work best for most users.

What usage scenarios in elisp packages might benefit from
string-collate-equalp even if this function depends on details that can
be quite different for different users?

>> - BBDB needs to know whether a name is already present in the database
>>   or not, ignoring case.  The function bbdb-string= is again what Sam
>>   suggests to put into subr.el.  The function string-collate-equalp
>>   might be better suited for this.  But which locale should it use?  The
>>   records in my BBDB cover larger parts of the world and I do not even
>>   know which locale(s) might work best for each of them, not to mention
>>   that BBDB needs to loop over all records.  Is there a "univeral
>>   default locale"?
>
> That "universal default locale" is what Emacs uses, modulo the few
> problematic characters like the dotless I etc.  For 100% predictable
> results, build your own case table, bind the buffer's case table to
> it, and then call case-insensitive comparison.

I am not sure I can follow your argument.  Do you suggest that, likely,
BBDB will work best if it compares names using compare-strings?
(I'd be glad to hear that.)  This code should work for users who do not
want to build their own case table and stuff like that.

Thanks!



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-20 17:37           ` Roland Winkler
@ 2022-07-20 17:50             ` Eli Zaretskii
  2022-07-20 18:10               ` Roland Winkler
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2022-07-20 17:50 UTC (permalink / raw)
  To: Roland Winkler; +Cc: monnier, emacs-devel

> From: Roland Winkler <winkler@gnu.org>
> Cc: monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Wed, 20 Jul 2022 12:37:29 -0500
> 
> > I hear you, but your request is impossible to fulfill in practice.
> > That's because the collation rules used by this function are
> > implemented in the C library, and even if we know the locale,
> > different implementations of libc use different collation rules (in
> > addition, collation rules for some locales change with time).
> 
> Even mentioning the difficulties could be useful here.

I'm not sure I agree.  To describe all the important aspects of this
would take too long, and it isn't the job of our manual to document
this stuff.  Read this if you want to know:

  https://unicode.org/reports/tr10/

> The elisp manual is used by people who want to develop code that
> works for a wide range of users.  So even if string comparison is a
> slippery terrain these elisp hackers need to make design choices
> that work best for most users.

Luckily, Emacs Lisp programs rarely need this.

> What usage scenarios in elisp packages might benefit from
> string-collate-equalp even if this function depends on details that can
> be quite different for different users?

For example, sorting file names.  If you want to get anything similar
to what GNU 'ls' does on GNU/Linux (in particular, with punctuation
characters in file names), you need to use the locale's collation
rules as implemented by glibc.  Which is what string-collate-lessp
does.

> >> - BBDB needs to know whether a name is already present in the database
> >>   or not, ignoring case.  The function bbdb-string= is again what Sam
> >>   suggests to put into subr.el.  The function string-collate-equalp
> >>   might be better suited for this.  But which locale should it use?  The
> >>   records in my BBDB cover larger parts of the world and I do not even
> >>   know which locale(s) might work best for each of them, not to mention
> >>   that BBDB needs to loop over all records.  Is there a "univeral
> >>   default locale"?
> >
> > That "universal default locale" is what Emacs uses, modulo the few
> > problematic characters like the dotless I etc.  For 100% predictable
> > results, build your own case table, bind the buffer's case table to
> > it, and then call case-insensitive comparison.
> 
> I am not sure I can follow your argument.  Do you suggest that, likely,
> BBDB will work best if it compares names using compare-strings?

Yes.  But in addition, you should set up the case table of the current
buffer when you do so, because otherwise special cases with the likes
of the Turkish language's dotless I could in rare cases screw you.

> (I'd be glad to hear that.)  This code should work for users who do not
> want to build their own case table and stuff like that.

Not the users should build the case table, BBDB (or whatever Lisp
program that needs the comparison) should.  It's not that hard,
really: if you only need ASCII, use ascii-case-table, otherwise copy
the standard case-table and modify it to make sure I downcases to i
and similarly with a few other exceptional letters.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-20 17:50             ` Eli Zaretskii
@ 2022-07-20 18:10               ` Roland Winkler
  2022-07-20 18:16                 ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Roland Winkler @ 2022-07-20 18:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

On Wed, Jul 20 2022, Eli Zaretskii wrote:
>> Even mentioning the difficulties could be useful here.
>
> I'm not sure I agree.  To describe all the important aspects of this
> would take too long, and it isn't the job of our manual to document
> this stuff.  Read this if you want to know:
>
>   https://unicode.org/reports/tr10/

A footnote pointing the interested reader to this report could already
be useful.  I am not suggesting to try to provide a more exhaustive
discussion of this topic.  I am suggesting to mention briefly that the
topic is subtle and depends on details "beyond emacs itself".

>> I am not sure I can follow your argument.  Do you suggest that, likely,
>> BBDB will work best if it compares names using compare-strings?
>
> Yes.

Thanks, that's already good to know!

> But in addition, you should set up the case table of the current
> buffer when you do so, because otherwise special cases with the likes
> of the Turkish language's dotless I could in rare cases screw you.
>
>> (I'd be glad to hear that.)  This code should work for users who do not
>> want to build their own case table and stuff like that.
>
> Not the users should build the case table, BBDB (or whatever Lisp
> program that needs the comparison) should.  It's not that hard,
> really: if you only need ASCII, use ascii-case-table, otherwise copy
> the standard case-table and modify it to make sure I downcases to i
> and similarly with a few other exceptional letters.

I am not sure it would be possible to predict how a default case table
for BBDB should differ from the standard case table.  BBDB might be the
only package of a user that accumulates strings that go beyond what
otherwise a user is dealing with regularly.  If there is a sensible
"BBDB default case table" I'd hope that this is the standard case table.

Or if not: can you suggest an emacs package that I can look into as a
source of inspiration?



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-20 18:10               ` Roland Winkler
@ 2022-07-20 18:16                 ` Eli Zaretskii
  2022-07-20 18:18                   ` [External] : " Drew Adams
  2022-07-21  6:56                   ` Eli Zaretskii
  0 siblings, 2 replies; 45+ messages in thread
From: Eli Zaretskii @ 2022-07-20 18:16 UTC (permalink / raw)
  To: Roland Winkler; +Cc: monnier, emacs-devel

> From: Roland Winkler <winkler@gnu.org>
> Cc: monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Wed, 20 Jul 2022 13:10:35 -0500
> 
> On Wed, Jul 20 2022, Eli Zaretskii wrote:
> >> Even mentioning the difficulties could be useful here.
> >
> > I'm not sure I agree.  To describe all the important aspects of this
> > would take too long, and it isn't the job of our manual to document
> > this stuff.  Read this if you want to know:
> >
> >   https://unicode.org/reports/tr10/
> 
> A footnote pointing the interested reader to this report could already
> be useful.

I'll see if we have a good place for that.

> > Not the users should build the case table, BBDB (or whatever Lisp
> > program that needs the comparison) should.  It's not that hard,
> > really: if you only need ASCII, use ascii-case-table, otherwise copy
> > the standard case-table and modify it to make sure I downcases to i
> > and similarly with a few other exceptional letters.
> 
> I am not sure it would be possible to predict how a default case table
> for BBDB should differ from the standard case table.  BBDB might be the
> only package of a user that accumulates strings that go beyond what
> otherwise a user is dealing with regularly.  If there is a sensible
> "BBDB default case table" I'd hope that this is the standard case table.

Maybe BBDB can just use the standard case table, I don't know.  You
should be the judge of that: if your users don't care with I not being
equal to i case-insensitively, when the language-environment happens
to be Turkish, then you shouldn't worry about that.

> Or if not: can you suggest an emacs package that I can look into as a
> source of inspiration?

I'm not aware of any (which is not to say there isn't any, just that I
don't know).



^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [External] : Re: case-insensitive string comparison
  2022-07-20 18:16                 ` Eli Zaretskii
@ 2022-07-20 18:18                   ` Drew Adams
  2022-07-21  6:56                   ` Eli Zaretskii
  1 sibling, 0 replies; 45+ messages in thread
From: Drew Adams @ 2022-07-20 18:18 UTC (permalink / raw)
  To: Eli Zaretskii, Roland Winkler
  Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org

> > A footnote pointing the interested reader
> > to this report could already be useful.
> 
> I'll see if we have a good place for that.

+1. Thx.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-20 18:16                 ` Eli Zaretskii
  2022-07-20 18:18                   ` [External] : " Drew Adams
@ 2022-07-21  6:56                   ` Eli Zaretskii
  2022-07-21 14:19                     ` Roland Winkler
  1 sibling, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2022-07-21  6:56 UTC (permalink / raw)
  To: winkler; +Cc: monnier, emacs-devel

> Date: Wed, 20 Jul 2022 21:16:12 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
> > >   https://unicode.org/reports/tr10/
> > 
> > A footnote pointing the interested reader to this report could already
> > be useful.
> 
> I'll see if we have a good place for that.

Done.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-21  6:56                   ` Eli Zaretskii
@ 2022-07-21 14:19                     ` Roland Winkler
  2022-07-21 15:53                       ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Roland Winkler @ 2022-07-21 14:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

On Thu, Jul 21 2022, Eli Zaretskii wrote:
>> Date: Wed, 20 Jul 2022 21:16:12 +0300
>> From: Eli Zaretskii <eliz@gnu.org>
>> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
>> 
>> > >   https://unicode.org/reports/tr10/
>> > 
>> > A footnote pointing the interested reader to this report could already
>> > be useful.
>> 
>> I'll see if we have a good place for that.
>
> Done.

Thank you! - I always thought that such technical reports were beyond my
comprehension.  But this unicode report is quite readable.  So there is
no need for emacs to reinvent the wheel.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-21 14:19                     ` Roland Winkler
@ 2022-07-21 15:53                       ` Eli Zaretskii
  2022-07-21 16:35                         ` Roland Winkler
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2022-07-21 15:53 UTC (permalink / raw)
  To: Roland Winkler; +Cc: monnier, emacs-devel

> From: Roland Winkler <winkler@gnu.org>
> Cc: monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Thu, 21 Jul 2022 09:19:37 -0500
> 
> On Thu, Jul 21 2022, Eli Zaretskii wrote:
> >> Date: Wed, 20 Jul 2022 21:16:12 +0300
> >> From: Eli Zaretskii <eliz@gnu.org>
> >> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> >> 
> >> > >   https://unicode.org/reports/tr10/
> >> > 
> >> > A footnote pointing the interested reader to this report could already
> >> > be useful.
> >> 
> >> I'll see if we have a good place for that.
> >
> > Done.
> 
> Thank you! - I always thought that such technical reports were beyond my
> comprehension.  But this unicode report is quite readable.  So there is
> no need for emacs to reinvent the wheel.

The implementation of strcoll is in the underlying libc, so Emacs
_cannot_ possible reinvent this wheel, even if we wanted to.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-21 15:53                       ` Eli Zaretskii
@ 2022-07-21 16:35                         ` Roland Winkler
  0 siblings, 0 replies; 45+ messages in thread
From: Roland Winkler @ 2022-07-21 16:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

On Thu, Jul 21 2022, Eli Zaretskii wrote:
>> Thank you! - I always thought that such technical reports were beyond my
>> comprehension.  But this unicode report is quite readable.  So there is
>> no need for emacs to reinvent the wheel.
>
> The implementation of strcoll is in the underlying libc, so Emacs
> _cannot_ possible reinvent this wheel, even if we wanted to.

I had in mind the elisp manual reinventing (rewriting) the readable
unicode report.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-20 16:22       ` Sam Steingold
@ 2022-07-25 14:23         ` Sam Steingold
  2022-07-25 15:58           ` Eli Zaretskii
                             ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Sam Steingold @ 2022-07-25 14:23 UTC (permalink / raw)
  To: emacs-devel

> * Sam Steingold <fqf@tah.bet> [2022-07-20 12:22:33 -0400]:
>
>> * Stefan Monnier <zbaavre@veb.hzbagerny.pn> [2022-07-19 23:01:31 -0400]:
>>
>>> PS. Actually, compare-strings/ignore_case is broken because it does,
>>> essentially, upcase both arguments, see https://stackoverflow.com/q/319426/850781
>>
>> Hmm... `string-collate-equalp`?
>
> (string-collate-equalp "a" "A" current-locale-environment t)
> ==> nil
> current-locale-environment
> ==> "en_US.UTF-8"

So, how do we do case-insensitive string comparison in Emacs?

It is okay to add a `string-equal-ignore-case' based on `compare-strings'?
(even though it does not recognize "SS" and "ß" as equal)

Or should we first implement something like casefold in Python?
https://docs.python.org/3/library/stdtypes.html#str.casefold

-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://camera.org https://honestreporting.com https://www.memritv.org
Warning! Dates in calendar are closer than they appear!



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-25 14:23         ` Sam Steingold
@ 2022-07-25 15:58           ` Eli Zaretskii
  2022-07-25 19:39             ` Sam Steingold
  2022-07-25 19:37           ` Bruno Haible
  2022-07-26  3:24           ` Richard Stallman
  2 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2022-07-25 15:58 UTC (permalink / raw)
  To: sds; +Cc: emacs-devel

> From: Sam Steingold <sds@gnu.org>
> Date: Mon, 25 Jul 2022 10:23:30 -0400
> 
> >> Hmm... `string-collate-equalp`?
> >
> > (string-collate-equalp "a" "A" current-locale-environment t)
> > ==> nil
> > current-locale-environment
> > ==> "en_US.UTF-8"

I cannot reproduce this:

  (string-collate-equalp "a" "A" current-locale-environment t)
    => t
  current-locale-environment
    => "en_US.UTF-8"

What OS is this, and which Emacs version?

> So, how do we do case-insensitive string comparison in Emacs?

If you want locale-specific collation, as Stefan said, above.

> It is okay to add a `string-equal-ignore-case' based on `compare-strings'?
> (even though it does not recognize "SS" and "ß" as equal)

What's wrong with calling compare-strings directly?

> Or should we first implement something like casefold in Python?
> https://docs.python.org/3/library/stdtypes.html#str.casefold

Ha! we already have that:

  (get-char-code-property ?ß 'special-uppercase)
    => "SS"

Give us some credit, yes?



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-25 14:23         ` Sam Steingold
  2022-07-25 15:58           ` Eli Zaretskii
@ 2022-07-25 19:37           ` Bruno Haible
  2022-07-26  3:24           ` Richard Stallman
  2 siblings, 0 replies; 45+ messages in thread
From: Bruno Haible @ 2022-07-25 19:37 UTC (permalink / raw)
  To: emacs-devel; +Cc: Sam Steingold

Sam Steingold asked:
> > (string-collate-equalp "a" "A" current-locale-environment t)
> > ==> nil
> > current-locale-environment
> > ==> "en_US.UTF-8"
> 
> So, how do we do case-insensitive string comparison in Emacs?
> 
> It is okay to add a `string-equal-ignore-case' based on `compare-strings'?
> (even though it does not recognize "SS" and "ß" as equal)
> 
> Or should we first implement something like casefold in Python?
> https://docs.python.org/3/library/stdtypes.html#str.casefold

The Unicode Standard's algorithm for case-insensitive string comparison
is indeed much better thought-out than anything that you could come
up with within a month.

You are pointing to the Python implementation. But there's also an
implementation in GNU libunistring [1] and one in ICU4C <unicode/ustring.h>
[2]. Emacs could surely use one of these.

The implementation from GNU libunistring is also available through Gnulib,
as a set of modules [3]. The most relevant modules are
  unicase/u8-casecmp
  unicase/u8-casecoll
  unicase/u8-casefold
  unicase/u8-casemap
  unicase/u8-casexfrm
  unicase/u8-ct-casefold
  unicase/u8-ct-tolower
  unicase/u8-ct-totitle
  unicase/u8-ct-toupper

Bruno

[1] https://www.gnu.org/software/libunistring/manual/html_node/Case-insensitive-comparison.html
[2] https://unicode-org.github.io/icu/userguide/transforms/casemappings.html
[3] https://www.gnu.org/software/gnulib/MODULES.html






^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-25 15:58           ` Eli Zaretskii
@ 2022-07-25 19:39             ` Sam Steingold
  2022-07-26 13:05               ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Sam Steingold @ 2022-07-25 19:39 UTC (permalink / raw)
  To: emacs-devel, Eli Zaretskii

> * Eli Zaretskii <ryvm@tah.bet> [2022-07-25 18:58:19 +0300]:
>
>> From: Sam Steingold <sds@gnu.org>
>> Date: Mon, 25 Jul 2022 10:23:30 -0400
>> 
>> >> Hmm... `string-collate-equalp`?
>> >
>> > (string-collate-equalp "a" "A" current-locale-environment t)
>> > ==> nil
>> > current-locale-environment
>> > ==> "en_US.UTF-8"
>
> I cannot reproduce this:
>
>   (string-collate-equalp "a" "A" current-locale-environment t)
>     => t
>   current-locale-environment
>     => "en_US.UTF-8"
>
> What OS is this, and which Emacs version?

GNU Emacs 29.0.50 (build 5, x86_64-apple-darwin21.5.0, NS appkit-2113.50 Version 12.4 (Build 21F79))
 of 2022-07-25
Repository revision: ffe12ff2503917e47c0356195b31430996c148f9
Repository branch: master
Windowing system distributor 'Apple', version 10.3.2113
System Description:  macOS 12.4

>> So, how do we do case-insensitive string comparison in Emacs?
>
> If you want locale-specific collation, as Stefan said, above.

Do I?
Is it really true that "UTF-8" without "en_US" does _not_ define case conversion?
but https://docs.python.org/3/library/stdtypes.html#str.casefold says

>>>>> The casefolding algorithm is described in section 3.13 of the Unicode Standard.

this seems to imply that user locale setting is not relevant.
(locale _is_ mentioned in
https://www.unicode.org/versions/Unicode14.0.0/ch03.pdf but it looks
like a _specification_ of the algorithm, not its _modification_).

>> It is okay to add a `string-equal-ignore-case' based on `compare-strings'?
>> (even though it does not recognize "SS" and "ß" as equal)
>
> What's wrong with calling compare-strings directly?

I want to be able to use `string-equal-ignore-case' as a :test argument
to things like `cl-find'.
And I don't want to have to think about encodings and locales.
So I want the core Emacs maintainers who know about these things to
provide me with something that works. Thanks in advance! ;-)

The fact that there are ***TWO*** core functions that compare strings -
`string-collate-equalp' and `compare-strings' - does not look right to me.
_I_ should not have to decide which function to use.

>> Or should we first implement something like casefold in Python?
>
> Ha! we already have that:
>
>   (get-char-code-property ?ß 'special-uppercase)
>     => "SS"

Nice, but how does it help me if
--8<---------------cut here---------------start------------->8---
(compare-strings "SS" 0 nil "ß" 0 nil t)
==> -1
(string-collate-equalp "SS" "ß" "en_US.UTF-8" t)
==> nil
--8<---------------cut here---------------end--------------->8---
instead of `t'?

> Give us some credit, yes?

Sure, and I am very grateful!

-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://fairforall.org https://camera.org https://thereligionofpeace.com
He who laughs last did not get the joke.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-25 14:23         ` Sam Steingold
  2022-07-25 15:58           ` Eli Zaretskii
  2022-07-25 19:37           ` Bruno Haible
@ 2022-07-26  3:24           ` Richard Stallman
  2022-07-26  8:00             ` Helmut Eller
  2022-07-26 14:28             ` Sam Steingold
  2 siblings, 2 replies; 45+ messages in thread
From: Richard Stallman @ 2022-07-26  3:24 UTC (permalink / raw)
  To: sds; +Cc: emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > So, how do we do case-insensitive string comparison in Emacs?

Users could do it by calling `compare-strings' directly.

  > It is okay to add a `string-equal-ignore-case' based on `compare-strings'?
  > (even though it does not recognize "SS" and "ß" as equal)

A function `string-equal-ignore-case' would make sense.  My question is,
is it worth the cost in complexity, or is it better to urge users to call
`compare-strings' directly?

That depends on how often programs will do case-insensitive string comparison.
If frequently, that gives a bigger upside to `string-equal-ignore-case'.

  > Or should we first implement something like casefold in Python?
  > https://docs.python.org/3/library/stdtypes.html#str.casefold

That casefold operation is not the same thing as ignoring case in
Emacs.  How to integrate something like that into Emacs, and in
general how to handle `ß' properly in case conversion, calls for more
thought.

It's possible that Python's handling is good, that we should implement
something similar.  It would be useful for people to study that option
including designing how to put it into Emacs, and whether the results
would be problem-free.

Part of the issue is how this should affect the existing case features
including searches in the buffer, case conversion commands and functions,
and `compare-strings'.  Also how it interacts with Turkish.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-26  3:24           ` Richard Stallman
@ 2022-07-26  8:00             ` Helmut Eller
  2022-07-26 12:21               ` Eli Zaretskii
  2022-07-27  2:58               ` Richard Stallman
  2022-07-26 14:28             ` Sam Steingold
  1 sibling, 2 replies; 45+ messages in thread
From: Helmut Eller @ 2022-07-26  8:00 UTC (permalink / raw)
  To: Richard Stallman; +Cc: sds, emacs-devel

On Mon, Jul 25 2022, Richard Stallman wrote:

> How to integrate something like that into Emacs, and in
> general how to handle `ß' properly in case conversion, calls for more
> thought.

Unicode defines a LATIN CAPITAL LETTER SHARP S `ẞ' U+1E9E.  So maybe
that's an easy problem now.  Not sure what typographers think about it.

Helmut



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-26  8:00             ` Helmut Eller
@ 2022-07-26 12:21               ` Eli Zaretskii
  2022-07-27  2:58               ` Richard Stallman
  1 sibling, 0 replies; 45+ messages in thread
From: Eli Zaretskii @ 2022-07-26 12:21 UTC (permalink / raw)
  To: Helmut Eller; +Cc: rms, sds, emacs-devel

> From: Helmut Eller <eller.helmut@gmail.com>
> Cc: sds@gnu.org,  emacs-devel@gnu.org
> Date: Tue, 26 Jul 2022 10:00:43 +0200
> 
> On Mon, Jul 25 2022, Richard Stallman wrote:
> 
> > How to integrate something like that into Emacs, and in
> > general how to handle `ß' properly in case conversion, calls for more
> > thought.
> 
> Unicode defines a LATIN CAPITAL LETTER SHARP S `ẞ' U+1E9E.  So maybe
> that's an easy problem now.  Not sure what typographers think about it.

They are in disagreement, AFAIU.  The Unicode Character Database (UCD)
doesn't define ß and ẞ as a case-pair: the latter down-cases to the
former, but the former doesn't up-case to the latter.  So it is still
a "special-casing" situation (and AFAIU different languages have
different views on its usage).



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-25 19:39             ` Sam Steingold
@ 2022-07-26 13:05               ` Eli Zaretskii
  2022-07-26 14:16                 ` Sam Steingold
  2022-07-26 14:43                 ` Robert Pluim
  0 siblings, 2 replies; 45+ messages in thread
From: Eli Zaretskii @ 2022-07-26 13:05 UTC (permalink / raw)
  To: sds; +Cc: emacs-devel

> From: Sam Steingold <sds@gnu.org>
> Date: Mon, 25 Jul 2022 15:39:34 -0400
> 
> > * Eli Zaretskii <ryvm@tah.bet> [2022-07-25 18:58:19 +0300]:
> >
> >> > (string-collate-equalp "a" "A" current-locale-environment t)
> >> > ==> nil
> >> > current-locale-environment
> >> > ==> "en_US.UTF-8"
> >
> > I cannot reproduce this:
> >
> >   (string-collate-equalp "a" "A" current-locale-environment t)
> >     => t
> >   current-locale-environment
> >     => "en_US.UTF-8"
> >
> > What OS is this, and which Emacs version?
> 
> GNU Emacs 29.0.50 (build 5, x86_64-apple-darwin21.5.0, NS appkit-2113.50 Version 12.4 (Build 21F79))
>  of 2022-07-25
> Repository revision: ffe12ff2503917e47c0356195b31430996c148f9
> Repository branch: master
> Windowing system distributor 'Apple', version 10.3.2113
> System Description:  macOS 12.4

Could be something macOS-specific.  Maybe your system doesn't define
the __STDC_ISO_10646__ feature?  In that case, string-collate-equalp
(see the doc string) behaves like string-equal, and that one doesn't
have a case-insensitive variant.

> >> So, how do we do case-insensitive string comparison in Emacs?
> >
> > If you want locale-specific collation, as Stefan said, above.
> 
> Do I?
> Is it really true that "UTF-8" without "en_US" does _not_ define case conversion?

string-collate-equalp relies on the implementation in your libc, so
that's something I cannot answer (although I'd expect any reasonable
libc to work as expected here).

In general, locale-specific comparison is a bad idea in Emacs, unless
you are writing a Lisp program that absolutely _must_ meet the
locale's definitions of collation order and equivalence.  That's
because some locales have unexpected requirements, and because
different libc's implement this stuff very differently.  So using
string-collate-equalp and string-collate-lessp makes your program
unpredictable on any machine but your own.

For that reason, I suggest always using compare-strings instead.  That
function uses the Unicode locale-independent case-conversion rules,
and you can predictably control/tailor that if you need by using a
buffer-local case-table.

> but https://docs.python.org/3/library/stdtypes.html#str.casefold says
> 
> >>>>> The casefolding algorithm is described in section 3.13 of the Unicode Standard.
> 
> this seems to imply that user locale setting is not relevant.

That conclusion is incorrect.  The collation database is usually
tailored for each locale, and at least glibc indeed loads the tailored
collation tables for each locale you request.

> >> It is okay to add a `string-equal-ignore-case' based on `compare-strings'?
> >> (even though it does not recognize "SS" and "ß" as equal)
> >
> > What's wrong with calling compare-strings directly?
> 
> I want to be able to use `string-equal-ignore-case' as a :test argument
> to things like `cl-find'.

Then write a thin wrapper around compare-strings, and be done.

> And I don't want to have to think about encodings and locales.
> So I want the core Emacs maintainers who know about these things to
> provide me with something that works. Thanks in advance! ;-)

There's nothing to think about: see above.  The best results, in the
Emacs context, are to write code that doesn't depend on the locale,
and that's what you get with compare-strings.  No need to know
anything about encoding or locales.

> The fact that there are ***TWO*** core functions that compare strings -
> `string-collate-equalp' and `compare-strings' - does not look right to me.
> _I_ should not have to decide which function to use.

You can always ask.  But the documentation at least hints that the
locale-specific comparison has many hidden aspects:

  This function obeys the conventions for collation order in your locale
  settings.  For example, characters with different coding points but
  the same meaning might be considered as equal, like different grave
  accent Unicode characters:

  (string-collate-equalp (string ?\uFF40) (string ?\u1FEF))
    => t

> >> Or should we first implement something like casefold in Python?
> >
> > Ha! we already have that:
> >
> >   (get-char-code-property ?ß 'special-uppercase)
> >     => "SS"
> 
> Nice, but how does it help me if
> --8<---------------cut here---------------start------------->8---
> (compare-strings "SS" 0 nil "ß" 0 nil t)
> ==> -1
> (string-collate-equalp "SS" "ß" "en_US.UTF-8" t)
> ==> nil
> --8<---------------cut here---------------end--------------->8---
> instead of `t'?

It depends on what you want to do, and why you care about the ß case
in the first place.  AFAIR, you never explained that, nor described
your goal.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-26 13:05               ` Eli Zaretskii
@ 2022-07-26 14:16                 ` Sam Steingold
  2022-07-26 15:53                   ` Eli Zaretskii
  2022-07-26 14:43                 ` Robert Pluim
  1 sibling, 1 reply; 45+ messages in thread
From: Sam Steingold @ 2022-07-26 14:16 UTC (permalink / raw)
  To: emacs-devel, Eli Zaretskii

> * Eli Zaretskii <ryvm@tah.bet> [2022-07-26 16:05:50 +0300]:
>
>> From: Sam Steingold <sds@gnu.org>
>> Date: Mon, 25 Jul 2022 15:39:34 -0400
>> 
>> > * Eli Zaretskii <ryvm@tah.bet> [2022-07-25 18:58:19 +0300]:
>> >
>> >> > (string-collate-equalp "a" "A" current-locale-environment t)
>> >> > ==> nil
>> >> > current-locale-environment
>> >> > ==> "en_US.UTF-8"
>> >
>> > I cannot reproduce this:
>> >
>> >   (string-collate-equalp "a" "A" current-locale-environment t)
>> >     => t
>> >   current-locale-environment
>> >     => "en_US.UTF-8"
>> >
>> > What OS is this, and which Emacs version?
>> 
>> GNU Emacs 29.0.50 (build 5, x86_64-apple-darwin21.5.0, NS appkit-2113.50 Version 12.4 (Build 21F79))
>>  of 2022-07-25
>> Repository revision: ffe12ff2503917e47c0356195b31430996c148f9
>> Repository branch: master
>> Windowing system distributor 'Apple', version 10.3.2113
>> System Description:  macOS 12.4
>
> Could be something macOS-specific.  Maybe your system doesn't define
> the __STDC_ISO_10646__ feature?  In that case, string-collate-equalp
> (see the doc string) behaves like string-equal, and that one doesn't
> have a case-insensitive variant.

How do I find out?
--8<---------------cut here---------------start------------->8---
echo  > .zzz.c;
gcc -E -dM .zzz.c | grep __STDC_ISO_10646__
--8<---------------cut here---------------end--------------->8---
does not print anything, but maybe I need to `#include' something?

> In general, locale-specific comparison is a bad idea in Emacs...
> For that reason, I suggest always using compare-strings instead.

Thank you very much for the clear and detailed explanation!

>> >> It is okay to add a `string-equal-ignore-case' based on `compare-strings'?
>> >> (even though it does not recognize "SS" and "ß" as equal)
>> >
>> > What's wrong with calling compare-strings directly?
>>
>> I want to be able to use `string-equal-ignore-case' as a :test argument
>> to things like `cl-find'.
>
> Then write a thin wrapper around compare-strings, and be done.

I think the need is sufficiently generic, e.g., BBDB provides such a
wrapper, as, I am sure, do many other packages.
Many core files can be simplified by using `string-equal-ignore-case'
(just like with the `string-prefix-p').

Thank again!

-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://www.dhimmitude.org https://thereligionofpeace.com
When you are arguing with an idiot, your opponent is doing the same.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-26  3:24           ` Richard Stallman
  2022-07-26  8:00             ` Helmut Eller
@ 2022-07-26 14:28             ` Sam Steingold
  2022-07-26 15:42               ` Sam Steingold
  2022-07-26 16:10               ` Eli Zaretskii
  1 sibling, 2 replies; 45+ messages in thread
From: Sam Steingold @ 2022-07-26 14:28 UTC (permalink / raw)
  To: emacs-devel, rms; +Cc: Bruno Haible

> * Richard Stallman <ezf@tah.bet> [2022-07-25 23:24:43 -0400]:
>
>   > It is okay to add a `string-equal-ignore-case' based on `compare-strings'?
>   > (even though it does not recognize "SS" and "ß" as equal)
>
> A function `string-equal-ignore-case' would make sense.  My question is,
> is it worth the cost in complexity, or is it better to urge users to call
> `compare-strings' directly?

1. we already have `string-prefix-p' and `string-suffix-p' which are
thin wrappers around `compare-strings'

> That depends on how often programs will do case-insensitive string comparison.
> If frequently, that gives a bigger upside to `string-equal-ignore-case'.

2. there are dozens of places in Emacs core with code like

--8<---------------cut here---------------start------------->8---
          (eq t (compare-strings (sgml-tag-name tag-info) nil nil
				 (car stack) nil nil t))
--8<---------------cut here---------------end--------------->8---

3. some emacs packages already have to define their own versions of
`string-equal-ignore-case', e.g., `bbdb-string='.

>   > Or should we first implement something like casefold in Python?
>   > https://docs.python.org/3/library/stdtypes.html#str.casefold
>
> That casefold operation is not the same thing as ignoring case in
> Emacs.

Normally, case-insensitive comparison means something like

--8<---------------cut here---------------start------------->8---
(string= (casefold A) (casefold B))
--8<---------------cut here---------------end--------------->8---

`compare-strings' does

--8<---------------cut here---------------start------------->8---
(string= (upcase A) (upcase B))
--8<---------------cut here---------------end--------------->8---

(except it does it character-by-character, no allocating new strings for
`upcase').

> How to integrate something like that into Emacs, and in
> general how to handle `ß' properly in case conversion, calls for more
> thought.

Bruno Haible replied in this thread, suggesting libunistring via gnulib.
I think this is the easiest way to handle the issue.

-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://memri.org https://honestreporting.com https://ffii.org
The program isn't debugged until the last user is dead.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-26 13:05               ` Eli Zaretskii
  2022-07-26 14:16                 ` Sam Steingold
@ 2022-07-26 14:43                 ` Robert Pluim
  1 sibling, 0 replies; 45+ messages in thread
From: Robert Pluim @ 2022-07-26 14:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sds, emacs-devel

>>>>> On Tue, 26 Jul 2022 16:05:50 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Sam Steingold <sds@gnu.org>
    >> Date: Mon, 25 Jul 2022 15:39:34 -0400
    >> 
    >> > * Eli Zaretskii <ryvm@tah.bet> [2022-07-25 18:58:19 +0300]:
    >> >
    >> >> > (string-collate-equalp "a" "A" current-locale-environment t)
    >> >> > ==> nil
    >> >> > current-locale-environment
    >> >> > ==> "en_US.UTF-8"
    >> >
    >> > I cannot reproduce this:
    >> >
    >> >   (string-collate-equalp "a" "A" current-locale-environment t)
    >> >     => t
    >> >   current-locale-environment
    >> >     => "en_US.UTF-8"
    >> >
    >> > What OS is this, and which Emacs version?
    >> 
    >> GNU Emacs 29.0.50 (build 5, x86_64-apple-darwin21.5.0, NS appkit-2113.50 Version 12.4 (Build 21F79))
    >> of 2022-07-25
    >> Repository revision: ffe12ff2503917e47c0356195b31430996c148f9
    >> Repository branch: master
    >> Windowing system distributor 'Apple', version 10.3.2113
    >> System Description:  macOS 12.4

    Eli> Could be something macOS-specific.  Maybe your system doesn't define
    Eli> the __STDC_ISO_10646__ feature?  In that case, string-collate-equalp
    Eli> (see the doc string) behaves like string-equal, and that one doesn't
    Eli> have a case-insensitive variant.

Neither Appleʼs clang nor llvm 13 clang define it. Looks like thereʼs
some plan to add it, but it hasnʼt happened yet. see
<https://reviews.llvm.org/D106577>

Robert
-- 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-26 14:28             ` Sam Steingold
@ 2022-07-26 15:42               ` Sam Steingold
  2022-07-26 16:10               ` Eli Zaretskii
  1 sibling, 0 replies; 45+ messages in thread
From: Sam Steingold @ 2022-07-26 15:42 UTC (permalink / raw)
  To: emacs-devel

> * Sam Steingold <fqf@tah.bet> [2022-07-26 10:28:01 -0400]:
>
>> * Richard Stallman <ezf@tah.bet> [2022-07-25 23:24:43 -0400]:
>>
>> That depends on how often programs will do case-insensitive string comparison.
>> If frequently, that gives a bigger upside to `string-equal-ignore-case'.
>
> 3. some emacs packages already have to define their own versions of
> `string-equal-ignore-case', e.g., `bbdb-string='.

and also `bibtex-string=' and `completion--string-equal-p' in the core.

-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://fairforall.org http://think-israel.org https://www.memritv.org
Growing Old is Inevitable; Growing Up is Optional.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-26 14:16                 ` Sam Steingold
@ 2022-07-26 15:53                   ` Eli Zaretskii
  2022-07-26 16:00                     ` Sam Steingold
  2022-07-26 16:16                     ` Lars Ingebrigtsen
  0 siblings, 2 replies; 45+ messages in thread
From: Eli Zaretskii @ 2022-07-26 15:53 UTC (permalink / raw)
  To: sds; +Cc: emacs-devel

> From: Sam Steingold <sds@gnu.org>
> Date: Tue, 26 Jul 2022 10:16:08 -0400
> 
> > Could be something macOS-specific.  Maybe your system doesn't define
> > the __STDC_ISO_10646__ feature?  In that case, string-collate-equalp
> > (see the doc string) behaves like string-equal, and that one doesn't
> > have a case-insensitive variant.
> 
> How do I find out?
> --8<---------------cut here---------------start------------->8---
> echo  > .zzz.c;
> gcc -E -dM .zzz.c | grep __STDC_ISO_10646__
> --8<---------------cut here---------------end--------------->8---
> does not print anything, but maybe I need to `#include' something?

No, that exactly means you are getting the string-equal fallback
instead.  Here on GNU/Linux I get

  $ gcc -E -dM foo.c | fgrep 10646
  #define __STDC_ISO_10646__ 201706L

> >> I want to be able to use `string-equal-ignore-case' as a :test argument
> >> to things like `cl-find'.
> >
> > Then write a thin wrapper around compare-strings, and be done.
> 
> I think the need is sufficiently generic, e.g., BBDB provides such a
> wrapper, as, I am sure, do many other packages.
> Many core files can be simplified by using `string-equal-ignore-case'
> (just like with the `string-prefix-p').

I'm not convinced, but I won't mount the barricades if Lars and/or
others think we need this.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-26 15:53                   ` Eli Zaretskii
@ 2022-07-26 16:00                     ` Sam Steingold
  2022-07-26 16:16                     ` Lars Ingebrigtsen
  1 sibling, 0 replies; 45+ messages in thread
From: Sam Steingold @ 2022-07-26 16:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On Tue, 26 Jul 2022 at 11:53, Eli Zaretskii <eliz@gnu.org> wrote:
>
> > From: Sam Steingold <sds@gnu.org>
> >
> > I think the need is sufficiently generic, e.g., BBDB provides such a
> > wrapper, as, I am sure, do many other packages.
> > Many core files can be simplified by using `string-equal-ignore-case'
> > (just like with the `string-prefix-p').
>
> I'm not convinced, but I won't mount the barricades if Lars and/or
> others think we need this.

Even though we already have completion--string-equal-p and
bibtex-string= in core?
(and also gnus-string-equal which is _almost_ identical)

-- 
Sam Steingold <http://sds.podval.org> <http://www.childpsy.net>
<http://steingoldpsychology.com>



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-26 14:28             ` Sam Steingold
  2022-07-26 15:42               ` Sam Steingold
@ 2022-07-26 16:10               ` Eli Zaretskii
  2022-07-26 18:56                 ` Bruno Haible
  1 sibling, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2022-07-26 16:10 UTC (permalink / raw)
  To: sds; +Cc: emacs-devel, rms, bruno

> From: Sam Steingold <sds@gnu.org>
> Cc: Bruno Haible <bruno@clisp.org>
> Date: Tue, 26 Jul 2022 10:28:01 -0400
> 
> Bruno Haible replied in this thread, suggesting libunistring via gnulib.
> I think this is the easiest way to handle the issue.

Using an external library whose notion of string comparison and
letter-case cannot be controlled by Emacs is a non-starter.  With the
current machinery, a Lisp program or a user can control up/down-casing
by specifying a buffer-local case-table, and we won't give up this
important functionality.  Other than that, I'm not aware of anything
that libunistring can do that Emacs cannot: we import the same Unicode
tables as libunistring does, so we have the same data to do these
jobs.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-26 15:53                   ` Eli Zaretskii
  2022-07-26 16:00                     ` Sam Steingold
@ 2022-07-26 16:16                     ` Lars Ingebrigtsen
  1 sibling, 0 replies; 45+ messages in thread
From: Lars Ingebrigtsen @ 2022-07-26 16:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sds, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> I think the need is sufficiently generic, e.g., BBDB provides such a
>> wrapper, as, I am sure, do many other packages.
>> Many core files can be simplified by using `string-equal-ignore-case'
>> (just like with the `string-prefix-p').
>
> I'm not convinced, but I won't mount the barricades if Lars and/or
> others think we need this.

Since there are already three of these variations in core, I think that
shows that this would be a handy function to have.

I've used `cl-equalp' for this in the past, but having a `string-'
prefixed function might make sense.  And in that case, what about
calling it `string-equalp'?  But perhaps too obscure for people coming
from an non-CL background, so `string-equal-ignore-case' is fine by me.




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-26 16:10               ` Eli Zaretskii
@ 2022-07-26 18:56                 ` Bruno Haible
  2022-07-26 19:30                   ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Bruno Haible @ 2022-07-26 18:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sds, emacs-devel

Eli Zaretskii wrote:
> With the
> current machinery, a Lisp program or a user can control up/down-casing
> by specifying a buffer-local case-table, and we won't give up this
> important functionality.

For which types of users, and for which use-cases, do you consider this an
"important functionality"?

Recall that the Unicode casing tables already cover the special cases for
'ß', Turkisch i, and so on.

I'd like to understand whether per-user customization of casing rules is
so important that libunistring should offer it in the API (as opposed to
requiring code modifications).

LibreOffice, for example, allows per-user customizations of the spell-
checking dictionary, but not of the casing tables. Is that a flaw, and why?

Bruno






^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-26 18:56                 ` Bruno Haible
@ 2022-07-26 19:30                   ` Eli Zaretskii
  0 siblings, 0 replies; 45+ messages in thread
From: Eli Zaretskii @ 2022-07-26 19:30 UTC (permalink / raw)
  To: Bruno Haible; +Cc: sds, emacs-devel

> From: Bruno Haible <bruno@clisp.org>
> Cc: sds@gnu.org, emacs-devel@gnu.org
> Date: Tue, 26 Jul 2022 20:56:10 +0200
> 
> Eli Zaretskii wrote:
> > With the
> > current machinery, a Lisp program or a user can control up/down-casing
> > by specifying a buffer-local case-table, and we won't give up this
> > important functionality.
> 
> For which types of users, and for which use-cases, do you consider this an
> "important functionality"?

One example that immediately comes to mind is when you need to downcase
strings without being hit by the case of 'I' in the Turkish locale.
We use this, for example, when parsing various Internet protocols.

More importantly, Emacs had this feature for many years, so suddenly
losing it is not really a possibility.

> Recall that the Unicode casing tables already cover the special cases for
> 'ß', Turkisch i, and so on.

We have the infrastructure for supporting that, and do so in locales
where that is required.  In particular, Emacs imports the data from
the Unicode SpecialCasing.txt file.

> I'd like to understand whether per-user customization of casing rules is
> so important that libunistring should offer it in the API (as opposed to
> requiring code modifications).
> 
> LibreOffice, for example, allows per-user customizations of the spell-
> checking dictionary, but not of the casing tables. Is that a flaw, and why?

Emacs is not only a text-editing program, it is primarily a
text-processing environment.  When you write programs that process
text, control of case conversions is sometimes important.

Whether this means libunistring needs to grow such an API, I don't
know.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-26  8:00             ` Helmut Eller
  2022-07-26 12:21               ` Eli Zaretskii
@ 2022-07-27  2:58               ` Richard Stallman
  2022-07-31  8:24                 ` Eli Zaretskii
  1 sibling, 1 reply; 45+ messages in thread
From: Richard Stallman @ 2022-07-27  2:58 UTC (permalink / raw)
  To: Helmut Eller; +Cc: sds, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Unicode defines a LATIN CAPITAL LETTER SHARP S `ẞ' U+1E9E.

On my terminal, that character dispays as \u1E9E, which is not
as helpful as if it were S.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: case-insensitive string comparison
  2022-07-27  2:58               ` Richard Stallman
@ 2022-07-31  8:24                 ` Eli Zaretskii
  0 siblings, 0 replies; 45+ messages in thread
From: Eli Zaretskii @ 2022-07-31  8:24 UTC (permalink / raw)
  To: rms; +Cc: eller.helmut, sds, emacs-devel

> From: Richard Stallman <rms@gnu.org>
> Cc: sds@gnu.org, emacs-devel@gnu.org
> Date: Tue, 26 Jul 2022 22:58:40 -0400
> 
>   > Unicode defines a LATIN CAPITAL LETTER SHARP S `ẞ' U+1E9E.
> 
> On my terminal, that character dispays as \u1E9E, which is not
> as helpful as if it were S.

I've now added support for U+1E9E to both "C-x 8" keyboard input and
to latin1-disp on text terminals.



^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2022-07-31  8:24 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-19 17:27 case-insensitive string comparison Sam Steingold
2022-07-19 18:06 ` Mattias Engdegård
2022-07-19 18:56   ` Sam Steingold
2022-07-20  4:39     ` tomas
2022-07-20 11:35       ` Eli Zaretskii
2022-07-20 13:30         ` tomas
2022-07-19 18:16 ` Stefan Kangas
2022-07-19 19:39 ` Roland Winkler
2022-07-19 22:47   ` Sam Steingold
2022-07-20  2:21     ` Roland Winkler
2022-07-20  3:01     ` Stefan Monnier
2022-07-20 16:22       ` Sam Steingold
2022-07-25 14:23         ` Sam Steingold
2022-07-25 15:58           ` Eli Zaretskii
2022-07-25 19:39             ` Sam Steingold
2022-07-26 13:05               ` Eli Zaretskii
2022-07-26 14:16                 ` Sam Steingold
2022-07-26 15:53                   ` Eli Zaretskii
2022-07-26 16:00                     ` Sam Steingold
2022-07-26 16:16                     ` Lars Ingebrigtsen
2022-07-26 14:43                 ` Robert Pluim
2022-07-25 19:37           ` Bruno Haible
2022-07-26  3:24           ` Richard Stallman
2022-07-26  8:00             ` Helmut Eller
2022-07-26 12:21               ` Eli Zaretskii
2022-07-27  2:58               ` Richard Stallman
2022-07-31  8:24                 ` Eli Zaretskii
2022-07-26 14:28             ` Sam Steingold
2022-07-26 15:42               ` Sam Steingold
2022-07-26 16:10               ` Eli Zaretskii
2022-07-26 18:56                 ` Bruno Haible
2022-07-26 19:30                   ` Eli Zaretskii
2022-07-20 16:24       ` Roland Winkler
2022-07-20 17:06         ` Sam Steingold
2022-07-20 17:16           ` Eli Zaretskii
2022-07-20 17:12         ` Eli Zaretskii
2022-07-20 17:37           ` Roland Winkler
2022-07-20 17:50             ` Eli Zaretskii
2022-07-20 18:10               ` Roland Winkler
2022-07-20 18:16                 ` Eli Zaretskii
2022-07-20 18:18                   ` [External] : " Drew Adams
2022-07-21  6:56                   ` Eli Zaretskii
2022-07-21 14:19                     ` Roland Winkler
2022-07-21 15:53                       ` Eli Zaretskii
2022-07-21 16:35                         ` Roland Winkler

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).