unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Sam Steingold <sds@gnu.org>
To: emacs-devel@gnu.org, Eli Zaretskii <eliz@gnu.org>
Subject: Re: case-insensitive string comparison
Date: Mon, 25 Jul 2022 15:39:34 -0400	[thread overview]
Message-ID: <lztu750yo9.fsf@3c22fb11fdab.ant.amazon.com> (raw)
In-Reply-To: <83o7xddw10.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 25 Jul 2022 18:58:19 +0300")

> * Eli Zaretskii <ryvm@tah.bet> [2022-07-25 18:58:19 +0300]:
>
>> From: Sam Steingold <sds@gnu.org>
>> Date: Mon, 25 Jul 2022 10:23:30 -0400
>> 
>> >> Hmm... `string-collate-equalp`?
>> >
>> > (string-collate-equalp "a" "A" current-locale-environment t)
>> > ==> nil
>> > current-locale-environment
>> > ==> "en_US.UTF-8"
>
> I cannot reproduce this:
>
>   (string-collate-equalp "a" "A" current-locale-environment t)
>     => t
>   current-locale-environment
>     => "en_US.UTF-8"
>
> What OS is this, and which Emacs version?

GNU Emacs 29.0.50 (build 5, x86_64-apple-darwin21.5.0, NS appkit-2113.50 Version 12.4 (Build 21F79))
 of 2022-07-25
Repository revision: ffe12ff2503917e47c0356195b31430996c148f9
Repository branch: master
Windowing system distributor 'Apple', version 10.3.2113
System Description:  macOS 12.4

>> So, how do we do case-insensitive string comparison in Emacs?
>
> If you want locale-specific collation, as Stefan said, above.

Do I?
Is it really true that "UTF-8" without "en_US" does _not_ define case conversion?
but https://docs.python.org/3/library/stdtypes.html#str.casefold says

>>>>> The casefolding algorithm is described in section 3.13 of the Unicode Standard.

this seems to imply that user locale setting is not relevant.
(locale _is_ mentioned in
https://www.unicode.org/versions/Unicode14.0.0/ch03.pdf but it looks
like a _specification_ of the algorithm, not its _modification_).

>> It is okay to add a `string-equal-ignore-case' based on `compare-strings'?
>> (even though it does not recognize "SS" and "ß" as equal)
>
> What's wrong with calling compare-strings directly?

I want to be able to use `string-equal-ignore-case' as a :test argument
to things like `cl-find'.
And I don't want to have to think about encodings and locales.
So I want the core Emacs maintainers who know about these things to
provide me with something that works. Thanks in advance! ;-)

The fact that there are ***TWO*** core functions that compare strings -
`string-collate-equalp' and `compare-strings' - does not look right to me.
_I_ should not have to decide which function to use.

>> Or should we first implement something like casefold in Python?
>
> Ha! we already have that:
>
>   (get-char-code-property ?ß 'special-uppercase)
>     => "SS"

Nice, but how does it help me if
--8<---------------cut here---------------start------------->8---
(compare-strings "SS" 0 nil "ß" 0 nil t)
==> -1
(string-collate-equalp "SS" "ß" "en_US.UTF-8" t)
==> nil
--8<---------------cut here---------------end--------------->8---
instead of `t'?

> Give us some credit, yes?

Sure, and I am very grateful!

-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://fairforall.org https://camera.org https://thereligionofpeace.com
He who laughs last did not get the joke.



  reply	other threads:[~2022-07-25 19:39 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-19 17:27 case-insensitive string comparison Sam Steingold
2022-07-19 18:06 ` Mattias Engdegård
2022-07-19 18:56   ` Sam Steingold
2022-07-20  4:39     ` tomas
2022-07-20 11:35       ` Eli Zaretskii
2022-07-20 13:30         ` tomas
2022-07-19 18:16 ` Stefan Kangas
2022-07-19 19:39 ` Roland Winkler
2022-07-19 22:47   ` Sam Steingold
2022-07-20  2:21     ` Roland Winkler
2022-07-20  3:01     ` Stefan Monnier
2022-07-20 16:22       ` Sam Steingold
2022-07-25 14:23         ` Sam Steingold
2022-07-25 15:58           ` Eli Zaretskii
2022-07-25 19:39             ` Sam Steingold [this message]
2022-07-26 13:05               ` Eli Zaretskii
2022-07-26 14:16                 ` Sam Steingold
2022-07-26 15:53                   ` Eli Zaretskii
2022-07-26 16:00                     ` Sam Steingold
2022-07-26 16:16                     ` Lars Ingebrigtsen
2022-07-26 14:43                 ` Robert Pluim
2022-07-25 19:37           ` Bruno Haible
2022-07-26  3:24           ` Richard Stallman
2022-07-26  8:00             ` Helmut Eller
2022-07-26 12:21               ` Eli Zaretskii
2022-07-27  2:58               ` Richard Stallman
2022-07-31  8:24                 ` Eli Zaretskii
2022-07-26 14:28             ` Sam Steingold
2022-07-26 15:42               ` Sam Steingold
2022-07-26 16:10               ` Eli Zaretskii
2022-07-26 18:56                 ` Bruno Haible
2022-07-26 19:30                   ` Eli Zaretskii
2022-07-20 16:24       ` Roland Winkler
2022-07-20 17:06         ` Sam Steingold
2022-07-20 17:16           ` Eli Zaretskii
2022-07-20 17:12         ` Eli Zaretskii
2022-07-20 17:37           ` Roland Winkler
2022-07-20 17:50             ` Eli Zaretskii
2022-07-20 18:10               ` Roland Winkler
2022-07-20 18:16                 ` Eli Zaretskii
2022-07-20 18:18                   ` [External] : " Drew Adams
2022-07-21  6:56                   ` Eli Zaretskii
2022-07-21 14:19                     ` Roland Winkler
2022-07-21 15:53                       ` Eli Zaretskii
2022-07-21 16:35                         ` Roland Winkler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=lztu750yo9.fsf@3c22fb11fdab.ant.amazon.com \
    --to=sds@gnu.org \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).