From: Sam Steingold <sds@gnu.org>
To: emacs-devel@gnu.org, Eli Zaretskii <eliz@gnu.org>
Subject: Re: case-insensitive string comparison
Date: Mon, 25 Jul 2022 15:39:34 -0400 [thread overview]
Message-ID: <lztu750yo9.fsf@3c22fb11fdab.ant.amazon.com> (raw)
In-Reply-To: <83o7xddw10.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 25 Jul 2022 18:58:19 +0300")
> * Eli Zaretskii <ryvm@tah.bet> [2022-07-25 18:58:19 +0300]:
>
>> From: Sam Steingold <sds@gnu.org>
>> Date: Mon, 25 Jul 2022 10:23:30 -0400
>>
>> >> Hmm... `string-collate-equalp`?
>> >
>> > (string-collate-equalp "a" "A" current-locale-environment t)
>> > ==> nil
>> > current-locale-environment
>> > ==> "en_US.UTF-8"
>
> I cannot reproduce this:
>
> (string-collate-equalp "a" "A" current-locale-environment t)
> => t
> current-locale-environment
> => "en_US.UTF-8"
>
> What OS is this, and which Emacs version?
GNU Emacs 29.0.50 (build 5, x86_64-apple-darwin21.5.0, NS appkit-2113.50 Version 12.4 (Build 21F79))
of 2022-07-25
Repository revision: ffe12ff2503917e47c0356195b31430996c148f9
Repository branch: master
Windowing system distributor 'Apple', version 10.3.2113
System Description: macOS 12.4
>> So, how do we do case-insensitive string comparison in Emacs?
>
> If you want locale-specific collation, as Stefan said, above.
Do I?
Is it really true that "UTF-8" without "en_US" does _not_ define case conversion?
but https://docs.python.org/3/library/stdtypes.html#str.casefold says
>>>>> The casefolding algorithm is described in section 3.13 of the Unicode Standard.
this seems to imply that user locale setting is not relevant.
(locale _is_ mentioned in
https://www.unicode.org/versions/Unicode14.0.0/ch03.pdf but it looks
like a _specification_ of the algorithm, not its _modification_).
>> It is okay to add a `string-equal-ignore-case' based on `compare-strings'?
>> (even though it does not recognize "SS" and "ß" as equal)
>
> What's wrong with calling compare-strings directly?
I want to be able to use `string-equal-ignore-case' as a :test argument
to things like `cl-find'.
And I don't want to have to think about encodings and locales.
So I want the core Emacs maintainers who know about these things to
provide me with something that works. Thanks in advance! ;-)
The fact that there are ***TWO*** core functions that compare strings -
`string-collate-equalp' and `compare-strings' - does not look right to me.
_I_ should not have to decide which function to use.
>> Or should we first implement something like casefold in Python?
>
> Ha! we already have that:
>
> (get-char-code-property ?ß 'special-uppercase)
> => "SS"
Nice, but how does it help me if
--8<---------------cut here---------------start------------->8---
(compare-strings "SS" 0 nil "ß" 0 nil t)
==> -1
(string-collate-equalp "SS" "ß" "en_US.UTF-8" t)
==> nil
--8<---------------cut here---------------end--------------->8---
instead of `t'?
> Give us some credit, yes?
Sure, and I am very grateful!
--
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://fairforall.org https://camera.org https://thereligionofpeace.com
He who laughs last did not get the joke.
next prev parent reply other threads:[~2022-07-25 19:39 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-19 17:27 case-insensitive string comparison Sam Steingold
2022-07-19 18:06 ` Mattias Engdegård
2022-07-19 18:56 ` Sam Steingold
2022-07-20 4:39 ` tomas
2022-07-20 11:35 ` Eli Zaretskii
2022-07-20 13:30 ` tomas
2022-07-19 18:16 ` Stefan Kangas
2022-07-19 19:39 ` Roland Winkler
2022-07-19 22:47 ` Sam Steingold
2022-07-20 2:21 ` Roland Winkler
2022-07-20 3:01 ` Stefan Monnier
2022-07-20 16:22 ` Sam Steingold
2022-07-25 14:23 ` Sam Steingold
2022-07-25 15:58 ` Eli Zaretskii
2022-07-25 19:39 ` Sam Steingold [this message]
2022-07-26 13:05 ` Eli Zaretskii
2022-07-26 14:16 ` Sam Steingold
2022-07-26 15:53 ` Eli Zaretskii
2022-07-26 16:00 ` Sam Steingold
2022-07-26 16:16 ` Lars Ingebrigtsen
2022-07-26 14:43 ` Robert Pluim
2022-07-25 19:37 ` Bruno Haible
2022-07-26 3:24 ` Richard Stallman
2022-07-26 8:00 ` Helmut Eller
2022-07-26 12:21 ` Eli Zaretskii
2022-07-27 2:58 ` Richard Stallman
2022-07-31 8:24 ` Eli Zaretskii
2022-07-26 14:28 ` Sam Steingold
2022-07-26 15:42 ` Sam Steingold
2022-07-26 16:10 ` Eli Zaretskii
2022-07-26 18:56 ` Bruno Haible
2022-07-26 19:30 ` Eli Zaretskii
2022-07-20 16:24 ` Roland Winkler
2022-07-20 17:06 ` Sam Steingold
2022-07-20 17:16 ` Eli Zaretskii
2022-07-20 17:12 ` Eli Zaretskii
2022-07-20 17:37 ` Roland Winkler
2022-07-20 17:50 ` Eli Zaretskii
2022-07-20 18:10 ` Roland Winkler
2022-07-20 18:16 ` Eli Zaretskii
2022-07-20 18:18 ` [External] : " Drew Adams
2022-07-21 6:56 ` Eli Zaretskii
2022-07-21 14:19 ` Roland Winkler
2022-07-21 15:53 ` Eli Zaretskii
2022-07-21 16:35 ` Roland Winkler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=lztu750yo9.fsf@3c22fb11fdab.ant.amazon.com \
--to=sds@gnu.org \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).