unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Ihor Radchenko <yantar92@posteo.net>
Cc: 59275@debbugs.gnu.org
Subject: bug#59275: Unexpected return value of `string-collate-lessp' on Mac
Date: Sat, 26 Nov 2022 11:22:29 +0200	[thread overview]
Message-ID: <83k03it6i2.fsf@gnu.org> (raw)
In-Reply-To: <87v8n2je5q.fsf@localhost> (message from Ihor Radchenko on Sat, 26 Nov 2022 08:47:13 +0000)

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: 59275@debbugs.gnu.org
> Date: Sat, 26 Nov 2022 08:47:13 +0000
> 
> > 'downcase' uses the buffer-local case table if such is defined for the
> > buffer that happens to be the current when you invoke 'downcase', and that's
> > another cause of inconsistency and user surprises, especially when the
> > strings you compare don't really "belong" to the current buffer.
> 
> Interesting. Is there any reason why this is not mentioned in the
> docstring for `downcase'?

Yes: because we are ashamed of that and hope to change it at some point, if
we ever figure out how to do that.  The way to avoid this caveat is simple:
let-bind case-table when you call 'downcase'.

> I now see 4.10 The Case Table section of the manual, and it looks like
> case tables should be set mostly automatically (by Emacs?) according to
> the language environment.

Yes.  But a buffer can have its local case-table.

> Are details about this process documented anywhere?

No.  But see characters.el and the function I mention below.

> Are these case conversion tables independent of glibc?

Yes.  We build them completely separately and from scratch, as you will see
in characters.el.

> https://nullprogram.com/blog/2014/06/13/ that mentioned something
> similar about caveats with composition.

I don't see there anything about sorting or collation.  What did I miss?

> Just mentioning it for your reference. (I am not sure if the caveats
> discussed have been raised on Emacs devel).

What did you think ought to be discussed?

Btw, that blog fails to distinguish between display-time features and
processing of text without displaying it.  On display, Emacs combines
characters that are combining, so equivalent character sequences should look
the same.  But Emacs doesn't by default consider equivalent character
sequences as equal in all situations, leaving this to the Lisp program.
Considering them always as equal looks sexy in a blog post, because it
raises some brows and has the "whoah!" effect, but isn't a good policy in
general, since some applications definitely need to know about the original
decomposed sequence.  We cannot conceal this from Lisp programs by hiding
the original sequence on some low level that is not exposed to Lisp.  Yes,
this makes Lisp programs more complicated, but that comes with the
territory: you cannot have power without complexity.

> I feel that I miss something. Don't Emacs provide unicode case
> conversion tables?

The case tables we provide are based on Unicode, but are tweaked by the
language-environment.  See, for example, turkish-case-conversion-enable,
which is run when the Turkish language-environment is turned on.

> Why plain ASCII rules?

Your logic is.  What you suggest breaks down if you consider various
complications in some locales.

> > And we are talking about a single system where these problems happen, which
> > is macOS, right?  Wouldn't it be better for "Someone" who uses macOS to just
> > bite the bullet and write a proper collation function, or find a free
> > software implementation of one, and include it in Emacs?  This is what I did
> > for MS-Windows at the time string-collate-lessp was added to Emacs.  Why
> > cannot macOS users do the same?
> 
> It would be. But how can we ask for this? etc/TODO? Or maybe re-open
> this bug report?

Anything will be fine with me, but unless the people who are asking you to
do these workarounds are motivated enough to sit down and do the job, we
will never get there.  And guess what effect these workarounds have on their
motivation.





  reply	other threads:[~2022-11-26  9:22 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-15  4:08 bug#59275: Unexpected return value of `string-collate-lessp' on Mac Ihor Radchenko
2022-11-15  9:51 ` Robert Pluim
2022-11-16  3:47   ` Ihor Radchenko
2022-11-15 13:46 ` Eli Zaretskii
2022-11-15 15:05   ` Ihor Radchenko
2022-11-15 15:16     ` Eli Zaretskii
2022-11-16  1:34       ` Ihor Radchenko
2022-11-16 13:00         ` Eli Zaretskii
2022-11-21  7:28           ` Ihor Radchenko
2022-11-21 13:31             ` Eli Zaretskii
2022-11-22  1:24               ` Ihor Radchenko
2022-11-22 12:56                 ` Eli Zaretskii
2022-11-23 10:39                   ` Ihor Radchenko
2022-11-23 14:58                     ` Eli Zaretskii
2022-11-24  2:22                       ` Ihor Radchenko
2022-11-24  7:23                         ` Eli Zaretskii
2022-11-26  2:03                   ` Ihor Radchenko
2022-11-26  8:06                     ` Eli Zaretskii
2022-11-26  8:47                       ` Ihor Radchenko
2022-11-26  9:22                         ` Eli Zaretskii [this message]
2022-11-27 14:00                           ` Maxim Nikulin
2022-11-27 14:23                             ` Eli Zaretskii
2022-11-27 15:19                               ` Maxim Nikulin
2022-11-27 15:42                                 ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83k03it6i2.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=59275@debbugs.gnu.org \
    --cc=yantar92@posteo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).