all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Mattias Engdegård" <mattiase@acm.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: Aidan Kehoe <kehoea@parhasard.net>,
	Lars Ingebrigtsen <larsi@gnus.org>,
	11309-done@debbugs.gnu.org
Subject: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic,  Greek
Date: Wed, 9 Dec 2020 15:37:19 +0100	[thread overview]
Message-ID: <28B85957-B8DB-431D-A120-F17D8AE4693F@acm.org> (raw)
In-Reply-To: <83zh2o5itq.fsf@gnu.org>

Eli, thanks for looking at the patch, now pushed to master (with Basil's suggested tweak).

> Why is it wrong, and what practical problems does this cause?

ß is a lower case letter so lowercasep(ß)=false is wrong. As a consequence, matching ß with [:lower:] and [:upper:] don't work correctly: ß should be matched by [:lower:] when case-fold-search is nil, and by both [:lower:] and [:upper:] when case-fold-search is non-nil.

The problem stems from the fact that uppercasep and lowercasep don't use the Unicode case information directly (which perhaps they should) but derive the case indirectly from the upcase and downcase tables, and there is no way to state that a char is lower case but cannot be upcased or downcased. (Below I'm going to use the notation T[C] for the table T indexed by character C.)

Currently, characters missing from or self-mapping in the upcase and downcase tables are considered to be caseless. For instance, upcase[*]=downcase[*]=* and upcase[中]=downcase[中]=nil. However, we also have upcase[ß]=downcase[ß]=ß, causing the incorrect lowercasep result.

The solution that I ended up applying was the simplest possible: set upcase[ß]=ẞ (U+7838). The special-uppercase properties ensure that (upcase "ß") => "SS", and now all tests pass.

(An acceptable alternative would have been to set upcase[ß]=nil and adapt lowercasep accordingly. I tried that and it works flawlessly, but involves slightly more changes.)

And that concludes the resolution of this bug.






  reply	other threads:[~2020-12-09 14:37 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-22 10:11 bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek Aidan Kehoe
2020-12-07 17:24 ` Lars Ingebrigtsen
2020-12-07 22:14 ` Mattias Engdegård
2020-12-08 14:48   ` Mattias Engdegård
2020-12-08 16:02     ` Eli Zaretskii
2020-12-08 16:57       ` Mattias Engdegård
2020-12-08 17:05         ` Eli Zaretskii
2020-12-09 14:37           ` Mattias Engdegård [this message]
2020-12-09 15:46             ` Eli Zaretskii
2020-12-10  9:36             ` Mattias Engdegård
2020-12-10 14:17               ` Eli Zaretskii
2020-12-10 15:48                 ` Mattias Engdegård
2020-12-10 15:53                   ` Lars Ingebrigtsen
2020-12-11  9:18                     ` Mattias Engdegård
2020-12-11 15:26                       ` Lars Ingebrigtsen
2020-12-08 16:10     ` Andreas Schwab
2020-12-08 16:19       ` Mattias Engdegård
2020-12-08 17:01     ` Basil L. Contovounesios
2020-12-08 17:04       ` Mattias Engdegård

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28B85957-B8DB-431D-A120-F17D8AE4693F@acm.org \
    --to=mattiase@acm.org \
    --cc=11309-done@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=kehoea@parhasard.net \
    --cc=larsi@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.