From: "Elias Mårtenson" <lokedhs@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: Lars Ingebrigtsen <larsi@gnus.org>, emacs-devel <emacs-devel@gnu.org>
Subject: Re: On language-dependent defaults for character-folding
Date: Sat, 20 Feb 2016 13:22:57 +0800 [thread overview]
Message-ID: <CADtN0WL-rX5xzw75P=qLEYFYzLWkuCuntE+gf2BAhn981_jWBg@mail.gmail.com> (raw)
In-Reply-To: <83egc8qzjh.fsf@gnu.org>
[-- Attachment #1: Type: text/plain, Size: 4305 bytes --]
On 20 February 2016 at 03:18, Eli Zaretskii <eliz@gnu.org> wrote:
> > Date: Fri, 19 Feb 2016 21:37:26 +0800
> > From: Elias Mårtenson <lokedhs@gmail.com>
> > Cc: Lars Ingebrigtsen <larsi@gnus.org>, emacs-devel <emacs-devel@gnu.org
> >
> >
> > For example, if the buffer includes ñ (2 characters), should "C-s n"
> > find the n in it?
> >
> > That depends on the locale of the user.
>
> There are use cases that are independent of the locale. For example,
> imagine that you need to find all the literal n characters in a buffer
> because you are investigating a bug in the program that produced that
> buffer. As an Emacs user, I need to do such jobs almost every day. I
> don't want the results affected by the locale.
>
Of course I'm not saying that you should now be able to do this. All I'm
advocating here is sensible defaults.
> > However, from the point of a user, there should not be a visible
> > difference between the precomposed and the composed variants are the
> > exact same character.
>
> What if the user wants to find all those places where what looks like
> ñ is actually ñ? Wouldn't that be a valid use case?
>
It would, but certainly a very rare one. For all intents and purposes the
two forms are (should be) equivalent.
> The reference you are looking for is the Unicode Standard itself. It
> says to use the normalization forms, see for example section 5.16
> there.
>
I have read that section before, and I have now read it again. The section
certainly talks about searching ignores diacritics, but does not discuss a
method to do so. There is also a reference to TR29, but it refers to
grapheme clusters which would be a very strange way to do character folding
(Koreans would be very confused).
> Every character-folding search implementation decomposes characters
> before matching them. So does Emacs. We didn't invent this, and we
> certainly didn't use the decompositions where they weren't supposed to
> be used. It's not a trick, it's what everyone else does to do the
> job. See the ICU library, for example.
>
Every example you have given so far discusses the decomposition
equivalence. I.e. the fact that the who variants of ñ are the same. Section
5.16 discuss the _concept_ of allowing n and ñ match similarly but the
mechanism to do so is locale-dependent. This is what Unicode says, and that
is what I say. My position is simply that the default (if absolutely
nothing else overrides it) should be chosen to take the locale of the user
into account.
> > The decompositions are used in the normalisation forms to ensure that
> the two variants are treated equally
> > (such as the two alternative representations of ñ that we have been
> discussing).
>
> Yes, and any character-folding search uses normalization forms as
> well.
>
Yes, but that's not what normalisation forms were designed to do.
Again (I really apologise for repeating myself, I'm starting to sound like
a troll and that is truly not my intention), the purpose of normalisation
forms are to ensure that the two variants of ñ compare the same. It is not
designed to provide a mechanism to allow n to compare equal to ñ.
> > Yes. I am fully aware of this. But so be it. Having applications work
> differently depending on the locale of the
> > environment the application was started in is nothing new.
>
> It's not new. It's old. We should move on to more general
> environments that support multiple languages. Emacs is such an
> environment. The old l10n paradigms are fundamentally incompatible
> with that.
>
Sure, but doesn't it make sense to fall back to the user's default if the
buffer does not have an overriding locale?
> > Being a multi-lingual environment, Emacs has no real notion of the
> > locale.
> >
> > Perhaps it should?
>
> That'd be a step backward, IMO.
>
As opposed to having no concept of locale at all? I just have to disagree
with you on that.
> Strange, I always thought the data was there. Perhaps you should ask
> a question on the Unicode mailing list, then.
>
That's a good idea actually. Thank you for the suggestion. I'm reading that
mailing list, and I will post a question there.
Regards,
Elias
[-- Attachment #2: Type: text/html, Size: 6205 bytes --]
next prev parent reply other threads:[~2016-02-20 5:22 UTC|newest]
Thread overview: 263+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-09 17:26 On language-dependent defaults for character-folding Artur Malabarba
2016-02-09 17:39 ` Pierpaolo Bernardi
2016-02-09 17:54 ` Paul Eggert
2016-02-10 0:49 ` Pierpaolo Bernardi
2016-02-10 2:20 ` Artur Malabarba
2016-02-10 3:01 ` Pierpaolo Bernardi
2016-02-10 9:55 ` Artur Malabarba
2016-02-10 18:12 ` Óscar Fuentes
2016-02-10 19:23 ` Artur Malabarba
2016-02-09 17:48 ` Drew Adams
2016-02-09 16:43 ` Artur Malabarba
2016-02-09 17:58 ` Eli Zaretskii
2016-02-09 17:10 ` Artur Malabarba
2016-02-09 18:21 ` Óscar Fuentes
2016-02-09 19:54 ` Artur Malabarba
2016-02-09 20:08 ` Eli Zaretskii
2016-02-10 1:58 ` Artur Malabarba
2016-02-09 21:07 ` Óscar Fuentes
2016-02-10 2:18 ` Artur Malabarba
2016-02-10 2:52 ` Óscar Fuentes
2016-02-10 2:56 ` Mark Oteiza
2016-02-10 15:25 ` Eli Zaretskii
2016-02-10 21:17 ` Artur Malabarba
2016-02-11 3:39 ` Eli Zaretskii
2016-02-12 22:36 ` Per Starbäck
2016-02-13 8:33 ` Eli Zaretskii
2016-02-13 10:10 ` Markus Triska
2016-02-13 10:21 ` Eli Zaretskii
2016-02-13 16:46 ` joakim
2016-02-11 0:54 ` Juri Linkov
2016-02-11 1:37 ` Óscar Fuentes
2016-02-12 0:50 ` Juri Linkov
2016-02-12 1:50 ` Óscar Fuentes
2016-02-12 7:10 ` Eli Zaretskii
2016-02-12 7:32 ` Óscar Fuentes
2016-02-12 8:44 ` Eli Zaretskii
2016-02-12 10:03 ` Óscar Fuentes
2016-02-12 11:11 ` Joost Kremers
2016-02-12 18:21 ` Óscar Fuentes
2016-02-12 12:00 ` Eli Zaretskii
2016-02-12 18:42 ` Óscar Fuentes
2016-02-12 19:06 ` Eli Zaretskii
2016-02-12 19:28 ` Óscar Fuentes
2016-02-12 23:57 ` Juri Linkov
2016-02-13 0:06 ` Drew Adams
2016-02-13 8:49 ` Eli Zaretskii
2016-02-13 17:20 ` Drew Adams
2016-02-13 17:58 ` Eli Zaretskii
2016-02-18 19:15 ` John Wiegley
2016-02-18 20:12 ` Eli Zaretskii
2016-02-19 5:11 ` Lars Ingebrigtsen
2016-02-19 8:20 ` Eli Zaretskii
2016-02-19 9:22 ` Elias Mårtenson
2016-02-19 10:09 ` Eli Zaretskii
2016-02-19 10:51 ` Elias Mårtenson
2016-02-19 11:46 ` Eli Zaretskii
2016-02-19 13:37 ` Elias Mårtenson
2016-02-19 19:18 ` Eli Zaretskii
2016-02-20 5:22 ` Elias Mårtenson [this message]
2016-02-20 6:31 ` Lars Ingebrigtsen
2016-02-20 9:18 ` Elias Mårtenson
2016-02-20 10:34 ` Eli Zaretskii
2016-02-21 2:51 ` Lars Ingebrigtsen
2016-02-21 6:28 ` Elias Mårtenson
2016-02-21 8:14 ` Achim Gratz
2016-02-23 16:56 ` Eli Zaretskii
2016-02-21 10:05 ` Lars Ingebrigtsen
2016-02-21 11:01 ` Elias Mårtenson
2016-02-21 16:02 ` Eli Zaretskii
2016-02-22 1:58 ` Lars Ingebrigtsen
2016-02-22 2:34 ` Elias Mårtenson
2016-02-22 2:48 ` Lars Ingebrigtsen
2016-02-22 6:13 ` Werner LEMBERG
2016-02-22 18:03 ` Richard Stallman
2016-02-22 18:27 ` Werner LEMBERG
2016-02-22 18:01 ` Richard Stallman
2016-02-22 19:06 ` Eli Zaretskii
2016-02-23 17:43 ` Richard Stallman
2016-02-23 18:14 ` Eli Zaretskii
2016-02-23 20:24 ` Yuri Khan
2016-02-25 12:11 ` Richard Stallman
2016-02-25 14:57 ` Yuri Khan
2016-02-26 20:21 ` Richard Stallman
2016-02-27 5:47 ` Yuri Khan
2016-02-27 19:54 ` Richard Stallman
2016-02-27 20:02 ` Eli Zaretskii
2016-02-27 20:05 ` Eli Zaretskii
2016-02-28 10:25 ` Richard Stallman
2016-02-28 6:06 ` Yuri Khan
2016-02-24 13:41 ` Richard Stallman
2016-02-24 17:54 ` Eli Zaretskii
2016-02-25 12:15 ` Richard Stallman
2016-02-25 12:38 ` Joost Kremers
2016-02-25 22:43 ` John Wiegley
2016-02-25 22:48 ` John Wiegley
2016-02-26 18:13 ` Eli Zaretskii
2016-02-27 0:48 ` John Wiegley
2016-02-27 8:38 ` Eli Zaretskii
2016-02-27 8:58 ` John Wiegley
2016-02-27 9:30 ` Eli Zaretskii
2016-02-27 16:22 ` Ken Brown
2016-02-27 22:48 ` John Wiegley
2016-02-28 15:57 ` Eli Zaretskii
2016-02-28 16:59 ` Drew Adams
2016-02-28 22:59 ` John Wiegley
2016-02-29 0:22 ` Drew Adams
2016-02-29 0:31 ` Juri Linkov
2016-02-29 3:45 ` Eli Zaretskii
2016-02-27 19:53 ` Richard Stallman
2016-02-27 20:01 ` Eli Zaretskii
2016-02-28 10:24 ` Richard Stallman
2016-02-28 16:01 ` Eli Zaretskii
[not found] ` <<E1aZyX5-0007bU-Mu@fencepost.gnu.org>
[not found] ` <<83oab0ako0.fsf@gnu.org>
2016-02-28 17:00 ` Drew Adams
2016-02-28 17:59 ` Clément Pit--Claudel
2016-02-28 18:04 ` Eli Zaretskii
2016-02-28 18:15 ` Clément Pit--Claudel
2016-02-28 18:23 ` Drew Adams
2016-02-28 18:46 ` Eli Zaretskii
2016-02-28 18:22 ` Drew Adams
2016-02-28 18:58 ` Clément Pit--Claudel
2016-02-24 13:41 ` Richard Stallman
2016-02-24 17:56 ` Eli Zaretskii
2016-02-25 12:15 ` Richard Stallman
2016-02-23 20:21 ` Yuri Khan
2016-02-23 21:15 ` Marcin Borkowski
2016-02-22 18:01 ` Richard Stallman
2016-02-22 18:58 ` Eli Zaretskii
2016-02-23 1:30 ` Lars Ingebrigtsen
2016-02-23 17:46 ` Richard Stallman
2016-02-24 1:50 ` Lars Ingebrigtsen
2016-02-24 6:40 ` Lars Brinkhoff
2016-02-24 13:43 ` Richard Stallman
2016-02-23 2:03 ` Elias Mårtenson
2016-02-23 17:46 ` Richard Stallman
2016-02-22 3:38 ` Eli Zaretskii
2016-02-22 3:57 ` Lars Ingebrigtsen
2016-02-22 16:10 ` Eli Zaretskii
2016-02-22 18:58 ` John Wiegley
2016-02-23 7:50 ` Per Starbäck
2016-02-23 16:29 ` John Wiegley
2016-02-21 16:31 ` Eli Zaretskii
2016-02-21 16:58 ` Elias Mårtenson
2016-02-21 17:23 ` Eli Zaretskii
2016-02-21 18:48 ` Ivan Andrus
2016-02-22 15:58 ` Wolfgang Jenkner
2016-02-22 16:35 ` Eli Zaretskii
2016-02-22 16:56 ` Wolfgang Jenkner
2016-02-22 17:24 ` Eli Zaretskii
2016-02-22 17:59 ` Richard Stallman
2016-02-22 18:57 ` Eli Zaretskii
2016-02-23 17:43 ` Richard Stallman
2016-02-23 18:03 ` Eli Zaretskii
2016-02-24 13:41 ` Richard Stallman
2016-02-23 17:43 ` Richard Stallman
[not found] ` <<E1aYGze-000655-RM@fencepost.gnu.org>
2016-02-23 18:00 ` Drew Adams
2016-02-22 17:59 ` Richard Stallman
2016-02-22 18:51 ` Eli Zaretskii
2016-02-23 0:14 ` Juri Linkov
2016-02-23 17:11 ` Eli Zaretskii
2016-02-24 0:16 ` Juri Linkov
2016-02-24 18:39 ` Eli Zaretskii
2016-02-25 0:29 ` Juri Linkov
2016-02-25 16:24 ` Eli Zaretskii
2016-02-29 0:22 ` Juri Linkov
2016-02-29 16:27 ` Eli Zaretskii
2016-02-29 23:40 ` Juri Linkov
2016-03-01 16:44 ` Eli Zaretskii
2016-02-26 20:23 ` Richard Stallman
2016-02-21 16:25 ` Eli Zaretskii
2016-02-22 1:56 ` Lars Ingebrigtsen
2016-02-22 9:20 ` Andreas Schwab
2016-02-23 1:46 ` Lars Ingebrigtsen
2016-02-23 3:38 ` Eli Zaretskii
2016-02-21 12:44 ` Richard Stallman
2016-02-21 16:05 ` Eli Zaretskii
2016-02-22 17:57 ` Richard Stallman
2016-02-22 18:34 ` Eli Zaretskii
2016-02-20 9:21 ` Eli Zaretskii
2016-02-20 10:08 ` Elias Mårtenson
2016-02-20 10:44 ` Eli Zaretskii
2016-02-19 20:38 ` Marcin Borkowski
2016-02-19 22:44 ` Lars Ingebrigtsen
2016-02-19 22:54 ` Clément Pit--Claudel
2016-02-20 5:25 ` Elias Mårtenson
2016-02-20 14:32 ` Richard Stallman
2016-02-20 15:50 ` Elias Mårtenson
2016-02-21 12:45 ` Richard Stallman
2016-02-20 8:09 ` Eli Zaretskii
2016-02-20 14:32 ` Richard Stallman
2016-02-24 23:27 ` Rasmus
2016-02-25 20:46 ` Richard Stallman
2016-02-13 18:15 ` Artur Malabarba
2016-02-13 18:26 ` Drew Adams
2016-02-12 19:09 ` Clément Pit--Claudel
2016-02-12 19:39 ` Óscar Fuentes
2016-02-13 15:32 ` Richard Stallman
2016-02-13 15:40 ` Eli Zaretskii
2016-02-13 16:58 ` Andreas Schwab
2016-02-13 17:44 ` Eli Zaretskii
2016-02-13 16:37 ` Marcin Borkowski
2016-02-13 16:50 ` Eli Zaretskii
2016-02-13 17:15 ` Marcin Borkowski
2016-02-13 17:45 ` Eli Zaretskii
2016-02-13 17:52 ` Marcin Borkowski
2016-02-13 17:46 ` andres.ramirez
2016-02-14 13:59 ` Richard Stallman
2016-02-12 23:50 ` Juri Linkov
2016-02-13 0:33 ` Óscar Fuentes
2016-02-14 13:57 ` Richard Stallman
2016-02-14 14:27 ` Óscar Fuentes
2016-02-15 10:28 ` Richard Stallman
2016-02-15 12:31 ` Óscar Fuentes
2016-02-15 17:45 ` Richard Stallman
2016-02-16 13:54 ` Elias Mårtenson
2016-02-16 14:30 ` Per Starbäck
2016-02-16 19:32 ` Ken Brown
2016-02-16 23:49 ` Lars Ingebrigtsen
2016-02-17 16:03 ` Richard Stallman
2016-02-18 8:57 ` Alan Mackenzie
2016-02-18 17:27 ` Eli Zaretskii
2016-02-19 12:37 ` Richard Stallman
2016-02-19 18:31 ` John Wiegley
2016-02-17 8:00 ` Joost Kremers
2016-02-17 15:34 ` Eli Zaretskii
2016-02-17 18:30 ` Achim Gratz
2016-02-17 19:30 ` Eli Zaretskii
2016-02-17 20:26 ` Marcin Borkowski
2016-02-17 20:06 ` Joost Kremers
2016-02-17 20:15 ` Eli Zaretskii
2016-02-17 22:58 ` Ken Brown
2016-02-18 0:03 ` Vinicius Latorre
2016-02-18 17:29 ` Eli Zaretskii
2016-02-18 4:55 ` Marcin Borkowski
2016-02-18 11:26 ` Filipp Gunbin
2016-02-18 17:26 ` Eli Zaretskii
2016-02-19 12:30 ` Filipp Gunbin
2016-02-19 15:22 ` Eli Zaretskii
2016-02-18 17:30 ` Eli Zaretskii
2016-02-17 22:53 ` Mark Oteiza
2016-02-18 0:11 ` Juri Linkov
2016-02-18 0:20 ` Mark Oteiza
2016-02-18 17:28 ` Eli Zaretskii
2016-02-18 4:53 ` Marcin Borkowski
2016-02-18 17:07 ` Elias Mårtenson
2016-02-18 17:21 ` Eli Zaretskii
2016-02-19 7:40 ` Elias Mårtenson
2016-02-19 19:24 ` Achim Gratz
2016-02-20 5:05 ` Elias Mårtenson
2016-02-20 13:59 ` Achim Gratz
2016-02-19 20:47 ` Marcin Borkowski
2016-02-20 14:31 ` Richard Stallman
2016-02-18 17:46 ` Eli Zaretskii
2016-02-18 18:18 ` Mark Oteiza
2016-02-18 18:24 ` Eli Zaretskii
2016-02-18 16:30 ` Richard Stallman
2016-02-18 17:07 ` Eli Zaretskii
2016-02-13 16:38 ` Marcin Borkowski
2016-02-13 17:58 ` Content navigation (was: On language-dependent defaults for character-folding) Óscar Fuentes
2016-02-13 16:32 ` On language-dependent defaults for character-folding Marcin Borkowski
2016-02-13 16:47 ` Eli Zaretskii
2016-02-13 17:03 ` Marcin Borkowski
2016-02-10 13:52 ` Adrian.B.Robert
2016-02-24 9:58 ` Marcin Borkowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CADtN0WL-rX5xzw75P=qLEYFYzLWkuCuntE+gf2BAhn981_jWBg@mail.gmail.com' \
--to=lokedhs@gmail.com \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
--cc=larsi@gnus.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.