all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Elias Mårtenson" <lokedhs@gmail.com>
To: Werner LEMBERG <wl@gnu.org>
Cc: "Óscar Fuentes" <ofv@wanadoo.es>, emacs-devel <emacs-devel@gnu.org>
Subject: Re: Character folding in the pretest
Date: Fri, 5 Feb 2016 14:36:13 +0800	[thread overview]
Message-ID: <CADtN0WLnTYHioJ1p16JP-pt=rNqbyBmGfxh3SQFwfswEZnCz0A@mail.gmail.com> (raw)
In-Reply-To: <20160205.070103.162978216111829522.wl@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 4980 bytes --]

On 5 February 2016 at 14:01, Werner LEMBERG <wl@gnu.org> wrote:

>
> >> This naturally leads to a possible user option: Having `optical'
> >> matches or not, where `optical' means `base character plus
> >> diacritic and/or slight modifications', e.g., o → ø → ö etc., etc.
> >
> > How do you even define "optical similarities"?
>
> Basically the same as Eli has described: Base character plus
> diacritics, probably plus some basic shapes with `diacritics' that
> Unicode doesn't represent as composable: o → ø, l → ł, d → đ, etc.
>

Composability is somewhat arbitrary. The character composition has very
little to do with "visual similarities". Just have a look at character
compositions in Devanagari for example.


> > Should l and I compare the same under this definition?  They
> > certainly looks similar.
>
> No, since the similarity is a font issue only.  For this reason I
> *never* use Arial-like fonts.
>

And that argument works equally well for a and å. They really have
_nothing_ in common. The fact that there exists a Unicode decomposition for
them is completely irrelevant to a Swedish speaker.

Also note that to a Swedish speaker (well, at least up until recently), W
and V were variations of the same character. Yet I'm not advocating that
Emacs should consider them similar unless the locale says they should be.

In fact, the links to the Unicode TR on collations that Eli posted mentions
that as a specific example.


> > What about p and q?  They look like mirror images of each other.
> > What about z and s?  They even sound similar.
>
> Nonsense.  I've clearly mentioned `base character plus diacritic'.
> Why do you intentionally skip that?  Doing so reminds me of
> Schopenhauer's first stratagem in `The Art of Being Right'...
>

I did not intentionally skip that. I would appreciate it if you didn't
assume that I was out to simply prove you wrong, or that I am here to troll.

I was using that as an example in trying to highlight that to some people
(like myself) ä just simply is not a character with a diacritic. It is in
German, but not in Swedish.

I think this is hard to explain because in many European language (such as
English, German and French) you have characters which are variations or
alternatives. For example, in French you have the letter Œ, which is a
variation of "OE". Likewise in German, ß is a variation of SS and Ü is a
variation of UE. As far as I know, I could write "Müller" as "Mueller".

However, this is not true for Swedish. I'll say it again (and I apologise
for repeating myself, this kind of repetition makes me sound like the troll
that you accused me of being) but in Swedish the difference between Å and A
are just as great as the difference in English between the letters E and O.
Writing my last name as "Martenson" looks just as bizarre as me writing
your last name as "Merner". And yes, I picked M because it kinda looks like
an upside-down W and I'm doing that not because I'm really suggesting that
that equivalence should be implemented, but because I want to illustrate
just how silly it looks.



> > To a Swedish speaker there are zero similarities between a, ä and å.
>
> I'm a native German speaker, and there is *zero* similarity in the
> sound between `a' and `ä', say.


I know. Speak a little German. In fact, Ä is pronounced exactly the same in
German and Swedish. That said, as far as I can recall from my German
lessons 25 years ago, German grammar does see Ä as a variation of A. At
least they are sorted together in the dictionary.

Swedish distinction is much greater. This discussion would have been much
easier if the letter looked completely different. :-)


> But it is quite common in English
> texts, say, to omit the diaeresis dots, thus having a searching mode
> that finds both `Hänsel und Gretel' and `Hansel and Gretel' at the
> same time would be very valuable.
>

I never said it's not valuable. I never even suggested that this kind of
comparisons should not be possible.

In fact, I'm not even suggesting that this kind of comparisons should not
be the default, even. Especially given the fact that locale-dependent
comparators are not very well supported in Emacs at the moment.

What I did want to do was try try to explain that even though there is a
visual similarity between A, Ä and Å, to a Swedish speaker those
similarities are no greater than those of q and k. And definitely much more
different than W and V (which were, up until recently sorted under V in
dictionaries and seen as simply a visual variation).

>
> What you describe naturally leads to another user option: Don't handle
> characters as `equal' (with a proper definition of `equal') that
> aren't `equal' in the user's locale.


This is exactly my point. And you have managed to compress hundreds of my
words into a single, district sentence. Thank you.

[-- Attachment #2: Type: text/html, Size: 6918 bytes --]

  reply	other threads:[~2016-02-05  6:36 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-03  0:31 Character folding in the pretest Per Starbäck
2016-02-03  6:34 ` Adrian.B.Robert
2016-02-03  8:00 ` Paul Eggert
2016-02-03 10:54   ` Yuri Khan
2016-02-03 15:57     ` Filipp Gunbin
2016-02-03 16:24       ` Drew Adams
2016-02-03 16:46         ` Clément Pit--Claudel
2016-02-03 17:28           ` Drew Adams
2016-02-03 18:10             ` Clément Pit--Claudel
2016-02-03 18:24           ` Clément Pit--Claudel
2016-02-03 18:31             ` Drew Adams
2016-02-03 16:52       ` Yuri Khan
2016-02-03 11:08 ` Artur Malabarba
2016-02-03 13:24   ` Stefan Monnier
2016-02-03 13:35     ` Nicolas Petton
2016-02-03 15:06       ` Drew Adams
2016-02-03 15:41       ` Eli Zaretskii
2016-02-03 15:55         ` Teemu Likonen
2016-02-03 16:16           ` Eli Zaretskii
2016-02-06 13:41             ` Teemu Likonen
2016-02-06 14:33               ` Eli Zaretskii
2016-02-06 15:09                 ` Teemu Likonen
2016-02-06 18:38                   ` Artur Malabarba
2016-02-06 19:08                     ` Eli Zaretskii
2016-02-07  1:06                       ` Artur Malabarba
2016-02-03 16:54         ` Clément Pit--Claudel
2016-02-03 17:01           ` John Wiegley
2016-02-03 21:08             ` Óscar Fuentes
2016-02-03 22:32               ` John Wiegley
2016-02-03 22:52                 ` Clément Pit--Claudel
2016-02-03 23:50                 ` Sacha Chua
2016-02-04  5:49               ` Ivan Andrus
2016-02-04 21:30                 ` Richard Stallman
2016-02-04  8:40               ` Elias Mårtenson
2016-02-04 11:57                 ` Dirk-Jan C. Binnema
2016-02-04 15:18                   ` Drew Adams
2016-02-04 15:59                     ` Óscar Fuentes
2016-02-04 16:36                       ` Clément Pit--Claudel
2016-02-04 16:47                         ` Óscar Fuentes
2016-02-04 17:05                           ` Werner LEMBERG
2016-02-05  5:09                             ` Elias Mårtenson
2016-02-05  6:01                               ` Werner LEMBERG
2016-02-05  6:36                                 ` Elias Mårtenson [this message]
2016-02-05  7:15                                   ` Werner LEMBERG
2016-02-05  7:22                                     ` Elias Mårtenson
2016-02-06 15:43                                       ` Rasmus
2016-02-06 15:51                                         ` Eli Zaretskii
2016-02-05  7:52                                   ` Eli Zaretskii
2016-02-05 15:09                                     ` Filipp Gunbin
2016-02-05 19:21                                       ` Eli Zaretskii
2016-02-05 21:12                                         ` Óscar Fuentes
2016-02-05 22:20                                           ` Eli Zaretskii
2016-02-06 19:49                                           ` Richard Stallman
2016-02-06 19:49                                         ` Richard Stallman
2016-02-08 14:05                                 ` Marcin Borkowski
2016-02-08 17:48                                   ` Eli Zaretskii
2016-02-08 17:57                                     ` Werner LEMBERG
2016-02-08 19:18                                     ` Marcin Borkowski
2016-02-08 19:37                                       ` Eli Zaretskii
     [not found]                                       ` <<83oabrouwj.fsf@gnu.org>
2016-02-09  0:04                                         ` Drew Adams
2016-02-09 12:15                                       ` Richard Stallman
     [not found]                                       ` <<E1aT7CM-0005LM-9f@fencepost.gnu.org>
2016-02-09 15:26                                         ` Drew Adams
2016-02-06 12:58                               ` Rasmus
2016-02-04 17:12                           ` Eli Zaretskii
2016-02-04 19:35                             ` Óscar Fuentes
2016-02-04 19:52                               ` Clément Pit--Claudel
2016-02-04 20:05                               ` Eli Zaretskii
2016-02-04 17:27                           ` Clément Pit--Claudel
2016-02-04 17:34                             ` Eli Zaretskii
2016-02-04 18:18                             ` Yuri Khan
2016-02-04 19:46                             ` Óscar Fuentes
2016-02-04 20:06                               ` Clément Pit--Claudel
2016-02-04 20:40                                 ` Óscar Fuentes
2016-02-04 20:56                                   ` Clément Pit--Claudel
2016-02-04 21:16                                     ` Óscar Fuentes
2016-02-04 20:07                               ` Eli Zaretskii
2016-02-04 20:52                                 ` Óscar Fuentes
2016-02-04 20:59                                   ` Clément Pit--Claudel
2016-02-04 21:08                                   ` Eli Zaretskii
2016-02-04 20:23                         ` John Wiegley
2016-02-04 17:07                       ` Eli Zaretskii
2016-02-04 17:31                         ` Clément Pit--Claudel
2016-02-04 23:05                     ` Artur Malabarba
2016-02-06  9:37                       ` Per Starbäck
2016-02-06 10:41                         ` Eli Zaretskii
2016-02-06 12:52                           ` Rasmus
2016-02-06 14:31                             ` Eli Zaretskii
2016-02-06 14:24                           ` Ken Brown
2016-02-06 15:07                             ` Eli Zaretskii
2016-02-04 16:54                   ` Eli Zaretskii
2016-02-04 17:36                     ` Paul Eggert
2016-02-04 17:45                       ` Eli Zaretskii
2016-02-04 19:25                         ` Paul Eggert
2016-02-04 19:36                           ` Eli Zaretskii
2016-02-04 17:26                   ` Teemu Likonen
2016-02-05  8:08                     ` Adrian.B.Robert
2016-02-04 21:32                 ` Richard Stallman
2016-02-08 14:12                   ` Marcin Borkowski
2016-02-03 17:02           ` Eli Zaretskii
2016-02-03 15:38   ` Eli Zaretskii
2016-02-03 22:53   ` Richard Stallman
2016-02-03 15:39 ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADtN0WLnTYHioJ1p16JP-pt=rNqbyBmGfxh3SQFwfswEZnCz0A@mail.gmail.com' \
    --to=lokedhs@gmail.com \
    --cc=emacs-devel@gnu.org \
    --cc=ofv@wanadoo.es \
    --cc=wl@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.