all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Per Starbäck" <per.starback@gmail.com>
To: Artur Malabarba <bruce.connor.am@gmail.com>
Cc: "Dirk-Jan C. Binnema" <djcb@djcbsoftware.nl>,
	Drew Adams <drew.adams@oracle.com>,
	emacs-devel <emacs-devel@gnu.org>
Subject: Re: Character folding in the pretest
Date: Sat, 6 Feb 2016 10:37:06 +0100	[thread overview]
Message-ID: <CADkQgvv-ZwfCqkhUeapEENLAxi984ndLqEporZ1J8bTxUMwRMg@mail.gmail.com> (raw)
In-Reply-To: <CAAdUY-LqpOpn8CQeP0Wi77mxdQyXm_DOgF_aGhMw9QsfBsac6w@mail.gmail.com>

Oscar Fuentes wrote:

> If a Spaniard inputs "sana" on a search box and "saña" is found, he
> will regard the software as either buggy, dumb or completely
> oblivious to Spanish culture.

Similar to my example of how a Swede would see a search for "varpa"
finding "värpa" or "varpå" (all of the three being existing totally
different words).

When met with the "argument" that not many people speak Swedish anyway
I replied that it was only an example of what I knew best, and that
there probably were similar examples in several other languages. I'm
glad to hear there is one in Spanish, one of the largest languages of
the world. Now let's count the number of affected people again! :)

That character folding is dependent on locale is of course well-known
by those who work on this. Artur Malabarba wrote:

> FTR, like I've said a couple of times already, I will invest more time
> into making this customizable once I've seen how it's received.
> Also (and this I haven't said yet) I do plan on providing a better
> default depending on locale. When the time comes to actually implement
> it I'll explain why I prefer locale (over some notion of buffer-local
> language).

When Artur again confirmed that he is fine with having the new feature
turned of in Emacs 25 with the intention of having it turned on later,
after it has had enough testing, I though this would finally be settled.

But evidently not yet... From the opposers it has been argued as if
this is something mandated by Unicode, so we can do nothing about it
but to follow. It doesn't matter if the result is seen as buggy or
dumb by users. "This feature is simply folding as specified by the
Unicode standard".

That is not so. Of course the Unicode Consortium is well aware of the
issues that I, Oscar and others are pointing out, and that I'm sure
Artur is well aware of.

Eli Zaretskii:
> Perhaps you aren't familiar with Unicode equivalence, in which case I
> suggest these sources:
>
>   http://unicode.org/reports/tr10/#Searching
>   http://www.unicode.org/notes/tn5/
>   http://www.unicode.org/reports/tr30/tr30-4.html

But of course these take up issues like we have mentioned here. The
first one mentions the aa/å equivalence in Danish for example. And to
quote the last one:

#  In the general case, different search term foldings are applied for
#  different languages. For example, accent distinctions are ignorable
#  for some languages, but not for others. In English the accent in
#  words like naïve is optional, while to a Swedish user 'o' and 'ö'
#  are distinct letters.

That is by the way the last draft of a withdrawn tecnical report.

  Draft UTR #30: Unicode Character Foldings has been withdrawn. It was
  never formally approved; the last public version was a draft
  UTR,which can be found at
  http://www.unicode.org/reports/tr30/tr30-4.html.

That shows not only that the issues I, Oscar and others are mentioning
are not something new that we just thought of that Unicode somehow
should have us ignore. It also shows that there *is* no technical
report on Unicode Character Foldings.

We have to break out of the circles this is going in. John Wiegley wrote:

> A locale-based quotient for natural language text seems like a reasonable
> default, unless pretesting/polling shows us otherwise. However, there will
> always be times when you don't want it, or you want a different quotient
> altogether, or even various combinations of them.

Yes, that would be a good default, but that's not a default that we
can have in the next Emacs, but that there is great prospects we can
have in the one after that. Please John, put your foot down and don't
let this continue ad infinitum.

The options we have are instead:

(1) Let the default be as searching has worked before. Nothing gets
worse for anyone.

We'll the start of a new exciting feature available, that will be just
right for many users, and that will be tried by a lot others as well,
giving feedback for the continued development that Artur has written
that he already is planning.

(2) Make the fundamental feature searching work fundamentally
different out of the box in a way that for many users will be seen as
neat, and for many users will be seen as "buggy, dumb or completely
oblivious to" the user's culture.



  reply	other threads:[~2016-02-06  9:37 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-03  0:31 Character folding in the pretest Per Starbäck
2016-02-03  6:34 ` Adrian.B.Robert
2016-02-03  8:00 ` Paul Eggert
2016-02-03 10:54   ` Yuri Khan
2016-02-03 15:57     ` Filipp Gunbin
2016-02-03 16:24       ` Drew Adams
2016-02-03 16:46         ` Clément Pit--Claudel
2016-02-03 17:28           ` Drew Adams
2016-02-03 18:10             ` Clément Pit--Claudel
2016-02-03 18:24           ` Clément Pit--Claudel
2016-02-03 18:31             ` Drew Adams
2016-02-03 16:52       ` Yuri Khan
2016-02-03 11:08 ` Artur Malabarba
2016-02-03 13:24   ` Stefan Monnier
2016-02-03 13:35     ` Nicolas Petton
2016-02-03 15:06       ` Drew Adams
2016-02-03 15:41       ` Eli Zaretskii
2016-02-03 15:55         ` Teemu Likonen
2016-02-03 16:16           ` Eli Zaretskii
2016-02-06 13:41             ` Teemu Likonen
2016-02-06 14:33               ` Eli Zaretskii
2016-02-06 15:09                 ` Teemu Likonen
2016-02-06 18:38                   ` Artur Malabarba
2016-02-06 19:08                     ` Eli Zaretskii
2016-02-07  1:06                       ` Artur Malabarba
2016-02-03 16:54         ` Clément Pit--Claudel
2016-02-03 17:01           ` John Wiegley
2016-02-03 21:08             ` Óscar Fuentes
2016-02-03 22:32               ` John Wiegley
2016-02-03 22:52                 ` Clément Pit--Claudel
2016-02-03 23:50                 ` Sacha Chua
2016-02-04  5:49               ` Ivan Andrus
2016-02-04 21:30                 ` Richard Stallman
2016-02-04  8:40               ` Elias Mårtenson
2016-02-04 11:57                 ` Dirk-Jan C. Binnema
2016-02-04 15:18                   ` Drew Adams
2016-02-04 15:59                     ` Óscar Fuentes
2016-02-04 16:36                       ` Clément Pit--Claudel
2016-02-04 16:47                         ` Óscar Fuentes
2016-02-04 17:05                           ` Werner LEMBERG
2016-02-05  5:09                             ` Elias Mårtenson
2016-02-05  6:01                               ` Werner LEMBERG
2016-02-05  6:36                                 ` Elias Mårtenson
2016-02-05  7:15                                   ` Werner LEMBERG
2016-02-05  7:22                                     ` Elias Mårtenson
2016-02-06 15:43                                       ` Rasmus
2016-02-06 15:51                                         ` Eli Zaretskii
2016-02-05  7:52                                   ` Eli Zaretskii
2016-02-05 15:09                                     ` Filipp Gunbin
2016-02-05 19:21                                       ` Eli Zaretskii
2016-02-05 21:12                                         ` Óscar Fuentes
2016-02-05 22:20                                           ` Eli Zaretskii
2016-02-06 19:49                                           ` Richard Stallman
2016-02-06 19:49                                         ` Richard Stallman
2016-02-08 14:05                                 ` Marcin Borkowski
2016-02-08 17:48                                   ` Eli Zaretskii
2016-02-08 17:57                                     ` Werner LEMBERG
2016-02-08 19:18                                     ` Marcin Borkowski
2016-02-08 19:37                                       ` Eli Zaretskii
     [not found]                                       ` <<83oabrouwj.fsf@gnu.org>
2016-02-09  0:04                                         ` Drew Adams
2016-02-09 12:15                                       ` Richard Stallman
     [not found]                                       ` <<E1aT7CM-0005LM-9f@fencepost.gnu.org>
2016-02-09 15:26                                         ` Drew Adams
2016-02-06 12:58                               ` Rasmus
2016-02-04 17:12                           ` Eli Zaretskii
2016-02-04 19:35                             ` Óscar Fuentes
2016-02-04 19:52                               ` Clément Pit--Claudel
2016-02-04 20:05                               ` Eli Zaretskii
2016-02-04 17:27                           ` Clément Pit--Claudel
2016-02-04 17:34                             ` Eli Zaretskii
2016-02-04 18:18                             ` Yuri Khan
2016-02-04 19:46                             ` Óscar Fuentes
2016-02-04 20:06                               ` Clément Pit--Claudel
2016-02-04 20:40                                 ` Óscar Fuentes
2016-02-04 20:56                                   ` Clément Pit--Claudel
2016-02-04 21:16                                     ` Óscar Fuentes
2016-02-04 20:07                               ` Eli Zaretskii
2016-02-04 20:52                                 ` Óscar Fuentes
2016-02-04 20:59                                   ` Clément Pit--Claudel
2016-02-04 21:08                                   ` Eli Zaretskii
2016-02-04 20:23                         ` John Wiegley
2016-02-04 17:07                       ` Eli Zaretskii
2016-02-04 17:31                         ` Clément Pit--Claudel
2016-02-04 23:05                     ` Artur Malabarba
2016-02-06  9:37                       ` Per Starbäck [this message]
2016-02-06 10:41                         ` Eli Zaretskii
2016-02-06 12:52                           ` Rasmus
2016-02-06 14:31                             ` Eli Zaretskii
2016-02-06 14:24                           ` Ken Brown
2016-02-06 15:07                             ` Eli Zaretskii
2016-02-04 16:54                   ` Eli Zaretskii
2016-02-04 17:36                     ` Paul Eggert
2016-02-04 17:45                       ` Eli Zaretskii
2016-02-04 19:25                         ` Paul Eggert
2016-02-04 19:36                           ` Eli Zaretskii
2016-02-04 17:26                   ` Teemu Likonen
2016-02-05  8:08                     ` Adrian.B.Robert
2016-02-04 21:32                 ` Richard Stallman
2016-02-08 14:12                   ` Marcin Borkowski
2016-02-03 17:02           ` Eli Zaretskii
2016-02-03 15:38   ` Eli Zaretskii
2016-02-03 22:53   ` Richard Stallman
2016-02-03 15:39 ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADkQgvv-ZwfCqkhUeapEENLAxi984ndLqEporZ1J8bTxUMwRMg@mail.gmail.com \
    --to=per.starback@gmail.com \
    --cc=bruce.connor.am@gmail.com \
    --cc=djcb@djcbsoftware.nl \
    --cc=drew.adams@oracle.com \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.