From: "Clément Pit--Claudel" <clement.pit@gmail.com>
To: emacs-devel@gnu.org
Subject: Re: Character folding in the pretest
Date: Thu, 4 Feb 2016 12:27:49 -0500 [thread overview]
Message-ID: <56B38A15.90703@gmail.com> (raw)
In-Reply-To: <87mvrg2zid.fsf@wanadoo.es>
[-- Attachment #1: Type: text/plain, Size: 3215 bytes --]
On 02/04/2016 11:47 AM, Óscar Fuentes wrote:
> Clément Pit--Claudel <clement.pit@gmail.com> writes:
>
>> On 02/04/2016 10:59 AM, Óscar Fuentes wrote:
>>> After seeing the case I mentioned (`n' matching `ñ' in Spanish
>>> text) it is obvious that the feature is not ready for prime
>>> time.
>>
>> This is interesting. I guess it boils down to whether you're trying
>> to avoid false positives or false negatives. For me the strength of
>> this feature is that it lets me find virtually anything using an
>> dumb keyboard (one without easy access to accents); I don't care
>> too much about false positives (that is, I don't mind if ‘n’ finds
>> ‘ñ’). In that sense, it doesn't matter if letters "are different";
>> all that matters is whether they look different. I imagine that's
>> why the Unicode standard defined things that way. It seems this
>> behavior is consistent with that of most online search engines (I
>> tried Google, Bing, and DuckDuckGo; all return accented matches for
>> unaccented keywords).
>
> I see your point, but you are talking about accents all the time. In
> Spanish `n' and `ñ' are different letters. `n' matching `ñ' is no
> different than `p' matching `q'. I think that you will agree that
> some of us will see that behavior as a glaring bug.
I should have said diacritics instead of accents; sorry. The difference between n matching ñ and p matching q is that graphically, ñ is n + ~ (it can also be encoded that way: ̃n).
Here's another issue that character folding solves; Id like your thoughts on it. Try to search the text of my message for 'n' and 'ñ', without any sort of character folding.
This will match n but not ñ: ̃n.
This will match ñ but not n: ñ.
Note that the behaviour has nothing to do with Emacs; most applications will behave the same. The first ñ is using n + combining tilde, while the second is a single character ñ. Both are legal representation of the Spanish letter ñ. With character folding, both match 'n'. This is a much more logical default, I think. The same thing can be said for virtually every diacritic.
On a more personal note, I wouldn't see the character folding behaviour as a bug for French, where ç is quite different from c, and é is quite different from e.
>> I'm wary of smart solutions based on locale or buffer language.
>> It's not uncommon to be writing a single document in multiple
>> languages; especially if names are involved. Plus, it's not obvious
>> that a single set of settings is enough for each locale. For
>> example, one could argue that folding accents makes no sense in
>> French: ‘supprimé’ means ‘removed’, but ‘supprime’ means ‘removes’.
>> Yet it is not uncommon for people to write the latter for the
>> former, especially when using a dumb keyboard.
>
> I'm not sure how to fix this, but seeing similar reservations from
> other users, some language-dependent behavior is unavoidable.
I don't think so. An on-off switch seems enough to begin with. Language-dependent folding could to be a separate feature; unicode folding (the curretn implementation) would be a fine feature to start with, I think.
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
next prev parent reply other threads:[~2016-02-04 17:27 UTC|newest]
Thread overview: 102+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-03 0:31 Character folding in the pretest Per Starbäck
2016-02-03 6:34 ` Adrian.B.Robert
2016-02-03 8:00 ` Paul Eggert
2016-02-03 10:54 ` Yuri Khan
2016-02-03 15:57 ` Filipp Gunbin
2016-02-03 16:24 ` Drew Adams
2016-02-03 16:46 ` Clément Pit--Claudel
2016-02-03 17:28 ` Drew Adams
2016-02-03 18:10 ` Clément Pit--Claudel
2016-02-03 18:24 ` Clément Pit--Claudel
2016-02-03 18:31 ` Drew Adams
2016-02-03 16:52 ` Yuri Khan
2016-02-03 11:08 ` Artur Malabarba
2016-02-03 13:24 ` Stefan Monnier
2016-02-03 13:35 ` Nicolas Petton
2016-02-03 15:06 ` Drew Adams
2016-02-03 15:41 ` Eli Zaretskii
2016-02-03 15:55 ` Teemu Likonen
2016-02-03 16:16 ` Eli Zaretskii
2016-02-06 13:41 ` Teemu Likonen
2016-02-06 14:33 ` Eli Zaretskii
2016-02-06 15:09 ` Teemu Likonen
2016-02-06 18:38 ` Artur Malabarba
2016-02-06 19:08 ` Eli Zaretskii
2016-02-07 1:06 ` Artur Malabarba
2016-02-03 16:54 ` Clément Pit--Claudel
2016-02-03 17:01 ` John Wiegley
2016-02-03 21:08 ` Óscar Fuentes
2016-02-03 22:32 ` John Wiegley
2016-02-03 22:52 ` Clément Pit--Claudel
2016-02-03 23:50 ` Sacha Chua
2016-02-04 5:49 ` Ivan Andrus
2016-02-04 21:30 ` Richard Stallman
2016-02-04 8:40 ` Elias Mårtenson
2016-02-04 11:57 ` Dirk-Jan C. Binnema
2016-02-04 15:18 ` Drew Adams
2016-02-04 15:59 ` Óscar Fuentes
2016-02-04 16:36 ` Clément Pit--Claudel
2016-02-04 16:47 ` Óscar Fuentes
2016-02-04 17:05 ` Werner LEMBERG
2016-02-05 5:09 ` Elias Mårtenson
2016-02-05 6:01 ` Werner LEMBERG
2016-02-05 6:36 ` Elias Mårtenson
2016-02-05 7:15 ` Werner LEMBERG
2016-02-05 7:22 ` Elias Mårtenson
2016-02-06 15:43 ` Rasmus
2016-02-06 15:51 ` Eli Zaretskii
2016-02-05 7:52 ` Eli Zaretskii
2016-02-05 15:09 ` Filipp Gunbin
2016-02-05 19:21 ` Eli Zaretskii
2016-02-05 21:12 ` Óscar Fuentes
2016-02-05 22:20 ` Eli Zaretskii
2016-02-06 19:49 ` Richard Stallman
2016-02-06 19:49 ` Richard Stallman
2016-02-08 14:05 ` Marcin Borkowski
2016-02-08 17:48 ` Eli Zaretskii
2016-02-08 17:57 ` Werner LEMBERG
2016-02-08 19:18 ` Marcin Borkowski
2016-02-08 19:37 ` Eli Zaretskii
[not found] ` <<83oabrouwj.fsf@gnu.org>
2016-02-09 0:04 ` Drew Adams
2016-02-09 12:15 ` Richard Stallman
[not found] ` <<E1aT7CM-0005LM-9f@fencepost.gnu.org>
2016-02-09 15:26 ` Drew Adams
2016-02-06 12:58 ` Rasmus
2016-02-04 17:12 ` Eli Zaretskii
2016-02-04 19:35 ` Óscar Fuentes
2016-02-04 19:52 ` Clément Pit--Claudel
2016-02-04 20:05 ` Eli Zaretskii
2016-02-04 17:27 ` Clément Pit--Claudel [this message]
2016-02-04 17:34 ` Eli Zaretskii
2016-02-04 18:18 ` Yuri Khan
2016-02-04 19:46 ` Óscar Fuentes
2016-02-04 20:06 ` Clément Pit--Claudel
2016-02-04 20:40 ` Óscar Fuentes
2016-02-04 20:56 ` Clément Pit--Claudel
2016-02-04 21:16 ` Óscar Fuentes
2016-02-04 20:07 ` Eli Zaretskii
2016-02-04 20:52 ` Óscar Fuentes
2016-02-04 20:59 ` Clément Pit--Claudel
2016-02-04 21:08 ` Eli Zaretskii
2016-02-04 20:23 ` John Wiegley
2016-02-04 17:07 ` Eli Zaretskii
2016-02-04 17:31 ` Clément Pit--Claudel
2016-02-04 23:05 ` Artur Malabarba
2016-02-06 9:37 ` Per Starbäck
2016-02-06 10:41 ` Eli Zaretskii
2016-02-06 12:52 ` Rasmus
2016-02-06 14:31 ` Eli Zaretskii
2016-02-06 14:24 ` Ken Brown
2016-02-06 15:07 ` Eli Zaretskii
2016-02-04 16:54 ` Eli Zaretskii
2016-02-04 17:36 ` Paul Eggert
2016-02-04 17:45 ` Eli Zaretskii
2016-02-04 19:25 ` Paul Eggert
2016-02-04 19:36 ` Eli Zaretskii
2016-02-04 17:26 ` Teemu Likonen
2016-02-05 8:08 ` Adrian.B.Robert
2016-02-04 21:32 ` Richard Stallman
2016-02-08 14:12 ` Marcin Borkowski
2016-02-03 17:02 ` Eli Zaretskii
2016-02-03 15:38 ` Eli Zaretskii
2016-02-03 22:53 ` Richard Stallman
2016-02-03 15:39 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56B38A15.90703@gmail.com \
--to=clement.pit@gmail.com \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.