On 02/04/2016 11:47 AM, Óscar Fuentes wrote: > Clément Pit--Claudel writes: > >> On 02/04/2016 10:59 AM, Óscar Fuentes wrote: >>> After seeing the case I mentioned (`n' matching `ñ' in Spanish >>> text) it is obvious that the feature is not ready for prime >>> time. >> >> This is interesting. I guess it boils down to whether you're trying >> to avoid false positives or false negatives. For me the strength of >> this feature is that it lets me find virtually anything using an >> dumb keyboard (one without easy access to accents); I don't care >> too much about false positives (that is, I don't mind if ‘n’ finds >> ‘ñ’). In that sense, it doesn't matter if letters "are different"; >> all that matters is whether they look different. I imagine that's >> why the Unicode standard defined things that way. It seems this >> behavior is consistent with that of most online search engines (I >> tried Google, Bing, and DuckDuckGo; all return accented matches for >> unaccented keywords). > > I see your point, but you are talking about accents all the time. In > Spanish `n' and `ñ' are different letters. `n' matching `ñ' is no > different than `p' matching `q'. I think that you will agree that > some of us will see that behavior as a glaring bug. I should have said diacritics instead of accents; sorry. The difference between n matching ñ and p matching q is that graphically, ñ is n + ~ (it can also be encoded that way: ̃n). Here's another issue that character folding solves; Id like your thoughts on it. Try to search the text of my message for 'n' and 'ñ', without any sort of character folding. This will match n but not ñ: ̃n. This will match ñ but not n: ñ. Note that the behaviour has nothing to do with Emacs; most applications will behave the same. The first ñ is using n + combining tilde, while the second is a single character ñ. Both are legal representation of the Spanish letter ñ. With character folding, both match 'n'. This is a much more logical default, I think. The same thing can be said for virtually every diacritic. On a more personal note, I wouldn't see the character folding behaviour as a bug for French, where ç is quite different from c, and é is quite different from e. >> I'm wary of smart solutions based on locale or buffer language. >> It's not uncommon to be writing a single document in multiple >> languages; especially if names are involved. Plus, it's not obvious >> that a single set of settings is enough for each locale. For >> example, one could argue that folding accents makes no sense in >> French: ‘supprimé’ means ‘removed’, but ‘supprime’ means ‘removes’. >> Yet it is not uncommon for people to write the latter for the >> former, especially when using a dumb keyboard. > > I'm not sure how to fix this, but seeing similar reservations from > other users, some language-dependent behavior is unavoidable. I don't think so. An on-off switch seems enough to begin with. Language-dependent folding could to be a separate feature; unicode folding (the curretn implementation) would be a fine feature to start with, I think.