* Character folding in the pretest @ 2016-02-03 0:31 Per Starbäck 2016-02-03 6:34 ` Adrian.B.Robert ` (3 more replies) 0 siblings, 4 replies; 102+ messages in thread From: Per Starbäck @ 2016-02-03 0:31 UTC (permalink / raw) To: emacs-devel@gnu.org I brought up earlier that the new character fold feature that still hasn't been in any released version of Emacs shouldn't be turned on by default when it debuts. Now I've tested the first prerelease of Emacs 25, and seen that it is still turned on by default, so I'll revisit this and argue why this is important. Probably what I say here is all I have to say. === There was a lot of agreement === RMS wrote that there ought to be a poll about the default. Eli wrote that > Such a poll could only work if the behavior intended to become the > default is already available in released versions of Emacs, so users > could turn it on and try it. This is not the case with character > folding, which is only available in development snapshots, and > actually is still in flux: it changes in non-trivial ways almost every > day. > > If we are afraid users will hate this default, we can turn it off in > v25.1 and consider making it the default later. RMS commented: > That seems like the right approach. Artur Malabarba wrote: > I don't mind leaving this OFF by default in Emacs 25. So long as the > eventual goal is to have it ON by default (preferably in 26). Drew liked the feature and thought it should be turned off initially: > My expectation, if we turn it off by default, is that users will > try it, like it, and possibly ask for it to become the default > behavior. There is no reason to jump the gun on this. Eli thought that it should remain turned on in the pretest to get more testing: > The entire time interval between Nov 15 this year and until we release > Emacs 25.1 (which will take a few months, probably more than 6, > judging by past experience) is supposed to provide that feedback. All > it takes to turn this off by default is changing the default value of > a single variable (and change a couple of places in the User Manual to > reflect that). Once we decide to do that, it can be done very quickly > and easily. We can do that a day before the release, if we want to. > > OTOH, turning it off today means that it will get much less testing, > and therefore bugs related to it (like the one reported just today in > http://debbugs.gnu.org/cgi/bugreport.cgi?bug=22090) will most probably > remain hidden for who knows how long. It's time to make that decision now. === Why === Because this is a big change that have repercussions that haven't had all the major wrinkles ironed out yet. Some software throws big changes like that in the face of the users, and more or less force them to get the kinks out or find out how to turn it off, but that is usually not the Emacs way. Here usually the big kinks are already taken care of when something is introduced to users who haven't specifically asked for it. That's a good thing. Eli argued against me that Emacs sometimes does that, for example with bidi which he argued was a much bigger change. In some ways it was, but still, for people who don't use RTL languages all of that has been more of less invisible, and for those who do it was obviously better than without it. I know how the current character fold version is *just wrong* for Swedes and other Scandinavians when handling their native languages. There was a flurry of messages then which I couldn't keep up with, and where I thought most of it took up issues I already had answered anyway, but I'm getting back to this now. One answer was that problems for Scandinavians wasn't relevant, because I had to show that it was "_wrong_ in _most_ situations" to be relevant. I don't agree with that, but even if you do, I think my Scandinavian example is only an example, and that there probably are several similar in different locales. === What was that Swedish example now again? === A and Ä. In classical Latin U and V was the same letter. Not until Late Middle Ages were there these two forms and they weren't differentiated one as a consonant and one as a vowel until the 16th century. In spite their historical equivalness they are clearly different letters in for example English. Having a character fold feature where a search for U found V would be *just wrong*. Since everyone on this list knows English everyone knows that. What we get now for Swedish is very similar to that. Everyone who knows Swedish knows that. Here ÅÄÖ are separate letters in all ways from A and O, in spite of their historic origin tying them together. That is just history. "Ä" has its own key on a keyboard, its own name and its own position in the alphabet. For a Swede to have a search for "varpa" in a Swedish text find "värpa" or "varpå" would be *just wrong*. It would give a strong impression of this being an American program not meant to be used for Swedish. Note that this is not me saying that we Swedes don't like character folding. It's a perfectly good feature to have a search for "entre" find "entré" or a search for "crepe" find "crêpe" because "é" and "ê" are accented variants of "e". But "ä" in Swedish is in no way an accented letter. At this point several people usually reply "then just turn it off". But the point is that by having it work like this out of the box it sends a message to some new users that Emacs is not usable at all. If they instead have some problems with a feature they have explicitly turned on that's something else. Those who have turned it on know how to turn it off. Others don't necessarily know that. === Are there other examples? === I won't say something certain about a language I'm not a native speaker of, but I think there are similar situations. I suspect for example that Russian и and й is a similar pair, where it is *just wrong* that a search for "и" (CYRILLIC SMALL LETTER I) also finds "й" (CYRILLIC SMALL LETTER SHORT I). All in all I see the need for a feature to adjust individual entries of the character folding before it ought to be turned on by default. === Are users expecting this? === Has Emacs been late implementing character folding? Is everyone expecting that now and it's important to turn it on to now seem to be out of the loop? It doesn't seem so. Eli wrote first that character folding was introduced in Emacs to give users "what the other text-editing and word-processing environments provide, what they therefore are expected to expect". I answered that for example Gedit and Firefox didn't have this feature, and then Eli wrote that I should "try more serious editing environments" like MS Word. Since then I have had opportunity to try MS Word 2013 and I couldn't find such a feature. Maybe there was such a feature I couldn't find. Maybe it had been turned off by the system administrators at my university. I don't know, but on a random web source, http://wordribbon.tips.net/T010627_Ignoring_Accented_Characters_in_Searches.html I find it stated that MS Word (2007, 2010, and 2013) doesn't have such a feature. I don't think this is something users expect, but something that will be an example of how Emacs does things better for those who can turn it off for good results already and for the rest of us when it has become slightly more featureful. === Option menu === Also, please please add a checkbox for character folding just above or below the one for case folding in the Options menu!! ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 0:31 Character folding in the pretest Per Starbäck @ 2016-02-03 6:34 ` Adrian.B.Robert 2016-02-03 8:00 ` Paul Eggert ` (2 subsequent siblings) 3 siblings, 0 replies; 102+ messages in thread From: Adrian.B.Robert @ 2016-02-03 6:34 UTC (permalink / raw) To: emacs-devel As a developer in a Scandinavian country I find the new case folding very useful for searching in text when I have a US keyboard layout enabled. That said, I agree that it should not be the default, but be easily discoverable in the Options menu. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 0:31 Character folding in the pretest Per Starbäck 2016-02-03 6:34 ` Adrian.B.Robert @ 2016-02-03 8:00 ` Paul Eggert 2016-02-03 10:54 ` Yuri Khan 2016-02-03 11:08 ` Artur Malabarba 2016-02-03 15:39 ` Eli Zaretskii 3 siblings, 1 reply; 102+ messages in thread From: Paul Eggert @ 2016-02-03 8:00 UTC (permalink / raw) To: emacs-devel Per Starbäck wrote: > I suspect for > example that Russian и and й is a similar pair, where it is *just > wrong* that a search for "и" (CYRILLIC SMALL LETTER I) also finds "й" > (CYRILLIC SMALL LETTER SHORT I). For what it's worth, here is an amusing bug report involving two Russians who disagree about whether to accent-fold и and й: http://tracker.firebirdsql.org/browse/CORE-4803 ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 8:00 ` Paul Eggert @ 2016-02-03 10:54 ` Yuri Khan 2016-02-03 15:57 ` Filipp Gunbin 0 siblings, 1 reply; 102+ messages in thread From: Yuri Khan @ 2016-02-03 10:54 UTC (permalink / raw) To: Paul Eggert; +Cc: Emacs developers On Wed, Feb 3, 2016 at 2:00 PM, Paul Eggert <eggert@cs.ucla.edu> wrote: > For what it's worth, here is an amusing bug report involving two Russians > who disagree about whether to accent-fold и and й: > > http://tracker.firebirdsql.org/browse/CORE-4803 Very funny. In Russian, И and Й are only treated as equivalent within crossword puzzles; otherwise everybody agrees they are different letters. Е and Ё, on the other hand, are a holywar-inducing contention point. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 10:54 ` Yuri Khan @ 2016-02-03 15:57 ` Filipp Gunbin 2016-02-03 16:24 ` Drew Adams 2016-02-03 16:52 ` Yuri Khan 0 siblings, 2 replies; 102+ messages in thread From: Filipp Gunbin @ 2016-02-03 15:57 UTC (permalink / raw) To: Yuri Khan; +Cc: Paul Eggert, Emacs developers On 03/02/2016 16:54 +0600, Yuri Khan wrote: > Е and Ё, on the other hand, are a holywar-inducing contention > point. They have their own places in the Russian alphabet. I think char-folding should fold only "modified" letter variants into "canonical" form (without any modifications). Е and Ё are just separate letters, although we don't use Ё much... Once I "fixed" all our text resources files at work and a colleague of mine commented in review that Ё is used only in childrens books. I had to revert the change. Filipp ^ permalink raw reply [flat|nested] 102+ messages in thread
* RE: Character folding in the pretest 2016-02-03 15:57 ` Filipp Gunbin @ 2016-02-03 16:24 ` Drew Adams 2016-02-03 16:46 ` Clément Pit--Claudel 2016-02-03 16:52 ` Yuri Khan 1 sibling, 1 reply; 102+ messages in thread From: Drew Adams @ 2016-02-03 16:24 UTC (permalink / raw) To: Filipp Gunbin, Yuri Khan; +Cc: Paul Eggert, Emacs developers > > Е and Ё, on the other hand, are a holywar-inducing contention > > point. > > They have their own places in the Russian alphabet. I think > char-folding should fold only "modified" letter variants into > "canonical" form (without any modifications). > > Е and Ё are just separate letters, although we don't use Ё much... > > Once I "fixed" all our text resources files at work and a colleague of > mine commented in review that Ё is used only in childrens books. I had > to revert the change. The point, IMO, is that there are multiple use cases, depending on the user and the context (including, but not limited to, language). What we really need are ways for _users_ to _easily_ express their preferences, including perhaps preferences for different contexts that they use, and including ways to express what they want on the fly - not just ahead of time via Customize (e.g. default preferences). That should be the _first_ order of business. If we do a good job of providing for that then anything additional we do concerning DWIM or default behaviors is icing on the cake. If we do not take care of the need to give users flexible control then anything we do (DWIM or defaults) will be misguided for at least some users and use cases. It typically hurts more than helps, IMO. This is a general point, not limited to char folding or search. Our priority should be to (1) yes, raise possible use cases for discussion, such as is being done now in this thread, and (2) come up with brilliant, easy-to-use ways to _give users control_. Users are different, and even the same user has multiple use cases - s?he does not want the same behavior all the time. It is not enough to look at the user's language setting etc. Only the user knows, at any given time, what s?he wants. It is fine to be smart about the defaults we set, but that's not the most important thing. Likewise wrt coming up with clever DWIM behavior. But the smartest DWIM is brain dead when compared with a live user. And even the best default behavior is no good for many use cases. Users need to be able to (easily) control the behavior. Thinking first about defaults or DWIM is wrong, IMO. We should think first about how users can change the behavior, including on the fly. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 16:24 ` Drew Adams @ 2016-02-03 16:46 ` Clément Pit--Claudel 2016-02-03 17:28 ` Drew Adams 2016-02-03 18:24 ` Clément Pit--Claudel 0 siblings, 2 replies; 102+ messages in thread From: Clément Pit--Claudel @ 2016-02-03 16:46 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 561 bytes --] On 02/03/2016 11:24 AM, Drew Adams wrote: > Thinking first about defaults or DWIM is wrong, IMO. We > should think first about how users can change the behavior, > including on the fly. I don't agree. This leads to Emacs being painful to use without large amounts of customization. Do many Emacs devs use an empty or almost empty .emacs? Customizability is a strength, but the popularity of pre-packaged Emacs configurations (prelude, Emacs starter kit, Graphene, and countless .emacs.d repositories) says something about good defaults. Clément. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* RE: Character folding in the pretest 2016-02-03 16:46 ` Clément Pit--Claudel @ 2016-02-03 17:28 ` Drew Adams 2016-02-03 18:10 ` Clément Pit--Claudel 2016-02-03 18:24 ` Clément Pit--Claudel 1 sibling, 1 reply; 102+ messages in thread From: Drew Adams @ 2016-02-03 17:28 UTC (permalink / raw) To: Clément Pit--Claudel, emacs-devel > > Thinking first about defaults or DWIM is wrong, IMO. We > > should think first about how users can change the behavior, > > including on the fly. > > I don't agree. This leads to Emacs being painful to use without large > amounts of customization. Do many Emacs devs use an empty or almost empty > .emacs? > Customizability is a strength, but the popularity of pre-packaged Emacs > configurations (prelude, Emacs starter kit, Graphene, and countless .emacs.d > repositories) says something about good defaults. Please read what I wrote. I do not argue that defaults are unimportant, or that we should not choose good default behavior, and choose it carefully. Quite the contrary. My point is that concentrating _first_ on the default behavior, without considering various use cases, is a mistake. (One reason it is a mistake is precisely because without considering possible use cases the default choice made is likely to not be the best one.) I welcome the recent posts that point to different use cases. The mere _possibility_ of char folding (treating different chars equivalently, for some meanings of equivalence) means that there can be, and so there will be, some very different needs and preferences wrt which chars are to be handled as equivalent in which contexts. Better for us to start hearing about this at the outset, so we have a wider vision of what this new feature represents. As to the popularity of starter kits: Sure. But the popularity of _Emacs_ itself has a lot to do with its bendability - the fact that different people can use it in different ways, and extend it or customize it or change it on the fly to fit their needs. Without that, Emacs is not Emacs. And in the case at hand, I feel that char folding does not yet provide enough flexibility for users. It provides a useful set of foldings (equivalences) out of the box, and that's great, as a start. But we should make it more user-customizable. Just one opinion. It's not a case of one or the other: picking good defaults and clever DWIM or providing ways for users to control the behavior. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 17:28 ` Drew Adams @ 2016-02-03 18:10 ` Clément Pit--Claudel 0 siblings, 0 replies; 102+ messages in thread From: Clément Pit--Claudel @ 2016-02-03 18:10 UTC (permalink / raw) To: Drew Adams, emacs-devel [-- Attachment #1: Type: text/plain, Size: 112 bytes --] On 02/03/2016 12:28 PM, Drew Adams wrote: > Please read what I wrote. Please don't assume that I didn't. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 16:46 ` Clément Pit--Claudel 2016-02-03 17:28 ` Drew Adams @ 2016-02-03 18:24 ` Clément Pit--Claudel 2016-02-03 18:31 ` Drew Adams 1 sibling, 1 reply; 102+ messages in thread From: Clément Pit--Claudel @ 2016-02-03 18:24 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 746 bytes --] Amusingly, the tagline for Emacs Starter Kit is “Because the Emacs defaults are not so great sometimes.” On 02/03/2016 11:46 AM, Clément Pit--Claudel wrote: > On 02/03/2016 11:24 AM, Drew Adams wrote: >> Thinking first about defaults or DWIM is wrong, IMO. We >> should think first about how users can change the behavior, >> including on the fly. > > I don't agree. This leads to Emacs being painful to use without large amounts of customization. Do many Emacs devs use an empty or almost empty .emacs? > Customizability is a strength, but the popularity of pre-packaged Emacs configurations (prelude, Emacs starter kit, Graphene, and countless .emacs.d repositories) says something about good defaults. > > Clément. > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* RE: Character folding in the pretest 2016-02-03 18:24 ` Clément Pit--Claudel @ 2016-02-03 18:31 ` Drew Adams 0 siblings, 0 replies; 102+ messages in thread From: Drew Adams @ 2016-02-03 18:31 UTC (permalink / raw) To: Clément Pit--Claudel, emacs-devel > Amusingly, the tagline for Emacs Starter Kit is “Because the Emacs defaults > are not so great sometimes.” Yes, a starter kit is a customization, albeit one that its creator expects will be useful to multiple users. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 15:57 ` Filipp Gunbin 2016-02-03 16:24 ` Drew Adams @ 2016-02-03 16:52 ` Yuri Khan 1 sibling, 0 replies; 102+ messages in thread From: Yuri Khan @ 2016-02-03 16:52 UTC (permalink / raw) To: Filipp Gunbin; +Cc: Paul Eggert, Emacs developers On Wed, Feb 3, 2016 at 9:57 PM, Filipp Gunbin <fgunbin@fastmail.fm> wrote: >> Е and Ё, on the other hand, are a holywar-inducing contention >> point. > > They have their own places in the Russian alphabet. I think > char-folding should fold only "modified" letter variants into > "canonical" form (without any modifications). > > Е and Ё are just separate letters, although we don't use Ё much... Oh, we use it all the time. It’s just that many people habitually write Е in place of Ё. And this is exactly the reason why char folding becomes relevant for this particular pair. When searching in a text by someone other, I will want to fold so that I find occurrences where I would write Ё but other would replace it with Е. Likewise, those other people, when reading my text, will want to fold in order to find occurrences where they would write Е but I would write Ё. > Once I "fixed" all our text resources files at work and a colleague of > mine commented in review that Ё is used only in childrens books. I had > to revert the change. In this situation, you will want to not fold, so that you can search for all instances of Ё and decide which to replace with Е. (Even when the policy is to avoid Ё, it is still mandatory in cases of ambiguity.) ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 0:31 Character folding in the pretest Per Starbäck 2016-02-03 6:34 ` Adrian.B.Robert 2016-02-03 8:00 ` Paul Eggert @ 2016-02-03 11:08 ` Artur Malabarba 2016-02-03 13:24 ` Stefan Monnier ` (2 more replies) 2016-02-03 15:39 ` Eli Zaretskii 3 siblings, 3 replies; 102+ messages in thread From: Artur Malabarba @ 2016-02-03 11:08 UTC (permalink / raw) To: Per Starbäck; +Cc: emacs-devel@gnu.org Per Starbäck <per@starback.se> writes: > I brought up earlier that the new character fold feature that still > hasn't been in any released version of Emacs shouldn't be turned on by > default when it debuts. FTR, My opinion on this is still as you quoted: >> I don't mind leaving this OFF by default in Emacs 25. So long as the >> eventual goal is to have it ON by default (preferably in 26). I do also share Eli's opinion, that it would be nice to get as much (pre)testing as possible before the release. However, it's likely I'll grow a little absent from this list in the next few months, so it's entirely possible I'll miss out on the chance to turn this OFF before release. Does anyone volunteer to switch OFF this default shortly before release? If not, I'll just do it now. > I know how the current character fold version is *just wrong* for > Swedes and other Scandinavians when handling their native languages. The current version just follows the Unicode standard (plus a few ad-hoc rules related to quotation marks), whose authors have certainly spent a lot more time on this than us. This is just a polite way of saying “we're not catering to any languages, take any complaints up with that other team”. Of course, that doesn't mean we can't improve support to specific languages/locales in the future. But I've mentioned before I don't want to start designing APIs or sophisticated features on top of the current implementation before seeing how it fares “in the wild” for at least one release. > === Option menu === > > Also, please please add a checkbox for character folding just above or > below the one for case folding in the Options menu!! Yes, please! ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 11:08 ` Artur Malabarba @ 2016-02-03 13:24 ` Stefan Monnier 2016-02-03 13:35 ` Nicolas Petton 2016-02-03 15:38 ` Eli Zaretskii 2016-02-03 22:53 ` Richard Stallman 2 siblings, 1 reply; 102+ messages in thread From: Stefan Monnier @ 2016-02-03 13:24 UTC (permalink / raw) To: emacs-devel > Does anyone volunteer to switch OFF this default shortly before release? Am I the only one worried about making changes "shortly before release"? Stefan ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 13:24 ` Stefan Monnier @ 2016-02-03 13:35 ` Nicolas Petton 2016-02-03 15:06 ` Drew Adams 2016-02-03 15:41 ` Eli Zaretskii 0 siblings, 2 replies; 102+ messages in thread From: Nicolas Petton @ 2016-02-03 13:35 UTC (permalink / raw) To: Stefan Monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 324 bytes --] Stefan Monnier <monnier@iro.umontreal.ca> writes: >> Does anyone volunteer to switch OFF this default shortly before release? > > Am I the only one worried about making changes "shortly before > release"? I agree. If we want to turn it off by default for the release, I'd do it now, so it gets in the next pretest. Nico [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 512 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* RE: Character folding in the pretest 2016-02-03 13:35 ` Nicolas Petton @ 2016-02-03 15:06 ` Drew Adams 2016-02-03 15:41 ` Eli Zaretskii 1 sibling, 0 replies; 102+ messages in thread From: Drew Adams @ 2016-02-03 15:06 UTC (permalink / raw) To: Nicolas Petton, Stefan Monnier, emacs-devel > >> Does anyone volunteer to switch OFF this default shortly before release? > > > > Am I the only one worried about making changes "shortly before > > release"? > > I agree. If we want to turn it off by default for the release, I'd do > it now, so it gets in the next pretest. +1 WYTestIWYG ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 13:35 ` Nicolas Petton 2016-02-03 15:06 ` Drew Adams @ 2016-02-03 15:41 ` Eli Zaretskii 2016-02-03 15:55 ` Teemu Likonen 2016-02-03 16:54 ` Clément Pit--Claudel 1 sibling, 2 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-03 15:41 UTC (permalink / raw) To: Nicolas Petton; +Cc: monnier, emacs-devel > From: Nicolas Petton <nicolas@petton.fr> > Date: Wed, 03 Feb 2016 14:35:46 +0100 > > If we want to turn it off by default for the release We don't, at least not yet. We want to collect feedback. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 15:41 ` Eli Zaretskii @ 2016-02-03 15:55 ` Teemu Likonen 2016-02-03 16:16 ` Eli Zaretskii 2016-02-03 16:54 ` Clément Pit--Claudel 1 sibling, 1 reply; 102+ messages in thread From: Teemu Likonen @ 2016-02-03 15:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Nicolas Petton, monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 673 bytes --] Eli Zaretskii [2016-02-03 17:41:06+02] wrote: >> From: Nicolas Petton <nicolas@petton.fr> >> Date: Wed, 03 Feb 2016 14:35:46 +0100 >> If we want to turn it off by default for the release > > We don't, at least not yet. We want to collect feedback. Here's mine: I don't want "a" and "ä" to be the same in searches, by default. In my language (Finnish) they are different letters and phonemes, for example: "tai" (= or) and "täi" (= a louse); "sakki" (= gang, crowd) and "säkki" (= a sack). This is a great feature, though. -- /// Teemu Likonen - .-.. <https://github.com/tlikonen> // // PGP: 4E10 55DC 84E9 DFF6 13D7 8557 719D 69D3 2453 9450 /// [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 818 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 15:55 ` Teemu Likonen @ 2016-02-03 16:16 ` Eli Zaretskii 2016-02-06 13:41 ` Teemu Likonen 0 siblings, 1 reply; 102+ messages in thread From: Eli Zaretskii @ 2016-02-03 16:16 UTC (permalink / raw) To: Teemu Likonen; +Cc: nicolas, monnier, emacs-devel > From: Teemu Likonen <tlikonen@iki.fi> > Cc: Nicolas Petton <nicolas@petton.fr>, monnier@iro.umontreal.ca, emacs-devel@gnu.org > Date: Wed, 03 Feb 2016 17:55:50 +0200 > > > We want to collect feedback. > > Here's mine: I don't want "a" and "ä" to be the same in searches, by > default. In my language (Finnish) they are different letters and > phonemes, for example: "tai" (= or) and "täi" (= a louse); "sakki" (= > gang, crowd) and "säkki" (= a sack). Thank you. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 16:16 ` Eli Zaretskii @ 2016-02-06 13:41 ` Teemu Likonen 2016-02-06 14:33 ` Eli Zaretskii 0 siblings, 1 reply; 102+ messages in thread From: Teemu Likonen @ 2016-02-06 13:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: nicolas, monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 902 bytes --] Eli Zaretskii [2016-02-03 18:16:25+02] wrote: > From: Teemu Likonen <tlikonen@iki.fi> >> Here's mine: I don't want "a" and "ä" to be the same in searches, by >> default. In my language (Finnish) they are different letters and >> phonemes, for example: "tai" (= or) and "täi" (= a louse); "sakki" (= >> gang, crowd) and "säkki" (= a sack). > > Thank you. Actually I take that back. I've been testing (and thinking) the character folding feature more and it's very unlikely that users will face problems with it in the Finnish language. It doesn't bother me if the feature is on by default. A global switch (a dynamic variable) would be a good thing and I think it should override any locale or language based magic, if such magic is even necessary. -- /// Teemu Likonen - .-.. <https://github.com/tlikonen> // // PGP: 4E10 55DC 84E9 DFF6 13D7 8557 719D 69D3 2453 9450 /// [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 818 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-06 13:41 ` Teemu Likonen @ 2016-02-06 14:33 ` Eli Zaretskii 2016-02-06 15:09 ` Teemu Likonen 0 siblings, 1 reply; 102+ messages in thread From: Eli Zaretskii @ 2016-02-06 14:33 UTC (permalink / raw) To: Teemu Likonen; +Cc: nicolas, monnier, emacs-devel > From: Teemu Likonen <tlikonen@iki.fi> > Cc: nicolas@petton.fr, monnier@iro.umontreal.ca, emacs-devel@gnu.org > Date: Sat, 06 Feb 2016 15:41:44 +0200 > > Eli Zaretskii [2016-02-03 18:16:25+02] wrote: > > > From: Teemu Likonen <tlikonen@iki.fi> > >> Here's mine: I don't want "a" and "ä" to be the same in searches, by > >> default. In my language (Finnish) they are different letters and > >> phonemes, for example: "tai" (= or) and "täi" (= a louse); "sakki" (= > >> gang, crowd) and "säkki" (= a sack). > > > > Thank you. > > Actually I take that back. I've been testing (and thinking) the > character folding feature more and it's very unlikely that users will > face problems with it in the Finnish language. It doesn't bother me if > the feature is on by default. Thanks again for sharing your views. > A global switch (a dynamic variable) would be a good thing and I think > it should override any locale or language based magic, if such magic is > even necessary. Not sure I understand: a global switch to do what? If to turn character folding on and off, then such a possibility already exists. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-06 14:33 ` Eli Zaretskii @ 2016-02-06 15:09 ` Teemu Likonen 2016-02-06 18:38 ` Artur Malabarba 0 siblings, 1 reply; 102+ messages in thread From: Teemu Likonen @ 2016-02-06 15:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: nicolas, monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 876 bytes --] Eli Zaretskii [2016-02-06 16:33:36+02] wrote: > From: Teemu Likonen <tlikonen@iki.fi> >> A global switch (a dynamic variable) would be a good thing and I >> think it should override any locale or language based magic, if such >> magic is even necessary. > > Not sure I understand: a global switch to do what? If to turn > character folding on and off, then such a possibility already exists. By global switch I meant a variable like case-fold-search but for character folding. But after looking a bit more closely I found search-default-regexp-mode. The "regexp" part in the variable name is confusing but I guess I must read the whole "(emacs) Search" info node and its subnodes. It's probably too long since the last time. -- /// Teemu Likonen - .-.. <https://github.com/tlikonen> // // PGP: 4E10 55DC 84E9 DFF6 13D7 8557 719D 69D3 2453 9450 /// [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 818 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-06 15:09 ` Teemu Likonen @ 2016-02-06 18:38 ` Artur Malabarba 2016-02-06 19:08 ` Eli Zaretskii 0 siblings, 1 reply; 102+ messages in thread From: Artur Malabarba @ 2016-02-06 18:38 UTC (permalink / raw) To: Teemu Likonen; +Cc: Nicolas Petton, Eli Zaretskii, Stefan Monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 525 bytes --] On 6 Feb 2016 1:09 pm, "Teemu Likonen" <tlikonen@iki.fi> wrote: > By global switch I meant a variable like case-fold-search but for > character folding. But after looking a bit more closely I found > search-default-regexp-mode. The "regexp" part in the variable name is > confusing I see how that's confusing. We should probably call it search-default-mode. There's still time to make the change. The regexp part is related to the implementation, so it's really of little interest to the user and shouldn't be in the name. [-- Attachment #2: Type: text/html, Size: 688 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-06 18:38 ` Artur Malabarba @ 2016-02-06 19:08 ` Eli Zaretskii 2016-02-07 1:06 ` Artur Malabarba 0 siblings, 1 reply; 102+ messages in thread From: Eli Zaretskii @ 2016-02-06 19:08 UTC (permalink / raw) To: Artur Malabarba; +Cc: nicolas, tlikonen, monnier, emacs-devel > Date: Sat, 6 Feb 2016 18:38:44 +0000 > From: Artur Malabarba <arturmalabarba@gmail.com> > Cc: emacs-devel <emacs-devel@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca>, > Nicolas Petton <nicolas@petton.fr>, Eli Zaretskii <eliz@gnu.org> > > On 6 Feb 2016 1:09 pm, "Teemu Likonen" <tlikonen@iki.fi> wrote: > > By global switch I meant a variable like case-fold-search but for > > character folding. But after looking a bit more closely I found > > search-default-regexp-mode. The "regexp" part in the variable name is > > confusing > > I see how that's confusing. We should probably call it search-default-mode. Yes, please. Thanks. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-06 19:08 ` Eli Zaretskii @ 2016-02-07 1:06 ` Artur Malabarba 0 siblings, 0 replies; 102+ messages in thread From: Artur Malabarba @ 2016-02-07 1:06 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Nicolas Petton, tlikonen, Stefan Monnier, emacs-devel On 6 February 2016 at 19:08, Eli Zaretskii <eliz@gnu.org> wrote: >> Date: Sat, 6 Feb 2016 18:38:44 +0000 >> From: Artur Malabarba <arturmalabarba@gmail.com> >> Cc: emacs-devel <emacs-devel@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca>, >> Nicolas Petton <nicolas@petton.fr>, Eli Zaretskii <eliz@gnu.org> >> >> On 6 Feb 2016 1:09 pm, "Teemu Likonen" <tlikonen@iki.fi> wrote: >> > By global switch I meant a variable like case-fold-search but for >> > character folding. But after looking a bit more closely I found >> > search-default-regexp-mode. The "regexp" part in the variable name is >> > confusing >> >> I see how that's confusing. We should probably call it search-default-mode. > > Yes, please. > > Thanks. Done ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 15:41 ` Eli Zaretskii 2016-02-03 15:55 ` Teemu Likonen @ 2016-02-03 16:54 ` Clément Pit--Claudel 2016-02-03 17:01 ` John Wiegley 2016-02-03 17:02 ` Eli Zaretskii 1 sibling, 2 replies; 102+ messages in thread From: Clément Pit--Claudel @ 2016-02-03 16:54 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 832 bytes --] On 02/03/2016 10:41 AM, Eli Zaretskii wrote: > We don't, at least not yet. We want to collect feedback. I love the new behaviour: * It makes it much nicer to search through documents written in French when I'm not using a French keyboard. * It also makes it easier to search through emails in which some accents have been omitted (probably for the same reason as above). * It even makes it nicer to search for my own name: it's definitely wrong to spell it “Clement”, but many websites reject “Clément” due to the accent, so I end up with emails addressed to “Clement”. I don't read Emacs' change logs carefully enough to hear about every new feature. Disabling features that I don't like doesn't bother me; on the other hand, discovering features to enable is much harder. So I'd vote for this being on by default. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 16:54 ` Clément Pit--Claudel @ 2016-02-03 17:01 ` John Wiegley 2016-02-03 21:08 ` Óscar Fuentes 2016-02-03 17:02 ` Eli Zaretskii 1 sibling, 1 reply; 102+ messages in thread From: John Wiegley @ 2016-02-03 17:01 UTC (permalink / raw) To: Clément Pit--Claudel; +Cc: emacs-devel >>>>> Clément Pit--Claudel <clement.pit@gmail.com> writes: > It makes it much nicer to search through documents written in French when > I'm not using a French keyboard. It's also nice when searching a Spanish document, where someone says "como" and you want to search for it, but aren't sure if it was meant as a question word (¿Cómo?) or a preposition (como). -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 17:01 ` John Wiegley @ 2016-02-03 21:08 ` Óscar Fuentes 2016-02-03 22:32 ` John Wiegley ` (2 more replies) 0 siblings, 3 replies; 102+ messages in thread From: Óscar Fuentes @ 2016-02-03 21:08 UTC (permalink / raw) To: emacs-devel John Wiegley <jwiegley@gmail.com> writes: > It's also nice when searching a Spanish document, where someone says "como" > and you want to search for it, but aren't sure if it was meant as a question > word (¿Cómo?) or a preposition (como). Furthermore, in Spanish nowadays you can't expect correct orthography, even on supposedly educated environments. Also, involuntary typos involving accents are common. I like the feature very much, but I'm neutral wrt its default value. If you ask me, as a programmer, I would say no, but as an Spaniard that occasionally uses Emacs to write Spanish text, I'll say yes. BTW, searching for `n' also matches `ñ', which is definitely wrong. Those are not equivalent characters by any stretch. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 21:08 ` Óscar Fuentes @ 2016-02-03 22:32 ` John Wiegley 2016-02-03 22:52 ` Clément Pit--Claudel 2016-02-03 23:50 ` Sacha Chua 2016-02-04 5:49 ` Ivan Andrus 2016-02-04 8:40 ` Elias Mårtenson 2 siblings, 2 replies; 102+ messages in thread From: John Wiegley @ 2016-02-03 22:32 UTC (permalink / raw) To: Óscar Fuentes; +Cc: Sacha Chua, emacs-devel >>>>> Óscar Fuentes <ofv@wanadoo.es> writes: > BTW, searching for `n' also matches `ñ', which is definitely wrong. Those > are not equivalent characters by any stretch. I think a poll about this would be a good idea. There is enough contention about having it as a default that we may prefer to wait, especially since it does change the searching behavior that 24.x are used to. What's the best method these days for conducting such a poll? I wonder if these types of polls is something our community ambassador, Sacha, would be willing to take ownership of... -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 22:32 ` John Wiegley @ 2016-02-03 22:52 ` Clément Pit--Claudel 2016-02-03 23:50 ` Sacha Chua 1 sibling, 0 replies; 102+ messages in thread From: Clément Pit--Claudel @ 2016-02-03 22:52 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 721 bytes --] On 02/03/2016 05:32 PM, John Wiegley wrote: >>>>>> Óscar Fuentes <ofv@wanadoo.es> writes: > >> BTW, searching for `n' also matches `ñ', which is definitely wrong. Those >> are not equivalent characters by any stretch. > > I think a poll about this would be a good idea. There is enough contention > about having it as a default that we may prefer to wait, especially since it > does change the searching behavior that 24.x are used to. > > What's the best method these days for conducting such a poll? I wonder if > these types of polls is something our community ambassador, Sacha, would be > willing to take ownership of... I wonder whether the meta Emacs Stack Exchange would work. Clément. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 22:32 ` John Wiegley 2016-02-03 22:52 ` Clément Pit--Claudel @ 2016-02-03 23:50 ` Sacha Chua 1 sibling, 0 replies; 102+ messages in thread From: Sacha Chua @ 2016-02-03 23:50 UTC (permalink / raw) To: emacs-devel; +Cc: jwiegley John Wiegley <jwiegley@gmail.com> writes: > I think a poll about this would be a good idea. There is enough > contention about having it as a default that we may prefer to wait, > especially since it does change the searching behavior that 24.x are > used to. What's the best method these days for conducting such a poll? > I wonder if these types of polls is something our community > ambassador, Sacha, would be willing to take ownership of... This approach from 2002 ( http://lists.gnu.org/archive/html/emacs-devel/2002-06/msg00170.html ) of posting a lightly-structured e-mail-based poll so that people could either share a quick answer or a more nuanced opinion seems to still be a better way than, say, using a web-based multiple-choice poll. On-list discussion seems to be slightly more useful than quick off-list voting, although I think I can handle tallying quick votes sent to an address off-list if needed. Polling is a weird thing, anyway. You'll probably mostly hear from people who feel strongly about it, so I'm not sure how representative that will be for our user base. There are pretty good arguments on all sides in the current thread, so I'm not sure if you'll get that much additional information from a poll. Still, if someone wants to draft a poll, I can help with the grunt-work of distributing it (maybe emacs-devel, help-gnu-emacs, emacs-tangents, Reddit, and Planet Emacsen), tallying the votes, and maybe updating a proposal page with additional notes (maybe on EmacsWiki). Sacha ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 21:08 ` Óscar Fuentes 2016-02-03 22:32 ` John Wiegley @ 2016-02-04 5:49 ` Ivan Andrus 2016-02-04 21:30 ` Richard Stallman 2016-02-04 8:40 ` Elias Mårtenson 2 siblings, 1 reply; 102+ messages in thread From: Ivan Andrus @ 2016-02-04 5:49 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel On Feb 3, 2016, at 2:08 PM, Óscar Fuentes <ofv@wanadoo.es> wrote: > > John Wiegley <jwiegley@gmail.com> writes: > >> It's also nice when searching a Spanish document, where someone says "como" >> and you want to search for it, but aren't sure if it was meant as a question >> word (¿Cómo?) or a preposition (como). > > Furthermore, in Spanish nowadays you can't expect correct orthography, > even on supposedly educated environments. Also, involuntary typos > involving accents are common. > > I like the feature very much, but I'm neutral wrt its default value. If > you ask me, as a programmer, I would say no, but as an Spaniard that > occasionally uses Emacs to write Spanish text, I'll say yes. > > BTW, searching for `n' also matches `ñ', which is definitely wrong. > Those are not equivalent characters by any stretch. Though folding b and v would be very helpful for some of the Spanish I read. :-) -Ivan ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 5:49 ` Ivan Andrus @ 2016-02-04 21:30 ` Richard Stallman 0 siblings, 0 replies; 102+ messages in thread From: Richard Stallman @ 2016-02-04 21:30 UTC (permalink / raw) To: Ivan Andrus; +Cc: ofv, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Though folding b and v would be very helpful for some of the Spanish I read. :-) This suggests a possible feature, phonetic search. It would be too hard to support English, I fear, but some other languages might be easier. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 21:08 ` Óscar Fuentes 2016-02-03 22:32 ` John Wiegley 2016-02-04 5:49 ` Ivan Andrus @ 2016-02-04 8:40 ` Elias Mårtenson 2016-02-04 11:57 ` Dirk-Jan C. Binnema 2016-02-04 21:32 ` Richard Stallman 2 siblings, 2 replies; 102+ messages in thread From: Elias Mårtenson @ 2016-02-04 8:40 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 582 bytes --] On 4 February 2016 at 05:08, Óscar Fuentes <ofv@wanadoo.es> wrote: BTW, searching for `n' also matches `ñ', which is definitely wrong. > Those are not equivalent characters by any stretch. > What type of character equivalence should be used is locale-dependent. Everybody here agrees with that. Thus, the solution must also be locale-dependent. It would make sense to have the default based on the session's locale, meaning that in a Swedish locale a, ä and å would be different and n and ñ be different, but under a Spanish locale, the opposite would be true. [-- Attachment #2: Type: text/html, Size: 947 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 8:40 ` Elias Mårtenson @ 2016-02-04 11:57 ` Dirk-Jan C. Binnema 2016-02-04 15:18 ` Drew Adams ` (2 more replies) 2016-02-04 21:32 ` Richard Stallman 1 sibling, 3 replies; 102+ messages in thread From: Dirk-Jan C. Binnema @ 2016-02-04 11:57 UTC (permalink / raw) To: emacs-devel On Thursday Feb 04 2016, Elias Mårtenson wrote: > On 4 February 2016 at 05:08, Óscar Fuentes <ofv@wanadoo.es> wrote: > > BTW, searching for `n' also matches `ñ', which is definitely wrong. >> Those are not equivalent characters by any stretch. > What type of character equivalence should be used is locale-dependent. > Everybody here agrees with that. Thus, the solution must also be > locale-dependent. > It would make sense to have the default based on the session's locale, > meaning that in a Swedish locale a, ä and å would be different and n and ñ > be different, but under a Spanish locale, the opposite would be true. Character equivalence is based on the language(s) of whatever is in your buffer, which might be correlated with your locale, but not more than that. Regardless, for the purpose of searching, my personal preference would be to make folding rather inclusive; I don't really care about the exact rules languages have come up for what letters are considered "the same", I just care for what I, as a user, would find the easiest to match. So for instance, I'd like "angstrom" to match "Ångström" even though in Swedish, a/Å and o/ö are not the same. Somewhat similar to how languages' capitalization rules are ignored when searching case-insensitively. A few false positives are not much of problem. That would also get my vote as a reasonable default for case-folding in searches. But I'll happily take any default, as long as there's a way to get the above behavior, preferably without having to change my locale. Kind regards, Dirk. -- Dirk-Jan C. Binnema Helsinki, Finland e:djcb@djcbsoftware.nl w:www.djcbsoftware.nl pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C ^ permalink raw reply [flat|nested] 102+ messages in thread
* RE: Character folding in the pretest 2016-02-04 11:57 ` Dirk-Jan C. Binnema @ 2016-02-04 15:18 ` Drew Adams 2016-02-04 15:59 ` Óscar Fuentes 2016-02-04 23:05 ` Artur Malabarba 2016-02-04 16:54 ` Eli Zaretskii 2016-02-04 17:26 ` Teemu Likonen 2 siblings, 2 replies; 102+ messages in thread From: Drew Adams @ 2016-02-04 15:18 UTC (permalink / raw) To: Dirk-Jan C. Binnema, emacs-devel > > It would make sense to have the default based on the session's locale, > > meaning that in a Swedish locale a, ä and å would be different and n and ñ > > be different, but under a Spanish locale, the opposite would be true. > > Character equivalence is based on the language(s) of whatever is in your > buffer, which might be correlated with your locale, but not more than > that. > > Regardless, for the purpose of searching, my personal preference would > be to make folding rather inclusive; I don't really care about the exact > rules languages have come up for what letters are considered "the same", > I just care for what I, as a user, would find the easiest to match. > > So for instance, I'd like "angstrom" to match "Ångström" even though in > Swedish, a/Å and o/ö are not the same. Somewhat similar to how > languages' capitalization rules are ignored when searching > case-insensitively. A few false positives are not much of problem. > > That would also get my vote as a reasonable default for case-folding in > searches. But I'll happily take any default, as long as there's a way to > get the above behavior, preferably without having to change my locale. Both of these posts (one saying that it should be possible to take locale into account, perhaps even for default behavior; the other adding that someone might have a personal preference) point to the existence of multiple use cases and users needing to be able to (easily) control the behavior. We can fine-tune defaulting at design time, to try to provide a reasonable behavior for most use cases/contexts, but users still need to be able to easily customize the sets of equivalence classes, and they should be able to have multiple sets of such sets, which they can activate in different contexts (e.g. modes). That is really where the design effort should be, at this point. We have a basic char-folding mechanism, but we do not yet provide an easy way for a user to customize the behavior, let alone to define/get the various behaviors that s?he might want in different contexts. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 15:18 ` Drew Adams @ 2016-02-04 15:59 ` Óscar Fuentes 2016-02-04 16:36 ` Clément Pit--Claudel 2016-02-04 17:07 ` Eli Zaretskii 2016-02-04 23:05 ` Artur Malabarba 1 sibling, 2 replies; 102+ messages in thread From: Óscar Fuentes @ 2016-02-04 15:59 UTC (permalink / raw) To: emacs-devel Drew Adams <drew.adams@oracle.com> writes: [snip] > That is really where the design effort should be, at this point. > We have a basic char-folding mechanism, but we do not yet provide > an easy way for a user to customize the behavior, let alone to > define/get the various behaviors that s?he might want in different > contexts. Allowing the user to configure the feature is good, but the defaults should be usable. After seeing the case I mentioned (`n' matching `ñ' in Spanish text) it is obvious that the feature is not ready for prime time. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 15:59 ` Óscar Fuentes @ 2016-02-04 16:36 ` Clément Pit--Claudel 2016-02-04 16:47 ` Óscar Fuentes 2016-02-04 20:23 ` John Wiegley 2016-02-04 17:07 ` Eli Zaretskii 1 sibling, 2 replies; 102+ messages in thread From: Clément Pit--Claudel @ 2016-02-04 16:36 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1415 bytes --] On 02/04/2016 10:59 AM, Óscar Fuentes wrote: > After seeing the case I mentioned (`n' matching `ñ' in > Spanish text) it is obvious that the feature is not ready for prime > time. This is interesting. I guess it boils down to whether you're trying to avoid false positives or false negatives. For me the strength of this feature is that it lets me find virtually anything using an dumb keyboard (one without easy access to accents); I don't care too much about false positives (that is, I don't mind if ‘n’ finds ‘ñ’). In that sense, it doesn't matter if letters "are different"; all that matters is whether they look different. I imagine that's why the Unicode standard defined things that way. It seems this behavior is consistent with that of most online search engines (I tried Google, Bing, and DuckDuckGo; all return accented matches for unaccented keywords). I'm wary of smart solutions based on locale or buffer language. It's not uncommon to be writing a single document in multiple languages; especially if names are involved. Plus, it's not obvious that a single set of settings is enough for each locale. For example, one could argue that folding accents makes no sense in French: ‘supprimé’ means ‘removed’, but ‘supprime’ means ‘removes’. Yet it is not uncommon for people to write the latter for the former, especially when using a dumb keyboard. Clément. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 16:36 ` Clément Pit--Claudel @ 2016-02-04 16:47 ` Óscar Fuentes 2016-02-04 17:05 ` Werner LEMBERG ` (2 more replies) 2016-02-04 20:23 ` John Wiegley 1 sibling, 3 replies; 102+ messages in thread From: Óscar Fuentes @ 2016-02-04 16:47 UTC (permalink / raw) To: emacs-devel Clément Pit--Claudel <clement.pit@gmail.com> writes: > On 02/04/2016 10:59 AM, Óscar Fuentes wrote: >> After seeing the case I mentioned (`n' matching `ñ' in >> Spanish text) it is obvious that the feature is not ready for prime >> time. > > This is interesting. I guess it boils down to whether you're trying to > avoid false positives or false negatives. For me the strength of this > feature is that it lets me find virtually anything using an dumb > keyboard (one without easy access to accents); I don't care too much > about false positives (that is, I don't mind if ‘n’ finds ‘ñ’). In > that sense, it doesn't matter if letters "are different"; all that > matters is whether they look different. I imagine that's why the > Unicode standard defined things that way. It seems this behavior is > consistent with that of most online search engines (I tried Google, > Bing, and DuckDuckGo; all return accented matches for unaccented > keywords). I see your point, but you are talking about accents all the time. In Spanish `n' and `ñ' are different letters. `n' matching `ñ' is no different than `p' matching `q'. I think that you will agree that some of us will see that behavior as a glaring bug. > I'm wary of smart solutions based on locale or buffer language. It's > not uncommon to be writing a single document in multiple languages; > especially if names are involved. Plus, it's not obvious that a single > set of settings is enough for each locale. For example, one could > argue that folding accents makes no sense in French: ‘supprimé’ means > ‘removed’, but ‘supprime’ means ‘removes’. Yet it is not uncommon for > people to write the latter for the former, especially when using a > dumb keyboard. I'm not sure how to fix this, but seeing similar reservations from other users, some language-dependent behavior is unavoidable. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 16:47 ` Óscar Fuentes @ 2016-02-04 17:05 ` Werner LEMBERG 2016-02-05 5:09 ` Elias Mårtenson 2016-02-04 17:12 ` Eli Zaretskii 2016-02-04 17:27 ` Clément Pit--Claudel 2 siblings, 1 reply; 102+ messages in thread From: Werner LEMBERG @ 2016-02-04 17:05 UTC (permalink / raw) To: ofv; +Cc: emacs-devel >> This is interesting. I guess it boils down to whether you're trying >> to avoid false positives or false negatives. For me the strength >> of this feature is that it lets me find virtually anything using an >> dumb keyboard (one without easy access to accents); I don't care >> too much about false positives (that is, I don't mind if ‘n’ finds >> ‘ñ’). In that sense, it doesn't matter if letters "are different"; >> all that matters is whether they look different. I imagine that's >> why the Unicode standard defined things that way. It seems this >> behavior is consistent with that of most online search engines (I >> tried Google, Bing, and DuckDuckGo; all return accented matches for >> unaccented keywords). > > I see your point, but you are talking about accents all the time. > In Spanish `n' and `ñ' are different letters. `n' matching `ñ' is > no different than `p' matching `q'. I think that you will agree > that some of us will see that behavior as a glaring bug. This naturally leads to a possible user option: Having `optical' matches or not, where `optical' means `base character plus diacritic and/or slight modifications', e.g., o → ø → ö etc., etc. Werner ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 17:05 ` Werner LEMBERG @ 2016-02-05 5:09 ` Elias Mårtenson 2016-02-05 6:01 ` Werner LEMBERG 2016-02-06 12:58 ` Rasmus 0 siblings, 2 replies; 102+ messages in thread From: Elias Mårtenson @ 2016-02-05 5:09 UTC (permalink / raw) To: Werner LEMBERG; +Cc: Óscar Fuentes, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1935 bytes --] On 5 Feb 2016 1:06 a.m., "Werner LEMBERG" <wl@gnu.org> wrote: > > This naturally leads to a possible user option: Having `optical' > matches or not, where `optical' means `base character plus diacritic > and/or slight modifications', e.g., o → ø → ö etc., etc. I think this statement shows how easy it is to introduce cultural bias, although the fact that your name sounds German suggests that personal preference is involved. How do you even define "optical similarities"? Should l and I compare the same under this definition? They certainly looks similar. What about p and q? They look like mirror images of each other. What about z and s? They even sound similar. To a Swedish speaker there are zero similarities between a, ä and å. They are, in fact, just as different as a and z are to an English speaker. I really cannot emphasise this enough, and reading this thread tells me that it needs to be emphasised even more. As someone who lives in an English speaking country and using English keyboards, while still working with documents in various languages, I see first-hand the need to have ways of searching for characters that I can't easily type on my keyboard, but this issue is orthogonal to that of character equivalence. The conflating of these two issues are, in my opinion, the root cause of many of the disagreements in this thread. My personal preference is that the expected behaviour of searches is more related to the locale of the user, rather than that of the document being searched. In other words, as a non-Spanish speaker, I'd expect to be able to find ñ when searching for n, even if the document I'm searching in is in Spanish. There are definitely an infinite number of counter-examples to this (enough to keep this thread going for another 100 messages, I'm sure), but at least there is reason to consider making the default based on the locale of the user. [-- Attachment #2: Type: text/html, Size: 2116 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-05 5:09 ` Elias Mårtenson @ 2016-02-05 6:01 ` Werner LEMBERG 2016-02-05 6:36 ` Elias Mårtenson 2016-02-08 14:05 ` Marcin Borkowski 2016-02-06 12:58 ` Rasmus 1 sibling, 2 replies; 102+ messages in thread From: Werner LEMBERG @ 2016-02-05 6:01 UTC (permalink / raw) To: lokedhs; +Cc: ofv, emacs-devel >> This naturally leads to a possible user option: Having `optical' >> matches or not, where `optical' means `base character plus >> diacritic and/or slight modifications', e.g., o → ø → ö etc., etc. > > How do you even define "optical similarities"? Basically the same as Eli has described: Base character plus diacritics, probably plus some basic shapes with `diacritics' that Unicode doesn't represent as composable: o → ø, l → ł, d → đ, etc. > Should l and I compare the same under this definition? They > certainly looks similar. No, since the similarity is a font issue only. For this reason I *never* use Arial-like fonts. > What about p and q? They look like mirror images of each other. > What about z and s? They even sound similar. Nonsense. I've clearly mentioned `base character plus diacritic'. Why do you intentionally skip that? Doing so reminds me of Schopenhauer's first stratagem in `The Art of Being Right'... > To a Swedish speaker there are zero similarities between a, ä and å. I'm a native German speaker, and there is *zero* similarity in the sound between `a' and `ä', say. But it is quite common in English texts, say, to omit the diaeresis dots, thus having a searching mode that finds both `Hänsel und Gretel' and `Hansel and Gretel' at the same time would be very valuable. > My personal preference is that the expected behaviour of searches is > more related to the locale of the user, rather than that of the > document being searched. In other words, as a non-Spanish speaker, > I'd expect to be able to find ñ when searching for n, even if the > document I'm searching in is in Spanish. There are definitely an > infinite number of counter-examples to this (enough to keep this > thread going for another 100 messages, I'm sure), but at least there > is reason to consider making the default based on the locale of the > user. What you describe naturally leads to another user option: Don't handle characters as `equal' (with a proper definition of `equal') that aren't `equal' in the user's locale. Werner ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-05 6:01 ` Werner LEMBERG @ 2016-02-05 6:36 ` Elias Mårtenson 2016-02-05 7:15 ` Werner LEMBERG 2016-02-05 7:52 ` Eli Zaretskii 2016-02-08 14:05 ` Marcin Borkowski 1 sibling, 2 replies; 102+ messages in thread From: Elias Mårtenson @ 2016-02-05 6:36 UTC (permalink / raw) To: Werner LEMBERG; +Cc: Óscar Fuentes, emacs-devel [-- Attachment #1: Type: text/plain, Size: 4980 bytes --] On 5 February 2016 at 14:01, Werner LEMBERG <wl@gnu.org> wrote: > > >> This naturally leads to a possible user option: Having `optical' > >> matches or not, where `optical' means `base character plus > >> diacritic and/or slight modifications', e.g., o → ø → ö etc., etc. > > > > How do you even define "optical similarities"? > > Basically the same as Eli has described: Base character plus > diacritics, probably plus some basic shapes with `diacritics' that > Unicode doesn't represent as composable: o → ø, l → ł, d → đ, etc. > Composability is somewhat arbitrary. The character composition has very little to do with "visual similarities". Just have a look at character compositions in Devanagari for example. > > Should l and I compare the same under this definition? They > > certainly looks similar. > > No, since the similarity is a font issue only. For this reason I > *never* use Arial-like fonts. > And that argument works equally well for a and å. They really have _nothing_ in common. The fact that there exists a Unicode decomposition for them is completely irrelevant to a Swedish speaker. Also note that to a Swedish speaker (well, at least up until recently), W and V were variations of the same character. Yet I'm not advocating that Emacs should consider them similar unless the locale says they should be. In fact, the links to the Unicode TR on collations that Eli posted mentions that as a specific example. > > What about p and q? They look like mirror images of each other. > > What about z and s? They even sound similar. > > Nonsense. I've clearly mentioned `base character plus diacritic'. > Why do you intentionally skip that? Doing so reminds me of > Schopenhauer's first stratagem in `The Art of Being Right'... > I did not intentionally skip that. I would appreciate it if you didn't assume that I was out to simply prove you wrong, or that I am here to troll. I was using that as an example in trying to highlight that to some people (like myself) ä just simply is not a character with a diacritic. It is in German, but not in Swedish. I think this is hard to explain because in many European language (such as English, German and French) you have characters which are variations or alternatives. For example, in French you have the letter Œ, which is a variation of "OE". Likewise in German, ß is a variation of SS and Ü is a variation of UE. As far as I know, I could write "Müller" as "Mueller". However, this is not true for Swedish. I'll say it again (and I apologise for repeating myself, this kind of repetition makes me sound like the troll that you accused me of being) but in Swedish the difference between Å and A are just as great as the difference in English between the letters E and O. Writing my last name as "Martenson" looks just as bizarre as me writing your last name as "Merner". And yes, I picked M because it kinda looks like an upside-down W and I'm doing that not because I'm really suggesting that that equivalence should be implemented, but because I want to illustrate just how silly it looks. > > To a Swedish speaker there are zero similarities between a, ä and å. > > I'm a native German speaker, and there is *zero* similarity in the > sound between `a' and `ä', say. I know. Speak a little German. In fact, Ä is pronounced exactly the same in German and Swedish. That said, as far as I can recall from my German lessons 25 years ago, German grammar does see Ä as a variation of A. At least they are sorted together in the dictionary. Swedish distinction is much greater. This discussion would have been much easier if the letter looked completely different. :-) > But it is quite common in English > texts, say, to omit the diaeresis dots, thus having a searching mode > that finds both `Hänsel und Gretel' and `Hansel and Gretel' at the > same time would be very valuable. > I never said it's not valuable. I never even suggested that this kind of comparisons should not be possible. In fact, I'm not even suggesting that this kind of comparisons should not be the default, even. Especially given the fact that locale-dependent comparators are not very well supported in Emacs at the moment. What I did want to do was try try to explain that even though there is a visual similarity between A, Ä and Å, to a Swedish speaker those similarities are no greater than those of q and k. And definitely much more different than W and V (which were, up until recently sorted under V in dictionaries and seen as simply a visual variation). > > What you describe naturally leads to another user option: Don't handle > characters as `equal' (with a proper definition of `equal') that > aren't `equal' in the user's locale. This is exactly my point. And you have managed to compress hundreds of my words into a single, district sentence. Thank you. [-- Attachment #2: Type: text/html, Size: 6918 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-05 6:36 ` Elias Mårtenson @ 2016-02-05 7:15 ` Werner LEMBERG 2016-02-05 7:22 ` Elias Mårtenson 2016-02-05 7:52 ` Eli Zaretskii 1 sibling, 1 reply; 102+ messages in thread From: Werner LEMBERG @ 2016-02-05 7:15 UTC (permalink / raw) To: lokedhs; +Cc: ofv, emacs-devel >> Basically the same as Eli has described: Base character plus >> diacritics, probably plus some basic shapes with `diacritics' that >> Unicode doesn't represent as composable: o → ø, l → ł, d → đ, etc. > > Composability is somewhat arbitrary. The character composition has > very little to do with "visual similarities". Just have a look at > character compositions in Devanagari for example. Character compositions in Devanagari form ligatures. This is a completely different concept. It is possible that a given character sequence yields different renderings, depending on the availability of a ligature in a font. The same issue is present in Arabic, BTW. What we are discussing here is inherently bound to alphabetic scripts, in particular Latin, Greek, and Cyrillic. Abugida and Abjad scripts need a separate solution, as do CJKV scripts. > Likewise in German, ß is a variation of SS and Ü is a variation of > UE. As far as I know, I could write "Müller" as "Mueller". In German, `Mueller' is an emergency representation if `ü' is not available; it is highly discouraged otherwise. But yes, it would be beneficial if there were an option to make a search for `Mueller' match `Müller' also (and vice versa). > However, this is not true for Swedish. I'll say it again (and I > apologise for repeating myself, this kind of repetition makes me > sound like the troll that you accused me of being) but in Swedish > the difference between Å and A are just as great as the difference > in English between the letters E and O. [...] Funnily, in your neighbouring country Denmark `A' and `Å' are much nearer, cf. `Århus' vs. `Aarhus'. >> What you describe naturally leads to another user option: Don't >> handle characters as `equal' (with a proper definition of `equal') >> that aren't `equal' in the user's locale. > > This is exactly my point. [...] :) Werner ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-05 7:15 ` Werner LEMBERG @ 2016-02-05 7:22 ` Elias Mårtenson 2016-02-06 15:43 ` Rasmus 0 siblings, 1 reply; 102+ messages in thread From: Elias Mårtenson @ 2016-02-05 7:22 UTC (permalink / raw) To: Werner LEMBERG; +Cc: Óscar Fuentes, emacs-devel [-- Attachment #1: Type: text/plain, Size: 760 bytes --] On 5 February 2016 at 15:15, Werner LEMBERG <wl@gnu.org> wrote: > > > However, this is not true for Swedish. I'll say it again (and I > > apologise for repeating myself, this kind of repetition makes me > > sound like the troll that you accused me of being) but in Swedish > > the difference between Å and A are just as great as the difference > > in English between the letters E and O. [...] > > Funnily, in your neighbouring country Denmark `A' and `Å' are much > nearer, cf. `Århus' vs. `Aarhus'. > Yes, that is funny. And I wish my Danish was better so that I could explain that. But yes, you observation is correct. If I remember correctly, I wasn't even aware that Aarhus and Århus was the same place until it was pointed out. [-- Attachment #2: Type: text/html, Size: 1178 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-05 7:22 ` Elias Mårtenson @ 2016-02-06 15:43 ` Rasmus 2016-02-06 15:51 ` Eli Zaretskii 0 siblings, 1 reply; 102+ messages in thread From: Rasmus @ 2016-02-06 15:43 UTC (permalink / raw) To: emacs-devel Elias Mårtenson <lokedhs@gmail.com> writes: > On 5 February 2016 at 15:15, Werner LEMBERG <wl@gnu.org> wrote: > >> >> > However, this is not true for Swedish. I'll say it again (and >> > I apologise for repeating myself, this kind of repetition >> > makes me sound like the troll that you accused me of being) >> > but in Swedish the difference between Å and A are just as >> > great as the difference in English between the letters E and >> > O. [...] >> >> Funnily, in your neighbouring country Denmark `A' and `Å' are >> much nearer, cf. `Århus' vs. `Aarhus'. >> > > Yes, that is funny. And I wish my Danish was better so that I > could explain that. But yes, you observation is correct. Å and aa is the same though Å apparently sorts before aa in the dictionary. Å is the recommended symbol for the aa sounds since 1948, but in some cases like places one is free to chose (e.g. Århus and Aarhus and Aalborg and Ålborg; note "Ålborg" is uncommon and is never used by citizens of the city). Since 2011 Aarhus is used in official documents, but both representations are generally correct. For the purpose of the discussion you could argue that "arhus" should match Århus since an equivalent representation is aarhus... Rasmus -- The second rule of Fight Club is: You do not talk about Fight Club ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-06 15:43 ` Rasmus @ 2016-02-06 15:51 ` Eli Zaretskii 0 siblings, 0 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-06 15:51 UTC (permalink / raw) To: Rasmus; +Cc: emacs-devel > From: Rasmus <rasmus@gmx.us> > Date: Sat, 06 Feb 2016 16:43:14 +0100 > > For the purpose of the discussion you could argue that "arhus" > should match Århus And it does, indeed, when both character-folding and case-folding are turned on. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-05 6:36 ` Elias Mårtenson 2016-02-05 7:15 ` Werner LEMBERG @ 2016-02-05 7:52 ` Eli Zaretskii 2016-02-05 15:09 ` Filipp Gunbin 1 sibling, 1 reply; 102+ messages in thread From: Eli Zaretskii @ 2016-02-05 7:52 UTC (permalink / raw) To: Elias Mårtenson; +Cc: ofv, emacs-devel > Date: Fri, 5 Feb 2016 14:36:13 +0800 > From: Elias Mårtenson <lokedhs@gmail.com> > Cc: Óscar Fuentes <ofv@wanadoo.es>, > emacs-devel <emacs-devel@gnu.org> > > What I did want to do was try try to explain that even though there is a visual similarity between A, Ä and Å, to > a Swedish speaker those similarities are no greater than those of q and k. And definitely much more different > than W and V (which were, up until recently sorted under V in dictionaries and seen as simply a visual > variation). > > > What you describe naturally leads to another user option: Don't handle > characters as `equal' (with a proper definition of `equal') that > aren't `equal' in the user's locale. > > This is exactly my point. And you have managed to compress hundreds of my words into a single, district > sentence. Thank you. We are not going by visual similarity, or any other arbitrary criteria. We are using established rules specified by the UCD, the Unicode Character Database, and the explanations that accompany it in the standard itself. The main rule is equivalent character strings should match (when character folding is enabled). That character equivalence is language-dependent is a truism that doesn't need to be argued. The plan is to have language-dependent variations as soon as Emacs acquires good infrastructure for doing that in a useful manner. The idea behind the current implementation was that this feature will be useful even when it is language-agnostic, which is the lowest level of compatibility cited in the Unicode Standard (so the Unicode Consortium guys didn't think it to be a stupid idea). ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-05 7:52 ` Eli Zaretskii @ 2016-02-05 15:09 ` Filipp Gunbin 2016-02-05 19:21 ` Eli Zaretskii 0 siblings, 1 reply; 102+ messages in thread From: Filipp Gunbin @ 2016-02-05 15:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel > The idea behind the current implementation was that this feature will > be useful even when it is language-agnostic, which is the lowest level > of compatibility cited in the Unicode Standard (so the Unicode > Consortium guys didn't think it to be a stupid idea). While we have strict rules for some languages, it's very helpful to count for errors which natives and non-natives may make and fold as much as possible - if folded search gives too many false positive that may just be an indication that a more specific (not folded) search should be used. I now realize that I'd like to see folded even distinct letters like Russian Е and Ё - I cannot tell in advance when the author did it correct. However, having folding on by default will certainly tell me that Emacs is not respecting Russian alphabet, which some people here wrote about too. Filipp ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-05 15:09 ` Filipp Gunbin @ 2016-02-05 19:21 ` Eli Zaretskii 2016-02-05 21:12 ` Óscar Fuentes 2016-02-06 19:49 ` Richard Stallman 0 siblings, 2 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-05 19:21 UTC (permalink / raw) To: Filipp Gunbin; +Cc: emacs-devel > From: Filipp Gunbin <fgunbin@fastmail.fm> > Cc: emacs-devel@gnu.org > Date: Fri, 05 Feb 2016 18:09:23 +0300 > > I now realize that I'd like to see folded even distinct letters like > Russian Е and Ё - I cannot tell in advance when the author did it > correct. > > However, having folding on by default will certainly tell me that Emacs > is not respecting Russian alphabet, which some people here wrote about > too. Folding has nothing to do with respecting the alphabet. A and a are not the same letters, either, and have distinct positions within the English alphabet, and yet it is customary to have case folded during searching, and Emacs does that by default. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-05 19:21 ` Eli Zaretskii @ 2016-02-05 21:12 ` Óscar Fuentes 2016-02-05 22:20 ` Eli Zaretskii 2016-02-06 19:49 ` Richard Stallman 2016-02-06 19:49 ` Richard Stallman 1 sibling, 2 replies; 102+ messages in thread From: Óscar Fuentes @ 2016-02-05 21:12 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Folding has nothing to do with respecting the alphabet. A and a are > not the same letters, either, and have distinct positions within the > English alphabet, This is big news to me. AFAIK `A' and `a' are the same letter, one in uppercase form and the other in lowercase form. The English alphabet consists on 26 letters. This is what I learned many years ago, but it seems that it is all wrong. In Spanish, `A' and `a' are the same letter. `á' and `a' are also the same letter. `n' and `ñ' are not the same letter. > and yet it is customary to have case folded during searching, and > Emacs does that by default. Maybe you are confusing C with English :-) Seriously, if you want a feature for the people who think on terms of encodings, that's fine, but please keep in mind that most people see text as text, the same thing they can write with a pencil, not series of bytes on Unicode, ASCII or whatever. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-05 21:12 ` Óscar Fuentes @ 2016-02-05 22:20 ` Eli Zaretskii 2016-02-06 19:49 ` Richard Stallman 1 sibling, 0 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-05 22:20 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel > From: Óscar Fuentes <ofv@wanadoo.es> > Date: Fri, 05 Feb 2016 22:12:34 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Folding has nothing to do with respecting the alphabet. A and a are > > not the same letters, either, and have distinct positions within the > > English alphabet, > > This is big news to me. AFAIK `A' and `a' are the same letter, one in > uppercase form and the other in lowercase form. The English alphabet > consists on 26 letters. This is what I learned many years ago, but it > seems that it is all wrong. You are missing the point. The point is that "folding", by its very definition, means mapping distinct things to the same value. So no one argues that the letters are different before they are folded. > Seriously, if you want a feature for the people who think on terms of > encodings, that's fine, but please keep in mind that most people see > text as text, the same thing they can write with a pencil, not series of > bytes on Unicode, ASCII or whatever. The notion of "text" moved a long way since we were in kindergarten. The Unicode Standard is about plain text, not anything else. We slowly adapt to that, and character folding is one milestone on that long journey. It has nothing to do with encoding. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-05 21:12 ` Óscar Fuentes 2016-02-05 22:20 ` Eli Zaretskii @ 2016-02-06 19:49 ` Richard Stallman 1 sibling, 0 replies; 102+ messages in thread From: Richard Stallman @ 2016-02-06 19:49 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > In Spanish, `A' and `a' are the same letter. `á' and `a' are also the > same letter. `n' and `ñ' are not the same letter. Ok, but let's not make dire criticial remarks about it ;-). -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-05 19:21 ` Eli Zaretskii 2016-02-05 21:12 ` Óscar Fuentes @ 2016-02-06 19:49 ` Richard Stallman 1 sibling, 0 replies; 102+ messages in thread From: Richard Stallman @ 2016-02-06 19:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: fgunbin, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Folding has nothing to do with respecting the alphabet. I agree. This is not a matter of principle, just convenience. I am sure this feature will be convenient for many users if configured it right -- but what configuration is right appears not to be obvious. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-05 6:01 ` Werner LEMBERG 2016-02-05 6:36 ` Elias Mårtenson @ 2016-02-08 14:05 ` Marcin Borkowski 2016-02-08 17:48 ` Eli Zaretskii 1 sibling, 1 reply; 102+ messages in thread From: Marcin Borkowski @ 2016-02-08 14:05 UTC (permalink / raw) To: Werner LEMBERG; +Cc: ofv, lokedhs, emacs-devel On 2016-02-05, at 07:01, Werner LEMBERG <wl@gnu.org> wrote: >> How do you even define "optical similarities"? > > Basically the same as Eli has described: Base character plus > diacritics, probably plus some basic shapes with `diacritics' that > Unicode doesn't represent as composable: o → ø, l → ł, d → đ, etc. Just as another datapoint in discussion: for me, searching for "l" and finding "ł" seems a bit weird. (The opposite even more so.) I admit this might be nice for people without access to Polish keyboard, and in fact the most popular layout for Polish keyboard is one where "AltGr + l" stands for "ł", but they are really different letters, and similarly with other such cases: "łata" = "patch" "lata" = "flies" (verb, as in "something flies") "kąt" = "angle" "kat" = "hangman" Etc., etc. BTW, strangely enough, here isearching for "l" does /not/ find "ł", but isearching for "a" (with character folding on) finds "ą". Whatever one thinks about char folding, this is clearly a bug. For Polish texts, I would rather turn char folding off. Best, -- Marcin Borkowski http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski Faculty of Mathematics and Computer Science Adam Mickiewicz University ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-08 14:05 ` Marcin Borkowski @ 2016-02-08 17:48 ` Eli Zaretskii 2016-02-08 17:57 ` Werner LEMBERG 2016-02-08 19:18 ` Marcin Borkowski 0 siblings, 2 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-08 17:48 UTC (permalink / raw) To: Marcin Borkowski; +Cc: ofv, lokedhs, emacs-devel > From: Marcin Borkowski <mbork@mbork.pl> > Date: Mon, 08 Feb 2016 15:05:05 +0100 > Cc: ofv@wanadoo.es, lokedhs@gmail.com, emacs-devel@gnu.org > > Just as another datapoint in discussion: for me, searching for "l" and > finding "ł" seems a bit weird. (The opposite even more so.) Which is why neither one happens under character folding. > BTW, strangely enough, here isearching for "l" does /not/ find "ł", but > isearching for "a" (with character folding on) finds "ą". Whatever one > thinks about char folding, this is clearly a bug. It's not a bug, it's the feature working as designed: we only fold characters that have suitable decompositions in the Unicode Character Database. So: (get-char-code-property ?ą 'decomposition) => (97 808) but (get-char-code-property ?ł 'decomposition) => (322) IOW, ą is canonically equivalent to the 2-character sequence a ̨ (which is why searching for a finds that character), while ł has no canonical decomposition (nor any other decomposition). This means that the Unicode guys decided that ł should not be equivalent to any other sequence of characters, and therefore Emacs doesn't find it unless you search for it literally. If you want to know why ł doesn't have any decompositions, I suggest to ask on the Unicode mailing list, I'm sure they had good reasons, most probably reasons that came from people who are experts in the Polish language and its intricacies. We just trust the results. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-08 17:48 ` Eli Zaretskii @ 2016-02-08 17:57 ` Werner LEMBERG 2016-02-08 19:18 ` Marcin Borkowski 1 sibling, 0 replies; 102+ messages in thread From: Werner LEMBERG @ 2016-02-08 17:57 UTC (permalink / raw) To: eliz; +Cc: ofv, lokedhs, emacs-devel > This means that the Unicode guys decided that ł should not be > equivalent to any other sequence of characters, and therefore Emacs > doesn't find it unless you search for it literally. Well, I'm suggesting to extend Unicode rules here for the sake of (non-Polish) users. > If you want to know why ł doesn't have any decompositions, I suggest > to ask on the Unicode mailing list, [...] It's quite easy: A decomposition happens only if the modifier at most touches the glyph. A glyph with a strike-through feature (ł, đ, ø, etc.) is thus not decomposable. Werner ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-08 17:48 ` Eli Zaretskii 2016-02-08 17:57 ` Werner LEMBERG @ 2016-02-08 19:18 ` Marcin Borkowski 2016-02-08 19:37 ` Eli Zaretskii ` (3 more replies) 1 sibling, 4 replies; 102+ messages in thread From: Marcin Borkowski @ 2016-02-08 19:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: ofv, lokedhs, emacs-devel On 2016-02-08, at 18:48, Eli Zaretskii <eliz@gnu.org> wrote: >> From: Marcin Borkowski <mbork@mbork.pl> >> Date: Mon, 08 Feb 2016 15:05:05 +0100 >> Cc: ofv@wanadoo.es, lokedhs@gmail.com, emacs-devel@gnu.org >> >> Just as another datapoint in discussion: for me, searching for "l" and >> finding "ł" seems a bit weird. (The opposite even more so.) > > Which is why neither one happens under character folding. > >> BTW, strangely enough, here isearching for "l" does /not/ find "ł", but >> isearching for "a" (with character folding on) finds "ą". Whatever one >> thinks about char folding, this is clearly a bug. > > It's not a bug, it's the feature working as designed: we only fold > characters that have suitable decompositions in the Unicode Character > Database. So: > > (get-char-code-property ?ą 'decomposition) => (97 808) > > but > > (get-char-code-property ?ł 'decomposition) => (322) > > IOW, ą is canonically equivalent to the 2-character sequence a ̨ (which > is why searching for a finds that character), while ł has no canonical > decomposition (nor any other decomposition). > > This means that the Unicode guys decided that ł should not be > equivalent to any other sequence of characters, and therefore Emacs > doesn't find it unless you search for it literally. > > If you want to know why ł doesn't have any decompositions, I suggest > to ask on the Unicode mailing list, I'm sure they had good reasons, > most probably reasons that came from people who are experts in the > Polish language and its intricacies. We just trust the results. Thanks for the explanation, Eli! However, given the number of bugs/quirks in Unicode, I'd personally prefer not to trust them too much. (Though I understand that the Emacs devs /have/ to trust someone, and choosing the Unicode people is probably not a bad idea generally.) Funnily, one of the more annoying bugs in Unicode is connected with quotes, AFAIR. (Why not beat a dead horse? ;-)) And folding "ą" to "a" while not "ł" to "l" is something which most Poles (I guess) would treat as a serious, WTF-level bug. And good luck to all non-Polish people with isearching for the name of Jan Łukasiewicz (just to choose a Lisp-related name;-)). Yet another datapoint suggesting that the issue is really complicated, and that Drew is right: if this is not configurable by users, it might end up more annoying than helping. (Not to say it won't - I trust Artur here.) Best, -- Marcin Borkowski http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski Faculty of Mathematics and Computer Science Adam Mickiewicz University ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-08 19:18 ` Marcin Borkowski @ 2016-02-08 19:37 ` Eli Zaretskii [not found] ` <<83oabrouwj.fsf@gnu.org> ` (2 subsequent siblings) 3 siblings, 0 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-08 19:37 UTC (permalink / raw) To: Marcin Borkowski; +Cc: ofv, lokedhs, emacs-devel > From: Marcin Borkowski <mbork@mbork.pl> > Cc: wl@gnu.org, ofv@wanadoo.es, lokedhs@gmail.com, emacs-devel@gnu.org > Date: Mon, 08 Feb 2016 20:18:48 +0100 > > Drew is right: if this is not configurable by users, it might end up > more annoying than helping. It's already configurable, always have been. This is Emacs, right? ^ permalink raw reply [flat|nested] 102+ messages in thread
[parent not found: <<83oabrouwj.fsf@gnu.org>]
* RE: Character folding in the pretest [not found] ` <<83oabrouwj.fsf@gnu.org> @ 2016-02-09 0:04 ` Drew Adams 0 siblings, 0 replies; 102+ messages in thread From: Drew Adams @ 2016-02-09 0:04 UTC (permalink / raw) To: Eli Zaretskii, Marcin Borkowski; +Cc: ofv, lokedhs, emacs-devel > > Drew is right: if this is not configurable by users, it might end up > > more annoying than helping. > > It's already configurable, always have been. This is Emacs, right? What I suggested was introducing easy, flexible, powerful ways to customize/configure. I gave more specifics, including ability to (easily) define multiple equivalence classes, switch among them, combine them in various ways, associate them with given modes, etc. "This is Emacs" and "this is Lisp", therefore you can do nearly anything is not what I had in mind. FWIW, I suggested these things not because otherwise "it might end up more annoying than helping". It's already a useful feature. But it can and should become more useful still. There's no hurry, but there's also no harm in thinking about what ways a user might interact with such possible additional features. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-08 19:18 ` Marcin Borkowski 2016-02-08 19:37 ` Eli Zaretskii [not found] ` <<83oabrouwj.fsf@gnu.org> @ 2016-02-09 12:15 ` Richard Stallman [not found] ` <<E1aT7CM-0005LM-9f@fencepost.gnu.org> 3 siblings, 0 replies; 102+ messages in thread From: Richard Stallman @ 2016-02-09 12:15 UTC (permalink / raw) To: Marcin Borkowski; +Cc: ofv, eliz, lokedhs, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] I think it is clear that people want various different character folding rules. The differences depend partly on what language the text is in, partly on whether the user actually speaks that language, and partly on personal preference. Rather than arguing for an a-priori rule, we should let users show us what they actually like, and then try to find general patterns in those preferences so that we can make general defaults that users tend to like. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 102+ messages in thread
[parent not found: <<E1aT7CM-0005LM-9f@fencepost.gnu.org>]
* RE: Character folding in the pretest [not found] ` <<E1aT7CM-0005LM-9f@fencepost.gnu.org> @ 2016-02-09 15:26 ` Drew Adams 0 siblings, 0 replies; 102+ messages in thread From: Drew Adams @ 2016-02-09 15:26 UTC (permalink / raw) To: rms, Marcin Borkowski; +Cc: ofv, eliz, lokedhs, emacs-devel > I think it is clear that people want various different character > folding rules. The differences depend partly on what language the > text is in, partly on whether the user actually speaks that language, > and partly on personal preference. > > Rather than arguing for an a-priori rule, we should let users show us > what they actually like, and then try to find general patterns in > those preferences so that we can make general defaults that users > tend to like. +1 ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-05 5:09 ` Elias Mårtenson 2016-02-05 6:01 ` Werner LEMBERG @ 2016-02-06 12:58 ` Rasmus 1 sibling, 0 replies; 102+ messages in thread From: Rasmus @ 2016-02-06 12:58 UTC (permalink / raw) To: emacs-devel Elias Mårtenson <lokedhs@gmail.com> writes: > My personal preference is that the expected behaviour of > searches is more related to the locale of the user, rather than > that of the document being searched. In other words, as a > non-Spanish speaker, I'd expect to be able to find ñ when > searching for n, even if the document I'm searching in is in > Spanish. There are definitely an infinite number of > counter-examples to this (enough to keep this thread going for > another 100 messages, I'm sure), but at least there is reason to > consider making the default based on the locale of the user. But what locale? The keyboard makes the most sense, I guess, but plenty people switches between layouts (native and English, say) and it might be confusing to have different search results based on that. The "main" locale surely will not work IMO. I use a Scando keyboard, my Gnome is set to Spanish, and I mostly compose documents in English, German or Danish.... Rasmus -- Send from my Emacs ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 16:47 ` Óscar Fuentes 2016-02-04 17:05 ` Werner LEMBERG @ 2016-02-04 17:12 ` Eli Zaretskii 2016-02-04 19:35 ` Óscar Fuentes 2016-02-04 17:27 ` Clément Pit--Claudel 2 siblings, 1 reply; 102+ messages in thread From: Eli Zaretskii @ 2016-02-04 17:12 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel > From: Óscar Fuentes <ofv@wanadoo.es> > Date: Thu, 04 Feb 2016 17:47:54 +0100 > > I see your point, but you are talking about accents all the time. In > Spanish `n' and `ñ' are different letters. `n' matching `ñ' is no > different than `p' matching `q'. Unicode disagrees: M-: (get-char-code-property ?ñ 'decomposition) RET => (110 771) 110 is 'n' and 771 is U+0303 NON-SPACING TILDE, a combining accent. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 17:12 ` Eli Zaretskii @ 2016-02-04 19:35 ` Óscar Fuentes 2016-02-04 19:52 ` Clément Pit--Claudel 2016-02-04 20:05 ` Eli Zaretskii 0 siblings, 2 replies; 102+ messages in thread From: Óscar Fuentes @ 2016-02-04 19:35 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> I see your point, but you are talking about accents all the time. In >> Spanish `n' and `ñ' are different letters. `n' matching `ñ' is no >> different than `p' matching `q'. > > Unicode disagrees: > > M-: (get-char-code-property ?ñ 'decomposition) RET > > => (110 771) > > 110 is 'n' and 771 is U+0303 NON-SPACING TILDE, a combining accent. AFAIK Unicode doesn't mandate what the Spanish alphabet is. I thought that the point of the feature was to provide searching with support for character equivalence classes, which is very useful for the case of Spanish (and other languages, I'm sure). But you are saying that the feature is about how the characters are encoded by the computer and not about how they are used by people. If that is true, it should be disabled by default. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 19:35 ` Óscar Fuentes @ 2016-02-04 19:52 ` Clément Pit--Claudel 2016-02-04 20:05 ` Eli Zaretskii 1 sibling, 0 replies; 102+ messages in thread From: Clément Pit--Claudel @ 2016-02-04 19:52 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 2034 bytes --] On 02/04/2016 02:35 PM, Óscar Fuentes wrote: > Eli Zaretskii <eliz@gnu.org> writes: > >>> I see your point, but you are talking about accents all the time. In >>> Spanish `n' and `ñ' are different letters. `n' matching `ñ' is no >>> different than `p' matching `q'. >> >> Unicode disagrees: >> >> M-: (get-char-code-property ?ñ 'decomposition) RET >> >> => (110 771) >> >> 110 is 'n' and 771 is U+0303 NON-SPACING TILDE, a combining accent. > > AFAIK Unicode doesn't mandate what the Spanish alphabet is. > > I thought that the point of the feature was to provide searching with > support for character equivalence classes, which is very useful for the > case of Spanish (and other languages, I'm sure). But you are saying that > the feature is about how the characters are encoded by the computer and > not about how they are used by people. If that is true, it should be > disabled by default. Why? This feature is simply folding as specified by the Unicode standard. Hopefully the way it is implemented will indeed lend itself to future extensions; using it for user-defined classes of substitutions would be nice. But I don't understand why the possibility of fancier (though less clearly defined) folding should disqualify this feature from becoming the default. Also, it's not easy (I'd guess not possible) to give any sort of precise meaning to ‘how characters are used by people’. I still find this simple character folding quite useful; I just accept that it's visual folding, not semantic folding (and this list is well aware of the difficulties that arise when one tries to assign semantic meaning to characters; cf. the ‘’ vs `' debate). The semantics of this simple folding are as uncontroversial as can be; we're following an established standard. Maybe there's a better behaved notion of folding out there, but I'm not sure why its existence is relevant to the choice of a default, since we don't have an implementation (nor a spec) for that alternative. Clément. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 19:35 ` Óscar Fuentes 2016-02-04 19:52 ` Clément Pit--Claudel @ 2016-02-04 20:05 ` Eli Zaretskii 1 sibling, 0 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-04 20:05 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel > From: Óscar Fuentes <ofv@wanadoo.es> > Date: Thu, 04 Feb 2016 20:35:53 +0100 > > > M-: (get-char-code-property ?ñ 'decomposition) RET > > > > => (110 771) > > > > 110 is 'n' and 771 is U+0303 NON-SPACING TILDE, a combining accent. > > AFAIK Unicode doesn't mandate what the Spanish alphabet is. I didn't say it did. > I thought that the point of the feature was to provide searching with > support for character equivalence classes It is. > But you are saying that the feature is about how the characters are > encoded by the computer and not about how they are used by > people. If that is true, it should be disabled by default. But it isn't true. This has (almost) nothing to do with encoding, get-char-code-property accesses properties, not encodings. Perhaps you aren't familiar with Unicode equivalence, in which case I suggest these sources: http://unicode.org/reports/tr10/#Searching http://www.unicode.org/notes/tn5/ http://www.unicode.org/reports/tr30/tr30-4.html ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 16:47 ` Óscar Fuentes 2016-02-04 17:05 ` Werner LEMBERG 2016-02-04 17:12 ` Eli Zaretskii @ 2016-02-04 17:27 ` Clément Pit--Claudel 2016-02-04 17:34 ` Eli Zaretskii ` (2 more replies) 2 siblings, 3 replies; 102+ messages in thread From: Clément Pit--Claudel @ 2016-02-04 17:27 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 3215 bytes --] On 02/04/2016 11:47 AM, Óscar Fuentes wrote: > Clément Pit--Claudel <clement.pit@gmail.com> writes: > >> On 02/04/2016 10:59 AM, Óscar Fuentes wrote: >>> After seeing the case I mentioned (`n' matching `ñ' in Spanish >>> text) it is obvious that the feature is not ready for prime >>> time. >> >> This is interesting. I guess it boils down to whether you're trying >> to avoid false positives or false negatives. For me the strength of >> this feature is that it lets me find virtually anything using an >> dumb keyboard (one without easy access to accents); I don't care >> too much about false positives (that is, I don't mind if ‘n’ finds >> ‘ñ’). In that sense, it doesn't matter if letters "are different"; >> all that matters is whether they look different. I imagine that's >> why the Unicode standard defined things that way. It seems this >> behavior is consistent with that of most online search engines (I >> tried Google, Bing, and DuckDuckGo; all return accented matches for >> unaccented keywords). > > I see your point, but you are talking about accents all the time. In > Spanish `n' and `ñ' are different letters. `n' matching `ñ' is no > different than `p' matching `q'. I think that you will agree that > some of us will see that behavior as a glaring bug. I should have said diacritics instead of accents; sorry. The difference between n matching ñ and p matching q is that graphically, ñ is n + ~ (it can also be encoded that way: ̃n). Here's another issue that character folding solves; Id like your thoughts on it. Try to search the text of my message for 'n' and 'ñ', without any sort of character folding. This will match n but not ñ: ̃n. This will match ñ but not n: ñ. Note that the behaviour has nothing to do with Emacs; most applications will behave the same. The first ñ is using n + combining tilde, while the second is a single character ñ. Both are legal representation of the Spanish letter ñ. With character folding, both match 'n'. This is a much more logical default, I think. The same thing can be said for virtually every diacritic. On a more personal note, I wouldn't see the character folding behaviour as a bug for French, where ç is quite different from c, and é is quite different from e. >> I'm wary of smart solutions based on locale or buffer language. >> It's not uncommon to be writing a single document in multiple >> languages; especially if names are involved. Plus, it's not obvious >> that a single set of settings is enough for each locale. For >> example, one could argue that folding accents makes no sense in >> French: ‘supprimé’ means ‘removed’, but ‘supprime’ means ‘removes’. >> Yet it is not uncommon for people to write the latter for the >> former, especially when using a dumb keyboard. > > I'm not sure how to fix this, but seeing similar reservations from > other users, some language-dependent behavior is unavoidable. I don't think so. An on-off switch seems enough to begin with. Language-dependent folding could to be a separate feature; unicode folding (the curretn implementation) would be a fine feature to start with, I think. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 17:27 ` Clément Pit--Claudel @ 2016-02-04 17:34 ` Eli Zaretskii 2016-02-04 18:18 ` Yuri Khan 2016-02-04 19:46 ` Óscar Fuentes 2 siblings, 0 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-04 17:34 UTC (permalink / raw) To: Clément Pit--Claudel; +Cc: emacs-devel > From: Clément Pit--Claudel <clement.pit@gmail.com> > Date: Thu, 4 Feb 2016 12:27:49 -0500 > > > I'm not sure how to fix this, but seeing similar reservations from > > other users, some language-dependent behavior is unavoidable. > > I don't think so. An on-off switch seems enough to begin with. Language-dependent folding could to be a separate feature; unicode folding (the curretn implementation) would be a fine feature to start with, I think. That's the idea, indeed: the feature currently provides language-independent lax matching; language-dependent variations should follow, once we (a) figure out how to know _the_ language at any given place, and (b) acquire a database of those variations. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 17:27 ` Clément Pit--Claudel 2016-02-04 17:34 ` Eli Zaretskii @ 2016-02-04 18:18 ` Yuri Khan 2016-02-04 19:46 ` Óscar Fuentes 2 siblings, 0 replies; 102+ messages in thread From: Yuri Khan @ 2016-02-04 18:18 UTC (permalink / raw) To: Clément Pit--Claudel; +Cc: Emacs developers On Thu, Feb 4, 2016 at 11:27 PM, Clément Pit--Claudel <clement.pit@gmail.com> wrote: > I should have said diacritics instead of accents; sorry. The difference between n matching ñ and p matching q is that graphically, ñ is n + ~ (it can also be encoded that way: ̃n). This last example is wrong. Combining diacritics always affect the preceding character, not the following. In your example, the tilde is rendered over the space preceding n. If you see the tilde over n, this indicates a bug in the font you are using. (It is fairly common.) ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 17:27 ` Clément Pit--Claudel 2016-02-04 17:34 ` Eli Zaretskii 2016-02-04 18:18 ` Yuri Khan @ 2016-02-04 19:46 ` Óscar Fuentes 2016-02-04 20:06 ` Clément Pit--Claudel 2016-02-04 20:07 ` Eli Zaretskii 2 siblings, 2 replies; 102+ messages in thread From: Óscar Fuentes @ 2016-02-04 19:46 UTC (permalink / raw) To: emacs-devel Clément Pit--Claudel <clement.pit@gmail.com> writes: [snip] It seems that the feature is not geared towards natural language, but for the cases where the user cares about how the character is composed. As mentioned on my answer to Eli, this feature should default to off. Your use case is not typical and is based on usage circunstances (writing French with a US keyboard), personal opinions about what is admisible or factors depending on your language (maybe French has no a similar case of Spanish n/ñ), so I think that it is not convincing enough to change my POV about the default status of the feature. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 19:46 ` Óscar Fuentes @ 2016-02-04 20:06 ` Clément Pit--Claudel 2016-02-04 20:40 ` Óscar Fuentes 2016-02-04 20:07 ` Eli Zaretskii 1 sibling, 1 reply; 102+ messages in thread From: Clément Pit--Claudel @ 2016-02-04 20:06 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1139 bytes --] On 02/04/2016 02:46 PM, Óscar Fuentes wrote: > Your use case is not typical and is based on usage circunstances > (writing French with a US keyboard), personal opinions about what is > admisible or factors depending on your language (maybe French has no a > similar case of Spanish n/ñ) My name is a good example in French. Clément and Clement are not pronounced the same at all. I gave other examples in other messages. My writing French with an american keyboard has nothing to do with this feature; we're talking about searching, not input methods. > so I think that it is not convincing > enough to change my POV about the default status of the feature. I was not trying to change your POV; mostly to understand it. I think you've described a use case that is not covered by the current implementation (you want character folding to be smart, and to recognize whether the user knows that ñ and n are more different than á and a before folding deciding whether to fold ñ into n). But why should your use case not being covered by the current implementation prevent that implementation from becoming the default? [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 20:06 ` Clément Pit--Claudel @ 2016-02-04 20:40 ` Óscar Fuentes 2016-02-04 20:56 ` Clément Pit--Claudel 0 siblings, 1 reply; 102+ messages in thread From: Óscar Fuentes @ 2016-02-04 20:40 UTC (permalink / raw) To: emacs-devel Clément Pit--Claudel <clement.pit@gmail.com> writes: > My name is a good example in French. Clément and Clement are not > pronounced the same at all. I gave other examples in other messages. Sure, there are plenty of similar cases in Spanish. Every Spaniard knows that "canto" and "cantó" are different words and, most likely, will be not too upset or even happy while seeing isearch locating "cantó" when searching for "canto". But the same doesn't apply to n/ñ. If a Spaniard inputs "sana" on a search box and "saña" is found, he will regard the software as either buggy, dumb or completely oblivious to Spanish culture. I'm unable to make isearch-query-replace work (it gives me "isearch-query-replace: Wrong type argument: stringp, nil") but if the replaced elements are the same that gets found with Isearch, the n/ñ thing can produce lots of hilarious (or embarrassing) anecdotes :-) > I was not trying to change your POV; mostly to understand it. I think > you've described a use case that is not covered by the current > implementation (you want character folding to be smart, and to > recognize whether the user knows that ñ and n are more different than > á and a before folding deciding whether to fold ñ into n). But why > should your use case not being covered by the current implementation > prevent that implementation from becoming the default? We are talking about isearch here, the most basic and accessible way of text searching on Emacs. Introducing a change on how it works with the consequence of creating an "it is not a bug, it is a feature" experience for a fair chunk of the world's population seems like something that should give us pause. Personally, I'm fine with disabling the feature on my setup, but I'll advise against setting defaults that appeals to users who see foreign characters as glyphs instead of thinking on the users who actually see meaning on those characters. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 20:40 ` Óscar Fuentes @ 2016-02-04 20:56 ` Clément Pit--Claudel 2016-02-04 21:16 ` Óscar Fuentes 0 siblings, 1 reply; 102+ messages in thread From: Clément Pit--Claudel @ 2016-02-04 20:56 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 660 bytes --] On 02/04/2016 03:40 PM, Óscar Fuentes wrote: > If a Spaniard inputs "sana" on a search box and "saña" is found, he > will regard the software as either buggy, dumb or completely > oblivious to Spanish culture. Is that true? Here are Google.es results for "sana"; Google seems to be happy to return saña too: > La Agencia Árabe Siria de Noticias > sana.sy/es/ > > saña - Definición - WordReference.com > www.wordreference.com/definicion/saña > > Saná - Wikipedia, la enciclopedia libre > https://es.wikipedia.org/wiki/Saná I'm seeing this both from France and from the US, on Google.es; is it different from Spain? Clément. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 20:56 ` Clément Pit--Claudel @ 2016-02-04 21:16 ` Óscar Fuentes 0 siblings, 0 replies; 102+ messages in thread From: Óscar Fuentes @ 2016-02-04 21:16 UTC (permalink / raw) To: emacs-devel Clément Pit--Claudel <clement.pit@gmail.com> writes: > On 02/04/2016 03:40 PM, Óscar Fuentes wrote: >> If a Spaniard inputs "sana" on a search box and "saña" is found, he >> will regard the software as either buggy, dumb or completely >> oblivious to Spanish culture. > > Is that true? Here are Google.es results for "sana"; Google seems to be happy to return saña too: > >> La Agencia Árabe Siria de Noticias >> sana.sy/es/ >> >> saña - Definición - WordReference.com >> www.wordreference.com/definicion/saña >> >> Saná - Wikipedia, la enciclopedia libre >> https://es.wikipedia.org/wiki/Saná > > I'm seeing this both from France and from the US, on Google.es; is it different from Spain? It is the same from Spain. Apparently Google is optimized for non-native people who possible don't see a real difference among `n' and `ñ', or have no method for typing an `ñ' (by law, all keyboards sold on Spain must have a dedicated key for `ñ'). Google is being dumb here (from an Spanish-speaking POV, maybe not so from other's POV). ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 19:46 ` Óscar Fuentes 2016-02-04 20:06 ` Clément Pit--Claudel @ 2016-02-04 20:07 ` Eli Zaretskii 2016-02-04 20:52 ` Óscar Fuentes 1 sibling, 1 reply; 102+ messages in thread From: Eli Zaretskii @ 2016-02-04 20:07 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel > From: Óscar Fuentes <ofv@wanadoo.es> > Date: Thu, 04 Feb 2016 20:46:20 +0100 > > It seems that the feature is not geared towards natural language, but > for the cases where the user cares about how the character is composed. You misunderstood. Decomposition is just a tool that is used to search for equivalent character sequences. > As mentioned on my answer to Eli, this feature should default to off. AFAIU, that opinion is based on misunderstanding of what the feature is supposed to do and support. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 20:07 ` Eli Zaretskii @ 2016-02-04 20:52 ` Óscar Fuentes 2016-02-04 20:59 ` Clément Pit--Claudel 2016-02-04 21:08 ` Eli Zaretskii 0 siblings, 2 replies; 102+ messages in thread From: Óscar Fuentes @ 2016-02-04 20:52 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Óscar Fuentes <ofv@wanadoo.es> >> Date: Thu, 04 Feb 2016 20:46:20 +0100 >> >> It seems that the feature is not geared towards natural language, but >> for the cases where the user cares about how the character is composed. > > You misunderstood. Decomposition is just a tool that is used to > search for equivalent character sequences. Equivalent in the Unicode sense, right? >> As mentioned on my answer to Eli, this feature should default to off. > > AFAIU, that opinion is based on misunderstanding of what the feature > is supposed to do and support. If my understanding is correct now (the feature is some Unicode thing and not about how characters are used by people) I insist on defaulting to off, unless we renounce to make Emacs amenable to those who use a text editor for natural languages. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 20:52 ` Óscar Fuentes @ 2016-02-04 20:59 ` Clément Pit--Claudel 2016-02-04 21:08 ` Eli Zaretskii 1 sibling, 0 replies; 102+ messages in thread From: Clément Pit--Claudel @ 2016-02-04 20:59 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 378 bytes --] On 02/04/2016 03:52 PM, Óscar Fuentes wrote: > I insist on defaulting to off, unless we renounce to make Emacs > amenable to those who use a text editor for natural languages. I think this is a false dichotomy. I use Emacs for natural languages too, and I'm OK with that behavior being the default. It could also be made default in prog-mode but not in text-mode. etc. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 20:52 ` Óscar Fuentes 2016-02-04 20:59 ` Clément Pit--Claudel @ 2016-02-04 21:08 ` Eli Zaretskii 1 sibling, 0 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-04 21:08 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel > From: Óscar Fuentes <ofv@wanadoo.es> > Date: Thu, 04 Feb 2016 21:52:08 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > You misunderstood. Decomposition is just a tool that is used to > > search for equivalent character sequences. > > Equivalent in the Unicode sense, right? Equivalent in the following sense: if the text includes ñ (these are 2 separate characters, they are just combined for display), then searching for either n or ñ (a single character in both cases) should find that 2-character sequence. This follows the "canonical equivalence", described in more detail here: http://unicode.org/reports/tr10/#Canonical_Equivalence > If my understanding is correct now (the feature is some Unicode thing > and not about how characters are used by people) I insist on defaulting > to off, unless we renounce to make Emacs amenable to those who use a > text editor for natural languages. It _is_ about how characters are used, see above. And you don't need to insist, you can just turn it off in your sessions. You have heard at least 2 people whose opinions are to the contrary, for various valid reasons. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 16:36 ` Clément Pit--Claudel 2016-02-04 16:47 ` Óscar Fuentes @ 2016-02-04 20:23 ` John Wiegley 1 sibling, 0 replies; 102+ messages in thread From: John Wiegley @ 2016-02-04 20:23 UTC (permalink / raw) To: Clément Pit--Claudel; +Cc: emacs-devel >>>>> Clément Pit--Claudel <clement.pit@gmail.com> writes: > For me the strength of this feature is that it lets me find virtually > anything using an dumb keyboard (one without easy access to accents); I > don't care too much about false positives (that is, I don't mind if ‘n’ > finds ‘ñ’). Going beyond natural languages, there have been a few times when I've wanted to search for equivalence expressions in an Agda file, for example, but really I want it to match against anything similar, so typing "x = y", I'd like it to find occurrences using ≈ ≅ ≃ ≡ =, etc.. This sort of lax searching is like taking a "quotient" of your buffer based on the equivalence classes you're interested in, and then searching against that version of the buffer. And there many quotients to be taken, for many reasons. A locale-based quotient for natural language text seems like a reasonable default, unless pretesting/polling shows us otherwise. However, there will always be times when you don't want it, or you want a different quotient altogether, or even various combinations of them. -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 15:59 ` Óscar Fuentes 2016-02-04 16:36 ` Clément Pit--Claudel @ 2016-02-04 17:07 ` Eli Zaretskii 2016-02-04 17:31 ` Clément Pit--Claudel 1 sibling, 1 reply; 102+ messages in thread From: Eli Zaretskii @ 2016-02-04 17:07 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel > From: Óscar Fuentes <ofv@wanadoo.es> > Date: Thu, 04 Feb 2016 16:59:18 +0100 > > After seeing the case I mentioned (`n' matching `ñ' in Spanish text) > it is obvious that the feature is not ready for prime time. The feature was _designed_ to do this, so it simply works as designed. It can be turned off if you don't like the results, but saying it isn't ready based on that is IMO inaccurate, if not incorrect. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 17:07 ` Eli Zaretskii @ 2016-02-04 17:31 ` Clément Pit--Claudel 0 siblings, 0 replies; 102+ messages in thread From: Clément Pit--Claudel @ 2016-02-04 17:31 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1117 bytes --] On 02/04/2016 12:07 PM, Eli Zaretskii wrote: >> From: Óscar Fuentes <ofv@wanadoo.es> >> Date: Thu, 04 Feb 2016 16:59:18 +0100 >> >> After seeing the case I mentioned (`n' matching `ñ' in Spanish text) >> it is obvious that the feature is not ready for prime time. > > The feature was _designed_ to do this, so it simply works as designed. > It can be turned off if you don't like the results, but saying it > isn't ready based on that is IMO inaccurate, if not incorrect. I agree. Maybe we're just discussing two different features? * One is unicode standard character folding; it's implemented, it works as designed, it has very clear semantics based on a recognized standard, but we're not sure if it should be enabled by default (I'd vote yes). The other is language-dependent character folding; it isn't implemented (though some people think it could reuse some of the architecture used for unicode folding), it doesn't have clear semantics (it's a matter of user-preference, though we might be able to come up with good defaults), and many people would love such a feature. Clément. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 15:18 ` Drew Adams 2016-02-04 15:59 ` Óscar Fuentes @ 2016-02-04 23:05 ` Artur Malabarba 2016-02-06 9:37 ` Per Starbäck 1 sibling, 1 reply; 102+ messages in thread From: Artur Malabarba @ 2016-02-04 23:05 UTC (permalink / raw) To: Drew Adams; +Cc: Dirk-Jan C. Binnema, emacs-devel >> > It would make sense to have the default based on the session's locale, >> Character equivalence is based on the language(s) of whatever is in your >> buffer, > That is really where the design effort should be, at this point. > We have a basic char-folding mechanism, but we do not yet provide > an easy way for a user to customize the behavior, let alone to > define/get the various behaviors that s?he might want in different > contexts. FTR, like I've said a couple of times already, I will invest more time into making this customizable once I've seen how it's received. Also (and this I haven't said yet) I do plan on providing a better default depending on locale. When the time comes to actually implement it I'll explain why I prefer locale (over some notion of buffer-local language). ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 23:05 ` Artur Malabarba @ 2016-02-06 9:37 ` Per Starbäck 2016-02-06 10:41 ` Eli Zaretskii 0 siblings, 1 reply; 102+ messages in thread From: Per Starbäck @ 2016-02-06 9:37 UTC (permalink / raw) To: Artur Malabarba; +Cc: Dirk-Jan C. Binnema, Drew Adams, emacs-devel Oscar Fuentes wrote: > If a Spaniard inputs "sana" on a search box and "saña" is found, he > will regard the software as either buggy, dumb or completely > oblivious to Spanish culture. Similar to my example of how a Swede would see a search for "varpa" finding "värpa" or "varpå" (all of the three being existing totally different words). When met with the "argument" that not many people speak Swedish anyway I replied that it was only an example of what I knew best, and that there probably were similar examples in several other languages. I'm glad to hear there is one in Spanish, one of the largest languages of the world. Now let's count the number of affected people again! :) That character folding is dependent on locale is of course well-known by those who work on this. Artur Malabarba wrote: > FTR, like I've said a couple of times already, I will invest more time > into making this customizable once I've seen how it's received. > Also (and this I haven't said yet) I do plan on providing a better > default depending on locale. When the time comes to actually implement > it I'll explain why I prefer locale (over some notion of buffer-local > language). When Artur again confirmed that he is fine with having the new feature turned of in Emacs 25 with the intention of having it turned on later, after it has had enough testing, I though this would finally be settled. But evidently not yet... From the opposers it has been argued as if this is something mandated by Unicode, so we can do nothing about it but to follow. It doesn't matter if the result is seen as buggy or dumb by users. "This feature is simply folding as specified by the Unicode standard". That is not so. Of course the Unicode Consortium is well aware of the issues that I, Oscar and others are pointing out, and that I'm sure Artur is well aware of. Eli Zaretskii: > Perhaps you aren't familiar with Unicode equivalence, in which case I > suggest these sources: > > http://unicode.org/reports/tr10/#Searching > http://www.unicode.org/notes/tn5/ > http://www.unicode.org/reports/tr30/tr30-4.html But of course these take up issues like we have mentioned here. The first one mentions the aa/å equivalence in Danish for example. And to quote the last one: # In the general case, different search term foldings are applied for # different languages. For example, accent distinctions are ignorable # for some languages, but not for others. In English the accent in # words like naïve is optional, while to a Swedish user 'o' and 'ö' # are distinct letters. That is by the way the last draft of a withdrawn tecnical report. Draft UTR #30: Unicode Character Foldings has been withdrawn. It was never formally approved; the last public version was a draft UTR,which can be found at http://www.unicode.org/reports/tr30/tr30-4.html. That shows not only that the issues I, Oscar and others are mentioning are not something new that we just thought of that Unicode somehow should have us ignore. It also shows that there *is* no technical report on Unicode Character Foldings. We have to break out of the circles this is going in. John Wiegley wrote: > A locale-based quotient for natural language text seems like a reasonable > default, unless pretesting/polling shows us otherwise. However, there will > always be times when you don't want it, or you want a different quotient > altogether, or even various combinations of them. Yes, that would be a good default, but that's not a default that we can have in the next Emacs, but that there is great prospects we can have in the one after that. Please John, put your foot down and don't let this continue ad infinitum. The options we have are instead: (1) Let the default be as searching has worked before. Nothing gets worse for anyone. We'll the start of a new exciting feature available, that will be just right for many users, and that will be tried by a lot others as well, giving feedback for the continued development that Artur has written that he already is planning. (2) Make the fundamental feature searching work fundamentally different out of the box in a way that for many users will be seen as neat, and for many users will be seen as "buggy, dumb or completely oblivious to" the user's culture. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-06 9:37 ` Per Starbäck @ 2016-02-06 10:41 ` Eli Zaretskii 2016-02-06 12:52 ` Rasmus 2016-02-06 14:24 ` Ken Brown 0 siblings, 2 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-06 10:41 UTC (permalink / raw) To: Per Starbäck; +Cc: djcb, drew.adams, bruce.connor.am, emacs-devel > Date: Sat, 6 Feb 2016 10:37:06 +0100 > From: Per Starbäck <per.starback@gmail.com> > Cc: "Dirk-Jan C. Binnema" <djcb@djcbsoftware.nl>, > Drew Adams <drew.adams@oracle.com>, emacs-devel <emacs-devel@gnu.org> > > From the opposers it has been argued as if this is something > mandated by Unicode, so we can do nothing about it but to follow. No one said anything like that. The references to the Unicode Standard and its various data and TRs are to make the point that the feature as implemented is based on sound principles and not on some arbitrary criteria. No one said the feature is "mandated" in any way, shape or form. Whether the features should be turned on by default is a matter only we the Emacs community will decide. > It doesn't matter if the result is seen as buggy or dumb by > users. "This feature is simply folding as specified by the Unicode > standard". The Unicode Standard specifies _how_ to fold during search. It also includes recommendations _when_ to fold. It doesn't mandate anything, and even if it did, we don't need to heed to that. Your arguments in this part are a red herring. > That is not so. Of course the Unicode Consortium is well aware of the > issues that I, Oscar and others are pointing out, and that I'm sure > Artur is well aware of. We are all aware of that, please give us credit that we know something about the issues involved. It is you who seems to misunderstands important aspects of this, see below. > Eli Zaretskii: > > Perhaps you aren't familiar with Unicode equivalence, in which case I > > suggest these sources: > > > > http://unicode.org/reports/tr10/#Searching > > http://www.unicode.org/notes/tn5/ > > http://www.unicode.org/reports/tr30/tr30-4.html > > But of course these take up issues like we have mentioned here. The > first one mentions the aa/å equivalence in Danish for example. And to > quote the last one: > > # In the general case, different search term foldings are applied for > # different languages. For example, accent distinctions are ignorable > # for some languages, but not for others. In English the accent in > # words like naïve is optional, while to a Swedish user 'o' and 'ö' > # are distinct letters. It seems that you have read only the parts that confirm your views in your eyes, and skipped or dismissed the rest. And now you are spreading your misunderstanding among others. The facts are different. Unicode indeed recognizes that different languages change the rules to some degree. However, it defines several distinct degrees of conformance, and what we have now is the lowest possible level of conformance, the one that is not tailored to any particular language. See Section 3.8 of TR#10, referenced above, and Table 13 there. What we in fact implemented is the default collation weights, which are independent of language tailoring. This is similar to the data we use for case-folding: it doesn't include any language-specific tailoring, and so in some cases, like Turkish dotless i issue, produces results that are incorrect in the context of some specific languages. Still we use it, and it generally works very well. In the long run, we should add language-specific tailoring to this and other similar features. Currently, we lack the infrastructure for doing that in a useful way, so this further development must wait. But it doesn't mean the feature isn't useful as it is now, and several participants in this thread explicitly said they like what the feature gives them. Which doesn't surprise me, because it matches the advice in the Unicode Standard, so I know we are on the right path. > That is by the way the last draft of a withdrawn tecnical report. (So why are you quoting from it and claim that it supports your POV? If it's indeed a useless, withdrawn draft, then it has no relevance at all, right? Please decide whether you want to treat that report seriously or not, and please be consistent with your decision. Trying to have the cake and also eat it doesn't add credibility to your opinions.) > Draft UTR #30: Unicode Character Foldings has been withdrawn. It was > never formally approved; the last public version was a draft > UTR,which can be found at > http://www.unicode.org/reports/tr30/tr30-4.html. Actually, that draft was mentioned because it includes interesting and important stuff not mentioned in one place in any other publication I know of. I referred to it under an assumption that the reader will be keenly interested in learning as much relevant background information about the subject as possible, even if the report itself never made it to the official status. > We have to break out of the circles this is going in. There are no circles. We wanted to collect feedback, and we are collecting it. The pretest is going on for merely one week, and the feedback we have already is useful, and it keeps coming in. Stopping that and making the decision now makes no sense to me. The release is still quite far away, and we have nothing to lose by hearing from more people. Assuming we want to make an informed decision, there's no rush. > Please John, put your foot down and don't let this continue ad > infinitum. No one intends to continue "ad infinitum". That's another red herring. We should continue collecting feedback for a couple more of pretest releases, that's all. Then we can make the decision based on that feedback. I counted 10 people (excluding myself and Artur) who expressed their clear opinions in this thread; that is way too few for an intelligent decision, IMO. > The options we have are instead: > > (1) Let the default be as searching has worked before. Nothing gets > worse for anyone. > > We'll the start of a new exciting feature available, that will be just > right for many users, and that will be tried by a lot others as well, > giving feedback for the continued development that Artur has written > that he already is planning. > > (2) Make the fundamental feature searching work fundamentally > different out of the box in a way that for many users will be seen as > neat, and for many users will be seen as "buggy, dumb or completely > oblivious to" the user's culture. With all due respect, I don't think this is an objective description of the alternatives. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-06 10:41 ` Eli Zaretskii @ 2016-02-06 12:52 ` Rasmus 2016-02-06 14:31 ` Eli Zaretskii 2016-02-06 14:24 ` Ken Brown 1 sibling, 1 reply; 102+ messages in thread From: Rasmus @ 2016-02-06 12:52 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > No one intends to continue "ad infinitum". That's another red > herring. We should continue collecting feedback for a couple > more of pretest releases, that's all. Then we can make the > decision based on that feedback. I counted 10 people (excluding > myself and Artur) who expressed their clear opinions in this > thread; that is way too few for an intelligent decision, IMO. My language probably does not fit the agnostic approach. Nonetheless, this is an awesome features and I think it should be on by default. Thanks for working on this to all those who have done so! Rasmus -- C is for Cookie ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-06 12:52 ` Rasmus @ 2016-02-06 14:31 ` Eli Zaretskii 0 siblings, 0 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-06 14:31 UTC (permalink / raw) To: Rasmus; +Cc: emacs-devel > From: Rasmus <rasmus@gmx.us> > Date: Sat, 06 Feb 2016 13:52:23 +0100 > > My language probably does not fit the agnostic approach. > Nonetheless, this is an awesome features and I think it should be > on by default. > > Thanks for working on this to all those who have done so! Thank you for your feedback. (Most of the credit for the actual work goes to Artur, of course.) ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-06 10:41 ` Eli Zaretskii 2016-02-06 12:52 ` Rasmus @ 2016-02-06 14:24 ` Ken Brown 2016-02-06 15:07 ` Eli Zaretskii 1 sibling, 1 reply; 102+ messages in thread From: Ken Brown @ 2016-02-06 14:24 UTC (permalink / raw) To: Eli Zaretskii, Per Starbäck Cc: djcb, bruce.connor.am, drew.adams, emacs-devel On 2/6/2016 5:41 AM, Eli Zaretskii wrote: > No one intends to continue "ad infinitum". That's another red > herring. We should continue collecting feedback for a couple more of > pretest releases, that's all. Then we can make the decision based on > that feedback. I counted 10 people (excluding myself and Artur) who > expressed their clear opinions in this thread; that is way too few for > an intelligent decision, IMO. I'll add one more. I like character folding in its present form, and I will use it whether it's on by default or not. As to whether it should be on by default, I agree with those who say it's too early to make that decision. Ken ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-06 14:24 ` Ken Brown @ 2016-02-06 15:07 ` Eli Zaretskii 0 siblings, 0 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-06 15:07 UTC (permalink / raw) To: Ken Brown; +Cc: per.starback, djcb, bruce.connor.am, drew.adams, emacs-devel > Cc: djcb@djcbsoftware.nl, drew.adams@oracle.com, bruce.connor.am@gmail.com, > emacs-devel@gnu.org > From: Ken Brown <kbrown@cornell.edu> > Date: Sat, 6 Feb 2016 09:24:24 -0500 > > I'll add one more. I like character folding in its present form, and I > will use it whether it's on by default or not. > > As to whether it should be on by default, I agree with those who say > it's too early to make that decision. Thanks for the feedback. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 11:57 ` Dirk-Jan C. Binnema 2016-02-04 15:18 ` Drew Adams @ 2016-02-04 16:54 ` Eli Zaretskii 2016-02-04 17:36 ` Paul Eggert 2016-02-04 17:26 ` Teemu Likonen 2 siblings, 1 reply; 102+ messages in thread From: Eli Zaretskii @ 2016-02-04 16:54 UTC (permalink / raw) To: Dirk-Jan C. Binnema; +Cc: emacs-devel > From: "Dirk-Jan C. Binnema" <djcb@djcbsoftware.nl> > Date: Thu, 04 Feb 2016 13:57:36 +0200 > > > What type of character equivalence should be used is locale-dependent. > > Everybody here agrees with that. Thus, the solution must also be > > locale-dependent. > > > It would make sense to have the default based on the session's locale, > > meaning that in a Swedish locale a, ä and å would be different and n and ñ > > be different, but under a Spanish locale, the opposite would be true. > > Character equivalence is based on the language(s) of whatever is in your > buffer, which might be correlated with your locale, but not more than > that. Indeed. Emacs is a multilingual environment, so any assumption that the main language in every buffer, or even in most buffers, is likely to be the locale's language will misfire. Also, Emacs has features that need match characters which didn't come from human-readable text at all, like file names. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 16:54 ` Eli Zaretskii @ 2016-02-04 17:36 ` Paul Eggert 2016-02-04 17:45 ` Eli Zaretskii 0 siblings, 1 reply; 102+ messages in thread From: Paul Eggert @ 2016-02-04 17:36 UTC (permalink / raw) To: Eli Zaretskii, Dirk-Jan C. Binnema; +Cc: emacs-devel On 02/04/2016 08:54 AM, Eli Zaretskii wrote: > Emacs is a multilingual environment, so any assumption that > the main language in every buffer, or even in most buffers, is likely > to be the locale's language will misfire. True, but although Emacs is designed to be language-agnostic when handling buffer text, that doesn't mean it should be designed to be language-agnostic when handling user input. If Emacs starts up in a language-X locale, its user probably will be more comfortable using language-X rules for searching, even if the main language in a buffer is language Y. As an English-speaker when I search Swedish texts by hand, I normally want to use English-like rules because English is what I know and I can't really read the Swedish anyway. In English we tend to consider accents unimportant when searching, and because we treat “naïve” like “naive” we also treat “Ångström” like “Angstrom” even though the latter is not correct in Swedish. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 17:36 ` Paul Eggert @ 2016-02-04 17:45 ` Eli Zaretskii 2016-02-04 19:25 ` Paul Eggert 0 siblings, 1 reply; 102+ messages in thread From: Eli Zaretskii @ 2016-02-04 17:45 UTC (permalink / raw) To: Paul Eggert; +Cc: djcb, emacs-devel > Cc: emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Thu, 4 Feb 2016 09:36:47 -0800 > > On 02/04/2016 08:54 AM, Eli Zaretskii wrote: > > Emacs is a multilingual environment, so any assumption that > > the main language in every buffer, or even in most buffers, is likely > > to be the locale's language will misfire. > > True, but although Emacs is designed to be language-agnostic when > handling buffer text, that doesn't mean it should be designed to be > language-agnostic when handling user input. The user input in this case is a search string. A search string is likely to use the language of the text being searched, not the language of the user's locale. E.g., when I search Cyrillic text, I will hardly ever use Hebrew, my locale language. > As an English-speaker when I search Swedish texts by hand, I > normally want to use English-like rules because English is what I > know and I can't really read the Swedish anyway. I'm not sure this is the use case we should cater to. We should instead cater to users who search text they _can_ read. > In English we tend to consider accents unimportant when searching Amazingly enough, Unicode advises the same. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 17:45 ` Eli Zaretskii @ 2016-02-04 19:25 ` Paul Eggert 2016-02-04 19:36 ` Eli Zaretskii 0 siblings, 1 reply; 102+ messages in thread From: Paul Eggert @ 2016-02-04 19:25 UTC (permalink / raw) To: Eli Zaretskii; +Cc: djcb, emacs-devel On 02/04/2016 09:45 AM, Eli Zaretskii wrote: > We should instead cater to users who search text they_can_ read. This depends on what one means by "read". I can "read" Swedish in the sense that I know where the word boundaries are and have some idea of how they're pronounced. I can also "read" Belarusian in the sense that I know Cyrillic and a bit of Russian and can follow Belarusian better than Swedish, though I easily get lost. In both cases, I'd prefer Unicode-type case folding even though it's "wrong" to ignore diacritics in the native languages. Conversely, I can't "read" Hebrew or Chinese or Arabic in the same sense and so don't much care how folding works for those language. Perhaps some Hebrew-speaking experts want פּ and פ and ף to be treated the same while searching, while other experts do not; it doesn't matter to me. To help provide context here, most of my reading of non-English text is to support other free projects such as the tz database. That database is mostly English but contains short passages from other languages. I use Emacs for primary database maintenance, but often use other programs to browse the Internet as they're more convenient. I'll cut and paste out of a Firefox browser between a page of interest and Google Translate, for example. Examples of text under Emacs control include "Bahía", "Lịch hai thế kỷ", "中国科技史料", and "Новый счет времени". Most of the searching for this sort of thing in Emacs will involve typing strings like "bahia" and "lich" where I almost always prefer diacritic- and case-folded search. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 19:25 ` Paul Eggert @ 2016-02-04 19:36 ` Eli Zaretskii 0 siblings, 0 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-04 19:36 UTC (permalink / raw) To: Paul Eggert; +Cc: djcb, emacs-devel > Cc: djcb@djcbsoftware.nl, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Thu, 4 Feb 2016 11:25:41 -0800 > > On 02/04/2016 09:45 AM, Eli Zaretskii wrote: > > We should instead cater to users who search text they_can_ read. > > This depends on what one means by "read". I can "read" Swedish in the > sense that I know where the word boundaries are and have some idea of > how they're pronounced. I can also "read" Belarusian in the sense that I > know Cyrillic and a bit of Russian and can follow Belarusian better than > Swedish, though I easily get lost. In both cases, I'd prefer > Unicode-type case folding even though it's "wrong" to ignore diacritics > in the native languages. Then the current defaults are definitely for you, I think. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 11:57 ` Dirk-Jan C. Binnema 2016-02-04 15:18 ` Drew Adams 2016-02-04 16:54 ` Eli Zaretskii @ 2016-02-04 17:26 ` Teemu Likonen 2016-02-05 8:08 ` Adrian.B.Robert 2 siblings, 1 reply; 102+ messages in thread From: Teemu Likonen @ 2016-02-04 17:26 UTC (permalink / raw) To: Dirk-Jan C. Binnema; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 986 bytes --] Dirk-Jan C. Binnema [2016-02-04 13:57:36+02] wrote: > Regardless, for the purpose of searching, my personal preference would > be to make folding rather inclusive; I don't really care about the > exact rules languages have come up for what letters are considered > "the same", I just care for what I, as a user, would find the easiest > to match. > That would also get my vote as a reasonable default for case-folding > in searches. But I'll happily take any default, as long as there's a > way to get the above behavior, preferably without having to change my > locale. I think that just a global setting and easy switch like M-s <something> in isearch prompt is enough. I fear that any locale or language based magic or intelligence is over-engineering and may cause annoying surprises. Unexpected intelligence can be harmful too. -- /// Teemu Likonen - .-.. <https://github.com/tlikonen> // // PGP: 4E10 55DC 84E9 DFF6 13D7 8557 719D 69D3 2453 9450 /// [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 818 bytes --] ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 17:26 ` Teemu Likonen @ 2016-02-05 8:08 ` Adrian.B.Robert 0 siblings, 0 replies; 102+ messages in thread From: Adrian.B.Robert @ 2016-02-05 8:08 UTC (permalink / raw) To: emacs-devel Teemu Likonen <tlikonen@iki.fi> writes: > Dirk-Jan C. Binnema [2016-02-04 13:57:36+02] wrote: > >> Regardless, for the purpose of searching, my personal preference would >> be to make folding rather inclusive; I don't really care about the >> exact rules languages have come up for what letters are considered >> "the same", I just care for what I, as a user, would find the easiest >> to match. > >> ... > I think that just a global setting and easy switch like M-s <something> > in isearch prompt is enough. I fear that any locale or language based > magic or intelligence is over-engineering and may cause annoying > surprises. Unexpected intelligence can be harmful too. +1 I sense a strong enmity between the perfect and the good here. "Dumb" (unicode-equivalence-based) character folding is a a godsend for searching through texts when using the "wrong" keyboard layout, for whatever reason. It also matches expectations from using search engines, etc.. And exact matching can handle the need for precision. Using default=exact with an easy global option for switching to unicode-folding will be a great step forward. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 8:40 ` Elias Mårtenson 2016-02-04 11:57 ` Dirk-Jan C. Binnema @ 2016-02-04 21:32 ` Richard Stallman 2016-02-08 14:12 ` Marcin Borkowski 1 sibling, 1 reply; 102+ messages in thread From: Richard Stallman @ 2016-02-04 21:32 UTC (permalink / raw) To: Elias Mårtenson; +Cc: ofv, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > It would make sense to have the default based on the session's locale, Maybe the locale should be the ultimate default, but I think we should try to tie this to something else people specify in Emacs. We have something called the language environment that we could connect this to. Perhaps we need another temporary and buffer-specific language setting. It could control this, select the spelling dictionary, select a default input method, and more. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-04 21:32 ` Richard Stallman @ 2016-02-08 14:12 ` Marcin Borkowski 0 siblings, 0 replies; 102+ messages in thread From: Marcin Borkowski @ 2016-02-08 14:12 UTC (permalink / raw) To: rms; +Cc: ofv, Elias Mårtenson, emacs-devel On 2016-02-04, at 22:32, Richard Stallman <rms@gnu.org> wrote: > > It would make sense to have the default based on the session's locale, > > Maybe the locale should be the ultimate default, but I think we should > try to tie this to something else people specify in Emacs. > > We have something called the language environment that we could > connect this to. > > Perhaps we need another temporary and buffer-specific language setting. > It could control this, select the spelling dictionary, select a default > input method, and more. Yes, we need it. Wasn't that discussed some time ago? -- Marcin Borkowski http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski Faculty of Mathematics and Computer Science Adam Mickiewicz University ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 16:54 ` Clément Pit--Claudel 2016-02-03 17:01 ` John Wiegley @ 2016-02-03 17:02 ` Eli Zaretskii 1 sibling, 0 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-03 17:02 UTC (permalink / raw) To: Clément Pit--Claudel; +Cc: emacs-devel > From: Clément Pit--Claudel <clement.pit@gmail.com> > Date: Wed, 3 Feb 2016 11:54:41 -0500 > > On 02/03/2016 10:41 AM, Eli Zaretskii wrote: > > We don't, at least not yet. We want to collect feedback. > > I love the new behaviour: Thanks for your feedback. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 11:08 ` Artur Malabarba 2016-02-03 13:24 ` Stefan Monnier @ 2016-02-03 15:38 ` Eli Zaretskii 2016-02-03 22:53 ` Richard Stallman 2 siblings, 0 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-03 15:38 UTC (permalink / raw) To: Artur Malabarba; +Cc: per, emacs-devel > From: Artur Malabarba <bruce.connor.am@gmail.com> > Date: Wed, 03 Feb 2016 11:08:57 +0000 > Cc: "emacs-devel@gnu.org" <emacs-devel@gnu.org> > > Does anyone volunteer to switch OFF this default shortly before release? > If not, I'll just do it now. Thanks, but there's no need to do this yet. Doing that is easy, so if you need a volunteer, here I am. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 11:08 ` Artur Malabarba 2016-02-03 13:24 ` Stefan Monnier 2016-02-03 15:38 ` Eli Zaretskii @ 2016-02-03 22:53 ` Richard Stallman 2 siblings, 0 replies; 102+ messages in thread From: Richard Stallman @ 2016-02-03 22:53 UTC (permalink / raw) To: Artur Malabarba; +Cc: per, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] To get useful feedback from pretests, we need to ask the community to respond. Otherwise we will only hear from those who absolutely hate it. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: Character folding in the pretest 2016-02-03 0:31 Character folding in the pretest Per Starbäck ` (2 preceding siblings ...) 2016-02-03 11:08 ` Artur Malabarba @ 2016-02-03 15:39 ` Eli Zaretskii 3 siblings, 0 replies; 102+ messages in thread From: Eli Zaretskii @ 2016-02-03 15:39 UTC (permalink / raw) To: Per Starbäck; +Cc: emacs-devel > Date: Wed, 3 Feb 2016 01:31:11 +0100 > From: Per Starbäck <per@starback.se> > > Eli thought that it should remain turned on in the pretest to get more > testing: > > The entire time interval between Nov 15 this year and until we release > > Emacs 25.1 (which will take a few months, probably more than 6, > > judging by past experience) is supposed to provide that feedback. All > > it takes to turn this off by default is changing the default value of > > a single variable (and change a couple of places in the User Manual to > > reflect that). Once we decide to do that, it can be done very quickly > > and easily. We can do that a day before the release, if we want to. > > > > OTOH, turning it off today means that it will get much less testing, > > and therefore bugs related to it (like the one reported just today in > > http://debbugs.gnu.org/cgi/bugreport.cgi?bug=22090) will most probably > > remain hidden for who knows how long. > > It's time to make that decision now. IMO, it's too early for that. As I said in the quote above, the time interval for the feedback can go on until very close to the release. That time is still far away. The pretest just started less than a week ago, and no new opinions were heard yet. Let us collect the feedback for a bit more than just a couple of days. If someone wants to start a poll somewhere, please do, it will allow us to collect more data and make better decisions. If not, we will have to go with what will be written here and on other relevant forums. > Also, please please add a checkbox for character folding just above or > below the one for case folding in the Options menu!! Indeed, patches are welcome for such an addition. (Lax whitespace option probably needs a similar option.) Thanks. ^ permalink raw reply [flat|nested] 102+ messages in thread
end of thread, other threads:[~2016-02-09 15:26 UTC | newest] Thread overview: 102+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-02-03 0:31 Character folding in the pretest Per Starbäck 2016-02-03 6:34 ` Adrian.B.Robert 2016-02-03 8:00 ` Paul Eggert 2016-02-03 10:54 ` Yuri Khan 2016-02-03 15:57 ` Filipp Gunbin 2016-02-03 16:24 ` Drew Adams 2016-02-03 16:46 ` Clément Pit--Claudel 2016-02-03 17:28 ` Drew Adams 2016-02-03 18:10 ` Clément Pit--Claudel 2016-02-03 18:24 ` Clément Pit--Claudel 2016-02-03 18:31 ` Drew Adams 2016-02-03 16:52 ` Yuri Khan 2016-02-03 11:08 ` Artur Malabarba 2016-02-03 13:24 ` Stefan Monnier 2016-02-03 13:35 ` Nicolas Petton 2016-02-03 15:06 ` Drew Adams 2016-02-03 15:41 ` Eli Zaretskii 2016-02-03 15:55 ` Teemu Likonen 2016-02-03 16:16 ` Eli Zaretskii 2016-02-06 13:41 ` Teemu Likonen 2016-02-06 14:33 ` Eli Zaretskii 2016-02-06 15:09 ` Teemu Likonen 2016-02-06 18:38 ` Artur Malabarba 2016-02-06 19:08 ` Eli Zaretskii 2016-02-07 1:06 ` Artur Malabarba 2016-02-03 16:54 ` Clément Pit--Claudel 2016-02-03 17:01 ` John Wiegley 2016-02-03 21:08 ` Óscar Fuentes 2016-02-03 22:32 ` John Wiegley 2016-02-03 22:52 ` Clément Pit--Claudel 2016-02-03 23:50 ` Sacha Chua 2016-02-04 5:49 ` Ivan Andrus 2016-02-04 21:30 ` Richard Stallman 2016-02-04 8:40 ` Elias Mårtenson 2016-02-04 11:57 ` Dirk-Jan C. Binnema 2016-02-04 15:18 ` Drew Adams 2016-02-04 15:59 ` Óscar Fuentes 2016-02-04 16:36 ` Clément Pit--Claudel 2016-02-04 16:47 ` Óscar Fuentes 2016-02-04 17:05 ` Werner LEMBERG 2016-02-05 5:09 ` Elias Mårtenson 2016-02-05 6:01 ` Werner LEMBERG 2016-02-05 6:36 ` Elias Mårtenson 2016-02-05 7:15 ` Werner LEMBERG 2016-02-05 7:22 ` Elias Mårtenson 2016-02-06 15:43 ` Rasmus 2016-02-06 15:51 ` Eli Zaretskii 2016-02-05 7:52 ` Eli Zaretskii 2016-02-05 15:09 ` Filipp Gunbin 2016-02-05 19:21 ` Eli Zaretskii 2016-02-05 21:12 ` Óscar Fuentes 2016-02-05 22:20 ` Eli Zaretskii 2016-02-06 19:49 ` Richard Stallman 2016-02-06 19:49 ` Richard Stallman 2016-02-08 14:05 ` Marcin Borkowski 2016-02-08 17:48 ` Eli Zaretskii 2016-02-08 17:57 ` Werner LEMBERG 2016-02-08 19:18 ` Marcin Borkowski 2016-02-08 19:37 ` Eli Zaretskii [not found] ` <<83oabrouwj.fsf@gnu.org> 2016-02-09 0:04 ` Drew Adams 2016-02-09 12:15 ` Richard Stallman [not found] ` <<E1aT7CM-0005LM-9f@fencepost.gnu.org> 2016-02-09 15:26 ` Drew Adams 2016-02-06 12:58 ` Rasmus 2016-02-04 17:12 ` Eli Zaretskii 2016-02-04 19:35 ` Óscar Fuentes 2016-02-04 19:52 ` Clément Pit--Claudel 2016-02-04 20:05 ` Eli Zaretskii 2016-02-04 17:27 ` Clément Pit--Claudel 2016-02-04 17:34 ` Eli Zaretskii 2016-02-04 18:18 ` Yuri Khan 2016-02-04 19:46 ` Óscar Fuentes 2016-02-04 20:06 ` Clément Pit--Claudel 2016-02-04 20:40 ` Óscar Fuentes 2016-02-04 20:56 ` Clément Pit--Claudel 2016-02-04 21:16 ` Óscar Fuentes 2016-02-04 20:07 ` Eli Zaretskii 2016-02-04 20:52 ` Óscar Fuentes 2016-02-04 20:59 ` Clément Pit--Claudel 2016-02-04 21:08 ` Eli Zaretskii 2016-02-04 20:23 ` John Wiegley 2016-02-04 17:07 ` Eli Zaretskii 2016-02-04 17:31 ` Clément Pit--Claudel 2016-02-04 23:05 ` Artur Malabarba 2016-02-06 9:37 ` Per Starbäck 2016-02-06 10:41 ` Eli Zaretskii 2016-02-06 12:52 ` Rasmus 2016-02-06 14:31 ` Eli Zaretskii 2016-02-06 14:24 ` Ken Brown 2016-02-06 15:07 ` Eli Zaretskii 2016-02-04 16:54 ` Eli Zaretskii 2016-02-04 17:36 ` Paul Eggert 2016-02-04 17:45 ` Eli Zaretskii 2016-02-04 19:25 ` Paul Eggert 2016-02-04 19:36 ` Eli Zaretskii 2016-02-04 17:26 ` Teemu Likonen 2016-02-05 8:08 ` Adrian.B.Robert 2016-02-04 21:32 ` Richard Stallman 2016-02-08 14:12 ` Marcin Borkowski 2016-02-03 17:02 ` Eli Zaretskii 2016-02-03 15:38 ` Eli Zaretskii 2016-02-03 22:53 ` Richard Stallman 2016-02-03 15:39 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).