From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: On language-dependent defaults for character-folding Date: Sat, 27 Feb 2016 10:38:53 +0200 Message-ID: <83fuwecztu.fsf@gnu.org> References: <87egc7evu3.fsf@gnus.org> <83io1jpt4u.fsf@gnu.org> <87povqhj25.fsf@gnus.org> <87povqe5tr.fsf@gnus.org> <87ziuta4l4.fsf@gnus.org> <87y4adzcia.fsf@gnus.org> <83twl0k1k5.fsf@gnu.org> <83k2lvi99c.fsf@gnu.org> <83oab6gfiw.fsf@gnu.org> <878u29x8vl.fsf@fastmail.fm> <83ziuncpch.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1456562378 27312 80.91.229.3 (27 Feb 2016 08:39:38 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 27 Feb 2016 08:39:38 +0000 (UTC) Cc: joostkremers@fastmail.fm, larsi@gnus.org, lokedhs@gmail.com, rms@gnu.org, emacs-devel@gnu.org To: "John Wiegley" Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Feb 27 09:39:37 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aZaPY-0008MA-OS for ged-emacs-devel@m.gmane.org; Sat, 27 Feb 2016 09:39:36 +0100 Original-Received: from localhost ([::1]:53868 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aZaPX-0008Cl-U9 for ged-emacs-devel@m.gmane.org; Sat, 27 Feb 2016 03:39:35 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:43999) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aZaPR-0008CG-0C for emacs-devel@gnu.org; Sat, 27 Feb 2016 03:39:32 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aZaPN-00024D-73 for emacs-devel@gnu.org; Sat, 27 Feb 2016 03:39:28 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:42031) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aZaPI-00023Q-T0; Sat, 27 Feb 2016 03:39:20 -0500 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:3974 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1aZaPB-0006M3-Hg; Sat, 27 Feb 2016 03:39:13 -0500 In-reply-to: (message from John Wiegley on Fri, 26 Feb 2016 16:48:21 -0800) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:200706 Archived-At: > From: John Wiegley > Cc: joostkremers@fastmail.fm, rms@gnu.org, lokedhs@gmail.com, larsi@gnus.org, emacs-devel@gnu.org > Date: Fri, 26 Feb 2016 16:48:21 -0800 > > > [1:text/plain Hide] > > >>>>> Eli Zaretskii writes: > > > The discussion (with a few exceptions) is about how to augment the current > > implementation to make it more acceptable to various needs and cultures. So > > I think it's directly related to the pretest, and so moving it to > > emacs-tangents would be wrong. > > In that case, can you please propose a plan for reaching such acceptability? > If I can clearly see what we're aiming toward, it will give me a context for > reading these messages, and help focus the discussion. > > For example: makes exactly it not acceptable today? what are the desirable > features of an "ideal implementation"? what are the variables we're trying to > hammer down? etc. Then I think we can meaningfully tackle this issue by > breaking it into the smaller pieces that make it up. The simplest change would be to have character-folding disabled by default in some European locales whose users expressed objections to having it on by default, due to folding of some characters that shouldn't be folded in the languages of those locales. Another, more complex, but still simple enough, possibility would be to have character-folding on by default, but have the problematic foldings filtered out from the regexp used by it. We could either always filter out all of them, or filter out only some of them, as determined by the user locale. For example, in the Spanish locales, ñ will not be folded. The next alternative is to come up with a fine-grained classification of character-folding, and provide user options to control each one of them independently, with the defaults determined by the user locale. For example, one class of folding is the one required for matching pre-composed characters such as á with its decomposed variant á; another class is for finding "similar" characters, such as finding ⒜ when looking for a. There should probably be classes that are disliked by users of certain languages, such as ñ for Spanish. Etc. etc. (I think this alternative needs more research and user feedback, and so is probably not for the release branch.) Maybe there are more alternatives, I don't know. It's not like they were explicitly proposed by someone; the above is just my personal conclusions from reading the discussion.