From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Richard Stallman Newsgroups: gmane.emacs.devel Subject: Re: On language-dependent defaults for character-folding Date: Tue, 23 Feb 2016 12:43:56 -0500 Message-ID: References: <834mdc5w6o.fsf@gnu.org> <838u2hu6aq.fsf@gnu.org> <871t899tde.fsf@gnus.org> <83y4ahru04.fsf@gnu.org> <83fuwproyf.fsf@gnu.org> <837fi0sz29.fsf@gnu.org> <83egc8qzjh.fsf@gnu.org> <87egc7evu3.fsf@gnus.org> <83io1jpt4u.fsf@gnu.org> <87povqhj25.fsf@gnus.org> <87povqe5tr.fsf@gnus.org> <87ziuta4l4.fsf@gnus.org> <87y4adzcia.fsf@gnus.org> <83twl0k1k5.fsf@gnu.org> Reply-To: rms@gnu.org NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1456249465 8059 80.91.229.3 (23 Feb 2016 17:44:25 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 23 Feb 2016 17:44:25 +0000 (UTC) Cc: larsi@gnus.org, lokedhs@gmail.com, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Feb 23 18:44:15 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aYH0R-0008Vv-Mq for ged-emacs-devel@m.gmane.org; Tue, 23 Feb 2016 18:44:15 +0100 Original-Received: from localhost ([::1]:58835 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aYH0R-0005pm-4M for ged-emacs-devel@m.gmane.org; Tue, 23 Feb 2016 12:44:15 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:42839) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aYH0J-0005i5-2A for emacs-devel@gnu.org; Tue, 23 Feb 2016 12:44:11 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aYH0H-0000oW-WA for emacs-devel@gnu.org; Tue, 23 Feb 2016 12:44:06 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:45541) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aYH09-0000ih-Cg; Tue, 23 Feb 2016 12:43:57 -0500 Original-Received: from rms by fencepost.gnu.org with local (Exim 4.82) (envelope-from ) id 1aYH08-0006Eu-7C; Tue, 23 Feb 2016 12:43:56 -0500 In-reply-to: <83twl0k1k5.fsf@gnu.org> (message from Eli Zaretskii on Mon, 22 Feb 2016 21:06:02 +0200) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:200552 Archived-At: [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Some minimal amount of folding will nevertheless be necessary even in > asymmetric mode, in order to find character sequences produced by > decomposing characters like ö into o and the combining mark ̈. That's > because these two characters when juxtaposed (ö) look identical to the > precomposed character on most displays, so we should by default find > such decomposed sequences even when the search string includes the > precomposed character. That is interesting. It means we need several levels of folding: * Different appearances of the same letter+decorations: as a single code point, or as a composition. * Identical-looking distinct code points (Latin a and Cyrillic a). * The same letter with different decorations (o and ö in English). * Equivalent letters (ö and ø in Swedish). * Non-equivalent letters modified from a common base (o and ö in Swedish). The first level is language-independent and should be handled symmetrically, with each folding group as an equivalence class. Is there any need, ever, to disable the first level? Perhaps it would be good to enable that all the time. The second level is also language-independent. Does anyone ever want to turn it off? The other levels are language-specific, and the user might want to enable or disable them. When enabled, the user might want them handled symmetrically or asymmetrically. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html.