From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Achim Gratz Newsgroups: gmane.emacs.devel Subject: Re: On language-dependent defaults for character-folding Date: Sun, 21 Feb 2016 09:14:18 +0100 Organization: Linux Private Site Message-ID: <87egc68opx.fsf@Rainer.invalid> References: <83d1s17npz.fsf@gnu.org> <87oablfpn3.fsf@mail.linkov.net> <834mdd6llx.fsf@gnu.org> <7fbb8bc7-9a97-4bad-a103-a6690a35241d@default> <834mdc5w6o.fsf@gnu.org> <838u2hu6aq.fsf@gnu.org> <871t899tde.fsf@gnus.org> <83y4ahru04.fsf@gnu.org> <83fuwproyf.fsf@gnu.org> <837fi0sz29.fsf@gnu.org> <83egc8qzjh.fsf@gnu.org> <87egc7evu3.fsf@gnus.org> <83io1jpt4u.fsf@gnu.org> <87povqhj25.fsf@gnus.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1456042494 1166 80.91.229.3 (21 Feb 2016 08:14:54 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 21 Feb 2016 08:14:54 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Feb 21 09:14:44 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aXPAC-0005nq-BJ for ged-emacs-devel@m.gmane.org; Sun, 21 Feb 2016 09:14:44 +0100 Original-Received: from localhost ([::1]:39083 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aXPAB-0006BA-NW for ged-emacs-devel@m.gmane.org; Sun, 21 Feb 2016 03:14:43 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:42184) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aXP9y-00065t-20 for emacs-devel@gnu.org; Sun, 21 Feb 2016 03:14:30 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aXP9u-0000H7-Qq for emacs-devel@gnu.org; Sun, 21 Feb 2016 03:14:29 -0500 Original-Received: from plane.gmane.org ([80.91.229.3]:36303) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aXP9u-0000H3-Jo for emacs-devel@gnu.org; Sun, 21 Feb 2016 03:14:26 -0500 Original-Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1aXP9s-0005Xe-VS for emacs-devel@gnu.org; Sun, 21 Feb 2016 09:14:25 +0100 Original-Received: from p54b4660e.dip0.t-ipconnect.de ([84.180.102.14]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 21 Feb 2016 09:14:24 +0100 Original-Received: from Stromeko by p54b4660e.dip0.t-ipconnect.de with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 21 Feb 2016 09:14:24 +0100 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 26 Original-X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: p54b4660e.dip0.t-ipconnect.de User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.90 (gnu/linux) Cancel-Lock: sha1:kUIkaL9xS63Uj+vCZKqnPyqhpc4= X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 80.91.229.3 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:200364 Archived-At: Elias Mårtenson writes: > Because under the Unicode decomposition rules, ø is not decomposable. I > can't explain why that is the case (probably because there is no reason to > have a combining /. After all, the only languages that use ø are languages > that use it as a character of its own). AFAIK, for combining characters to be composable/decomposable the glyphs must not overlap. This is the same issue as with the polish »ł« to the best of my knowledge. In other words, unicode composition/decomposition rules tell you more about the glyph construction than they do about useful strategies to search for multiple characters. The idea of using the base character of the canonical decomposition in the search might still yield a useful shortcut in most cases, but I'm not sure it is correct in all languages even when that decomposition exists and, as the examples show, there are cases where the non-decomposed character has to be treated specially. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptations for Waldorf Q V3.00R3 and Q+ V3.54R2: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada