From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Cesar Crusius Newsgroups: gmane.emacs.devel Subject: Re: ucs-normalize and diacritics Date: Wed, 25 Jul 2018 15:59:30 -0700 Message-ID: References: <8736w88pnn.fsf@gmail.com> <83lga0v4ff.fsf@gnu.org> <87tvoo73s9.fsf@gmail.com> <83fu08ujln.fsf@gnu.org> <837eljv0v0.fsf@gnu.org> <87k1pj5b3w.fsf@gmail.com> <87fu0759kt.fsf@gmail.com> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" X-Trace: blaine.gmane.org 1532559469 32053 195.159.176.226 (25 Jul 2018 22:57:49 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 25 Jul 2018 22:57:49 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Jul 26 00:57:45 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fiSix-00086r-TI for ged-emacs-devel@m.gmane.org; Thu, 26 Jul 2018 00:57:40 +0200 Original-Received: from localhost ([::1]:56586 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fiSl2-0000vD-Sp for ged-emacs-devel@m.gmane.org; Wed, 25 Jul 2018 18:59:48 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:55990) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fiSku-0000oD-2Y for emacs-devel@gnu.org; Wed, 25 Jul 2018 18:59:43 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fiSkr-0000kq-57 for emacs-devel@gnu.org; Wed, 25 Jul 2018 18:59:40 -0400 Original-Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]:33487) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fiSkq-0000kZ-Ts for emacs-devel@gnu.org; Wed, 25 Jul 2018 18:59:37 -0400 Original-Received: by mail-pf1-x429.google.com with SMTP id b17-v6so2164915pfi.0 for ; Wed, 25 Jul 2018 15:59:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:references:date:in-reply-to:message-id:user-agent :mime-version; bh=uUW2kfZVjn/hOOnPkhWYHFGrTRRvwSDyMgRRrXi0Y1E=; b=lUDqQv2PosjXMD3UV3GOu/bYJ7SyhFyWa9emqOlaqEOQQMYkYOox42wuYYGeSKMAQ8 iijifbCC/UFQhnT1cQsCXI66RYmgJj8oLgxX2fpR0L16jwUlrPuMrowxyqAAPcIqMm53 oimTVIhdop3IK0OSwNmwTbcZLXo9TKxu9DAbJIGMjXxDoQG9KFnqGwjID3zl1t+3EUT3 1tY2XfC6Lk5LdNb1/hSFnydB5jVstzBg8UrMXzrcX2niXrVH6sX3cpsvla1eilHUHeu+ Jzs9/dAAWt96eQnpbqqD7VUfnRJS02gKJn/mbCqvD7zJdt1WfqQhexGHPJsNNE6TL0JO a0iw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=uUW2kfZVjn/hOOnPkhWYHFGrTRRvwSDyMgRRrXi0Y1E=; b=TP0V4yyo98CvQh1mYL6hk7mAQ8ysUR+W8VEdiMLMjyibd3KCdGvXIQ1dZBH4DP50gW tcD3tdA0lMGQvuzPQKUQKzgIANG60fLLeb2T5S5YRANvz5x0379at7Sitvc8rAFiqt5X uNHmYRsxxGj81jihE5zXkxMbmHkO2hKzEsilUKtgX9OAu+Hb6A4L6EOd39ifbNaWH/Xv TP5o4dWnqCjP/VEQMtBMjCSGwOoYD+4H07EPWotUqn4Gw6cs45gwLWWOsP5VDBURUbvJ dOOe03I7XW/zoMXyP0hYTZsUtk8+cu4bg4pCKzEuOHaUmgN5rlGQtsscuFUnebUcIZtq CtxA== X-Gm-Message-State: AOUpUlF7G/aoKJp22Zi/svdkQumgONHZlaFczoHzxm2m51ePJfMWmbQY MmJ4+iXW4gD4lT6I4wwUh2HT6xwY X-Google-Smtp-Source: AAOMgpcGaQI6o8xoxBr1BsjBgDj0Mmwnis4NSDAA5tNXLjFrMLXre3oN5piDjB/LRiU9y06DzCNneQ== X-Received: by 2002:a62:f587:: with SMTP id b7-v6mr24030393pfm.158.1532559575485; Wed, 25 Jul 2018 15:59:35 -0700 (PDT) Original-Received: from ccrusius-glaptop ([2620:0:100e:fd00:da9:de06:b8be:83b2]) by smtp.gmail.com with ESMTPSA id r28-v6sm26898215pfd.37.2018.07.25.15.59.34 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 25 Jul 2018 15:59:34 -0700 (PDT) In-Reply-To: <87fu0759kt.fsf@gmail.com> (Robert Pluim's message of "Wed, 25 Jul 2018 22:44:50 +0200") X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::429 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:227821 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Robert Pluim writes: > Cesar Crusius writes: > >>> How common is 3-character composition likely to be? (for that matter, >>> how common is 2-character composition? I explicitly use input methods >>> for this kind of stuff). I can envisage an algorithm that takes a >>> combining character, then scans backwards to see if the font used for >>> it will cover all previous characters, recursively. It does seem like >>> a lot of effort for a small return. >> >> Recalling a recent discussion, they are unavoidable in polytonic >> Greek, because Unicode does not provide the pre-combined >> character. There's no other way to get an "rough breathing long alpha >> with acute accent," =E1=BE=B1=CC=94=CC=81. (Which by the way Emacs handl= es nicely with the >> font I use, Iosevka.) > > Even that is only a two character composition (unless I=CA=BCve > misunderstood the what-cursor-position output), and it=CA=BCs rather > specialized, and you know what you=CA=BCre doing :-) There are three characters in total, =E1=BE=B1 plus =CC=94plus =CC=81. >> Granted, not many people will use this, but for those who do, they >> will be all over the place. > > I'm not sure I understand the comment. Current behaviour is what it > is, I=CA=BCm not proposing anything that would make it worse. I'm just answering the "how common" question above, nothing else implied re= garding the current discussion. For people writing polytonic Greek, composi= tions like the one above are not too uncommon, and two character compositio= ns of the form "macron + combining accent" such as =E1=BE=B1=CC=81 are very= common. FWIW. Best, =2D-=20 Cesar Crusius --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQGzBAEBCgAdFiEEsu/ErKn7yEV7E0FU/X9qSDfQj2EFAltZANIACgkQ/X9qSDfQ j2FCCQv/exd772YucEa8izUXH0AU1xjGUoHeQdnwiWO0JM8xsZ9tGS8cSJqtOVWg Q978IPYaKrOm+h3/kD8e1y9sKWHdFFQuUPhX7+/SvFBBkL0jlcA3pIOg2ncYyB0M z9gSOO1IoI9teLLaEqasHFCey+h0c2xaiLTRAn5vqPb3Y+7q4RCXg2gXcbVi6TPT 8Y4BV2ynX5BoGdyP3YkEnknkGRrWoan7FSEAnWN4LGV701Na7NOgE0D/SUlp2H0n ARvcoOUxZaJSY+qsV3nKfxO0LzvW34A752cgqnnTMbSib1S4vdJZEeMrWKosM59O gH1qqJ7u0aTQChL2mNb6b6hY5/hA/Rdyz9qN8ywA3JfVQUeg8rI4PWtr269brioG eiD61dAYfdJo34Z3M75LWxtboKgQUhFI91aRvAsDfIGDYdsrubSjRP0tlsxGX5fS XDfpzSWPvAGy0tlrM6soMWYA2/DGb5DQP5I5RT3rYU38FmxTsxkEm3HdKiW8ZtpL rcEqTfys =kbqj -----END PGP SIGNATURE----- --=-=-=--