From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Entering emojis Date: Thu, 28 Oct 2021 20:37:57 +0300 Message-ID: <83lf2drqx6.fsf@gnu.org> References: <87cznths5j.fsf@gnus.org> <87tuh4f1ie.fsf@gnus.org> <0353A9DA-0041-4D71-8E1B-09FB07A5FD0F@acm.org> <87ilxialzw.fsf@igel.home> <831r46wj6r.fsf@gnu.org> <83fssmuxui.fsf@gnu.org> <83bl3aux6y.fsf@gnu.org> <835ytiuvm9.fsf@gnu.org> <834k91vgie.fsf@gnu.org> <8ff3b131c5fa370d9eaf@heytings.org> <83mtmttsxz.fsf@gnu.org> <8ff3b131c56b7b2d1d6f@heytings.org> <83bl39tqnl.fsf@gnu.org> <8ff3b131c531f5254799@heytings.org> <83a6ittp5r.fsf@gnu.org> <8ff3b131c53b9df49236@heytings.org> <834k91th5c.fsf@gnu.org> <8ff3b131c5fe09753ca0@heytings.org> <83mtmtru6l.fsf@gnu.org> <8ff3b131c57f741d04e5@heytings.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="36229"; mail-complaints-to="usenet@ciao.gmane.io" Cc: mattiase@acm.org, emacs-devel@gnu.org, schwab@linux-m68k.org, stefankangas@gmail.com, raman@google.com To: Gregory Heytings Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Oct 28 19:53:40 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mg9at-0009CA-Ly for ged-emacs-devel@m.gmane-mx.org; Thu, 28 Oct 2021 19:53:39 +0200 Original-Received: from localhost ([::1]:49112 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mg9as-0005ga-Ev for ged-emacs-devel@m.gmane-mx.org; Thu, 28 Oct 2021 13:53:38 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:47614) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mg9Lp-00079K-Dx for emacs-devel@gnu.org; Thu, 28 Oct 2021 13:38:05 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:58342) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mg9Ll-00048M-Sf; Thu, 28 Oct 2021 13:38:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=9nkPC4v/nWqpzdu3vmUYmF9jgmZagsbjWUbRvKOxESA=; b=O6OSU6+Hddu5 wP4ObPKU/eHi4zfAAn+W4z3/jebPxe8kYp0/96P300yfzfR+t6wK6g642YDNkf7NQnt7UkJmt6x0e QZ/wBygO6ajSATFZ4BRFdD5EsOij/Lb8D48lvVm/lVKoZSYnzC8bp6/I7tGfWYAFTJxk8NFxfQSja o70PE++Y/Cx7bq6N7bxlFSJ48V9jTEn5TJZimieLts5fCGSAJ2S/ilTdZaAxfAp/EZy2H6YPoqAEb K3312lG+aZwxWH3PbwaI4+w1dPVt9Myzxf+0m9+DBs0Y//fx0r6uSbQdR7435JTK7SofkQbak1COE kfvFCnrccpNTO8hAnwjuIQ==; Original-Received: from [87.69.77.57] (port=1374 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mg9Ll-0002g1-CP; Thu, 28 Oct 2021 13:38:01 -0400 In-Reply-To: <8ff3b131c57f741d04e5@heytings.org> (message from Gregory Heytings on Thu, 28 Oct 2021 17:06:56 +0000) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:278131 Archived-At: > Date: Thu, 28 Oct 2021 17:06:56 +0000 > From: Gregory Heytings > cc: mattiase@acm.org, raman@google.com, schwab@linux-m68k.org, > stefankangas@gmail.com, emacs-devel@gnu.org > > >> Given the limited available manpower in that narrow subfield, I'm not > >> quite sure it was the best thing to do. Using a font with predefined > >> ligatures is much easier to enter text. > > > > I think the issue is not whether the font delivers ligatures or not, the > > issue is whether the font recognizes the sequences which should produce > > either ligatures or series of glyphs with offsets, when the formatting > > controls are in the sequence. It sounds like the existing fonts don't > > recognize such sequences for what they are supposed to produce. > > This is not how ligatures work. Ligatures automatically translate a > sequence of characters into an appropriate glyph, which may or may not be > a combination of other glyphs. For example, the Computer Modern font > translates the two-character sequence "fi" into a character which looks > better than "f" followed by "i". If for some reason you don't want that > ligature to take place, you write "f{}i" (in TeX), and you get "f" > followed by "i". Yes, I know. But ligatures are not the only way of handling this. When a font produces a ligature, i.e. a precomposed glyph that should be displayed instead of several characters, it produces a single font glyph. The other way is to produce several font glyphs, each one with offsets relative to the base-line. Emacs supports both ways. However, for any of the two to work, both the shaping engine and the font should recognize the sequence, and the font should produce one or more glyphs with the offsets for that sequence. > >> But it's not a joiner, it's a non-joiner. The logic is the opposite of > >> what Unicode decided to do: known quadrats are automatically recognized > >> and combined appropriately when their individual characters appear one > >> after the other in a string. It's only when you want to avoid this > >> that you have to add a non-joiner. > > > > That's not what the Unicode Standard says. > > I don't know what you mean by this. I mean what the Unicode Standard says: it says that two hieroglyphs should be displayed "normally", i.e. as separate characters at the same vertical position, unless there's the vertical joiner between them, in which case one should be above the other. > With the ligature logic, to enter "em-hotep", which is composed of four > characters, you just enter these four characters: G17, R4, X1, Q3. Which then means that you cannot write anything where these 4 characters are displayed in an arrangement different from that particular one that's encoded in the font, unless you use non-joiners. > I don't know. The problem is that the sequence of egyptian characters in > etc/HELLO that are displayed correctly by hb-view and LibreOffice (and > that are included in my patch) is for some reason not displayed correctly > by Emacs, even with the recent Aegyptus font installed. Are ligatures > disabled for some reason in Emacs? No, they aren't. In Emacs, ligatures are just part of character composition, they aren't a separate feature. However, the composition rules I defined went with Unicode, and need to be fixed to support what the Aegyptus font does. Does the patch below help? diff --git a/lisp/language/misc-lang.el b/lisp/language/misc-lang.el index a2ca678..141349a 100644 --- a/lisp/language/misc-lang.el +++ b/lisp/language/misc-lang.el @@ -192,7 +192,12 @@ egyptian-shape-grouping composition-function-table #x13437 (list (vector "\U00013437[\U00013000-\U0001343F]+" - 0 #'egyptian-shape-grouping)))) + 0 #'egyptian-shape-grouping))) + (set-char-table-range + composition-function-table + '(#x13000 . #x1342E) + (list (vector "[\U00013000-\U0001342E]+" + 0 #'font-shape-gstring)))) (provide 'misc-lang)