From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#41645: 27.0.91; Combining Grapheme Joiner (#x34f) gui artifacts Date: Tue, 02 Jun 2020 19:07:35 +0300 Message-ID: <83zh9lcuyg.fsf@gnu.org> References: <83zh9merd4.fsf@gnu.org> <83wo4qepab.fsf@gnu.org> <83lfl6eiod.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="17731"; mail-complaints-to="usenet@ciao.gmane.io" Cc: dfussner@googlemail.com, 41645@debbugs.gnu.org To: Pip Cet Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Jun 02 18:09:10 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jg9TS-0004S9-JD for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 02 Jun 2020 18:09:10 +0200 Original-Received: from localhost ([::1]:58876 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jg9TR-0005Eo-Ej for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 02 Jun 2020 12:09:09 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:59750) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jg9TK-0005DA-6i for bug-gnu-emacs@gnu.org; Tue, 02 Jun 2020 12:09:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:57410) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jg9TJ-0003W1-TA for bug-gnu-emacs@gnu.org; Tue, 02 Jun 2020 12:09:01 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1jg9TJ-00013g-O1 for bug-gnu-emacs@gnu.org; Tue, 02 Jun 2020 12:09:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 02 Jun 2020 16:09:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 41645 X-GNU-PR-Package: emacs Original-Received: via spool by 41645-submit@debbugs.gnu.org id=B41645.15911140863997 (code B ref 41645); Tue, 02 Jun 2020 16:09:01 +0000 Original-Received: (at 41645) by debbugs.gnu.org; 2 Jun 2020 16:08:06 +0000 Original-Received: from localhost ([127.0.0.1]:40723 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jg9SM-00012I-7h for submit@debbugs.gnu.org; Tue, 02 Jun 2020 12:08:06 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:52152) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jg9SL-00011o-8M for 41645@debbugs.gnu.org; Tue, 02 Jun 2020 12:08:01 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:47281) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jg9SG-0003Oo-1K; Tue, 02 Jun 2020 12:07:56 -0400 Original-Received: from [176.228.60.248] (port=3875 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jg9SD-0005mw-V9; Tue, 02 Jun 2020 12:07:55 -0400 In-Reply-To: (message from Pip Cet on Mon, 1 Jun 2020 19:48:15 +0000) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:181409 Archived-At: > From: Pip Cet > Date: Mon, 1 Jun 2020 19:48:15 +0000 > Cc: dfussner@googlemail.com, 41645@debbugs.gnu.org > > > (aref composition-function-table #x34f) > > => (["\\c.\\c^+" 1 compose-gstring-for-graphic] > > [nil 0 compose-gstring-for-graphic]) > > > So CGJ is supposed to be composed with the previous character, > > similarly to diacritics. > > (BTW, isn't that regexp wrong? a base character can be followed by two > diacritics, then a CGJ, then another diacritic...) No, I don't think the regexp is wrong. Every character whose Unicode general-category property is Mn is given the '^' ("combining") category, see characters.el. The CGJ is one of them, but all the accents and diacritics are also of that class. So the above matches any sequence of Mn characters in any order and permutation -- which is of course more than is needed, but we rely on the shaper to combine only those that should be. For example, if I type the following 4 characters: LATIN SMALL LETTER U (U+0075) + COMBINING GRAVE ACCENT (U+0300) + CGJ (U+034F) + COMBINING DIAERESIS (U+0308) I see the composition working correctly, provided that I use a font that supports all the 4 codepoints. > > And another question: in the cases where you see artifacts, does the > > call to font-shape-gstring inside compose-gstring-for-graphic return > > nil or non-nil? > > Neither, it's never reached. The first rule fails because font_range > restricts the composition range to a single character, so the second > rule applies. OK, that's the crucial fact: it means that the font used for CGJ is not the same one as used for the surrounding text. This is a situation I never saw on my systems. In addition, the font used for the CGJ has a zero-width glyph for it, which is another thing I never saw (I meanwhile tried almost 20 different fonts supporting the CGJ, and all of them either produce a 1-pixel thin space or the funny box with a circle inside). So I think now it's clear what is going when this particular font is present, and we are left... > I think we're just failing to deal with a zero-width autocomposition > glyph, because we're dealing fine with the same glyph when > autocomposition is off. > > xdisp.c: > 30008 if (get_char_glyph_code (it->char_to_display, font, &char2b)) > 30009 { > 30010 pcm = get_per_char_metric (font, &char2b); > 30011 if (pcm->width == 0 > 30012 && pcm->rbearing == 0 && pcm->lbearing == 0) > 30013 pcm = NULL; > 30014 } > ...with this. I think you are right, and we should do the same with zero-width LGLYPH_STRING, forcing it->glyph_not_available_p to non-zero, and then doing something like this in fill_gstring_glyph_string: if (s->font == NULL || glyph_not_available_p) { s->font_not_found_p = true; s->font = FRAME_FONT (s->f); } similar to what fill_glyph_string does. WDYT?