From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Robert Pluim Newsgroups: gmane.emacs.bugs Subject: bug#39799: 28.0.50; Most emoji sequences =?UTF-8?Q?don=E2=80=99t?= render correctly Date: Fri, 28 Feb 2020 17:39:56 +0100 Message-ID: References: <83lfongp4p.fsf@gnu.org> <835zfrglu5.fsf@gnu.org> <83wo86g8pg.fsf@gnu.org> <83h7zafzwh.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="48940"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 39799@debbugs.gnu.org, mfabian@redhat.com To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Fri Feb 28 17:50:48 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1j7iqc-000CXn-EE for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 28 Feb 2020 17:50:46 +0100 Original-Received: from localhost ([::1]:50622 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j7iqb-0004Ad-Cd for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 28 Feb 2020 11:50:45 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:57145) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j7ihD-0003Nn-Q0 for bug-gnu-emacs@gnu.org; Fri, 28 Feb 2020 11:41:04 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j7ihC-00031t-GR for bug-gnu-emacs@gnu.org; Fri, 28 Feb 2020 11:41:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:55987) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1j7ihC-00031m-Ca for bug-gnu-emacs@gnu.org; Fri, 28 Feb 2020 11:41:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1j7ihC-0001v1-AI for bug-gnu-emacs@gnu.org; Fri, 28 Feb 2020 11:41:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Robert Pluim Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 28 Feb 2020 16:41:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 39799 X-GNU-PR-Package: emacs Original-Received: via spool by 39799-submit@debbugs.gnu.org id=B39799.15829080067266 (code B ref 39799); Fri, 28 Feb 2020 16:41:02 +0000 Original-Received: (at 39799) by debbugs.gnu.org; 28 Feb 2020 16:40:06 +0000 Original-Received: from localhost ([127.0.0.1]:33727 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j7igI-0001t8-6b for submit@debbugs.gnu.org; Fri, 28 Feb 2020 11:40:06 -0500 Original-Received: from mail-wr1-f54.google.com ([209.85.221.54]:34619) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j7igG-0001sZ-Be for 39799@debbugs.gnu.org; Fri, 28 Feb 2020 11:40:04 -0500 Original-Received: by mail-wr1-f54.google.com with SMTP id z15so3711887wrl.1 for <39799@debbugs.gnu.org>; Fri, 28 Feb 2020 08:40:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :mime-version:content-transfer-encoding; bh=edqu9bGn9sT7Veyr4Ia+Y1IStlFqvpjcjc7wgGAtrUs=; b=EMqcTR+F/A8mz96D02ndgQLGW2MNKCtM+l4V8KYZ0hD3UXhRB11nKbNAxHSjCQ/28L jvauAq9DDJmaw8b27BYGBauLO05B2zGFUhN5RQKLDf974KMiYxVh5fLtUUuWRPkfVDDM xK6SDnITApx4pSc87OEYRjZDhn5W7iWLfkqbI3ssDEiLjqnCHvPbfmtNPFVwXW1Nheh4 kgV3DeJdiiMEA3eaEUWCI112KMOv2ladQFRhrEItWTxyF7v3s5tMnPBxIapMiGSTCDBc urw75UQCmnBrQOGR8K5pAfVctlB/enVVkyorttK7QBsbVI8EGZweNKaMATWeGYAJFLCo Qbiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:mime-version:content-transfer-encoding; bh=edqu9bGn9sT7Veyr4Ia+Y1IStlFqvpjcjc7wgGAtrUs=; b=DjELf5jfUYa5Y/ualApQtjX5CARYCh4MjIWomhL5aftgqr/OyY8NxgvYn7kFCW8Kwz 4XXMtTxK2nlDgiPqMspJH7TMk/Jbp9F1sip4jy2HL1eC8xyg0rsEXBrwrJMQfqQ5Kq+g cMasgsTquStnqEiQBnAoo+mI0oz4QbGsoXvcXuh0hQ9nmNujNpQRWRo3ckLG2Y7Jkt4t k/3gtZ8J8CrVhYTqgOHaMvrL30t2phOJqgYyJR0sTtQzka056U3JpVuATLtRlzyG49Px C8cDZnQt7Mn5e8vIU0XkbUMvHnn1IxZQGfLgeiY16vZId4Z0a8E5NNedFtS7T93KCHIh WZGA== X-Gm-Message-State: APjAAAUmcApvrq42rWozD0+XMiH3Ebi4X2S29lg2Pibovu6JrpIj0Q9f 2G5YcQMcsYdZ1cu/N681EQSI8twm X-Google-Smtp-Source: APXvYqyfbR35w/Cg7t+YD/zYYHiQGePZ4FBUxa5cIZ5MDIUr7jRdGwjbK20Phta98l1vqEIjoHt++Q== X-Received: by 2002:adf:a312:: with SMTP id c18mr5850792wrb.77.1582907998122; Fri, 28 Feb 2020 08:39:58 -0800 (PST) Original-Received: from rpluim-mac ([149.5.228.1]) by smtp.gmail.com with ESMTPSA id n3sm2992679wmc.27.2020.02.28.08.39.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 28 Feb 2020 08:39:57 -0800 (PST) In-Reply-To: <83h7zafzwh.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 28 Feb 2020 18:19:10 +0200") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:176595 Archived-At: >>>>> On Fri, 28 Feb 2020 18:19:10 +0200, Eli Zaretskii said: >> From: Robert Pluim >> Cc: Glenn Morris , mfabian@redhat.com, 39799@debbugs.= gnu.org >> Date: Fri, 28 Feb 2020 15:14:01 +0100 >>=20 >> >> It matches forward off the first char, so the >> >> composition-function-table entries all have '0' as the number of = chars >> >> to match. Would it be better to match backwards? >>=20 Eli> I don't think matching backwards is better in general. Did you ha= ve a Eli> reason for thinking it was? >>=20 >> I thought I saw a comment in composite.c that says matching is done >> backward, but I see that it=CA=BCs done forwards as well. Eli> Btw, it sometimes _can_ be beneficial to use backward matching: if= it Eli> makes the size of composition-function-table smaller. Since Eli> composition-function-table is a char-table, and char-tables alloca= te Eli> sub-tables only if needed, you can conserve memory (and thus make Eli> Emacs's memory footprint smaller) and faster (because 'aref' will = llok Eli> up values in a char-table faster) by setting a smaller number of Eli> slots. For example, if the 2nd character of an Emoji sequence was Eli> always one specific character, or a small set of characters, you c= ould Eli> set only the slots of those few characters, which would make the Eli> char-table smaller. OTOH, if that would yield many different Eli> composition rules in the list of rules for those few characters, Eli> redisplay could become slower, because it generally examines the r= ules Eli> one by one until it finds an appropriate one. So the winning setu= p of Eli> composition-function-table is the one that sets the smallest numbe= r of Eli> slots, but still keeps the lists of rules for those slots short. = And Eli> note that setting the same rule for a range of codepoints generally Eli> uses up only one slot in the char-table, so rules that can be Eli> generalized to cover many characters are preferable. I don=CA=BCt think that applies in this case. The sequences are all easily categorised based on the first char in the sequence. It could be done based on the 2nd, or 3rd or whatever, but I don=CA=BCt think that reduces the number of entries. Plus there=CA=BCs always one rule per character, since multiple patterns starting with the same character are combined using regexp-opt. One thing though: the code currently does set-char-table-range to a new value. Is there a chance that an entry already exists in composition-function-table for a particular character? If so I=CA=BCd have to change it to add the new rule after the existing one (before?). Robert