From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Khaled Hosny Newsgroups: gmane.emacs.bugs Subject: bug#33729: 27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n) Date: Sat, 5 Jan 2019 23:15:14 +0200 Message-ID: <20190105211514.GB28761@macbook.localdomain> References: <20181222154945.GE2244@macbook.localdomain> <83bm5d9wsc.fsf@gnu.org> <20181222205948.GF2244@macbook.localdomain> <838t0gapcj.fsf@gnu.org> <20181223135109.GA6568@macbook.localdomain> <83va3k8c79.fsf@gnu.org> <20181224020847.GC6568@macbook.localdomain> <83lg4e9a7q.fsf@gnu.org> <20181224173723.GH6568@macbook.localdomain> <83imzi94tz.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1546722847 3864 195.159.176.226 (5 Jan 2019 21:14:07 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 5 Jan 2019 21:14:07 +0000 (UTC) User-Agent: Mutt/1.11.1 (2018-12-01) Cc: behdad@behdad.org, 33729@debbugs.gnu.org, far.nasiri.m@gmail.com, kaushal.modi@gmail.com To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Jan 05 22:14:03 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from listsout.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gftGc-0000rf-45 for geb-bug-gnu-emacs@m.gmane.org; Sat, 05 Jan 2019 22:14:02 +0100 Original-Received: from localhost ([127.0.0.1]:34679 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gftIi-0004wU-Gn for geb-bug-gnu-emacs@m.gmane.org; Sat, 05 Jan 2019 16:16:12 -0500 Original-Received: from eggsout.gnu.org ([209.51.188.92]:52080 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gftIc-0004wP-8g for bug-gnu-emacs@gnu.org; Sat, 05 Jan 2019 16:16:07 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gftIY-0000wY-BF for bug-gnu-emacs@gnu.org; Sat, 05 Jan 2019 16:16:05 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:39558) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gftIY-0000wH-67 for bug-gnu-emacs@gnu.org; Sat, 05 Jan 2019 16:16:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1gftIX-0002QR-Qh for bug-gnu-emacs@gnu.org; Sat, 05 Jan 2019 16:16:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Khaled Hosny Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 05 Jan 2019 21:16:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 33729 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 33729-submit@debbugs.gnu.org id=B33729.15467229269276 (code B ref 33729); Sat, 05 Jan 2019 21:16:01 +0000 Original-Received: (at 33729) by debbugs.gnu.org; 5 Jan 2019 21:15:26 +0000 Original-Received: from localhost ([127.0.0.1]:48094 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gftHx-0002PY-KT for submit@debbugs.gnu.org; Sat, 05 Jan 2019 16:15:25 -0500 Original-Received: from mail-ed1-f45.google.com ([209.85.208.45]:37290) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gftHw-0002PK-EP for 33729@debbugs.gnu.org; Sat, 05 Jan 2019 16:15:24 -0500 Original-Received: by mail-ed1-f45.google.com with SMTP id h15so34636573edb.4 for <33729@debbugs.gnu.org>; Sat, 05 Jan 2019 13:15:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=AelQR6wUNPHqIZgJf54xZ43Kxa1MlKefsIXkKACjdI8=; b=l5geqCA/orlispZkOSC0D/LBQRYMr1iw2YRZFsDCrVWnsEgVJnE2QYUvN8I/p7M4a0 0WapYVeYpT/xVm57XyhTzJrEgK8gZBTMdri447LHx1gEyTIKyB5uBxafpsJkvUJuiaiE fV/ZO6kGQ+kQbqe9rkHoU1WyG7CFj4rU/nVyCbJZuFdOatxpa51aPOGJKhUH2zhwPjSY KQFPqFd6TUAf8slCk4G2fsNplA1c23jk+6x5MrAXC2Cs/YJLFTvy8nGRjqDgBlMvEu/x dSftiarKU/9PKwYjRc6k+SdfdD2LCTVgr4T2D4ILxZKScLuSpWg0/riR3ZWuRmjT34Uz VGWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=AelQR6wUNPHqIZgJf54xZ43Kxa1MlKefsIXkKACjdI8=; b=sDkU4xay2hzBSAJuAn93hlif4eeWQe0PJwKavzxO1r8BEiKG4s4YdQsGD6NrbC/1FM 8gfXu+MiWahrohGhkzGHlOZGcHXrchPUsAwYOIevE0RB2cMBler34vWKdivDC+Ua24Tz ej1Yd19bB0cKSQh3ddoOpuD1crvwxRJCPuchM+fVw+fgSszX7GpLFRGaQhGXJqNOXBU2 2xI1oe8KlU1H1E2/P/HFOruTmhRbKFL5dLWtgY5kPD3eIFcLxdrR8mR0nZoVoZm4mvLc 8YnCgCgK9p+jbaaK34X09kDt/3IemaS98W2qD95EbzIIMTEWTwqsxwV+bKv8/0+V7iLA UxfA== X-Gm-Message-State: AA+aEWa0uXyhl05RDdnC9g3bjVm9oxau/Y7SYzp/eO6WwSYU3+n48Pla BdszCB0Csi6j5K2mupgjkoo= X-Google-Smtp-Source: AFSGD/WdWq/ON9zxUF/kCo5pxgF05GviqTI452/OEbv77o2wYz9A1RIsbSIc7tB2lvI/oLuu22Swcw== X-Received: by 2002:a50:9724:: with SMTP id c33mr50788146edb.288.1546722918570; Sat, 05 Jan 2019 13:15:18 -0800 (PST) Original-Received: from macbook.localdomain ([41.237.67.158]) by smtp.gmail.com with ESMTPSA id t9sm27606596edd.25.2019.01.05.13.15.16 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 05 Jan 2019 13:15:17 -0800 (PST) Content-Disposition: inline In-Reply-To: <83imzi94tz.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:154172 Archived-At: On Mon, Dec 24, 2018 at 08:07:04PM +0200, Eli Zaretskii wrote: > > Date: Mon, 24 Dec 2018 19:37:23 +0200 > > From: Khaled Hosny > > Cc: rgm@gnu.org, far.nasiri.m@gmail.com, behdad@behdad.org, > > 33729@debbugs.gnu.org, kaushal.modi@gmail.com > > > > > Per previous discussions, we decided to use the Harfbuzz built-in > > > methods for determining the script, since Emacs doesn't have this > > > information, and adding it will just do the same as Harfbuzz does, > > > i.e. find the first character whose script is not Common etc., using > > > the UCD database. I think it was you who suggested to use the > > > Harfbuzz built-ins in this case. > > > > The built-in HarfBuzz code is for getting the script for a given > > character, but resolving characters with Common script is left to the > > client. Suppose you have this string (upper case for RTL) ABC 123 DEF, > > what HarfBuzz sees during shaping is three separate chunks of text ABC, > > 123, DEF. The 123 part is all Common script characters and thus > > hb_buffer_guess_segment_properties won’t be able to guess anything (and > > based on the font and the script, this can cause rendering differences). > > Emacs will have to resolve the script of Common characters before > > applying bidi algorithm and pass that down to HarfBuzz. > > I'm not sure I understand: why does HarfBuzz care that 123 was in the > middle if RTL text. It doesn’t. What it cares about here is the correct script. Because 123 are in the middle of RTL text they will be shaped separately, and thus hb_buffer_guess_segment_properties() will only see 123 and won’t to be able to guess the correct script for them (Arabic, Hebrew, etc., whatever the script for the surrounding RTL text is). The point I’m trying to make is that script detection, even in its simplest form, needs to be done on the text as a whole not just the portion being shaped, which makes hb_buffer_guess_segment_properties() ill equipped for doing this as it only sees a small portion of the text at a time. > Does it need to shape 123 specially in this case? Depending on the font, the digits might be shaped differently if the script is, say Arabic, by e.g. applying script-specific substitutions to forms more suitable for a given script. > (In general, AFAIK simple characters like 123 will not even go through > HarfBuzz, as Emacs doesn't call the shaper for characters whose entry > in composition-function-table is nil. So I guess 123 here should > stand for some other characters, not for literal digits? IOW, I don't > think I understand the example very well.) This is a bug then and needs to be fixed. All text should go through HarfBuzz since even so-called “simple” character often require shaping depending on the text and the font. If this is done for optimization, then it should be revised to see if shaping with HarfBuzz is actually significantly slower and if it is, find more proper ways to optimize it. Regards, Khaled