From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#27525: 25.1; Line wrapping of bidi paragraphs Date: Fri, 21 Jul 2017 11:37:30 +0300 Message-ID: <83tw269odx.fsf@gnu.org> References: <8337abobuz.fsf@gnu.org> <87eftpa30a.fsf@blei.turtle-trading.net> <83a84djweb.fsf@gnu.org> <83shhsbakk.fsf@gnu.org> <83lgnjbsqw.fsf@gnu.org> <83bmofbc0f.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org X-Trace: blaine.gmane.org 1500626358 26492 195.159.176.226 (21 Jul 2017 08:39:18 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 21 Jul 2017 08:39:18 +0000 (UTC) Cc: 27525@debbugs.gnu.org To: Itai Berli Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Fri Jul 21 10:39:14 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dYTSq-0006aO-NY for geb-bug-gnu-emacs@m.gmane.org; Fri, 21 Jul 2017 10:39:12 +0200 Original-Received: from localhost ([::1]:41718 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dYTSw-0007ED-Aw for geb-bug-gnu-emacs@m.gmane.org; Fri, 21 Jul 2017 04:39:18 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:34888) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dYTSl-0007An-76 for bug-gnu-emacs@gnu.org; Fri, 21 Jul 2017 04:39:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dYTSg-00079S-8N for bug-gnu-emacs@gnu.org; Fri, 21 Jul 2017 04:39:07 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:47667) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dYTSg-00079I-4d for bug-gnu-emacs@gnu.org; Fri, 21 Jul 2017 04:39:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1dYTSf-0004ai-Of for bug-gnu-emacs@gnu.org; Fri, 21 Jul 2017 04:39:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 21 Jul 2017 08:39:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 27525 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 27525-submit@debbugs.gnu.org id=B27525.150062628817581 (code B ref 27525); Fri, 21 Jul 2017 08:39:01 +0000 Original-Received: (at 27525) by debbugs.gnu.org; 21 Jul 2017 08:38:08 +0000 Original-Received: from localhost ([127.0.0.1]:50344 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dYTRn-0004ZV-KD for submit@debbugs.gnu.org; Fri, 21 Jul 2017 04:38:07 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:43617) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dYTRl-0004Z2-Oq for 27525@debbugs.gnu.org; Fri, 21 Jul 2017 04:38:06 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dYTRb-0006JC-HW for 27525@debbugs.gnu.org; Fri, 21 Jul 2017 04:38:00 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:42612) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dYTRK-00067W-RK; Fri, 21 Jul 2017 04:37:55 -0400 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1848 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1dYTRK-0002gU-7t; Fri, 21 Jul 2017 04:37:38 -0400 In-reply-to: (message from Itai Berli on Fri, 21 Jul 2017 09:19:25 +0300) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:134817 Archived-At: > From: Itai Berli > Date: Fri, 21 Jul 2017 09:19:25 +0300 > > Now that I have downloaded the source code, I'd like to take a look at this problem first hand. I'm not a > programmer, not even an amateur one, but I can sometimes make sense of the general gist of code when I > read it, and I'd like to take a look at the part of code that's responsible for the present bug, maybe put a > breakpoint here and there and give it a test run to get a feel of how it works, and why it misses the mark when > it comes to line wrapping bidi paragraphs. > > Could you please give me some pointers: what files should I look into, what functions should I read, possibly > even suggestions for where to put breakpoints and which variables to watch. I'm not asking for a > comprehensive and detailed run down of this feature; just a starting point(s). Every tip and suggestion will be > welcome. The relevant files are bidi.c and xdisp.c. There's a long comment at the beginning of xdisp.c, whose last parts deal with how the bidi reordering is incorporated into the display engine, and a long comment at the beginning of bidi.c that has more details about the reordering itself. Note that this is not an implementation bug, it's a consequence of how the bidi reordering engine's integration with the rest of the display code was designed: we reorder text for display _before_ making the layout decisions. IOW, the layout layer of the display engine is fed characters in _visual_ order, already reordered by bidi.c functions which the layout layer calls when it needs another character. The advantage of this design is that the display engine knows almost nothing about the reordering stuff, it doesn't care about resolved levels etc., because all that was already taken care of. To make line-wrapping do what the UBA describes, we would need to feed the display engine with characters in logical order, but record with each character its resolved bidi level, resulting from partial processing by bidi.c. Then, when a line is completely laid out, we'd need to reorder the glyphs prepared for that line according to UBA rules L1, L2, and L4, using the resolved levels recorded by bidi.c code. (L3 is tricky, because combining marks are applied when producing glyphs, so it has to be solved by "some other method".) The above means we need to redesign the interface between xdisp.c and bidi.c, and then rewrite the current reordering function into something that will work on the glyphs of a laid-out line. That in itself is more or less straightforward refactoring of the existing code, but unfortunately it isn't the scary part of the job. The scary part is all the subtleties of the Emacs display engine and the features it provides, when bidirectional text is involved. For example, many places need to calculate layout metrics without displaying anything. A typical example is vertical-motion when line-move-visual is in effect -- it needs to determine what buffer position is displayed one screen line up or down from a given character. Another example is how we process a mouse click, which starts by determining which buffer position (more accurately, which offset of what object) is displayed at given pixel coordinates. These places use functions that "simulate" display -- they perform all the layout calculations, but don't create glyphs (because nothing needs to be displayed). Since glyphs are not created, the "line" to be displayed doesn't exist, and thus the reordering step will have nothing to work on. Whoever will work on fixing line-wrapping will have to figure out how to solve this problem in a way that is compatible with the 2nd sentence of the UBA's section 3.4. There are many complications in this part of the display code, because oftentimes Emacs ends the display "simulation" before reaching the end of the line, and sometimes even starts it in the middle of a line. All this needs to be figured out and implemented when reordering needs to see a full screen line, and implemented in a way that doesn't hurt performance in any significant way. Then there are complications with invisible text: the 'invisible' text property can start and/or end in the middle if non-base embedding level, and the question is how to produce the result that the user expects, when some of the characters that affect reordering are effectively hidden from the reordering code, because the invisible text is simply skipped and never fed to the layout layer. (With the current design, reordering is done before the text invisibility is considered, so the result is quite naturally the expected one.) Similar problems arise with display properties and overlays which hide portions of buffer text, optionally replacing them with some other text or image -- the reordering step will somehow need to avoid reordering the text of a display string as if it were part of the surrounding buffer text, because that's not what the user expects. Another complication is where glyph production and layout decisions are mixed with bidi level resolution. One such situation is how we implement the display property of the form '(space :align-to HPOS)' which is treated as a paragraph separator for the purposes of bidi reordering (thus supporting display of tables with bidirectional text). If we separate reordering from level resolution, this will have to be rethought if not reimplemented. And I'm quite sure there are other complications that I forget. This is what took the lion's share of the work on making the display engine bidi-aware (because the basic reordering engine which is now bidi.c was written and debugged, as a stand-alone program, 15 years ago). Whoever will work on fixing the line-wrapping issue will have to do at least part of that anew. I surely hope a motivated individual will step forward for the job at some point, but they need to know what they will face.