From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Itai Berli Newsgroups: gmane.emacs.bugs Subject: bug#27525: 25.1; Line wrapping of bidi paragraphs Date: Fri, 21 Jul 2017 13:58:57 +0300 Message-ID: References: <8337abobuz.fsf@gnu.org> <87eftpa30a.fsf@blei.turtle-trading.net> <83a84djweb.fsf@gnu.org> <83shhsbakk.fsf@gnu.org> <83lgnjbsqw.fsf@gnu.org> <83bmofbc0f.fsf@gnu.org> <83tw269odx.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="94eb2c1cc68ae3a6590554d1c2e3" X-Trace: blaine.gmane.org 1500634833 6127 195.159.176.226 (21 Jul 2017 11:00:33 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 21 Jul 2017 11:00:33 +0000 (UTC) To: 27525@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Fri Jul 21 13:00:17 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dYVfM-0000kt-Nt for geb-bug-gnu-emacs@m.gmane.org; Fri, 21 Jul 2017 13:00:17 +0200 Original-Received: from localhost ([::1]:42291 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dYVfS-0000aJ-F4 for geb-bug-gnu-emacs@m.gmane.org; Fri, 21 Jul 2017 07:00:22 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:37396) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dYVfG-0000TB-9f for bug-gnu-emacs@gnu.org; Fri, 21 Jul 2017 07:00:16 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dYVf9-0001IA-N8 for bug-gnu-emacs@gnu.org; Fri, 21 Jul 2017 07:00:09 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:47718) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dYVf9-0001Hw-Is for bug-gnu-emacs@gnu.org; Fri, 21 Jul 2017 07:00:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1dYVf9-0007tG-93 for bug-gnu-emacs@gnu.org; Fri, 21 Jul 2017 07:00:03 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Itai Berli Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 21 Jul 2017 11:00:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 27525 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 27525-submit@debbugs.gnu.org id=B27525.150063478630272 (code B ref 27525); Fri, 21 Jul 2017 11:00:03 +0000 Original-Received: (at 27525) by debbugs.gnu.org; 21 Jul 2017 10:59:46 +0000 Original-Received: from localhost ([127.0.0.1]:50395 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dYVes-0007sC-6m for submit@debbugs.gnu.org; Fri, 21 Jul 2017 06:59:46 -0400 Original-Received: from mail-wr0-f177.google.com ([209.85.128.177]:33589) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dYVeq-0007rz-H1 for 27525@debbugs.gnu.org; Fri, 21 Jul 2017 06:59:45 -0400 Original-Received: by mail-wr0-f177.google.com with SMTP id v105so50821052wrb.0 for <27525@debbugs.gnu.org>; Fri, 21 Jul 2017 03:59:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=QHMCFnBwFIR49hDMsM4x0/jPBIEhIcecMDrBExF8X44=; b=eQ91MTitlI3c7Rja/euy9pFi62iKQxE9tmZa/AvHqCK9GQ22y5yppZKAn+Rs6IJx1o ip1OVhUx2Nr9fBj6rmU0oHLvxjzJ+3nDiQrO9mBAMYVvcIT8JDM1ghMh3FqI4oDf5PLn brss0MM6/RS/QzWbJrX22rYhA1G+BOwX3Tn6P46mJVpK/leGKLB92SSw0yvXZJBIHoBh 1gTDNivpp7XWusjtrFx4gCJXrYQtuatcP1MdHiRxv2uN/ZDAyLlUSJM/SLHxwFIAKrqH s3QcLY/16agqbuJ+8Ko9ixl/rZLFjOVh8qv06w3JYHvhcW0QAI9a3TIpBsHG0to/A3sf lLnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=QHMCFnBwFIR49hDMsM4x0/jPBIEhIcecMDrBExF8X44=; b=mVl1CbxAHQoacEICZaqRe4psWkqRRkoEQezoTsfsA9EoPmGaEzYdm9sxEum5ZRtH6d hhPFUd38OCH5oMLSbKrnHl3mqPlaXbQRCuMWfMcqaihW3TCjep6ypmacRtQJW76oT7UW SoAyEYn+Ao7VZq8QN9FunaCUQilOng0yufiGMvc74C7lTpZF/BPH+PHBk+eY0vPG2WUA cRoVq4PYK50uYunCUAhaM8hjFBwEO7mL/C9KE0z3jyE/9nTz0lCfyWHSeD+4JJVFrHYc RCapozU1R9nquU7VHYpFGnK0UW5H3s6MQnwHs0b13i9EXLyJDjrzamqPSFACwKTrJ9NB bIcw== X-Gm-Message-State: AIVw110dzYJ4QzJqBrwxwrqxIOVzXTwIiBoe3aST7cvLxowOsRz2y9lp DlEWb1aEuRNZ8zx8NlvkMQ2CMFtm+vc3 X-Received: by 10.223.167.69 with SMTP id e5mr619339wrd.79.1500634778148; Fri, 21 Jul 2017 03:59:38 -0700 (PDT) Original-Received: by 10.28.197.196 with HTTP; Fri, 21 Jul 2017 03:58:57 -0700 (PDT) In-Reply-To: X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:134821 Archived-At: --94eb2c1cc68ae3a6590554d1c2e3 Content-Type: text/plain; charset="UTF-8" I have a suggestion for another approach to tackling this bug. It doesn't fix the problem, but it offers a better workaround, in my opinion, than the present one of requiring the user to break lines manually. Is this approach easier to implement? Instead of requiring Emacs to wrap lines correctly as they are being typed, only require it to display correctly wrapped lines once a file is opened, as well as once the user explicitly invokes a certain function while editing the file. On Fri, Jul 21, 2017 at 12:44 PM, Itai Berli wrote: > Thank you. > > I just want to make sure I understand. Please correct me if I'm wrong. > > 1. The bidi logic is entirely contained in the file bidi.c. > > 2. The display logic is entirely contained in the file xdisp.c. > > 3. The interface between the two modules is minimal. If I wish to cancel > Emacs' bidi features, all I need to do is comment out a couple lines in > xdisp.c and a user who doesn't use bidi documents will never know the > difference. > > 4. All the complications you mentioned are limited to code in xdisp.c > > > On Fri, Jul 21, 2017 at 11:37 AM, Eli Zaretskii wrote: > >> > From: Itai Berli >> > Date: Fri, 21 Jul 2017 09:19:25 +0300 >> > >> > Now that I have downloaded the source code, I'd like to take a look at >> this problem first hand. I'm not a >> > programmer, not even an amateur one, but I can sometimes make sense of >> the general gist of code when I >> > read it, and I'd like to take a look at the part of code that's >> responsible for the present bug, maybe put a >> > breakpoint here and there and give it a test run to get a feel of how >> it works, and why it misses the mark when >> > it comes to line wrapping bidi paragraphs. >> > >> > Could you please give me some pointers: what files should I look into, >> what functions should I read, possibly >> > even suggestions for where to put breakpoints and which variables to >> watch. I'm not asking for a >> > comprehensive and detailed run down of this feature; just a starting >> point(s). Every tip and suggestion will be >> > welcome. >> >> The relevant files are bidi.c and xdisp.c. There's a long comment at >> the beginning of xdisp.c, whose last parts deal with how the bidi >> reordering is incorporated into the display engine, and a long comment >> at the beginning of bidi.c that has more details about the reordering >> itself. >> >> Note that this is not an implementation bug, it's a consequence of how >> the bidi reordering engine's integration with the rest of the display >> code was designed: we reorder text for display _before_ making the >> layout decisions. IOW, the layout layer of the display engine is fed >> characters in _visual_ order, already reordered by bidi.c functions >> which the layout layer calls when it needs another character. The >> advantage of this design is that the display engine knows almost >> nothing about the reordering stuff, it doesn't care about resolved >> levels etc., because all that was already taken care of. >> >> To make line-wrapping do what the UBA describes, we would need to feed >> the display engine with characters in logical order, but record with >> each character its resolved bidi level, resulting from partial >> processing by bidi.c. Then, when a line is completely laid out, we'd >> need to reorder the glyphs prepared for that line according to UBA >> rules L1, L2, and L4, using the resolved levels recorded by bidi.c >> code. (L3 is tricky, because combining marks are applied when >> producing glyphs, so it has to be solved by "some other method".) >> >> The above means we need to redesign the interface between xdisp.c and >> bidi.c, and then rewrite the current reordering function into >> something that will work on the glyphs of a laid-out line. >> >> That in itself is more or less straightforward refactoring of the >> existing code, but unfortunately it isn't the scary part of the job. >> The scary part is all the subtleties of the Emacs display engine and >> the features it provides, when bidirectional text is involved. For >> example, many places need to calculate layout metrics without >> displaying anything. A typical example is vertical-motion when >> line-move-visual is in effect -- it needs to determine what buffer >> position is displayed one screen line up or down from a given >> character. Another example is how we process a mouse click, which >> starts by determining which buffer position (more accurately, which >> offset of what object) is displayed at given pixel coordinates. >> >> These places use functions that "simulate" display -- they perform all >> the layout calculations, but don't create glyphs (because nothing >> needs to be displayed). Since glyphs are not created, the "line" to >> be displayed doesn't exist, and thus the reordering step will have >> nothing to work on. Whoever will work on fixing line-wrapping will >> have to figure out how to solve this problem in a way that is >> compatible with the 2nd sentence of the UBA's section 3.4. There are >> many complications in this part of the display code, because >> oftentimes Emacs ends the display "simulation" before reaching the end >> of the line, and sometimes even starts it in the middle of a line. >> All this needs to be figured out and implemented when reordering needs >> to see a full screen line, and implemented in a way that doesn't hurt >> performance in any significant way. >> >> Then there are complications with invisible text: the 'invisible' text >> property can start and/or end in the middle if non-base embedding >> level, and the question is how to produce the result that the user >> expects, when some of the characters that affect reordering are >> effectively hidden from the reordering code, because the invisible >> text is simply skipped and never fed to the layout layer. (With the >> current design, reordering is done before the text invisibility is >> considered, so the result is quite naturally the expected one.) >> Similar problems arise with display properties and overlays which hide >> portions of buffer text, optionally replacing them with some other >> text or image -- the reordering step will somehow need to avoid >> reordering the text of a display string as if it were part of the >> surrounding buffer text, because that's not what the user expects. >> >> Another complication is where glyph production and layout decisions >> are mixed with bidi level resolution. One such situation is how we >> implement the display property of the form '(space :align-to HPOS)' >> which is treated as a paragraph separator for the purposes of bidi >> reordering (thus supporting display of tables with bidirectional >> text). If we separate reordering from level resolution, this will >> have to be rethought if not reimplemented. >> >> And I'm quite sure there are other complications that I forget. This >> is what took the lion's share of the work on making the display engine >> bidi-aware (because the basic reordering engine which is now bidi.c >> was written and debugged, as a stand-alone program, 15 years ago). >> Whoever will work on fixing the line-wrapping issue will have to do at >> least part of that anew. I surely hope a motivated individual will >> step forward for the job at some point, but they need to know what >> they will face. >> > > --94eb2c1cc68ae3a6590554d1c2e3 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I have a suggestion for another approach to tackling this = bug. It doesn't fix the problem, but it offers a better workaround, in = my opinion, than the present one of requiring the user to break lines manua= lly. Is this approach easier to implement?

Instead of re= quiring Emacs to wrap lines correctly as they are being typed, only require= it to display correctly wrapped lines once a file is opened, as well as on= ce the user explicitly invokes a certain function while editing the file.

On Fri,= Jul 21, 2017 at 12:44 PM, Itai Berli <itai.berli@gmail.com> wrote:
Thank you.
I just want to make sure I understand. Please correct me i= f I'm wrong.

1. The bidi logic is entirely con= tained in the file bidi.c.

2. The display logic is= entirely contained in the file xdisp.c.

3. The in= terface between the two modules is minimal. If I wish to cancel Emacs' = bidi features, all I need to do is comment out a couple lines in xdisp.c an= d a user who doesn't use bidi documents will never know the difference.=

4. All the complications you mentioned are limited = to code in xdisp.c


On Fri, Jul 21, 2017 at = 11:37 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> From: Itai Berli <itai.berli@gmail.com>
> Date: Fri, 21 Jul 2017 09:19:25 +0300
>
> Now that I have downloaded the source code, I'd like to take a loo= k at this problem first hand. I'm not a
> programmer, not even an amateur one, but I can sometimes make sense of= the general gist of code when I
> read it, and I'd like to take a look at the part of code that'= s responsible for the present bug, maybe put a
> breakpoint here and there and give it a test run to get a feel of how = it works, and why it misses the mark when
> it comes to line wrapping bidi paragraphs.
>
> Could you please give me some pointers: what files should I look into,= what functions should I read, possibly
> even suggestions for where to put breakpoints and which variables to w= atch. I'm not asking for a
> comprehensive and detailed run down of this feature; just a starting p= oint(s). Every tip and suggestion will be
> welcome.

The relevant files are bidi.c and xdisp.c.=C2=A0 There's a long = comment at
the beginning of xdisp.c, whose last parts deal with how the bidi
reordering is incorporated into the display engine, and a long comment
at the beginning of bidi.c that has more details about the reordering
itself.

Note that this is not an implementation bug, it's a consequence of how<= br> the bidi reordering engine's integration with the rest of the display code was designed: we reorder text for display _before_ making the
layout decisions.=C2=A0 IOW, the layout layer of the display engine is fed<= br> characters in _visual_ order, already reordered by bidi.c functions
which the layout layer calls when it needs another character.=C2=A0 The
advantage of this design is that the display engine knows almost
nothing about the reordering stuff, it doesn't care about resolved
levels etc., because all that was already taken care of.

To make line-wrapping do what the UBA describes, we would need to feed
the display engine with characters in logical order, but record with
each character its resolved bidi level, resulting from partial
processing by bidi.c.=C2=A0 Then, when a line is completely laid out, we= 9;d
need to reorder the glyphs prepared for that line according to UBA
rules L1, L2, and L4, using the resolved levels recorded by bidi.c
code.=C2=A0 (L3 is tricky, because combining marks are applied when
producing glyphs, so it has to be solved by "some other method".)=

The above means we need to redesign the interface between xdisp.c and
bidi.c, and then rewrite the current reordering function into
something that will work on the glyphs of a laid-out line.

That in itself is more or less straightforward refactoring of the
existing code, but unfortunately it isn't the scary part of the job. The scary part is all the subtleties of the Emacs display engine and
the features it provides, when bidirectional text is involved.=C2=A0 For example, many places need to calculate layout metrics without
displaying anything.=C2=A0 A typical example is vertical-motion when
line-move-visual is in effect -- it needs to determine what buffer
position is displayed one screen line up or down from a given
character.=C2=A0 Another example is how we process a mouse click, which
starts by determining which buffer position (more accurately, which
offset of what object) is displayed at given pixel coordinates.

These places use functions that "simulate" display -- they perfor= m all
the layout calculations, but don't create glyphs (because nothing
needs to be displayed).=C2=A0 Since glyphs are not created, the "line&= quot; to
be displayed doesn't exist, and thus the reordering step will have
nothing to work on.=C2=A0 Whoever will work on fixing line-wrapping will have to figure out how to solve this problem in a way that is
compatible with the 2nd sentence of the UBA's section 3.4.=C2=A0 There = are
many complications in this part of the display code, because
oftentimes Emacs ends the display "simulation" before reaching th= e end
of the line, and sometimes even starts it in the middle of a line.
All this needs to be figured out and implemented when reordering needs
to see a full screen line, and implemented in a way that doesn't hurt performance in any significant way.

Then there are complications with invisible text: the 'invisible' t= ext
property can start and/or end in the middle if non-base embedding
level, and the question is how to produce the result that the user
expects, when some of the characters that affect reordering are
effectively hidden from the reordering code, because the invisible
text is simply skipped and never fed to the layout layer.=C2=A0 (With the current design, reordering is done before the text invisibility is
considered, so the result is quite naturally the expected one.)
Similar problems arise with display properties and overlays which hide
portions of buffer text, optionally replacing them with some other
text or image -- the reordering step will somehow need to avoid
reordering the text of a display string as if it were part of the
surrounding buffer text, because that's not what the user expects.

Another complication is where glyph production and layout decisions
are mixed with bidi level resolution.=C2=A0 One such situation is how we implement the display property of the form '(space :align-to HPOS)'=
which is treated as a paragraph separator for the purposes of bidi
reordering (thus supporting display of tables with bidirectional
text).=C2=A0 If we separate reordering from level resolution, this will
have to be rethought if not reimplemented.

And I'm quite sure there are other complications that I forget.=C2=A0 T= his
is what took the lion's share of the work on making the display engine<= br> bidi-aware (because the basic reordering engine which is now bidi.c
was written and debugged, as a stand-alone program, 15 years ago).
Whoever will work on fixing the line-wrapping issue will have to do at
least part of that anew.=C2=A0 I surely hope a motivated individual will step forward for the job at some point, but they need to know what
they will face.


--94eb2c1cc68ae3a6590554d1c2e3--