unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#27525: 25.1; Line wrapping of bidi paragraphs
@ 2017-06-29  7:23 Itai Berli
  2017-06-29 14:55 ` Eli Zaretskii
                   ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Itai Berli @ 2017-06-29  7:23 UTC (permalink / raw)
  To: 27525

The line-wrapping algorithm for formatting multi-lingual paragraphs
containing text in languages of opposite directionality (e.g. English
and Hebrew) is inconsistent with other text editing applications
(including Gmail, Google Docs, Libre Writer, MS-Word, Pages, and
TextEdit), as well as with Emacs itself!

Consider as an example the following paragraph that starts with two
Hebrew words followed by the opening of Lincoln's Gettysburg Address.

נאום גטיסבורג Four score and seven years ago our fathers brought forth
on this continent, a new nation, conceived in Liberty, and dedicated to
the proposition that all men are created equal.

When I type this inside the `M-x report-emacs-bug` buffer, and press
`RET`, the paragraph lines wrap as follows, similar to how the other
applications mentioned above handle it.

ILLUSTRATION: A correct way to line-wrap a bidi paragraph
http://imgur.com/9VDZFz0

If I now copy this paragraph and paste it in a new buffer, the line
wrapping is preserved.

However, when I *type* the same paragraph inside a new buffer, (as well as
when I finish typing the paragraph insie the `M-x report-emacs-bug`
buffer, just before pressing `RET`), the lines wrap as follows.

ILLUSTRATION: An incorrect way to line-wrap a bidi paragraph
http://imgur.com/Bckn7zP

Observe that the English text flows from the bottom of the paragraph to
the top, which makes no sense, since the words of the paragraph have a
natural, logical ordering within the paragraph that is independent of their
directionality, but the way the lines are wrapped in the last screenshot
disrupts this logical order by placing the last word ('equal') on the
same line as the first two words (the Hebrew words), whereas the third
word ('Four') is positioned two lines apart.



In GNU Emacs 25.1.1 (x86_64-apple-darwin13.4.0, NS appkit-1265.21
Version 10.9.5 (Build 13F1911))
 of 2016-09-21 built on builder10-9.porkrind.org
Windowing system distributor 'Apple', version 10.3.1504
Configured using:
 'configure --with-ns '--enable-locallisppath=/Library/Application
 Support/Emacs/${version}/site-lisp:/Library/Application
 Support/Emacs/site-lisp' --with-modules'

Configured features:
NOTIFY ACL GNUTLS LIBXML2 ZLIB TOOLKIT_SCROLL_BARS NS MODULES

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Fundamental

Minor modes in effect:
  ivy-mode: t
  shell-dirtrack-mode: t
  projectile-mode: t
  helm-descbinds-mode: t
  async-bytecomp-package-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  buffer-read-only: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent messages:
ad-handle-definition: ‘ibuffer’ got redefined
Turn on helm-projectile key bindings
For information about GNU Emacs and the GNU system, type C-h C-a.

Load-path shadows:
/Users/itaiberli/.emacs.d/elpa/seq-2.20/seq hides
/Applications/Emacs.app/Contents/Resources/lisp/emacs-lisp/seq

Features:
(shadow sort mail-extr emacsbug message rfc822 mml mml-sec epg mm-decode
mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader
sendmail rfc2047 rfc2045 ietf-drums mail-utils colir color counsel
jka-compr esh-util etags xref project swiper reftex reftex-vars
two-column ivy delsel ivy-overlay helm-projectile helm-files rx
image-dired tramp tramp-compat tramp-loaddefs trampver shell pcomplete
format-spec dired-x dired-aux ffap helm-tags helm-bookmark helm-adaptive
helm-info bookmark pp helm-external helm-net browse-url xml url
url-proxy url-privacy url-expand url-methods url-history url-cookie
url-domsuf url-util url-parse auth-source gnus-util mm-util help-fns
mail-prsvr password-cache url-vars mailcap helm-buffers helm-grep
helm-regexp helm-utils helm-locate helm-help helm-types projectile grep
compile comint ansi-color ring ibuf-ext ibuffer thingatpt helm-descbinds
helm easy-mmode helm-source cl-seq eieio-compat eieio eieio-core
helm-multi-match helm-lib dired helm-config helm-easymenu cl-macs
async-bytecomp async advice edmacro kmacro finder-inf tex-site info
package epg-config seq byte-opt gv bytecomp byte-compile cl-extra
help-mode easymenu cconv cl-loaddefs pcase cl-lib time-date mule-util
tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type
mwheel ns-win ucs-normalize term/common-win tool-bar dnd fontset image
regexp-opt fringe tabulated-list newcomment elisp-mode lisp-mode
prog-mode register page menu-bar rfn-eshadow timer select scroll-bar
mouse jit-lock font-lock syntax facemenu font-core frame cl-generic cham
georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese charscript case-table epa-hook
jka-cmpr-hook help simple abbrev minibuffer cl-preloaded nadvice
loaddefs button faces cus-face macroexp files text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote kqueue cocoa ns multi-tty
make-network-process emacs)

Memory information:
((conses 16 312113 14836)
 (symbols 48 30403 0)
 (miscs 40 88 163)
 (strings 32 51779 9508)
 (string-bytes 1 1669823)
 (vectors 16 50217)
 (vector-slots 8 844607 6040)
 (floats 8 564 139)
 (intervals 56 243 0)
 (buffers 976 18))





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-06-29  7:23 bug#27525: 25.1; Line wrapping of bidi paragraphs Itai Berli
@ 2017-06-29 14:55 ` Eli Zaretskii
  2017-06-29 18:35 ` Itai Berli
  2017-07-04  9:10 ` Itai Berli
  2 siblings, 0 replies; 34+ messages in thread
From: Eli Zaretskii @ 2017-06-29 14:55 UTC (permalink / raw)
  To: Itai Berli; +Cc: 27525

> From: Itai Berli <itai.berli@gmail.com>
> Date: Thu, 29 Jun 2017 10:23:15 +0300
> 
> The line-wrapping algorithm for formatting multi-lingual paragraphs
> containing text in languages of opposite directionality (e.g. English
> and Hebrew) is inconsistent with other text editing applications
> (including Gmail, Google Docs, Libre Writer, MS-Word, Pages, and
> TextEdit), as well as with Emacs itself!

The inconsistency with Emacs is because in the first case the text is
broken into separate lines with newlines, whereas in the second it's a
single long line.

> נאום גטיסבורג Four score and seven years ago our fathers brought forth
> on this continent, a new nation, conceived in Liberty, and dedicated to
> the proposition that all men are created equal.
> 
> When I type this inside the `M-x report-emacs-bug` buffer, and press
> `RET`, the paragraph lines wrap as follows, similar to how the other
> applications mentioned above handle it.
> 
> ILLUSTRATION: A correct way to line-wrap a bidi paragraph
> http://imgur.com/9VDZFz0
> 
> If I now copy this paragraph and paste it in a new buffer, the line
> wrapping is preserved.
> 
> However, when I *type* the same paragraph inside a new buffer, (as well as
> when I finish typing the paragraph insie the `M-x report-emacs-bug`
> buffer, just before pressing `RET`), the lines wrap as follows.
> 
> ILLUSTRATION: An incorrect way to line-wrap a bidi paragraph
> http://imgur.com/Bckn7zP
> 
> Observe that the English text flows from the bottom of the paragraph to
> the top, which makes no sense, since the words of the paragraph have a
> natural, logical ordering within the paragraph that is independent of their
> directionality, but the way the lines are wrapped in the last screenshot
> disrupts this logical order by placing the last word ('equal') on the
> same line as the first two words (the Hebrew words), whereas the third
> word ('Four') is positioned two lines apart.

Yes, Emacs's line-wrapping doesn't work well when the text
directionality is opposite to the paragraph direction.  The reasons
are technical (I can tell the details if someone is interested), but
in the nutshell the requirements of the UBA in that area would need a
thorough change of how the basic Emacs display layout is designed.

The remedy is usually simple: break the long lines into shorter ones
by inserting newlines.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-06-29  7:23 bug#27525: 25.1; Line wrapping of bidi paragraphs Itai Berli
  2017-06-29 14:55 ` Eli Zaretskii
@ 2017-06-29 18:35 ` Itai Berli
  2017-07-04  9:10 ` Itai Berli
  2 siblings, 0 replies; 34+ messages in thread
From: Itai Berli @ 2017-06-29 18:35 UTC (permalink / raw)
  To: 27525

> The remedy is usually simple: break the long lines into shorter ones
by inserting newlines.

I'm sorry but this is no remedy at all. This is like going back to the
typewriter era.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-06-29  7:23 bug#27525: 25.1; Line wrapping of bidi paragraphs Itai Berli
  2017-06-29 14:55 ` Eli Zaretskii
  2017-06-29 18:35 ` Itai Berli
@ 2017-07-04  9:10 ` Itai Berli
  2017-07-04  9:11   ` Itai Berli
  2017-07-04 14:40   ` Eli Zaretskii
  2 siblings, 2 replies; 34+ messages in thread
From: Itai Berli @ 2017-07-04  9:10 UTC (permalink / raw)
  To: 27525

[-- Attachment #1: Type: text/plain, Size: 782 bytes --]

I'd like to add that this behavior breaks the Unicode bidirectional
algorithm (UBA), and hence invalidates Emacs' claim of full conformance, or
indeed of weak conformance, for that matter (so-called 'implicit
bidirectionality' -- see section 4.2 of the UBA specifications).

The reason is that section 3.4 'Reordering Resolved Levels' of the
algorithm states (I replaced the bullet points in the original by numbers):

> * The characters are shaped into glyphs [...]
*> * *The accumulated widths of those glyphs *(in logical order)* are used
to determine line breaks.

The Emacs line-wrapping algorithm does not use the logical order of the
glyphs to determine line breaks, as evidence by the example given in my
original post, which I shall link to again: http://imgur.com/Bckn7zP

[-- Attachment #2: Type: text/html, Size: 974 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-04  9:10 ` Itai Berli
@ 2017-07-04  9:11   ` Itai Berli
  2017-07-04  9:19     ` Itai Berli
  2017-07-04 14:40   ` Eli Zaretskii
  1 sibling, 1 reply; 34+ messages in thread
From: Itai Berli @ 2017-07-04  9:11 UTC (permalink / raw)
  To: 27525

[-- Attachment #1: Type: text/plain, Size: 955 bytes --]

I obviously didn't end up replacing the bullet points by numbers...

On Tue, Jul 4, 2017 at 12:10 PM, Itai Berli <itai.berli@gmail.com> wrote:

> I'd like to add that this behavior breaks the Unicode bidirectional
> algorithm (UBA), and hence invalidates Emacs' claim of full conformance, or
> indeed of weak conformance, for that matter (so-called 'implicit
> bidirectionality' -- see section 4.2 of the UBA specifications).
>
> The reason is that section 3.4 'Reordering Resolved Levels' of the
> algorithm states (I replaced the bullet points in the original by numbers):
>
> > * The characters are shaped into glyphs [...]
> *> * *The accumulated widths of those glyphs *(in logical order)* are
> used to determine line breaks.
>
> The Emacs line-wrapping algorithm does not use the logical order of the
> glyphs to determine line breaks, as evidence by the example given in my
> original post, which I shall link to again: http://imgur.com/Bckn7zP
>

[-- Attachment #2: Type: text/html, Size: 1438 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-04  9:11   ` Itai Berli
@ 2017-07-04  9:19     ` Itai Berli
  2017-07-04 14:43       ` Eli Zaretskii
  2017-07-04 23:05       ` Richard Stallman
  0 siblings, 2 replies; 34+ messages in thread
From: Itai Berli @ 2017-07-04  9:19 UTC (permalink / raw)
  To: 27525

[-- Attachment #1: Type: text/plain, Size: 1379 bytes --]

I'd also like to add what should be obvious (considering the fact the Emacs
provides a line wrapping feature to begin with) that the "remedy" of
braking long lines by inserting newline character is not very usable, as
one would have to move the newlines to other locations if one had to add a
few more words to the line.


On Tue, Jul 4, 2017 at 12:11 PM, Itai Berli <itai.berli@gmail.com> wrote:

> I obviously didn't end up replacing the bullet points by numbers...
>
> On Tue, Jul 4, 2017 at 12:10 PM, Itai Berli <itai.berli@gmail.com> wrote:
>
>> I'd like to add that this behavior breaks the Unicode bidirectional
>> algorithm (UBA), and hence invalidates Emacs' claim of full conformance, or
>> indeed of weak conformance, for that matter (so-called 'implicit
>> bidirectionality' -- see section 4.2 of the UBA specifications).
>>
>> The reason is that section 3.4 'Reordering Resolved Levels' of the
>> algorithm states (I replaced the bullet points in the original by numbers):
>>
>> > * The characters are shaped into glyphs [...]
>> *> * *The accumulated widths of those glyphs *(in logical order)* are
>> used to determine line breaks.
>>
>> The Emacs line-wrapping algorithm does not use the logical order of the
>> glyphs to determine line breaks, as evidence by the example given in my
>> original post, which I shall link to again: http://imgur.com/Bckn7zP
>>
>
>

[-- Attachment #2: Type: text/html, Size: 2328 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-04  9:10 ` Itai Berli
  2017-07-04  9:11   ` Itai Berli
@ 2017-07-04 14:40   ` Eli Zaretskii
  1 sibling, 0 replies; 34+ messages in thread
From: Eli Zaretskii @ 2017-07-04 14:40 UTC (permalink / raw)
  To: Itai Berli; +Cc: 27525

> From: Itai Berli <itai.berli@gmail.com>
> Date: Tue, 4 Jul 2017 12:10:17 +0300
> 
> I'd like to add that this behavior breaks the Unicode bidirectional algorithm (UBA), and hence invalidates
> Emacs' claim of full conformance, or indeed of weak conformance, for that matter (so-called 'implicit
> bidirectionality' -- see section 4.2 of the UBA specifications).

Yes, line wrapping when text direction is opposite to the paragraph
direction is where Emacs deviates from the UBA.  I've added this
caveat to the Emacs manuals a few days ago.

The "full conformance" part refers to the fact that all the explicit
directional controls are supported, including the isolates.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-04  9:19     ` Itai Berli
@ 2017-07-04 14:43       ` Eli Zaretskii
  2017-07-04 14:52         ` Itai Berli
  2017-07-04 23:05       ` Richard Stallman
  1 sibling, 1 reply; 34+ messages in thread
From: Eli Zaretskii @ 2017-07-04 14:43 UTC (permalink / raw)
  To: Itai Berli; +Cc: 27525

> From: Itai Berli <itai.berli@gmail.com>
> Date: Tue, 4 Jul 2017 12:19:42 +0300
> 
> I'd also like to add what should be obvious (considering the fact the Emacs provides a line wrapping feature to
> begin with) that the "remedy" of braking long lines by inserting newline character is not very usable, as one
> would have to move the newlines to other locations if one had to add a few more words to the line.

Emacs provides several convenience feature for this purpose.  There's
the auto-fill mode, which will refill paragraphs automatically as you
type.  There's also the 'M-q' key which will refill, and optionally
also justify, the marked region, defaulting to the current paragraph
if region is not active.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-04 14:43       ` Eli Zaretskii
@ 2017-07-04 14:52         ` Itai Berli
  2017-07-04 15:19           ` Eli Zaretskii
  0 siblings, 1 reply; 34+ messages in thread
From: Itai Berli @ 2017-07-04 14:52 UTC (permalink / raw)
  To: 27525

[-- Attachment #1: Type: text/plain, Size: 933 bytes --]

Is this considered a bug to be fixed? If so, what priority does it have,
and what's the time frame for a complete fix?

On Tue, Jul 4, 2017 at 5:43 PM, Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Itai Berli <itai.berli@gmail.com>
> > Date: Tue, 4 Jul 2017 12:19:42 +0300
> >
> > I'd also like to add what should be obvious (considering the fact the
> Emacs provides a line wrapping feature to
> > begin with) that the "remedy" of braking long lines by inserting newline
> character is not very usable, as one
> > would have to move the newlines to other locations if one had to add a
> few more words to the line.
>
> Emacs provides several convenience feature for this purpose.  There's
> the auto-fill mode, which will refill paragraphs automatically as you
> type.  There's also the 'M-q' key which will refill, and optionally
> also justify, the marked region, defaulting to the current paragraph
> if region is not active.
>

[-- Attachment #2: Type: text/html, Size: 1351 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-04 14:52         ` Itai Berli
@ 2017-07-04 15:19           ` Eli Zaretskii
  0 siblings, 0 replies; 34+ messages in thread
From: Eli Zaretskii @ 2017-07-04 15:19 UTC (permalink / raw)
  To: Itai Berli; +Cc: 27525

> From: Itai Berli <itai.berli@gmail.com>
> Date: Tue, 4 Jul 2017 17:52:52 +0300
> 
> Is this considered a bug to be fixed?

I would love to see that fixed, yes.

> If so, what priority does it have, and what's the time frame for a complete
> fix?

Priority is only relevant when there's someone who intends to work on
this soon and knows how to fix it.  I don't think there's such a
person at this time.

I myself thought about this quite a lot, and I simply have no idea how
to fix this without redesigning most of the Emacs display code.  And
to do that, it will take someone more talented than myself and/or
someone with much more time on their hands.  Sorry.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-04  9:19     ` Itai Berli
  2017-07-04 14:43       ` Eli Zaretskii
@ 2017-07-04 23:05       ` Richard Stallman
  2017-07-05  2:29         ` Eli Zaretskii
  1 sibling, 1 reply; 34+ messages in thread
From: Richard Stallman @ 2017-07-04 23:05 UTC (permalink / raw)
  To: Itai Berli; +Cc: 27525

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > I'd also like to add what should be obvious (considering the fact the Emacs
  > provides a line wrapping feature to begin with) that the "remedy" of
  > braking long lines by inserting newline character is not very usable, as
  > one would have to move the newlines to other locations if one had to add a
  > few more words to the line.

It might be good if Emacs could refill lines automatically the way
some other ediors do.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.






^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-04 23:05       ` Richard Stallman
@ 2017-07-05  2:29         ` Eli Zaretskii
  2017-07-05 22:59           ` Richard Stallman
  2017-07-09 18:17           ` Benjamin Riefenstahl
  0 siblings, 2 replies; 34+ messages in thread
From: Eli Zaretskii @ 2017-07-05  2:29 UTC (permalink / raw)
  To: rms; +Cc: itai.berli, 27525

> From: Richard Stallman <rms@gnu.org>
> Date: Tue, 04 Jul 2017 19:05:30 -0400
> Cc: 27525@debbugs.gnu.org
> 
>   > I'd also like to add what should be obvious (considering the fact the Emacs
>   > provides a line wrapping feature to begin with) that the "remedy" of
>   > braking long lines by inserting newline character is not very usable, as
>   > one would have to move the newlines to other locations if one had to add a
>   > few more words to the line.
> 
> It might be good if Emacs could refill lines automatically the way
> some other ediors do.

We already have that: "M-x visual-line-mode RET".





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-05  2:29         ` Eli Zaretskii
@ 2017-07-05 22:59           ` Richard Stallman
  2017-07-06  2:39             ` Eli Zaretskii
  2017-07-09 18:17           ` Benjamin Riefenstahl
  1 sibling, 1 reply; 34+ messages in thread
From: Richard Stallman @ 2017-07-05 22:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: itai.berli, 27525

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > We already have that: "M-x visual-line-mode RET".

That seems to break lines
byt not to fill them together.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.






^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-05 22:59           ` Richard Stallman
@ 2017-07-06  2:39             ` Eli Zaretskii
  2017-07-06 16:01               ` Richard Stallman
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Zaretskii @ 2017-07-06  2:39 UTC (permalink / raw)
  To: rms; +Cc: itai.berli, 27525

> From: Richard Stallman <rms@gnu.org>
> CC: itai.berli@gmail.com, 27525@debbugs.gnu.org
> Date: Wed, 05 Jul 2017 18:59:07 -0400
> 
>   > We already have that: "M-x visual-line-mode RET".
> 
> That seems to break lines
> byt not to fill them together.

Then maybe I don't understand what you meant by "fill".  This feature
produces exactly the same display as M-q, but it doesn't insert
newline characters into the buffer.  Isn't that what you meant by
"filling"?





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-06  2:39             ` Eli Zaretskii
@ 2017-07-06 16:01               ` Richard Stallman
  2017-07-06 16:17                 ` Eli Zaretskii
  0 siblings, 1 reply; 34+ messages in thread
From: Richard Stallman @ 2017-07-06 16:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: itai.berli, 27525

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Then maybe I don't understand what you meant by "fill".  This feature
  > produces exactly the same display as M-q, but it doesn't insert
  > newline characters into the buffer.

That isn't what I saw.  What I saw is that it split individual lines
but didn't recombine them.

For instance, I inserted this text

======================================================================
Then maybe I don't understand what you meant by "fill".  This feature
produces exactly the same display as M-q, but it doesn't insert newline characters into the buffer.
Isn't that what you meant by
"filling"?
======================================================================

and enabled visual-line-mode, and what I see looks like

======================================================================
Then maybe I don't understand what you meant by "fill".  This feature
produces exactly the same display as M-q, but it doesn't insert newline
characters into the buffer.
Isn't that what you meant by
"filling"?
======================================================================

My Emacs sources are from December,  If this doesn't fail for you,
maybe it has been fixed since then.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.






^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-06 16:01               ` Richard Stallman
@ 2017-07-06 16:17                 ` Eli Zaretskii
  2017-07-07 18:23                   ` Richard Stallman
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Zaretskii @ 2017-07-06 16:17 UTC (permalink / raw)
  To: rms; +Cc: itai.berli, 27525

> From: Richard Stallman <rms@gnu.org>
> CC: itai.berli@gmail.com, 27525@debbugs.gnu.org
> Date: Thu, 06 Jul 2017 12:01:42 -0400
> 
>   > Then maybe I don't understand what you meant by "fill".  This feature
>   > produces exactly the same display as M-q, but it doesn't insert
>   > newline characters into the buffer.
> 
> That isn't what I saw.  What I saw is that it split individual lines
> but didn't recombine them.

Oh, you expected Emacs to remove the hard newlines as part of the
feature?  That's right, it doesn't do that, and neither do the other
editors.  They only remove "soft" newlines, and Emacs does that as
well.  Removing hard newlines would be a misfeature, IMO, since in
this mode they are user-controlled, and Emacs has no business
second-guessing the user.

> My Emacs sources are from December,  If this doesn't fail for you,
> maybe it has been fixed since then.

No, what you see is how it's supposed to work.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-06 16:17                 ` Eli Zaretskii
@ 2017-07-07 18:23                   ` Richard Stallman
  2017-07-07 19:21                     ` Eli Zaretskii
  0 siblings, 1 reply; 34+ messages in thread
From: Richard Stallman @ 2017-07-07 18:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: itai.berli, 27525

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  >   Removing hard newlines would be a misfeature, IMO, since in
  > this mode they are user-controlled, and Emacs has no business
  > second-guessing the user.

That's clearly right for editing _in_ that mode.

But it might be useful to have an alternate way to get into the mode,
one which would convert hard newlines to soft ones.  Exiting would
convert them back.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.






^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-07 18:23                   ` Richard Stallman
@ 2017-07-07 19:21                     ` Eli Zaretskii
  0 siblings, 0 replies; 34+ messages in thread
From: Eli Zaretskii @ 2017-07-07 19:21 UTC (permalink / raw)
  To: rms; +Cc: itai.berli, 27525

> From: Richard Stallman <rms@gnu.org>
> CC: itai.berli@gmail.com, 27525@debbugs.gnu.org
> Date: Fri, 07 Jul 2017 14:23:47 -0400
> 
> But it might be useful to have an alternate way to get into the mode,
> one which would convert hard newlines to soft ones.  Exiting would
> convert them back.

Should be a nice new feature, yes.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-05  2:29         ` Eli Zaretskii
  2017-07-05 22:59           ` Richard Stallman
@ 2017-07-09 18:17           ` Benjamin Riefenstahl
  2017-07-09 18:30             ` Eli Zaretskii
  1 sibling, 1 reply; 34+ messages in thread
From: Benjamin Riefenstahl @ 2017-07-09 18:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 27525, itai.berli

>> From: Richard Stallman <rms@gnu.org>
>> It might be good if Emacs could refill lines automatically the way
>> some other ediors do.

Eli Zaretskii writes:
> We already have that: "M-x visual-line-mode RET".

JFTR, even that does not help in this case.  With visual-line-mode the
order of the lines is still wrong with the text that the OP gave.

benny





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-09 18:17           ` Benjamin Riefenstahl
@ 2017-07-09 18:30             ` Eli Zaretskii
  2017-07-19  8:50               ` Itai Berli
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Zaretskii @ 2017-07-09 18:30 UTC (permalink / raw)
  To: Benjamin Riefenstahl; +Cc: 27525, itai.berli

> From: Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net>
> Cc: 27525@debbugs.gnu.org,  itai.berli@gmail.com
> Date: Sun, 09 Jul 2017 20:17:41 +0200
> 
> >> From: Richard Stallman <rms@gnu.org>
> >> It might be good if Emacs could refill lines automatically the way
> >> some other ediors do.
> 
> Eli Zaretskii writes:
> > We already have that: "M-x visual-line-mode RET".
> 
> JFTR, even that does not help in this case.  With visual-line-mode the
> order of the lines is still wrong with the text that the OP gave.

Of course.  It isn't supposed to help.  From the POV of the display
engine, visual-line-mode is just a fancy kind of producing
continuation lines, so all the problems you see with continued lines
will still be there in visual-line-mode.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-09 18:30             ` Eli Zaretskii
@ 2017-07-19  8:50               ` Itai Berli
  2017-07-19 12:59                 ` Itai Berli
  2017-07-19 17:24                 ` Eli Zaretskii
  0 siblings, 2 replies; 34+ messages in thread
From: Itai Berli @ 2017-07-19  8:50 UTC (permalink / raw)
  To: 27525

[-- Attachment #1: Type: text/plain, Size: 2500 bytes --]

Eli, in different bug report, namely 27526, I recently wrote the following
remark:

> the line-wrapping bug is still a major annoyance, at best, and until it
is fixed, Emacs cannot claim to be Unicode compliant.

to which you replied:

> I disagree, as I already said many times.

You do agree, though, that Emacs does not conform to the Unicode
Bidirectional Algorithm as specified in the Unicode Standard Annex #9.
After all, the following paragraph appears in the bidi code itself (
http://git.savannah.gnu.org/cgit/emacs.git/tree/src/bidi.c):

   Note that, because reordering is implemented below the level in
   xdisp.c that breaks glyphs into screen lines, we are violating
   paragraph 3.4 of UAX#9. which mandates that line breaking shall be
   done before reordering each screen line separately.

So the only thing you disagree with me is that non-conformance to the
Unicode Bidirectional Algorithm is tantamount to non-conformance to the
Unicode Standard. Well, this disagreement is easily settled by reading
article C12 'Bidirectional Text' of section 3.2 'Conformance Requirements'
of the Unicode Standard:

A process that displays text containing supported right-to-left characters
or embedding codes shall display all visible representations of characters
(excluding format characters) in the same order as if the Bidirectional
Algorithm had been applied to the text, unless tailored by a higher-level
protocol as permitted by the specification.

* The Bidirectional Algorithm is specified in Unicode Standard Annex #9,
“Uni- code Bidirectional Algorithm.”


On Sun, Jul 9, 2017 at 9:30 PM, Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net>
> > Cc: 27525@debbugs.gnu.org,  itai.berli@gmail.com
> > Date: Sun, 09 Jul 2017 20:17:41 +0200
> >
> > >> From: Richard Stallman <rms@gnu.org>
> > >> It might be good if Emacs could refill lines automatically the way
> > >> some other ediors do.
> >
> > Eli Zaretskii writes:
> > > We already have that: "M-x visual-line-mode RET".
> >
> > JFTR, even that does not help in this case.  With visual-line-mode the
> > order of the lines is still wrong with the text that the OP gave.
>
> Of course.  It isn't supposed to help.  From the POV of the display
> engine, visual-line-mode is just a fancy kind of producing
> continuation lines, so all the problems you see with continued lines
> will still be there in visual-line-mode.
>

[-- Attachment #2: Type: text/html, Size: 4287 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-19  8:50               ` Itai Berli
@ 2017-07-19 12:59                 ` Itai Berli
  2017-07-19 17:28                   ` Eli Zaretskii
  2017-07-19 17:24                 ` Eli Zaretskii
  1 sibling, 1 reply; 34+ messages in thread
From: Itai Berli @ 2017-07-19 12:59 UTC (permalink / raw)
  To: 27525

[-- Attachment #1: Type: text/plain, Size: 4452 bytes --]

And in case you counter that Emacs takes advantage of the "higher-level
protocol" clause, this clause doesn't apply to paragraph 3.4, which Emacs
violates.

But don't take my word for it. I contacted Mr. Aharon Lanin in the matter.
Mr. Lanin is a senior software engineer at Google Tel Aviv as well as one
of the three editors of the Unicode Bidirectional Algorithm from version
6.3.0 till the latest one, v. 10.0.0. I presented him with the following
screenshot of a bidi paragraph in Emacs (it's the same screenshot as I
posted originally in the present ticket): http://imgur.com/Bckn7zP

I list below an excerp from our conversation, which Mr. Lanin has given me
permission to quote.

--- EXCERPT BEGIN ---

***Me***: Just to be clear, are the following statements correct for the
Unicode Standard v. 8.0.0 and above? (I'm mentioning v. 8.0.0 because this
is the version that Emacs claims conformance to.)

1. The way Emacs handles line wrapping of bidi paragraphs does not satisfy
section 3.4 of the Unicode Bidirectional Algorithm. There are no provisions
for higher-level protocol interpretation of this section.

2. If a candidate implementation of the Unicode Bidirectional Algorithm
doesn't satisfy section 3.4, it does not conform to the Unicode
Bidirectional Algorithm.

3. If a candidate implementation of the Unicode Standard does not conform
to the Unicode Bidirectional Algorithm (of the same version), it does not
conform to the Unicode Standard.

***Lanin*** I think so, but I am a programmer, not a lawyer :-)

--- EXCERPT END ---

The Emacs manual and all official Emacs publications should make it clear
that Emacs does not conform to the Unicode Standard. Anything else is
simply not true, and is a deliberate misleading.

On Wed, Jul 19, 2017 at 11:50 AM, Itai Berli <itai.berli@gmail.com> wrote:

> Eli, in different bug report, namely 27526, I recently wrote the following
> remark:
>
> > the line-wrapping bug is still a major annoyance, at best, and until it
> is fixed, Emacs cannot claim to be Unicode compliant.
>
> to which you replied:
>
> > I disagree, as I already said many times.
>
> You do agree, though, that Emacs does not conform to the Unicode
> Bidirectional Algorithm as specified in the Unicode Standard Annex #9.
> After all, the following paragraph appears in the bidi code itself (
> http://git.savannah.gnu.org/cgit/emacs.git/tree/src/bidi.c):
>
>    Note that, because reordering is implemented below the level in
>    xdisp.c that breaks glyphs into screen lines, we are violating
>    paragraph 3.4 of UAX#9. which mandates that line breaking shall be
>    done before reordering each screen line separately.
>
> So the only thing you disagree with me is that non-conformance to the
> Unicode Bidirectional Algorithm is tantamount to non-conformance to the
> Unicode Standard. Well, this disagreement is easily settled by reading
> article C12 'Bidirectional Text' of section 3.2 'Conformance Requirements'
> of the Unicode Standard:
>
> A process that displays text containing supported right-to-left characters
> or embedding codes shall display all visible representations of characters
> (excluding format characters) in the same order as if the Bidirectional
> Algorithm had been applied to the text, unless tailored by a higher-level
> protocol as permitted by the specification.
>
> * The Bidirectional Algorithm is specified in Unicode Standard Annex #9,
> “Uni- code Bidirectional Algorithm.”
>
>
> On Sun, Jul 9, 2017 at 9:30 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>
>> > From: Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net>
>> > Cc: 27525@debbugs.gnu.org,  itai.berli@gmail.com
>> > Date: Sun, 09 Jul 2017 20:17:41 +0200
>> >
>> > >> From: Richard Stallman <rms@gnu.org>
>> > >> It might be good if Emacs could refill lines automatically the way
>> > >> some other ediors do.
>> >
>> > Eli Zaretskii writes:
>> > > We already have that: "M-x visual-line-mode RET".
>> >
>> > JFTR, even that does not help in this case.  With visual-line-mode the
>> > order of the lines is still wrong with the text that the OP gave.
>>
>> Of course.  It isn't supposed to help.  From the POV of the display
>> engine, visual-line-mode is just a fancy kind of producing
>> continuation lines, so all the problems you see with continued lines
>> will still be there in visual-line-mode.
>>
>
>

[-- Attachment #2: Type: text/html, Size: 7719 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-19  8:50               ` Itai Berli
  2017-07-19 12:59                 ` Itai Berli
@ 2017-07-19 17:24                 ` Eli Zaretskii
  1 sibling, 0 replies; 34+ messages in thread
From: Eli Zaretskii @ 2017-07-19 17:24 UTC (permalink / raw)
  To: Itai Berli; +Cc: 27525

> From: Itai Berli <itai.berli@gmail.com>
> Date: Wed, 19 Jul 2017 11:50:54 +0300
> 
> Eli, in different bug report, namely 27526, I recently wrote the following remark:
> 
> > the line-wrapping bug is still a major annoyance, at best, and until it is fixed, Emacs cannot claim to be
> Unicode compliant.
> 
> to which you replied:
> 
> > I disagree, as I already said many times.
> 
> You do agree, though, that Emacs does not conform to the Unicode Bidirectional Algorithm as specified in the
> Unicode Standard Annex #9.

I maintain that Emacs deviates from the UBA in a relatively minor way,
in an aspect that is only tangentially related to reordering
bidirectional text for display, and that raises its head in situations
that are relatively rare in practice, and in many of those rare cases
can be easily worked around by breaking long lines.

> So the only thing you disagree with me is that non-conformance to the Unicode Bidirectional Algorithm is
> tantamount to non-conformance to the Unicode Standard.

Not only, see above.

> Well, this disagreement is easily settled by reading
> article C12 'Bidirectional Text' of section 3.2 'Conformance Requirements' of the Unicode Standard:

No, it is not settled; see above.

And I don't really understand what is the purpose of your insistence
on the formal definition of this deviation.  It certainly won't help
fixing this issue any time soon, not unless someone steps forward to
do the job, which IMO is quite large.  All it does is cause me to
think, for the first time in many years, whether I indeed had to
invest all that huge amount of time and energy in single-handedly
coding, testing, and debugging the bidirectional text support for
Emacs, which even today, 10 years later, still shines among all the
bidi-aware editors out there, certainly among those of the Free
Software variety.  Even the fribidi library didn't yet catch up with
Unicode 6.3 and later.

If after all that all I get is this badgering about a minor issue
whose solution needs a thorough rewrite of the related code, then I
wish I never wasted those efforts working on a feature which I naïvely
assumed will be tremendously useful to many, and that in fact causes
only negative reactions from the few who use it.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-19 12:59                 ` Itai Berli
@ 2017-07-19 17:28                   ` Eli Zaretskii
  2017-07-19 21:40                     ` Itai Berli
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Zaretskii @ 2017-07-19 17:28 UTC (permalink / raw)
  To: Itai Berli; +Cc: 27525

> From: Itai Berli <itai.berli@gmail.com>
> Date: Wed, 19 Jul 2017 15:59:14 +0300
> 
> The Emacs manual and all official Emacs publications should make it clear that Emacs does not conform to
> the Unicode Standard. Anything else is simply not true, and is a deliberate misleading.

The Emacs manual already describes this deviation.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-19 17:28                   ` Eli Zaretskii
@ 2017-07-19 21:40                     ` Itai Berli
  2017-07-20  5:08                       ` Eli Zaretskii
  0 siblings, 1 reply; 34+ messages in thread
From: Itai Berli @ 2017-07-19 21:40 UTC (permalink / raw)
  To: 27525

[-- Attachment #1: Type: text/plain, Size: 4249 bytes --]

I appreciate your hard work. I can imagine that it took you a lot of time,
effort and agony to pull off this feature, and your efforts were
worthwhile: you created a good and helpful feature. For some purposes the
feature is perfect as it is, especially now that you've allowed
customization of the paragraph separator. I believe Emacs can be the #1
plain-text bidi editor out there, but this hinges on fixing this bug.

> I maintain that Emacs deviates from the UBA in a relatively minor way,
in an aspect that is only tangentially related to reordering
bidirectional text for display, and that raises its head in situations
that are relatively rare in practice, and in many of those rare cases
can be easily worked around by breaking long lines.

One of the valuable aspects of an ISO standard is that it is not left to
the personal judgment of a programmer to decide what is worth implementing,
and how to do so. It is not for you to decide what is a minor detail and
what is a major one, what is tangential and what is core. You need to
implement it to the letter, or else you can't claim conformance, no matter
how slight you imagine your deviation to be.

On what do you base your claim that this problem occurs relatively rarely
in practice? This is the kind of statement that only a specialist
linguist/statistician can make. And have you taken into account the type of
demographics who use Emacs' bidi feature and the kinds of texts they're
likely to type?

Contrary to what you said, my personal experience show that this is a major
inconvenience, and that it is a situation that occurs very often, almost
every paragraph, in fact, since I write primarily LaTeX documents where
English markup is intermixed with predominantly Hebrew text containing
frequent quotes from English textbooks and articles.

Yes, breaking lines is a possible workaround for LaTex, but it makes for
ugly and erratic looking paragraphs that are difficult to read and edit.
And what about documents that are not LaTeX? What workaround do they have?

You mention breaking "long lines", but this is not just a problem of long
lines. It takes just two English words inside a Hebrew paragraph that
happen to fall on a line break, to manifest this bug.

>  even today, 10 years later, still shines among all the
bidi-aware editors out there, certainly among those of the Free
Software variety.

Yes, Emacs shines as one of the very rare bidi-aware text editors that
enable entering explicit directional formatting characters. This is indeed
to Emacs' credit and is a very helpful feature.

However, Emacs also shines as possibly the only bidi-aware text editor that
botches the line wrapping of bidi paragraphs. Every single editor that I've
checked gets it right: from Word to Kate to GEdit to Google Docs to
BlueFish to TextEdit.

> And I don't really understand what is the purpose of your insistence
on the formal definition of this deviation.  It certainly won't help
fixing this issue any time soon, not unless someone steps forward to
do the job, which IMO is quite large.

I don't know what you mean by 'the formal definition of this deviation'. I
think that Emacs should not mislead the users and potential users. That it
should not claim to conform to a standard when it does not. I think that
when prospective users google "Emacs bidi" or "Emacs unicode" they will be
able to easily see that there's a problem with bidi line wrapping and that
if they require a text editor that is Unicode compliant they should look
elsewhere. The keywords are: transparency, truth in advertising,
user-friendly, and standards-oriented.

> The Emacs manual already describes this deviation.

In the online manual sections 22.19 (Bidirectional Editing) and 37.26
(Bidirectional Display) claim that Emacs implements the Unicode
Bidirectional Algorithm.


On Wed, Jul 19, 2017 at 8:28 PM, Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Itai Berli <itai.berli@gmail.com>
> > Date: Wed, 19 Jul 2017 15:59:14 +0300
> >
> > The Emacs manual and all official Emacs publications should make it
> clear that Emacs does not conform to
> > the Unicode Standard. Anything else is simply not true, and is a
> deliberate misleading.
>
> The Emacs manual already describes this deviation.
>

[-- Attachment #2: Type: text/html, Size: 6557 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-19 21:40                     ` Itai Berli
@ 2017-07-20  5:08                       ` Eli Zaretskii
  2017-07-20  7:01                         ` Itai Berli
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Zaretskii @ 2017-07-20  5:08 UTC (permalink / raw)
  To: Itai Berli; +Cc: 27525

> From: Itai Berli <itai.berli@gmail.com>
> Date: Thu, 20 Jul 2017 00:40:15 +0300
> 
> I believe Emacs
> can be the #1 plain-text bidi editor out there, but this hinges on fixing this bug.

And I believe you exaggerate the importance of this issue, and how
much it diminishes the usefulness of the Emacs bidi support.  Can we
agree to disagree about that, now that we've reiterated that
disagreement many times, and all of that is forever recorded in the
bug tracker?

> 
> > I maintain that Emacs deviates from the UBA in a relatively minor way,
> in an aspect that is only tangentially related to reordering
> bidirectional text for display, and that raises its head in situations
> that are relatively rare in practice, and in many of those rare cases
> can be easily worked around by breaking long lines.
> 
> One of the valuable aspects of an ISO standard is that it is not left to the personal judgment of a programmer
> to decide what is worth implementing, and how to do so. It is not for you to decide what is a minor detail and
> what is a major one, what is tangential and what is core. You need to implement it to the letter, or else you
> can't claim conformance, no matter how slight you imagine your deviation to be.

Of course, it's for me to decide.  Emacs is not an implementation of
the Unicode Standard: Emacs _follows_ the Unicode recommendations
where we decide it to be useful/practical, and doesn't where we don't.

> On what do you base your claim that this problem occurs relatively rarely in practice? This is the kind of
> statement that only a specialist linguist/statistician can make. And have you taken into account the type of
> demographics who use Emacs' bidi feature and the kinds of texts they're likely to type?

It doesn't take a statistician/linguist to realize that

  . long lines that wrap on Emacs display are rare to begin with
  . lines with predominantly RTL text in LTR paragraphs are rare, and
    likewise lines with predominantly LTR text in RTL paragraphs
  . multiplying 2 rare cases makes the result very rare

> Contrary to what you said, my personal experience show that this is a major inconvenience, and that it is a
> situation that occurs very often, almost every paragraph, in fact, since I write primarily LaTeX documents
> where English markup is intermixed with predominantly Hebrew text containing frequent quotes from English
> textbooks and articles.

LaTeX documents can easily work around the problem by breaking long
lines into shorter ones.

> Yes, breaking lines is a possible workaround for LaTex, but it makes for ugly and erratic looking paragraphs
> that are difficult to read and edit.

I fail to see why it would be ugly or hard to read.  Especially since
you can now have a different paragraph direction after every newline.
Perhaps you need to break lines more judiciously, not at random
points.

> And what about documents that are not LaTeX? What workaround do they
> have?

Plain text documents, and documents that are "nearly plain text", like
TeX, Texinfo, and other similar systems, rarely if ever consider
newlines as significant.  So this workaround is available there as
well.  About the only exception I know of is poetry, where over-long
lines are even rarer.

Btw, on GUI terminals there's one other workaround: make your Emacs
window wider.  That works with any file/buffer, not just text-like
ones.

> You mention breaking "long lines", but this is not just a problem of long lines. It takes just two English words
> inside a Hebrew paragraph that happen to fall on a line break, to manifest this bug.

Yeah, and how frequently does that happen?

> However, Emacs also shines as possibly the only bidi-aware text editor that botches the line wrapping of bidi
> paragraphs. Every single editor that I've checked gets it right: from Word to Kate to GEdit to Google Docs to
> BlueFish to TextEdit.

You are free to use those other bidi-aware editors, if they suit your
needs better.  They don't have half the bidi features you get in
Emacs, but if line-wrapping is so much more important for you than all
the rest of the UBA, you don't have to use Emacs.

> > The Emacs manual already describes this deviation.
> 
> In the online manual sections 22.19 (Bidirectional Editing) and 37.26 (Bidirectional Display) claim that Emacs
> implements the Unicode Bidirectional Algorithm.

You have the latest sources in the Git repository you cloned, look
there for the latest text.

Once again: this is an annoyance, and I'd love to see it fixed.  But
it's a minor annoyance, which happens rarely, and on most cases there
are workarounds.  Fixing it is a large job, and will take a motivated
volunteer with a lot of talent or a lot of free time (or both).  Until
we are lucky to have that, we will have to live with this annoyance.

Cane we PLEASE finally agree to disagree about this?  I see no reason
for discussing this further, as we are just repeating the same
arguments again and again.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-20  5:08                       ` Eli Zaretskii
@ 2017-07-20  7:01                         ` Itai Berli
  2017-07-20 11:09                           ` Eli Zaretskii
  0 siblings, 1 reply; 34+ messages in thread
From: Itai Berli @ 2017-07-20  7:01 UTC (permalink / raw)
  To: 27525

[-- Attachment #1: Type: text/plain, Size: 5583 bytes --]

I see no reason to continue this discussion any further.

One thing I'm curious about, though. What bidi features exist in Emacs,
half of which the other editors don't have? Which features were you
referring to when you wrote that, thanks to them, "10 years later, Emacs
still shines among all the bidi-aware editors out there"?

On Thu, Jul 20, 2017 at 8:08 AM, Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Itai Berli <itai.berli@gmail.com>
> > Date: Thu, 20 Jul 2017 00:40:15 +0300
> >
> > I believe Emacs
> > can be the #1 plain-text bidi editor out there, but this hinges on
> fixing this bug.
>
> And I believe you exaggerate the importance of this issue, and how
> much it diminishes the usefulness of the Emacs bidi support.  Can we
> agree to disagree about that, now that we've reiterated that
> disagreement many times, and all of that is forever recorded in the
> bug tracker?
>
> >
> > > I maintain that Emacs deviates from the UBA in a relatively minor way,
> > in an aspect that is only tangentially related to reordering
> > bidirectional text for display, and that raises its head in situations
> > that are relatively rare in practice, and in many of those rare cases
> > can be easily worked around by breaking long lines.
> >
> > One of the valuable aspects of an ISO standard is that it is not left to
> the personal judgment of a programmer
> > to decide what is worth implementing, and how to do so. It is not for
> you to decide what is a minor detail and
> > what is a major one, what is tangential and what is core. You need to
> implement it to the letter, or else you
> > can't claim conformance, no matter how slight you imagine your deviation
> to be.
>
> Of course, it's for me to decide.  Emacs is not an implementation of
> the Unicode Standard: Emacs _follows_ the Unicode recommendations
> where we decide it to be useful/practical, and doesn't where we don't.
>
> > On what do you base your claim that this problem occurs relatively
> rarely in practice? This is the kind of
> > statement that only a specialist linguist/statistician can make. And
> have you taken into account the type of
> > demographics who use Emacs' bidi feature and the kinds of texts they're
> likely to type?
>
> It doesn't take a statistician/linguist to realize that
>
>   . long lines that wrap on Emacs display are rare to begin with
>   . lines with predominantly RTL text in LTR paragraphs are rare, and
>     likewise lines with predominantly LTR text in RTL paragraphs
>   . multiplying 2 rare cases makes the result very rare
>
> > Contrary to what you said, my personal experience show that this is a
> major inconvenience, and that it is a
> > situation that occurs very often, almost every paragraph, in fact, since
> I write primarily LaTeX documents
> > where English markup is intermixed with predominantly Hebrew text
> containing frequent quotes from English
> > textbooks and articles.
>
> LaTeX documents can easily work around the problem by breaking long
> lines into shorter ones.
>
> > Yes, breaking lines is a possible workaround for LaTex, but it makes for
> ugly and erratic looking paragraphs
> > that are difficult to read and edit.
>
> I fail to see why it would be ugly or hard to read.  Especially since
> you can now have a different paragraph direction after every newline.
> Perhaps you need to break lines more judiciously, not at random
> points.
>
> > And what about documents that are not LaTeX? What workaround do they
> > have?
>
> Plain text documents, and documents that are "nearly plain text", like
> TeX, Texinfo, and other similar systems, rarely if ever consider
> newlines as significant.  So this workaround is available there as
> well.  About the only exception I know of is poetry, where over-long
> lines are even rarer.
>
> Btw, on GUI terminals there's one other workaround: make your Emacs
> window wider.  That works with any file/buffer, not just text-like
> ones.
>
> > You mention breaking "long lines", but this is not just a problem of
> long lines. It takes just two English words
> > inside a Hebrew paragraph that happen to fall on a line break, to
> manifest this bug.
>
> Yeah, and how frequently does that happen?
>
> > However, Emacs also shines as possibly the only bidi-aware text editor
> that botches the line wrapping of bidi
> > paragraphs. Every single editor that I've checked gets it right: from
> Word to Kate to GEdit to Google Docs to
> > BlueFish to TextEdit.
>
> You are free to use those other bidi-aware editors, if they suit your
> needs better.  They don't have half the bidi features you get in
> Emacs, but if line-wrapping is so much more important for you than all
> the rest of the UBA, you don't have to use Emacs.
>
> > > The Emacs manual already describes this deviation.
> >
> > In the online manual sections 22.19 (Bidirectional Editing) and 37.26
> (Bidirectional Display) claim that Emacs
> > implements the Unicode Bidirectional Algorithm.
>
> You have the latest sources in the Git repository you cloned, look
> there for the latest text.
>
> Once again: this is an annoyance, and I'd love to see it fixed.  But
> it's a minor annoyance, which happens rarely, and on most cases there
> are workarounds.  Fixing it is a large job, and will take a motivated
> volunteer with a lot of talent or a lot of free time (or both).  Until
> we are lucky to have that, we will have to live with this annoyance.
>
> Cane we PLEASE finally agree to disagree about this?  I see no reason
> for discussing this further, as we are just repeating the same
> arguments again and again.
>

[-- Attachment #2: Type: text/html, Size: 6700 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-20  7:01                         ` Itai Berli
@ 2017-07-20 11:09                           ` Eli Zaretskii
  2017-07-21  6:19                             ` Itai Berli
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Zaretskii @ 2017-07-20 11:09 UTC (permalink / raw)
  To: Itai Berli; +Cc: 27525

> Resent-Sender: help-debbugs@gnu.org
> From: Itai Berli <itai.berli@gmail.com>
> Date: Thu, 20 Jul 2017 10:01:33 +0300
> 
> I see no reason to continue this discussion any further.

Thank you.

> One thing I'm curious about, though. What bidi features exist in Emacs, half of which the other editors don't
> have? Which features were you referring to when you wrote that, thanks to them, "10 years later, Emacs still
> shines among all the bidi-aware editors out there"?

 . For starters, all the UBA features are fully supported, including
   the directional isolates and bracket-matching (a.k.a. "BPA").
 . Support for bidirectional display on both GUI and text-mode terminals.
 . Both logical-order and visual-order cursor motion.
 . Full support for Arabic shaping and other complex-script shaping
   features in bidirectional text (e.g., Hebrew "nikkud").
 . Bidi formatting controls are visible on screen, so users don't need
   to guess why display looks like it does.
 . Variables to control where paragraphs begin and end, for the
   purposes of determining base paragraph direction.
 . Variables to disable mirroring of parentheses due to bidi context,
   and even disable bidi reordering entirely, if needed.
 . Lisp functions that are necessary when writing bidi-aware
   customizations and features:
   . a function that returns base paragraph direction at point
   . a function that returns resolved bidi levels for a line
   . a function that takes a string and wraps it so that it could be
     concatenated with other strings without fear of producing jumbled
     display due to reordering (important for tabular display)
   . a function to find characters whose directionality was overridden
     by bidi controls (which could be used to maliciously dupe the
     user to think a string is not what it really is)
   . a function to return a substring of buffer text surrounded by
     bidi controls that make sure its visual appearance will not
     change when copied to a different portion of text

And that is even before we consider Emacs-only features, which those
other editors can only dream about, like mouse-highlight, display and
overlay strings, invisible text, text alignment on display, etc. --
all of which are bidi-aware in Emacs.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-20 11:09                           ` Eli Zaretskii
@ 2017-07-21  6:19                             ` Itai Berli
  2017-07-21  8:37                               ` Eli Zaretskii
  0 siblings, 1 reply; 34+ messages in thread
From: Itai Berli @ 2017-07-21  6:19 UTC (permalink / raw)
  To: 27525

[-- Attachment #1: Type: text/plain, Size: 3295 bytes --]

You sure put a lot of thought and effort into it. Thank you for that.

Now that I have downloaded the source code, I'd like to take a look at this
problem first hand. I'm not a programmer, not even an amateur one, but I
can sometimes make sense of the general gist of code when I read it, and
I'd like to take a look at the part of code that's responsible for the
present bug, maybe put a breakpoint here and there and give it a test run
to get a feel of how it works, and why it misses the mark when it comes to
line wrapping bidi paragraphs.

Could you please give me some pointers: what files should I look into, what
functions should I read, possibly even suggestions for where to put
breakpoints and which variables to watch. I'm not asking for a
comprehensive and detailed run down of this feature; just a starting
point(s). Every tip and suggestion will be welcome.

On Thu, Jul 20, 2017 at 2:09 PM, Eli Zaretskii <eliz@gnu.org> wrote:

> > Resent-Sender: help-debbugs@gnu.org
> > From: Itai Berli <itai.berli@gmail.com>
> > Date: Thu, 20 Jul 2017 10:01:33 +0300
> >
> > I see no reason to continue this discussion any further.
>
> Thank you.
>
> > One thing I'm curious about, though. What bidi features exist in Emacs,
> half of which the other editors don't
> > have? Which features were you referring to when you wrote that, thanks
> to them, "10 years later, Emacs still
> > shines among all the bidi-aware editors out there"?
>
>  . For starters, all the UBA features are fully supported, including
>    the directional isolates and bracket-matching (a.k.a. "BPA").
>  . Support for bidirectional display on both GUI and text-mode terminals.
>  . Both logical-order and visual-order cursor motion.
>  . Full support for Arabic shaping and other complex-script shaping
>    features in bidirectional text (e.g., Hebrew "nikkud").
>  . Bidi formatting controls are visible on screen, so users don't need
>    to guess why display looks like it does.
>  . Variables to control where paragraphs begin and end, for the
>    purposes of determining base paragraph direction.
>  . Variables to disable mirroring of parentheses due to bidi context,
>    and even disable bidi reordering entirely, if needed.
>  . Lisp functions that are necessary when writing bidi-aware
>    customizations and features:
>    . a function that returns base paragraph direction at point
>    . a function that returns resolved bidi levels for a line
>    . a function that takes a string and wraps it so that it could be
>      concatenated with other strings without fear of producing jumbled
>      display due to reordering (important for tabular display)
>    . a function to find characters whose directionality was overridden
>      by bidi controls (which could be used to maliciously dupe the
>      user to think a string is not what it really is)
>    . a function to return a substring of buffer text surrounded by
>      bidi controls that make sure its visual appearance will not
>      change when copied to a different portion of text
>
> And that is even before we consider Emacs-only features, which those
> other editors can only dream about, like mouse-highlight, display and
> overlay strings, invisible text, text alignment on display, etc. --
> all of which are bidi-aware in Emacs.
>

[-- Attachment #2: Type: text/html, Size: 4036 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-21  6:19                             ` Itai Berli
@ 2017-07-21  8:37                               ` Eli Zaretskii
  2017-07-21  9:44                                 ` Itai Berli
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Zaretskii @ 2017-07-21  8:37 UTC (permalink / raw)
  To: Itai Berli; +Cc: 27525

> From: Itai Berli <itai.berli@gmail.com>
> Date: Fri, 21 Jul 2017 09:19:25 +0300
> 
> Now that I have downloaded the source code, I'd like to take a look at this problem first hand. I'm not a
> programmer, not even an amateur one, but I can sometimes make sense of the general gist of code when I
> read it, and I'd like to take a look at the part of code that's responsible for the present bug, maybe put a
> breakpoint here and there and give it a test run to get a feel of how it works, and why it misses the mark when
> it comes to line wrapping bidi paragraphs.
> 
> Could you please give me some pointers: what files should I look into, what functions should I read, possibly
> even suggestions for where to put breakpoints and which variables to watch. I'm not asking for a
> comprehensive and detailed run down of this feature; just a starting point(s). Every tip and suggestion will be
> welcome.

The relevant files are bidi.c and xdisp.c.  There's a long comment at
the beginning of xdisp.c, whose last parts deal with how the bidi
reordering is incorporated into the display engine, and a long comment
at the beginning of bidi.c that has more details about the reordering
itself.

Note that this is not an implementation bug, it's a consequence of how
the bidi reordering engine's integration with the rest of the display
code was designed: we reorder text for display _before_ making the
layout decisions.  IOW, the layout layer of the display engine is fed
characters in _visual_ order, already reordered by bidi.c functions
which the layout layer calls when it needs another character.  The
advantage of this design is that the display engine knows almost
nothing about the reordering stuff, it doesn't care about resolved
levels etc., because all that was already taken care of.

To make line-wrapping do what the UBA describes, we would need to feed
the display engine with characters in logical order, but record with
each character its resolved bidi level, resulting from partial
processing by bidi.c.  Then, when a line is completely laid out, we'd
need to reorder the glyphs prepared for that line according to UBA
rules L1, L2, and L4, using the resolved levels recorded by bidi.c
code.  (L3 is tricky, because combining marks are applied when
producing glyphs, so it has to be solved by "some other method".)

The above means we need to redesign the interface between xdisp.c and
bidi.c, and then rewrite the current reordering function into
something that will work on the glyphs of a laid-out line.

That in itself is more or less straightforward refactoring of the
existing code, but unfortunately it isn't the scary part of the job.
The scary part is all the subtleties of the Emacs display engine and
the features it provides, when bidirectional text is involved.  For
example, many places need to calculate layout metrics without
displaying anything.  A typical example is vertical-motion when
line-move-visual is in effect -- it needs to determine what buffer
position is displayed one screen line up or down from a given
character.  Another example is how we process a mouse click, which
starts by determining which buffer position (more accurately, which
offset of what object) is displayed at given pixel coordinates.

These places use functions that "simulate" display -- they perform all
the layout calculations, but don't create glyphs (because nothing
needs to be displayed).  Since glyphs are not created, the "line" to
be displayed doesn't exist, and thus the reordering step will have
nothing to work on.  Whoever will work on fixing line-wrapping will
have to figure out how to solve this problem in a way that is
compatible with the 2nd sentence of the UBA's section 3.4.  There are
many complications in this part of the display code, because
oftentimes Emacs ends the display "simulation" before reaching the end
of the line, and sometimes even starts it in the middle of a line.
All this needs to be figured out and implemented when reordering needs
to see a full screen line, and implemented in a way that doesn't hurt
performance in any significant way.

Then there are complications with invisible text: the 'invisible' text
property can start and/or end in the middle if non-base embedding
level, and the question is how to produce the result that the user
expects, when some of the characters that affect reordering are
effectively hidden from the reordering code, because the invisible
text is simply skipped and never fed to the layout layer.  (With the
current design, reordering is done before the text invisibility is
considered, so the result is quite naturally the expected one.)
Similar problems arise with display properties and overlays which hide
portions of buffer text, optionally replacing them with some other
text or image -- the reordering step will somehow need to avoid
reordering the text of a display string as if it were part of the
surrounding buffer text, because that's not what the user expects.

Another complication is where glyph production and layout decisions
are mixed with bidi level resolution.  One such situation is how we
implement the display property of the form '(space :align-to HPOS)'
which is treated as a paragraph separator for the purposes of bidi
reordering (thus supporting display of tables with bidirectional
text).  If we separate reordering from level resolution, this will
have to be rethought if not reimplemented.

And I'm quite sure there are other complications that I forget.  This
is what took the lion's share of the work on making the display engine
bidi-aware (because the basic reordering engine which is now bidi.c
was written and debugged, as a stand-alone program, 15 years ago).
Whoever will work on fixing the line-wrapping issue will have to do at
least part of that anew.  I surely hope a motivated individual will
step forward for the job at some point, but they need to know what
they will face.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-21  8:37                               ` Eli Zaretskii
@ 2017-07-21  9:44                                 ` Itai Berli
  2017-07-21 10:58                                   ` Itai Berli
  2017-07-21 13:01                                   ` Eli Zaretskii
  0 siblings, 2 replies; 34+ messages in thread
From: Itai Berli @ 2017-07-21  9:44 UTC (permalink / raw)
  To: 27525

[-- Attachment #1: Type: text/plain, Size: 6703 bytes --]

Thank you.

I just want to make sure I understand. Please correct me if I'm wrong.

1. The bidi logic is entirely contained in the file bidi.c.

2. The display logic is entirely contained in the file xdisp.c.

3. The interface between the two modules is minimal. If I wish to cancel
Emacs' bidi features, all I need to do is comment out a couple lines in
xdisp.c and a user who doesn't use bidi documents will never know the
difference.

4. All the complications you mentioned are limited to code in xdisp.c


On Fri, Jul 21, 2017 at 11:37 AM, Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Itai Berli <itai.berli@gmail.com>
> > Date: Fri, 21 Jul 2017 09:19:25 +0300
> >
> > Now that I have downloaded the source code, I'd like to take a look at
> this problem first hand. I'm not a
> > programmer, not even an amateur one, but I can sometimes make sense of
> the general gist of code when I
> > read it, and I'd like to take a look at the part of code that's
> responsible for the present bug, maybe put a
> > breakpoint here and there and give it a test run to get a feel of how it
> works, and why it misses the mark when
> > it comes to line wrapping bidi paragraphs.
> >
> > Could you please give me some pointers: what files should I look into,
> what functions should I read, possibly
> > even suggestions for where to put breakpoints and which variables to
> watch. I'm not asking for a
> > comprehensive and detailed run down of this feature; just a starting
> point(s). Every tip and suggestion will be
> > welcome.
>
> The relevant files are bidi.c and xdisp.c.  There's a long comment at
> the beginning of xdisp.c, whose last parts deal with how the bidi
> reordering is incorporated into the display engine, and a long comment
> at the beginning of bidi.c that has more details about the reordering
> itself.
>
> Note that this is not an implementation bug, it's a consequence of how
> the bidi reordering engine's integration with the rest of the display
> code was designed: we reorder text for display _before_ making the
> layout decisions.  IOW, the layout layer of the display engine is fed
> characters in _visual_ order, already reordered by bidi.c functions
> which the layout layer calls when it needs another character.  The
> advantage of this design is that the display engine knows almost
> nothing about the reordering stuff, it doesn't care about resolved
> levels etc., because all that was already taken care of.
>
> To make line-wrapping do what the UBA describes, we would need to feed
> the display engine with characters in logical order, but record with
> each character its resolved bidi level, resulting from partial
> processing by bidi.c.  Then, when a line is completely laid out, we'd
> need to reorder the glyphs prepared for that line according to UBA
> rules L1, L2, and L4, using the resolved levels recorded by bidi.c
> code.  (L3 is tricky, because combining marks are applied when
> producing glyphs, so it has to be solved by "some other method".)
>
> The above means we need to redesign the interface between xdisp.c and
> bidi.c, and then rewrite the current reordering function into
> something that will work on the glyphs of a laid-out line.
>
> That in itself is more or less straightforward refactoring of the
> existing code, but unfortunately it isn't the scary part of the job.
> The scary part is all the subtleties of the Emacs display engine and
> the features it provides, when bidirectional text is involved.  For
> example, many places need to calculate layout metrics without
> displaying anything.  A typical example is vertical-motion when
> line-move-visual is in effect -- it needs to determine what buffer
> position is displayed one screen line up or down from a given
> character.  Another example is how we process a mouse click, which
> starts by determining which buffer position (more accurately, which
> offset of what object) is displayed at given pixel coordinates.
>
> These places use functions that "simulate" display -- they perform all
> the layout calculations, but don't create glyphs (because nothing
> needs to be displayed).  Since glyphs are not created, the "line" to
> be displayed doesn't exist, and thus the reordering step will have
> nothing to work on.  Whoever will work on fixing line-wrapping will
> have to figure out how to solve this problem in a way that is
> compatible with the 2nd sentence of the UBA's section 3.4.  There are
> many complications in this part of the display code, because
> oftentimes Emacs ends the display "simulation" before reaching the end
> of the line, and sometimes even starts it in the middle of a line.
> All this needs to be figured out and implemented when reordering needs
> to see a full screen line, and implemented in a way that doesn't hurt
> performance in any significant way.
>
> Then there are complications with invisible text: the 'invisible' text
> property can start and/or end in the middle if non-base embedding
> level, and the question is how to produce the result that the user
> expects, when some of the characters that affect reordering are
> effectively hidden from the reordering code, because the invisible
> text is simply skipped and never fed to the layout layer.  (With the
> current design, reordering is done before the text invisibility is
> considered, so the result is quite naturally the expected one.)
> Similar problems arise with display properties and overlays which hide
> portions of buffer text, optionally replacing them with some other
> text or image -- the reordering step will somehow need to avoid
> reordering the text of a display string as if it were part of the
> surrounding buffer text, because that's not what the user expects.
>
> Another complication is where glyph production and layout decisions
> are mixed with bidi level resolution.  One such situation is how we
> implement the display property of the form '(space :align-to HPOS)'
> which is treated as a paragraph separator for the purposes of bidi
> reordering (thus supporting display of tables with bidirectional
> text).  If we separate reordering from level resolution, this will
> have to be rethought if not reimplemented.
>
> And I'm quite sure there are other complications that I forget.  This
> is what took the lion's share of the work on making the display engine
> bidi-aware (because the basic reordering engine which is now bidi.c
> was written and debugged, as a stand-alone program, 15 years ago).
> Whoever will work on fixing the line-wrapping issue will have to do at
> least part of that anew.  I surely hope a motivated individual will
> step forward for the job at some point, but they need to know what
> they will face.
>

[-- Attachment #2: Type: text/html, Size: 7690 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-21  9:44                                 ` Itai Berli
@ 2017-07-21 10:58                                   ` Itai Berli
  2017-07-21 13:19                                     ` Eli Zaretskii
  2017-07-21 13:01                                   ` Eli Zaretskii
  1 sibling, 1 reply; 34+ messages in thread
From: Itai Berli @ 2017-07-21 10:58 UTC (permalink / raw)
  To: 27525

[-- Attachment #1: Type: text/plain, Size: 7404 bytes --]

I have a suggestion for another approach to tackling this bug. It doesn't
fix the problem, but it offers a better workaround, in my opinion, than the
present one of requiring the user to break lines manually. Is this approach
easier to implement?

Instead of requiring Emacs to wrap lines correctly as they are being typed,
only require it to display correctly wrapped lines once a file is opened,
as well as once the user explicitly invokes a certain function while
editing the file.

On Fri, Jul 21, 2017 at 12:44 PM, Itai Berli <itai.berli@gmail.com> wrote:

> Thank you.
>
> I just want to make sure I understand. Please correct me if I'm wrong.
>
> 1. The bidi logic is entirely contained in the file bidi.c.
>
> 2. The display logic is entirely contained in the file xdisp.c.
>
> 3. The interface between the two modules is minimal. If I wish to cancel
> Emacs' bidi features, all I need to do is comment out a couple lines in
> xdisp.c and a user who doesn't use bidi documents will never know the
> difference.
>
> 4. All the complications you mentioned are limited to code in xdisp.c
>
>
> On Fri, Jul 21, 2017 at 11:37 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>
>> > From: Itai Berli <itai.berli@gmail.com>
>> > Date: Fri, 21 Jul 2017 09:19:25 +0300
>> >
>> > Now that I have downloaded the source code, I'd like to take a look at
>> this problem first hand. I'm not a
>> > programmer, not even an amateur one, but I can sometimes make sense of
>> the general gist of code when I
>> > read it, and I'd like to take a look at the part of code that's
>> responsible for the present bug, maybe put a
>> > breakpoint here and there and give it a test run to get a feel of how
>> it works, and why it misses the mark when
>> > it comes to line wrapping bidi paragraphs.
>> >
>> > Could you please give me some pointers: what files should I look into,
>> what functions should I read, possibly
>> > even suggestions for where to put breakpoints and which variables to
>> watch. I'm not asking for a
>> > comprehensive and detailed run down of this feature; just a starting
>> point(s). Every tip and suggestion will be
>> > welcome.
>>
>> The relevant files are bidi.c and xdisp.c.  There's a long comment at
>> the beginning of xdisp.c, whose last parts deal with how the bidi
>> reordering is incorporated into the display engine, and a long comment
>> at the beginning of bidi.c that has more details about the reordering
>> itself.
>>
>> Note that this is not an implementation bug, it's a consequence of how
>> the bidi reordering engine's integration with the rest of the display
>> code was designed: we reorder text for display _before_ making the
>> layout decisions.  IOW, the layout layer of the display engine is fed
>> characters in _visual_ order, already reordered by bidi.c functions
>> which the layout layer calls when it needs another character.  The
>> advantage of this design is that the display engine knows almost
>> nothing about the reordering stuff, it doesn't care about resolved
>> levels etc., because all that was already taken care of.
>>
>> To make line-wrapping do what the UBA describes, we would need to feed
>> the display engine with characters in logical order, but record with
>> each character its resolved bidi level, resulting from partial
>> processing by bidi.c.  Then, when a line is completely laid out, we'd
>> need to reorder the glyphs prepared for that line according to UBA
>> rules L1, L2, and L4, using the resolved levels recorded by bidi.c
>> code.  (L3 is tricky, because combining marks are applied when
>> producing glyphs, so it has to be solved by "some other method".)
>>
>> The above means we need to redesign the interface between xdisp.c and
>> bidi.c, and then rewrite the current reordering function into
>> something that will work on the glyphs of a laid-out line.
>>
>> That in itself is more or less straightforward refactoring of the
>> existing code, but unfortunately it isn't the scary part of the job.
>> The scary part is all the subtleties of the Emacs display engine and
>> the features it provides, when bidirectional text is involved.  For
>> example, many places need to calculate layout metrics without
>> displaying anything.  A typical example is vertical-motion when
>> line-move-visual is in effect -- it needs to determine what buffer
>> position is displayed one screen line up or down from a given
>> character.  Another example is how we process a mouse click, which
>> starts by determining which buffer position (more accurately, which
>> offset of what object) is displayed at given pixel coordinates.
>>
>> These places use functions that "simulate" display -- they perform all
>> the layout calculations, but don't create glyphs (because nothing
>> needs to be displayed).  Since glyphs are not created, the "line" to
>> be displayed doesn't exist, and thus the reordering step will have
>> nothing to work on.  Whoever will work on fixing line-wrapping will
>> have to figure out how to solve this problem in a way that is
>> compatible with the 2nd sentence of the UBA's section 3.4.  There are
>> many complications in this part of the display code, because
>> oftentimes Emacs ends the display "simulation" before reaching the end
>> of the line, and sometimes even starts it in the middle of a line.
>> All this needs to be figured out and implemented when reordering needs
>> to see a full screen line, and implemented in a way that doesn't hurt
>> performance in any significant way.
>>
>> Then there are complications with invisible text: the 'invisible' text
>> property can start and/or end in the middle if non-base embedding
>> level, and the question is how to produce the result that the user
>> expects, when some of the characters that affect reordering are
>> effectively hidden from the reordering code, because the invisible
>> text is simply skipped and never fed to the layout layer.  (With the
>> current design, reordering is done before the text invisibility is
>> considered, so the result is quite naturally the expected one.)
>> Similar problems arise with display properties and overlays which hide
>> portions of buffer text, optionally replacing them with some other
>> text or image -- the reordering step will somehow need to avoid
>> reordering the text of a display string as if it were part of the
>> surrounding buffer text, because that's not what the user expects.
>>
>> Another complication is where glyph production and layout decisions
>> are mixed with bidi level resolution.  One such situation is how we
>> implement the display property of the form '(space :align-to HPOS)'
>> which is treated as a paragraph separator for the purposes of bidi
>> reordering (thus supporting display of tables with bidirectional
>> text).  If we separate reordering from level resolution, this will
>> have to be rethought if not reimplemented.
>>
>> And I'm quite sure there are other complications that I forget.  This
>> is what took the lion's share of the work on making the display engine
>> bidi-aware (because the basic reordering engine which is now bidi.c
>> was written and debugged, as a stand-alone program, 15 years ago).
>> Whoever will work on fixing the line-wrapping issue will have to do at
>> least part of that anew.  I surely hope a motivated individual will
>> step forward for the job at some point, but they need to know what
>> they will face.
>>
>
>

[-- Attachment #2: Type: text/html, Size: 8608 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-21  9:44                                 ` Itai Berli
  2017-07-21 10:58                                   ` Itai Berli
@ 2017-07-21 13:01                                   ` Eli Zaretskii
  1 sibling, 0 replies; 34+ messages in thread
From: Eli Zaretskii @ 2017-07-21 13:01 UTC (permalink / raw)
  To: Itai Berli; +Cc: 27525

> From: Itai Berli <itai.berli@gmail.com>
> Date: Fri, 21 Jul 2017 12:44:40 +0300
> 
> 1. The bidi logic is entirely contained in the file bidi.c.

"Bidi logic" is not well defined.  If you mean the implementation of
the UBA, then yes, it's in bidi.c, with the sole exception of
mirroring of characters due to bidi context, which is in xdisp.c.

> 2. The display logic is entirely contained in the file xdisp.c.

If by "display logic" you mean the layout parts, i.e. the code which
constructs screen lines out of characters and breaks physical lines
into logical lines, then yes.

> 3. The interface between the two modules is minimal. If I wish to cancel Emacs' bidi features, all I need to do
> is comment out a couple lines in xdisp.c and a user who doesn't use bidi documents will never know the
> difference.

Not exactly.  There are numerous code snippets that handle
bidi-related complications, like the fact that buffer position is no
longer monotonously increasing with screen coordinates, all over in
xdisp.c.  But if bidi.c functions are never called, these snippets
will most probably be no-ops.

> 4. All the complications you mentioned are limited to code in xdisp.c

No.  The current implementation includes a couple of functions in
bidi.c that perform on-the-flight reordering of characters into visual
order.  These functions will have to be bypassed, and instead there
should be a new implementation of reordering which works on a laid-out
screen line.  In addition, the data structures shared by bidi.c and
xdisp.c will have to be changed to include the resolved level of each
character and some other information (part of that is already there,
but it will have to be revisited to make sure it doesn't hide some of
the info needed for reordering, because currently the information is
stored post-reordering).





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#27525: 25.1; Line wrapping of bidi paragraphs
  2017-07-21 10:58                                   ` Itai Berli
@ 2017-07-21 13:19                                     ` Eli Zaretskii
  0 siblings, 0 replies; 34+ messages in thread
From: Eli Zaretskii @ 2017-07-21 13:19 UTC (permalink / raw)
  To: Itai Berli; +Cc: 27525

> From: Itai Berli <itai.berli@gmail.com>
> Date: Fri, 21 Jul 2017 13:58:57 +0300
> 
> Instead of requiring Emacs to wrap lines correctly as they are being typed, only require it to display correctly
> wrapped lines once a file is opened, as well as once the user explicitly invokes a certain function while editing
> the file.

Thanks, but this proposal is unworkable, for several reasons.

First, reordering of bidirectional text is a display-time feature.
That means text in the buffer is left intact, in its original logical
order, and is reordered only when some portion of that text is being
considered for possible redisplay.  (I wrote "considered for possible
redisplay" on purpose, because many times the display engine will
decide that certain parts of the display have not changed, and
therefore this potential will not be realized for some of the text.)
The reordered characters appear in their visual order only in the data
structures used as part of redisplay, the buffer text retains its
original order.  These data structures are ephemeral and not
accessible from Lisp.

This means that wrapping lines when a file is first visited by Emacs
is meaningless, because most of the file will not be shown on-screen.
Likewise with the user invoking a command: it can only be done when an
incorrect display is already on the screen, and only for the part that
is actually displayed.

But there is a more fundamental reason why this will not be useful.
Almost everything the user does in Emacs causes a redisplay cycle.
Even when the user just moves the cursor, there is a redisplay
immediately after that.  If your cursor blinks (as it does by
default), each blink causes a redisplay cycle.  Each such redisplay
cycle compares the actual screen display with the desired one, and
redraws any parts that are different.  So unless the correct line
wrapping is part of the "normal" display, it will be very short-lived,
since the very next redisplay cycle might replace it with the
"incorrectly" wrapped lines.

All of this is due to the basics of the Emacs design: changes to
buffer text and other related objects made by the Lisp interpreter do
not directly drive the display.  Instead, when the Lisp interpreter
becomes idle (meaning that it is done with changing buffers and other
objects, as instructed by the last command), redisplay is
automatically called, and is responsible for figuring out what, if
anything, needs to be changed on display due to the changes made by
the Lisp interpreter.  It follows that you cannot "fix" redisplay
problems by something you do in Lisp.





^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2017-07-21 13:19 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-29  7:23 bug#27525: 25.1; Line wrapping of bidi paragraphs Itai Berli
2017-06-29 14:55 ` Eli Zaretskii
2017-06-29 18:35 ` Itai Berli
2017-07-04  9:10 ` Itai Berli
2017-07-04  9:11   ` Itai Berli
2017-07-04  9:19     ` Itai Berli
2017-07-04 14:43       ` Eli Zaretskii
2017-07-04 14:52         ` Itai Berli
2017-07-04 15:19           ` Eli Zaretskii
2017-07-04 23:05       ` Richard Stallman
2017-07-05  2:29         ` Eli Zaretskii
2017-07-05 22:59           ` Richard Stallman
2017-07-06  2:39             ` Eli Zaretskii
2017-07-06 16:01               ` Richard Stallman
2017-07-06 16:17                 ` Eli Zaretskii
2017-07-07 18:23                   ` Richard Stallman
2017-07-07 19:21                     ` Eli Zaretskii
2017-07-09 18:17           ` Benjamin Riefenstahl
2017-07-09 18:30             ` Eli Zaretskii
2017-07-19  8:50               ` Itai Berli
2017-07-19 12:59                 ` Itai Berli
2017-07-19 17:28                   ` Eli Zaretskii
2017-07-19 21:40                     ` Itai Berli
2017-07-20  5:08                       ` Eli Zaretskii
2017-07-20  7:01                         ` Itai Berli
2017-07-20 11:09                           ` Eli Zaretskii
2017-07-21  6:19                             ` Itai Berli
2017-07-21  8:37                               ` Eli Zaretskii
2017-07-21  9:44                                 ` Itai Berli
2017-07-21 10:58                                   ` Itai Berli
2017-07-21 13:19                                     ` Eli Zaretskii
2017-07-21 13:01                                   ` Eli Zaretskii
2017-07-19 17:24                 ` Eli Zaretskii
2017-07-04 14:40   ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).