unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Column numbering in bidirectional display
@ 2010-05-21  9:08 Eli Zaretskii
  2010-05-21  9:30 ` David Kastrup
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Eli Zaretskii @ 2010-05-21  9:08 UTC (permalink / raw
  To: emacs-devel; +Cc: emacs-bidi

With most of basic features needed for displaying bidirectional text
out of my way (the notable omission so far is reordering display
strings), development now enters the application level, albeit on a
very basic level for now.

One of the major issues on this level is the semantics of the column
numbering.  In the unidirectional case, this is trivial: column
numbers start at zero at the left margin and increase linearly as we
move to the right.

In the bidirectional case, we have two complications.  First, there
are right-to-left (R2L) lines made entirely of R2L characters.  They
are displayed starting at the right margin of the window, like this:

                                      ZYX WVU TSRQ PONMLKJIH GFEDCBA

What should current-column return when point is before A, i.e. at the
first character of the line in the reading order, which is at the
right margin of the window on display?

The other complication is mixed L2R and R2L text.  Example of how we
display a L2R line that includes some R2L characters:

  EDCBA abcde fghij

Here A is the first character of the line in buffer's logical order.
What should current-column return when point is before A?

A similar example for displaying a R2L line that includes some L2R
text:

                                                 JIHGF EDCBA abcde

and we have the same dilemma regarding the value of current-column
when point is before a.

Currently, current-column (and move-to-column, and other primitives in
indent.c) work in buffer's logical order, disregarding the reordering
of characters for display.  That is why current-column returns zero
for all the situations I described above.  It also counts column in
strict logical order.  For example, here are the column numbers for
each character of the last example (numbers that need more than one
digit are written vertically):
                                                 JIHGF EDCBA abcde
                                                 11111111987612345
                                                 76543210

This might surprise at first, and might even look terribly wrong, but
it turns out that users expect that in bidirectional text.  At least
MS Word behaves _exactly_ like this, AFAICS.

Moreover, this makes a surprising number of basic Emacs features work
correctly even though the underlying Lisp code is entirely oblivious
to bidi reordering.  One example is Dired, when file names include R2L
characters: I was pleasantly surprised to see that it puts the cursor
on the correct place within the file name.  Another example is the
various features that manipulate indentation.

If we decide that columns should be numbered in their screen order,
from left to right, then we will need:

  . Rewrite primitives in indent.c to be bidi-aware, i.e. advance by
    calling functions from bidi.c rather than just incrementing
    character positions.  This would complicate the parts that move
    backwards, because there's no code in bidi.c that can do that, and
    it's not trivial to write such code.

  . Fix all the Lisp code that uses these primitives to not assume
    that column zero is necessarily the first character of the line
    that follows a newline.

Admittedly, there are some features which need to be fixed even if we
keep the current semantics of column numbering.  C-e (just fixed 2
days ago) is one example.  But I think the number of such features is
much smaller than if we number columns in visual screen order.

So on balance, I think we should keep the current semantics of the
line numbering, whereby columns are numbered in strict logical order.

If we decide to go that way, we will need to provide primitives or
subroutines to get to the visually first and last characters of a
visual line.  That's because some features need that; see the thread
Re: Hl-line and visual-line for one example.  beginning-of-visual-line
and end-of-visual-line sound like a good starting point.

Comments are welcome.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Column numbering in bidirectional display
  2010-05-21  9:08 Column numbering in bidirectional display Eli Zaretskii
@ 2010-05-21  9:30 ` David Kastrup
  2010-05-21 11:17   ` Eli Zaretskii
  2010-05-21 13:20 ` Yair F
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: David Kastrup @ 2010-05-21  9:30 UTC (permalink / raw
  To: emacs-bidi; +Cc: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> This might surprise at first, and might even look terribly wrong, but
> it turns out that users expect that in bidirectional text.  At least
> MS Word behaves _exactly_ like this, AFAICS.
>
> Moreover, this makes a surprising number of basic Emacs features work
> correctly even though the underlying Lisp code is entirely oblivious
> to bidi reordering.  One example is Dired, when file names include R2L
> characters: I was pleasantly surprised to see that it puts the cursor
> on the correct place within the file name.  Another example is the
> various features that manipulate indentation.
>
> If we decide that columns should be numbered in their screen order,
> from left to right, then we will need:
>
>   . Rewrite primitives in indent.c to be bidi-aware, i.e. advance by
>     calling functions from bidi.c rather than just incrementing
>     character positions.  This would complicate the parts that move
>     backwards, because there's no code in bidi.c that can do that, and
>     it's not trivial to write such code.
>
>   . Fix all the Lisp code that uses these primitives to not assume
>     that column zero is necessarily the first character of the line
>     that follows a newline.

It is my opinion that bidi reordering should be kept strictly a display
feature.

The function move-to-column is sort of a hybrid ("The column of a
character is calculated by adding together the widths as displayed of
the previous characters in the line.")  It is obvious that the quoted
sentence from the description raises more questions than it answers:
what are "previous characters" in this context?  Does the "width as
displayed" count positively?

Should the relation
(eq (<= (progn (move-to-column x) (point)) (progn (move-to-column y)))
    (<= x y))
be preserved?  That would make table formatting complex.

A command like vertical-motion acts on a display text presentation
rather than a logical representation: it would heed bidi (where
applicable).

Programmatically, text manipulation should keep as far away from those
display-oriented functions as possible (except where indeed the display
representation should be manipulated).  And all basic text manipulation
should stay

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: Column numbering in bidirectional display
  2010-05-21  9:30 ` David Kastrup
@ 2010-05-21 11:17   ` Eli Zaretskii
  0 siblings, 0 replies; 9+ messages in thread
From: Eli Zaretskii @ 2010-05-21 11:17 UTC (permalink / raw
  To: David Kastrup; +Cc: emacs-bidi, emacs-devel

> From: David Kastrup <dak@gnu.org>
> Date: Fri, 21 May 2010 11:30:12 +0200
> Cc: emacs-devel@gnu.org
> 
> It is my opinion that bidi reordering should be kept strictly a display
> feature.

Just so I'm sure I understand what you are saying: do you agree that
current-column should return a logical-order column number it does
today?

> A command like vertical-motion acts on a display text presentation
> rather than a logical representation: it would heed bidi (where
> applicable).

This already works, as long as all paragraphs have the same direction,
either L2R or R2L.  The cursor is placed on characters whose visual
distance from the window margin is the same (as far as the line's
length allows that).  That's because the display engine internally
keeps the correct horizontal position of each glyph, after reordering,
and the various routines that move in ``display line'' use bidi
iteration.

I will probably need to fix this for when paragraph direction changes;
currently, Emacs puts the cursor at the same distance from the other
edge of the window, which is not terribly wrong, but I think users
will not expect that.  However, note that if this is fixed, the value
of current-column will change when point moves from a L2R paragraph to
a R2L one or vice versa.

> Programmatically, text manipulation should keep as far away from those
> display-oriented functions as possible (except where indeed the display
> representation should be manipulated).  And all basic text manipulation
> should stay

Hmm, looks unfinished.

Anyway, thanks for the feedback.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Column numbering in bidirectional display
  2010-05-21  9:08 Column numbering in bidirectional display Eli Zaretskii
  2010-05-21  9:30 ` David Kastrup
@ 2010-05-21 13:20 ` Yair F
  2010-05-21 14:07   ` David Kastrup
  2010-05-22  0:34 ` Stefan Monnier
  2010-06-30  4:30 ` "Martin J. Dürst"
  3 siblings, 1 reply; 9+ messages in thread
From: Yair F @ 2010-05-21 13:20 UTC (permalink / raw
  To: Eli Zaretskii; +Cc: emacs-bidi, emacs-devel

I would stick to logical ordering and have previous/next-line more
accordingly. In the long tem there is no escape - think of region
copying and yanking.
Kepping to strict logical model prevents surprises in the long term -
See Mozilla for a confusing mixed logical/visual implementation.


On Fri, May 21, 2010 at 12:08 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> With most of basic features needed for displaying bidirectional text
> out of my way (the notable omission so far is reordering display
> strings), development now enters the application level, albeit on a
> very basic level for now.
>
> One of the major issues on this level is the semantics of the column
> numbering.  In the unidirectional case, this is trivial: column
> numbers start at zero at the left margin and increase linearly as we
> move to the right.
>
> In the bidirectional case, we have two complications.  First, there
> are right-to-left (R2L) lines made entirely of R2L characters.  They
> are displayed starting at the right margin of the window, like this:
>
>                                      ZYX WVU TSRQ PONMLKJIH GFEDCBA
>
> What should current-column return when point is before A, i.e. at the
> first character of the line in the reading order, which is at the
> right margin of the window on display?
>
> The other complication is mixed L2R and R2L text.  Example of how we
> display a L2R line that includes some R2L characters:
>
>  EDCBA abcde fghij
>
> Here A is the first character of the line in buffer's logical order.
> What should current-column return when point is before A?
>
> A similar example for displaying a R2L line that includes some L2R
> text:
>
>                                                 JIHGF EDCBA abcde
>
> and we have the same dilemma regarding the value of current-column
> when point is before a.
>
> Currently, current-column (and move-to-column, and other primitives in
> indent.c) work in buffer's logical order, disregarding the reordering
> of characters for display.  That is why current-column returns zero
> for all the situations I described above.  It also counts column in
> strict logical order.  For example, here are the column numbers for
> each character of the last example (numbers that need more than one
> digit are written vertically):
>                                                 JIHGF EDCBA abcde
>                                                 11111111987612345
>                                                 76543210
>
> This might surprise at first, and might even look terribly wrong, but
> it turns out that users expect that in bidirectional text.  At least
> MS Word behaves _exactly_ like this, AFAICS.
>
> Moreover, this makes a surprising number of basic Emacs features work
> correctly even though the underlying Lisp code is entirely oblivious
> to bidi reordering.  One example is Dired, when file names include R2L
> characters: I was pleasantly surprised to see that it puts the cursor
> on the correct place within the file name.  Another example is the
> various features that manipulate indentation.
>
> If we decide that columns should be numbered in their screen order,
> from left to right, then we will need:
>
>  . Rewrite primitives in indent.c to be bidi-aware, i.e. advance by
>    calling functions from bidi.c rather than just incrementing
>    character positions.  This would complicate the parts that move
>    backwards, because there's no code in bidi.c that can do that, and
>    it's not trivial to write such code.
>
>  . Fix all the Lisp code that uses these primitives to not assume
>    that column zero is necessarily the first character of the line
>    that follows a newline.
>
> Admittedly, there are some features which need to be fixed even if we
> keep the current semantics of column numbering.  C-e (just fixed 2
> days ago) is one example.  But I think the number of such features is
> much smaller than if we number columns in visual screen order.
>
> So on balance, I think we should keep the current semantics of the
> line numbering, whereby columns are numbered in strict logical order.
>
> If we decide to go that way, we will need to provide primitives or
> subroutines to get to the visually first and last characters of a
> visual line.  That's because some features need that; see the thread
> Re: Hl-line and visual-line for one example.  beginning-of-visual-line
> and end-of-visual-line sound like a good starting point.
>
> Comments are welcome.
>
> _______________________________________________
> emacs-bidi mailing list
> emacs-bidi@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-bidi
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Column numbering in bidirectional display
  2010-05-21 13:20 ` Yair F
@ 2010-05-21 14:07   ` David Kastrup
  0 siblings, 0 replies; 9+ messages in thread
From: David Kastrup @ 2010-05-21 14:07 UTC (permalink / raw
  To: emacs-bidi; +Cc: emacs-devel

Yair F <yair.f.lists@gmail.com> writes:

> I would stick to logical ordering and have previous/next-line more
> accordingly. In the long tem there is no escape - think of region
> copying and yanking.

That's nothing compared to the headaches for rectangular regions...

Overwrite-mode could also be fun when replacing R-L text (or R-L marks)
with L-R text.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Column numbering in bidirectional display
  2010-05-21  9:08 Column numbering in bidirectional display Eli Zaretskii
  2010-05-21  9:30 ` David Kastrup
  2010-05-21 13:20 ` Yair F
@ 2010-05-22  0:34 ` Stefan Monnier
  2010-06-30  4:30 ` "Martin J. Dürst"
  3 siblings, 0 replies; 9+ messages in thread
From: Stefan Monnier @ 2010-05-22  0:34 UTC (permalink / raw
  To: Eli Zaretskii; +Cc: emacs-bidi, emacs-devel

[...snip...]
> So on balance, I think we should keep the current semantics of the
> line numbering, whereby columns are numbered in strict logical order.

It does sound "too sweet to be true", but if you say it's so, I'm all
too happy to believe you.

I do think we'll need to provide better "visual movement" commands, but
these should really focus on being commands, like the current
line-move stuff.

It does point to a relevant detail: we had intended to (and someone
started working on) using the display iterator to (re)implement
current-column so as to take things like proportional fonts (and
variable font sizes) into account and return pixel-precise info in the
form of floating-point column numbers.  In light of your message, it
seems that maybe this kind of functionality should not replace the
current code.


        Stefan



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Column numbering in bidirectional display
  2010-05-21  9:08 Column numbering in bidirectional display Eli Zaretskii
                   ` (2 preceding siblings ...)
  2010-05-22  0:34 ` Stefan Monnier
@ 2010-06-30  4:30 ` "Martin J. Dürst"
  2010-06-30 17:22   ` Eli Zaretskii
  3 siblings, 1 reply; 9+ messages in thread
From: "Martin J. Dürst" @ 2010-06-30  4:30 UTC (permalink / raw
  To: Eli Zaretskii; +Cc: emacs-bidi, emacs-devel

Hello Eli,

On 2010/05/21 18:08, Eli Zaretskii wrote:

> So on balance, I think we should keep the current semantics of the
> line numbering, whereby columns are numbered in strict logical order.
>
> If we decide to go that way, we will need to provide primitives or
> subroutines to get to the visually first and last characters of a
> visual line.  That's because some features need that; see the thread
> Re: Hl-line and visual-line for one example.  beginning-of-visual-line
> and end-of-visual-line sound like a good starting point.
>
> Comments are welcome.

I agree that keeping column numbering as currently available in logical 
ordering. I think that at least in the long run, you will need more 
visual-related functions than just beginning-of-visual-line and 
end-of-visual-line.

Regards,   Martin.

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Column numbering in bidirectional display
  2010-06-30  4:30 ` "Martin J. Dürst"
@ 2010-06-30 17:22   ` Eli Zaretskii
  2010-07-01  1:36     ` "Martin J. Dürst"
  0 siblings, 1 reply; 9+ messages in thread
From: Eli Zaretskii @ 2010-06-30 17:22 UTC (permalink / raw
  To: "Martin J. Dürst"; +Cc: emacs-bidi, emacs-devel

> Date: Wed, 30 Jun 2010 13:30:33 +0900
> From: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
> CC: emacs-devel@gnu.org, emacs-bidi@gnu.org
> 
> I think that at least in the long run, you will need more
> visual-related functions than just beginning-of-visual-line and
> end-of-visual-line.

Like what, for example?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Column numbering in bidirectional display
  2010-06-30 17:22   ` Eli Zaretskii
@ 2010-07-01  1:36     ` "Martin J. Dürst"
  0 siblings, 0 replies; 9+ messages in thread
From: "Martin J. Dürst" @ 2010-07-01  1:36 UTC (permalink / raw
  To: Eli Zaretskii; +Cc: emacs-bidi, emacs-devel



On 2010/07/01 2:22, Eli Zaretskii wrote:
>> Date: Wed, 30 Jun 2010 13:30:33 +0900
>> From: "Martin J. Dürst"<duerst@it.aoyama.ac.jp>
>> CC: emacs-devel@gnu.org, emacs-bidi@gnu.org
>>
>> I think that at least in the long run, you will need more
>> visual-related functions than just beginning-of-visual-line and
>> end-of-visual-line.
>
> Like what, for example?

For example functions that return the character at a given visual 
position from the left or the right, or the logical position of that 
character,...

Regards,    Martin.


-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-07-01  1:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-21  9:08 Column numbering in bidirectional display Eli Zaretskii
2010-05-21  9:30 ` David Kastrup
2010-05-21 11:17   ` Eli Zaretskii
2010-05-21 13:20 ` Yair F
2010-05-21 14:07   ` David Kastrup
2010-05-22  0:34 ` Stefan Monnier
2010-06-30  4:30 ` "Martin J. Dürst"
2010-06-30 17:22   ` Eli Zaretskii
2010-07-01  1:36     ` "Martin J. Dürst"

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).