unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* BIDI, LaTeX (auctex) and the «evil» backslash
@ 2016-05-21 13:55 Uwe Brauer
  2016-05-21 17:34 ` Eli Zaretskii
  0 siblings, 1 reply; 6+ messages in thread
From: Uwe Brauer @ 2016-05-21 13:55 UTC (permalink / raw)
  To: emacs-devel; +Cc: auctex-devel

[-- Attachment #1: Type: text/plain, Size: 1647 bytes --]



Hello

I was going to continue my message about displaying latex environments,
such as equations in a BIDI file, till I recognized that the basic
problem is the LaTeX backslash.

In my understanding UTF distinguish between

    -  LTR chars such as a,b,c

    -  RLT chars such as א,ב,ג

    -  «neutral» chars such as (),\ etc.

The problem is now that \ is part of a LaTeX command. So a LaTeX file
with hebrew text, has a problem, as the  sreenshot below of the file
(bidi-paragraph-direction nil) bidi-pargraph-nil shows, for example

begin{equation}\
instead of
\begin{equation}

Worse
[\ x^2=2x-1 \]

Instead of
\[ 2x-1=x^2 \]

Now there are at least four solutions to this problem


    -  set bidi-paragraph-direction to left (shown in the next
       screenshot.) The display is correct, however typing Hebrew, when
       bidi-paragraph-direction is set to left is as unpleasant as
       writing English with bidi-paragraph-direction set to right.

    -  use LRM chars before the backslash (see the last screenshot;
       having set `glyphless-char-display-control' to `acronym'.
       This looks well to but adding these chars is cumbersome.

    -  hack auctex (CC to the auctex list): a new variable is
       introduced, say bidi-support, which is per default nil, but if it
       is t, then LRM chars are inserted before a backslash. I am
       pretty sure the auctex team will not like this idea very much.

    -  back emacs: in a LaTeX buffer, backslash is considered as LTR, I
       don't know whether this can be done one the lisp level or whether
       it can be done at all.

Comments?

regards


Uwe Brauer




[-- Attachment #2: bidi-pargraph-left-to-right --]
[-- Type: application/octet-stream, Size: 74159 bytes --]

[-- Attachment #3: bidi-pargraph-left-to-right-LRM --]
[-- Type: application/octet-stream, Size: 77816 bytes --]

[-- Attachment #4: bidi-pargraph-nil --]
[-- Type: application/octet-stream, Size: 72472 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BIDI, LaTeX (auctex) and the «evil» backslash
  2016-05-21 13:55 BIDI, LaTeX (auctex) and the «evil» backslash Uwe Brauer
@ 2016-05-21 17:34 ` Eli Zaretskii
  2016-05-21 17:37   ` Eli Zaretskii
  2016-05-21 17:44   ` Uwe Brauer
  0 siblings, 2 replies; 6+ messages in thread
From: Eli Zaretskii @ 2016-05-21 17:34 UTC (permalink / raw)
  To: Uwe Brauer; +Cc: emacs-devel

> From: Uwe Brauer <oub@mat.ucm.es>
> Date: Sat, 21 May 2016 13:55:18 +0000
> Cc: auctex-devel <auctex-devel@gnu.org>
> 
> In my understanding UTF distinguish between

What is "UTF" in this context?

>     -  LTR chars such as a,b,c
> 
>     -  RLT chars such as א,ב,ג
> 
>     -  «neutral» chars such as (),\ etc.

You should read the description of UBA, the Unicode Bidirectional
Algorithm (which Emacs implements).  There you will see that there are
actually 4 classes of characters:

 . string (LTR and RTL)
 . weak (numbers, number separators, diacriticals)
 . neutral (punctuation and whitespace)
 . formatting control characters (RLM etc.)

So:

  (get-char-code-property ?\\ 'bidi-class) => ON

("ON" stands for "other neutral", see the node "Character Properties"
in the ELisp manual.)

>     -  set bidi-paragraph-direction to left (shown in the next
>        screenshot.) The display is correct, however typing Hebrew, when
>        bidi-paragraph-direction is set to left is as unpleasant as
>        writing English with bidi-paragraph-direction set to right.
> 
>     -  use LRM chars before the backslash (see the last screenshot;
>        having set `glyphless-char-display-control' to `acronym'.
>        This looks well to but adding these chars is cumbersome.
> 
>     -  hack auctex (CC to the auctex list): a new variable is
>        introduced, say bidi-support, which is per default nil, but if it
>        is t, then LRM chars are inserted before a backslash. I am
>        pretty sure the auctex team will not like this idea very much.
> 
>     -  back emacs: in a LaTeX buffer, backslash is considered as LTR, I
>        don't know whether this can be done one the lisp level or whether
>        it can be done at all.
> 
> Comments?

The last one is possible, of course (this is Emacs), but that way lies
madness: arbitrarily changing bidirectional properties of characters
will bite you elsewhere, because the corresponding tables are global.

The other 3 alternatives are indeed the available solutions.
Personally, I recommend the 1st one; I see no problem with typing RTL
text in a left-to-right paragraph (and vice versa), and don't
understand what unpleasant things you bump into when doing that.  TeX
files are fundamentally left-to-right, as any program text, so that
would be my suggestion.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BIDI, LaTeX (auctex) and the «evil» backslash
  2016-05-21 17:34 ` Eli Zaretskii
@ 2016-05-21 17:37   ` Eli Zaretskii
  2016-05-21 17:44   ` Uwe Brauer
  1 sibling, 0 replies; 6+ messages in thread
From: Eli Zaretskii @ 2016-05-21 17:37 UTC (permalink / raw)
  To: oub; +Cc: emacs-devel

> You should read the description of UBA, the Unicode Bidirectional
> Algorithm (which Emacs implements).  There you will see that there are
> actually 4 classes of characters:
> 
>  . string (LTR and RTL)
     ^^^^^^
Should be "strong", obviously.  Sorry.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BIDI, LaTeX (auctex) and the «evil» backslash
  2016-05-21 17:34 ` Eli Zaretskii
  2016-05-21 17:37   ` Eli Zaretskii
@ 2016-05-21 17:44   ` Uwe Brauer
  2016-05-21 18:34     ` Eli Zaretskii
  1 sibling, 1 reply; 6+ messages in thread
From: Uwe Brauer @ 2016-05-21 17:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Uwe Brauer, emacs-devel

>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes:

   >> From: Uwe Brauer <oub@mat.ucm.es>
   >> Date: Sat, 21 May 2016 13:55:18 +0000
   >> Cc: auctex-devel <auctex-devel@gnu.org>
   >> 
   >> In my understanding UTF distinguish between

   > What is "UTF" in this context?

My bad I mean Unicode.

   >> -  LTR chars such as a,b,c
   >> 
   >> -  RLT chars such as א,ב,ג
   >> 
   >> -  «neutral» chars such as (),\ etc.

   > You should read the description of UBA, the Unicode Bidirectional
   > Algorithm (which Emacs implements).  There you will see that there are
   > actually 4 classes of characters:

   >  . string (LTR and RTL)
   >  . weak (numbers, number separators, diacriticals)
   >  . neutral (punctuation and whitespace)
   >  . formatting control characters (RLM etc.)

   > So:

   >   (get-char-code-property ?\\ 'bidi-class) => ON

   > ("ON" stands for "other neutral", see the node "Character Properties"
   > in the ELisp manual.)

   >> -  set bidi-paragraph-direction to left (shown in the next
   >> screenshot.) The display is correct, however typing Hebrew, when
   >> bidi-paragraph-direction is set to left is as unpleasant as
   >> writing English with bidi-paragraph-direction set to right.
   >> 
   >> -  use LRM chars before the backslash (see the last screenshot;
   >> having set `glyphless-char-display-control' to `acronym'.
   >> This looks well to but adding these chars is cumbersome.
   >> 
   >> -  hack auctex (CC to the auctex list): a new variable is
   >> introduced, say bidi-support, which is per default nil, but if it
   >> is t, then LRM chars are inserted before a backslash. I am
   >> pretty sure the auctex team will not like this idea very much.
   >> 
   >> -  back emacs: in a LaTeX buffer, backslash is considered as LTR, I
   >> don't know whether this can be done one the lisp level or whether
   >> it can be done at all.
   >> 
   >> Comments?

   > The last one is possible, of course (this is Emacs), but that way lies
   > madness: arbitrarily changing bidirectional properties of characters
   > will bite you elsewhere, because the corresponding tables are global.

So that cannot be restricted just to LaTeX mayor modes?

In any case how could I change the backslash on a lisp level, ie change

(get-char-code-property ?\\ 'bidi-class)

From ON to whatever is necessary.

I just would like to check and see what happens

   > The other 3 alternatives are indeed the available solutions.
   > Personally, I recommend the 1st one; I see no problem with typing
   > RTL text in a left-to-right paragraph (and vice versa), and don't
   > understand what unpleasant things you bump into when doing that.
   > TeX files are fundamentally left-to-right, as any program text, so
   > that would be my suggestion.

Well I don't like the cursor movements in such a situation and that is
why for the moment I use the LRM chars, which don't cause any
problem at least not using unicode and xelatex.

Again thanks for all your efforts for providing BIDI support.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BIDI, LaTeX (auctex) and the «evil» backslash
  2016-05-21 17:44   ` Uwe Brauer
@ 2016-05-21 18:34     ` Eli Zaretskii
  2016-05-21 20:20       ` Uwe Brauer
  0 siblings, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2016-05-21 18:34 UTC (permalink / raw)
  To: Uwe Brauer; +Cc: oub, emacs-devel

> From: Uwe Brauer <oub@mat.ucm.es>
> Cc: Uwe Brauer <oub@mat.ucm.es>, emacs-devel@gnu.org
> Date: Sat, 21 May 2016 17:44:32 +0000
> 
>    > The last one is possible, of course (this is Emacs), but that way lies
>    > madness: arbitrarily changing bidirectional properties of characters
>    > will bite you elsewhere, because the corresponding tables are global.
> 
> So that cannot be restricted just to LaTeX mayor modes?

No, the tables are global (they are quite large).

> In any case how could I change the backslash on a lisp level, ie change
> 
> (get-char-code-property ?\\ 'bidi-class)
> 
> >From ON to whatever is necessary.

You want put-char-code-property, I think.  (Never tried this myself.)

>    > The other 3 alternatives are indeed the available solutions.
>    > Personally, I recommend the 1st one; I see no problem with typing
>    > RTL text in a left-to-right paragraph (and vice versa), and don't
>    > understand what unpleasant things you bump into when doing that.
>    > TeX files are fundamentally left-to-right, as any program text, so
>    > that would be my suggestion.
> 
> Well I don't like the cursor movements in such a situation

Did you try setting visual-order-cursor-movement non-nil?  Maybe
that's all you need to solve your problems?



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BIDI, LaTeX (auctex) and the «evil» backslash
  2016-05-21 18:34     ` Eli Zaretskii
@ 2016-05-21 20:20       ` Uwe Brauer
  0 siblings, 0 replies; 6+ messages in thread
From: Uwe Brauer @ 2016-05-21 20:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Uwe Brauer, emacs-devel



   > No, the tables are global (they are quite large).


   > You want put-char-code-property, I think.  (Never tried this myself.)

I played a bit around with that


(put-char-code-property ?\\ 'bidi-class 'L)

(I am not sure whether to use L or LRO.)

That works as expected, that is in latex buffer the backslash behaves
now as I expect him to behave.

Hm I could write a small hack to change back and forward between 'L and
'ON...



   > Did you try setting visual-order-cursor-movement non-nil?  Maybe
   > that's all you need to solve your problems?

I have set this always to t[1]. But my problem with writing Hebrew when
bidi-paragraph-direction is set to left, is:

    -  first the cursor sits fixed and spits the hebrew chars, which I
       find counter intuitive  

    -  worse:  beginning-of-line and end-of-line are confusing in this
       setting


Anyhow thanks for the hint
with (put-char-code-property ?\\ 'bidi-class 'L)



Footnotes: 
[1]  in fact, if memory serves me well it was me who nagged so much that
     you finally implemented visual-order-cursor-movement :-D




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-05-21 20:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-21 13:55 BIDI, LaTeX (auctex) and the «evil» backslash Uwe Brauer
2016-05-21 17:34 ` Eli Zaretskii
2016-05-21 17:37   ` Eli Zaretskii
2016-05-21 17:44   ` Uwe Brauer
2016-05-21 18:34     ` Eli Zaretskii
2016-05-21 20:20       ` Uwe Brauer

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).