unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: "Ehud Karni" <ehud@unix.mvs.co.il>
To: eliz@gnu.org
Cc: emacs-bidi@gnu.org, emacs-devel@gnu.org
Subject: Re: Bidirectional editing in Emacs -- main design decisions
Date: Sat, 10 Oct 2009 16:57:59 +0200	[thread overview]
Message-ID: <200910101457.n9AEvxrW000735@beta.mvs.co.il> (raw)
In-Reply-To: <83bpkgl113.fsf@gnu.org> (message from Eli Zaretskii on Fri, 09 Oct 2009 23:18:00 +0200)

On Fri, 09 Oct 2009 23:18:00 Eli Zaretskii wrote:
>
> Here's what I can tell about the subject (bidi display) at this point

In general I agree with your decisions.

> 1. Text storage
>
>    Bidirectional text in Emacs buffers and strings is stored in strict
>    logical order (a.k.a. "reading order").  This is how most (if not
>    all) other implementations handle bidirectional text.  The
>    advantage of this is that file and process I/O is trivial, as well
>    as text search.  [snip]

The search has many problems but this should not influence your bidi
reordering. The changes to various search functions can be done later.

The user ALWAYS search for the visual text s/he sees (S/he never knows
the logical order unless she visits the file literally).

The problems are caused by many reasons:
  1. Different logical inputs, even without formatting characters, can
     result in the same visual output.
     e.g. Logical Hebrew text + a number in LTR reading order, the
     number may be before or after the Hebrew text, but in the visual
     output the number will always be after (to the left of) the text.
     Logical "123 HEBREW 456" appears as "123 456 WERBEH".
  2. Formatting characters are not seen and should not be searched.
  3. The visual appearance of the searched string may be different from
     what it will match.  e.g. The search for logical "HEBREW 3." in
     RTL reading order will appear as ".3 WERBEH" but will match
     also something like logical "HEBREW 3.14159" which its visual
     appearance is "3.14159 WERBEH". This may be what the user wants
     but it may also disturb her because she really wants to find only
     (visual) ".3 WERBEH".
     There is also a technical question, how Emacs will show the found
     string which is not connected as in the "3.14159 WERBEH" above.

As a minimum adjustment, I think the search must ignore the formatting
characters. An option to show (or operate, in search & replace) only on
found matches that are also the same visually is recommended.

> 3. Bidi formatting codes are retained

Agreed, but see my comment on search.

> 7. Paragraph base direction
>
>    There is a buffer-specific variable `paragraph-direction' that
>    allows to override this dynamic detection of the direction of each
>    paragraph, and force a certain base direction on all paragraphs in
>    the buffer.  I expect, for example, each major mode for a
>    programming language to force the left-to-right paragraph
>    direction, because programming languages are written left to right,
>    and right-to-left scripts appear in such buffers only in strings
>    embedded in the program or in comments.

I think a better name is `bidi-paragraphs-direction' or even
`bidi-paragraphs-reading-direction'. Note the `s' in paragraphs,
because it is influence all the paragraphs in the buffer.

There should be a key to toggle this variable. It will very
useful for the minibuffer.

> 8. User control of visual order

Do you intend to support all the explicit formatting characters (LRO is
specially important as it allows to store visual strings as is) or just
the implicit (and more used) LRM and RLM ?

>    This design kills two birds: (a) it produces text that is compliant
>    with other applications, and will display the same as in Emacs, and
>    (b) it avoids the need to invent yet another Emacs infrastructure
>    feature to keep information such as paragraph direction outside of
>    the text itself.

While you can store the LRM and RLM in ISO-8859-8 encoding, there is no
way to store the the other formatting characters.

> That is all for now.  If you have comments or questions, you are
> welcome to voice them.

I found an editor that support the all the formatting characters, YODIT
(http://www.yudit.org/) it is GPLed, may be you can use it.

The W3C recommend not to use explicit formatting characters (i.e.
RLO/LRO/RLE/LRE/PDF) and instead to use markup (see
http://www.w3.org/International/questions/qa-bidi-controls ,
specially the "reasons" section).

Ehud.


--
 Ehud Karni           Tel: +972-3-7966-561  /"\
 Mivtach - Simon      Fax: +972-3-7976-561  \ /  ASCII Ribbon Campaign
 Insurance agencies   (USA) voice mail and   X   Against   HTML   Mail
 http://www.mvs.co.il  FAX:  1-815-5509341  / \
 GnuPG: 98EA398D <http://www.keyserver.net/>    Better Safe Than Sorry

  parent reply	other threads:[~2009-10-10 14:57 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-09 21:18 Bidirectional editing in Emacs -- main design decisions Eli Zaretskii
2009-10-09 21:55 ` joakim
2009-10-09 22:29   ` Eli Zaretskii
2009-10-09 22:42     ` joakim
2009-10-10  7:08       ` Eli Zaretskii
2009-10-10  7:28         ` joakim
2009-10-10  8:20           ` Eli Zaretskii
2009-10-09 22:41 ` Eli Zaretskii
2009-10-10  9:16   ` Richard Stallman
2009-10-10 11:38     ` Eli Zaretskii
2009-10-11  8:41       ` Richard Stallman
2009-10-11 20:12         ` Eli Zaretskii
2009-10-11 21:11           ` Eli Zaretskii
2009-10-12 10:11           ` Richard Stallman
2009-10-12 18:40             ` Eli Zaretskii
2009-10-10 13:44 ` Sascha Wilde
2009-10-10 14:06   ` Eli Zaretskii
2009-10-10 15:54     ` Sascha Wilde
2009-10-10 14:57 ` Ehud Karni [this message]
2009-10-10 16:38   ` Eli Zaretskii
2009-10-10 15:13 ` Jason Rumney
2009-10-10 16:06   ` Eli Zaretskii
2009-10-10 16:29     ` Jason Rumney
2009-10-10 17:18 ` James Cloos
2009-10-10 18:33   ` Eli Zaretskii
2011-04-18 14:54 ` Eli Zaretskii
2011-04-19 13:11   ` Stefan Monnier
2011-04-19 16:02     ` Eli Zaretskii
2011-04-20  3:15       ` Stefan Monnier
2011-04-25 17:31       ` Mohsen BANAN
2011-04-25 17:58         ` Eli Zaretskii
2011-04-25 18:44           ` Mohsen BANAN
2011-04-25 18:59             ` Eli Zaretskii
2011-04-25 21:31               ` Now: Paragraph Direction Detection and Harmonization -- Was: " Mohsen BANAN
2011-04-25 22:00                 ` Eli Zaretskii
2011-04-26  7:56                   ` Mohsen BANAN
2011-04-26 18:05                     ` Eli Zaretskii
2011-04-27 21:58                       ` Now: Paragraph Direction Detection and Harmonization Mohsen BANAN
2011-04-26 18:24                   ` Mohsen BANAN
2011-04-26 19:23                     ` Eli Zaretskii
2011-04-26  1:22                 ` Now: Paragraph Direction Detection and Harmonization -- Was: Re: Bidirectional editing in Emacs -- main design decisions Stephen J. Turnbull
2011-04-28  0:52           ` Requesting instructions for enabling bidi by default Mohsen BANAN
2011-04-28  1:21             ` Juanma Barranquero

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200910101457.n9AEvxrW000735@beta.mvs.co.il \
    --to=ehud@unix.mvs.co.il \
    --cc=eliz@gnu.org \
    --cc=emacs-bidi@gnu.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).