From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bidi,gmane.emacs.devel Subject: Column numbering in bidirectional display Date: Fri, 21 May 2010 12:08:48 +0300 Message-ID: <83tyq1pqov.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: lo.gmane.org X-Trace: dough.gmane.org 1274432993 26775 80.91.229.12 (21 May 2010 09:09:53 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 21 May 2010 09:09:53 +0000 (UTC) Cc: emacs-bidi@gnu.org To: emacs-devel@gnu.org Original-X-From: emacs-bidi-bounces+gnu-emacs-bidi=m.gmane.org@gnu.org Fri May 21 11:09:50 2010 connect(): No such file or directory Return-path: Envelope-to: gnu-emacs-bidi@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OFOEo-0002pu-08 for gnu-emacs-bidi@m.gmane.org; Fri, 21 May 2010 11:09:50 +0200 Original-Received: from localhost ([127.0.0.1]:50164 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OFOEn-0000jU-6Z for gnu-emacs-bidi@m.gmane.org; Fri, 21 May 2010 05:09:49 -0400 Original-Received: from [140.186.70.92] (port=46098 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OFOEM-0000ZW-Ph for emacs-bidi@gnu.org; Fri, 21 May 2010 05:09:24 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OFOEJ-0007q8-El for emacs-bidi@gnu.org; Fri, 21 May 2010 05:09:21 -0400 Original-Received: from mtaout20.012.net.il ([80.179.55.166]:61504) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OFOEG-0007pN-Li; Fri, 21 May 2010 05:09:17 -0400 Original-Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0L2R00G00IMHOH00@a-mtaout20.012.net.il>; Fri, 21 May 2010 12:08:50 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([77.127.33.125]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0L2R00FNEIQOJ130@a-mtaout20.012.net.il>; Fri, 21 May 2010 12:08:50 +0300 (IDT) X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: Solaris 10 (beta) X-BeenThere: emacs-bidi@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion of Emacs support for multi-directional text." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-bidi-bounces+gnu-emacs-bidi=m.gmane.org@gnu.org Errors-To: emacs-bidi-bounces+gnu-emacs-bidi=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bidi:613 gmane.emacs.devel:124983 Archived-At: With most of basic features needed for displaying bidirectional text out of my way (the notable omission so far is reordering display strings), development now enters the application level, albeit on a very basic level for now. One of the major issues on this level is the semantics of the column numbering. In the unidirectional case, this is trivial: column numbers start at zero at the left margin and increase linearly as we move to the right. In the bidirectional case, we have two complications. First, there are right-to-left (R2L) lines made entirely of R2L characters. They are displayed starting at the right margin of the window, like this: ZYX WVU TSRQ PONMLKJIH GFEDCBA What should current-column return when point is before A, i.e. at the first character of the line in the reading order, which is at the right margin of the window on display? The other complication is mixed L2R and R2L text. Example of how we display a L2R line that includes some R2L characters: EDCBA abcde fghij Here A is the first character of the line in buffer's logical order. What should current-column return when point is before A? A similar example for displaying a R2L line that includes some L2R text: JIHGF EDCBA abcde and we have the same dilemma regarding the value of current-column when point is before a. Currently, current-column (and move-to-column, and other primitives in indent.c) work in buffer's logical order, disregarding the reordering of characters for display. That is why current-column returns zero for all the situations I described above. It also counts column in strict logical order. For example, here are the column numbers for each character of the last example (numbers that need more than one digit are written vertically): JIHGF EDCBA abcde 11111111987612345 76543210 This might surprise at first, and might even look terribly wrong, but it turns out that users expect that in bidirectional text. At least MS Word behaves _exactly_ like this, AFAICS. Moreover, this makes a surprising number of basic Emacs features work correctly even though the underlying Lisp code is entirely oblivious to bidi reordering. One example is Dired, when file names include R2L characters: I was pleasantly surprised to see that it puts the cursor on the correct place within the file name. Another example is the various features that manipulate indentation. If we decide that columns should be numbered in their screen order, from left to right, then we will need: . Rewrite primitives in indent.c to be bidi-aware, i.e. advance by calling functions from bidi.c rather than just incrementing character positions. This would complicate the parts that move backwards, because there's no code in bidi.c that can do that, and it's not trivial to write such code. . Fix all the Lisp code that uses these primitives to not assume that column zero is necessarily the first character of the line that follows a newline. Admittedly, there are some features which need to be fixed even if we keep the current semantics of column numbering. C-e (just fixed 2 days ago) is one example. But I think the number of such features is much smaller than if we number columns in visual screen order. So on balance, I think we should keep the current semantics of the line numbering, whereby columns are numbered in strict logical order. If we decide to go that way, we will need to provide primitives or subroutines to get to the visually first and last characters of a visual line. That's because some features need that; see the thread Re: Hl-line and visual-line for one example. beginning-of-visual-line and end-of-visual-line sound like a good starting point. Comments are welcome.