From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bidi,gmane.emacs.devel Subject: Re: Mixed L2R and R2L paragraphs and horizontal scroll Date: Tue, 02 Feb 2010 21:30:54 +0200 Message-ID: <83mxzrh1u9.fsf@gnu.org> References: <83tyu3iu6b.fsf@gnu.org> <4B645FF4.30205@gmx.at> <83ockbil1v.fsf@gnu.org> <4B646AD3.1010102@gmx.at> <83mxzviio5.fsf@gnu.org> <4B647AE5.5090001@gmx.at> <83ljffif09.fsf@gnu.org> <4B648C6E.8080905@gmx.at> <83eil7i84h.fsf@gnu.org> <4B654F24.5020603@gmx.at> <83aavui23z.fsf@gnu.org> <4B65E199.6070100@gmx.at> <833a1lioie.fsf@gnu.org> <4B66922D.9060304@gmx.at> <83tyu0hfll.fsf@gnu.org> <4B67DD88.6060602@gmx.at> Reply-To: Eli Zaretskii NNTP-Posting-Host: lo.gmane.org X-Trace: ger.gmane.org 1265139156 17496 80.91.229.12 (2 Feb 2010 19:32:36 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 2 Feb 2010 19:32:36 +0000 (UTC) Cc: emacs-bidi@gnu.org, emacs-devel@gnu.org To: martin rudalics Original-X-From: emacs-bidi-bounces+gnu-emacs-bidi=m.gmane.org@gnu.org Tue Feb 02 20:32:33 2010 Return-path: Envelope-to: gnu-emacs-bidi@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1NcOUC-0001oH-Iv for gnu-emacs-bidi@m.gmane.org; Tue, 02 Feb 2010 20:32:32 +0100 Original-Received: from localhost ([127.0.0.1]:42328 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NcOUB-00031q-RV for gnu-emacs-bidi@m.gmane.org; Tue, 02 Feb 2010 14:32:31 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NcOSa-0002BK-UM for emacs-bidi@gnu.org; Tue, 02 Feb 2010 14:30:52 -0500 Original-Received: from [199.232.76.173] (port=40390 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NcOSa-0002Aq-5r for emacs-bidi@gnu.org; Tue, 02 Feb 2010 14:30:52 -0500 Original-Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1NcOSX-0007rd-DZ for emacs-bidi@gnu.org; Tue, 02 Feb 2010 14:30:51 -0500 Original-Received: from mtaout21.012.net.il ([80.179.55.169]:55326) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NcOSW-0007rP-Ve; Tue, 02 Feb 2010 14:30:49 -0500 Original-Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0KX800400BAEQY00@a-mtaout21.012.net.il>; Tue, 02 Feb 2010 21:30:47 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([87.70.67.249]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0KX8003NEBJAFR40@a-mtaout21.012.net.il>; Tue, 02 Feb 2010 21:30:47 +0200 (IST) In-reply-to: <4B67DD88.6060602@gmx.at> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by monty-python.gnu.org: Solaris 10 (beta) X-BeenThere: emacs-bidi@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion of Emacs support for multi-directional text." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-bidi-bounces+gnu-emacs-bidi=m.gmane.org@gnu.org Errors-To: emacs-bidi-bounces+gnu-emacs-bidi=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bidi:500 gmane.emacs.devel:120824 Archived-At: > Date: Tue, 02 Feb 2010 09:08:40 +0100 > From: martin rudalics > CC: emacs-bidi@gnu.org, emacs-devel@gnu.org > > > I already implemented such a feature: a per-buffer variable that > > forces all paragraphs to be either L2R or R2L. A value of `nil' means > > the direction of each paragraph is dynamically determined by applying > > the rules described in the Unicode Standard Annex 9 (UAX#9). > > I meant a function which does (1) set such a variable You mean, besides "M-x set-variable RET"? > and (2) apply it to one or all windows showing a buffer. Currently, the variable is per-buffer, so it affects all the windows showing that buffer. Why would one need to do that only in some windows showing a buffer? > Calling this function would temporarily override any L2R/R2L > specifications specified for a file, buffer, or paragraph. There are no specifications for a file (unless you set the variable I'm talking about in file's local variables section). As for individual paragraphs, control of their base direction is not by some Emacs setting, but by inserting special formatting characters at the beginning of each paragraph. These characters (LRM and RLM) are supposed to be invisible by default, i.e. displayed as zero-width space, but they have strong directionality, L for LRM and R for RLM. Since UAX#9 says that a paragraph's base direction is determined by its first strong directional character, each one of these two characters sets the paragraph direction according to directionality of the character. It would be easy enough to write a command that inserts LRM or RLM at the beginning of each paragraph in a buffer or region. But that's application level, and I still have a lot of turf to cover before I get to that. > BTW, do UAX#9 paragraphs require new definitions for `paragraph-start' > or `paragraph-separate'? It does: Paragraphs are divided by the Paragraph Separator or appropriate Newline Function [...]. Paragraphs may also be determined by higher-level protocols: for example, the text in two different cells of a table will be in different paragraphs. and the table of Bidirectional Character Types says that a Paragaraph Separator type is assigned to the following characters: Paragraph separator, appropriate Newline Functions, higher-level protocol paragraph determination Accordingly, in the Unicode Database, the characters CR and LF (a.k.a. NL) that normally separate lines have the Paragraph Separator (B) type. This could sound like a disaster (each line being a separate paragraph), since Emacs uses hard newlines to fill paragraphs. Fortunately, UAX#9 leaves a fire escape: it says (see above) that paragraphs can also be determined by ``higher-level protocols''. I used this fire escape to preserve the normal Emacs notion of a paragraph, including the usual sense of `paragraph-start' and `paragraph-separate'. For instance the code that determines the base direction of each paragraph looks back for a position that matches `paragraph-start', and then finds the first strong directional character after that. So UAX#9 does define a default for paragraph start that is different from Emacs, but gives us a way to preserve ours. Which we did.