From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Message Mode and bidi Date: Tue, 20 Feb 2024 16:36:32 +0200 Message-ID: <86o7cbnqxb.fsf@gnu.org> References: <87v86ldzpw.fsf@aura.christopherculver.com> <867cj1qg4m.fsf@gnu.org> <8734to16tw.fsf@aura.christopherculver.com> <86bk8cw20e.fsf@p200300d627023a0ad1f3c3db8ccb4c50.dip0.t-ipconnect.de> <87frxoyuq4.fsf@aura.christopherculver.com> <87ttm4klnl.fsf@ericabrahamsen.net> <86y1bfolgf.fsf@gnu.org> <87h6i3lnp8.fsf@ericabrahamsen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="6686"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Eric Abrahamsen Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Feb 20 15:37:37 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rcRFY-0001S3-TO for ged-emacs-devel@m.gmane-mx.org; Tue, 20 Feb 2024 15:37:36 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rcREf-0003RL-Af; Tue, 20 Feb 2024 09:36:41 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rcREa-0003PS-PK for emacs-devel@gnu.org; Tue, 20 Feb 2024 09:36:37 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rcREa-0006ea-CF; Tue, 20 Feb 2024 09:36:36 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=fuz+POGmYtyR4ERo0ADf2m1DUtoQ3wSyRYVR9Q9vpXM=; b=R8bm2Kd9vwtvY+3YBuiN BjmogF5kNLyJC+55DcQIy5BA+ERECZX0bI+ExgPEtBsRvZ0iQ/VZ3SZKXzBXAEQG9Nkpj7O7UD1hI 0926qzrLdfmMMCx+bYL9oyYbzrDhQmRaDSYRMsVEeYoHefHjhx0fn1zDxSP0v7IasSOWtXoqYLdze 0NQEyrXSRapaaE0JGZ9bjmMfN40FDfPMGW3+6qfp3AXhHjSPZkqwv3jnR8ETd/glzFivUvBSfB2c5 0AlKQEQsL1DjETAyDbrYUHKRVzNQ2Ae/2VzwoUkeYkJJ2QcINEfGKKngBdJaujKU+rLHElyhA2kEk i/z31cKqQI9nuA==; In-Reply-To: <87h6i3lnp8.fsf@ericabrahamsen.net> (message from Eric Abrahamsen on Mon, 19 Feb 2024 21:16:51 -0800) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:316397 Archived-At: > From: Eric Abrahamsen > Date: Mon, 19 Feb 2024 21:16:51 -0800 > > > Eli Zaretskii writes: > > >> From: Eric Abrahamsen > >> Date: Mon, 19 Feb 2024 16:46:22 -0800 > >> > >> Christopher Culver via "Emacs development discussions." > >> writes: > >> > >> > Joost Kremers writes: > >> >> When you compose a new message, is there a line "--text follows this line--" > >> >> separating the headers and the message text? In my case, there is (I use mu4e) > >> >> and when I type Arabic text on the line below this text, I get the effect you > >> >> mention. If I leave an empty line after "--text follows this line--", bidi works > >> >> as expected. > >> > > >> > Indeed, if I just go down one line and then begin typing, bidi works as > >> > expected. I am feeling very foolish that I did not even try this. Thank > >> > you for clearing up this problem, and for shedding light on how Emacs > >> > considers paragraphs. > >> > >> This is a bit weird, because the value of `mail-header-separator' > >> ("--text follows this line--") is added to both `paragraph-start' and > >> `paragraph-separate') in `message-mode'. You'd think one of those would > >> do it. > > > > Emacs doesn't use paragraph-separate and paragraph-start to define > > where a paragraph starts and ends, for the purposes of determining the > > base directionality of a paragraph. It uses separate variables for > > that, see the node "Bidirectional Editing" in the Emacs user manual. > > The reason for using separate variables is because several modes, > > including (but not limited to) message-mode, set the former variables > > to regexps that get in the way of bidi reordering, and could easily > > produce wrong results on display. > > Do you think we'd stand a chance of finding values for > bidi-paragraph-start|separate-re that would resolve this particular > issue? We already have those values described in the Emacs manual: Each paragraph of bidirectional text can have its own “base direction”, either right-to-left or left-to-right. Text in left-to-right paragraphs begins on the screen at the left margin of the window and is truncated or continued when it reaches the right margin. By contrast, text in right-to-left paragraphs is displayed starting at the right margin and is continued or truncated at the left margin. By default, paragraph boundaries are empty lines, i.e., lines consisting entirely of whitespace characters. To change that, you can customize the two variables ‘bidi-paragraph-start-re’ and ‘bidi-paragraph-separate-re’, whose values should be regular expressions (strings); e.g., to have a single newline start a new paragraph, set ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ both of these variables to ‘"^"’. These two variables are buffer-local ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (*note Locals::). But beware: doing this in Emacs will cause effects that are unpleasant to readers of bidirectional text, because you could have "chess-like" text display, like this: asasasasasasasasasasasasassa ASASASASASASASASASASASASA xcxcxcxcxcxcxcxcxcxcxcxcxc JKJNJKKKNKNKNKNKNK etc. Here upper-case letters stand for RTL (like Arabic or Farsi) text and lower-case letters stand for LTR (like Latin or Cyrillic) text. Let me explain. UBA, the Unicode Bidirectional Algorithm which Emacs implements, was designed for text-processing programs that wrap long lines without inserting hard newlines. For those applications, a single hard newline indicates a new paragraph, and thus recalculating the base direction of a paragraph after a newline is justified. But in Emacs, we have a lot of text filled and wrapped using hard newlines, and filling can insert a newline in an arbitrary place inside the paragraph. If it happens that a newline was inserted before a strong right-to-left character, under the strict UBA rules the next line will be considered as a paragraph with right-to-left base direction, and rendered starting at the right window edge. Which leads to the above "chess-like" display. Since that is basically unacceptable for Emacs users, we require at least one empty line to separate paragraphs. The price of a single empty line before the body of an email message that needs to be rendered right to left is a small price to pay for solving the horrible display effect of changing the paragraph's base direction after each newline. An alternative would be to use visual-line-mode instead of wrapping using hard newlines, but that is still relatively rare in Emacs, especially in email messages.