* bug#22429: Force character to be recognized as LTR inside RTL paragraph @ 2016-01-21 21:14 Filipe Moreira 2016-01-22 8:08 ` Eli Zaretskii 0 siblings, 1 reply; 8+ messages in thread From: Filipe Moreira @ 2016-01-21 21:14 UTC (permalink / raw) To: 22429 [-- Attachment #1.1: Type: text/plain, Size: 951 bytes --] Hi everyone, I’m using Emacs as a LaTeX editor, with the AUCTeX mode. One document I’m authoring is written in English with some paragraphs in Hebrew or Greek. The issue I have is with mixing some neutral characters that need to be LTR, inside a paragraph which is RTL. An example of this is the slash (i.e. ‘\’) character used by LaTeX to signal its commands. Inside a RTL paragraph I ideally want to force Emacs to always interpret the slash character, as well as the open and close brackets (i.e. {}) as LTR. This is not what happens at the moment. Here I have a visual representation of the problem: http://emacs.stackexchange.com/questions/19696/handling-left-to-right-inside-right-to-left-paragraphs-using-emacs-and-auctex . Is it possible to whitelist some characters that should always be interpreted as LTR? Thanks Filipe Moreira -- Freelance Web Developer(Ruby & Javascript) http://coderelax.com/ [-- Attachment #1.2: Type: text/html, Size: 3071 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#22429: Force character to be recognized as LTR inside RTL paragraph 2016-01-21 21:14 bug#22429: Force character to be recognized as LTR inside RTL paragraph Filipe Moreira @ 2016-01-22 8:08 ` Eli Zaretskii 2016-01-22 8:24 ` Eli Zaretskii 2016-01-22 11:54 ` Filipe Moreira 0 siblings, 2 replies; 8+ messages in thread From: Eli Zaretskii @ 2016-01-22 8:08 UTC (permalink / raw) To: Filipe Moreira; +Cc: 22429 > Date: Thu, 21 Jan 2016 13:14:22 -0800 > From: "Filipe Moreira" <famoreira@gmail.com> > > I’m using Emacs as a LaTeX editor, with the AUCTeX mode. One document I’m > authoring is written in English with some paragraphs in Hebrew or Greek. > > The issue I have is with mixing some neutral characters that need to be LTR, > inside a paragraph which is RTL. An example of this is the slash (i.e. ‘\’) > character used by LaTeX to signal its commands. Inside a RTL paragraph I > ideally want to force Emacs to always interpret the slash character, as well as > the open and close brackets (i.e. {}) as LTR. > > This is not what happens at the moment. Here I have a visual representation of > the problem: > http://emacs.stackexchange.com/questions/19696/handling-left-to-right-inside-right-to-left-paragraphs-using-emacs-and-auctex. > > Is it possible to whitelist some characters that should always be interpreted > as LTR? The directionality of characters is determined by their bidirectional class property as defined by the Unicode Character Database. Emacs uses those definitions in its implementation of the UBA, the Unicode Bidirectional Algorithm, when it lays out text for display. Punctuation characters, such as \, {, and } have "weak directionality": they take the directionality of the surrounding text, and if the directionality on either side is different, they default to the paragraph's base direction, which is RTL in your case. So that is what you see. Emacs being Emacs, you can programmatically change the bidirectional class of every character, but that change has global effect: it will affect the directionality of that character everywhere in the Emacs session. So this is not recommended. The correct solution to these problems is to wrap the footnote block in the LRE..PDF or LRI..PDI control characters, so that the footnote is rendered independently of the surrounding bidirectional context. See the example below. Not sure if LaTeX will DTRT with directional control characters, but if it doesn't, that's a bug/misfeature in LaTeX. \begin{hebrew} \pstart בְּרֵאשִׁ֖ית\footnoteA{This is a Hebrew related footnote} בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃ \pend \end{hebrew} Another possibility is to insert newlines between the footnote and the surrounding text, as shown below. Not sure if LaTeX will be happy with that, and I think it's uglier anyway. \begin{hebrew} \pstart בְּרֵאשִׁ֖ית \footnoteA{This is a Hebrew related footnote} בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃ \pend \end{hebrew} I don't think there's a bug to fix here, so I'm going to close this bug report. Any objections? ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#22429: Force character to be recognized as LTR inside RTL paragraph 2016-01-22 8:08 ` Eli Zaretskii @ 2016-01-22 8:24 ` Eli Zaretskii 2016-01-22 9:31 ` Andy Moreton 2016-01-22 11:54 ` Filipe Moreira 1 sibling, 1 reply; 8+ messages in thread From: Eli Zaretskii @ 2016-01-22 8:24 UTC (permalink / raw) To: famoreira; +Cc: 22429 > Date: Fri, 22 Jan 2016 10:08:06 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: 22429@debbugs.gnu.org > > The correct solution to these problems is to wrap the footnote block > in the LRE..PDF or LRI..PDI control characters, so that the footnote > is rendered independently of the surrounding bidirectional context. Actually, LRM should also work, you just need to put it on both sides of the footnote, like below: \begin{hebrew} \pstart בְּרֵאשִׁ֖ית\footnoteA{This is a Hebrew related footnote} בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃ \pend \end{hebrew} ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#22429: Force character to be recognized as LTR inside RTL paragraph 2016-01-22 8:24 ` Eli Zaretskii @ 2016-01-22 9:31 ` Andy Moreton 2016-01-22 14:03 ` Eli Zaretskii 0 siblings, 1 reply; 8+ messages in thread From: Andy Moreton @ 2016-01-22 9:31 UTC (permalink / raw) To: 22429 On Fri 22 Jan 2016, Eli Zaretskii wrote: >> Date: Fri, 22 Jan 2016 10:08:06 +0200 >> From: Eli Zaretskii <eliz@gnu.org> >> Cc: 22429@debbugs.gnu.org >> >> The correct solution to these problems is to wrap the footnote block >> in the LRE..PDF or LRI..PDI control characters, so that the footnote >> is rendered independently of the surrounding bidirectional context. > > Actually, LRM should also work, you just need to put it on both sides > of the footnote, like below: > > \begin{hebrew} > \pstart > > בְּרֵאשִׁ֖ית\footnoteA{This is a Hebrew related footnote} בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃ > > \pend > \end{hebrew} While reading this message, I noticed odd behaviour of cursor motion with <right> and <left> (i.e. right-char and left-char). I would expect repeated <right> to move in logical order until the end of the buffer, but it gets stuck on the newline after "\pstart". Likewise repeated <left> from the end gets stuck at the newline before "\pend". Saving this text in a file "foo.txt" showed the same behaviour (using the latest emacs-25 branch with "emacs -Q"). Is this expected ? AndyM ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#22429: Force character to be recognized as LTR inside RTL paragraph 2016-01-22 9:31 ` Andy Moreton @ 2016-01-22 14:03 ` Eli Zaretskii 0 siblings, 0 replies; 8+ messages in thread From: Eli Zaretskii @ 2016-01-22 14:03 UTC (permalink / raw) To: Andy Moreton; +Cc: 22429 > From: Andy Moreton <andrewjmoreton@gmail.com> > Date: Fri, 22 Jan 2016 09:31:39 +0000 > > While reading this message, I noticed odd behaviour of cursor motion > with <right> and <left> (i.e. right-char and left-char). > > I would expect repeated <right> to move in logical order until the end > of the buffer, but it gets stuck on the newline after "\pstart". > Likewise repeated <left> from the end gets stuck at the newline before > "\pend". > > Saving this text in a file "foo.txt" showed the same behaviour (using the > latest emacs-25 branch with "emacs -Q"). Is this expected ? Yes, expected. The paragraph direction changes when you enter a paragraph that has a different base direction, and the arrow keys are sensitive to the paragraph base direction. ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#22429: Force character to be recognized as LTR inside RTL paragraph 2016-01-22 8:08 ` Eli Zaretskii 2016-01-22 8:24 ` Eli Zaretskii @ 2016-01-22 11:54 ` Filipe Moreira 2016-01-22 14:01 ` Eli Zaretskii 1 sibling, 1 reply; 8+ messages in thread From: Filipe Moreira @ 2016-01-22 11:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 22429 [-- Attachment #1: Type: text/plain, Size: 4194 bytes --] Hi Eli, Thank for taking the time to look into this On Fri, Jan 22, 2016 at 8:08 AM, Eli Zaretskii <eliz@gnu.org> wrote: > > Date: Thu, 21 Jan 2016 13:14:22 -0800 > > From: "Filipe Moreira" <famoreira@gmail.com> > > > > I’m using Emacs as a LaTeX editor, with the AUCTeX mode. One document I’m > > authoring is written in English with some paragraphs in Hebrew or Greek. > > > > The issue I have is with mixing some neutral characters that need to be > LTR, > > inside a paragraph which is RTL. An example of this is the slash (i.e. > ‘\’) > > character used by LaTeX to signal its commands. Inside a RTL paragraph I > > ideally want to force Emacs to always interpret the slash character, as > well as > > the open and close brackets (i.e. {}) as LTR. > > > > This is not what happens at the moment. Here I have a visual > representation of > > the problem: > > > http://emacs.stackexchange.com/questions/19696/handling-left-to-right-inside-right-to-left-paragraphs-using-emacs-and-auctex > . > > > > Is it possible to whitelist some characters that should always be > interpreted > > as LTR? > > The directionality of characters is determined by their bidirectional > class property as defined by the Unicode Character Database. Emacs > uses those definitions in its implementation of the UBA, the Unicode > Bidirectional Algorithm, when it lays out text for display. > Punctuation characters, such as \, {, and } have "weak > directionality": they take the directionality of the surrounding text, > and if the directionality on either side is different, they default to > the paragraph's base direction, which is RTL in your case. So that is > what you see. > > Emacs being Emacs, you can programmatically change the bidirectional > class of every character, but that change has global effect: it will > affect the directionality of that character everywhere in the Emacs > session. So this is not recommended. > Also this is not recommended, I would be willing to have the bidi class property of some characters set to left-to-right, like the example of the slash character. Can you point somewhere regarding this? I saw the get-char-code-property function but could not find anyway to actually change the setting. > > The correct solution to these problems is to wrap the footnote block > in the LRE..PDF or LRI..PDI control characters, so that the footnote > is rendered independently of the surrounding bidirectional context. > See the example below. Not sure if LaTeX will DTRT with directional > control characters, but if it doesn't, that's a bug/misfeature in > LaTeX. > > \begin{hebrew} > \pstart > > בְּרֵאשִׁ֖ית\footnoteA{This is a Hebrew related footnote} בָּרָ֣א > אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃ > > \pend > \end{hebrew} > In this example the direction of the surrounding Hebrew text has been changed. The word בְּרֵאשִׁ֖ית should come before (i.e. on the right) of the word בָּרָ֣א. So while the footnote command is correctly shown as LTR the Hebrew text has been changed. I don't think is is the expected. See the updated image ( http://emacs.stackexchange.com/questions/19696/handling-left-to-right-inside-right-to-left-paragraphs-using-emacs-and-auctex) that shows TextEdit correct handling of this. > > Another possibility is to insert newlines between the footnote and the > surrounding text, as shown below. Not sure if LaTeX will be happy > with that, and I think it's uglier anyway. > > \begin{hebrew} > \pstart > > בְּרֵאשִׁ֖ית > > \footnoteA{This is a Hebrew related footnote} > > בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃ > > \pend > \end{hebrew} > Unfortunately for my use case this is not possible. > > I don't think there's a bug to fix here, so I'm going to close this > bug report. Any objections? > Is there any change of having a way to set the unicode bidirectionally of a character within each separate mode? Could this be considered a feature? [-- Attachment #2: Type: text/html, Size: 5837 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#22429: Force character to be recognized as LTR inside RTL paragraph 2016-01-22 11:54 ` Filipe Moreira @ 2016-01-22 14:01 ` Eli Zaretskii 2016-01-22 15:15 ` Filipe Moreira 0 siblings, 1 reply; 8+ messages in thread From: Eli Zaretskii @ 2016-01-22 14:01 UTC (permalink / raw) To: Filipe Moreira; +Cc: 22429 > From: Filipe Moreira <famoreira@gmail.com> > Date: Fri, 22 Jan 2016 11:54:45 +0000 > Cc: 22429@debbugs.gnu.org > > Emacs being Emacs, you can programmatically change the bidirectional > class of every character, but that change has global effect: it will > affect the directionality of that character everywhere in the Emacs > session. So this is not recommended. > > Also this is not recommended, I would be willing to have the bidi class > property of some characters set to left-to-right, like the example of the slash > character. Can you tell why? There are ways to produce the display you expect without changing the character properties; I described 3 such ways. If you change the properties, the text will only display correctly on your system, any other user who displays your text, either in Emacs or in other editor that supports bidirectional display, will see the text in the same jumbled order you wanted to avoid. So I see very little sense in such changes. > Can you point somewhere regarding this? I saw the > get-char-code-property function but could not find anyway to > actually change the setting. You want put-char-code-property. Again, I very much recommend not to do that. > \begin{hebrew} > \pstart > > בְּרֵאשִׁ֖ית\footnoteA{This is a Hebrew related footnote} בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת > הָאָֽרֶץ׃ > > \pend > \end{hebrew} > > > In this example the direction of the surrounding Hebrew text has been changed. > The word בְּרֵאשִׁ֖ית should come before (i.e. on the right) of the word בָּרָ֣א. So > while the footnote command is correctly shown as LTR the Hebrew text has been > changed. I don't think is is the expected. See the updated image > (http://emacs.stackexchange.com/questions/19696/handling-left-to-right-inside-right-to-left-paragraphs-using-emacs-and-auctex) > that shows TextEdit correct handling of this. What version of Emacs do you have? The above renders correctly for me, both in Emacs 24.5 and in the development version. The word בְּרֵאשִׁ֖ית is shown to the right of the footnote, and all the rest is shown to the left of it. Maybe you have an older Emacs which somehow has a bug? > Is there any change of having a way to set the unicode bidirectionally of a > character within each separate mode? Could this be considered a feature? I think it would be a misfeature, for the reasons explained above. It's the same as using a private font to display some character in a different shape -- you are the only one who will enjoy that shape. However, nothing prevents a mode from using put-char-code-property in some ingenious ways to do what you want. ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#22429: Force character to be recognized as LTR inside RTL paragraph 2016-01-22 14:01 ` Eli Zaretskii @ 2016-01-22 15:15 ` Filipe Moreira 0 siblings, 0 replies; 8+ messages in thread From: Filipe Moreira @ 2016-01-22 15:15 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 22429 [-- Attachment #1: Type: text/plain, Size: 3424 bytes --] On Fri, Jan 22, 2016 at 2:01 PM, Eli Zaretskii <eliz@gnu.org> wrote: > > From: Filipe Moreira <famoreira@gmail.com> > > Date: Fri, 22 Jan 2016 11:54:45 +0000 > > Cc: 22429@debbugs.gnu.org > > > > Emacs being Emacs, you can programmatically change the bidirectional > > class of every character, but that change has global effect: it will > > affect the directionality of that character everywhere in the Emacs > > session. So this is not recommended. > > > > Also this is not recommended, I would be willing to have the bidi class > > property of some characters set to left-to-right, like the example of > the slash > > character. > > Can you tell why? There are ways to produce the display you expect > without changing the character properties; I described 3 such ways. > If you change the properties, the text will only display correctly on > your system, any other user who displays your text, either in Emacs or > in other editor that supports bidirectional display, will see the text > in the same jumbled order you wanted to avoid. So I see very little > sense in such changes. > > > Can you point somewhere regarding this? I saw the > > get-char-code-property function but could not find anyway to > > actually change the setting. > > You want put-char-code-property. Again, I very much recommend not to > do that. > > > \begin{hebrew} > > \pstart > > > > בְּרֵאשִׁ֖ית\footnoteA{This is a Hebrew related footnote} בָּרָ֣א > אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת > > הָאָֽרֶץ׃ > > > > \pend > > \end{hebrew} > > > > > > In this example the direction of the surrounding Hebrew text has been > changed. > > The word בְּרֵאשִׁ֖ית should come before (i.e. on the right) of the word > בָּרָ֣א. So > > while the footnote command is correctly shown as LTR the Hebrew text has > been > > changed. I don't think is is the expected. See the updated image > > ( > http://emacs.stackexchange.com/questions/19696/handling-left-to-right-inside-right-to-left-paragraphs-using-emacs-and-auctex > ) > > that shows TextEdit correct handling of this. > > What version of Emacs do you have? The above renders correctly for > me, both in Emacs 24.5 and in the development version. The word > בְּרֵאשִׁ֖ית is shown to the right of the footnote, and all the rest is > shown to the left of it. Maybe you have an older Emacs which somehow > has a bug? > I have just tested wrapping the footnote command within LTM (on both ends) in a clean Emacs 24.5.1 (started with -Q) and it worked! This wasn't working on my normal environment so I will need to investigate why that is. > > > Is there any change of having a way to set the unicode bidirectionally > of a > > character within each separate mode? Could this be considered a feature? > > I think it would be a misfeature, for the reasons explained above. > It's the same as using a private font to display some character in a > different shape -- you are the only one who will enjoy that shape. > > However, nothing prevents a mode from using put-char-code-property in > some ingenious ways to do what you want. > I appreciate your help. This is all new to me and I've already learned a lot from you and others regarding this. Thank you for making Emacs so great. [-- Attachment #2: Type: text/html, Size: 4531 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-01-22 15:15 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-01-21 21:14 bug#22429: Force character to be recognized as LTR inside RTL paragraph Filipe Moreira 2016-01-22 8:08 ` Eli Zaretskii 2016-01-22 8:24 ` Eli Zaretskii 2016-01-22 9:31 ` Andy Moreton 2016-01-22 14:03 ` Eli Zaretskii 2016-01-22 11:54 ` Filipe Moreira 2016-01-22 14:01 ` Eli Zaretskii 2016-01-22 15:15 ` Filipe Moreira
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.