* ncr (numeric character reference) to unicode
@ 2009-04-13 20:17 B. T. Raven
2009-04-13 20:52 ` Eli Zaretskii
2009-04-14 3:07 ` Miles Bader
0 siblings, 2 replies; 12+ messages in thread
From: B. T. Raven @ 2009-04-13 20:17 UTC (permalink / raw)
To: help-gnu-emacs
Does any of you know whether nxhtml has the capability to convert
sequences like this:
שַׁלוֹם.
(shalom in Hebrew)
to the equivalent Unicode string. N. Walsh had a couple of .el files
that implemented this I think, but they required cl to be loaded also.
Will Emacs 23.1 also support bidi when it is released?
Thanks,
Ed
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ncr (numeric character reference) to unicode
2009-04-13 20:17 ncr (numeric character reference) to unicode B. T. Raven
@ 2009-04-13 20:52 ` Eli Zaretskii
2009-04-14 3:07 ` Miles Bader
1 sibling, 0 replies; 12+ messages in thread
From: Eli Zaretskii @ 2009-04-13 20:52 UTC (permalink / raw)
To: help-gnu-emacs
> Date: Mon, 13 Apr 2009 15:17:47 -0500
> From: "B. T. Raven" <nihil@nihilo.net>
>
> Will Emacs 23.1 also support bidi when it is released?
Sadly, no. Bidirectional editing needs quite a bit of supporting
code and major changes to some fundamental Emacs features, such as
text fill, and no one succeeded to write the code for that yet.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ncr (numeric character reference) to unicode
2009-04-13 20:17 ncr (numeric character reference) to unicode B. T. Raven
2009-04-13 20:52 ` Eli Zaretskii
@ 2009-04-14 3:07 ` Miles Bader
2009-04-14 16:42 ` B. T. Raven
1 sibling, 1 reply; 12+ messages in thread
From: Miles Bader @ 2009-04-14 3:07 UTC (permalink / raw)
To: help-gnu-emacs
"B. T. Raven" <nihil@nihilo.net> writes:
> Does any of you know whether nxhtml has the capability to convert
> sequences like this:
>
> שַׁלוֹם.
> (shalom in Hebrew)
The following should work:
(defun expand-html-encoded-chars (start end)
(interactive "r")
(save-excursion
(goto-char start)
(while (re-search-forward "&#\\([0-9]+\\);" end t)
(replace-match
(char-to-string
(decode-char 'ucs (string-to-number (match-string 1))) )
t t))))
-Miles
--
Guilt, n. The condition of one who is known to have committed an indiscretion,
as distinguished from the state of him who has covered his tracks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ncr (numeric character reference) to unicode
2009-04-14 3:07 ` Miles Bader
@ 2009-04-14 16:42 ` B. T. Raven
2009-04-15 21:43 ` Stephen Berman
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: B. T. Raven @ 2009-04-14 16:42 UTC (permalink / raw)
To: help-gnu-emacs
Miles Bader wrote:
> "B. T. Raven" <nihil@nihilo.net> writes:
>> Does any of you know whether nxhtml has the capability to convert
>> sequences like this:
>>
>> שַׁלוֹם.
>> (shalom in Hebrew)
>
> The following should work:
>
> (defun expand-html-encoded-chars (start end)
> (interactive "r")
> (save-excursion
> (goto-char start)
> (while (re-search-forward "&#\\([0-9]+\\);" end t)
> (replace-match
> (char-to-string
> (decode-char 'ucs (string-to-number (match-string 1))) )
> t t))))
>
> -Miles
>
Thanks, Eli and Miles. The conversion works fine (with uncomposed
glyphs, that is, points as separate characters, same as in the html
codes). I referenced the command in an alias:
(defalias 'xhc 'expand-html-encoded-chars)
and then tried to do the same with this function:
(defun reverse-string (beg end)
(interactive "r")
(setq str (buffer-substring beg end))
(apply #'string (nreverse (string-to-list str))))
but it doesn't seem to work, although it doesn't produce errors in a
traceback buffer. What am I missing?
Thanks,
Ed
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ncr (numeric character reference) to unicode
2009-04-14 16:42 ` B. T. Raven
@ 2009-04-15 21:43 ` Stephen Berman
[not found] ` <mailman.5401.1239831823.31690.help-gnu-emacs@gnu.org>
` (2 subsequent siblings)
3 siblings, 0 replies; 12+ messages in thread
From: Stephen Berman @ 2009-04-15 21:43 UTC (permalink / raw)
To: help-gnu-emacs
On Tue, 14 Apr 2009 11:42:09 -0500 "B. T. Raven" <nihil@nihilo.net> wrote:
> Miles Bader wrote:
>> "B. T. Raven" <nihil@nihilo.net> writes:
>>> Does any of you know whether nxhtml has the capability to convert
>>> sequences like this:
>>>
>>> שַׁלוֹם.
>>> (shalom in Hebrew)
>>
>> The following should work:
>>
>> (defun expand-html-encoded-chars (start end)
>> (interactive "r")
>> (save-excursion
>> (goto-char start)
>> (while (re-search-forward "&#\\([0-9]+\\);" end t)
>> (replace-match (char-to-string
>> (decode-char 'ucs (string-to-number (match-string 1))) )
>> t t))))
>>
>> -Miles
>>
>
> Thanks, Eli and Miles. The conversion works fine (with uncomposed glyphs, that
> is, points as separate characters, same as in the html codes). I referenced
> the command in an alias:
>
> (defalias 'xhc 'expand-html-encoded-chars)
>
> and then tried to do the same with this function:
>
> (defun reverse-string (beg end)
> (interactive "r")
> (setq str (buffer-substring beg end))
> (apply #'string (nreverse (string-to-list str))))
>
> but it doesn't seem to work, although it doesn't produce errors in a traceback
> buffer. What am I missing?
>
> Thanks,
>
> Ed
Does this do what you want?
(defun reverse-string (beg end)
(interactive "r")
(xhc beg end)
(let* ((beg (region-beginning))
(end (region-end))
(str1 (buffer-substring beg end))
(str2 (apply #'string (nreverse (string-to-list str1)))))
(replace-string str1 str2 nil beg end)))
Steve Berman
^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <mailman.5401.1239831823.31690.help-gnu-emacs@gnu.org>]
* Re: ncr (numeric character reference) to unicode
[not found] ` <mailman.5401.1239831823.31690.help-gnu-emacs@gnu.org>
@ 2009-04-16 1:42 ` B. T. Raven
2009-04-16 4:35 ` Kevin Rodgers
2009-04-16 13:23 ` Stephen Berman
0 siblings, 2 replies; 12+ messages in thread
From: B. T. Raven @ 2009-04-16 1:42 UTC (permalink / raw)
To: help-gnu-emacs
Stephen Berman wrote:
> On Tue, 14 Apr 2009 11:42:09 -0500 "B. T. Raven" <nihil@nihilo.net> wrote:
>
>> Miles Bader wrote:
>>> "B. T. Raven" <nihil@nihilo.net> writes:
>>>> Does any of you know whether nxhtml has the capability to convert
>>>> sequences like this:
>>>>
>>>> שַׁלוֹם.
>>>> (shalom in Hebrew)
>>> The following should work:
>>>
>>> (defun expand-html-encoded-chars (start end)
>>> (interactive "r")
>>> (save-excursion
>>> (goto-char start)
>>> (while (re-search-forward "&#\\([0-9]+\\);" end t)
>>> (replace-match (char-to-string
>>> (decode-char 'ucs (string-to-number (match-string 1))) )
>>> t t))))
>>>
>>> -Miles
>>>
>> Thanks, Eli and Miles. The conversion works fine (with uncomposed glyphs, that
>> is, points as separate characters, same as in the html codes). I referenced
>> the command in an alias:
>>
>> (defalias 'xhc 'expand-html-encoded-chars)
>>
>> and then tried to do the same with this function:
>>
>> (defun reverse-string (beg end)
>> (interactive "r")
>> (setq str (buffer-substring beg end))
>> (apply #'string (nreverse (string-to-list str))))
>>
>> but it doesn't seem to work, although it doesn't produce errors in a traceback
>> buffer. What am I missing?
>>
>> Thanks,
>>
>> Ed
>
> Does this do what you want?
>
> (defun reverse-string (beg end)
> (interactive "r")
> (xhc beg end)
> (let* ((beg (region-beginning))
> (end (region-end))
> (str1 (buffer-substring beg end))
> (str2 (apply #'string (nreverse (string-to-list str1)))))
> (replace-string str1 str2 nil beg end)))
>
> Steve Berman
>
>
>
That would probably do a little more than I want. Miles' expand html
function is only needed if someone sends these ncr sequences in email.
Btw, why are beg and end calculated in the function if they are passed
to it? This almost does what I want:
(defun reverse-bufsubstring (beg end)
(interactive "r")
(let* (
(str1 (buffer-substring beg end))
(str2 (apply #'string (nreverse (string-to-list str1)))))
(replace-string str1 str2 nil beg end)))
except that it converts
same
one
as
before
into this:
erofeb
sa
eno
emas
so now that has to be reversed line by line rather than character by
character. Anyway, all of this is just a kludge until the gurus come up
with a real bidi functionality.
Thanks again,
Ed
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ncr (numeric character reference) to unicode
2009-04-16 1:42 ` B. T. Raven
@ 2009-04-16 4:35 ` Kevin Rodgers
2009-04-16 13:23 ` Stephen Berman
1 sibling, 0 replies; 12+ messages in thread
From: Kevin Rodgers @ 2009-04-16 4:35 UTC (permalink / raw)
To: help-gnu-emacs
B. T. Raven wrote:
> That would probably do a little more than I want. Miles' expand html
> function is only needed if someone sends these ncr sequences in email.
> Btw, why are beg and end calculated in the function if they are passed
> to it? This almost does what I want:
>
> (defun reverse-bufsubstring (beg end)
> (interactive "r")
> (let* (
> (str1 (buffer-substring beg end))
> (str2 (apply #'string (nreverse (string-to-list str1)))))
> (replace-string str1 str2 nil beg end)))
>
>
> except that it converts
>
> same
> one
> as
> before
>
> into this:
>
> erofeb
> sa
> eno
> emas
>
> so now that has to be reversed line by line rather than character by
> character. Anyway, all of this is just a kludge until the gurus come up
> with a real bidi functionality.
(defun reverse-region-by-line (beg end)
(interactive "r")
(save-excursion
(goto-char beg)
(while (and (< (point) end) (re-search-forward "\\=.*$" end t))
(replace-match (apply #'string
(nreverse (string-to-list (match-string 0)))))
(forward-line))))
--
Kevin Rodgers
Denver, Colorado, USA
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ncr (numeric character reference) to unicode
2009-04-16 1:42 ` B. T. Raven
2009-04-16 4:35 ` Kevin Rodgers
@ 2009-04-16 13:23 ` Stephen Berman
1 sibling, 0 replies; 12+ messages in thread
From: Stephen Berman @ 2009-04-16 13:23 UTC (permalink / raw)
To: help-gnu-emacs
On Wed, 15 Apr 2009 20:42:49 -0500 "B. T. Raven" <nihil@nihilo.net> wrote:
> Stephen Berman wrote:
[...]
>> Does this do what you want?
>>
>> (defun reverse-string (beg end)
>> (interactive "r")
>> (xhc beg end)
>> (let* ((beg (region-beginning))
>> (end (region-end))
>> (str1 (buffer-substring beg end))
>> (str2 (apply #'string (nreverse (string-to-list str1)))))
>> (replace-string str1 str2 nil beg end)))
>>
>> Steve Berman
>>
>>
>>
>
> That would probably do a little more than I want. Miles' expand html function
> is only needed if someone sends these ncr sequences in email. Btw, why are beg
> and end calculated in the function if they are passed to it?
When xhc is called on beg and end (since I mistakenly thought you wanted
to convert the HTML entities and reverse the result in one blow) the
region is changed, so it has to be recalculated for the arguments of
buffer-substring (actually, only region-end changes, so beg really
shouldn't be recalculated). Of course, new variables could have been
used in the let* clause.
Steve Berman
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ncr (numeric character reference) to unicode
2009-04-14 16:42 ` B. T. Raven
2009-04-15 21:43 ` Stephen Berman
[not found] ` <mailman.5401.1239831823.31690.help-gnu-emacs@gnu.org>
@ 2009-04-16 4:20 ` Kevin Rodgers
[not found] ` <mailman.5427.1239855645.31690.help-gnu-emacs@gnu.org>
3 siblings, 0 replies; 12+ messages in thread
From: Kevin Rodgers @ 2009-04-16 4:20 UTC (permalink / raw)
To: help-gnu-emacs
B. T. Raven wrote:
> (defalias 'xhc 'expand-html-encoded-chars)
>
> and then tried to do the same with this function:
>
> (defun reverse-string (beg end)
> (interactive "r")
> (setq str (buffer-substring beg end))
> (apply #'string (nreverse (string-to-list str))))
>
> but it doesn't seem to work, although it doesn't produce errors in a
> traceback buffer. What am I missing?
Miles' expand-html-encoded-chars function modifies the buffer, with
replace-match. Your reverse-string function generates a value, but
does not modify the buffer.
--
Kevin Rodgers
Denver, Colorado, USA
^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <mailman.5427.1239855645.31690.help-gnu-emacs@gnu.org>]
* Re: ncr (numeric character reference) to unicode
[not found] ` <mailman.5427.1239855645.31690.help-gnu-emacs@gnu.org>
@ 2009-04-17 3:39 ` B. T. Raven
2009-04-17 15:19 ` Stephen Berman
[not found] ` <mailman.5538.1239981609.31690.help-gnu-emacs@gnu.org>
0 siblings, 2 replies; 12+ messages in thread
From: B. T. Raven @ 2009-04-17 3:39 UTC (permalink / raw)
To: help-gnu-emacs
Kevin Rodgers wrote:
> B. T. Raven wrote:
>> (defalias 'xhc 'expand-html-encoded-chars)
>>
>> and then tried to do the same with this function:
>>
>> (defun reverse-string (beg end)
>> (interactive "r")
>> (setq str (buffer-substring beg end))
>> (apply #'string (nreverse (string-to-list str))))
>>
>> but it doesn't seem to work, although it doesn't produce errors in a
>> traceback buffer. What am I missing?
>
> Miles' expand-html-encoded-chars function modifies the buffer, with
> replace-match. Your reverse-string function generates a value, but
> does not modify the buffer.
>
Yes, of course. That finally dawned on me. Thanks Kevin and Steve. This
is finally what I want:
(defun reverse-string (str)
(apply #'string (nreverse (string-to-list str))))
(defun reverse-region-by-line (beg end)
(interactive "r")
(save-excursion
(goto-char beg)
(while (and (< (point) end) (re-search-forward "\\=.*$" end t))
(replace-match (reverse-string (match-string 0)))
(forward-line))))
But now I find that if I copy-paste from Emacs 23.0.90.1, the Greek letters
αβγδ
appear in Mozilla Tbird (here) in the original order but
בִּּרֵאשׁיתבָּרָּאא לֹהִים אלתשָּׁמַיִם וְ אלת הָ ּאָרֶ ׃
is automatically reversed without running the above command on its
region. ??? Is there invisible bidi info in the string or is it just the
fact that the characters are Hebrew that causes this?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ncr (numeric character reference) to unicode
2009-04-17 3:39 ` B. T. Raven
@ 2009-04-17 15:19 ` Stephen Berman
[not found] ` <mailman.5538.1239981609.31690.help-gnu-emacs@gnu.org>
1 sibling, 0 replies; 12+ messages in thread
From: Stephen Berman @ 2009-04-17 15:19 UTC (permalink / raw)
To: help-gnu-emacs
On Thu, 16 Apr 2009 22:39:43 -0500 "B. T. Raven" <nihil@nihilo.net> wrote:
> But now I find that if I copy-paste from Emacs 23.0.90.1, the Greek letters
>
>
> αβγδ
>
> appear in Mozilla Tbird (here) in the original order but
>
>
> בִּּרֵאשׁיתבָּרָּאא לֹהִים אלתשָּׁמַיִם וְ אלת הָ ּאָרֶ ׃
>
>
> is automatically reversed without running the above command on its region. ???
> Is there invisible bidi info in the string or is it just the fact that the
> characters are Hebrew that causes this?
Presumably the latter. I guess Thunderbird works like OpenOffice.org,
which also automatically reverses the Hebrew text, and whose Help entry
for "bi-directional writing" says:
,----
| Currently, OpenOffice.org supports Hindi, Thai, Hebrew, and Arabic as
| CTL [Complex Text Layout] languages. If you select the text flow from
| right to left, embedded Western text still runs from left to
| right. The cursor responds to the arrow keys in that Right Arrow moves
| it "to the text end" and Left Arrow "to the text start".
`----
Steve Berman
^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <mailman.5538.1239981609.31690.help-gnu-emacs@gnu.org>]
* Re: ncr (numeric character reference) to unicode
[not found] ` <mailman.5538.1239981609.31690.help-gnu-emacs@gnu.org>
@ 2009-04-17 23:20 ` B. T. Raven
0 siblings, 0 replies; 12+ messages in thread
From: B. T. Raven @ 2009-04-17 23:20 UTC (permalink / raw)
To: help-gnu-emacs
Stephen Berman wrote:
> On Thu, 16 Apr 2009 22:39:43 -0500 "B. T. Raven" <nihil@nihilo.net> wrote:
>
>> But now I find that if I copy-paste from Emacs 23.0.90.1, the Greek letters
>>
>>
>> αβγδ
>>
>> appear in Mozilla Tbird (here) in the original order but
>>
>>
>>
Saw this right to left order in Emacs 23:
בִּּרֵאשׁיתבָּרָּאא לֹהִים אלתשָּׁמַיִם וְ אלת הָ ּאָרֶ ׃
Copy-pasted here in Tbird:
׃ ֶרָאּ ָה תלא ְו םִיַמָּׁשתלא םיִהֹל אאָּרָּבתיׁשאֵרִּּב
First line C-c C-v (CUA) here in Tbird:
בִּּרֵאשׁיתבָּרָּאא לֹהִים אלתשָּׁמַיִם וְ אלת הָ ּאָרֶ ׃
And it doesn't matter which direction the Hebrew text is selected in. In
fact S-arrow won't move over the Hebrew, character-by-character but it
selects the whole line, whether characters are in forward or reverse
order and whether cursor starts at left or right.
Ed
>>
>>
>> is automatically reversed without running the above command on its region. ???
>> Is there invisible bidi info in the string or is it just the fact that the
>> characters are Hebrew that causes this?
>
> Presumably the latter. I guess Thunderbird works like OpenOffice.org,
> which also automatically reverses the Hebrew text, and whose Help entry
> for "bi-directional writing" says:
> ,----
> | Currently, OpenOffice.org supports Hindi, Thai, Hebrew, and Arabic as
> | CTL [Complex Text Layout] languages. If you select the text flow from
> | right to left, embedded Western text still runs from left to
> | right. The cursor responds to the arrow keys in that Right Arrow moves
> | it "to the text end" and Left Arrow "to the text start".
> `----
>
> Steve Berman
>
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2009-04-17 23:20 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-13 20:17 ncr (numeric character reference) to unicode B. T. Raven
2009-04-13 20:52 ` Eli Zaretskii
2009-04-14 3:07 ` Miles Bader
2009-04-14 16:42 ` B. T. Raven
2009-04-15 21:43 ` Stephen Berman
[not found] ` <mailman.5401.1239831823.31690.help-gnu-emacs@gnu.org>
2009-04-16 1:42 ` B. T. Raven
2009-04-16 4:35 ` Kevin Rodgers
2009-04-16 13:23 ` Stephen Berman
2009-04-16 4:20 ` Kevin Rodgers
[not found] ` <mailman.5427.1239855645.31690.help-gnu-emacs@gnu.org>
2009-04-17 3:39 ` B. T. Raven
2009-04-17 15:19 ` Stephen Berman
[not found] ` <mailman.5538.1239981609.31690.help-gnu-emacs@gnu.org>
2009-04-17 23:20 ` B. T. Raven
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.