* Internal coding system - Need advice 22 v. 23
@ 2009-11-11 5:00 Sebastian Rose
2009-11-11 6:55 ` Kenichi Handa
0 siblings, 1 reply; 3+ messages in thread
From: Sebastian Rose @ 2009-11-11 5:00 UTC (permalink / raw)
To: emacs-devel Mailinglist
Hi emacs experts,
please, could someone help me to understand the internal coding system
of emacs-22? Or simply point me to a function that correctly decodes
multibyte characters in emacs-22?
Here is why:
To send unicode text to emacs via org-protocol, I wrote a function, that
decodes all those hex-encoded characters (e.g. `%C3%B6' is the German
Umlaut `ü'). That function `org-protocol-unhex-string' is defunned in
org/lisp/org-protocol.el. It works perfectly for emacs-23.
For emacs-22 this does _not_ work, however.
The problem occurs, if `char-to-string' is called with an unicode
character. E.g.
(char-to-string 8211)
perfectly returns a dash in emacs-23. `org-protocol-unhex-string' works
like a charme for texts like those on http://www.welcome2japan.cn/ - in
emacs-23.
If I evaluate the same expression in emacs-22, I get an error (see
backtrace below) and (prin1-char 8211) returns nil.
There must be a way to correctly decode multibyte characters in
emacs-22.
Thanks in advance,
Sebastian
Debugger entered--Lisp error: (error "Invalid character: 8211, #o20023,
#x2013") char-to-string(8211)
(concat ret (char-to-string sum))
(setq ret (concat ret (char-to-string sum)))
(progn (setq ret (concat ret ...)) (setq sum 0))
(if (= 0 eat) (progn (setq ret ...) (setq sum 0)))
(when (= 0 eat) (setq ret (concat ret ...)) (setq sum 0))
(let* ((b ...) (a ...) (b ...) (c1 ...) (c2 ...) (val ...)
(shift ...) (xor ...)) (if (>= val 192) (setq eat shift)) (setq val
(logxor val xor)) (setq sum (+ ... val)) (if (> eat 0) (setq eat ...))
(when (= 0 eat) (setq ret ...) (setq sum 0))) (while bytes (let*
(... ... ... ... ... ... ... ...) (if ... ...) (setq val ...) (setq
sum ...) (if ... ...) (when ... ... ...))) (let* ((bytes ...) (ret "")
(eat 0) (sum 0)) (while bytes (let* ... ... ... ... ... ...)) ret)
org-protocol-unhex-compound("%20%E2%80%93%20") (let* ((start ...)
(end ...) (hex ...) (replacement ...)) (setq tmp (concat tmp ...
replacement)) (setq str (substring str end))) (while (string-match "\\(%
[0-9a-f][0-9a-f]\\)+" str) (let* (... ... ... ...) (setq tmp ...) (setq
str ...))) (let ((tmp "") (case-fold-search t)) (while (string-match "\
\(%[0-9a-f][0-9a-f]\\)+" str) (let* ... ... ...)) (setq tmp (concat tmp
str)) tmp) org-protocol-unhex-string("org-protocol.el%20%E2%80%93%
20Intercept%20calls%20from%20emacsclient%20to%20trigger%20custom%
20actions") mapcar(org-protocol-unhex-string ("http%3A%2F%2Forgmode.org%
2Fworg%2Forg-contrib%2Forg-protocol.php" "org-protocol.el%20%E2%80%93%
20Intercept%20calls%20from%20emacsclient%20to%20trigger%20custom%
20actions" "")) (if (fboundp unhexify) (mapcar unhexify split-parts)
(mapcar (quote org-protocol-unhex-string) split-parts)) (if unhexify
(if (fboundp unhexify) (mapcar unhexify split-parts) (mapcar ...
split-parts)) split-parts) (let* ((sep ...) (split-parts ...)) (if
unhexify (if ... ... ...) split-parts)) org-protocol-split-data("http%3A
%2F%2Forgmode.org%2Fworg%2Forg-contrib%
2Forg-protocol.php/org-protocol.el%20%E2%80%93%20Intercept%20calls%
20from%20emacsclient%20to%20trigger%20custom%20actions/" t) (let*
((parts ...) (template ...) (url ...) (type ...) (title ...)
(region ...) (orglink ...) remember-annotation-functions) (setq
org-stored-links (cons ... org-stored-links)) (kill-new orglink)
(org-store-link-props :type type :link url :description title :initial
region) (raise-frame) (org-remember nil (string-to-char template))) (if
(and (boundp ...) (fboundp ...)) (let* (... ... ... ... ... ... ...
remember-annotation-functions) (setq org-stored-links ...) (kill-new
orglink) (org-store-link-props :type type :link url :description
title :initial region) (raise-frame) (org-remember nil ...)) (message
"Org-mode not loaded.")) org-protocol-remember("http%3A%2F%2Forgmode.org
%2Fworg%2Forg-contrib%2Forg-protocol.php/org-protocol.el%20%E2%80%93%
20Intercept%20calls%20from%20emacsclient%20to%20trigger%20custom%
20actions/") funcall(org-protocol-remember "http%3A%2F%2Forgmode.org%
2Fworg%2Forg-contrib%2Forg-protocol.php/org-protocol.el%20%E2%80%93%
20Intercept%20calls%20from%20emacsclient%20to%20trigger%20custom%
20actions/") (throw (quote fname) (funcall func result)) (if greedy nil
(throw (quote fname) (funcall func result))) (unless greedy (throw
(quote fname) (funcall func result))) (progn (unless greedy
(throw ... ...)) (funcall func result) (throw (quote fname) t)) (if
(fboundp func) (progn (unless greedy ...) (funcall func result)
(throw ... t))) (when (fboundp func) (unless greedy (throw ... ...))
(funcall func result) (throw (quote fname) t)) (let* ((func ...)
(greedy ...) (splitted ...) (result ...)) (when
(plist-get ... :kill-client) (message "Greedy org-protocol handler.
Killing client.") (server-edit)) (when (fboundp func) (unless
greedy ...) (funcall func result) (throw ... t))) (progn (let*
(... ... ... ...) (when ... ... ...) (when ... ... ... ...))) (if
(string-match proto fname) (progn (let* ... ... ...))) (when
(string-match proto fname) (let* (... ... ... ...) (when ... ... ...)
(when ... ... ... ...))) (let ((proto ...)) (when (string-match proto
fname) (let* ... ... ...))) (while --cl-dolist-temp-- (setq prolist
(car --cl-dolist-temp--)) (let (...) (when ... ...)) (setq
--cl-dolist-temp-- (cdr --cl-dolist-temp--))) (let ((--cl-dolist-temp--
sub-protocols) prolist) (while --cl-dolist-temp-- (setq prolist ...)
(let ... ...) (setq --cl-dolist-temp-- ...)) nil) (catch (quote
--cl-block-nil--) (let (... prolist) (while
--cl-dolist-temp-- ... ... ...) nil)) (cl-block-wrapper (catch (quote
--cl-block-nil--) (let ... ... nil))) (block nil (let (... prolist)
(while --cl-dolist-temp-- ... ... ...) nil)) (dolist (prolist
sub-protocols) (let (...) (when ... ...))) (progn (dolist (prolist
sub-protocols) (let ... ...))) (if (string-match the-protocol fname)
(progn (dolist ... ...))) (when (string-match the-protocol fname)
(dolist (prolist sub-protocols) (let ... ...))) (let
((the-protocol ...)) (when (string-match the-protocol fname)
(dolist ... ...))) (catch (quote fname) (let (...) (when ... ...))
fname) (let ((sub-protocols ...)) (catch (quote fname) (let ... ...)
fname)) org-protocol-check-filename-for-protocol
("/home/andy/org-protocol:/remember:/http%3A%2F%2Forgmode.org%2Fworg%
2Forg-contrib%2Forg-protocol.php/org-protocol.el%20%E2%80%93%20Intercept
%20calls%20from%20emacsclient%20to%20trigger%20custom%
20actions/" (("/home/andy/org-protocol:/remember:/http%3A%2F%
2Forgmode.org%2Fworg%2Forg-contrib%2Forg-protocol.php/org-protocol.el%
20%E2%80%93%20Intercept%20calls%20from%20emacsclient%20to%20trigger%
20custom%20actions/" 1 0)) (#<process server <*8*>>)) byte-code
(<bytecode removed> [flist var --cl-dolist-temp-- fname client files
nil expand-file-name org-protocol-check-filename-for-protocol t throw
greedy delq] 5) server-visit-files
((("/home/andy/org-protocol:/remember:/http%3A%2F%2Forgmode.org%2Fworg%
2Forg-contrib%2Forg-protocol.php/org-protocol.el%20%E2%80%93%20Intercept
%20calls%20from%20emacsclient%20to%20trigger%20custom%20actions/" 1 0))
(#<process server <*8*>>) nil) byte-code(<bytecode removed> [proc
string prev --cl-proc-- default-enable-multibyte-characters
file-name-coding-system process-get :authenticated string-match "-auth \
\(.*?\\)\n" match-string 1 :auth-key 0 nil process-put t server-log
"Authentication successful" "Authentication failed" process-send-string
delete-process throw
--cl-block-server-process-filter-- :previous-string recursion-depth
run-with-timer make-symbol "--proc--" lambda (&rest --cl-rest--) apply #
[(G47000) <bytecode removed> [G47000 server-process-filter ""] 3] quote
--cl-rest-- top-level (byte-code <bytecode reoved> [mapc #[...
<bytecode removed> [buffer isearch-mode boundp isearch-cancel] 2]
buffer-list] 3) ((quit ...)) "\n" "[^ ]* " "-nowait" "-eval" "-display"
"\\([^ ]*\\) " server-unquote-arg err (byte-code <bytecode removed>
[display tmp-frame server-select-display] 2) ((error ...)) "\\`\\+[0-9]+
\\'" string-to-number ...] 10) server-process-filter(#<process server
<*8*>> "/home/andy/org-protocol://remember://http%3A%2F%2Forgmode.org%
2Fworg%2Forg-contrib%2Forg-protocol.php/org-protocol.el%20%E2%80%93%
20Intercept%20calls%20from%20emacsclient%20to%20trigger%20custom%
20actions/ \n")
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Internal coding system - Need advice 22 v. 23
2009-11-11 5:00 Internal coding system - Need advice 22 v. 23 Sebastian Rose
@ 2009-11-11 6:55 ` Kenichi Handa
2009-11-11 8:10 ` Sebastian Rose
0 siblings, 1 reply; 3+ messages in thread
From: Kenichi Handa @ 2009-11-11 6:55 UTC (permalink / raw)
To: Sebastian Rose; +Cc: emacs-devel
In article <87pr7pzme8.fsf@gmx.de>, Sebastian Rose <sebastian_rose@gmx.de> writes:
> To send unicode text to emacs via org-protocol, I wrote a function, that
> decodes all those hex-encoded characters (e.g. `%C3%B6' is the German
> Umlaut `ü'). That function `org-protocol-unhex-string' is defunned in
> org/lisp/org-protocol.el. It works perfectly for emacs-23.
> For emacs-22 this does _not_ work, however.
> The problem occurs, if `char-to-string' is called with an unicode
> character. E.g.
> (char-to-string 8211)
> perfectly returns a dash in emacs-23. `org-protocol-unhex-string' works
> like a charme for texts like those on http://www.welcome2japan.cn/ - in
> emacs-23.
Emacs-23's character code is a superset of Unicode, but
Emacs-22's character code is not compatible with Unicode.
To get Emacs-22's character code from Unicode character
code, you must use decode-char function. For instance, the
above should be done by (string (decode-char 'ucs 8211)).
And please note that Emacs-22 supports just a subset of
Unicode (U+0000..U+33FF, U+E000..U+FFFF, and some of CJK
characters contained in such legacy charsets as JISX0208,
GB2312, KSC5601).
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Internal coding system - Need advice 22 v. 23
2009-11-11 6:55 ` Kenichi Handa
@ 2009-11-11 8:10 ` Sebastian Rose
0 siblings, 0 replies; 3+ messages in thread
From: Sebastian Rose @ 2009-11-11 8:10 UTC (permalink / raw)
To: Kenichi Handa; +Cc: emacs-devel
Kenichi Handa <handa@m17n.org> writes:
> In article <87pr7pzme8.fsf@gmx.de>, Sebastian Rose <sebastian_rose@gmx.de> writes:
>
>> To send unicode text to emacs via org-protocol, I wrote a function, that
>> decodes all those hex-encoded characters (e.g. `%C3%B6' is the German
>> Umlaut `ü'). That function `org-protocol-unhex-string' is defunned in
>> org/lisp/org-protocol.el. It works perfectly for emacs-23.
>
>> For emacs-22 this does _not_ work, however.
>
>> The problem occurs, if `char-to-string' is called with an unicode
>> character. E.g.
>
>> (char-to-string 8211)
>
>> perfectly returns a dash in emacs-23. `org-protocol-unhex-string' works
>> like a charme for texts like those on http://www.welcome2japan.cn/ - in
>> emacs-23.
>
> Emacs-23's character code is a superset of Unicode, but
> Emacs-22's character code is not compatible with Unicode.
> To get Emacs-22's character code from Unicode character
> code, you must use decode-char function. For instance, the
> above should be done by (string (decode-char 'ucs 8211)).
Ahh - OK. Seems to work perfect.
> And please note that Emacs-22 supports just a subset of
> Unicode (U+0000..U+33FF, U+E000..U+FFFF, and some of CJK
> characters contained in such legacy charsets as JISX0208,
> GB2312, KSC5601).
Well, there's nothing I could do about it except check for errors and
fail.
Thanks a lot for this explanation!
Best wishes
Sebastian
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-11-11 8:10 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-11 5:00 Internal coding system - Need advice 22 v. 23 Sebastian Rose
2009-11-11 6:55 ` Kenichi Handa
2009-11-11 8:10 ` Sebastian Rose
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).