* 23.0.50; utf7-decode failed with non latin-1 charactor
@ 2007-11-01 17:29 Topia
2007-11-02 12:40 ` Jason Rumney
0 siblings, 1 reply; 9+ messages in thread
From: Topia @ 2007-11-01 17:29 UTC (permalink / raw)
To: emacs-pretest-bug
I want to use imap's internationalized (japanese) folder name,
from Wanderlust. but I see error "Unable to convert from Unicode"
from utf7-u16-latin1-char-converter.
You can see with this code:
;; "Sent Mail" in Japanese, imap's UTF-7
(utf7-decode "&kAFP4W4IMH8w4TD8MOs-" 'imap)
so I found problem that utf7-utf-16-coding-system is nil, because:
* Emacs 23 has utf-16-be, but this coding system has BOM.
* utf-16-be-nosig is not found.
I found utf-16-be without BOM version, utf-16be. evalute above code
after (setq utf7-utf-16-coding-system 'utf-16be), no error occured.
Could you modify utf7-utf-16-coding-system to add utf-16be?
(defconst utf7-utf-16-coding-system
(cond ((mm-coding-system-p 'utf-16-be-no-signature) ; Mule-UCS
'utf-16-be-no-signature)
((and (mm-coding-system-p 'utf-16-be) ; Emacs 21.3, Emacs 22
;; Avoid versions with BOM.
(= 2 (length (encode-coding-string "a" 'utf-16-be))))
'utf-16-be)
+ ((and (mm-coding-system-p 'utf-16be) ; Emacs 23?
+ ;; Avoid versions with BOM.
+ (= 2 (length (encode-coding-string "a" 'utf-16be))))
+ 'utf-16be)
((mm-coding-system-p 'utf-16-be-nosig) ; ?
'utf-16-be-nosig))
"Coding system which encodes big endian UTF-16 without a BOM signature.")
In GNU Emacs 23.0.50.2 (x86_64-unknown-linux-gnu, GTK+ Version 2.12.0)
of 2007-11-01 on undine
configured using `configure '--prefix=/usr/opt/emacs/23.0.50' '--with-x-toolkit=gtk' '--with-x' '--with-xpm' '--with-jpeg' '--with-tiff' '--with-gif' '--with-png' '--with-kerberos5''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: C
value of $LANG: ja_JP.UTF-8
locale-coding-system: utf-8
default-enable-multibyte-characters: t
Major mode: Emacs-Lisp
Regards,
--
Topia <topia@clovery.jp>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 23.0.50; utf7-decode failed with non latin-1 charactor
2007-11-01 17:29 23.0.50; utf7-decode failed with non latin-1 charactor Topia
@ 2007-11-02 12:40 ` Jason Rumney
2007-11-03 3:58 ` Richard Stallman
2007-11-05 7:02 ` Kenichi Handa
0 siblings, 2 replies; 9+ messages in thread
From: Jason Rumney @ 2007-11-02 12:40 UTC (permalink / raw)
To: Topia; +Cc: emacs-pretest-bug
Topia wrote:
> I want to use imap's internationalized (japanese) folder name,
> from Wanderlust. but I see error "Unable to convert from Unicode"
> from utf7-u16-latin1-char-converter.
>
> You can see with this code:
> ;; "Sent Mail" in Japanese, imap's UTF-7
> (utf7-decode "&kAFP4W4IMH8w4TD8MOs-" 'imap)
>
There appear to be two different implementations of utf-7 in Emacs, one
in lisp/international/utf-7.el, and one in lisp/gnus/utf7.el
The former seems to work for decoding, but always returns nil on
encoding without changing the buffer contents (the correctly encoded
text is in a buffer called " *temp*" however).
The latter only seems to work on Latin-1 text (as documented in the
commentary) and returns results from the encoder that are inconsistent
with iconv.
Probably lisp/international/utf-7.el should be fixed, and the Gnus one
dropped.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 23.0.50; utf7-decode failed with non latin-1 charactor
2007-11-02 12:40 ` Jason Rumney
@ 2007-11-03 3:58 ` Richard Stallman
2007-11-05 7:02 ` Kenichi Handa
1 sibling, 0 replies; 9+ messages in thread
From: Richard Stallman @ 2007-11-03 3:58 UTC (permalink / raw)
To: Jason Rumney, handa; +Cc: emacs-pretest-bug, topia
There appear to be two different implementations of utf-7 in Emacs, one
in lisp/international/utf-7.el, and one in lisp/gnus/utf7.el
The former seems to work for decoding, but always returns nil on
encoding without changing the buffer contents (the correctly encoded
text is in a buffer called " *temp*" however).
The latter only seems to work on Latin-1 text (as documented in the
commentary) and returns results from the encoder that are inconsistent
with iconv.
Probably lisp/international/utf-7.el should be fixed, and the Gnus one
dropped.
Handa, can you fix international/utf-7.el?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 23.0.50; utf7-decode failed with non latin-1 charactor
2007-11-02 12:40 ` Jason Rumney
2007-11-03 3:58 ` Richard Stallman
@ 2007-11-05 7:02 ` Kenichi Handa
2007-11-06 19:53 ` imap.el: international/utf-7.el vs. gnus/utf7.el (was: 23.0.50; utf7-decode failed with non latin-1 charactor) Reiner Steib
2007-11-07 12:12 ` 23.0.50; utf7-decode failed with non latin-1 charactor Jason Rumney
1 sibling, 2 replies; 9+ messages in thread
From: Kenichi Handa @ 2007-11-05 7:02 UTC (permalink / raw)
To: Jason Rumney; +Cc: emacs-pretest-bug, topia
In article <472B1AD5.3090006@gnu.org>, Jason Rumney <jasonr@gnu.org> writes:
[...]
> There appear to be two different implementations of utf-7 in Emacs, one
> in lisp/international/utf-7.el, and one in lisp/gnus/utf7.el
> The former seems to work for decoding, but always returns nil on
> encoding without changing the buffer contents (the correctly encoded
> text is in a buffer called " *temp*" however).
[...]
> Probably lisp/international/utf-7.el should be fixed, and the Gnus one
> dropped.
It seems that fucntions utf-7-decode and utf-7-encode are
designed to be called only as pre-write/post-read functions
of a coding system utf-7 (and commented out coding system
utf-7-imap) in lisp/international/utf-7.el.
I think the right thing is to uncomment all codes for
utf-7-map in utf-7.el, and modify gnus to use normal
encode/decode-coding-region/string with utf-7-imap.
I've just committed the former change. Could someone do the
latter change?
---
Kenichi Handa
handa@ni.aist.go.jp
^ permalink raw reply [flat|nested] 9+ messages in thread
* imap.el: international/utf-7.el vs. gnus/utf7.el (was: 23.0.50; utf7-decode failed with non latin-1 charactor)
2007-11-05 7:02 ` Kenichi Handa
@ 2007-11-06 19:53 ` Reiner Steib
2007-11-07 0:36 ` Kenichi Handa
2007-11-07 12:12 ` 23.0.50; utf7-decode failed with non latin-1 charactor Jason Rumney
1 sibling, 1 reply; 9+ messages in thread
From: Reiner Steib @ 2007-11-06 19:53 UTC (permalink / raw)
To: Kenichi Handa, ding, emacs-devel; +Cc: topia, Jason Rumney
On Mon, Nov 05 2007, Kenichi Handa wrote:
> Jason Rumney <jasonr@gnu.org> writes:
>> Probably lisp/international/utf-7.el should be fixed, and the Gnus one
>> dropped.
Please keep in mind the we want to keep the Gnus versions in
Emacs/trunk (Gnus 5.13) and Gnus/trunk (aka No Gnus) in sync. We want
to keep No Gnus compatible with Emacs 21+ (and XEmacs 21.4+). So we
need to add some compatibility code.
> I think the right thing is to uncomment all codes for
> utf-7-map in utf-7.el, and modify gnus to use normal
> encode/decode-coding-region/string with utf-7-imap.
>
> I've just committed the former change.
AFAIKS, `utf-7-encode' also accepts that FROM is a string, but it's
not documented. Can we rely on this? Could you document it, please?
> Could someone do the latter change?
A grep through Gnus' sources suggests that it only uses the functions
`utf7-encode' and `utf7-decode' in `imap.el'.
How about the following patch to gnus/utf7.el (untested)? Could you
suggest a better test instead of `(<= 23 emacs-major-version)'?
Instead of -OLD and -NEW we should use more suitable names or include
the defuns directly.
--8<---------------cut here---------------start------------->8---
--- utf7.el 04 Aug 2007 20:36:34 +0200 1.12
+++ utf7.el 06 Nov 2007 20:42:08 +0100
@@ -207,7 +207,7 @@
(mm-decode-coding-region (point-min) (point-max) 'iso-8859-1)
(mm-enable-multibyte))
-(defun utf7-encode (string &optional for-imap)
+(defun utf7-encode-OLD (string &optional for-imap)
"Encode UTF-7 STRING. Use IMAP modification if FOR-IMAP is non-nil."
(let ((default-enable-multibyte-characters t))
(with-temp-buffer
@@ -215,7 +215,7 @@
(utf7-encode-internal for-imap)
(buffer-string))))
-(defun utf7-decode (string &optional for-imap)
+(defun utf7-decode-OLD (string &optional for-imap)
"Decode UTF-7 STRING. Use IMAP modification if FOR-IMAP is non-nil."
(let ((default-enable-multibyte-characters nil))
(with-temp-buffer
@@ -224,6 +224,31 @@
(mm-enable-multibyte)
(buffer-string))))
+(defun utf7-encode-NEW (string &optional for-imap)
+ (with-temp-buffer
+ ;; (utf-7-encode FROM TO IMAP)
+ ;;
+ ;; `utf-7-encode' also accepts that FROM is a string, but it's not
+ ;; documented.
+ (utf-7-encode string nil for-imap)
+ (buffer-string)))
+
+(defun utf7-decode-NEW (string &optional for-imap)
+ (with-temp-buffer
+ (insert string)
+ (goto-char (point-min))
+ ;; (utf-7-decode LEN IMAP)
+ (utf-7-decode (buffer-size) for-imap)
+ (buffer-string)))
+
+(if (and (require 'utf-7 nil t) ;; Additional test for XEmacs?
+ (<= 23 emacs-major-version)) ;; A feature test would be better
+ (progn
+ (defalias 'utf7-encode 'utf7-encode-NEW)
+ (defalias 'utf7-decode 'utf7-decode-NEW))
+ (defalias 'utf7-encode 'utf7-encode-OLD)
+ (defalias 'utf7-decode 'utf7-decode-OLD))
+
(provide 'utf7)
;;; arch-tag: 96078b55-85c7-4161-aed2-932c24b282c7
--8<---------------cut here---------------end--------------->8---
Bye, Reiner.
--
,,,
(o o)
---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: imap.el: international/utf-7.el vs. gnus/utf7.el (was: 23.0.50; utf7-decode failed with non latin-1 charactor)
2007-11-06 19:53 ` imap.el: international/utf-7.el vs. gnus/utf7.el (was: 23.0.50; utf7-decode failed with non latin-1 charactor) Reiner Steib
@ 2007-11-07 0:36 ` Kenichi Handa
2007-11-20 21:08 ` imap.el: international/utf-7.el vs. gnus/utf7.el Reiner Steib
0 siblings, 1 reply; 9+ messages in thread
From: Kenichi Handa @ 2007-11-07 0:36 UTC (permalink / raw)
To: Reiner Steib; +Cc: jasonr, topia, ding, emacs-devel
In article <v9prynjgm3.fsf_-_@marauder.physik.uni-ulm.de>, Reiner Steib <reinersteib+gmane@imap.cc> writes:
> > I think the right thing is to uncomment all codes for
> > utf-7-map in utf-7.el, and modify gnus to use normal
> > encode/decode-coding-region/string with utf-7-imap.
> >
> > I've just committed the former change.
> AFAIKS, `utf-7-encode' also accepts that FROM is a string, but it's
> not documented. Can we rely on this? Could you document it, please?
No, don't use utf-7-encode directly but use
encode-coding-string.
For instance, this:
> +(defun utf7-encode-NEW (string &optional for-imap)
> + (with-temp-buffer
> + ;; (utf-7-encode FROM TO IMAP)
> + ;;
> + ;; `utf-7-encode' also accepts that FROM is a string, but it's not
> + ;; documented.
> + (utf-7-encode string nil for-imap)
> + (buffer-string)))
can simply be:
(defun utf7-encode-NEW (string &optional for-imap)
(encode-coding-string string (if for-imap 'utf-7-imap 'utf-7)))
And the test for the availability is:
(and (coding-system-p 'utf-7) (coding-system-p 'utf-7-imap))
---
Kenichi Handa
handa@ni.aist.go.jp
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 23.0.50; utf7-decode failed with non latin-1 charactor
2007-11-05 7:02 ` Kenichi Handa
2007-11-06 19:53 ` imap.el: international/utf-7.el vs. gnus/utf7.el (was: 23.0.50; utf7-decode failed with non latin-1 charactor) Reiner Steib
@ 2007-11-07 12:12 ` Jason Rumney
2007-11-07 12:51 ` Kenichi Handa
1 sibling, 1 reply; 9+ messages in thread
From: Jason Rumney @ 2007-11-07 12:12 UTC (permalink / raw)
To: Kenichi Handa; +Cc: emacs-pretest-bug, topia
Kenichi Handa wrote:
> It seems that fucntions utf-7-decode and utf-7-encode are
> designed to be called only as pre-write/post-read functions
> of a coding system utf-7 (and commented out coding system
> utf-7-imap) in lisp/international/utf-7.el.
>
Is it a requirement for such functions to return nil?
If not, can we return the encoded string instead of nil, to make the
undocumented string FROM argument useful (as a drop in replacement for
the gnus utf7-encode).
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 23.0.50; utf7-decode failed with non latin-1 charactor
2007-11-07 12:12 ` 23.0.50; utf7-decode failed with non latin-1 charactor Jason Rumney
@ 2007-11-07 12:51 ` Kenichi Handa
0 siblings, 0 replies; 9+ messages in thread
From: Kenichi Handa @ 2007-11-07 12:51 UTC (permalink / raw)
To: Jason Rumney; +Cc: emacs-pretest-bug, topia
In article <4731ABC9.4080604@gnu.org>, Jason Rumney <jasonr@gnu.org> writes:
> Kenichi Handa wrote:
> > It seems that fucntions utf-7-decode and utf-7-encode are
> > designed to be called only as pre-write/post-read functions
> > of a coding system utf-7 (and commented out coding system
> > utf-7-imap) in lisp/international/utf-7.el.
> >
> Is it a requirement for such functions to return nil?
> If not, can we return the encoded string instead of nil,
utf-7-encode is called from pre-write functions
utf-7-pre-write-conversion and
utf-7-imap-pre-write-conversion, and they expects
utf-7-encode to put the encoded result in a new buffer.
So, it's possible to make utf-7-encode return a string, but
it's inefficient to make a string that is just ignored when
callled from utf-7-pre-write-conversion.
> to make the
> undocumented string FROM argument useful (as a drop in replacement for
> the gnus utf7-encode).
Why is that necessary? We can use encode-coding-string.
I want to keep the entry points for encoding and decoding
only to the functions decode/encode-coding-region/string.
---
Kenichi Handa
handa@ni.aist.go.jp
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: imap.el: international/utf-7.el vs. gnus/utf7.el
2007-11-07 0:36 ` Kenichi Handa
@ 2007-11-20 21:08 ` Reiner Steib
0 siblings, 0 replies; 9+ messages in thread
From: Reiner Steib @ 2007-11-20 21:08 UTC (permalink / raw)
To: Kenichi Handa; +Cc: jasonr, topia, ding, emacs-devel
On Wed, Nov 07 2007, Kenichi Handa wrote:
>> > I think the right thing is to uncomment all codes for
>> > utf-7-map in utf-7.el, and modify gnus to use normal
>> > encode/decode-coding-region/string with utf-7-imap.
>> >
>> > I've just committed the former change.
I have modified gnus/utf7.el accordingly:
* utf7.el (utf7-encode, utf7-decode): Use coding system
`utf-7'/`utf-7-imap' from utf-7.el' if available.
> No, don't use utf-7-encode directly but use encode-coding-string.
>
> For instance, this[...] can simply be:
>
> (defun utf7-encode-NEW (string &optional for-imap)
> (encode-coding-string string (if for-imap 'utf-7-imap 'utf-7)))
>
> And the test for the availability is:
>
> (and (coding-system-p 'utf-7) (coding-system-p 'utf-7-imap))
Thanks for the hints.
Bye, Reiner.
--
,,,
(o o)
---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2007-11-20 21:08 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-01 17:29 23.0.50; utf7-decode failed with non latin-1 charactor Topia
2007-11-02 12:40 ` Jason Rumney
2007-11-03 3:58 ` Richard Stallman
2007-11-05 7:02 ` Kenichi Handa
2007-11-06 19:53 ` imap.el: international/utf-7.el vs. gnus/utf7.el (was: 23.0.50; utf7-decode failed with non latin-1 charactor) Reiner Steib
2007-11-07 0:36 ` Kenichi Handa
2007-11-20 21:08 ` imap.el: international/utf-7.el vs. gnus/utf7.el Reiner Steib
2007-11-07 12:12 ` 23.0.50; utf7-decode failed with non latin-1 charactor Jason Rumney
2007-11-07 12:51 ` Kenichi Handa
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.