* 23.0.50; utf7-decode failed with non latin-1 charactor
@ 2007-11-01 17:29 Topia
2007-11-02 12:40 ` Jason Rumney
0 siblings, 1 reply; 7+ messages in thread
From: Topia @ 2007-11-01 17:29 UTC (permalink / raw)
To: emacs-pretest-bug
I want to use imap's internationalized (japanese) folder name,
from Wanderlust. but I see error "Unable to convert from Unicode"
from utf7-u16-latin1-char-converter.
You can see with this code:
;; "Sent Mail" in Japanese, imap's UTF-7
(utf7-decode "&kAFP4W4IMH8w4TD8MOs-" 'imap)
so I found problem that utf7-utf-16-coding-system is nil, because:
* Emacs 23 has utf-16-be, but this coding system has BOM.
* utf-16-be-nosig is not found.
I found utf-16-be without BOM version, utf-16be. evalute above code
after (setq utf7-utf-16-coding-system 'utf-16be), no error occured.
Could you modify utf7-utf-16-coding-system to add utf-16be?
(defconst utf7-utf-16-coding-system
(cond ((mm-coding-system-p 'utf-16-be-no-signature) ; Mule-UCS
'utf-16-be-no-signature)
((and (mm-coding-system-p 'utf-16-be) ; Emacs 21.3, Emacs 22
;; Avoid versions with BOM.
(= 2 (length (encode-coding-string "a" 'utf-16-be))))
'utf-16-be)
+ ((and (mm-coding-system-p 'utf-16be) ; Emacs 23?
+ ;; Avoid versions with BOM.
+ (= 2 (length (encode-coding-string "a" 'utf-16be))))
+ 'utf-16be)
((mm-coding-system-p 'utf-16-be-nosig) ; ?
'utf-16-be-nosig))
"Coding system which encodes big endian UTF-16 without a BOM signature.")
In GNU Emacs 23.0.50.2 (x86_64-unknown-linux-gnu, GTK+ Version 2.12.0)
of 2007-11-01 on undine
configured using `configure '--prefix=/usr/opt/emacs/23.0.50' '--with-x-toolkit=gtk' '--with-x' '--with-xpm' '--with-jpeg' '--with-tiff' '--with-gif' '--with-png' '--with-kerberos5''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: C
value of $LANG: ja_JP.UTF-8
locale-coding-system: utf-8
default-enable-multibyte-characters: t
Major mode: Emacs-Lisp
Regards,
--
Topia <topia@clovery.jp>
^ permalink raw reply [flat|nested] 7+ messages in thread
* 23.0.50; utf7-decode failed with non latin-1 charactor
@ 2007-11-01 17:45 Topia
0 siblings, 0 replies; 7+ messages in thread
From: Topia @ 2007-11-01 17:45 UTC (permalink / raw)
To: bug-gnu-emacs
I want to use imap's internationalized (japanese) folder name,
from Wanderlust. but I see error "Unable to convert from Unicode"
from utf7-u16-latin1-char-converter.
You can see with this code:
;; "Sent Mail" in Japanese, imap's UTF-7
(utf7-decode "&kAFP4W4IMH8w4TD8MOs-" 'imap)
tested on lisp/gnus/utf7.el:
;;; arch-tag: 96078b55-85c7-4161-aed2-932c24b282c7
so I found problem that utf7-utf-16-coding-system is nil, because:
* Emacs 23 has utf-16-be, but this coding system has BOM.
* utf-16-be-nosig is not found.
I found utf-16-be without BOM version, utf-16be. evalute above code
after (setq utf7-utf-16-coding-system 'utf-16be), no error occured.
Could you modify utf7-utf-16-coding-system to add utf-16be?
(defconst utf7-utf-16-coding-system
(cond ((mm-coding-system-p 'utf-16-be-no-signature) ; Mule-UCS
'utf-16-be-no-signature)
((and (mm-coding-system-p 'utf-16-be) ; Emacs 21.3, Emacs 22 (BOM?)
;; Avoid versions with BOM.
(= 2 (length (encode-coding-string "a" 'utf-16-be))))
'utf-16-be)
+ ((and (mm-coding-system-p 'utf-16be) ; Emacs 22 and later
+ ;; Avoid versions with BOM.
+ (= 2 (length (encode-coding-string "a" 'utf-16be))))
+ 'utf-16be)
((mm-coding-system-p 'utf-16-be-nosig) ; ?
'utf-16-be-nosig))
"Coding system which encodes big endian UTF-16 without a BOM signature.")
In GNU Emacs 23.0.50.2 (x86_64-unknown-linux-gnu, GTK+ Version 2.12.0)
of 2007-11-01 on undine
configured using `configure '--prefix=/usr/opt/emacs/23.0.50' '--with-x-toolkit=gtk' '--with-x' '--with-xpm' '--with-jpeg' '--with-tiff' '--with-gif' '--with-png' '--with-kerberos5''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: C
value of $LANG: ja_JP.UTF-8
locale-coding-system: utf-8
default-enable-multibyte-characters: t
Major mode: Emacs-Lisp
Sorry for bad English.
Regards,
--
Topia <topia@clovery.jp>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 23.0.50; utf7-decode failed with non latin-1 charactor
2007-11-01 17:29 Topia
@ 2007-11-02 12:40 ` Jason Rumney
2007-11-03 3:58 ` Richard Stallman
2007-11-05 7:02 ` Kenichi Handa
0 siblings, 2 replies; 7+ messages in thread
From: Jason Rumney @ 2007-11-02 12:40 UTC (permalink / raw)
To: Topia; +Cc: emacs-pretest-bug
Topia wrote:
> I want to use imap's internationalized (japanese) folder name,
> from Wanderlust. but I see error "Unable to convert from Unicode"
> from utf7-u16-latin1-char-converter.
>
> You can see with this code:
> ;; "Sent Mail" in Japanese, imap's UTF-7
> (utf7-decode "&kAFP4W4IMH8w4TD8MOs-" 'imap)
>
There appear to be two different implementations of utf-7 in Emacs, one
in lisp/international/utf-7.el, and one in lisp/gnus/utf7.el
The former seems to work for decoding, but always returns nil on
encoding without changing the buffer contents (the correctly encoded
text is in a buffer called " *temp*" however).
The latter only seems to work on Latin-1 text (as documented in the
commentary) and returns results from the encoder that are inconsistent
with iconv.
Probably lisp/international/utf-7.el should be fixed, and the Gnus one
dropped.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 23.0.50; utf7-decode failed with non latin-1 charactor
2007-11-02 12:40 ` Jason Rumney
@ 2007-11-03 3:58 ` Richard Stallman
2007-11-05 7:02 ` Kenichi Handa
1 sibling, 0 replies; 7+ messages in thread
From: Richard Stallman @ 2007-11-03 3:58 UTC (permalink / raw)
To: Jason Rumney, handa; +Cc: emacs-pretest-bug, topia
There appear to be two different implementations of utf-7 in Emacs, one
in lisp/international/utf-7.el, and one in lisp/gnus/utf7.el
The former seems to work for decoding, but always returns nil on
encoding without changing the buffer contents (the correctly encoded
text is in a buffer called " *temp*" however).
The latter only seems to work on Latin-1 text (as documented in the
commentary) and returns results from the encoder that are inconsistent
with iconv.
Probably lisp/international/utf-7.el should be fixed, and the Gnus one
dropped.
Handa, can you fix international/utf-7.el?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 23.0.50; utf7-decode failed with non latin-1 charactor
2007-11-02 12:40 ` Jason Rumney
2007-11-03 3:58 ` Richard Stallman
@ 2007-11-05 7:02 ` Kenichi Handa
2007-11-07 12:12 ` Jason Rumney
1 sibling, 1 reply; 7+ messages in thread
From: Kenichi Handa @ 2007-11-05 7:02 UTC (permalink / raw)
To: Jason Rumney; +Cc: emacs-pretest-bug, topia
In article <472B1AD5.3090006@gnu.org>, Jason Rumney <jasonr@gnu.org> writes:
[...]
> There appear to be two different implementations of utf-7 in Emacs, one
> in lisp/international/utf-7.el, and one in lisp/gnus/utf7.el
> The former seems to work for decoding, but always returns nil on
> encoding without changing the buffer contents (the correctly encoded
> text is in a buffer called " *temp*" however).
[...]
> Probably lisp/international/utf-7.el should be fixed, and the Gnus one
> dropped.
It seems that fucntions utf-7-decode and utf-7-encode are
designed to be called only as pre-write/post-read functions
of a coding system utf-7 (and commented out coding system
utf-7-imap) in lisp/international/utf-7.el.
I think the right thing is to uncomment all codes for
utf-7-map in utf-7.el, and modify gnus to use normal
encode/decode-coding-region/string with utf-7-imap.
I've just committed the former change. Could someone do the
latter change?
---
Kenichi Handa
handa@ni.aist.go.jp
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 23.0.50; utf7-decode failed with non latin-1 charactor
2007-11-05 7:02 ` Kenichi Handa
@ 2007-11-07 12:12 ` Jason Rumney
2007-11-07 12:51 ` Kenichi Handa
0 siblings, 1 reply; 7+ messages in thread
From: Jason Rumney @ 2007-11-07 12:12 UTC (permalink / raw)
To: Kenichi Handa; +Cc: emacs-pretest-bug, topia
Kenichi Handa wrote:
> It seems that fucntions utf-7-decode and utf-7-encode are
> designed to be called only as pre-write/post-read functions
> of a coding system utf-7 (and commented out coding system
> utf-7-imap) in lisp/international/utf-7.el.
>
Is it a requirement for such functions to return nil?
If not, can we return the encoded string instead of nil, to make the
undocumented string FROM argument useful (as a drop in replacement for
the gnus utf7-encode).
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 23.0.50; utf7-decode failed with non latin-1 charactor
2007-11-07 12:12 ` Jason Rumney
@ 2007-11-07 12:51 ` Kenichi Handa
0 siblings, 0 replies; 7+ messages in thread
From: Kenichi Handa @ 2007-11-07 12:51 UTC (permalink / raw)
To: Jason Rumney; +Cc: emacs-pretest-bug, topia
In article <4731ABC9.4080604@gnu.org>, Jason Rumney <jasonr@gnu.org> writes:
> Kenichi Handa wrote:
> > It seems that fucntions utf-7-decode and utf-7-encode are
> > designed to be called only as pre-write/post-read functions
> > of a coding system utf-7 (and commented out coding system
> > utf-7-imap) in lisp/international/utf-7.el.
> >
> Is it a requirement for such functions to return nil?
> If not, can we return the encoded string instead of nil,
utf-7-encode is called from pre-write functions
utf-7-pre-write-conversion and
utf-7-imap-pre-write-conversion, and they expects
utf-7-encode to put the encoded result in a new buffer.
So, it's possible to make utf-7-encode return a string, but
it's inefficient to make a string that is just ignored when
callled from utf-7-pre-write-conversion.
> to make the
> undocumented string FROM argument useful (as a drop in replacement for
> the gnus utf7-encode).
Why is that necessary? We can use encode-coding-string.
I want to keep the entry points for encoding and decoding
only to the functions decode/encode-coding-region/string.
---
Kenichi Handa
handa@ni.aist.go.jp
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-11-07 12:51 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-01 17:45 23.0.50; utf7-decode failed with non latin-1 charactor Topia
-- strict thread matches above, loose matches on Subject: below --
2007-11-01 17:29 Topia
2007-11-02 12:40 ` Jason Rumney
2007-11-03 3:58 ` Richard Stallman
2007-11-05 7:02 ` Kenichi Handa
2007-11-07 12:12 ` Jason Rumney
2007-11-07 12:51 ` Kenichi Handa
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.