From: Kenichi Handa <handa@etl.go.jp>
Cc: d.love@dl.ac.uk, monnier+gnu/emacs@rum.cs.yale.edu,
keichwa@gmx.net, emacs-devel@gnu.org
Subject: Re: Several serious problems
Date: Mon, 2 Sep 2002 10:28:25 +0900 (JST) [thread overview]
Message-ID: <200209020128.KAA08644@etlken.m17n.org> (raw)
In-Reply-To: <E17lefC-0003IF-00@fencepost.gnu.org> (message from Richard Stallman on Sun, 01 Sep 2002 20:01:54 -0400)
In article <E17lefC-0003IF-00@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:
> That depends on whether you include code in utf-8.el that encodes
> those charsets. If not, you need that change.
> In that case, I will install that change presently, and then we can
> study the question of whether to include the code in utf-8.el instead.
> What does that code in utf-8.el do, and how safe a change is it?
It defines two CCL codes to decode and encode utf-8 byte
sequence, and makes the coding system mule-utf-8 by using
those CCL codes.
I'll attach the necessary change to enable RC's utf-8 to
encode latin-X plus alpha (e.g. thai). The docstring of
mule-utf-8 may need improvement.
As the change is very small and that code has been in HEAD
for more than one month, I think the change is quite safe.
I recommend to install it in RC.
I also checked the code to some extent by this testsuite.
(dolist (charset (delq 'ascii
(delq 'eight-bit-control
(delq 'eight-bit-graphic
(coding-system-get 'mule-utf-8
'safe-charsets)))))
(let ((dimension (charset-dimension charset))
str)
(if (= dimension 1)
(setq str (string (make-char charset 33) (make-char charset 34)))
(setq str (string (make-char charset 33 33) (make-char charset 33 34))))
(or (memq 'mule-utf-8 (find-coding-systems-string str))
(not (string-match "\357\277\275" ; UTF-8 form of U+FFFD
(encode-coding-string str 'mule-utf-8)))
(error (format "%s is not supported" charset)))))
---
Ken'ichi HANDA
handa@etl.go.jp
*** utf-8.el.~1.9.4.2.~ Tue Jul 23 13:54:13 2002
--- utf-8.el Mon Sep 2 10:28:26 2002
***************
*** 269,275 ****
(loop
(if (r5 < 0)
((r1 = -1)
! (read-multibyte-character r0 r1))
(;; We have already done read-multibyte-character.
(r0 = r5)
(r1 = r6)
--- 269,277 ----
(loop
(if (r5 < 0)
((r1 = -1)
! (read-multibyte-character r0 r1)
! (translate-character ucs-mule-to-mule-unicode r0 r1))
!
(;; We have already done read-multibyte-character.
(r0 = r5)
(r1 = r6)
***************
*** 392,397 ****
--- 394,423 ----
mule-unicode-0100-24ff
mule-unicode-2500-33ff
mule-unicode-e000-ffff
+ latin-iso8859-2 (*)
+ latin-iso8859-3 (*)
+ latin-iso8859-4 (*)
+ cyrillic-iso8859-5 (*)
+ arabic-iso8859-6 (*)
+ greek-iso8859-7 (*)
+ hebrew-iso8859-8 (*)
+ latin-iso8859-9 (*)
+ latin-iso8859-14 (*)
+ latin-iso8859-15 (*)
+ chinese-sisheng (*)
+ ethiopic (*)
+ ipa (*)
+ lao (*)
+ katakana-jisx0201 (*)
+ thai-tis620 (*)
+ tibetan (*)
+ vietnamese-viscii-lower (*)
+ vietnamese-viscii-upper (*)
+
+ Among them, the charsets labeled \"(*)\" are supported only on
+ encoding. That means, they are correctly encoded to UTF-8, but are
+ decoded back to charsets latin-iso8859-1, mule-unicode-0100-24ff, or
+ mule-unicode-2500-33ff, not to the original charsets.
Unicode characters out of the ranges U+0000-U+33FF and U+E200-U+FFFF
are decoded into sequences of eight-bit-control and eight-bit-graphic
***************
*** 409,415 ****
latin-iso8859-1
mule-unicode-0100-24ff
mule-unicode-2500-33ff
! mule-unicode-e000-ffff)
(mime-charset . utf-8)
(coding-category . coding-category-utf-8)
(valid-codes (0 . 255))))
--- 435,460 ----
latin-iso8859-1
mule-unicode-0100-24ff
mule-unicode-2500-33ff
! mule-unicode-e000-ffff
! latin-iso8859-2
! latin-iso8859-3
! latin-iso8859-4
! cyrillic-iso8859-5
! arabic-iso8859-6
! greek-iso8859-7
! hebrew-iso8859-8
! latin-iso8859-9
! latin-iso8859-14
! latin-iso8859-15
! chinese-sisheng
! ethiopic
! ipa
! lao
! katakana-jisx0201
! thai-tis620
! tibetan
! vietnamese-viscii-lower
! vietnamese-viscii-upper)
(mime-charset . utf-8)
(coding-category . coding-category-utf-8)
(valid-codes (0 . 255))))
next prev parent reply other threads:[~2002-09-02 1:28 UTC|newest]
Thread overview: 105+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-08-19 7:48 Several serious problems Kenichi Handa
2002-08-22 17:08 ` Dave Love
2002-08-29 13:25 ` Kenichi Handa
2002-08-29 17:32 ` Stefan Monnier
2002-08-29 23:15 ` Dave Love
2002-08-30 14:36 ` Stefan Monnier
2002-09-04 17:23 ` Dave Love
2002-08-30 6:09 ` Richard Stallman
2002-08-31 17:30 ` Dave Love
2002-09-02 0:01 ` Richard Stallman
2002-09-04 17:15 ` Dave Love
2002-09-08 12:54 ` Richard Stallman
2002-09-12 22:38 ` Dave Love
2002-09-13 19:34 ` Richard Stallman
2002-09-25 7:01 ` status of utf-8.el, etc [Re: Several serious problems] Kenichi Handa
2002-09-25 14:35 ` Stefan Monnier
2002-09-25 23:47 ` Kenichi Handa
2002-09-26 13:56 ` Stefan Monnier
2002-09-27 13:22 ` Kenichi Handa
2002-09-28 3:19 ` Richard Stallman
2002-09-27 13:59 ` Dave Love
2002-09-27 15:24 ` Stefan Monnier
2002-09-28 3:20 ` Richard Stallman
2002-10-04 22:26 ` Dave Love
2002-10-05 16:59 ` Eli Zaretskii
2002-10-11 17:21 ` Dave Love
2002-10-12 8:27 ` Eli Zaretskii
2002-09-28 3:19 ` Richard Stallman
2002-09-27 13:55 ` Dave Love
2002-09-28 3:19 ` Richard Stallman
2002-09-30 9:09 ` Kenichi Handa
2002-09-30 13:29 ` Stefan Monnier
2002-10-01 7:37 ` Kenichi Handa
2002-10-01 20:03 ` Richard Stallman
2002-10-10 12:25 ` Kenichi Handa
2002-10-04 22:38 ` Dave Love
2002-10-04 22:32 ` Dave Love
2002-10-09 1:26 ` Kenichi Handa
2002-10-15 17:38 ` Dave Love
2002-10-16 4:38 ` Richard Stallman
2002-08-29 23:09 ` Several serious problems Dave Love
2002-08-30 6:11 ` Richard Stallman
2002-09-04 17:21 ` Dave Love
2002-08-29 23:17 ` Dave Love
2002-08-30 6:11 ` Richard Stallman
2002-08-31 17:31 ` Dave Love
2002-09-02 0:01 ` Richard Stallman
2002-09-02 1:28 ` Kenichi Handa [this message]
2002-09-05 13:41 ` Dave Love
2002-09-05 23:32 ` Kenichi Handa
2002-09-06 11:38 ` Robert J. Chassell
2002-09-07 23:19 ` Dave Love
2002-09-09 0:21 ` Richard Stallman
2002-09-12 22:43 ` Dave Love
2002-09-26 4:51 ` Kenichi Handa
2002-09-10 16:36 ` Richard Stallman
2002-08-30 6:09 ` Richard Stallman
2002-08-24 12:11 ` Richard Stallman
2002-08-26 13:17 ` Kenichi Handa
2002-08-26 16:15 ` Stefan Monnier
2002-08-29 23:18 ` Dave Love
2002-08-30 14:36 ` Stefan Monnier
2002-08-29 23:19 ` Dave Love
-- strict thread matches above, loose matches on Subject: below --
2002-07-22 17:11 Richard Stallman
2002-07-22 19:01 ` Andre Spiegel
2002-07-22 19:03 ` Andre Spiegel
2002-07-23 4:00 ` Richard Stallman
2002-07-22 19:03 ` Andreas Schwab
2002-07-23 18:58 ` Richard Stallman
2002-07-22 19:11 ` Andre Spiegel
2002-07-23 4:42 ` Karl Eichwalder
2002-07-24 3:25 ` Richard Stallman
2002-07-24 4:43 ` Karl Eichwalder
2002-07-25 3:12 ` Richard Stallman
2002-07-25 3:24 ` Karl Eichwalder
2002-07-26 15:35 ` Richard Stallman
2002-07-27 3:19 ` Karl Eichwalder
2002-07-29 1:12 ` Richard Stallman
2002-07-29 14:32 ` Karl Eichwalder
2002-07-30 1:00 ` Richard Stallman
2002-08-09 7:42 ` Stefan Monnier
2002-08-09 16:08 ` Karl Eichwalder
2002-08-10 17:16 ` Richard Stallman
2002-08-12 16:20 ` Stefan Monnier
2002-08-13 1:48 ` Richard Stallman
2002-08-15 2:30 ` Karl Eichwalder
2002-08-15 2:47 ` Stefan Monnier
2002-08-15 5:31 ` Karl Eichwalder
2002-08-15 15:30 ` Stefan Monnier
2002-08-15 17:33 ` Dave Love
2002-07-23 13:35 ` Kenichi Handa
2002-07-23 13:52 ` Alan Shutko
2002-07-24 3:25 ` Richard Stallman
2002-07-24 3:25 ` Richard Stallman
2002-07-24 4:37 ` Kenichi Handa
2002-07-25 3:12 ` Richard Stallman
2002-07-25 5:53 ` Miles Bader
2002-07-26 14:29 ` Francesco Potorti`
2002-07-27 18:52 ` Richard Stallman
2002-08-09 7:43 ` Stefan Monnier
2002-08-09 7:44 ` Stefan Monnier
2002-08-10 17:16 ` Richard Stallman
2002-08-12 0:26 ` Kenichi Handa
2002-08-09 4:41 ` Stefan Monnier
2002-08-15 17:23 ` Dave Love
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200209020128.KAA08644@etlken.m17n.org \
--to=handa@etl.go.jp \
--cc=d.love@dl.ac.uk \
--cc=emacs-devel@gnu.org \
--cc=keichwa@gmx.net \
--cc=monnier+gnu/emacs@rum.cs.yale.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).