unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Kenichi Handa <handa@etl.go.jp>
Cc: d.love@dl.ac.uk, monnier+gnu/emacs@rum.cs.yale.edu,
	keichwa@gmx.net, emacs-devel@gnu.org
Subject: Re: Several serious problems
Date: Mon, 2 Sep 2002 10:28:25 +0900 (JST)	[thread overview]
Message-ID: <200209020128.KAA08644@etlken.m17n.org> (raw)
In-Reply-To: <E17lefC-0003IF-00@fencepost.gnu.org> (message from Richard Stallman on Sun, 01 Sep 2002 20:01:54 -0400)

In article <E17lefC-0003IF-00@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:
>     That depends on whether you include code in utf-8.el that encodes
>     those charsets.  If not, you need that change.

> In that case, I will install that change presently, and then we can
> study the question of whether to include the code in utf-8.el instead.

> What does that code in utf-8.el do, and how safe a change is it?

It defines two CCL codes to decode and encode utf-8 byte
sequence, and makes the coding system mule-utf-8 by using
those CCL codes.

I'll attach the necessary change to enable RC's utf-8 to
encode latin-X plus alpha (e.g. thai).  The docstring of
mule-utf-8 may need improvement.

As the change is very small and that code has been in HEAD
for more than one month, I think the change is quite safe.
I recommend to install it in RC.

I also checked the code to some extent by this testsuite.

(dolist (charset (delq 'ascii
		       (delq 'eight-bit-control
			     (delq 'eight-bit-graphic
				   (coding-system-get 'mule-utf-8
						      'safe-charsets)))))
  (let ((dimension (charset-dimension charset))
	str)
    (if (= dimension 1)
	(setq str (string (make-char charset 33) (make-char charset 34)))
      (setq str (string (make-char charset 33 33) (make-char charset 33 34))))
    (or (memq 'mule-utf-8 (find-coding-systems-string str))
        (not (string-match "\357\277\275" ; UTF-8 form of U+FFFD
			   (encode-coding-string str 'mule-utf-8)))

	(error (format "%s is not supported" charset)))))

---
Ken'ichi HANDA
handa@etl.go.jp

*** utf-8.el.~1.9.4.2.~	Tue Jul 23 13:54:13 2002
--- utf-8.el	Mon Sep  2 10:28:26 2002
***************
*** 269,275 ****
       (loop
        (if (r5 < 0)
  	  ((r1 = -1)
! 	   (read-multibyte-character r0 r1))
  	(;; We have already done read-multibyte-character.
  	 (r0 = r5)
  	 (r1 = r6)
--- 269,277 ----
       (loop
        (if (r5 < 0)
  	  ((r1 = -1)
! 	   (read-multibyte-character r0 r1)
! 	   (translate-character ucs-mule-to-mule-unicode r0 r1))
! 
  	(;; We have already done read-multibyte-character.
  	 (r0 = r5)
  	 (r1 = r6)
***************
*** 392,397 ****
--- 394,423 ----
     mule-unicode-0100-24ff
     mule-unicode-2500-33ff
     mule-unicode-e000-ffff
+    latin-iso8859-2 (*)
+    latin-iso8859-3 (*)
+    latin-iso8859-4 (*)
+    cyrillic-iso8859-5 (*)
+    arabic-iso8859-6 (*)
+    greek-iso8859-7 (*)
+    hebrew-iso8859-8 (*)
+    latin-iso8859-9 (*)
+    latin-iso8859-14 (*)
+    latin-iso8859-15 (*)
+    chinese-sisheng (*)
+    ethiopic (*)
+    ipa (*)
+    lao (*)
+    katakana-jisx0201 (*)
+    thai-tis620 (*)
+    tibetan (*)
+    vietnamese-viscii-lower (*)
+    vietnamese-viscii-upper (*)
+ 
+ Among them, the charsets labeled \"(*)\" are supported only on
+ encoding.  That means, they are correctly encoded to UTF-8, but are
+ decoded back to charsets latin-iso8859-1, mule-unicode-0100-24ff, or
+ mule-unicode-2500-33ff, not to the original charsets.
  
  Unicode characters out of the ranges U+0000-U+33FF and U+E200-U+FFFF
  are decoded into sequences of eight-bit-control and eight-bit-graphic
***************
*** 409,415 ****
      latin-iso8859-1
      mule-unicode-0100-24ff
      mule-unicode-2500-33ff
!     mule-unicode-e000-ffff)
     (mime-charset . utf-8)
     (coding-category . coding-category-utf-8)
     (valid-codes (0 . 255))))
--- 435,460 ----
      latin-iso8859-1
      mule-unicode-0100-24ff
      mule-unicode-2500-33ff
!     mule-unicode-e000-ffff
!     latin-iso8859-2 
!     latin-iso8859-3 
!     latin-iso8859-4 
!     cyrillic-iso8859-5 
!     arabic-iso8859-6 
!     greek-iso8859-7 
!     hebrew-iso8859-8 
!     latin-iso8859-9 
!     latin-iso8859-14 
!     latin-iso8859-15 
!     chinese-sisheng 
!     ethiopic 
!     ipa 
!     lao 
!     katakana-jisx0201 
!     thai-tis620 
!     tibetan 
!     vietnamese-viscii-lower 
!     vietnamese-viscii-upper)
     (mime-charset . utf-8)
     (coding-category . coding-category-utf-8)
     (valid-codes (0 . 255))))

  reply	other threads:[~2002-09-02  1:28 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-08-19  7:48 Several serious problems Kenichi Handa
2002-08-22 17:08 ` Dave Love
2002-08-29 13:25   ` Kenichi Handa
2002-08-29 17:32     ` Stefan Monnier
2002-08-29 23:15       ` Dave Love
2002-08-30 14:36         ` Stefan Monnier
2002-09-04 17:23           ` Dave Love
2002-08-30  6:09       ` Richard Stallman
2002-08-31 17:30         ` Dave Love
2002-09-02  0:01           ` Richard Stallman
2002-09-04 17:15             ` Dave Love
2002-09-08 12:54               ` Richard Stallman
2002-09-12 22:38                 ` Dave Love
2002-09-13 19:34                   ` Richard Stallman
2002-09-25  7:01                   ` status of utf-8.el, etc [Re: Several serious problems] Kenichi Handa
2002-09-25 14:35                     ` Stefan Monnier
2002-09-25 23:47                       ` Kenichi Handa
2002-09-26 13:56                         ` Stefan Monnier
2002-09-27 13:22                           ` Kenichi Handa
2002-09-28  3:19                             ` Richard Stallman
2002-09-27 13:59                           ` Dave Love
2002-09-27 15:24                             ` Stefan Monnier
2002-09-28  3:20                               ` Richard Stallman
2002-10-04 22:26                               ` Dave Love
2002-10-05 16:59                                 ` Eli Zaretskii
2002-10-11 17:21                                   ` Dave Love
2002-10-12  8:27                                     ` Eli Zaretskii
2002-09-28  3:19                             ` Richard Stallman
2002-09-27 13:55                     ` Dave Love
2002-09-28  3:19                       ` Richard Stallman
2002-09-30  9:09                       ` Kenichi Handa
2002-09-30 13:29                         ` Stefan Monnier
2002-10-01  7:37                           ` Kenichi Handa
2002-10-01 20:03                             ` Richard Stallman
2002-10-10 12:25                               ` Kenichi Handa
2002-10-04 22:38                           ` Dave Love
2002-10-04 22:32                         ` Dave Love
2002-10-09  1:26                           ` Kenichi Handa
2002-10-15 17:38                             ` Dave Love
2002-10-16  4:38                               ` Richard Stallman
2002-08-29 23:09     ` Several serious problems Dave Love
2002-08-30  6:11       ` Richard Stallman
2002-09-04 17:21         ` Dave Love
2002-08-29 23:17     ` Dave Love
2002-08-30  6:11       ` Richard Stallman
2002-08-31 17:31         ` Dave Love
2002-09-02  0:01           ` Richard Stallman
2002-09-02  1:28             ` Kenichi Handa [this message]
2002-09-05 13:41               ` Dave Love
2002-09-05 23:32                 ` Kenichi Handa
2002-09-06 11:38                   ` Robert J. Chassell
2002-09-07 23:19                   ` Dave Love
2002-09-09  0:21                     ` Richard Stallman
2002-09-12 22:43                       ` Dave Love
2002-09-26  4:51                     ` Kenichi Handa
2002-09-10 16:36               ` Richard Stallman
2002-08-30  6:09     ` Richard Stallman
2002-08-24 12:11 ` Richard Stallman
2002-08-26 13:17   ` Kenichi Handa
2002-08-26 16:15     ` Stefan Monnier
2002-08-29 23:18       ` Dave Love
2002-08-30 14:36         ` Stefan Monnier
2002-08-29 23:19     ` Dave Love
  -- strict thread matches above, loose matches on Subject: below --
2002-07-22 17:11 Richard Stallman
2002-07-22 19:01 ` Andre Spiegel
2002-07-22 19:03 ` Andre Spiegel
2002-07-23  4:00   ` Richard Stallman
2002-07-22 19:03 ` Andreas Schwab
2002-07-23 18:58   ` Richard Stallman
2002-07-22 19:11 ` Andre Spiegel
2002-07-23  4:42 ` Karl Eichwalder
2002-07-24  3:25   ` Richard Stallman
2002-07-24  4:43     ` Karl Eichwalder
2002-07-25  3:12       ` Richard Stallman
2002-07-25  3:24         ` Karl Eichwalder
2002-07-26 15:35           ` Richard Stallman
2002-07-27  3:19             ` Karl Eichwalder
2002-07-29  1:12               ` Richard Stallman
2002-07-29 14:32                 ` Karl Eichwalder
2002-07-30  1:00                   ` Richard Stallman
2002-08-09  7:42               ` Stefan Monnier
2002-08-09 16:08                 ` Karl Eichwalder
2002-08-10 17:16                 ` Richard Stallman
2002-08-12 16:20                   ` Stefan Monnier
2002-08-13  1:48                     ` Richard Stallman
2002-08-15  2:30                       ` Karl Eichwalder
2002-08-15  2:47                         ` Stefan Monnier
2002-08-15  5:31                           ` Karl Eichwalder
2002-08-15 15:30                             ` Stefan Monnier
2002-08-15 17:33                               ` Dave Love
2002-07-23 13:35 ` Kenichi Handa
2002-07-23 13:52   ` Alan Shutko
2002-07-24  3:25     ` Richard Stallman
2002-07-24  3:25   ` Richard Stallman
2002-07-24  4:37     ` Kenichi Handa
2002-07-25  3:12       ` Richard Stallman
2002-07-25  5:53         ` Miles Bader
2002-07-26 14:29         ` Francesco Potorti`
2002-07-27 18:52           ` Richard Stallman
2002-08-09  7:43           ` Stefan Monnier
2002-08-09  7:44       ` Stefan Monnier
2002-08-10 17:16         ` Richard Stallman
2002-08-12  0:26         ` Kenichi Handa
2002-08-09  4:41 ` Stefan Monnier
2002-08-15 17:23   ` Dave Love

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200209020128.KAA08644@etlken.m17n.org \
    --to=handa@etl.go.jp \
    --cc=d.love@dl.ac.uk \
    --cc=emacs-devel@gnu.org \
    --cc=keichwa@gmx.net \
    --cc=monnier+gnu/emacs@rum.cs.yale.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).