all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Re: Several serious problems
@ 2002-08-19  7:48 Kenichi Handa
  2002-08-22 17:08 ` Dave Love
  2002-08-24 12:11 ` Richard Stallman
  0 siblings, 2 replies; 63+ messages in thread
From: Kenichi Handa @ 2002-08-19  7:48 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, keichwa, rms, emacs-devel

Dave Love <d.love@dl.ac.uk> writes:
> "Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:
>>  Indeed, the safe-charsets property of the utf-8 coding-system has not been
>>  updated to list the extra charsets it can now encode.

> I hope whatever's been changed has been properly tested if it's on the
> release branch.  Please get handa to check it if he hasn't already.

>>  I think Dave or Handa would now better how to fix that (whether
>>  unify-8859-on-encoding-mode should change the safe-charsets or whether
>>  it should simply always include the new charsets and load ucs-tables
>>  when needed.  And also which charsets should be added).

> Whoever changed it should sort it out.

I'm quite confused with the current status of utf-8.el,
ucs-tables.el, utf-16.el, utf-8-subst.el, etc in HEAD and
RC.

They differ in many parts (utf-8-subst.el and the necessary
change for that in mule.el and ccl.c don't exist in RC).

It's IMPOSSIBLE for me to figure out what are the correct
behaviour of them.  I've thought that the current codes were
the same one as what Dave had, but the above statement of
Dave's tells that it's not.

Could someone tell me why are they different in HEAD and RC,
and why are they different from what Dave have written?

---
Ken'ichi HANDA
handa@etl.go.jp

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-19  7:48 Several serious problems Kenichi Handa
@ 2002-08-22 17:08 ` Dave Love
  2002-08-29 13:25   ` Kenichi Handa
  2002-08-24 12:11 ` Richard Stallman
  1 sibling, 1 reply; 63+ messages in thread
From: Dave Love @ 2002-08-22 17:08 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, keichwa, rms, emacs-devel

Kenichi Handa <handa@etl.go.jp> writes:

> I'm quite confused with the current status of utf-8.el,
> ucs-tables.el, utf-16.el, utf-8-subst.el, etc in HEAD and
> RC.

I've been confused too, struggling to maintain several different
versions.

> It's IMPOSSIBLE for me to figure out what are the correct
> behaviour of them.

As far as I know, what's installed in the trunk behaves correctly, but
I'm not using that code and I don't know if I'd hear about real
problems with it (as opposed to imagined problems).  It should all be
things you have said are OK or I'm sure you will think are OK, but I
may have overlooked something.  However, it could use work for CJK, in
particular; there's a fixme in utf-8, and there could be additional
interconversion tables for CJK charsets as well as a way of
customizing the character preferences in utf-8-subst.el, and probably
other things.

> I've thought that the current codes were
> the same one as what Dave had, but the above statement of
> Dave's tells that it's not.

Well, now I check, utf-8.el in the RC branch seems to be as I left it,
which is what rms (I think) told me to do.  As far as I can tell, its
safe-charsets property is correct, and I don't understand what the
complaint is about.  When I couldn't check, I assumed someone had
modified it incorrectly, but there's no sign of that in CVS.

> Could someone tell me why are they different in HEAD and RC,
> and why are they different from what Dave have written?

Most changes aren't in RC since I was only allowed to add (a version
of) ucs-tables, not changing the default behaviour, so people could
turn on (partial) character translation themselves.  It doesn't affect
utf-8 or any other ccl coding systems because they don't use the
translation table (although the useful extra coding systems in
code-pages.el aren't included either, so I think only koi,
alternativnyj and mac-roman are affected).

I think I unilaterally added some other things (a utf-8 language
environment and utf-16.el?) since they addressed somewhat misleading
entries in PROBLEMS and the arguments against the Unicode support are
either demonstrably wrong or spurious IMNSHO.

I'm afraid I've had enough of all this, and I doubt it's worth more
effort anyhow.  Especially after all the FUD about them, the Mule
additions probably won't get used much unless they're the default,
even by i18n people, unfortunately.  It's a pity your good work on
Mule 5 is rather wasted.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-19  7:48 Several serious problems Kenichi Handa
  2002-08-22 17:08 ` Dave Love
@ 2002-08-24 12:11 ` Richard Stallman
  2002-08-26 13:17   ` Kenichi Handa
  1 sibling, 1 reply; 63+ messages in thread
From: Richard Stallman @ 2002-08-24 12:11 UTC (permalink / raw)
  Cc: d.love, monnier+gnu/emacs, keichwa, emacs-devel

    I'm quite confused with the current status of utf-8.el,
    ucs-tables.el, utf-16.el, utf-8-subst.el, etc in HEAD and
    RC.

Do you understand the situation in HEAD?

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-24 12:11 ` Richard Stallman
@ 2002-08-26 13:17   ` Kenichi Handa
  2002-08-26 16:15     ` Stefan Monnier
  2002-08-29 23:19     ` Dave Love
  0 siblings, 2 replies; 63+ messages in thread
From: Kenichi Handa @ 2002-08-26 13:17 UTC (permalink / raw)
  Cc: d.love, monnier+gnu/emacs, keichwa, emacs-devel

In article <200208241211.g7OCBW111768@wijiji.santafe.edu>, Richard Stallman <rms@gnu.org> writes:
>     I'm quite confused with the current status of utf-8.el,
>     ucs-tables.el, utf-16.el, utf-8-subst.el, etc in HEAD and
>     RC.

> Do you understand the situation in HEAD?

I don't understand what exactly do you mean by "situation".

I don't know if they are the same as what Dave currently
has.

I understand how each functions and variables are supposed
to work.  And, I know that those codes doesn't do definitely
wrong thing by reading through the codes briefly.

But, I have not checked if they surely works as
expected.  I believe Dave has done it.

And, I don't understand why those many functions/variables
are designed as the current way.  For instance,

(1) Why does loadup.el has this code:
	(ucs-unify-8859 'encode-only)
instead of:
	(unify-8859-on-encoding-mode 1)

(2) Why doesn't utf-8-subst.el provide mappings of
    non-Chinese characters for ksc, gb, and jisx charsets?
    The document of utf-8-translate-cjk says as below:
----------------------------------------------------------------------
Whether the `mule-utf-8' coding system should encode many CJK characters.

Enabling this loads tables which enable the coding system to encode
characters in the charsets `korean-ksc5601', `chinese-gb2312' and
`japanese-jisx0208', and to decode the corresponding unicodes into
...
----------------------------------------------------------------------
but, currently only Chinese characters in those charsets are
handled.

(3) Why is utf-8-translate-cjk a variable, not a minor-mode
    like unify-8859-on-(de/en)coding-mode?  Or, why the
    latter is not a simple variable?   By the way, it seems
    that once we customize utf-8-translate-cjk to t,
    customize it back to nil doesn't cancel the translation.

(4) It seems that the variable name
    utf-8-fragment-on-decoding is not appropriate because it
    is used also in utf-18.el.  Perhaps,
    ucs-fragment-on-decoding is better.

(5) It seems that mule-utf-16 can handle the same range of
    characters as mule-utf-8, but `safe-charsets' property
    doesn't contain, for instance, `latin-iso8895-2'.
    Perhaps, this is simply a bug to be fixed easily.

---
Ken'ichi HANDA
handa@etl.go.jp

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-26 13:17   ` Kenichi Handa
@ 2002-08-26 16:15     ` Stefan Monnier
  2002-08-29 23:18       ` Dave Love
  2002-08-29 23:19     ` Dave Love
  1 sibling, 1 reply; 63+ messages in thread
From: Stefan Monnier @ 2002-08-26 16:15 UTC (permalink / raw)
  Cc: rms, d.love, monnier+gnu/emacs, keichwa, emacs-devel

> (1) Why does loadup.el has this code:
> 	(ucs-unify-8859 'encode-only)
> instead of:
> 	(unify-8859-on-encoding-mode 1)

It might have been my "fault".  I think it's because I expect(ed)
unify-8859-on-encoding-mode to disappear (because there's no benefit
in turning it off, except for working around some bugs maybe).


	Stefan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-22 17:08 ` Dave Love
@ 2002-08-29 13:25   ` Kenichi Handa
  2002-08-29 17:32     ` Stefan Monnier
                       ` (3 more replies)
  0 siblings, 4 replies; 63+ messages in thread
From: Kenichi Handa @ 2002-08-29 13:25 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, keichwa, rms, emacs-devel

In article <rzqlm6ybz38.fsf@albion.dl.ac.uk>,
  Dave Love <d.love@dl.ac.uk> writes:
> As far as I know, what's installed in the trunk behaves correctly, but
> I'm not using that code

Why aren't you using that code?  Does it mean that you
changed some of them locally?

> and I don't know if I'd hear about real
> problems with it (as opposed to imagined problems).  It should all be
> things you have said are OK or I'm sure you will think are OK, but I
> may have overlooked something.  However, it could use work for CJK, in
> particular; there's a fixme in utf-8, and there could be additional
> interconversion tables for CJK charsets as well as a way of
> customizing the character preferences in utf-8-subst.el, and probably
> other things.

I noticed those `fixme's.   Yes, it is better to solve all
of them, but, for the moment, I want to concentrate on
fixing the problem of RC.

>>  I've thought that the current codes were
>>  the same one as what Dave had, but the above statement of
>>  Dave's tells that it's not.

> Well, now I check, utf-8.el in the RC branch seems to be as I left it,
> which is what rms (I think) told me to do.  As far as I can tell, its
> safe-charsets property is correct,

The safe-charsets property of utf-8 in RC is this:

ascii eight-bit-control eight-bit-graphic latin-iso8859-1
mule-unicode-0100-24ff mule-unicode-2500-33ff
mule-unicode-e000-ffff ethiopic tibetan thai-tis620
katakana-jisx0201 ipa chinese-sisheng lao
vietnamese-viscii-lower vietnamese-viscii-upper

It doesn't contain latin-iso8859-[23...].

> and I don't understand what the complaint is about.  When
> I couldn't check, I assumed someone had modified it
> incorrectly, but there's no sign of that in CVS.

The complaint is that the coding-system utf-8 can't encode
latin-2 characters in RC even if loadup.el has these lines.

(load "international/ucs-tables")
(ucs-unify-8859 'encode-only)

The reason is, as far as I see, the ccl program
`ccl-encode-mule-utf-8' doesn't have this line at the near
to head.

	   (translate-character ucs-mule-to-mule-unicode r0 r1))

So, even if we setup the translation table
`ucs-mule-to-mule-unicode' at loadup time, it is not used in
utf-8.

>>  Could someone tell me why are they different in HEAD and RC,
>>  and why are they different from what Dave have written?

> Most changes aren't in RC since I was only allowed to add (a version
> of) ucs-tables, not changing the default behaviour, so people could
> turn on (partial) character translation themselves.  It doesn't affect
> utf-8 or any other ccl coding systems because they don't use the
> translation table (although the useful extra coding systems in
> code-pages.el aren't included either, so I think only koi,
> alternativnyj and mac-roman are affected).

Hmmm, I think I realized the situation of RC.  It can unify
charsets between iso-8859-X, but utf-8 can't encode
iso-8859-X (intentionally), correct?

Richard, is it what you asked Dave to install for RC?

I think RC should also allow utf-8 to encode 8859-X
correctly like in HEAD.  I see no harm in it.

> I think I unilaterally added some other things (a utf-8 language
> environment and utf-16.el?) since they addressed somewhat misleading
> entries in PROBLEMS and the arguments against the Unicode support are
> either demonstrably wrong or spurious IMNSHO.

I don't oppose to that.  I found one problem with utf-16.
It seems that utf-16-le/be can handle 8859-X correctly
because of this line in ccl-encode-mule-utf-16-le/be,
      (translate-character ucs-mule-to-mule-unicode r0 r1)
but the safe-charsets property lists only these:
      ascii
      eight-bit-control
      latin-iso8859-1
      mule-unicode-0100-24ff
      mule-unicode-2500-33ff
      mule-unicode-e000-ffff
thus, they can't be regarded as a safe coding system for
them.

> I'm afraid I've had enough of all this,

Yah, you have done the excellent hack!  When I implemented
translation table stuffs, I didn't expect that it can be
used this thoroughly.

> and I doubt it's worth more effort anyhow.  Especially
> after all the FUD about them, the Mule additions probably
> won't get used much unless they're the default, even by
> i18n people, unfortunately.

I thought containing ucs-tables and etc in RC is at least
for making unify-on-encoding the default INCLUDING utf-8.

---
Ken'ichi HANDA
handa@etl.go.jp

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-29 13:25   ` Kenichi Handa
@ 2002-08-29 17:32     ` Stefan Monnier
  2002-08-29 23:15       ` Dave Love
  2002-08-30  6:09       ` Richard Stallman
  2002-08-29 23:09     ` Several serious problems Dave Love
                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 63+ messages in thread
From: Stefan Monnier @ 2002-08-29 17:32 UTC (permalink / raw)
  Cc: d.love, monnier+gnu/emacs, keichwa, rms, emacs-devel

> I noticed those `fixme's.   Yes, it is better to solve all
> of them, but, for the moment, I want to concentrate on
> fixing the problem of RC.

I think the only "problem" in RC is that latin-N chars cannot
be saved to utf-8.

> >>  I've thought that the current codes were
> >>  the same one as what Dave had, but the above statement of
> >>  Dave's tells that it's not.
> 
> > Well, now I check, utf-8.el in the RC branch seems to be as I left it,
> > which is what rms (I think) told me to do.  As far as I can tell, its
> > safe-charsets property is correct,
> 
> The safe-charsets property of utf-8 in RC is this:
> 
> ascii eight-bit-control eight-bit-graphic latin-iso8859-1
> mule-unicode-0100-24ff mule-unicode-2500-33ff
> mule-unicode-e000-ffff ethiopic tibetan thai-tis620
> katakana-jisx0201 ipa chinese-sisheng lao
> vietnamese-viscii-lower vietnamese-viscii-upper
> 
> It doesn't contain latin-iso8859-[23...].

And it's correct as long as ucs-tables is not loaded.
And since RC is "only bug-fixes" it's important that we don't make
any change outside of ucs-tables.el except for bug-fixes, so
we can't just change the safe-charsets property.  I.e.
we have to either accept the current situation or else
change the safe-charsets property of utf-8 from ucs-tables.el.
Unless RMS accepts to make changes to utf-8.el which are not
bug-fixes but improvements to the utf-8 support.

On the trunk it's easier since we just changed the safe-charsets
property directly in utf-8.el and made sure that ucs-tables.el
is loaded when necessary.


	Stefan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-29 13:25   ` Kenichi Handa
  2002-08-29 17:32     ` Stefan Monnier
@ 2002-08-29 23:09     ` Dave Love
  2002-08-30  6:11       ` Richard Stallman
  2002-08-29 23:17     ` Dave Love
  2002-08-30  6:09     ` Richard Stallman
  3 siblings, 1 reply; 63+ messages in thread
From: Dave Love @ 2002-08-29 23:09 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, keichwa, rms, emacs-devel

Kenichi Handa <handa@etl.go.jp> writes:

> In article <rzqlm6ybz38.fsf@albion.dl.ac.uk>,
>   Dave Love <d.love@dl.ac.uk> writes:
> > As far as I know, what's installed in the trunk behaves correctly, but
> > I'm not using that code
> 
> Why aren't you using that code?

I don't want to use an unstable Emacs with all sorts of things I don't
understand.

> I noticed those `fixme's.   Yes, it is better to solve all
> of them, but, for the moment, I want to concentrate on
> fixing the problem of RC.

I was trying to sort out RC, but I don't understand this problem.

> The safe-charsets property of utf-8 in RC is this:
> 
> ascii eight-bit-control eight-bit-graphic latin-iso8859-1
> mule-unicode-0100-24ff mule-unicode-2500-33ff
> mule-unicode-e000-ffff ethiopic tibetan thai-tis620
> katakana-jisx0201 ipa chinese-sisheng lao
> vietnamese-viscii-lower vietnamese-viscii-upper

I see:

 '((safe-charsets
    ascii
    eight-bit-control
    eight-bit-graphic
    latin-iso8859-1
    mule-unicode-0100-24ff
    mule-unicode-2500-33ff
    mule-unicode-e000-ffff)

in what appears to be revision 1.9.4.2 with sticky tag `EMACS_21_1_RC'.

> It doesn't contain latin-iso8859-[23...].

Indeed.

> The complaint is that the coding-system utf-8 can't encode
> latin-2 characters in RC even if loadup.el has these lines.

Indeed, but the complaint seemed to be that it could encode latin-2
and safe-charsets didn't say so.  That's why I thought someone had
changed it.

> The reason is, as far as I see, the ccl program
> `ccl-encode-mule-utf-8' doesn't have this line at the near
> to head.
> 
> 	   (translate-character ucs-mule-to-mule-unicode r0 r1))

Yes.  

> So, even if we setup the translation table
> `ucs-mule-to-mule-unicode' at loadup time, it is not used in
> utf-8.

Nor in other CCL coding systems.

> Hmmm, I think I realized the situation of RC.  It can unify
> charsets between iso-8859-X, but utf-8 can't encode
> iso-8859-X (intentionally), correct?

Yes.

> Richard, is it what you asked Dave to install for RC?

I'm pretty sure ucs-tables was only allowed to be installed because
just adding the file couldn't break anything.

> I think RC should also allow utf-8 to encode 8859-X
> correctly like in HEAD.  I see no harm in it.

I'm sure there's no harm in my Mule changes generally, but that's not
what everyone has been told, unfortunately.  

> > I think I unilaterally added some other things (a utf-8 language
> > environment and utf-16.el?) since they addressed somewhat misleading
> > entries in PROBLEMS and the arguments against the Unicode support are
> > either demonstrably wrong or spurious IMNSHO.
> 
> I don't oppose to that.

I didn't think you would.

> I found one problem with utf-16.
> It seems that utf-16-le/be can handle 8859-X correctly
> because of this line in ccl-encode-mule-utf-16-le/be,
>       (translate-character ucs-mule-to-mule-unicode r0 r1)

I guess that's an error, and I should have taken that out for
consistency with utf-8.

> > I'm afraid I've had enough of all this,
> 
> Yah, you have done the excellent hack!

I don't mean anything to do with useful work.  It's after being told
for so long it's impossible/broken/not wanted, wasting time, and then
having to sort out the situation in adverse circumstances.  It's very
unfortunate not to have an active maintainer for Mule generally.

> When I implemented translation table stuffs, I didn't expect that it
> can be used this thoroughly.

Strange!  I thought that was exactly what they were for, and the only
thing that was missing initially to satisfy the complaining Europeans
was char-coding-system-table.  The names were even
`...-unification-...' originally.

> I thought containing ucs-tables and etc in RC is at least
> for making unify-on-encoding the default INCLUDING utf-8.

I've no idea.  As far as I remember, it was due to pressure from users
of both Latin-1 and Latin-9 who must have actually tried it despite
what they were told.  I was surprised it was eventually allowed in.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-29 17:32     ` Stefan Monnier
@ 2002-08-29 23:15       ` Dave Love
  2002-08-30 14:36         ` Stefan Monnier
  2002-08-30  6:09       ` Richard Stallman
  1 sibling, 1 reply; 63+ messages in thread
From: Dave Love @ 2002-08-29 23:15 UTC (permalink / raw)
  Cc: Kenichi Handa, keichwa, rms, emacs-devel

"Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:

> I think the only "problem" in RC is that latin-N chars cannot
> be saved to utf-8.

In that case, I wasted considerable time...  I know, for instance,
that people whinge that keyboard input doesn't conform to the buffer
file coding system, and that other coding systems &c are needed --
windows-1252 probably most importantly.

> > It doesn't contain latin-iso8859-[23...].
> 
> And it's correct as long as ucs-tables is not loaded.

What handa showed isn't correct.  The utf-8 coding system on the RC
branch doesn't encode lao, for instance.

> And since RC is "only bug-fixes"

For some value of `bug fix'...

> it's important that we don't make
> any change outside of ucs-tables.el except for bug-fixes, so
> we can't just change the safe-charsets property.

I don't understand.  Of course you can't just change safe-charsets --
it has to reflect what the coding system actually encodes.

> On the trunk it's easier since we just changed the safe-charsets
> property directly in utf-8.el and made sure that ucs-tables.el
> is loaded when necessary.

Last I looked, it was preloaded.  I don't see why it shouldn't be, and
it would have been designed to be if I hadn't had to write it just as
an add-on initially.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-29 13:25   ` Kenichi Handa
  2002-08-29 17:32     ` Stefan Monnier
  2002-08-29 23:09     ` Several serious problems Dave Love
@ 2002-08-29 23:17     ` Dave Love
  2002-08-30  6:11       ` Richard Stallman
  2002-08-30  6:09     ` Richard Stallman
  3 siblings, 1 reply; 63+ messages in thread
From: Dave Love @ 2002-08-29 23:17 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, keichwa, rms, emacs-devel

Kenichi Handa <handa@etl.go.jp> writes:

> The safe-charsets property of utf-8 in RC is this:
> 
> ascii eight-bit-control eight-bit-graphic latin-iso8859-1
> mule-unicode-0100-24ff mule-unicode-2500-33ff
> mule-unicode-e000-ffff ethiopic tibetan thai-tis620
> katakana-jisx0201 ipa chinese-sisheng lao
> vietnamese-viscii-lower vietnamese-viscii-upper

I've just realized that you probably used coding-system-get, and
there's a problem with what I installed.  I didn't cut out this from
my working version:

*** ucs-tables.el.~1.12.4.1.~	Wed Jul  3 15:38:14 2002
--- ucs-tables.el	Thu Aug 29 19:27:15 2002
***************
*** 2443,2453 ****
  	       (coding-system-put cs 'translation-table-for-input cs)))))
      (optimize-char-table ucs-mule-to-mule-unicode)
      (dolist (c safe-charsets)
!       (aset table (make-char c) t))
!     (coding-system-put 'mule-utf-8 'safe-charsets
! 		       (append (coding-system-get 'mule-utf-8 'safe-charsets)
! 			       safe-charsets))
!     (register-char-codings 'mule-utf-8 table)))
  
  (defvar translation-table-for-input (make-translation-table))
  
--- 2443,2449 ----
  	       (coding-system-put cs 'translation-table-for-input cs)))))
      (optimize-char-table ucs-mule-to-mule-unicode)
      (dolist (c safe-charsets)
!       (aset table (make-char c) t))))
  
  (defvar translation-table-for-input (make-translation-table))

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-26 16:15     ` Stefan Monnier
@ 2002-08-29 23:18       ` Dave Love
  2002-08-30 14:36         ` Stefan Monnier
  0 siblings, 1 reply; 63+ messages in thread
From: Dave Love @ 2002-08-29 23:18 UTC (permalink / raw)
  Cc: Kenichi Handa, rms, keichwa, emacs-devel

"Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:

> It might have been my "fault".  I think it's because I expect(ed)
> unify-8859-on-encoding-mode to disappear (because there's no benefit
> in turning it off, except for working around some bugs maybe).

What bugs?

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-26 13:17   ` Kenichi Handa
  2002-08-26 16:15     ` Stefan Monnier
@ 2002-08-29 23:19     ` Dave Love
  1 sibling, 0 replies; 63+ messages in thread
From: Dave Love @ 2002-08-29 23:19 UTC (permalink / raw)
  Cc: rms, monnier+gnu/emacs, keichwa, emacs-devel

Kenichi Handa <handa@etl.go.jp> writes:

> I don't know if they are the same as what Dave currently
> has.

I tried to install all the relevant stuff I had, but for the CVS head,
it's modified versions of what I've actually been using, and is
basically untested.  I wanted someone who was actually using that code
base to install it and test it, but no-one could or would -- I can't
remember, but rms leant on me to install it.

> But, I have not checked if they surely works as
> expected.  I believe Dave has done it.

Only in more-or-less Emacs 21.2.

> And, I don't understand why those many functions/variables
> are designed as the current way.  For instance,
> 
> (1) Why does loadup.el has this code:
> 	(ucs-unify-8859 'encode-only)
> instead of:
> 	(unify-8859-on-encoding-mode 1)

Indeed.  I didn't do that.  The obvious thing to do is to change the
default in the defcustom, if ucs-tables is preloaded.

> (2) Why doesn't utf-8-subst.el provide mappings of
>     non-Chinese characters for ksc, gb, and jisx charsets?
>     The document of utf-8-translate-cjk says as below:
> ----------------------------------------------------------------------
> Whether the `mule-utf-8' coding system should encode many CJK characters.
> 
> Enabling this loads tables which enable the coding system to encode
> characters in the charsets `korean-ksc5601', `chinese-gb2312' and
> `japanese-jisx0208', and to decode the corresponding unicodes into
> ...
> ----------------------------------------------------------------------
> but, currently only Chinese characters in those charsets are
> handled.

I didn't realize that.  It may be coincidence.  What should be
translated is the set of characters

(japanese-jisx0208 ∪ chinese-gb2312 ∪ korean-ksc5601) \ mule-unicode-2500-33ff
                   ^                                  ^
                   union                              set difference

according to the Mule-UCS tables -- I just took the relevant codes
from there above U+33FF.  Perhaps that isn't how it actually is.

It needs someone with an interest in the CJK range to redo that stuff
anyhow; it shouldn't hardwire Japanese as the japanese-jisx0208 as the
preferred set, the sets used should probably be configurable, and it
should allow translating the relevant characters below U+3400.  (I
didn't think much about how best to do that without keeping large
tables on the heap that aren't actually used to do the translation.)

> (3) Why is utf-8-translate-cjk a variable, not a minor-mode
>     like unify-8859-on-(de/en)coding-mode?

I think because it can't be turned off.

>     Or, why the
>     latter is not a simple variable?   By the way, it seems
>     that once we customize utf-8-translate-cjk to t,
>     customize it back to nil doesn't cancel the translation.
> 
> (4) It seems that the variable name
>     utf-8-fragment-on-decoding is not appropriate because it
>     is used also in utf-18.el.  Perhaps,
>     ucs-fragment-on-decoding is better.

Probably.  It was defined before I wrote utf-16.el.  Much of that
stuff would have been written differently for installation in 21.1,
but it was done during the campaign against anything Unicode-based, so
that users could have it in Emacs 21.2 as conveniently as possible.

> (5) It seems that mule-utf-16 can handle the same range of
>     characters as mule-utf-8, but `safe-charsets' property
>     doesn't contain, for instance, `latin-iso8895-2'.
>     Perhaps, this is simply a bug to be fixed easily.

Yes.  The coding system needs to register the relevant translation
table(s) for safe-chars, that would have to be updated in sync with
any changes.  I don't know why that didn't get done.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-29 17:32     ` Stefan Monnier
  2002-08-29 23:15       ` Dave Love
@ 2002-08-30  6:09       ` Richard Stallman
  2002-08-31 17:30         ` Dave Love
  1 sibling, 1 reply; 63+ messages in thread
From: Richard Stallman @ 2002-08-30  6:09 UTC (permalink / raw)
  Cc: handa, d.love, monnier+gnu/emacs, keichwa, emacs-devel

    And since RC is "only bug-fixes" it's important that we don't make
    any change outside of ucs-tables.el except for bug-fixes, so
    we can't just change the safe-charsets property.

I don't follow the logic here.  Why can't we just change the
safe-charsets property?  Is there some obstacle to doing that?  Do you
think other things would fail to work if we did?  Are other changes
are needed as well to make it work?

    Unless RMS accepts to make changes to utf-8.el which are not
    bug-fixes but improvements to the utf-8 support.

If we can't save latin-N characters as utf-8, that is a bug.
If the fix is safe and clear, we may as well install it in RC.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-29 13:25   ` Kenichi Handa
                       ` (2 preceding siblings ...)
  2002-08-29 23:17     ` Dave Love
@ 2002-08-30  6:09     ` Richard Stallman
  3 siblings, 0 replies; 63+ messages in thread
From: Richard Stallman @ 2002-08-30  6:09 UTC (permalink / raw)
  Cc: d.love, monnier+gnu/emacs, keichwa, emacs-devel

    Hmmm, I think I realized the situation of RC.  It can unify
    charsets between iso-8859-X, but utf-8 can't encode
    iso-8859-X (intentionally), correct?

    Richard, is it what you asked Dave to install for RC?

I can't remember after this much time has gone by.
Chances are I never knew about this specific issue
and that I did not say anything to him about it one way or another,
but I can't remember.

If you can make this case work with a clean and safe change,
please do.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-29 23:09     ` Several serious problems Dave Love
@ 2002-08-30  6:11       ` Richard Stallman
  2002-09-04 17:21         ` Dave Love
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Stallman @ 2002-08-30  6:11 UTC (permalink / raw)
  Cc: handa, monnier+gnu/emacs, keichwa, emacs-devel

    > I think RC should also allow utf-8 to encode 8859-X
    > correctly like in HEAD.  I see no harm in it.

    I'm sure there's no harm in my Mule changes generally, but that's not
    what everyone has been told, unfortunately.  

We would not have installed your changes in the trunk if they were
harmful.  The issue about RC is not harm, it is risk of bugs.  Any
change has a risk of bugs, even if it is a great improvement.  But the
risk is not proportional to the improvement; they depend on different
factors.  In RC we try to keep this risk down.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-29 23:17     ` Dave Love
@ 2002-08-30  6:11       ` Richard Stallman
  2002-08-31 17:31         ` Dave Love
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Stallman @ 2002-08-30  6:11 UTC (permalink / raw)
  Cc: handa, monnier+gnu/emacs, keichwa, emacs-devel

    I've just realized that you probably used coding-system-get, and
    there's a problem with what I installed.  I didn't cut out this from
    my working version:

Is this a change we should install in RC now?

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-29 23:15       ` Dave Love
@ 2002-08-30 14:36         ` Stefan Monnier
  2002-09-04 17:23           ` Dave Love
  0 siblings, 1 reply; 63+ messages in thread
From: Stefan Monnier @ 2002-08-30 14:36 UTC (permalink / raw)
  Cc: Stefan Monnier, Kenichi Handa, keichwa, rms, emacs-devel

> > I think the only "problem" in RC is that latin-N chars cannot
> > be saved to utf-8.
> 
> In that case, I wasted considerable time...  I know, for instance,
> that people whinge that keyboard input doesn't conform to the buffer
> file coding system, and that other coding systems &c are needed --
> windows-1252 probably most importantly.

By "in RC" I meant "in RC as it currently stands", not "in RC before you
installed ucs-tables.el".  As you know, I'm a big fan of ucs-tables.el.
Please don't try and find offense where there isn't, it makes me rather sad.

> > > It doesn't contain latin-iso8859-[23...].
> > 
> > And it's correct as long as ucs-tables is not loaded.
> 
> What handa showed isn't correct.  The utf-8 coding system on the RC
> branch doesn't encode lao, for instance.

I was referring to what's in the utf-8.el file.

> > And since RC is "only bug-fixes"
> For some value of `bug fix'...

Obviously.

> > it's important that we don't make
> > any change outside of ucs-tables.el except for bug-fixes, so
> > we can't just change the safe-charsets property.
> 
> I don't understand.  Of course you can't just change safe-charsets --
> it has to reflect what the coding system actually encodes.

IIRC, on the trunk you changed utf-8.el directly and simply enforced
that ucs-tables.el be loaded when necessary.


	Stefan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-29 23:18       ` Dave Love
@ 2002-08-30 14:36         ` Stefan Monnier
  0 siblings, 0 replies; 63+ messages in thread
From: Stefan Monnier @ 2002-08-30 14:36 UTC (permalink / raw)
  Cc: Stefan Monnier, Kenichi Handa, rms, keichwa, emacs-devel

> > It might have been my "fault".  I think it's because I expect(ed)
> > unify-8859-on-encoding-mode to disappear (because there's no benefit
> > in turning it off, except for working around some bugs maybe).
> 
> What bugs?

None that I know of.  I meant the sentence to mean "to be able to turn
it off in case a bug showed up".


	Stefan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-30  6:09       ` Richard Stallman
@ 2002-08-31 17:30         ` Dave Love
  2002-09-02  0:01           ` Richard Stallman
  0 siblings, 1 reply; 63+ messages in thread
From: Dave Love @ 2002-08-31 17:30 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, handa, keichwa, emacs-devel

Richard Stallman <rms@gnu.org> writes:

> I don't follow the logic here.  Why can't we just change the
> safe-charsets property?

If you change safe-charsets without changing what the CCL actually
encodes, you're just courting data corruption.
E.g. find-coding-systems-... will report utf-8 for lao text, but if
you encode it, you'll just get U+FFFDs.

> If we can't save latin-N characters as utf-8, that is a bug.

[You argued against that before.]

Why just Latin-N, and why just as utf-8?  There shouldn't be anything
special about Latin.  That version of utf-8.el can't encode
cyrillic-iso8859-5, for instance, and the Cyrillic coding systems
can't encode the relevant characters from mule-unicode-0100-24ff.

Is it also a bug that utf-8 can't encode the CJK space or that the CJK
sets can't encode equivalent characters from other sets (which I
haven't tried to address and people probably don't care about)?

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-30  6:11       ` Richard Stallman
@ 2002-08-31 17:31         ` Dave Love
  2002-09-02  0:01           ` Richard Stallman
  0 siblings, 1 reply; 63+ messages in thread
From: Dave Love @ 2002-08-31 17:31 UTC (permalink / raw)
  Cc: handa, monnier+gnu/emacs, keichwa, emacs-devel

Richard Stallman <rms@gnu.org> writes:

>     I've just realized that you probably used coding-system-get, and
>     there's a problem with what I installed.  I didn't cut out this from
>     my working version:
> 
> Is this a change we should install in RC now?

That depends on whether you include code in utf-8.el that encodes
those charsets.  If not, you need that change.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-31 17:31         ` Dave Love
@ 2002-09-02  0:01           ` Richard Stallman
  2002-09-02  1:28             ` Kenichi Handa
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Stallman @ 2002-09-02  0:01 UTC (permalink / raw)
  Cc: handa, monnier+gnu/emacs, keichwa, emacs-devel

    That depends on whether you include code in utf-8.el that encodes
    those charsets.  If not, you need that change.

In that case, I will install that change presently, and then we can
study the question of whether to include the code in utf-8.el instead.

What does that code in utf-8.el do, and how safe a change is it?

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-31 17:30         ` Dave Love
@ 2002-09-02  0:01           ` Richard Stallman
  2002-09-04 17:15             ` Dave Love
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Stallman @ 2002-09-02  0:01 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, handa, keichwa, emacs-devel

    Why just Latin-N, and why just as utf-8?

I am talking about that issue because that is the issue someone
raised.  I don't know what other issue there is.  Could you tell us?

					      There shouldn't be anything
    special about Latin.

Latin-N character sets are very important in practice.  It is also
possible that they are easier to handle than some other character sets
(but I don't know whether that is the case here).  Those two factors
are directly relevant to whether it is worth fixing this case in RC.
The factors might be different for another character set.

    Is it also a bug that utf-8 can't encode the CJK space or that the CJK
    sets can't encode equivalent characters from other sets (which I
    haven't tried to address and people probably don't care about)?

That is certainly a bug.  The question is whether this bug may not be
worth fixing in RC.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-09-02  0:01           ` Richard Stallman
@ 2002-09-02  1:28             ` Kenichi Handa
  2002-09-05 13:41               ` Dave Love
  2002-09-10 16:36               ` Richard Stallman
  0 siblings, 2 replies; 63+ messages in thread
From: Kenichi Handa @ 2002-09-02  1:28 UTC (permalink / raw)
  Cc: d.love, monnier+gnu/emacs, keichwa, emacs-devel

In article <E17lefC-0003IF-00@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:
>     That depends on whether you include code in utf-8.el that encodes
>     those charsets.  If not, you need that change.

> In that case, I will install that change presently, and then we can
> study the question of whether to include the code in utf-8.el instead.

> What does that code in utf-8.el do, and how safe a change is it?

It defines two CCL codes to decode and encode utf-8 byte
sequence, and makes the coding system mule-utf-8 by using
those CCL codes.

I'll attach the necessary change to enable RC's utf-8 to
encode latin-X plus alpha (e.g. thai).  The docstring of
mule-utf-8 may need improvement.

As the change is very small and that code has been in HEAD
for more than one month, I think the change is quite safe.
I recommend to install it in RC.

I also checked the code to some extent by this testsuite.

(dolist (charset (delq 'ascii
		       (delq 'eight-bit-control
			     (delq 'eight-bit-graphic
				   (coding-system-get 'mule-utf-8
						      'safe-charsets)))))
  (let ((dimension (charset-dimension charset))
	str)
    (if (= dimension 1)
	(setq str (string (make-char charset 33) (make-char charset 34)))
      (setq str (string (make-char charset 33 33) (make-char charset 33 34))))
    (or (memq 'mule-utf-8 (find-coding-systems-string str))
        (not (string-match "\357\277\275" ; UTF-8 form of U+FFFD
			   (encode-coding-string str 'mule-utf-8)))

	(error (format "%s is not supported" charset)))))

---
Ken'ichi HANDA
handa@etl.go.jp

*** utf-8.el.~1.9.4.2.~	Tue Jul 23 13:54:13 2002
--- utf-8.el	Mon Sep  2 10:28:26 2002
***************
*** 269,275 ****
       (loop
        (if (r5 < 0)
  	  ((r1 = -1)
! 	   (read-multibyte-character r0 r1))
  	(;; We have already done read-multibyte-character.
  	 (r0 = r5)
  	 (r1 = r6)
--- 269,277 ----
       (loop
        (if (r5 < 0)
  	  ((r1 = -1)
! 	   (read-multibyte-character r0 r1)
! 	   (translate-character ucs-mule-to-mule-unicode r0 r1))
! 
  	(;; We have already done read-multibyte-character.
  	 (r0 = r5)
  	 (r1 = r6)
***************
*** 392,397 ****
--- 394,423 ----
     mule-unicode-0100-24ff
     mule-unicode-2500-33ff
     mule-unicode-e000-ffff
+    latin-iso8859-2 (*)
+    latin-iso8859-3 (*)
+    latin-iso8859-4 (*)
+    cyrillic-iso8859-5 (*)
+    arabic-iso8859-6 (*)
+    greek-iso8859-7 (*)
+    hebrew-iso8859-8 (*)
+    latin-iso8859-9 (*)
+    latin-iso8859-14 (*)
+    latin-iso8859-15 (*)
+    chinese-sisheng (*)
+    ethiopic (*)
+    ipa (*)
+    lao (*)
+    katakana-jisx0201 (*)
+    thai-tis620 (*)
+    tibetan (*)
+    vietnamese-viscii-lower (*)
+    vietnamese-viscii-upper (*)
+ 
+ Among them, the charsets labeled \"(*)\" are supported only on
+ encoding.  That means, they are correctly encoded to UTF-8, but are
+ decoded back to charsets latin-iso8859-1, mule-unicode-0100-24ff, or
+ mule-unicode-2500-33ff, not to the original charsets.
  
  Unicode characters out of the ranges U+0000-U+33FF and U+E200-U+FFFF
  are decoded into sequences of eight-bit-control and eight-bit-graphic
***************
*** 409,415 ****
      latin-iso8859-1
      mule-unicode-0100-24ff
      mule-unicode-2500-33ff
!     mule-unicode-e000-ffff)
     (mime-charset . utf-8)
     (coding-category . coding-category-utf-8)
     (valid-codes (0 . 255))))
--- 435,460 ----
      latin-iso8859-1
      mule-unicode-0100-24ff
      mule-unicode-2500-33ff
!     mule-unicode-e000-ffff
!     latin-iso8859-2 
!     latin-iso8859-3 
!     latin-iso8859-4 
!     cyrillic-iso8859-5 
!     arabic-iso8859-6 
!     greek-iso8859-7 
!     hebrew-iso8859-8 
!     latin-iso8859-9 
!     latin-iso8859-14 
!     latin-iso8859-15 
!     chinese-sisheng 
!     ethiopic 
!     ipa 
!     lao 
!     katakana-jisx0201 
!     thai-tis620 
!     tibetan 
!     vietnamese-viscii-lower 
!     vietnamese-viscii-upper)
     (mime-charset . utf-8)
     (coding-category . coding-category-utf-8)
     (valid-codes (0 . 255))))

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-09-02  0:01           ` Richard Stallman
@ 2002-09-04 17:15             ` Dave Love
  2002-09-08 12:54               ` Richard Stallman
  0 siblings, 1 reply; 63+ messages in thread
From: Dave Love @ 2002-09-04 17:15 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, handa, keichwa, emacs-devel

Richard Stallman <rms@gnu.org> writes:

>     Why just Latin-N, and why just as utf-8?
> 
> I am talking about that issue because that is the issue someone
> raised.  I don't know what other issue there is.  Could you tell us?

The issue is just the same for the other charsets that have
translation tables in the head code, and for other CCL coding systems.
For instance, the RC version of mule-utf-8 doesn't translate
cyrillic-iso8859-5, and the Cyrillic coding systems don't translate
mule-unicode-0100-24ff.

> Latin-N character sets are very important in practice.

I think the only thing which distinguishes Latin-N is that Latin-1 is
(was?) the Internet default and its code points are a Unicode subset.
I see no reason to treat, say, Latin-2 as more important than
Cyrillic; I guess it has fewer users for a start.  I also guess
windows-1252 is more widely used than Latin-1, like it or not.

> It is also possible that they are easier to handle than some other
> character sets (but I don't know whether that is the case here).

They're treated identically to the others that ucs-tables handles.
You have to work to remove them.  (The sets that are handled are just
the ones I could conveniently make tables for.)

>     Is it also a bug that utf-8 can't encode the CJK space or that the CJK
>     sets can't encode equivalent characters from other sets (which I
>     haven't tried to address and people probably don't care about)?
> 
> That is certainly a bug.

I actually agree with your previous opinion that lack of translations
isn't a bug as such, despite what PROBLEMS implied -- the features
behave as designed and documented.

I definitely don't agree that general lack of unification of Japanese
characters is a bug.  I got detailed information on the problems with
jisx mappings to Unicode, and we were asked not to confuse matters by
providing jisx0213 tables in Emacs 22, which is designed not to force
that.  (The jisx0208 that utf-8-subst.el uses is a case in point, but
I assume the Mule-UCS table I used is what Japanese linguists agree
on.)  It's also not clear that one should unify double-width
characters with iso8859, for instance.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-30  6:11       ` Richard Stallman
@ 2002-09-04 17:21         ` Dave Love
  0 siblings, 0 replies; 63+ messages in thread
From: Dave Love @ 2002-09-04 17:21 UTC (permalink / raw)
  Cc: handa, monnier+gnu/emacs, keichwa, emacs-devel

Richard Stallman <rms@gnu.org> writes:

> We would not have installed your changes in the trunk if they were
> harmful.

I was referring to what people have been told about them, including in
PROBLEMS.

[I'm not sure you'd actually know a priori whether what I installed
was harmful; it wasn't properly tested.  Obviously I think it's OK
modulo the bugs I haven't heard about, but that doesn't mean it
couldn't corrupt data.]

> The issue about RC is not harm, it is risk of bugs.  Any
> change has a risk of bugs, even if it is a great improvement.

Of course, and I'm surprised at some of what's been added.

> But the risk is not proportional to the improvement; they depend on
> different factors.  In RC we try to keep this risk down.

Of course.  I happen to be in the best position to evaluate the
factors in this case.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-08-30 14:36         ` Stefan Monnier
@ 2002-09-04 17:23           ` Dave Love
  0 siblings, 0 replies; 63+ messages in thread
From: Dave Love @ 2002-09-04 17:23 UTC (permalink / raw)
  Cc: Kenichi Handa, keichwa, rms, emacs-devel

"Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:

> > In that case, I wasted considerable time...  I know, for instance,
> > that people whinge that keyboard input doesn't conform to the buffer
> > file coding system, and that other coding systems &c are needed --
> > windows-1252 probably most importantly.
> 
> By "in RC" I meant "in RC as it currently stands", not "in RC before you
> installed ucs-tables.el".

So did I, or at least as it stood a few days ago.  I don't understand
this (or the rest of the message).  It's a non sequitur as far as I
can tell.

> Please don't try and find offense where there isn't, it makes me
> rather sad.

I don't know what you mean.  I'm just sticking up for a large set of
users.  However I guess they are likely to find offence if maintainers
dismiss -- or appear to -- m17n features they need.

As far as I know, my opinions are roughly the same as handa's --
apologies if not -- and he was the one proposing more changes in this
case.  I'm glad he eventually gets listened to, anyhow.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-09-02  1:28             ` Kenichi Handa
@ 2002-09-05 13:41               ` Dave Love
  2002-09-05 23:32                 ` Kenichi Handa
  2002-09-10 16:36               ` Richard Stallman
  1 sibling, 1 reply; 63+ messages in thread
From: Dave Love @ 2002-09-05 13:41 UTC (permalink / raw)
  Cc: rms, monnier+gnu/emacs, keichwa, emacs-devel

Kenichi Handa <handa@etl.go.jp> writes:

> + Among them, the charsets labeled \"(*)\" are supported only on
> + encoding.

I assume they still are only encodable if unify-8859-on-encoding-mode
is on.

> That means, they are correctly encoded to UTF-8, but are
> + decoded back to charsets latin-iso8859-1, mule-unicode-0100-24ff, or
> + mule-unicode-2500-33ff, not to the original charsets.

[That's actually customizable through a decoding table, of course.]

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-09-05 13:41               ` Dave Love
@ 2002-09-05 23:32                 ` Kenichi Handa
  2002-09-06 11:38                   ` Robert J. Chassell
  2002-09-07 23:19                   ` Dave Love
  0 siblings, 2 replies; 63+ messages in thread
From: Kenichi Handa @ 2002-09-05 23:32 UTC (permalink / raw)
  Cc: rms, monnier+gnu/emacs, keichwa, emacs-devel

In article <rzqy9ag7dux.fsf@albion.dl.ac.uk>, Dave Love <d.love@dl.ac.uk> writes:
> Kenichi Handa <handa@etl.go.jp> writes:
>>  + Among them, the charsets labeled \"(*)\" are supported only on
>>  + encoding.

> I assume they still are only encodable if unify-8859-on-encoding-mode
> is on.

Yes.  But, that mode is on by default in RC too.

>>  That means, they are correctly encoded to UTF-8, but are
>>  + decoded back to charsets latin-iso8859-1, mule-unicode-0100-24ff, or
>>  + mule-unicode-2500-33ff, not to the original charsets.

> [That's actually customizable through a decoding table, of course.]

How about adding this paragraph?

See also the documentations of:
  `unify-8859-on-decoding-mode', `unify-8859-on-encoding-mode',
  `utf-8-fragment-on-decoding'
to customize the behaviour of this coding system."

---
Ken'ichi HANDA
handa@etl.go.jp

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-09-05 23:32                 ` Kenichi Handa
@ 2002-09-06 11:38                   ` Robert J. Chassell
  2002-09-07 23:19                   ` Dave Love
  1 sibling, 0 replies; 63+ messages in thread
From: Robert J. Chassell @ 2002-09-06 11:38 UTC (permalink / raw)


[This started as a question regarding `unify-8859-on-encoding-mode', but
has evolved to a `themes' related question!]

   Yes.  But, that mode is on by default in RC too.

How do I determine easily whether unify-8859-on-encoding-mode is on or
off by default in particular instances of Emacs.  Currently, I am
running two instances, one a `plain vanilla' Emacs, and another that
loads a 150kb .emacs file.  I would like to know whether
`unify-8859-on-encoding-mode' is on or off in my `plain vanilla'
Emacs.

I am not actually trying to track down the code (which I have done
anyhow.  Evidentally, `ucs-fragment-8859' sets properties to `nil',
but I don't know whether they are changed elsewhere.).

Rather I am looking for a mechanism that reports the complete current
status.

The `mule-diag' command does this for other features, and I thought
it might provide the unify status, too, but it does not.  (Probably
for the good reason that eventually, unify will always be on.)

Instead, it turns out that I am looking for a reporter that tells me
everything about the current state of a particular instance of Emacs,
including variables and properties; in other words, including the
values of `(mule-diag)', `(describe-bindings)',
`(current-frame-configuration)', `load-path', and so on.

This reporter would be useful for anyone working on themes, since it
would mean you could go back to any number of previous states.

(And yes, the resulting status files will be big, perhaps too big for
any normal use.  But right now I am concerned more about the
capability than about optimization.  I don't know whether the
capability merits optimization but think it is a simplification worth
providing to moderately knowledgeable hackers.)

-- 
    Robert J. Chassell            bob@rattlesnake.com  bob@gnu.org
    Rattlesnake Enterprises       http://www.rattlesnake.com
    Free Software Foundation      http://www.gnu.org   GnuPG Key ID: 004B4AC8

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-09-05 23:32                 ` Kenichi Handa
  2002-09-06 11:38                   ` Robert J. Chassell
@ 2002-09-07 23:19                   ` Dave Love
  2002-09-09  0:21                     ` Richard Stallman
  2002-09-26  4:51                     ` Kenichi Handa
  1 sibling, 2 replies; 63+ messages in thread
From: Dave Love @ 2002-09-07 23:19 UTC (permalink / raw)
  Cc: rms, monnier+gnu/emacs, keichwa, emacs-devel

Kenichi Handa <handa@etl.go.jp> writes:

> Yes.  But, that mode is on by default in RC too.

Gosh.  However, it appears to be done wrongly.  Custom will show it
isn't on, and would turn it off if you tried to turn it on.  Surely if
it's preloaded and meant to be the default, the defcustom initial
value should just be changed.

> How about adding this paragraph?
> 
> See also the documentations of:
>   `unify-8859-on-decoding-mode', `unify-8859-on-encoding-mode',
>   `utf-8-fragment-on-decoding'
> to customize the behaviour of this coding system."

Fine, but that shouldn't be specific to mule-utf-8.  Those variables
affect more coding systems, and other CCL ones should use the
appropriate translation tables.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-09-04 17:15             ` Dave Love
@ 2002-09-08 12:54               ` Richard Stallman
  2002-09-12 22:38                 ` Dave Love
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Stallman @ 2002-09-08 12:54 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, handa, keichwa, emacs-devel

    For instance, the RC version of mule-utf-8 doesn't translate
    cyrillic-iso8859-5, and the Cyrillic coding systems don't translate
    mule-unicode-0100-24ff.

We could consider adding that support in RC.  Is it a safe change?

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-09-07 23:19                   ` Dave Love
@ 2002-09-09  0:21                     ` Richard Stallman
  2002-09-12 22:43                       ` Dave Love
  2002-09-26  4:51                     ` Kenichi Handa
  1 sibling, 1 reply; 63+ messages in thread
From: Richard Stallman @ 2002-09-09  0:21 UTC (permalink / raw)
  Cc: handa, monnier+gnu/emacs, keichwa, emacs-devel

    > Yes.  But, that mode is on by default in RC too.

    Gosh.  However, it appears to be done wrongly.  Custom will show it
    isn't on, and would turn it off if you tried to turn it on.  Surely if
    it's preloaded and meant to be the default, the defcustom initial
    value should just be changed.

That sounds right to me.  Can you send a patch?

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-09-02  1:28             ` Kenichi Handa
  2002-09-05 13:41               ` Dave Love
@ 2002-09-10 16:36               ` Richard Stallman
  1 sibling, 0 replies; 63+ messages in thread
From: Richard Stallman @ 2002-09-10 16:36 UTC (permalink / raw)
  Cc: d.love, monnier+gnu/emacs, keichwa, emacs-devel

    I'll attach the necessary change to enable RC's utf-8 to
    encode latin-X plus alpha (e.g. thai).  The docstring of
    mule-utf-8 may need improvement.

    As the change is very small and that code has been in HEAD
    for more than one month, I think the change is quite safe.
    I recommend to install it in RC.

Ok, would you please install it when your conference is over?

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-09-08 12:54               ` Richard Stallman
@ 2002-09-12 22:38                 ` Dave Love
  2002-09-13 19:34                   ` Richard Stallman
  2002-09-25  7:01                   ` status of utf-8.el, etc [Re: Several serious problems] Kenichi Handa
  0 siblings, 2 replies; 63+ messages in thread
From: Dave Love @ 2002-09-12 22:38 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, handa, keichwa, emacs-devel

Richard Stallman <rms@gnu.org> writes:

>     For instance, the RC version of mule-utf-8 doesn't translate
>     cyrillic-iso8859-5, and the Cyrillic coding systems don't translate
>     mule-unicode-0100-24ff.
> 
> We could consider adding that support in RC.  Is it a safe change?

It won't break anything if done correctly, but I don't remember how
much of a change it is relative to the 21.2 code and I don't know who
might have been testing it, if anyone.  My Cyrillic changes also
filled in the koi8-r and alternativnj translation tables properly, and
that may be mixed up with it.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-09-09  0:21                     ` Richard Stallman
@ 2002-09-12 22:43                       ` Dave Love
  0 siblings, 0 replies; 63+ messages in thread
From: Dave Love @ 2002-09-12 22:43 UTC (permalink / raw)
  Cc: handa, monnier+gnu/emacs, keichwa, emacs-devel

Richard Stallman <rms@gnu.org> writes:

>     > Yes.  But, that mode is on by default in RC too.
> 
>     Gosh.  However, it appears to be done wrongly.  Custom will show it
>     isn't on, and would turn it off if you tried to turn it on.  Surely if
>     it's preloaded and meant to be the default, the defcustom initial
>     value should just be changed.
> 
> That sounds right to me.  Can you send a patch?

I should have said `define-minor-mode', not defcustom.  Just change
:init-value nil to t and take out the function call from loadup.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-09-12 22:38                 ` Dave Love
@ 2002-09-13 19:34                   ` Richard Stallman
  2002-09-25  7:01                   ` status of utf-8.el, etc [Re: Several serious problems] Kenichi Handa
  1 sibling, 0 replies; 63+ messages in thread
From: Richard Stallman @ 2002-09-13 19:34 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, handa, keichwa, emacs-devel

    It won't break anything if done correctly, but I don't remember how
    much of a change it is relative to the 21.2 code and I don't know who
    might have been testing it, if anyone.  My Cyrillic changes also
    filled in the koi8-r and alternativnj translation tables properly, and
    that may be mixed up with it.

If you want to extract the precise changes that would make sense
to install in Emacs 21.3, we could possibly do that.  Otherwise
I guess we have nothing to install.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* status of utf-8.el, etc [Re: Several serious problems]
  2002-09-12 22:38                 ` Dave Love
  2002-09-13 19:34                   ` Richard Stallman
@ 2002-09-25  7:01                   ` Kenichi Handa
  2002-09-25 14:35                     ` Stefan Monnier
  2002-09-27 13:55                     ` Dave Love
  1 sibling, 2 replies; 63+ messages in thread
From: Kenichi Handa @ 2002-09-25  7:01 UTC (permalink / raw)
  Cc: rms, monnier+gnu/emacs, keichwa, emacs-devel

In article <rzqd6rin893.fsf@albion.dl.ac.uk>, Dave Love <d.love@dl.ac.uk> writes:
> Richard Stallman <rms@gnu.org> writes:
>>      For instance, the RC version of mule-utf-8 doesn't translate
>>      cyrillic-iso8859-5, and the Cyrillic coding systems don't translate
>>      mule-unicode-0100-24ff.
>>  
>>  We could consider adding that support in RC.  Is it a safe change?

> It won't break anything if done correctly, but I don't remember how
> much of a change it is relative to the 21.2 code and I don't know who
> might have been testing it, if anyone.

I noticed some combinations of unify-8859-on-encoding-mode,
utf-8-fragment-on-decoding, and utf-8-translate-cjk doesn't
work in HEAD.  So, I made a fairly comprehensive testsuite
for testing them (attached at the tail).

As the testsuite revealed several bugs, before working on
RC, I decided to fix them in HEAD at first.  I've finished
these:

(1) Fixing the following bugs.

(1-1) unify-8859-on-encoding-mode can't be turned off
safely.  For instance, then, iso-latin-1 can't encode
Latin-1 chars.

(1-2) utf-8-translate-cjk can never be turned off once
turned on.

(1-3) When utf-8-fragment-on-decoding is non-nil, utf-16-*
doesn't encode CJK chars correctly even if
utf-8-translate-cjk is non-nil.

(1-4) encode-char/decode-char don't reflect utf-8-translate-cjk.


(2) Renaming tables/variables.   We should have cleaner
    names before people starting to use it.

(2-1) As utf-8-fragment-on-decoding and utf-8-translate-cjk are
also applicable to utf-16, I cut off "-8" from them.

(2-2) Make translation table names and their body
char-tables different to avoid confusion.

The result is as follows:

(2-2-1) Translation-tables and translation-hash-tables (not variable)

old					new
---					---
ucs-mule-to-mule-unicode		utf-translation-table-for-encode
  (mule-utf-8/16 use it for encoding)

utf-translation-table-for-decode	utf-translation-table-for-decode
  (mule-utf-8/16 use it for decoding)

utf-8-subst-rev-table			utf-subst-table-for-encode
  (mule-utf-8/16 use it for encoding)

utf-8-subst-table			utf-subst-table-for-decode
  (mule-utf-8/16 use it for decoding)

(2-2-2) Mapping tables (variables) populating above.

old					new
---					---
ucs-mule-to-mule-unicode		ucs-mule-to-mule-unicode
  (this populates utf-translation-table-for-encode
   when unify-8859-on-encoding-mode is non-nil)

utf-8-subst-table			ucs-unicode-to-mule-cjk
  (this populates utf-subst-table-for-decode
   when utf-translate-cjk is non-nil)

utf-8-subst-rev-table			ucs-mule-cjk-to-unicode
  (this populates utf-subst-table-for-encode
   when utf-translate-cjk is non-nil)

utf-8-fragmentation-table		utf-fragmentation-table
  (this populates utf-translation-table-for-decode
   when utf-fragment-on-decoding is non-nil)

--not_exist--				utf-defragmentation-table
  (this populates utf-translation-table-for-encode
   when unify-8859-on-encoding-mode is nil
   and utf-fragment-on-decoding is non-nil)

utf-8-translation-table-for-decode	--deleted--


Don't you have better ideas for these names?   If not, I'll
install the changes soon.

---
Ken'ichi HANDA
handa@etl.go.jp

Two files: utf-test.el and result.txt.

result.txt is the result of loading utf-text.el, running M-x
utf-testsuite RET, and viewing the variable
utf-testsuite-result in the current Emacs.  After my
modification, all elements are `t'.

begin 664 temp.tar.gz
M'XL(`'9=D3T``^U:6W/;-A9V9O=%>M@_D!>LIU-3K<DEJ:OMZ>ZJJ:?U)K%G
M7'?[X&0F%`7)M"G2(4$G_O=[#@#>Q(LOL;M*!\<C&I=SP\&'`Q!2PA8ZHS$S
MJ+_U7&1:ICD:#;9,TS:MTG\DVQH/H0R/@3W$#N"W+7.P1<QG\ZA`2<R<B)"M
M"R>8.RU\=_5_I71P0/3O=.*&<R]8[A,O#G7;M&U]//,8[^IVM3E=W$",?(=Y
M@6[I[@54M)5S145Q1W2`Z&0RW-,M\J]W?=/L]=8$[8<++B-*KZIBHCD5&U?$
M+KWX,T!L4I6\=*Z=@,943UG(8$+Z?9#EPFX8Q(PD<DG$B<=`D$=&CV]C1E=Q
MEY!SC)&,!51)7K7+U3VL9K[J$X@HMKBWD>?[GEMJ9!>.IS,O'MDF5C,_BSRK
MQ*<Z.C<IU:R1/J/O6T8P=Y@#$C#3&`0]<%945.-KGVN&HAQ/)P@"$@2,!`S^
M,T88U!G4&=09*S';'<,PB@U[HE(9,1JN&3,T9P.&,DU<_?):E--!RB(?H:C^
M]NI70@.8$S&C_YC3K`P,.YJ6@G2J+R/G!J6`M"K0/K\R>[)S&^&SW>DPG(>F
M3\H<>'ZG`P_2\EE7O,ZR_7)W]N^7VH_;[:7'F-P$7AQV_YV-0[]72(%_/(+/
M>$O$ZUY"UIY]UXS!DBA,OJV?ZHZ;L'I(V&5(/&:6GQP^7QLO!F+XSK8'VS6(
M_V;7(DSB^\YR!1DOSIY,9W]@UJW(A^E:A];T3FA9Q2A9]4EA.DT30'-I7<U]
M5UB3S5FFO[FT22![3"*RZL#ST&CG*>K+U5E[_<<#H(R]/?TPB<+Z;6Z(R)L.
MGC_,]F#P9TV`,/,#&S`$GV%38F-O\@317J[@B3RIVDE_-&A(;@]0E@),'.2F
M_O6%(_57#]Z%Q+8ALW7_K+@A#ELCV#,+.27UDT_)MQ_SZ9'E^K/37[Y`R9Y5
M2$?WDY2B`BC9Z?[H1&HLGO<Y5D0JVBBL/&R:[,'_'5<PU[99WLY*DZ7]/9\L
M6:['RE]?/%*#9=K#!J0TBLIL<O!R]PR;";ZAAH$#[Y3;;P[/S@Y/R>L3\GIZ
M)*#$WX-?G^AI@VP1+XI?-X8V`F^XG0&&QF;3L0BF[T<GG\KV<@59?WOQ9#K[
MP\FP82][@+)T+\NNA%[]Y[4^E2::;H,V"`B;P`M@')D"D/=]VQ@,`&-P9.K+
MLPWVU_ZO(.CX`4(`85N>=YIY>^GE'MX,EB_&(AHG/D,^4+A]*FKA@K`+"FEJ
MM7*">5G"Z!ZQG9@XY(:Z+(S(.7GUR_14-_73PU]_>W-&#,,0+6_2EO=&M]S@
MQ5Q]E!L#Y5ZP)`O0)UAKK)S\='3\<]%.VO2V:&BMJ=&4,,,MBJ'B#2<1-YR9
M%J*]91>$^G1%`R[]H>U>=*=7]?KMR4^')9]YP[C@,6\X?B)_O8#WKL(Y)<<&
M(<>HT#2,L=$]PW;J!"@&>GF/$Y,9]<-/^S#W1PMX=PJ7.-_'9-#C@KLD";S%
MK<Z/+V&@\UM''#$W`!P`&[!R`C:C3UY,=SDH/8Q!`.S8NZ[9SC1#)!>1L\30
MHFY^BXG./4JM553+(B>(X3T0)NCRZI[ZUL)3F95BJ-@^CS).#;EVXIC.29RX
M+HWC1>+[M^B<N0]3%$4``_^6S"F#`G"!$AY"9^93@H[G/.)"=RY6]2QAY%,4
M!DLNS#O(S'&O4+-UM^:BN-2+DC#.?4#(7=*`L$1.-=8!TPL2ARLJL\**!S1D
MA.U""'G,$+,.P\Z(<OC/DF4*13<!:P'K'JX<-S:V,0^!`:VT<O1KLI/=<8O$
MID7T8^)%-.U($U@2E+,1T9!?\P)&(\=EW@V7UWS*B`;/P,)RL(157/T^`'1V
ML-=NX"FO[IX\[VLQ91_KDZCXED6N?6X<=Q`I-@^9MZ(Q^,J[9/Y.706ALKAF
M?8]\L%RX#G35`<OUACU,.,B2&;G,%4`[F.'":.6R;&<B]6>GA$P%7^.3S$\/
MEAM"5IN%23"7,S:I7<+<HMQC`2@M.:3`^4.VFKEAR$"F</W@@/P>1E?$B=`P
M!Q7B"]"8N'%N'_7#(O\=EG$"ZY(E42"EV04@EBL-`UC$D$(I[T;XAHN%P>/3
MENBL7A8%%W2'*QV"J<->ZN$"B<F'-NENIR/'N,MC^,^:<<(6C3NZY/R@"<YB
MH!O"+"0:.01#^TRU>F??S[M2PBV[5>HJ^5,5:G'$RAW)YB)#*JX'W9M_E@N+
MKVKV':PJBHLR@(6=<=1G@12#Z]^5:BX^4($%IP`PG5:*;7Q0O,$NM1<5QRSB
M#]Q=4'/):*S['B!6<R*Z$%]/EK,CXVE9B(G%7,XEME26C5JXH7T/>0$F,'52
MV))FVC(=\3+^?%<0*B&G\&5+:)D%]A`^:Q@OZ-72KR>E7C%P#(++\ZB03'<V
ME,RTI%]FEB33WJ)TFG-38?0R!Y%P9T57'T&(R!!GR.2=L,$X?N:R-)&QD#*;
M])7W8C_O$MY=Y_9[63]2R2W9"B(2,5AEL@1)ZH.Y4]Q+.?R\..^OG!1PE\^[
MZT\+HMLL#`HM68V64I;F8\4:(Z;WBF=&SF))T_Q-*FULF1I\#<GC7UI$8@<3
MZ*W#L,1#NM+A/"B1(A*(Z$X74&6E\&-`<;'<"?T<^`)!HK$5E0)/^3?W(O8[
ML(W)X7.&'YIX:C`J$9HOFY2YS%M@;L:L"&H^;+,H2[*TQ9M8<9GD^+(*:;PX
MYW<,/!?(:H6I?_#$RY3869]YKK+N]S^"PV"?V?/]QJC]]U_F<#@:I[__&@T'
MIOC]EZE^__5'4!V>X#WFQO$3_I8;7X2?`O'Z9W2[ZHY$W9&H.Q)U1_)<=R3=
M[G\Q\>QWS\_/UV^DF?A[#YZ>,U+ZV_@V:#S?&&>PK2VXC^E3`U0#?*8!?DF[
M&M0?.:@_+12Q)!PL/&5/I1V?K7TPOMI7,46*%"E2I$B1(D6*%"E2I$B1(D6*
7%"E2I$B1(D6*%"GZ(OH?U(],I@!0````
`
end

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-25  7:01                   ` status of utf-8.el, etc [Re: Several serious problems] Kenichi Handa
@ 2002-09-25 14:35                     ` Stefan Monnier
  2002-09-25 23:47                       ` Kenichi Handa
  2002-09-27 13:55                     ` Dave Love
  1 sibling, 1 reply; 63+ messages in thread
From: Stefan Monnier @ 2002-09-25 14:35 UTC (permalink / raw)
  Cc: d.love, rms, monnier+gnu/emacs, keichwa, emacs-devel

> (1-1) unify-8859-on-encoding-mode can't be turned off safely.

I still haven't heard of any reason why it should ever be turned off anyway.
I think we should simply get rid of this minor-mode.

> (2-1) As utf-8-fragment-on-decoding and utf-8-translate-cjk are
> also applicable to utf-16, I cut off "-8" from them.

Why not use the `ucs-' prefix ?


	Stefan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-25 14:35                     ` Stefan Monnier
@ 2002-09-25 23:47                       ` Kenichi Handa
  2002-09-26 13:56                         ` Stefan Monnier
  0 siblings, 1 reply; 63+ messages in thread
From: Kenichi Handa @ 2002-09-25 23:47 UTC (permalink / raw)
  Cc: d.love, rms, monnier+gnu/emacs, keichwa, emacs-devel

In article <200209251435.g8PEZKh10820@rum.cs.yale.edu>, "Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:
>>  (1-1) unify-8859-on-encoding-mode can't be turned off safely.
> I still haven't heard of any reason why it should ever be turned off anyway.
> I think we should simply get rid of this minor-mode.

I tend to agree with getting rid of it.  But, I have not yet
considered that possibility deeply.  If we are going to
remove it, I think we should do that in 21.3.  Introducing
something in 21.3 and remove it in 21.4 is not a good thing.

>>  (2-1) As utf-8-fragment-on-decoding and utf-8-translate-cjk are
>>  also applicable to utf-16, I cut off "-8" from them.

> Why not use the `ucs-' prefix ?

Because they directly influence UTF (UCS Transformation
Format) and tightly related to UTF.  On the other hand, such
variables as ucs-mule-to-mule-unicode and
ucs-unicode-to-mule-cjk are more neutral tables that are not
tightly related to UTF, I think.

---
Ken'ichi HANDA
handa@etl.go.jp

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: Several serious problems
  2002-09-07 23:19                   ` Dave Love
  2002-09-09  0:21                     ` Richard Stallman
@ 2002-09-26  4:51                     ` Kenichi Handa
  1 sibling, 0 replies; 63+ messages in thread
From: Kenichi Handa @ 2002-09-26  4:51 UTC (permalink / raw)
  Cc: rms, monnier+gnu/emacs, keichwa, emacs-devel

In article <rzqelc5s7zb.fsf@albion.dl.ac.uk>, Dave Love <d.love@dl.ac.uk> writes:
>>  See also the documentations of:
>>    `unify-8859-on-decoding-mode', `unify-8859-on-encoding-mode',
>>    `utf-8-fragment-on-decoding'
>>  to customize the behaviour of this coding system."

> Fine, but that shouldn't be specific to mule-utf-8.  Those variables
> affect more coding systems,

I'm going to introduce `dependency' in coding system
property.  The value will be a list of symbols whose values
affect the behaviour of the coding system.  mule-utf-* can
have this property from the start.  For iso-8859-?, we can
add this property in ucs-tables.el.

Then, descibe-coding-system can check it and produce a
proper descriptions something like below:
----------------------------------------------------------------------
1 -- iso-latin-1 (alias: iso-8859-1 latin-1)

ISO 2022 based 8-bit encoding for Latin-1 (MIME:ISO-8859-1).

See also the documentation of these customizable variables
which alter the behaviour of this coding system.
	`unify-8859-on-encoding-mode'
	`unify-8859-on-decoding-mode'
[...]
----------------------------------------------------------------------

> and other CCL ones should use the appropriate translation
> tables.

Sure.  I'll work on it later.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-25 23:47                       ` Kenichi Handa
@ 2002-09-26 13:56                         ` Stefan Monnier
  2002-09-27 13:22                           ` Kenichi Handa
  2002-09-27 13:59                           ` Dave Love
  0 siblings, 2 replies; 63+ messages in thread
From: Stefan Monnier @ 2002-09-26 13:56 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, d.love, rms, keichwa, emacs-devel

> In article <200209251435.g8PEZKh10820@rum.cs.yale.edu>, "Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:
> >>  (1-1) unify-8859-on-encoding-mode can't be turned off safely.
> > I still haven't heard of any reason why it should ever be turned off anyway.
> > I think we should simply get rid of this minor-mode.
> 
> I tend to agree with getting rid of it.  But, I have not yet
> considered that possibility deeply.  If we are going to
> remove it, I think we should do that in 21.3.  Introducing
> something in 21.3 and remove it in 21.4 is not a good thing.

But that would make 21.3 into much less of a "bug-fix only" release.
I think we're wasting way too much effort on 21.3.  We should get
it out the door quickly so we can concentrate on getting 21.4 ready.
Backporting features from 21.4 to 21.3 is not very constructive.


	Stefan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-26 13:56                         ` Stefan Monnier
@ 2002-09-27 13:22                           ` Kenichi Handa
  2002-09-28  3:19                             ` Richard Stallman
  2002-09-27 13:59                           ` Dave Love
  1 sibling, 1 reply; 63+ messages in thread
From: Kenichi Handa @ 2002-09-27 13:22 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, d.love, rms, keichwa, emacs-devel

In article <200209261356.g8QDuTO15360@rum.cs.yale.edu>, "Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:
>>  In article <200209251435.g8PEZKh10820@rum.cs.yale.edu>, "Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:
>>  >>  (1-1) unify-8859-on-encoding-mode can't be turned off safely.
>>  > I still haven't heard of any reason why it should ever be turned off anyway.
>>  > I think we should simply get rid of this minor-mode.
>>  
>>  I tend to agree with getting rid of it.  But, I have not yet
>>  considered that possibility deeply.  If we are going to
>>  remove it, I think we should do that in 21.3.  Introducing
>>  something in 21.3 and remove it in 21.4 is not a good thing.

> But that would make 21.3 into much less of a "bug-fix only" release.

??? Why not adding a new mode in 21.3 makes 21.3 much
less of a bug-fix only release?

> I think we're wasting way too much effort on 21.3.  We should get
> it out the door quickly so we can concentrate on getting 21.4 ready.
> Backporting features from 21.4 to 21.3 is not very constructive.

Generally I agree with that we should release 21.3 as soon
as possibility.  But, the current codes of RC and HEAD both
have bugs as I wrote previously.  Some of them are related
to the new features existing only in HEAD, thus don't exist
in RC.

But, I also noticed that when one toggles
unify-8859-on-decoding-mode on and off,
unify-8859-on-encoding-mode also stops working, both in RC
and HEAD.  At least such a bug should be fixed.

This week I think I fixed all bugs related to
unify-8859-on-encoding-mode.  I'll try to fix bugs related
to unify-8859-on-decoding-mode this weekend.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-25  7:01                   ` status of utf-8.el, etc [Re: Several serious problems] Kenichi Handa
  2002-09-25 14:35                     ` Stefan Monnier
@ 2002-09-27 13:55                     ` Dave Love
  2002-09-28  3:19                       ` Richard Stallman
  2002-09-30  9:09                       ` Kenichi Handa
  1 sibling, 2 replies; 63+ messages in thread
From: Dave Love @ 2002-09-27 13:55 UTC (permalink / raw)
  Cc: rms, monnier+gnu/emacs, keichwa, emacs-devel

Kenichi Handa <handa@m17n.org> writes:

> As the testsuite revealed several bugs, before working on
> RC, I decided to fix them in HEAD at first.

Sorry about that.  It's a bit of a mess and badly needed testing
properly.

> I've finished
> these:
> 
> (1) Fixing the following bugs.

I don't understand some of these straight off, but I don't have time
to look at it now.

> (1-2) utf-8-translate-cjk can never be turned off once
> turned on.

I don't think it should be toggled; is there a reason you'd want to
avoid it?  What it needs is to be made more flexible about how the
translation is done, allowing you to prefer Chinese to Japanese, for
instance.  Since people moan about the lack of CJK Unicode support, I
hoped someone else would do such work but there's been zero interest
as far as I know...

> (2) Renaming tables/variables.   We should have cleaner
>     names before people starting to use it.

The whole thing needs reorganizing.  I wouldn't have written it like
that for inclusion in Emacs directly.  Things which are in separate
files probably shouldn't be, for instance.

> (2-1) As utf-8-fragment-on-decoding and utf-8-translate-cjk are
> also applicable to utf-16, I cut off "-8" from them.

The `utf-8' comes from them being in utf-8.el.  At least
utf-8-fragment-on-decoding isn't actually specific to Unicode coding
systems.

> (2-2) Make translation table names and their body
> char-tables different to avoid confusion.

As far as I remember, some of the confusion goes away if these are
assumed always to be present, and loaded in the right order.

> Don't you have better ideas for these names?   If not, I'll
> install the changes soon.

Some of those names don't seem right.  For instance,
ucs-mule-to-mule-unicode isn't only used by utf-8/16 as far as I
remember.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-26 13:56                         ` Stefan Monnier
  2002-09-27 13:22                           ` Kenichi Handa
@ 2002-09-27 13:59                           ` Dave Love
  2002-09-27 15:24                             ` Stefan Monnier
  2002-09-28  3:19                             ` Richard Stallman
  1 sibling, 2 replies; 63+ messages in thread
From: Dave Love @ 2002-09-27 13:59 UTC (permalink / raw)
  Cc: Kenichi Handa, rms, keichwa, emacs-devel

"Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:

> > > I think we should simply get rid of this minor-mode.

Seconded.

> But that would make 21.3 into much less of a "bug-fix only" release.

I don't understand that.

> I think we're wasting way too much effort on 21.3.

I'm not sure there _enough_ effort going into it.  There seem to be
various trivial things being done of the sort I thought were banned,
but important problems aren't getting attention.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-27 13:59                           ` Dave Love
@ 2002-09-27 15:24                             ` Stefan Monnier
  2002-09-28  3:20                               ` Richard Stallman
  2002-10-04 22:26                               ` Dave Love
  2002-09-28  3:19                             ` Richard Stallman
  1 sibling, 2 replies; 63+ messages in thread
From: Stefan Monnier @ 2002-09-27 15:24 UTC (permalink / raw)
  Cc: Stefan Monnier, Kenichi Handa, rms, keichwa, emacs-devel

> > I think we're wasting way too much effort on 21.3.
> I'm not sure there _enough_ effort going into it.  There seem to be
> various trivial things being done of the sort I thought were banned,
> but important problems aren't getting attention.

From my point of view, all the effort put into 21.3 is effort not put
into 21.4.  Since 21.4 fixes most/all of those problems already, I think
all this work done on 21.3 is wasted.


	Stefan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-27 13:22                           ` Kenichi Handa
@ 2002-09-28  3:19                             ` Richard Stallman
  0 siblings, 0 replies; 63+ messages in thread
From: Richard Stallman @ 2002-09-28  3:19 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, monnier+gnu/emacs, d.love, keichwa,
	emacs-devel

    But, I also noticed that when one toggles
    unify-8859-on-decoding-mode on and off,
    unify-8859-on-encoding-mode also stops working, both in RC
    and HEAD.  At least such a bug should be fixed.

I agree.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-27 13:55                     ` Dave Love
@ 2002-09-28  3:19                       ` Richard Stallman
  2002-09-30  9:09                       ` Kenichi Handa
  1 sibling, 0 replies; 63+ messages in thread
From: Richard Stallman @ 2002-09-28  3:19 UTC (permalink / raw)
  Cc: handa, monnier+gnu/emacs, keichwa, emacs-devel

    > (1-2) utf-8-translate-cjk can never be turned off once
    > turned on.

    I don't think it should be toggled; is there a reason you'd want to
    avoid it?

I think it should work to turn this off.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-27 13:59                           ` Dave Love
  2002-09-27 15:24                             ` Stefan Monnier
@ 2002-09-28  3:19                             ` Richard Stallman
  1 sibling, 0 replies; 63+ messages in thread
From: Richard Stallman @ 2002-09-28  3:19 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, handa, keichwa, emacs-devel

    > I think we're wasting way too much effort on 21.3.

    I'm not sure there _enough_ effort going into it.  There seem to be
    various trivial things being done of the sort I thought were banned,
    but important problems aren't getting attention.

Could you send a clear description of these important problems?

Please try not to be terse--when you mention something tersely,
I often do not know what it refers to, and then it won't do any good.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-27 15:24                             ` Stefan Monnier
@ 2002-09-28  3:20                               ` Richard Stallman
  2002-10-04 22:26                               ` Dave Love
  1 sibling, 0 replies; 63+ messages in thread
From: Richard Stallman @ 2002-09-28  3:20 UTC (permalink / raw)
  Cc: d.love, monnier+gnu/emacs, handa, keichwa, emacs-devel

    >From my point of view, all the effort put into 21.3 is effort not put
    into 21.4.  Since 21.4 fixes most/all of those problems already, I think
    all this work done on 21.3 is wasted.

21.4 is likely to be many months away.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-27 13:55                     ` Dave Love
  2002-09-28  3:19                       ` Richard Stallman
@ 2002-09-30  9:09                       ` Kenichi Handa
  2002-09-30 13:29                         ` Stefan Monnier
  2002-10-04 22:32                         ` Dave Love
  1 sibling, 2 replies; 63+ messages in thread
From: Kenichi Handa @ 2002-09-30  9:09 UTC (permalink / raw)
  Cc: rms, monnier+gnu/emacs, keichwa, emacs-devel

In article <rzq4rcby1t4.fsf@albion.dl.ac.uk>, Dave Love <d.love@dl.ac.uk> writes:

> Kenichi Handa <handa@m17n.org> writes:
>>  As the testsuite revealed several bugs, before working on
>>  RC, I decided to fix them in HEAD at first.

I've just installed fixes in HEAD.   Could people please
test these customizalbe variables.
	unify-8859-on-encoding-mode
	unify-8859-on-decoding-mode
	utf-fragment-on-decoding
	utf-translate-cjk

>>  (1-2) utf-8-translate-cjk can never be turned off once
>>  turned on.

> I don't think it should be toggled;

Then, why do we have this now?
	(defcustom utf-translate-cjk nil ...)
As far as it's a customizalbe variable, one should be able
to turn it off.

> is there a reason you'd want to avoid it?

One may or may not want select-safe-coding-system to decide
utf-8 as the default for a buffer that contains CJK charsets
and etc.  I'm not sure.

> What it needs is to be made more flexible about how the
> translation is done, allowing you to prefer Chinese to Japanese, for
> instance.  Since people moan about the lack of CJK Unicode support, I
> hoped someone else would do such work but there's been zero interest
> as far as I know...

One reason of zero interest is perhaps that such a people is
already using Mule-UCS.

>>  (2-1) As utf-8-fragment-on-decoding and utf-8-translate-cjk are
>>  also applicable to utf-16, I cut off "-8" from them.

> The `utf-8' comes from them being in utf-8.el.  At least
> utf-8-fragment-on-decoding isn't actually specific to Unicode coding
> systems.

Sure.  If utf(-8)-fragment-on-decoding is non-nil, even if
unify-8859-on-decoding-mode is on, cyrillic-iso-8bit, etc
shouldn't decode characters into mule-unicode-*.  But, I
don't have a time to find a better name, and at least,
removing "-8" is better.

>>  (2-2) Make translation table names and their body
>>  char-tables different to avoid confusion.

> As far as I remember, some of the confusion goes away if these are
> assumed always to be present, and loaded in the right order.

Of course, by disabling the facility of turing them off, we
can make things simpler.  But, that is not the case at least
now.

> Some of those names don't seem right.  For instance,
> ucs-mule-to-mule-unicode isn't only used by utf-8/16 as far as I
> remember.

???  So, it doesn't contain "utf".

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-30  9:09                       ` Kenichi Handa
@ 2002-09-30 13:29                         ` Stefan Monnier
  2002-10-01  7:37                           ` Kenichi Handa
  2002-10-04 22:38                           ` Dave Love
  2002-10-04 22:32                         ` Dave Love
  1 sibling, 2 replies; 63+ messages in thread
From: Stefan Monnier @ 2002-09-30 13:29 UTC (permalink / raw)
  Cc: d.love, rms, monnier+gnu/emacs, keichwa, emacs-devel

> > is there a reason you'd want to avoid it?
> 
> One may or may not want select-safe-coding-system to decide
> utf-8 as the default for a buffer that contains CJK charsets
> and etc.  I'm not sure.

select-safe-coding-system is supposed to find a coding system
that's safe.  If utf-8 is safe, then it should definitely be among the
ones that might be selected.  We may want to make the selection among
the safe encodings more flexible (I recently pointed out that I want
to try utf-8 first when decoding but want to try latin-1 first when
encoding, for example), but it shouldn't be done indirectly by
making it impossible to save CJK chars into a utf-8 file.

I.e. for the same reason that unify-8859-on-encoding-mode has no reason
to ever be turned off, translation of CJK into utf-8 upon encoding should
never be turned off either.


	Stefan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-30 13:29                         ` Stefan Monnier
@ 2002-10-01  7:37                           ` Kenichi Handa
  2002-10-01 20:03                             ` Richard Stallman
  2002-10-04 22:38                           ` Dave Love
  1 sibling, 1 reply; 63+ messages in thread
From: Kenichi Handa @ 2002-10-01  7:37 UTC (permalink / raw)
  Cc: d.love, rms, monnier+gnu/emacs, keichwa, emacs-devel

I've just fixed codes in RC.

I don't have a strong opinion on whether we should provide
these two customizable variables or not:
	unify-8859-on-encoding-mode
	utf-translate-cjk

I just insist on that:

o They must be safely set and reset as far as they are
  provided.  I hope this is accomplished by my recent
  changes.

o If we decide that we don't provide them, it should be done
  in 21.3.  With the latest code, stop providing them can be
  done quite easily (and safely).

I'm a little bit tired by the work on those codes.  Please
decide whether we provide them or not.  Once decided not to
provide, I'll adjust the code.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-10-01  7:37                           ` Kenichi Handa
@ 2002-10-01 20:03                             ` Richard Stallman
  2002-10-10 12:25                               ` Kenichi Handa
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Stallman @ 2002-10-01 20:03 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, d.love, monnier+gnu/emacs, keichwa,
	emacs-devel

    I don't have a strong opinion on whether we should provide
    these two customizable variables or not:
	    unify-8859-on-encoding-mode
	    utf-translate-cjk

Are these decisions users will really want to change?

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-27 15:24                             ` Stefan Monnier
  2002-09-28  3:20                               ` Richard Stallman
@ 2002-10-04 22:26                               ` Dave Love
  2002-10-05 16:59                                 ` Eli Zaretskii
  1 sibling, 1 reply; 63+ messages in thread
From: Dave Love @ 2002-10-04 22:26 UTC (permalink / raw)
  Cc: Kenichi Handa, rms, keichwa, emacs-devel

"Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:

> Since 21.4 fixes most/all of those problems already, I think
> all this work done on 21.3 is wasted.

I'm actually thinking about things that as far as I know haven't been
fixed anywhere.  There's no version of Emacs 21 which runs properly on
the latest Irix, for instance, and people shouldn't have to wait
another year or whatever to be able to use the translation tables or
get a working Emacs on the current IRix, for instance.  It needs
attention particularly so that it isn't released with more problems,
such as the lossage introduced by the compound text stuff which I had
to fix unilaterally.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-30  9:09                       ` Kenichi Handa
  2002-09-30 13:29                         ` Stefan Monnier
@ 2002-10-04 22:32                         ` Dave Love
  2002-10-09  1:26                           ` Kenichi Handa
  1 sibling, 1 reply; 63+ messages in thread
From: Dave Love @ 2002-10-04 22:32 UTC (permalink / raw)
  Cc: rms, monnier+gnu/emacs, keichwa, emacs-devel

Kenichi Handa <handa@m17n.org> writes:

> >>  (1-2) utf-8-translate-cjk can never be turned off once
> >>  turned on.
> 
> > I don't think it should be toggled;
> 
> Then, why do we have this now?
> 	(defcustom utf-translate-cjk nil ...)

It's there so that people can find the facility and have the option
not to load it, since it involves quite large tables.  I don't know
how much the extra heap space is worth worrying about, but the tables
should probably be made bigger anyhow.  (This presumably wouldn't be
important if it was preloaded, with the tables in purespace, but I
think you want to be able to customize the charsets used, so they
can't be frozen.)

> As far as it's a customizalbe variable, one should be able
> to turn it off.

[I can think of examples where it probably only makes sense to
customize things per session.]

In this case I think I either forgot or ran out of enthusiasm, but I
don't think it's something you'd want to turn off after loading it.

> One may or may not want select-safe-coding-system to decide
> utf-8 as the default for a buffer that contains CJK charsets
> and etc.  I'm not sure.

That should be taken care of by coding priorities, surely, just as
with Mule-UCS.

> One reason of zero interest is perhaps that such a people is
> already using Mule-UCS.

Or that they've been told doing it in Emacs 21 is impossible.  I'm
sure people have complained about lack of built-in support, but if
they aren't willing to work on it, I guess they aren't very justified,
as with iso-8859 character translation.  The basic Emacs 21 support
actually has advantages over Mule-UCS, such as not clobbering
unrepresentable characters and being something we can understand; I
thought that would make it attractive anyway.

> > Some of those names don't seem right.  For instance,
> > ucs-mule-to-mule-unicode isn't only used by utf-8/16 as far as I
> > remember.
> 
> ???  So, it doesn't contain "utf".

I thought you re-named it to something that did contain "utf".

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-09-30 13:29                         ` Stefan Monnier
  2002-10-01  7:37                           ` Kenichi Handa
@ 2002-10-04 22:38                           ` Dave Love
  1 sibling, 0 replies; 63+ messages in thread
From: Dave Love @ 2002-10-04 22:38 UTC (permalink / raw)
  Cc: Kenichi Handa, rms, keichwa, emacs-devel

"Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:

> I.e. for the same reason that unify-8859-on-encoding-mode has no reason
> to ever be turned off, translation of CJK into utf-8 upon encoding should
> never be turned off either.

Note that you could be talking up to ~40k slots in each of two hash
tables, even if data for constructing them is ephemeral.  I think
that's the only reason people might not want it.

[I haven't checked how much of the space could reasonably be filled
from the Mule 5 charsets, but there's a lot of chinese-cns11643, for
instance.]

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-10-04 22:26                               ` Dave Love
@ 2002-10-05 16:59                                 ` Eli Zaretskii
  2002-10-11 17:21                                   ` Dave Love
  0 siblings, 1 reply; 63+ messages in thread
From: Eli Zaretskii @ 2002-10-05 16:59 UTC (permalink / raw)
  Cc: keichwa, emacs-devel

> From: Dave Love <d.love@dl.ac.uk>
> Date: 04 Oct 2002 23:26:45 +0100
> 
> There's no version of Emacs 21 which runs properly on
> the latest Irix

FWIW, I use stock Emacs 21.2 on Irix every day, with no problems at
all.  YMMV, of course.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-10-04 22:32                         ` Dave Love
@ 2002-10-09  1:26                           ` Kenichi Handa
  2002-10-15 17:38                             ` Dave Love
  0 siblings, 1 reply; 63+ messages in thread
From: Kenichi Handa @ 2002-10-09  1:26 UTC (permalink / raw)
  Cc: rms, monnier+gnu/emacs, keichwa, emacs-devel

In article <rzq7kgxal99.fsf@albion.dl.ac.uk>, Dave Love <d.love@dl.ac.uk> writes:

> Kenichi Handa <handa@m17n.org> writes:
>>  >>  (1-2) utf-8-translate-cjk can never be turned off once
>>  >>  turned on.
>>  
>>  > I don't think it should be toggled;
>>  
>>  Then, why do we have this now?
>>  	(defcustom utf-translate-cjk nil ...)

> It's there so that people can find the facility and have the option
> not to load it, since it involves quite large tables.

For such a case, isn't `feature' enough?  Those who want
this feature just does (require FEATURE).

>>  One may or may not want select-safe-coding-system to decide
>>  utf-8 as the default for a buffer that contains CJK charsets
>>  and etc.  I'm not sure.

> That should be taken care of by coding priorities, surely, just as
> with Mule-UCS.

Ah, hmmm, yes.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-10-01 20:03                             ` Richard Stallman
@ 2002-10-10 12:25                               ` Kenichi Handa
  0 siblings, 0 replies; 63+ messages in thread
From: Kenichi Handa @ 2002-10-10 12:25 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, d.love, monnier+gnu/emacs, keichwa,
	emacs-devel

In article <E17wTEb-0000eO-00@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:
>     I don't have a strong opinion on whether we should provide
>     these two customizable variables or not:
> 	    unify-8859-on-encoding-mode
> 	    utf-translate-cjk

> Are these decisions users will really want to change?

I think most users don't turn off
unify-8859-on-encoding-mode, thus they don't need this
variables.

But, if unify-8859-on-encoding-mode is on, writing, for
instance, iso-8859-5 file gets about 1.20 times slower.  So,
poeple who frequently edit 10 mbyte files of iso-8859-5
encoding may want to turn it off.  Though, it's surely vary
rare.

As for utf-translate-cjk, the effect of speed down is amost
negligible.  So, once one turns it on, he won't turn it off.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-10-05 16:59                                 ` Eli Zaretskii
@ 2002-10-11 17:21                                   ` Dave Love
  2002-10-12  8:27                                     ` Eli Zaretskii
  0 siblings, 1 reply; 63+ messages in thread
From: Dave Love @ 2002-10-11 17:21 UTC (permalink / raw)
  Cc: keichwa, emacs-devel

"Eli Zaretskii" <eliz@is.elta.co.il> writes:

> > From: Dave Love <d.love@dl.ac.uk>
> > Date: 04 Oct 2002 23:26:45 +0100
> > 
> > There's no version of Emacs 21 which runs properly on
> > the latest Irix
> 
> FWIW, I use stock Emacs 21.2 on Irix every day, with no problems at
> all.  YMMV, of course.

It's not worth very much unless you're talking about the latest Irix,
in which case I'd be interested to know how you get it to run.  As far
as I know, there's a (mmap?) problem with 21.2 anyhow, per PROBLEMS.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-10-11 17:21                                   ` Dave Love
@ 2002-10-12  8:27                                     ` Eli Zaretskii
  0 siblings, 0 replies; 63+ messages in thread
From: Eli Zaretskii @ 2002-10-12  8:27 UTC (permalink / raw)
  Cc: keichwa, emacs-devel

> From: Dave Love <d.love@dl.ac.uk>
> Date: 11 Oct 2002 18:21:22 +0100
> 
> > FWIW, I use stock Emacs 21.2 on Irix every day, with no problems at
> > all.  YMMV, of course.
> 
> It's not worth very much unless you're talking about the latest Irix,

IIRC, the OS version is 6.5.16 (I could check tomorrow if this is
important).  I don't know if it's the latest.

> in which case I'd be interested to know how you get it to run.

Plain "configure; make; make install", if memory serves.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-10-09  1:26                           ` Kenichi Handa
@ 2002-10-15 17:38                             ` Dave Love
  2002-10-16  4:38                               ` Richard Stallman
  0 siblings, 1 reply; 63+ messages in thread
From: Dave Love @ 2002-10-15 17:38 UTC (permalink / raw)
  Cc: rms, monnier+gnu/emacs, keichwa, emacs-devel

Kenichi Handa <handa@m17n.org> writes:

> > It's there so that people can find the facility and have the option
> > not to load it, since it involves quite large tables.
> 
> For such a case, isn't `feature' enough?  Those who want
> this feature just does (require FEATURE).

Well, that's not supposed to change the way Emacs works, but also, I
think such things should be in Custom so that people can find them in
the first place.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: status of utf-8.el, etc [Re: Several serious problems]
  2002-10-15 17:38                             ` Dave Love
@ 2002-10-16  4:38                               ` Richard Stallman
  0 siblings, 0 replies; 63+ messages in thread
From: Richard Stallman @ 2002-10-16  4:38 UTC (permalink / raw)
  Cc: handa, monnier+gnu/emacs, keichwa, emacs-devel

    > For such a case, isn't `feature' enough?  Those who want
    > this feature just does (require FEATURE).

    Well, that's not supposed to change the way Emacs works, but also, I
    think such things should be in Custom so that people can find them in
    the first place.

Both of these reasons are good ones.

^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2002-10-16  4:38 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-08-19  7:48 Several serious problems Kenichi Handa
2002-08-22 17:08 ` Dave Love
2002-08-29 13:25   ` Kenichi Handa
2002-08-29 17:32     ` Stefan Monnier
2002-08-29 23:15       ` Dave Love
2002-08-30 14:36         ` Stefan Monnier
2002-09-04 17:23           ` Dave Love
2002-08-30  6:09       ` Richard Stallman
2002-08-31 17:30         ` Dave Love
2002-09-02  0:01           ` Richard Stallman
2002-09-04 17:15             ` Dave Love
2002-09-08 12:54               ` Richard Stallman
2002-09-12 22:38                 ` Dave Love
2002-09-13 19:34                   ` Richard Stallman
2002-09-25  7:01                   ` status of utf-8.el, etc [Re: Several serious problems] Kenichi Handa
2002-09-25 14:35                     ` Stefan Monnier
2002-09-25 23:47                       ` Kenichi Handa
2002-09-26 13:56                         ` Stefan Monnier
2002-09-27 13:22                           ` Kenichi Handa
2002-09-28  3:19                             ` Richard Stallman
2002-09-27 13:59                           ` Dave Love
2002-09-27 15:24                             ` Stefan Monnier
2002-09-28  3:20                               ` Richard Stallman
2002-10-04 22:26                               ` Dave Love
2002-10-05 16:59                                 ` Eli Zaretskii
2002-10-11 17:21                                   ` Dave Love
2002-10-12  8:27                                     ` Eli Zaretskii
2002-09-28  3:19                             ` Richard Stallman
2002-09-27 13:55                     ` Dave Love
2002-09-28  3:19                       ` Richard Stallman
2002-09-30  9:09                       ` Kenichi Handa
2002-09-30 13:29                         ` Stefan Monnier
2002-10-01  7:37                           ` Kenichi Handa
2002-10-01 20:03                             ` Richard Stallman
2002-10-10 12:25                               ` Kenichi Handa
2002-10-04 22:38                           ` Dave Love
2002-10-04 22:32                         ` Dave Love
2002-10-09  1:26                           ` Kenichi Handa
2002-10-15 17:38                             ` Dave Love
2002-10-16  4:38                               ` Richard Stallman
2002-08-29 23:09     ` Several serious problems Dave Love
2002-08-30  6:11       ` Richard Stallman
2002-09-04 17:21         ` Dave Love
2002-08-29 23:17     ` Dave Love
2002-08-30  6:11       ` Richard Stallman
2002-08-31 17:31         ` Dave Love
2002-09-02  0:01           ` Richard Stallman
2002-09-02  1:28             ` Kenichi Handa
2002-09-05 13:41               ` Dave Love
2002-09-05 23:32                 ` Kenichi Handa
2002-09-06 11:38                   ` Robert J. Chassell
2002-09-07 23:19                   ` Dave Love
2002-09-09  0:21                     ` Richard Stallman
2002-09-12 22:43                       ` Dave Love
2002-09-26  4:51                     ` Kenichi Handa
2002-09-10 16:36               ` Richard Stallman
2002-08-30  6:09     ` Richard Stallman
2002-08-24 12:11 ` Richard Stallman
2002-08-26 13:17   ` Kenichi Handa
2002-08-26 16:15     ` Stefan Monnier
2002-08-29 23:18       ` Dave Love
2002-08-30 14:36         ` Stefan Monnier
2002-08-29 23:19     ` Dave Love

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.