unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* ctext-pre-write-conversion barfs
@ 2002-02-23  6:53 Tak Ota
  2002-02-23  8:48 ` Eli Zaretskii
  0 siblings, 1 reply; 10+ messages in thread
From: Tak Ota @ 2002-02-23  6:53 UTC (permalink / raw)


2002-02-22  Eli Zaretskii  <eliz@is.elta.co.il>

	Support for ICCCM  Extended Segments in X selections:

	* international/mule-conf.el (ctext-no-compositions): New coding
	system.
	(compount-text-no-extensions): Renamed from compound-text.
	(x-ctext-no-extensions, ctext-no-extensions): Aliases for
	compound-text-no-extensions.
	(compound-text): Redefined using post-read and pre-write
	conversions.

	* international/mule.el (non-standard-icccm-encodings-alist)
	(non-standard-designations-alist): New variables.
	(ctext-post-read-conversion, ctext-pre-write-conversion): New
	functions.

`ctext-pre-write-conversion' is called from `writ-region' as

  annotations = build_annotations_2 (start, end,
				     coding.pre_write_conversion, annotations);

As an irregular case write-region sometimes passes a string in
`start'.  It seems like ctext-pre-write-conversion is not prepared to
receive a string in START.

-Tak

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: ctext-pre-write-conversion barfs
  2002-02-23  6:53 ctext-pre-write-conversion barfs Tak Ota
@ 2002-02-23  8:48 ` Eli Zaretskii
  2002-02-23 16:11   ` Tak Ota
  0 siblings, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2002-02-23  8:48 UTC (permalink / raw)
  Cc: emacs-devel

> From: Tak Ota <Takaaki.Ota@am.sony.com>
> Date: Fri, 22 Feb 2002 22:53:55 -0800 (PST)
> 
> `ctext-pre-write-conversion' is called from `writ-region' as
> 
>   annotations = build_annotations_2 (start, end,
> 				     coding.pre_write_conversion, annotations);
> 
> As an irregular case write-region sometimes passes a string in
> `start'.  It seems like ctext-pre-write-conversion is not prepared to
> receive a string in START.

Thanks, I will look into this.

Do you have any real-life example of using compound-text in a way that
causes it to be called from write-region?  Note that compound-text is
generally inappropriate for use in file I/O, as its string says (it
can't DTRT with multibyte text).

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: ctext-pre-write-conversion barfs
  2002-02-23  8:48 ` Eli Zaretskii
@ 2002-02-23 16:11   ` Tak Ota
  2002-02-23 18:51     ` (no subject) Eli Zaretskii
  2002-02-26 16:51     ` ctext-pre-write-conversion barfs Eli Zaretskii
  0 siblings, 2 replies; 10+ messages in thread
From: Tak Ota @ 2002-02-23 16:11 UTC (permalink / raw)
  Cc: emacs-devel

Sat, 23 Feb 2002 10:48:42 +0200: "Eli Zaretskii" <eliz@is.elta.co.il> wrote:

> Do you have any real-life example of using compound-text in a way that
> causes it to be called from write-region?  Note that compound-text is
> generally inappropriate for use in file I/O, as its string says (it
> can't DTRT with multibyte text).

I don't know the exact mechanism why ctext-pre-write-conversion was
summoned.  But it was where the debug-on-error brought me to, while
using a mail package 'Mew' (3.0.54).  Following is the last function
issued in Mew (mew-mark.el) where write-region was called with a
string for the argument START.

(defun mew-summary-clean-folder-cache (folder)
  "Erase Summary mode then remove and touch the cache file."
  (if (get-buffer folder)
      (save-excursion
	(set-buffer folder)
	(mew-erase-buffer)
	(set-buffer-modified-p nil)))
  (let ((cfile (mew-expand-folder folder mew-summary-cache-file)))
    (if (file-exists-p cfile)
	(write-region "" nil cfile nil 'no-msg))))

BTW, I just now tried to save this buffer and noticed that
ctext-pre-write-conversion was invoked.  It is called 3 times for
each save-buffer.  Here is the output from describe-coding-system.

-Tak


Coding system for saving this buffer:
  x -- ctext-unix

Default coding system (for new files):
  S -- sjis (alias of japanese-shift-jis)

Coding system for keyboard input:
  S -- sjis (alias of japanese-shift-jis)

Coding system for terminal output:
  S -- sjis (alias of japanese-shift-jis)

Defaults for subprocess I/O:
  decoding: S -- sjis (alias of japanese-shift-jis)

  encoding: S -- sjis (alias of japanese-shift-jis)


Priority order for recognizing coding systems when reading files:
  1. iso-2022-jp (alias: junet)
  2. japanese-iso-8bit (alias: euc-japan-1990 euc-japan euc-jp)
  3. japanese-shift-jis (alias: shift_jis sjis)
  4. iso-2022-jp-2 
  5. iso-latin-1 (alias: iso-8859-1 latin-1)
  6. iso-2022-7bit 
  7. iso-2022-8bit-ss2 
  8. emacs-mule 
  9. raw-text (alias: mew-cs-text mew-cs-text-lf mew-cs-text-crlf mew-cs-text-cr mew-cs-text-net)
  10. chinese-big5 (alias: big5 cn-big5)
  11. no-conversion (alias: binary)
  12. mule-utf-8 (alias: utf-8)

  Other coding systems cannot be distinguished automatically
  from these, and therefore cannot be recognized automatically
  with the present coding system priorities.

  The following are decoded correctly but recognized as iso-2022-jp-2:
    iso-2022-7bit-ss2 iso-2022-7bit-lock iso-2022-7bit-lock-ss2 iso-2022-cn iso-2022-cn-ext iso-2022-kr

Particular coding systems specified for certain file names:

  OPERATION	TARGET PATTERN		CODING SYSTEM(s)
  ---------	--------------		----------------
  File I/O	"\\.g?z\\(~\\|\\.~[0-9]+~\\)?\\'"
					(no-conversion . no-conversion)
		"\\.tgz\\'"		(no-conversion . no-conversion)
		"\\.bz2\\'"		(no-conversion . no-conversion)
		"\\.Z\\(~\\|\\.~[0-9]+~\\)?\\'"
					(no-conversion . no-conversion)
		"\\.elc\\'"		(emacs-mule . emacs-mule)
		"\\.utf\\(-8\\)?\\'"	utf-8
		"\\(\\`\\|/\\)loaddefs.el\\'"
					(raw-text . raw-text-unix)
		"\\.tar\\'"		(no-conversion . no-conversion)
		""			find-buffer-file-type-coding-system
  Process I/O	nothing specified
  Network I/O	"nntp"			(junet-unix . junet-unix)
		110			(no-conversion . no-conversion)
		25			(no-conversion . no-conversion)

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* (no subject)
  2002-02-23 16:11   ` Tak Ota
@ 2002-02-23 18:51     ` Eli Zaretskii
  2002-02-23 23:11       ` Tak Ota
  2002-02-26 16:51     ` ctext-pre-write-conversion barfs Eli Zaretskii
  1 sibling, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2002-02-23 18:51 UTC (permalink / raw)
  Cc: emacs-devel

	 Tak Ota on Sat, 23 Feb 2002 08:11:49 -0800 (PST))
Subject: Re: ctext-pre-write-conversion barfs
Reply-to: Eli Zaretskii <eliz@is.elta.co.il>
References: <20020222.225355.01365596.Takaaki.Ota@am.sony.com>
	    <9743-Sat23Feb2002104842+0200-eliz@is.elta.co.il> <20020223.081149.60852929.Takaaki.Ota@am.sony.com>
--text follows this line--
> Date: Sat, 23 Feb 2002 08:11:49 -0800 (PST)
> From: Tak Ota <Takaaki.Ota@am.sony.com>
> 
> BTW, I just now tried to save this buffer and noticed that
> ctext-pre-write-conversion was invoked.  It is called 3 times for
> each save-buffer.  Here is the output from describe-coding-system.
> 
> Coding system for saving this buffer:
>   x -- ctext-unix

Could you please find out how come the buffer's encoding got set to
ctext-unix?  It's a very unusual coding system for buffers.
Compound-text is normally used for X selections only.

The reason I'm asking you to look into this is that the assumption
behind the code I wrote for the ctext extensions is that ctext is not
normally used for file I/O.  There are limitations of the pre-write
and post-read conversions that make the modified ctext coding system
inappropriate for reading and writing text to/from multibyte buffers.
If ctext is used for file I/O, I will have to revert the decision to
call the new encoding `ctext', and will find some other name.

So please look into this.  Thanks in advance.

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: (no subject)
  2002-02-23 18:51     ` (no subject) Eli Zaretskii
@ 2002-02-23 23:11       ` Tak Ota
  2002-02-25  1:11         ` [mew-int 00737] " Kazu Yamamoto
  0 siblings, 1 reply; 10+ messages in thread
From: Tak Ota @ 2002-02-23 23:11 UTC (permalink / raw)
  Cc: eliz, emacs-devel

Yamamoto-san,

We find that there is a small conflict in a recent change made to the
developer version of emacs with the coding system used in Mew package.

Could you answer to the following question?  What I can tell is
mew-mule3.el defines as:

;; ctext for consistency --unibyte, -unix for XEmacs's ^M
(defvar mew-cs-m17n (if (mew-coding-system-p 'ctext-unix) 'ctext-unix 'ctext))

Therefore many buffers in Mew sets ctext-unix as the default coding
system.  However, I have no idea how this decision was made.

-Tak

Sat, 23 Feb 2002 13:51:54 -0500: Eli Zaretskii <eliz@gnu.org> wrote:

> 	 Tak Ota on Sat, 23 Feb 2002 08:11:49 -0800 (PST))
> Subject: Re: ctext-pre-write-conversion barfs
> Reply-to: Eli Zaretskii <eliz@is.elta.co.il>
> References: <20020222.225355.01365596.Takaaki.Ota@am.sony.com>
> 	    <9743-Sat23Feb2002104842+0200-eliz@is.elta.co.il> <20020223.081149.60852929.Takaaki.Ota@am.sony.com>
> --text follows this line--
> > Date: Sat, 23 Feb 2002 08:11:49 -0800 (PST)
> > From: Tak Ota <Takaaki.Ota@am.sony.com>
> > 
> > BTW, I just now tried to save this buffer and noticed that
> > ctext-pre-write-conversion was invoked.  It is called 3 times for
> > each save-buffer.  Here is the output from describe-coding-system.
> > 
> > Coding system for saving this buffer:
> >   x -- ctext-unix
> 
> Could you please find out how come the buffer's encoding got set to
> ctext-unix?  It's a very unusual coding system for buffers.
> Compound-text is normally used for X selections only.
> 
> The reason I'm asking you to look into this is that the assumption
> behind the code I wrote for the ctext extensions is that ctext is not
> normally used for file I/O.  There are limitations of the pre-write
> and post-read conversions that make the modified ctext coding system
> inappropriate for reading and writing text to/from multibyte buffers.
> If ctext is used for file I/O, I will have to revert the decision to
> call the new encoding `ctext', and will find some other name.
> 
> So please look into this.  Thanks in advance.
> 
> _______________________________________________
> Emacs-devel mailing list
> Emacs-devel@gnu.org
> http://mail.gnu.org/mailman/listinfo/emacs-devel

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [mew-int 00737] Re: (no subject)
  2002-02-23 23:11       ` Tak Ota
@ 2002-02-25  1:11         ` Kazu Yamamoto
  2002-02-25  6:55           ` Eli Zaretskii
  0 siblings, 1 reply; 10+ messages in thread
From: Kazu Yamamoto @ 2002-02-25  1:11 UTC (permalink / raw)


From: Tak Ota <Takaaki.Ota@am.sony.com>
Subject: [mew-int 00737] Re: (no subject)

> Therefore many buffers in Mew sets ctext-unix as the default coding
> system.  However, I have no idea how this decision was made.

Yes, this is intentional. 

Ctext is only character set which can conver ISO-2022 related
character sets and ISO-8859-1. 

If we use US-ASCII and 8bit portion of ctext, this is identical to
ISO-8859-1. This is friendly to non-mule EMacs, Emacs --unibyte for
example.

Of course, Emacs (--multibyte) can read ctext.

So, to my best knowledge, ctext is only the character set which can
survive in the old Emacs world and the multilingual Emacs world.

Since Mew uses the CR character as a separator in Summary mode (for
historical reasons:: selective-display was used), ctext-unix is
necessary. (CR MUST not be converted to LF).

What I can say is as follows:

(1) Since selective-display is not used anymore, another separator
rather than CR can be used. ctext will be sufficient.

(2) Even if we continue to use CR, ctext-unix is necessary only for
    Summary mode. ctext is enough for other buffers.

--Kazu

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [mew-int 00737] Re: (no subject)
  2002-02-25  1:11         ` [mew-int 00737] " Kazu Yamamoto
@ 2002-02-25  6:55           ` Eli Zaretskii
  0 siblings, 0 replies; 10+ messages in thread
From: Eli Zaretskii @ 2002-02-25  6:55 UTC (permalink / raw)
  Cc: mew-int, emacs-devel


On Mon, 25 Feb 2002, Kazu Yamamoto wrote:

> From: Tak Ota <Takaaki.Ota@am.sony.com>
> Subject: [mew-int 00737] Re: (no subject)
> 
> > Therefore many buffers in Mew sets ctext-unix as the default coding
> > system.  However, I have no idea how this decision was made.
> 
> Yes, this is intentional. 
> 
> Ctext is only character set which can conver ISO-2022 related
> character sets and ISO-8859-1.
> 
> If we use US-ASCII and 8bit portion of ctext, this is identical to
> ISO-8859-1. This is friendly to non-mule EMacs, Emacs --unibyte for
> example.

Handa-san, it sounds like the new encoding with ICCM Extended Segments 
support should not be called ctext, because ctext is used in file I/O, at 
least by Mew.  It sounds like we should leave ctext as it was working 
before, including the fact that it didn't support the ICCCM Extended 
Segments, and use another name (e.g., compound-text-with-extensions) for 
the new coding system.  We will then have to make that new coding system 
be the default for X selections in CVS head.

Do you agree?

> So, to my best knowledge, ctext is only the character set which can
> survive in the old Emacs world and the multilingual Emacs world.

It was IMHO an unfortunate decision to use ctext for file I/O, since 
ctext must support the ICCCM spec which is inappropriate for encoding 
anything but X selections.  However, given that Mew uses that for quite 
some time, Emacs shouldn't break it, I think.

> (1) Since selective-display is not used anymore, another separator
> rather than CR can be used. ctext will be sufficient.
> 
> (2) Even if we continue to use CR, ctext-unix is necessary only for
>     Summary mode. ctext is enough for other buffers.

The -unix part is not the problem.  The problem is that ctext was changed 
in Emacs CVS to support Extended Segments, in accordance with the ICCCM 
spec (because some versions of X are using that ICCCM feature to encode 
ISO8859-15 characters in selections).  That change in ctext caused it to 
call pre-write-conversion function which couldn't handle being called on 
a string.

While I can (and most probably shall) fix the new coding system to be 
able to be called from write-region, there are other subtle aspects of 
the new coding system that make it inappropriate for file I/O.  So I 
think we had better leave the original ctext alone.

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [mew-int 00737] Re: (no subject)
@ 2002-02-25  7:10 Kenichi Handa
  2002-02-26 16:52 ` Eli Zaretskii
  0 siblings, 1 reply; 10+ messages in thread
From: Kenichi Handa @ 2002-02-25  7:10 UTC (permalink / raw)
  Cc: kazu, mew-int, emacs-devel

Eli Zaretskii <eliz@is.elta.co.il> writes:
> Handa-san, it sounds like the new encoding with ICCM Extended Segments 
> support should not be called ctext, because ctext is used in file I/O, at 
> least by Mew.  It sounds like we should leave ctext as it was working 
> before, including the fact that it didn't support the ICCCM Extended 
> Segments, and use another name (e.g., compound-text-with-extensions) for 
> the new coding system.  We will then have to make that new coding system 
> be the default for X selections in CVS head.

> Do you agree?

>>  So, to my best knowledge, ctext is only the character set which can
>>  survive in the old Emacs world and the multilingual Emacs world.

> It was IMHO an unfortunate decision to use ctext for file I/O, since 
> ctext must support the ICCCM spec which is inappropriate for encoding 
> anything but X selections.  However, given that Mew uses that for quite 
> some time, Emacs shouldn't break it, I think.

Considering this situation, I agree with the name change.

---
Ken'ichi HANDA
handa@etl.go.jp

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: ctext-pre-write-conversion barfs
  2002-02-23 16:11   ` Tak Ota
  2002-02-23 18:51     ` (no subject) Eli Zaretskii
@ 2002-02-26 16:51     ` Eli Zaretskii
  1 sibling, 0 replies; 10+ messages in thread
From: Eli Zaretskii @ 2002-02-26 16:51 UTC (permalink / raw)
  Cc: emacs-devel

> Date: Sat, 23 Feb 2002 08:11:49 -0800 (PST)
> From: Tak Ota <Takaaki.Ota@am.sony.com>
> 
> > Do you have any real-life example of using compound-text in a way that
> > causes it to be called from write-region?  Note that compound-text is
> > generally inappropriate for use in file I/O, as its string says (it
> > can't DTRT with multibyte text).
> 
> I don't know the exact mechanism why ctext-pre-write-conversion was
> summoned.  But it was where the debug-on-error brought me to, while
> using a mail package 'Mew' (3.0.54).  Following is the last function
> issued in Mew (mew-mark.el) where write-region was called with a
> string for the argument START.

This should be fixed now in CVS head.  Thanks for reporting this
problem.

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [mew-int 00737] Re: (no subject)
  2002-02-25  7:10 [mew-int 00737] Re: (no subject) Kenichi Handa
@ 2002-02-26 16:52 ` Eli Zaretskii
  0 siblings, 0 replies; 10+ messages in thread
From: Eli Zaretskii @ 2002-02-26 16:52 UTC (permalink / raw)
  Cc: kazu, mew-int, emacs-devel

> Date: Mon, 25 Feb 2002 16:10:20 +0900 (JST)
> From: Kenichi Handa <handa@etl.go.jp>
> 
> >>  So, to my best knowledge, ctext is only the character set which can
> >>  survive in the old Emacs world and the multilingual Emacs world.
> 
> > It was IMHO an unfortunate decision to use ctext for file I/O, since 
> > ctext must support the ICCCM spec which is inappropriate for encoding 
> > anything but X selections.  However, given that Mew uses that for quite 
> > some time, Emacs shouldn't break it, I think.
> 
> Considering this situation, I agree with the name change.

I made the change in CVS head.  (It isn't required in the RC, since I
didn't change the names there to begin with.)

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2002-02-26 16:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-02-23  6:53 ctext-pre-write-conversion barfs Tak Ota
2002-02-23  8:48 ` Eli Zaretskii
2002-02-23 16:11   ` Tak Ota
2002-02-23 18:51     ` (no subject) Eli Zaretskii
2002-02-23 23:11       ` Tak Ota
2002-02-25  1:11         ` [mew-int 00737] " Kazu Yamamoto
2002-02-25  6:55           ` Eli Zaretskii
2002-02-26 16:51     ` ctext-pre-write-conversion barfs Eli Zaretskii
  -- strict thread matches above, loose matches on Subject: below --
2002-02-25  7:10 [mew-int 00737] Re: (no subject) Kenichi Handa
2002-02-26 16:52 ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).