* ctext-pre-write-conversion barfs
@ 2002-02-23 6:53 Tak Ota
2002-02-23 8:48 ` Eli Zaretskii
0 siblings, 1 reply; 10+ messages in thread
From: Tak Ota @ 2002-02-23 6:53 UTC (permalink / raw)
2002-02-22 Eli Zaretskii <eliz@is.elta.co.il>
Support for ICCCM Extended Segments in X selections:
* international/mule-conf.el (ctext-no-compositions): New coding
system.
(compount-text-no-extensions): Renamed from compound-text.
(x-ctext-no-extensions, ctext-no-extensions): Aliases for
compound-text-no-extensions.
(compound-text): Redefined using post-read and pre-write
conversions.
* international/mule.el (non-standard-icccm-encodings-alist)
(non-standard-designations-alist): New variables.
(ctext-post-read-conversion, ctext-pre-write-conversion): New
functions.
`ctext-pre-write-conversion' is called from `writ-region' as
annotations = build_annotations_2 (start, end,
coding.pre_write_conversion, annotations);
As an irregular case write-region sometimes passes a string in
`start'. It seems like ctext-pre-write-conversion is not prepared to
receive a string in START.
-Tak
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ctext-pre-write-conversion barfs
2002-02-23 6:53 ctext-pre-write-conversion barfs Tak Ota
@ 2002-02-23 8:48 ` Eli Zaretskii
2002-02-23 16:11 ` Tak Ota
0 siblings, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2002-02-23 8:48 UTC (permalink / raw)
Cc: emacs-devel
> From: Tak Ota <Takaaki.Ota@am.sony.com>
> Date: Fri, 22 Feb 2002 22:53:55 -0800 (PST)
>
> `ctext-pre-write-conversion' is called from `writ-region' as
>
> annotations = build_annotations_2 (start, end,
> coding.pre_write_conversion, annotations);
>
> As an irregular case write-region sometimes passes a string in
> `start'. It seems like ctext-pre-write-conversion is not prepared to
> receive a string in START.
Thanks, I will look into this.
Do you have any real-life example of using compound-text in a way that
causes it to be called from write-region? Note that compound-text is
generally inappropriate for use in file I/O, as its string says (it
can't DTRT with multibyte text).
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ctext-pre-write-conversion barfs
2002-02-23 8:48 ` Eli Zaretskii
@ 2002-02-23 16:11 ` Tak Ota
2002-02-23 18:51 ` (no subject) Eli Zaretskii
2002-02-26 16:51 ` ctext-pre-write-conversion barfs Eli Zaretskii
0 siblings, 2 replies; 10+ messages in thread
From: Tak Ota @ 2002-02-23 16:11 UTC (permalink / raw)
Cc: emacs-devel
Sat, 23 Feb 2002 10:48:42 +0200: "Eli Zaretskii" <eliz@is.elta.co.il> wrote:
> Do you have any real-life example of using compound-text in a way that
> causes it to be called from write-region? Note that compound-text is
> generally inappropriate for use in file I/O, as its string says (it
> can't DTRT with multibyte text).
I don't know the exact mechanism why ctext-pre-write-conversion was
summoned. But it was where the debug-on-error brought me to, while
using a mail package 'Mew' (3.0.54). Following is the last function
issued in Mew (mew-mark.el) where write-region was called with a
string for the argument START.
(defun mew-summary-clean-folder-cache (folder)
"Erase Summary mode then remove and touch the cache file."
(if (get-buffer folder)
(save-excursion
(set-buffer folder)
(mew-erase-buffer)
(set-buffer-modified-p nil)))
(let ((cfile (mew-expand-folder folder mew-summary-cache-file)))
(if (file-exists-p cfile)
(write-region "" nil cfile nil 'no-msg))))
BTW, I just now tried to save this buffer and noticed that
ctext-pre-write-conversion was invoked. It is called 3 times for
each save-buffer. Here is the output from describe-coding-system.
-Tak
Coding system for saving this buffer:
x -- ctext-unix
Default coding system (for new files):
S -- sjis (alias of japanese-shift-jis)
Coding system for keyboard input:
S -- sjis (alias of japanese-shift-jis)
Coding system for terminal output:
S -- sjis (alias of japanese-shift-jis)
Defaults for subprocess I/O:
decoding: S -- sjis (alias of japanese-shift-jis)
encoding: S -- sjis (alias of japanese-shift-jis)
Priority order for recognizing coding systems when reading files:
1. iso-2022-jp (alias: junet)
2. japanese-iso-8bit (alias: euc-japan-1990 euc-japan euc-jp)
3. japanese-shift-jis (alias: shift_jis sjis)
4. iso-2022-jp-2
5. iso-latin-1 (alias: iso-8859-1 latin-1)
6. iso-2022-7bit
7. iso-2022-8bit-ss2
8. emacs-mule
9. raw-text (alias: mew-cs-text mew-cs-text-lf mew-cs-text-crlf mew-cs-text-cr mew-cs-text-net)
10. chinese-big5 (alias: big5 cn-big5)
11. no-conversion (alias: binary)
12. mule-utf-8 (alias: utf-8)
Other coding systems cannot be distinguished automatically
from these, and therefore cannot be recognized automatically
with the present coding system priorities.
The following are decoded correctly but recognized as iso-2022-jp-2:
iso-2022-7bit-ss2 iso-2022-7bit-lock iso-2022-7bit-lock-ss2 iso-2022-cn iso-2022-cn-ext iso-2022-kr
Particular coding systems specified for certain file names:
OPERATION TARGET PATTERN CODING SYSTEM(s)
--------- -------------- ----------------
File I/O "\\.g?z\\(~\\|\\.~[0-9]+~\\)?\\'"
(no-conversion . no-conversion)
"\\.tgz\\'" (no-conversion . no-conversion)
"\\.bz2\\'" (no-conversion . no-conversion)
"\\.Z\\(~\\|\\.~[0-9]+~\\)?\\'"
(no-conversion . no-conversion)
"\\.elc\\'" (emacs-mule . emacs-mule)
"\\.utf\\(-8\\)?\\'" utf-8
"\\(\\`\\|/\\)loaddefs.el\\'"
(raw-text . raw-text-unix)
"\\.tar\\'" (no-conversion . no-conversion)
"" find-buffer-file-type-coding-system
Process I/O nothing specified
Network I/O "nntp" (junet-unix . junet-unix)
110 (no-conversion . no-conversion)
25 (no-conversion . no-conversion)
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* (no subject)
2002-02-23 16:11 ` Tak Ota
@ 2002-02-23 18:51 ` Eli Zaretskii
2002-02-23 23:11 ` Tak Ota
2002-02-26 16:51 ` ctext-pre-write-conversion barfs Eli Zaretskii
1 sibling, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2002-02-23 18:51 UTC (permalink / raw)
Cc: emacs-devel
Tak Ota on Sat, 23 Feb 2002 08:11:49 -0800 (PST))
Subject: Re: ctext-pre-write-conversion barfs
Reply-to: Eli Zaretskii <eliz@is.elta.co.il>
References: <20020222.225355.01365596.Takaaki.Ota@am.sony.com>
<9743-Sat23Feb2002104842+0200-eliz@is.elta.co.il> <20020223.081149.60852929.Takaaki.Ota@am.sony.com>
--text follows this line--
> Date: Sat, 23 Feb 2002 08:11:49 -0800 (PST)
> From: Tak Ota <Takaaki.Ota@am.sony.com>
>
> BTW, I just now tried to save this buffer and noticed that
> ctext-pre-write-conversion was invoked. It is called 3 times for
> each save-buffer. Here is the output from describe-coding-system.
>
> Coding system for saving this buffer:
> x -- ctext-unix
Could you please find out how come the buffer's encoding got set to
ctext-unix? It's a very unusual coding system for buffers.
Compound-text is normally used for X selections only.
The reason I'm asking you to look into this is that the assumption
behind the code I wrote for the ctext extensions is that ctext is not
normally used for file I/O. There are limitations of the pre-write
and post-read conversions that make the modified ctext coding system
inappropriate for reading and writing text to/from multibyte buffers.
If ctext is used for file I/O, I will have to revert the decision to
call the new encoding `ctext', and will find some other name.
So please look into this. Thanks in advance.
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: (no subject)
2002-02-23 18:51 ` (no subject) Eli Zaretskii
@ 2002-02-23 23:11 ` Tak Ota
2002-02-25 1:11 ` [mew-int 00737] " Kazu Yamamoto
0 siblings, 1 reply; 10+ messages in thread
From: Tak Ota @ 2002-02-23 23:11 UTC (permalink / raw)
Cc: eliz, emacs-devel
Yamamoto-san,
We find that there is a small conflict in a recent change made to the
developer version of emacs with the coding system used in Mew package.
Could you answer to the following question? What I can tell is
mew-mule3.el defines as:
;; ctext for consistency --unibyte, -unix for XEmacs's ^M
(defvar mew-cs-m17n (if (mew-coding-system-p 'ctext-unix) 'ctext-unix 'ctext))
Therefore many buffers in Mew sets ctext-unix as the default coding
system. However, I have no idea how this decision was made.
-Tak
Sat, 23 Feb 2002 13:51:54 -0500: Eli Zaretskii <eliz@gnu.org> wrote:
> Tak Ota on Sat, 23 Feb 2002 08:11:49 -0800 (PST))
> Subject: Re: ctext-pre-write-conversion barfs
> Reply-to: Eli Zaretskii <eliz@is.elta.co.il>
> References: <20020222.225355.01365596.Takaaki.Ota@am.sony.com>
> <9743-Sat23Feb2002104842+0200-eliz@is.elta.co.il> <20020223.081149.60852929.Takaaki.Ota@am.sony.com>
> --text follows this line--
> > Date: Sat, 23 Feb 2002 08:11:49 -0800 (PST)
> > From: Tak Ota <Takaaki.Ota@am.sony.com>
> >
> > BTW, I just now tried to save this buffer and noticed that
> > ctext-pre-write-conversion was invoked. It is called 3 times for
> > each save-buffer. Here is the output from describe-coding-system.
> >
> > Coding system for saving this buffer:
> > x -- ctext-unix
>
> Could you please find out how come the buffer's encoding got set to
> ctext-unix? It's a very unusual coding system for buffers.
> Compound-text is normally used for X selections only.
>
> The reason I'm asking you to look into this is that the assumption
> behind the code I wrote for the ctext extensions is that ctext is not
> normally used for file I/O. There are limitations of the pre-write
> and post-read conversions that make the modified ctext coding system
> inappropriate for reading and writing text to/from multibyte buffers.
> If ctext is used for file I/O, I will have to revert the decision to
> call the new encoding `ctext', and will find some other name.
>
> So please look into this. Thanks in advance.
>
> _______________________________________________
> Emacs-devel mailing list
> Emacs-devel@gnu.org
> http://mail.gnu.org/mailman/listinfo/emacs-devel
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [mew-int 00737] Re: (no subject)
2002-02-23 23:11 ` Tak Ota
@ 2002-02-25 1:11 ` Kazu Yamamoto
2002-02-25 6:55 ` Eli Zaretskii
0 siblings, 1 reply; 10+ messages in thread
From: Kazu Yamamoto @ 2002-02-25 1:11 UTC (permalink / raw)
From: Tak Ota <Takaaki.Ota@am.sony.com>
Subject: [mew-int 00737] Re: (no subject)
> Therefore many buffers in Mew sets ctext-unix as the default coding
> system. However, I have no idea how this decision was made.
Yes, this is intentional.
Ctext is only character set which can conver ISO-2022 related
character sets and ISO-8859-1.
If we use US-ASCII and 8bit portion of ctext, this is identical to
ISO-8859-1. This is friendly to non-mule EMacs, Emacs --unibyte for
example.
Of course, Emacs (--multibyte) can read ctext.
So, to my best knowledge, ctext is only the character set which can
survive in the old Emacs world and the multilingual Emacs world.
Since Mew uses the CR character as a separator in Summary mode (for
historical reasons:: selective-display was used), ctext-unix is
necessary. (CR MUST not be converted to LF).
What I can say is as follows:
(1) Since selective-display is not used anymore, another separator
rather than CR can be used. ctext will be sufficient.
(2) Even if we continue to use CR, ctext-unix is necessary only for
Summary mode. ctext is enough for other buffers.
--Kazu
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [mew-int 00737] Re: (no subject)
2002-02-25 1:11 ` [mew-int 00737] " Kazu Yamamoto
@ 2002-02-25 6:55 ` Eli Zaretskii
0 siblings, 0 replies; 10+ messages in thread
From: Eli Zaretskii @ 2002-02-25 6:55 UTC (permalink / raw)
Cc: mew-int, emacs-devel
On Mon, 25 Feb 2002, Kazu Yamamoto wrote:
> From: Tak Ota <Takaaki.Ota@am.sony.com>
> Subject: [mew-int 00737] Re: (no subject)
>
> > Therefore many buffers in Mew sets ctext-unix as the default coding
> > system. However, I have no idea how this decision was made.
>
> Yes, this is intentional.
>
> Ctext is only character set which can conver ISO-2022 related
> character sets and ISO-8859-1.
>
> If we use US-ASCII and 8bit portion of ctext, this is identical to
> ISO-8859-1. This is friendly to non-mule EMacs, Emacs --unibyte for
> example.
Handa-san, it sounds like the new encoding with ICCM Extended Segments
support should not be called ctext, because ctext is used in file I/O, at
least by Mew. It sounds like we should leave ctext as it was working
before, including the fact that it didn't support the ICCCM Extended
Segments, and use another name (e.g., compound-text-with-extensions) for
the new coding system. We will then have to make that new coding system
be the default for X selections in CVS head.
Do you agree?
> So, to my best knowledge, ctext is only the character set which can
> survive in the old Emacs world and the multilingual Emacs world.
It was IMHO an unfortunate decision to use ctext for file I/O, since
ctext must support the ICCCM spec which is inappropriate for encoding
anything but X selections. However, given that Mew uses that for quite
some time, Emacs shouldn't break it, I think.
> (1) Since selective-display is not used anymore, another separator
> rather than CR can be used. ctext will be sufficient.
>
> (2) Even if we continue to use CR, ctext-unix is necessary only for
> Summary mode. ctext is enough for other buffers.
The -unix part is not the problem. The problem is that ctext was changed
in Emacs CVS to support Extended Segments, in accordance with the ICCCM
spec (because some versions of X are using that ICCCM feature to encode
ISO8859-15 characters in selections). That change in ctext caused it to
call pre-write-conversion function which couldn't handle being called on
a string.
While I can (and most probably shall) fix the new coding system to be
able to be called from write-region, there are other subtle aspects of
the new coding system that make it inappropriate for file I/O. So I
think we had better leave the original ctext alone.
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [mew-int 00737] Re: (no subject)
@ 2002-02-25 7:10 Kenichi Handa
2002-02-26 16:52 ` Eli Zaretskii
0 siblings, 1 reply; 10+ messages in thread
From: Kenichi Handa @ 2002-02-25 7:10 UTC (permalink / raw)
Cc: kazu, mew-int, emacs-devel
Eli Zaretskii <eliz@is.elta.co.il> writes:
> Handa-san, it sounds like the new encoding with ICCM Extended Segments
> support should not be called ctext, because ctext is used in file I/O, at
> least by Mew. It sounds like we should leave ctext as it was working
> before, including the fact that it didn't support the ICCCM Extended
> Segments, and use another name (e.g., compound-text-with-extensions) for
> the new coding system. We will then have to make that new coding system
> be the default for X selections in CVS head.
> Do you agree?
>> So, to my best knowledge, ctext is only the character set which can
>> survive in the old Emacs world and the multilingual Emacs world.
> It was IMHO an unfortunate decision to use ctext for file I/O, since
> ctext must support the ICCCM spec which is inappropriate for encoding
> anything but X selections. However, given that Mew uses that for quite
> some time, Emacs shouldn't break it, I think.
Considering this situation, I agree with the name change.
---
Ken'ichi HANDA
handa@etl.go.jp
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ctext-pre-write-conversion barfs
2002-02-23 16:11 ` Tak Ota
2002-02-23 18:51 ` (no subject) Eli Zaretskii
@ 2002-02-26 16:51 ` Eli Zaretskii
1 sibling, 0 replies; 10+ messages in thread
From: Eli Zaretskii @ 2002-02-26 16:51 UTC (permalink / raw)
Cc: emacs-devel
> Date: Sat, 23 Feb 2002 08:11:49 -0800 (PST)
> From: Tak Ota <Takaaki.Ota@am.sony.com>
>
> > Do you have any real-life example of using compound-text in a way that
> > causes it to be called from write-region? Note that compound-text is
> > generally inappropriate for use in file I/O, as its string says (it
> > can't DTRT with multibyte text).
>
> I don't know the exact mechanism why ctext-pre-write-conversion was
> summoned. But it was where the debug-on-error brought me to, while
> using a mail package 'Mew' (3.0.54). Following is the last function
> issued in Mew (mew-mark.el) where write-region was called with a
> string for the argument START.
This should be fixed now in CVS head. Thanks for reporting this
problem.
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [mew-int 00737] Re: (no subject)
2002-02-25 7:10 [mew-int 00737] Re: (no subject) Kenichi Handa
@ 2002-02-26 16:52 ` Eli Zaretskii
0 siblings, 0 replies; 10+ messages in thread
From: Eli Zaretskii @ 2002-02-26 16:52 UTC (permalink / raw)
Cc: kazu, mew-int, emacs-devel
> Date: Mon, 25 Feb 2002 16:10:20 +0900 (JST)
> From: Kenichi Handa <handa@etl.go.jp>
>
> >> So, to my best knowledge, ctext is only the character set which can
> >> survive in the old Emacs world and the multilingual Emacs world.
>
> > It was IMHO an unfortunate decision to use ctext for file I/O, since
> > ctext must support the ICCCM spec which is inappropriate for encoding
> > anything but X selections. However, given that Mew uses that for quite
> > some time, Emacs shouldn't break it, I think.
>
> Considering this situation, I agree with the name change.
I made the change in CVS head. (It isn't required in the RC, since I
didn't change the names there to begin with.)
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2002-02-26 16:52 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-02-23 6:53 ctext-pre-write-conversion barfs Tak Ota
2002-02-23 8:48 ` Eli Zaretskii
2002-02-23 16:11 ` Tak Ota
2002-02-23 18:51 ` (no subject) Eli Zaretskii
2002-02-23 23:11 ` Tak Ota
2002-02-25 1:11 ` [mew-int 00737] " Kazu Yamamoto
2002-02-25 6:55 ` Eli Zaretskii
2002-02-26 16:51 ` ctext-pre-write-conversion barfs Eli Zaretskii
-- strict thread matches above, loose matches on Subject: below --
2002-02-25 7:10 [mew-int 00737] Re: (no subject) Kenichi Handa
2002-02-26 16:52 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).