* ctext-pre-write-conversion barfs @ 2002-02-23 6:53 Tak Ota 2002-02-23 8:48 ` Eli Zaretskii 0 siblings, 1 reply; 8+ messages in thread From: Tak Ota @ 2002-02-23 6:53 UTC (permalink / raw) 2002-02-22 Eli Zaretskii <eliz@is.elta.co.il> Support for ICCCM Extended Segments in X selections: * international/mule-conf.el (ctext-no-compositions): New coding system. (compount-text-no-extensions): Renamed from compound-text. (x-ctext-no-extensions, ctext-no-extensions): Aliases for compound-text-no-extensions. (compound-text): Redefined using post-read and pre-write conversions. * international/mule.el (non-standard-icccm-encodings-alist) (non-standard-designations-alist): New variables. (ctext-post-read-conversion, ctext-pre-write-conversion): New functions. `ctext-pre-write-conversion' is called from `writ-region' as annotations = build_annotations_2 (start, end, coding.pre_write_conversion, annotations); As an irregular case write-region sometimes passes a string in `start'. It seems like ctext-pre-write-conversion is not prepared to receive a string in START. -Tak _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://mail.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ctext-pre-write-conversion barfs 2002-02-23 6:53 ctext-pre-write-conversion barfs Tak Ota @ 2002-02-23 8:48 ` Eli Zaretskii 2002-02-23 16:11 ` Tak Ota 0 siblings, 1 reply; 8+ messages in thread From: Eli Zaretskii @ 2002-02-23 8:48 UTC (permalink / raw) Cc: emacs-devel > From: Tak Ota <Takaaki.Ota@am.sony.com> > Date: Fri, 22 Feb 2002 22:53:55 -0800 (PST) > > `ctext-pre-write-conversion' is called from `writ-region' as > > annotations = build_annotations_2 (start, end, > coding.pre_write_conversion, annotations); > > As an irregular case write-region sometimes passes a string in > `start'. It seems like ctext-pre-write-conversion is not prepared to > receive a string in START. Thanks, I will look into this. Do you have any real-life example of using compound-text in a way that causes it to be called from write-region? Note that compound-text is generally inappropriate for use in file I/O, as its string says (it can't DTRT with multibyte text). _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://mail.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ctext-pre-write-conversion barfs 2002-02-23 8:48 ` Eli Zaretskii @ 2002-02-23 16:11 ` Tak Ota 2002-02-23 18:51 ` (no subject) Eli Zaretskii 2002-02-26 16:51 ` ctext-pre-write-conversion barfs Eli Zaretskii 0 siblings, 2 replies; 8+ messages in thread From: Tak Ota @ 2002-02-23 16:11 UTC (permalink / raw) Cc: emacs-devel Sat, 23 Feb 2002 10:48:42 +0200: "Eli Zaretskii" <eliz@is.elta.co.il> wrote: > Do you have any real-life example of using compound-text in a way that > causes it to be called from write-region? Note that compound-text is > generally inappropriate for use in file I/O, as its string says (it > can't DTRT with multibyte text). I don't know the exact mechanism why ctext-pre-write-conversion was summoned. But it was where the debug-on-error brought me to, while using a mail package 'Mew' (3.0.54). Following is the last function issued in Mew (mew-mark.el) where write-region was called with a string for the argument START. (defun mew-summary-clean-folder-cache (folder) "Erase Summary mode then remove and touch the cache file." (if (get-buffer folder) (save-excursion (set-buffer folder) (mew-erase-buffer) (set-buffer-modified-p nil))) (let ((cfile (mew-expand-folder folder mew-summary-cache-file))) (if (file-exists-p cfile) (write-region "" nil cfile nil 'no-msg)))) BTW, I just now tried to save this buffer and noticed that ctext-pre-write-conversion was invoked. It is called 3 times for each save-buffer. Here is the output from describe-coding-system. -Tak Coding system for saving this buffer: x -- ctext-unix Default coding system (for new files): S -- sjis (alias of japanese-shift-jis) Coding system for keyboard input: S -- sjis (alias of japanese-shift-jis) Coding system for terminal output: S -- sjis (alias of japanese-shift-jis) Defaults for subprocess I/O: decoding: S -- sjis (alias of japanese-shift-jis) encoding: S -- sjis (alias of japanese-shift-jis) Priority order for recognizing coding systems when reading files: 1. iso-2022-jp (alias: junet) 2. japanese-iso-8bit (alias: euc-japan-1990 euc-japan euc-jp) 3. japanese-shift-jis (alias: shift_jis sjis) 4. iso-2022-jp-2 5. iso-latin-1 (alias: iso-8859-1 latin-1) 6. iso-2022-7bit 7. iso-2022-8bit-ss2 8. emacs-mule 9. raw-text (alias: mew-cs-text mew-cs-text-lf mew-cs-text-crlf mew-cs-text-cr mew-cs-text-net) 10. chinese-big5 (alias: big5 cn-big5) 11. no-conversion (alias: binary) 12. mule-utf-8 (alias: utf-8) Other coding systems cannot be distinguished automatically from these, and therefore cannot be recognized automatically with the present coding system priorities. The following are decoded correctly but recognized as iso-2022-jp-2: iso-2022-7bit-ss2 iso-2022-7bit-lock iso-2022-7bit-lock-ss2 iso-2022-cn iso-2022-cn-ext iso-2022-kr Particular coding systems specified for certain file names: OPERATION TARGET PATTERN CODING SYSTEM(s) --------- -------------- ---------------- File I/O "\\.g?z\\(~\\|\\.~[0-9]+~\\)?\\'" (no-conversion . no-conversion) "\\.tgz\\'" (no-conversion . no-conversion) "\\.bz2\\'" (no-conversion . no-conversion) "\\.Z\\(~\\|\\.~[0-9]+~\\)?\\'" (no-conversion . no-conversion) "\\.elc\\'" (emacs-mule . emacs-mule) "\\.utf\\(-8\\)?\\'" utf-8 "\\(\\`\\|/\\)loaddefs.el\\'" (raw-text . raw-text-unix) "\\.tar\\'" (no-conversion . no-conversion) "" find-buffer-file-type-coding-system Process I/O nothing specified Network I/O "nntp" (junet-unix . junet-unix) 110 (no-conversion . no-conversion) 25 (no-conversion . no-conversion) _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://mail.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* (no subject) 2002-02-23 16:11 ` Tak Ota @ 2002-02-23 18:51 ` Eli Zaretskii 2002-02-23 23:11 ` Tak Ota 2002-02-26 16:51 ` ctext-pre-write-conversion barfs Eli Zaretskii 1 sibling, 1 reply; 8+ messages in thread From: Eli Zaretskii @ 2002-02-23 18:51 UTC (permalink / raw) Cc: emacs-devel Tak Ota on Sat, 23 Feb 2002 08:11:49 -0800 (PST)) Subject: Re: ctext-pre-write-conversion barfs Reply-to: Eli Zaretskii <eliz@is.elta.co.il> References: <20020222.225355.01365596.Takaaki.Ota@am.sony.com> <9743-Sat23Feb2002104842+0200-eliz@is.elta.co.il> <20020223.081149.60852929.Takaaki.Ota@am.sony.com> --text follows this line-- > Date: Sat, 23 Feb 2002 08:11:49 -0800 (PST) > From: Tak Ota <Takaaki.Ota@am.sony.com> > > BTW, I just now tried to save this buffer and noticed that > ctext-pre-write-conversion was invoked. It is called 3 times for > each save-buffer. Here is the output from describe-coding-system. > > Coding system for saving this buffer: > x -- ctext-unix Could you please find out how come the buffer's encoding got set to ctext-unix? It's a very unusual coding system for buffers. Compound-text is normally used for X selections only. The reason I'm asking you to look into this is that the assumption behind the code I wrote for the ctext extensions is that ctext is not normally used for file I/O. There are limitations of the pre-write and post-read conversions that make the modified ctext coding system inappropriate for reading and writing text to/from multibyte buffers. If ctext is used for file I/O, I will have to revert the decision to call the new encoding `ctext', and will find some other name. So please look into this. Thanks in advance. _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://mail.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: (no subject) 2002-02-23 18:51 ` (no subject) Eli Zaretskii @ 2002-02-23 23:11 ` Tak Ota 2002-02-25 1:11 ` [mew-int 00737] " Kazu Yamamoto 0 siblings, 1 reply; 8+ messages in thread From: Tak Ota @ 2002-02-23 23:11 UTC (permalink / raw) Cc: eliz, emacs-devel Yamamoto-san, We find that there is a small conflict in a recent change made to the developer version of emacs with the coding system used in Mew package. Could you answer to the following question? What I can tell is mew-mule3.el defines as: ;; ctext for consistency --unibyte, -unix for XEmacs's ^M (defvar mew-cs-m17n (if (mew-coding-system-p 'ctext-unix) 'ctext-unix 'ctext)) Therefore many buffers in Mew sets ctext-unix as the default coding system. However, I have no idea how this decision was made. -Tak Sat, 23 Feb 2002 13:51:54 -0500: Eli Zaretskii <eliz@gnu.org> wrote: > Tak Ota on Sat, 23 Feb 2002 08:11:49 -0800 (PST)) > Subject: Re: ctext-pre-write-conversion barfs > Reply-to: Eli Zaretskii <eliz@is.elta.co.il> > References: <20020222.225355.01365596.Takaaki.Ota@am.sony.com> > <9743-Sat23Feb2002104842+0200-eliz@is.elta.co.il> <20020223.081149.60852929.Takaaki.Ota@am.sony.com> > --text follows this line-- > > Date: Sat, 23 Feb 2002 08:11:49 -0800 (PST) > > From: Tak Ota <Takaaki.Ota@am.sony.com> > > > > BTW, I just now tried to save this buffer and noticed that > > ctext-pre-write-conversion was invoked. It is called 3 times for > > each save-buffer. Here is the output from describe-coding-system. > > > > Coding system for saving this buffer: > > x -- ctext-unix > > Could you please find out how come the buffer's encoding got set to > ctext-unix? It's a very unusual coding system for buffers. > Compound-text is normally used for X selections only. > > The reason I'm asking you to look into this is that the assumption > behind the code I wrote for the ctext extensions is that ctext is not > normally used for file I/O. There are limitations of the pre-write > and post-read conversions that make the modified ctext coding system > inappropriate for reading and writing text to/from multibyte buffers. > If ctext is used for file I/O, I will have to revert the decision to > call the new encoding `ctext', and will find some other name. > > So please look into this. Thanks in advance. > > _______________________________________________ > Emacs-devel mailing list > Emacs-devel@gnu.org > http://mail.gnu.org/mailman/listinfo/emacs-devel _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://mail.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [mew-int 00737] Re: (no subject) 2002-02-23 23:11 ` Tak Ota @ 2002-02-25 1:11 ` Kazu Yamamoto 2002-02-25 6:55 ` Eli Zaretskii 0 siblings, 1 reply; 8+ messages in thread From: Kazu Yamamoto @ 2002-02-25 1:11 UTC (permalink / raw) From: Tak Ota <Takaaki.Ota@am.sony.com> Subject: [mew-int 00737] Re: (no subject) > Therefore many buffers in Mew sets ctext-unix as the default coding > system. However, I have no idea how this decision was made. Yes, this is intentional. Ctext is only character set which can conver ISO-2022 related character sets and ISO-8859-1. If we use US-ASCII and 8bit portion of ctext, this is identical to ISO-8859-1. This is friendly to non-mule EMacs, Emacs --unibyte for example. Of course, Emacs (--multibyte) can read ctext. So, to my best knowledge, ctext is only the character set which can survive in the old Emacs world and the multilingual Emacs world. Since Mew uses the CR character as a separator in Summary mode (for historical reasons:: selective-display was used), ctext-unix is necessary. (CR MUST not be converted to LF). What I can say is as follows: (1) Since selective-display is not used anymore, another separator rather than CR can be used. ctext will be sufficient. (2) Even if we continue to use CR, ctext-unix is necessary only for Summary mode. ctext is enough for other buffers. --Kazu _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://mail.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [mew-int 00737] Re: (no subject) 2002-02-25 1:11 ` [mew-int 00737] " Kazu Yamamoto @ 2002-02-25 6:55 ` Eli Zaretskii 0 siblings, 0 replies; 8+ messages in thread From: Eli Zaretskii @ 2002-02-25 6:55 UTC (permalink / raw) Cc: mew-int, emacs-devel On Mon, 25 Feb 2002, Kazu Yamamoto wrote: > From: Tak Ota <Takaaki.Ota@am.sony.com> > Subject: [mew-int 00737] Re: (no subject) > > > Therefore many buffers in Mew sets ctext-unix as the default coding > > system. However, I have no idea how this decision was made. > > Yes, this is intentional. > > Ctext is only character set which can conver ISO-2022 related > character sets and ISO-8859-1. > > If we use US-ASCII and 8bit portion of ctext, this is identical to > ISO-8859-1. This is friendly to non-mule EMacs, Emacs --unibyte for > example. Handa-san, it sounds like the new encoding with ICCM Extended Segments support should not be called ctext, because ctext is used in file I/O, at least by Mew. It sounds like we should leave ctext as it was working before, including the fact that it didn't support the ICCCM Extended Segments, and use another name (e.g., compound-text-with-extensions) for the new coding system. We will then have to make that new coding system be the default for X selections in CVS head. Do you agree? > So, to my best knowledge, ctext is only the character set which can > survive in the old Emacs world and the multilingual Emacs world. It was IMHO an unfortunate decision to use ctext for file I/O, since ctext must support the ICCCM spec which is inappropriate for encoding anything but X selections. However, given that Mew uses that for quite some time, Emacs shouldn't break it, I think. > (1) Since selective-display is not used anymore, another separator > rather than CR can be used. ctext will be sufficient. > > (2) Even if we continue to use CR, ctext-unix is necessary only for > Summary mode. ctext is enough for other buffers. The -unix part is not the problem. The problem is that ctext was changed in Emacs CVS to support Extended Segments, in accordance with the ICCCM spec (because some versions of X are using that ICCCM feature to encode ISO8859-15 characters in selections). That change in ctext caused it to call pre-write-conversion function which couldn't handle being called on a string. While I can (and most probably shall) fix the new coding system to be able to be called from write-region, there are other subtle aspects of the new coding system that make it inappropriate for file I/O. So I think we had better leave the original ctext alone. _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://mail.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ctext-pre-write-conversion barfs 2002-02-23 16:11 ` Tak Ota 2002-02-23 18:51 ` (no subject) Eli Zaretskii @ 2002-02-26 16:51 ` Eli Zaretskii 1 sibling, 0 replies; 8+ messages in thread From: Eli Zaretskii @ 2002-02-26 16:51 UTC (permalink / raw) Cc: emacs-devel > Date: Sat, 23 Feb 2002 08:11:49 -0800 (PST) > From: Tak Ota <Takaaki.Ota@am.sony.com> > > > Do you have any real-life example of using compound-text in a way that > > causes it to be called from write-region? Note that compound-text is > > generally inappropriate for use in file I/O, as its string says (it > > can't DTRT with multibyte text). > > I don't know the exact mechanism why ctext-pre-write-conversion was > summoned. But it was where the debug-on-error brought me to, while > using a mail package 'Mew' (3.0.54). Following is the last function > issued in Mew (mew-mark.el) where write-region was called with a > string for the argument START. This should be fixed now in CVS head. Thanks for reporting this problem. _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://mail.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2002-02-26 16:51 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-02-23 6:53 ctext-pre-write-conversion barfs Tak Ota 2002-02-23 8:48 ` Eli Zaretskii 2002-02-23 16:11 ` Tak Ota 2002-02-23 18:51 ` (no subject) Eli Zaretskii 2002-02-23 23:11 ` Tak Ota 2002-02-25 1:11 ` [mew-int 00737] " Kazu Yamamoto 2002-02-25 6:55 ` Eli Zaretskii 2002-02-26 16:51 ` ctext-pre-write-conversion barfs Eli Zaretskii
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).