From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: unibyte<->multibyte conversion [Re: Emacs-diffs Digest, Vol 2, Issue 28] Date: Mon, 20 Jan 2003 11:29:51 +0900 (JST) Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Message-ID: <200301200229.LAA16287@etlken.m17n.org> References: <3405-Sat18Jan2003154003+0200-eliz@is.elta.co.il> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII X-Trace: main.gmane.org 1043029760 11098 80.91.224.249 (20 Jan 2003 02:29:20 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 20 Jan 2003 02:29:20 +0000 (UTC) Cc: emacs-devel@gnu.org Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 18aRgc-0002ss-00 for ; Mon, 20 Jan 2003 03:29:18 +0100 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 18aRrI-0007aX-00 for ; Mon, 20 Jan 2003 03:40:20 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18aRhl-0007OE-03 for emacs-devel@quimby.gnus.org; Sun, 19 Jan 2003 21:30:29 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10.13) id 18aRhK-0007NF-00 for emacs-devel@gnu.org; Sun, 19 Jan 2003 21:30:02 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10.13) id 18aRhI-0007MZ-00 for emacs-devel@gnu.org; Sun, 19 Jan 2003 21:30:01 -0500 Original-Received: from tsukuba.m17n.org ([192.47.44.130]) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18aRhF-0007KD-00; Sun, 19 Jan 2003 21:29:58 -0500 Original-Received: from fs.m17n.org (fs.m17n.org [192.47.44.2])h0K2Tqk16639; Mon, 20 Jan 2003 11:29:52 +0900 (JST) (envelope-from handa@m17n.org) Original-Received: from etlken.m17n.org (etlken.m17n.org [192.47.44.125]) h0K2TqR01374; Mon, 20 Jan 2003 11:29:52 +0900 (JST) Original-Received: (from handa@localhost) by etlken.m17n.org (8.8.8+Sun/3.7W-2001040620) id LAA16287; Mon, 20 Jan 2003 11:29:51 +0900 (JST) Original-To: eliz@is.elta.co.il In-reply-to: <3405-Sat18Jan2003154003+0200-eliz@is.elta.co.il> User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.2.92 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI) Original-cc: rms@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1b5 Precedence: list List-Id: Emacs development discussions. List-Help: List-Post: List-Subscribe: , List-Archive: List-Unsubscribe: , Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:10886 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:10886 In article <3405-Sat18Jan2003154003+0200-eliz@is.elta.co.il>, "Eli Zaretskii" writes: >> If you want multibyte strings "without decoding", would emacs-mule >> give you that? > I don't think so. emacs-mule is for reading text that is already in > the internal Emacs representation, like auto-save files. Yes. > AFAIK, raw-text does decode the text in the sense that > 8-bit characters which have their 8th bit set are decoded > into the eight-bit-* charsets. Yes, but that is only in the case that you read a file into a multibyte buffer by raw-text. This conversion from raw byte sequence to multibyte form is what done by string-to-multibyte which I wrote in the previous mail. On process reading, if raw-text is used, the process output is at first read as a unibyte string, the string is coverted to multibyte by string-as-mulitbyte (not by not-yet-existing string-to-multibyte), then inserted in a multibyte buffer. I don't remember why the current code does as above. I think the behaviour what Eli wrote is more consistent with the behaviour of file reading. Shall I change the code as what Eli wrote (by introducing the new function string-to-multibyte)? By the way, it may be clean to have all these functions in parallel, and spare one section describing the difference of MAKE, AS, TO conversions in info. string-make-multibyte string-as-multibyte string-to-multibyte string-make-unibyte string-as-unibyte string-to-unibyte (perpaps the same as string-as-unibyte, or it should signal an error if non-ascii, non-eight-bit-XXX is contained). buffer-make-multibyte buffer-as-multibyte (same as (set-buffer-multibyte BUFFER t)) buffer-to-multibyte buffer-make-unibyte buffer-as-unibyte (same as (set-buffer-multibyte BUFFER nil)) buffer-to-nuibyte --- Ken'ichi HANDA handa@m17n.org