From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: eight-bit char handling in emacs-unicode Date: Thu, 27 Nov 2003 10:34:45 +0900 (JST) Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Message-ID: <200311270134.KAA28664@etlken.m17n.org> References: <200311250107.KAA24646@etlken.m17n.org> <200311260007.JAA26617@etlken.m17n.org> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII X-Trace: sea.gmane.org 1069897091 13457 80.91.224.253 (27 Nov 2003 01:38:11 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 27 Nov 2003 01:38:11 +0000 (UTC) Cc: jas@extundo.com, emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Thu Nov 27 02:38:08 2003 Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1APB6e-0004n6-00 for ; Thu, 27 Nov 2003 02:38:08 +0100 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.35 #1 (Debian)) id 1APB6d-00088y-00 for ; Thu, 27 Nov 2003 02:38:07 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.24) id 1APC2R-00043X-2s for emacs-devel@quimby.gnus.org; Wed, 26 Nov 2003 21:37:51 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.24) id 1APC1c-00042l-4Y for emacs-devel@gnu.org; Wed, 26 Nov 2003 21:37:00 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.24) id 1APC15-0003vH-SH for emacs-devel@gnu.org; Wed, 26 Nov 2003 21:36:58 -0500 Original-Received: from [192.47.44.130] (helo=tsukuba.m17n.org) by monty-python.gnu.org with esmtp (Exim 4.24) id 1APC15-0003u9-62 for emacs-devel@gnu.org; Wed, 26 Nov 2003 21:36:27 -0500 Original-Received: from fs.m17n.org (fs.m17n.org [192.47.44.2]) by tsukuba.m17n.org (8.11.6p2/3.7W-20010518204228) with ESMTP id hAR1Ykh28646; Thu, 27 Nov 2003 10:34:46 +0900 (JST) (envelope-from handa@m17n.org) Original-Received: from etlken.m17n.org (etlken.m17n.org [192.47.44.125]) by fs.m17n.org (8.11.6/3.7W-20010823150639) with ESMTP id hAR1Yjs28984; Thu, 27 Nov 2003 10:34:45 +0900 (JST) Original-Received: (from handa@localhost) by etlken.m17n.org (8.8.8+Sun/3.7W-2001040620) id KAA28664; Thu, 27 Nov 2003 10:34:45 +0900 (JST) Original-To: monnier@IRO.UMontreal.CA In-reply-to: (message from Stefan Monnier on 26 Nov 2003 09:14:03 -0500) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.3 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.2 Precedence: list List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:18155 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:18155 In article , Stefan Monnier writes: >>> but since the output is a unibyte string, >>> that restrict it to cases where the code-points can be encoded in 8 bits, >>> thus it doesn't sound very generic >> Yes. But I thought generic or not is not a point here. > Except that if it's not generic (in the sense that it does not behave > meaningfully in all language environments), then it can't be used in generic > elisp code, right? Yes. But, it simply means that insertion of multibyte string in a unibyte buffer can't be generic. >> My examples shows that we can't use encode-coding-string. >> How can we use encode-coding-string without knowing what >> coding system to use? I haven't heard your answer yet. > I can't answer this question without knowing the answer to my question: > what is string-make-unibyte used for. It is used for converting a multibyte string to unibyte before it is inserted in a unibyte buffer. > I'm not saying that we can do something like: > (defun string-make-unibyte (s) (encode-coding-string s )) ??? I have thought that you are saying that because you wrote below: > To do what your string-make-unibyte does you should use > `encode-coding-string' where the coding system is passed explicitly. Anyway, > but I'm saying that everywhere where the current string-make-unibyte is > used, we should be able to easily replace it by a call to > encode-coding-string or a code to my make-string-unibyte (which does > not pay attention to the language environment and only accepts multibyte > chars that correspond to bytes, i.e. eight-bit-control or > eight-bit-graphic, or ASCII, and multibyte chars whose internal code point > is 128-255). It's an ambiguous statement. Which are you sauing? Replace string-make-unibyte by: (1) encode-coding-string or make-string-unibyte. (2) a code that applies encode-coding-string or make-string-unibyte to the whole string depending on something (perhaps on the input string?). (3) a code that applies encode-coding-string to substrings where that is appropriate, and applies make-string-unibyte to the remaing substrings. (4) something that I still don't understand. >>> I don't see any use of string-make-unibyte in your two examples. >> Again, I'd like to ask how to use encode-coding-string >> without knowing the proper coding-system in each case. > How could I know the coding-system to use when replacing > `string-make-unibyte' if I don't have any actual call to > string-make-unibyte to work with ? What a strange logic?!? You have been argued that we should replace string-make-unibyte with something that uses encode-coding-string. Then you should have an idea about what coding-system to use for encode-coding-string. --- Ken'ichi HANDA handa@m17n.org