From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: eight-bit char handling in emacs-unicode Date: 26 Nov 2003 09:14:03 -0500 Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Message-ID: References: <200311250107.KAA24646@etlken.m17n.org> <200311260007.JAA26617@etlken.m17n.org> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1069858573 10696 80.91.224.253 (26 Nov 2003 14:56:13 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 26 Nov 2003 14:56:13 +0000 (UTC) Cc: jas@extundo.com, emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Wed Nov 26 15:56:09 2003 Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1AP15N-0006PL-00 for ; Wed, 26 Nov 2003 15:56:09 +0100 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.35 #1 (Debian)) id 1AP15M-0000wM-00 for ; Wed, 26 Nov 2003 15:56:08 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.24) id 1AP1zO-0000Fc-6V for emacs-devel@quimby.gnus.org; Wed, 26 Nov 2003 10:54:02 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.24) id 1AP1yA-0008U2-2r for emacs-devel@gnu.org; Wed, 26 Nov 2003 10:52:46 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.24) id 1AP1xZ-0008IZ-9J for emacs-devel@gnu.org; Wed, 26 Nov 2003 10:52:40 -0500 Original-Received: from [199.232.41.8] (helo=mx20.gnu.org) by monty-python.gnu.org with esmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.24) id 1AP1sP-0006w8-Ru for emacs-devel@gnu.org; Wed, 26 Nov 2003 10:46:49 -0500 Original-Received: from [132.204.24.67] (helo=mercure.iro.umontreal.ca) by mx20.gnu.org with esmtp (Exim 4.24) id 1AP0Qs-0002ua-Qu for emacs-devel@gnu.org; Wed, 26 Nov 2003 09:14:18 -0500 Original-Received: from vor.iro.umontreal.ca (vor.iro.umontreal.ca [132.204.24.42]) by mercure.iro.umontreal.ca (8.12.9/8.12.9) with ESMTP id hAQEE3bj006812; Wed, 26 Nov 2003 09:14:04 -0500 Original-Received: by vor.iro.umontreal.ca (Postfix, from userid 20848) id 451053C63E; Wed, 26 Nov 2003 09:14:03 -0500 (EST) Original-To: Kenichi Handa In-Reply-To: <200311260007.JAA26617@etlken.m17n.org> Original-Lines: 58 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50 X-DIRO-MailScanner: Found to be clean X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.2 Precedence: list List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:18142 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:18142 >> but since the output is a unibyte string, >> that restrict it to cases where the code-points can be encoded in 8 bits, >> thus it doesn't sound very generic > Yes. But I thought generic or not is not a point here. Except that if it's not generic (in the sense that it does not behave meaningfully in all language environments), then it can't be used in generic elisp code, right? >> and I don't see any application for it >> (nor do I see any practical difference with using encode-coding-string >> since the output AFAIK would be the same). > My examples shows that we can't use encode-coding-string. > How can we use encode-coding-string without knowing what > coding system to use? I haven't heard your answer yet. I can't answer this question without knowing the answer to my question: what is string-make-unibyte used for. I'm not saying that we can do something like: (defun string-make-unibyte (s) (encode-coding-string s )) but I'm saying that everywhere where the current string-make-unibyte is used, we should be able to easily replace it by a call to encode-coding-string or a code to my make-string-unibyte (which does not pay attention to the language environment and only accepts multibyte chars that correspond to bytes, i.e. eight-bit-control or eight-bit-graphic, or ASCII, and multibyte chars whose internal code point is 128-255). > But, my understanding is that > string-make-unibyte/multibyte are designed not to change the > number of characters to make the difference of > unibyte/multibyte transparent in Lisp. That is indeed an absolute requirement. >> Of course: that's pretty much what I suggested: make-string-unibyte only >> accepts multibyte chars that correspond to "bytes". > I agree with that. But, it just changes the behaviour of > the function on error case. It doesn't change the concept > of what it does. Except that I said "byte" not "code point", which makes a difference in non-latin-1 locales. >> I don't see any use of string-make-unibyte in your two examples. > Again, I'd like to ask how to use encode-coding-string > without knowing the proper coding-system in each case. How could I know the coding-system to use when replacing `string-make-unibyte' if I don't have any actual call to string-make-unibyte to work with ? Stefan