From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Harald Hanche-Olsen Newsgroups: gmane.emacs.devel Subject: Re: (aset UNIBYTE-STRING MULTIBYTE-CHAR) Date: Wed, 14 May 2008 08:54:38 +0200 (CEST) Message-ID: <20080514.085438.56819933.hanche@math.ntnu.no> References: <20080507.213112.10351177.hanche@math.ntnu.no> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: Text/Plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1210751873 18577 80.91.229.12 (14 May 2008 07:57:53 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 14 May 2008 07:57:53 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed May 14 09:58:29 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1JwBsV-0005UD-9d for ged-emacs-devel@m.gmane.org; Wed, 14 May 2008 09:58:23 +0200 Original-Received: from localhost ([127.0.0.1]:42968 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JwBrm-0000t8-B7 for ged-emacs-devel@m.gmane.org; Wed, 14 May 2008 03:57:38 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JwBp2-0007Om-VW for emacs-devel@gnu.org; Wed, 14 May 2008 03:54:49 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JwBox-0007Kv-Ve for emacs-devel@gnu.org; Wed, 14 May 2008 03:54:48 -0400 Original-Received: from [199.232.76.173] (port=53174 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JwBox-0007Ki-Em for emacs-devel@gnu.org; Wed, 14 May 2008 03:54:43 -0400 Original-Received: from abel.math.ntnu.no ([129.241.15.50]:47290) by monty-python.gnu.org with smtp (Exim 4.60) (envelope-from ) id 1JwBow-0002PE-JI for emacs-devel@gnu.org; Wed, 14 May 2008 03:54:42 -0400 Original-Received: (qmail 27372 invoked from network); 14 May 2008 06:54:39 -0000 Original-Received: from gauss.math.ntnu.no (HELO localhost) (hanche@129.241.15.102) by abel.math.ntnu.no with ESMTPA; 14 May 2008 06:54:39 -0000 In-Reply-To: <20080507.213112.10351177.hanche@math.ntnu.no> X-URL: http://www.math.ntnu.no/~hanche/ X-Mailer: Mew version 5.2.51 on Emacs 23.0.0 / Mule 6.0 (HANACHIRUSATO) X-detected-kernel: by monty-python.gnu.org: Solaris 8 (1) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:97123 Archived-At: My message on this topic of a week ago elicited no responses, so I did a little more research on my own (which I should have done in the first place, maybe). This time I hope to see some discussion: + Harald Hanche-Olsen : > This works as it should in the latest CVS: > > (setq foo (make-string 4 ?a)) > (aset foo 1 ?€) ; <= that's a euro sign > > But this fails: > > (setq foo (make-string 4 ?a)) > (aset foo 1 ?å) > (aset foo 1 ?€) ; => Error: args out of range I went back in the mail archives and read the whole thread (it was in February and April this year), and I realize that the whole idea of changing a unibyte string into a multibyte one on the fly in order to support aset on them is somewhat controversial. Be that as it may, the above example shows that the fix put in by Kenichi Handa does not fix it right. Moreover, it is clear from the commit message that he was well aware of this limitation at the time: Working file: data.c revision 1.291 date: 2008-04-17 03:10:58 +0200; author: handa; state: Exp; lines: +11 -1; commitid: yW6gyKxwbZ4EPoZs; (Faset): Allow setting a multibyte character in an ASCII-only unibyte string. It seems to me that in order to get it right, one has to reallocate the data in the case of a non-ASCII-only unibyte string, using code like what is already there for the case when aset replaces an ASCII character with a non-ASCII one (which will increase the byte count of the string). The end result will be ugly and inefficient, but I see no other way if we are going to lay this one to rest. Comments? - Harald