From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Harald Hanche-Olsen Newsgroups: gmane.emacs.devel Subject: Re: (aset UNIBYTE-STRING MULTIBYTE-CHAR) Date: Wed, 14 May 2008 14:50:43 +0200 (CEST) Message-ID: <20080514.145043.228449419.hanche@math.ntnu.no> References: <20080507.213112.10351177.hanche@math.ntnu.no> <20080514.085438.56819933.hanche@math.ntnu.no> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: Text/Plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1210769463 15594 80.91.229.12 (14 May 2008 12:51:03 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 14 May 2008 12:51:03 +0000 (UTC) Cc: emacs-devel@gnu.org To: monnier@IRO.UMontreal.CA Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed May 14 14:51:40 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1JwGSJ-0002mb-8D for ged-emacs-devel@m.gmane.org; Wed, 14 May 2008 14:51:39 +0200 Original-Received: from localhost ([127.0.0.1]:33191 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JwGRa-0008LV-50 for ged-emacs-devel@m.gmane.org; Wed, 14 May 2008 08:50:54 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JwGRU-0008L5-Fh for emacs-devel@gnu.org; Wed, 14 May 2008 08:50:48 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JwGRT-0008Kl-IZ for emacs-devel@gnu.org; Wed, 14 May 2008 08:50:47 -0400 Original-Received: from [199.232.76.173] (port=41143 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JwGRT-0008Ki-DO for emacs-devel@gnu.org; Wed, 14 May 2008 08:50:47 -0400 Original-Received: from fiinbeck.math.ntnu.no ([129.241.15.140]:59395) by monty-python.gnu.org with smtp (Exim 4.60) (envelope-from ) id 1JwGRS-00017m-TO for emacs-devel@gnu.org; Wed, 14 May 2008 08:50:47 -0400 Original-Received: (qmail 2636 invoked from network); 14 May 2008 12:50:44 -0000 Original-Received: from localhost (127.0.0.1) by localhost with SMTP; 14 May 2008 12:50:44 -0000 In-Reply-To: X-URL: http://www.math.ntnu.no/~hanche/ X-Mailer: Mew version 6.0.51 on Emacs 23.0.60 / Mule 6.0 (HANACHIRUSATO) X-detected-kernel: by monty-python.gnu.org: Genre and OS details not recognized. X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:97139 Archived-At: + Stefan Monnier : > > + Harald Hanche-Olsen : > > >> This works as it should in the latest CVS: > >> > >> (setq foo (make-string 4 ?a)) > >> (aset foo 1 ?€) ; <= that's a euro sign > >> > >> But this fails: > >> > >> (setq foo (make-string 4 ?a)) > >> (aset foo 1 ?å) > >> (aset foo 1 ?€) ; => Error: args out of range > > Show us the real code that bunmped into the problem and I'll tell you > how to do it so as to avoid the risk of such problems. You'd have to tell the author of mew (http://mew.org/), Kazu Yamamoto. Actually, I have a one line patch to mew that fixes the problem, but he seems unwilling to apply it. Now don't get me wrong: I am not asking for a change in emacs to fix a problem in mew. I am suggesting a change in emacs for the sake of robustness: I think that if the problem of inserting multibyte characters in unibyte strings is worth fixing at all, it is worth fixing so it works in all cases. Otherwise, why bother? I do understand the arguments against fixing it, but the current situation where it will often work, but fail sometimes does not seem good to me. But at least, it's documented, I see that now: 4.4 Modifying Strings ===================== The most basic way to alter the contents of an existing string is with `aset' (*note Array Functions::). `(aset STRING IDX CHAR)' stores CHAR into STRING at index IDX. Each character occupies one or more bytes, and if CHAR needs a different number of bytes from the character already present at that index, `aset' signals an error. That last bit actually seems to be outdated: An error is not ALWAYS signaled in the indicated situation, only sometimes. Anyway, the code you're asking for (in case you're really curious): In mew-header.el (defun mew-addrstr-parse-syntax-list (str sep addrp &optional depth allow-spc) (when str (let* ((i 0) (len (length str)) (par-cnt 0) (tmp-cnt 0) (sep-cnt 0) (tmp (mew-make-string len)) c ret prevc) (catch 'max (while (< i len) (setq c (aref str i)) ; <= problem occurs here ... deleted ...))))) My one-line fix consists of changing the definition (elsewhere) (defun mew-make-string (len) (make-string len ?a)) into one that makes a multibyte string at the outset. (I like mew (a lot), so I am willing to put up with its various idiosynchrasies (and there are a some).) - Harald