From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: ludo@gnu.org (Ludovic =?iso-8859-1?Q?Court=E8s?=) Newsgroups: gmane.lisp.guile.devel Subject: Re: Wide strings Date: Mon, 26 Jan 2009 21:24:13 +0100 Message-ID: <871vupolxu.fsf@gnu.org> References: <470889.75847.qm@web37904.mail.mud.yahoo.com> <87wscjvwyq.fsf@gnu.org> <49dd78620901251532s77264727mc6a8c40456fcf561@mail.gmail.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1233007409 602 80.91.229.12 (26 Jan 2009 22:03:29 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 26 Jan 2009 22:03:29 +0000 (UTC) To: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Mon Jan 26 23:04:43 2009 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1LRZZC-0007DK-MS for guile-devel@m.gmane.org; Mon, 26 Jan 2009 23:04:26 +0100 Original-Received: from localhost ([127.0.0.1]:41303 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LRZXv-0003xw-0k for guile-devel@m.gmane.org; Mon, 26 Jan 2009 17:03:07 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LRY0d-000634-OU for guile-devel@gnu.org; Mon, 26 Jan 2009 15:24:39 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LRY0b-00061n-V8 for guile-devel@gnu.org; Mon, 26 Jan 2009 15:24:39 -0500 Original-Received: from [199.232.76.173] (port=41783 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LRY0b-00061Q-7e for guile-devel@gnu.org; Mon, 26 Jan 2009 15:24:37 -0500 Original-Received: from main.gmane.org ([80.91.229.2]:59662 helo=ciao.gmane.org) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1LRY0a-0006k2-RP for guile-devel@gnu.org; Mon, 26 Jan 2009 15:24:37 -0500 Original-Received: from list by ciao.gmane.org with local (Exim 4.43) id 1LRY0Q-0005ph-VP for guile-devel@gnu.org; Mon, 26 Jan 2009 20:24:27 +0000 Original-Received: from reverse-83.fdn.fr ([80.67.176.83]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 26 Jan 2009 20:24:26 +0000 Original-Received: from ludo by reverse-83.fdn.fr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 26 Jan 2009 20:24:26 +0000 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 44 Original-X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: reverse-83.fdn.fr X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 7 =?iso-8859-1?Q?Pluvi=F4se?= an 217 de la =?iso-8859-1?Q?R=E9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 821D 815D 902A 7EAB 5CEE D120 7FBA 3D4F EB1F 5364 X-OS: i686-pc-linux-gnu User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (gnu/linux) Cancel-Lock: sha1:1PwVm0lqcXdJNmhIn1Z0xNnOeaw= X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:8077 Archived-At: Hello! Neil Jerram writes: > But what about the other possible debate, about the API? Are you > thinking that we should accept R6RS's choice? No, I think we have SRFI-1[34] to start with, both of which are well defined in the context of Unicode. > (I really haven't read up on all this enough - however when reading > Tom Lord's analysis just now, I was thinking "why not just specify > that things like char-upcase don't work in the difficult cases", and > it seems to me that this is what R6RS chose to do. So at first glance > the R6RS API looks OK to me. Regarding `ß' (German eszet), which is one of the "difficult cases" mentioned by Tom Lord, SRFI-13 reads: Some characters case-map to more than one character. For example, the Latin-1 German eszet character upper-cases to "SS." * This means that the R5RS function char-upcase is not well-defined, since it is defined to produce a (single) character result. * It means that an in-place string-upcase! procedure cannot be reliably defined, since the original string may not be long enough to contain the result -- an N-character string might upcase to a 2N-character result. * It means that case-insensitive string-matching or searching is quite tricky. For example, an n-character string s might match a 2N-character string s'. And then: SRFI 13 makes no attempt to deal with these issues; it uses a simple 1-1 locale- and context-independent case-mapping I think it's reasonable to stick to this approach at first, at least. Locale-dependent case folding is part of `(ice-9 i18n)' anyway. Thanks, Ludo'.