From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: ludo@gnu.org (Ludovic =?iso-8859-1?Q?Court=E8s?=) Newsgroups: gmane.lisp.guile.devel Subject: Re: fencepost error in encoding processing Date: Mon, 16 Nov 2009 22:51:48 +0100 Message-ID: <87r5rygmtn.fsf@gnu.org> References: <30581C9F-01D3-4B5F-B413-EF46E1A3D365@raeburn.org> <87k4xqoc3k.fsf@gnu.org> <06D624B4-D409-4FC3-9EF5-12E90DBE37D0@raeburn.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1258408353 13048 80.91.229.12 (16 Nov 2009 21:52:33 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 16 Nov 2009 21:52:33 +0000 (UTC) Cc: guile-devel@gnu.org To: Ken Raeburn Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Mon Nov 16 22:52:26 2009 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1NA9Uo-00050m-4N for guile-devel@m.gmane.org; Mon, 16 Nov 2009 22:52:26 +0100 Original-Received: from localhost ([127.0.0.1]:35356 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NA9Un-0000qa-Gg for guile-devel@m.gmane.org; Mon, 16 Nov 2009 16:52:25 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NA9UN-0000lj-Lb for guile-devel@gnu.org; Mon, 16 Nov 2009 16:51:59 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NA9UH-0000jn-S7 for guile-devel@gnu.org; Mon, 16 Nov 2009 16:51:58 -0500 Original-Received: from [199.232.76.173] (port=47957 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NA9UH-0000jf-Mw for guile-devel@gnu.org; Mon, 16 Nov 2009 16:51:53 -0500 Original-Received: from mail2-relais-roc.national.inria.fr ([192.134.164.83]:29374) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.60) (envelope-from ) id 1NA9UH-0005oU-26 for guile-devel@gnu.org; Mon, 16 Nov 2009 16:51:53 -0500 X-IronPort-AV: E=Sophos;i="4.44,753,1249250400"; d="scan'208";a="36849984" Original-Received: from reverse-83.fdn.fr (HELO nixey) ([80.67.176.83]) by mail2-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES128-SHA; 16 Nov 2009 22:51:50 +0100 X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 26 Brumaire an 218 de la =?iso-8859-1?Q?R=E9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 821D 815D 902A 7EAB 5CEE D120 7FBA 3D4F EB1F 5364 X-OS: x86_64-unknown-linux-gnu In-Reply-To: <06D624B4-D409-4FC3-9EF5-12E90DBE37D0@raeburn.org> (Ken Raeburn's message of "Mon, 16 Nov 2009 12:25:17 -0500") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) X-detected-operating-system: by monty-python.gnu.org: Genre and OS details not recognized. X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:9683 Archived-At: Hi Ken, Ken Raeburn writes: > On Nov 16, 2009, at 08:03, Ludovic Court=C3=A8s wrote: >> As far as encoding names are concerned, Bruno Haible pointed me to >> http://www.iana.org/assignments/character-sets and I added a link to >> it >> in the manual a couple of days ago. > > Between your link and Mike's, it looks to me like we should add > several more characters. Yes. > Since we're scanning an Emacs-style coding specification, as long as > whitespace and semicolon aren't on the list, I think we can be > expansive, so let's go ahead and include all of ":,+=3D/()" to the > allowed set. The results will still be constrained by whatever the OS > supports; we just don't want Guile to impose additional constraints. Agreed. Note that we follow whatever libunistring implements, which happens to be IANA AIUI (though it=E2=80=99s case-insensitive.) > * libguile/read.c (scm_i_scan_for_encoding): Allow more punctuation > symbols in coding system names. > > diff --git a/libguile/read.c b/libguile/read.c > index 775612a..657e101 100644 > --- a/libguile/read.c > +++ b/libguile/read.c > @@ -1506,8 +1506,7 @@ scm_i_scan_for_encoding (SCM port) > i =3D 0; > while (pos + i - header <=3D SCM_ENCODING_SEARCH_SIZE > && pos + i - header < bytes_read > - && (isalnum((int) pos[i]) || pos[i] =3D=3D '_' || pos[i] =3D=3D '-' > - || pos[i] =3D=3D '.')) > + && (isalnum((int) pos[i]) || strchr("_-.:/,+=3D()", pos[i]) !=3D NULL)) Sounds good to me, except for the missing whitespace before =E2=80=98(=E2= =80=98. ;-) Ludo=E2=80=99.