From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Mark H Weaver Newsgroups: gmane.lisp.guile.devel Subject: Re: Needed: per-port reader options Date: Thu, 18 Oct 2012 13:48:15 -0400 Message-ID: <87pq4fpt7k.fsf@tines.lan> References: <87sj9ixl1j.fsf@tines.lan> <87ehkyza6v.fsf@gnu.org> <87k3uqtbf0.fsf@tines.lan> <87obk22j2i.fsf@gnu.org> <87y5j5srek.fsf@tines.lan> <87mwzjwy3v.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1350582520 11977 80.91.229.3 (18 Oct 2012 17:48:40 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 18 Oct 2012 17:48:40 +0000 (UTC) Cc: guile-devel@gnu.org To: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Thu Oct 18 19:48:47 2012 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1TOuD6-0000sU-Hj for guile-devel@m.gmane.org; Thu, 18 Oct 2012 19:48:44 +0200 Original-Received: from localhost ([::1]:52742 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TOuCz-00067p-EF for guile-devel@m.gmane.org; Thu, 18 Oct 2012 13:48:37 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:47461) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TOuCw-00067Z-71 for guile-devel@gnu.org; Thu, 18 Oct 2012 13:48:35 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TOuCu-0000tY-Ot for guile-devel@gnu.org; Thu, 18 Oct 2012 13:48:34 -0400 Original-Received: from world.peace.net ([96.39.62.75]:60970) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TOuCu-0000tU-Jg; Thu, 18 Oct 2012 13:48:32 -0400 Original-Received: from 209-6-91-212.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com ([209.6.91.212] helo=tines.lan) by world.peace.net with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1TOuCl-0002sx-Gf; Thu, 18 Oct 2012 13:48:23 -0400 In-Reply-To: <87mwzjwy3v.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Thu, 18 Oct 2012 18:20:36 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 96.39.62.75 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:15004 Archived-At: ludo@gnu.org (Ludovic Court=C3=A8s) writes: > Hi Mark! > > Mark H Weaver skribis: > >> ludo@gnu.org (Ludovic Court=C3=A8s) writes: > > [...] > >>> Of course. I just meant that, if you can call =E2=80=98make-reader=E2= =80=99 with >>> whatever options you=E2=80=99re interested in, instead of globally chan= ging the >>> reader=E2=80=99s option as is currently the case, then you=E2=80=99re h= alfway through. >>> >>> And in that case, the reader doesn=E2=80=99t need to be associated with= the >>> port. Instead, since =E2=80=98primitive-load=E2=80=99 honors =E2=80=98= current-reader=E2=80=99, it just >>> needs to be set when loading a file. This way, any changes to the >>> reader=E2=80=99s options will be local to that file. >> >> I see a few problems with that. >> >> First of all, since the reader directives can be placed anywhere that >> comments are permitted, the read options must be changed while the >> reader is in the middle of reading a single datum, > > Yes, but the reader can modify its own options data structure. True, but it cannot arrange for subsequent calls to 'read' on that port to use the new options data structure without stashing those changed options somewhere. Also, modifying its own options data structure cannot work for a reader directive like #!curly-infix unless curly-infix is supported by the default reader. >> In other words, if a program uses 'read' on a data file, the >> reader directives '#!fold-case' et al should affect all future calls to >> 'read' on that file. > > Just on that file, or on any file subsequently loaded? Just on that file. That's the only sane thing. Think about it. Suppose you're reading multiple files in an interleaved fashion (perhaps via a lazy stream that reads the files on demand), and a #!fold-case in one file changed the way the other files were read. That would be totally broken, don't you agree? FWIW, I tested the behavior of Chibi Scheme, and it does the right thing, exactly as I have described and implemented. >> Fluids cannot solve this problem, because the program might be >> performing interleaved reads of multiple files within the same thread. > > SRFI-105 reads: > > An implementation of this SRFI MUST accept the marker #!curly-infix > followed by a whitespace character in its standard datum readers [...] > > After reading this marker, the reader MUST accept curly-infix > expressions in subsequent datums until it reaches an end-of-file [...] > > To me, this sounds like global reader options (reset on EOF), not like > per-port options. Really? "until it reaches an end-of-file" sounds like per-port to me. Not convinced? Imagine the same thought experiment I proposed above, with one thread reading multiple files in an interleaved way via lazy streams. With your proposal, not only does the appearance of #!fold-case mysteriously change the way the other files are read at some random point depending on the interleaving, but now, when the original file's is EOF is found, the other files mysteriously change back to case-sensitive mode! Can you find _any_ scheme implementation that handles reader directives that way? If so, please let me know, because I have a strongly worded bug report to file :) >>> Concretely, this would mean changing read.c such that each token reader >>> takes the reader options as an additional first parameter. Instead of >>> looking up the global =E2=80=98scm_read_opts=E2=80=99, they would look = at this explicit >>> parameter. >> >> This is almost exactly what my patch does. I added an explicit >> parameter of type 'scm_t_read_opts *' to most of the helper functions in >> read.c, and that parameter is consulted instead of the global options. > > I like it. Excellent! >> When reader directives such as '#!fold-case' are encountered, both the >> 'scm_t_read_opts' struct and the per-port options are mutated. >> >> 'scm_read' initializes a local 'scm_t_read_opts' struct based on both >> the global read options and the per-port overrides (if any), and a >> pointer to that struct is passed down to all the helper functions in >> read.c that need it. >> >> What do you think? > > The patch you posted (=E2=80=9CImplement per-port reader options, #!fold-= case > and #!no-fold-case.=E2=80=9D) does all three things at once: (1) explicit > instead of global reader options, (2) per-port reader options, and (3) > fold-case. > > Do you think you could split it into 3 patches? Fair enough, I agree that this makes sense. I'll work on it. > I=E2=80=99m happy with (1) and (3). I remain skeptical about (2), becaus= e of > the mixture of concerns. I don't think it's really a mixture of concerns. The port just provides an alist for anyone to use, without caring what is put there. 'read' needs to know about ports, but that's always been the case. > Sorry for the extra work, but thank you for pushing these things! No problem, thanks for discussing it :) Mark