From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: "Philip McGrath" <philip@philipmcgrath.com>
Newsgroups: gmane.lisp.guile.devel
Subject: Re: [PATCH] add language/wisp to Guile?
Date: Sun, 26 Feb 2023 02:45:12 -0500
Message-ID: <981b0e74-96c0-4430-b693-7fc8026e3ead@app.fastmail.com>
References: <87h6w2fkz8.fsf@web.de>
 <f31a073e-6b8c-ce48-2917-53a13d76b108@telenet.be> <877cwxe4ar.fsf@web.de>
 <2f7d015d-ceb4-ef8f-b4fe-b69e39b723f8@telenet.be> <87357ldqaq.fsf@web.de>
 <1a70460e-11fb-9f5d-0d5f-1eb507d5af0d@telenet.be> <87ilg4j65e.fsf@web.de>
 <87edqsj5vt.fsf@web.de> <01212259-37dd-5d67-7bbc-101e01d96d01@telenet.be>
 <1a6c8dda-0124-124c-f932-937a11386ced@gmail.com> <87fsb5i912.fsf@web.de>
 <ece631e3-8b04-d538-1d8d-fd09fc8562cb@telenet.be>
 <08c725bd-84d4-4df9-a18c-6ee55d00634f@app.fastmail.com>
 <a20e0a05-8257-ee22-4af5-5d47871c0c90@telenet.be>
Mime-Version: 1.0
Content-Type: text/plain;charset=utf-8
Content-Transfer-Encoding: quoted-printable
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
	logging-data="9959"; mail-complaints-to="usenet@ciao.gmane.io"
User-Agent: Cyrus-JMAP/3.9.0-alpha0-172-g9a2dae1853-fm-20230213.001-g9a2dae18
Cc: "Christine Lemmer-Webber" <cwebber@dustycloud.org>
To: "Maxime Devos" <maximedevos@telenet.be>,
 =?UTF-8?Q?Ludovic_Court=C3=A8s?= <ludo@gnu.org>,
 "Matt Wette" <matt.wette@gmail.com>, guile-devel@gnu.org
Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Sun Feb 26 08:46:20 2023
Return-path: <guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org>
Envelope-to: guile-devel@m.gmane-mx.org
Original-Received: from lists.gnu.org ([209.51.188.17])
	by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
	(Exim 4.92)
	(envelope-from <guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org>)
	id 1pWBjd-0002MP-SX
	for guile-devel@m.gmane-mx.org; Sun, 26 Feb 2023 08:46:19 +0100
Original-Received: from localhost ([::1] helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <guile-devel-bounces@gnu.org>)
	id 1pWBjG-0003Vw-Sg; Sun, 26 Feb 2023 02:45:54 -0500
Original-Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <philip@philipmcgrath.com>)
 id 1pWBj3-0003Uq-SA
 for guile-devel@gnu.org; Sun, 26 Feb 2023 02:45:43 -0500
Original-Received: from out4-smtp.messagingengine.com ([66.111.4.28])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <philip@philipmcgrath.com>)
 id 1pWBiy-00075h-Ui; Sun, 26 Feb 2023 02:45:40 -0500
Original-Received: from compute1.internal (compute1.nyi.internal [10.202.2.41])
 by mailout.nyi.internal (Postfix) with ESMTP id 4CC585C00CE;
 Sun, 26 Feb 2023 02:45:33 -0500 (EST)
Original-Received: from imap52 ([10.202.2.102])
 by compute1.internal (MEProxy); Sun, 26 Feb 2023 02:45:33 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 philipmcgrath.com; h=cc:cc:content-transfer-encoding
 :content-type:date:date:from:from:in-reply-to:in-reply-to
 :message-id:mime-version:references:reply-to:sender:subject
 :subject:to:to; s=fm1; t=1677397533; x=1677483933; bh=Z9u3mJJk2m
 Tphwak03TGqYXEyVWbgXSMWpDWGe+lFmw=; b=m6B4LM5OR1jgv7tAAo53fWHACy
 bNctnxjoga00IbOfiLN72Vry+yBumDaOzDHU68ymV2AuLquqzAsMvdrhyyV/P6RP
 qGsvcAiBortESY3xHfJH9Gju1twAiDOr6UuRIHE5Lx/o5cqhCKoZqoTAbzKayfqy
 zqr2OAlk9DigCYsEi9oCKxQmml/JW+B54wbZwN+Wto8ju/SXxq6B7sNs8rtyCHrM
 +YXB1LvK9QKHy1m2n3wBvb2kN9g+u+fT6jKKhucss76ies9AaNAPOJ+LdJ4G48+h
 mzLRZYeZYNqwbIwevWVjGQ6QcmuJuDYdEvrSGYm6FjdQd48DiCYM2JPxHbAw==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:date:date:feedback-id:feedback-id:from:from
 :in-reply-to:in-reply-to:message-id:mime-version:references
 :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1677397533; x=
 1677483933; bh=Z9u3mJJk2mTphwak03TGqYXEyVWbgXSMWpDWGe+lFmw=; b=H
 +RpKuCzZN67EZeYhgxA3LO6MNLmE7BH+iy9iZ9BmsWyi8W8o2sdzWBM3+P6erBCB
 CfcmrCX8kde09FShzvbAqhUjZRJ/smlI1z/VgyOWWq+MpOoYorVLXvPA+UN486Xt
 /S2Tpv7j8cJ5aRCkmO8hOO/cXoUnzDez1kOVKloFN2/sUiFf+s67ea5aJAA28KHk
 OPBk1yeVUbItMGeMjx5DWLtwOCURcaKvpVmV3mqxQKLPkNy2I6HsBzuuOMvMdtas
 YWpL3wxUFyBxM6HC2/4rTKWApEbQtzNjzargg+5wbZ1iJHw8b1DbIuRxEGWjytXn
 NwAG9gZCujUBK94C520JA==
X-ME-Sender: <xms:HQ77Y-MQBRoA4lX-NvYphis5_8pLIo3jgmEsKpC20yQw-U9ReLo9KA>
 <xme:HQ77Y88-jlSJzMbAWmRzGpaR4FVRTpffTLYxkRmFtNQ0jsD_3ShuQtVH4MS7d08dG
 96U7JFnmCp4RK-6xcg>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrudekjedguddtiecutefuodetggdotefrod
 ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh
 necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd
 enucfjughrpefofgggkfgjfhffhffvvefutgfgsehtqhertderreejnecuhfhrohhmpedf
 rfhhihhlihhpucfotgfirhgrthhhfdcuoehphhhilhhiphesphhhihhlihhpmhgtghhrrg
 hthhdrtghomheqnecuggftrfgrthhtvghrnheplefhueekleegueffiefhkefhgeetieej
 gfejhedvveekffdulefhheefjefgtdeunecuffhomhgrihhnpehriehrshdrohhrghdprh
 grtghkvghtqdhlrghnghdrohhrghdpsghrohifnhdrvgguuhdpfihinhhgohhlohhgrdho
 rhhgpdhuthgrhhdrvgguuhdpghhnuhdrohhrghdpghhithhhuhgsrdgtohhmpdgrnhguhi
 hkvggvphdrtghomhenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhl
 fhhrohhmpehphhhilhhiphesphhhihhlihhpmhgtghhrrghthhdrtghomh
X-ME-Proxy: <xmx:HQ77Y1QQMy0lr2HZweGnoHvv1a564f_qYNV_rqrnfJh2p8h-rrMnIw>
 <xmx:HQ77Y-vXvEkiB_RYAftXV5W3EzFFd78f_yIVoITJfyq8Z37b8qtD9A>
 <xmx:HQ77Y2fVTCIOxzvJ_zs0RlIEPI_QrwJl7oasL5cAyOEhJ3MRPpBTUQ>
 <xmx:HQ77Y-R0tBeNYp4vay7EhEe4BqmLjRsUsY9fhSaMBh2AP21pQjVRYg>
Feedback-ID: i2b1146f3:Fastmail
Original-Received: by mailuser.nyi.internal (Postfix, from userid 501)
 id EC67FC60091; Sun, 26 Feb 2023 02:45:32 -0500 (EST)
X-Mailer: MessagingEngine.com Webmail Interface
In-Reply-To: <a20e0a05-8257-ee22-4af5-5d47871c0c90@telenet.be>
Received-SPF: pass client-ip=66.111.4.28;
 envelope-from=philip@philipmcgrath.com; helo=out4-smtp.messagingengine.com
X-Spam_score_int: -27
X-Spam_score: -2.8
X-Spam_bar: --
X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001,
 SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: guile-devel@gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Developers list for Guile,
 the GNU extensibility library" <guile-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/guile-devel>,
 <mailto:guile-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/guile-devel>
List-Post: <mailto:guile-devel@gnu.org>
List-Help: <mailto:guile-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/guile-devel>,
 <mailto:guile-devel-request@gnu.org?subject=subscribe>
Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org
Original-Sender: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org
Xref: news.gmane.io gmane.lisp.guile.devel:21744
Archived-At: <http://permalink.gmane.org/gmane.lisp.guile.devel/21744>

Hi,

On Sat, Feb 18, 2023, at 10:58 AM, Maxime Devos wrote:
> On 18-02-2023 04:50, Philip McGrath wrote:
>> I haven't read the patch or this thread closely,
>
> I'll assume you have read it non-closely.
>
>> but R6RS has an answer to any concerns about compatibility with `#lan=
g`. At the beginning of Chapter 4, "Lexical and Datum Syntax" (<http://w=
ww.r6rs.org/final/html/r6rs/r6rs-Z-H-7.html#node_chap_4>) the report spe=
cifies:
>>=20
>>>   An implementation must not extend the lexical or datum syntax in a=
ny way, with one exception: it need not treat the syntax `#!<identifier>=
`, for any <identifier> (see section 4.2.4) that is not `r6rs`, as a syn=
tax violation, and it may use specific `#!`-prefixed identifiers as flag=
s indicating that subsequent input contains extensions to the standard l=
exical or datum syntax. The syntax `#!r6rs` may be used to signify that =
the input afterward is written with the lexical syntax and datum syntax =
described by this report. `#!r6rs` is otherwise treated as a comment; se=
e section 4.2.3.
>
> That is for '#!lang', not '#lang'.  R6RS allows the former, but the=20
> patch does the latter.  As such, R6RS does not have an answer about=20
> incompatibility with `#lang', unless you count =E2=80=98it's incompati=
ble=E2=80=99 as an=20
> answer.
>

Let me try to be more concrete.

If you want a portable, RnRS-standardized lexical syntax for `#lang`, us=
e `#!<identifier>`, and systems that understand `#lang` will treat it (i=
n appropriate contexts) as an alias for `#lang `.

Alternatively, you could embrace that Guile (like every other Scheme sys=
tem I'm aware of) starts by default in a mode with implementation-specif=
ic extensions. Indeed, R6RS Appendix A specifically recognizes that "the=
 default mode offered by a Scheme implementation may be non-conformant, =
and such a Scheme implementation may require special settings or declara=
tions to enter the report-conformant mode" [1]. Then you could just writ=
e `#lang` and worry about the non-portable block comments some other day=
. This is what I would personally prefer.

>> In Racket, in the initial configuration of the reader when reading a =
file, "`#!` is an alias for `#lang` followed by a space when `#!` is fol=
lowed by alphanumeric ASCII, `+`, `-`, or `_`." (See <https://docs.racke=
t-lang.org/reference/reader.html#%28part._parse-reader%29>.) [...] > (Gu=
ile does not handle `#!r6rs` properly, presumably because of the=20
> legacy `#!`/`!#` block comments. I think this should be a surmountable=20
> obstacle, though, especially since Guile does support standard `#|`/`|=
#`=20
> block comments.)
>
> =E2=80=98#! ... !#=E2=80=99 comments aren't legacy; they exist to allo=
w putting the=20
> shebang in the first line of a script, and to pass additional argument=
s=20
> to the Guile interpreter (see: (guile)The Top of a Script File) (*).  =
As=20
> such, you can't just replace them with #| ... |# (unless you patch the=20
> kernel to recognise "#| ..." as a shebang line).
>
> (*) Maybe they exist for other purposes too.

According to "(guile)Block Comments", the `#!...!#` syntax existed befor=
e Guile 2.0 added support for `#|...|#` comments from SRFI 30 and R6RS.

>
> Furthermore, according to the kernel, #!r6rs would mean that the scrip=
t=20
> needs to be interpreted by a program named 'r6rs', but 'guile' is name=
d=20
> 'guile', not 'r6rs'.  (I assume this is in POSIX somewhere, though I=20
> couldn't find it.)
>
> (This is an incompatibility between R6RS and any system that has sheba=
ngs.)
>

This is not an incompatibility, because the `#!r6rs` lexeme (or `#!<iden=
tifier>`, more generally) is not the shebang line for the script. R6RS A=
ppendix D [2] gives this example of a Scheme script:

```
#!/usr/bin/env scheme-script
#!r6rs
(import (rnrs base)
        (rnrs io ports)
        (rnrs programs))
(put-bytes (standard-output-port)
           (call-with-port
               (open-file-input-port
                 (cadr (command-line)))
             get-bytes-all))
```

The appendix says that, "if the first line of a script begins with `#!/`=
 or `#!<space>`, implementations should ignore it on all platforms, even=
 if it does not conform to the recommended syntax". Admittedly this is n=
ot handled as consistently as I would prefer: I wish they had just stand=
ardized `#!/` and `#! ` as special comment syntax, as Racket does, and c=
larified the interaction with `#!<identifier>`. But Matt points out that=
 JavaScript also has very similar special treatment for a single initial=
 shebang comment. Lua has a similar mechanism: my vague recollection is =
that many languages do.=20

>>>
>>> (^) it doesn't integrate with the module system -- more concretely,
>>> (use-modules (foo)) wouldn't try loading foo.js -- adding '-x' argum=
ents
>>> would solve that, but we agree that that would be unreasonable in ma=
ny
>>> situations.  (Alternatively one could place ECMAScript code in a file
>>> with extension '.scm' with a '#lang' / '-*- mode: ecmascript -*-', b=
ut
>>> ... no.)

Generally I would use `.scm` (or `.rkt`), and certainly I would do so if=
 there isn't some well-established other extension. If you are just usin=
g the file, you shouldn't necessarily have to care what language it's im=
plemented in internally.

In particular, I don't think the `#lang` concept should be conflated wit=
h editor configuration like `'-*- mode: ecmascript -*-`. As an example, =
consider these two Racket programs:

```
#!datalog
parent(anchises, aeneas).
parent(aeneas, ascanius).
ancestor(A, B) :- parent(A, B).
ancestor(A, B) :- parent(A, C), ancestor(C, B).
ancestor(A, ascanius)?
```

```
#lang algol60
begin
    comment Credit to Rosetta Code;
    integer procedure fibonacci(n); value n; integer n;
    begin
        integer i, fn, fn1, fn2;
        fn2 :=3D 1;
        fn1 :=3D 0;
        fn  :=3D 0;
        for i :=3D 1 step 1 until n do begin
            fn  :=3D fn1 + fn2;
            fn2 :=3D fn1;
            fn1 :=3D fn
        end;
        fibonacci :=3D fn
    end;
=20
    integer i;
    for i :=3D 0 step 1 until 20 do printnln(fibonacci(i))
end
```

While I'm sure there are Emacs modes available for Datalog and Algol 60,=
 and some people might want to use them for these programs, I would prob=
ably want to edit them both in racket-mode: because racket-mode supports=
 the `#lang` protocol, it can obtain the syntax highlighting, indentatio=
n, and other support defined by each language, while also retaining the =
global features that all `#lang`-based languages get "for free", like a =
tool to rename variables that respects the actual model of scope. This i=
s one of the value propositions of the `#lang` system.

>>=20
>> Racket has a mechanism to enable additional source file extensions wi=
thout needing explicit command-line arguments by defining `module-suffix=
es` or `doc-modules-suffixes` in a metadata module that is consulted whe=
n the collection is "set up": https://docs.racket-lang.org/raco/setup-in=
fo.html However, this mechanism is not widely used.
>
> I guess this is an improvement over the runtime 'guile -x extension'.
> However, if I'm understanding 'setup-info.html' correctly, the downsid=
e=20
> is that you now need a separate file containing compilation settings.
>
> I have previously proposed a mechanism that makes the '-x' +=20
> '--language' a compile-time thing (i.e., embed the source file extensi=
on=20
> in the compiled .go; see previous e-mails in this thread), without=20
> having to make a separate file containing compilation settings.
>
> How is Racket's method an improvement over my proposal?
>

My focus in this thread is explaining and advocating for `#lang`. I see =
the whole business with file extensions as basically orthogonal to `#lan=
g`, and my opinions about it are much less strong, but I'll try to answe=
r your question. I think it would make sense for `.go` files to record t=
he file extension of their corresponding source files: Racket's `.zo` fi=
les do likewise. I don't object to a command-line option *at compile-tim=
e* (as you said) to enable additional file extensions, and I agree that =
there isn't a huge difference between that and an approach with a separa=
te configuration file, though I do find the configuration-file approach =
somewhat more declarative, which I prefer.

What I was really trying to argue here is that the file extension should=
 not determine the meaning of the program it contains: more on that belo=
w.

>> Overall, the experience of the Racket community strongly suggests tha=
t a file should say what language it is written in. Furthermore, that la=
nguage is a property of the code, not of its runtime environment, so env=
ironment variables, command-line options, and similar extralinguistic me=
chanism are a particularly poor fit for controlling it.
>
> Agreed on the 'no environment variables' thing, disagreed on the 'no=20
> command-line options'.  In the past e-mails in this thread, there was=20
> agreement on the =E2=80=98embed the source file extension in the compi=
led .go or=20
> something like that; and add -x extension stuff _when compiling_ (not=20
> runtime!) the software that uses the extension=E2=80=99.
>
> Do you any particular issues with that proposal?  AFAICT, it solves=20
> everything and is somewhat more straightforward that Racket.
>

I don't have particular issues with a compile-time command-line option t=
o determine which files to compile. I do object to using command-line op=
tions or file extensions to determine what language a file is written in=
.=20

>> File extensions are not the worst possible mechanisms, but they have =
similar problems: code written in an unsaved editor or a blog post may n=
ot have a file extension.
>
> With the proposal I wrote, it remains possible to override any 'file=20
> extension -> language' mapping.  It's not in any way incompatible with=20
> "-*- lang: whatever -*-"-like comments.
>
> Additionally, Guile can only load files that exist (i.e, 'saved'); Gui=
le=20
> is not an editor or blog reader, so these do not appear problems for=20
> Guile to me.
>

While it's true that the only files Guile can load are "files that exist=
", it's not true that "Guile can only load files": consider procedures l=
ike `eval-string`, `compile`, and, ultimately, `read-syntax`.

AFAICT, to the extent that Guile's current implementations of such proce=
dures support multiple languages, they rely on out-of-band configuration=
, like an optional `#:language` argument, which is just as extra-linguis=
tic as relying on command-line options, environment variables, or file e=
xtensions. What I'm trying to advocate is that programs should say in-ba=
nd, as part of their source code, what language they are written in.

> If the editor needs to determine the language for syntax highlighting =
or=20
> such, then there exist constructs like ';; -*- mode: scheme -*-' that=20
> are valid Scheme, but that's not a Guile matter.
>

See above for why the `#!language/wisp` option is perfectly valid R6RS S=
cheme and for some of my concerns about overloading editor configuration=
 to determine the semantics of programs.

More broadly, everyone who reads a piece of source code, including human=
s as well as editors and the `guile` executable, needs to know what lang=
uage it's written in to hope to understand it.

>> (For more on this theme, see the corresponding point of the Racket Ma=
nifesto: <https://cs.brown.edu/~sk/Publications/Papers/Published/fffkbmt=
-racket-manifesto/paper.pdf>) Actually writing the language into the sou=
rce code has proven to work well.
>
> What is the corresponding point?  I'm not finding any search results f=
or=20
> 'file extension' or 'file name', and I'm not finding any relevant sear=
ch=20
> results for 'editor'.  Could you give me a page reference and a releva=
nt=20
> quote?
>

I was trying to refer to section 5, "Racket Internalizes Extra-Linguisti=
c Mechanisms", which begins on p. 121 (p. 9 of the PDF). Admittedly, the=
 connection between the main set of examples they discuss and this conve=
rsation is non-obvious. Maybe the most relevant quote is the last paragr=
aph of that section, on p. 123 (PDF p. 11): "Finally, Racket also intern=
alizes other aspects of its context. Dating back to the beginning, Racke=
t programs can programmatically link modules and classes. In conventiona=
l languages, programmers must resort to extra-linguistic tools to abstra=
ct over such linguistic constructs; only ML-style languages and some scr=
ipting languages make modules and classes programmable, too." (Internal =
citations omitted.)

>> To end with an argument from authority, this is from Andy Wingo's "le=
ssons learned from guile, the ancient & spry" (<https://wingolog.org/arc=
hives/2020/02/07/lessons-learned-from-guile-the-ancient-spry>):
>>=20

Sorry, this was meant to be tongue-in-cheek, and it seems that didn't co=
me across. "Argument from authority" is often considered a category of l=
ogical fallacy, and ending with a quote is sometimes considered to be ba=
d style or to weaken a piece of persuasive writing.

>    * I previously pointed out some problems with that proposal
>      -- i.e., '#lang whatever' is bogus Scheme / Wisp / ...,

I hope I've explained why something like `#!language/wisp` is perfectly =
within the bounds of R6RS.

Also, given that Guile already starts with non-standard extensions enabl=
ed by default, I don't see any reason not to also support `#lang languag=
e/wisp`. In particular, the spelling of `#lang` proceeds directly from t=
he Scheme tradition. This is from the R6RS Rationale document, chapter 4=
, "Lexical Syntax", section 3, "Future Extensions" [3]:

>>>> The `#` is the prefix of several different kinds of syntactic entit=
ies: vectors, bytevectors, syntactic abbreviations related to syntax con=
struction, nested comments, characters, `#!r6rs`, and implementation-spe=
cific extensions to the syntax that start with `#!`. In each case, the c=
haracter following the `#` specifies what kind of syntactic datum follow=
s. In the case of bytevectors, the syntax anticipates several different =
kinds of homogeneous vectors, even though R6RS specifies only one. The `=
u8` after the `#v` identifies the components of the vector as unsigned 8=
-bit entities or octets.=20

>      and
>      'the module system won't find it, because of the unexpected
>      file extensions'.
>

This is indeed something that needs to be addressed, but it seems like a=
 very solvable problem. Using the extension ".scm" for everything would =
be one trivial solution. Something like your proposal to enable file ext=
ensions based on a compile-time option could likewise be part of a solut=
ion.

In general, I'll say that, while using Guile, I've often missed Racket's=
 more flexible constructs for importing modules. I especially miss `(req=
uire "foo/bar.rkt")`, which imports a module at a path relative to the m=
odule where the `require` form appears: it makes it easy to organize sma=
ll programs into multiple files without having to mess with a load path.

More messages have come since I started writing this reply, so I'll try =
to address them, too.

On Thu, Feb 23, 2023, at 1:04 PM, Maxime Devos wrote:
> On 23-02-2023 09:51, Dr. Arne Babenhauserheide wrote:
>>> Thinking a bit more about it, it should be possible to special-case
>>> Guile's interpretation of "#!" such that "#!r6rs" doesn't require a
>>> closing "!#".  (Technically backwards-incompatible, but I don't think
>>> people are writing #!r6rs ...!# in the wild.)
>> Do you need the closing !# if you restrict yourself to the first line?
>
> I thought so at first, but doing a little experiment, it appears you=20
> don't need to:
>
> $ guile
> scheme@(guile-user)> #!r6rs
> (display "hi") (newline)
>
> (output: hi)
>
> Apparently Guile already has required behaviour.
>

All the `#!r6rs` examples I've tried since I got Ludo=E2=80=99's mail ha=
ve worked, but I remember some not working as I'd expected in the past. =
I'll see if I can come up with any problematic examples again.

On Thu, Feb 23, 2023, at 1:42 PM, Maxime Devos wrote:
> Have you seen my messages on how the "#lang" construct is problematic=20
> for some languages, and how alternatives like "[comment delimiter] -*-=20
> stuff: scheme/ecmascript/... -*- [comment delimiter]" appear to be=20
> equally simple (*) and not have any downsides (**).
>
> (*) The port encoding detection supports "-*- coding: whatever -*-",=20
> presumably that functionality could be reused.
>

IMO, the use of  "-*- coding: whatever -*-" to detect encoding is an ugl=
y hack and should not be extended further.

I tried to raise some objections above to conflating editor configuratio=
n with syntax saying what a file's language is.

More broadly, I find "magic comments" highly objectionable. The whole po=
int of comments is to be able to communicate freely to human readers wit=
hout affecting the interpreter/compiler/evaluator. Introducing magic com=
ments means must constantly think about whether what you are writing for=
 humans might change the meaning of your program. Magic comments *withou=
t knowing a priori what is a comment* are even worse: now, you have to b=
eware of accidental "magic" in ALL of the lexical syntax of your program=
. (Consider that something like `(define (-*- mode: c++ -*-) 14)` is per=
fectly good Scheme.)

(It's not really relevant for the `#lang`-like case, but something I fin=
d especially ironic about encoding "magic comments" or, say, `<?xml vers=
ion=3D"1.0" encoding=3D"UTF-8"?>`, is that suddenly if you encode the Un=
icode text in some other encoding it becomes a lie.)

On Fri, Feb 24, 2023, at 6:51 PM, Maxime Devos wrote:
> On 25-02-2023 00:48, Maxime Devos wrote:
>>>> (**)=C2=A0For=C2=A0compatibility=C2=A0with=C2=A0Racket,=C2=A0it's=C2=
=A0not=C2=A0like=C2=A0we=C2=A0couldn't
>>>> implement=C2=A0both=C2=A0"#lang"=C2=A0and=C2=A0"-*-=C2=A0stuff:=C2=A0=
language=C2=A0-*-".
>
> TBC, I mean =E2=80=98only support #lang' for values of 'lang' that Rac=
ket=20
> supports=E2=80=99

If I understand what you're proposing here, I don't think it's a viable =
option.

The fundamental purpose of the `#lang` construct (however you spell it) =
is to provide an open, extensible protocol for defining languages. Thus,=
 "values of 'lang' that Racket supports" are unbounded, provided that a =
module has been installed where the language specification says to look.=
 From The Racket Reference [4]:

>>>> The `#lang` reader form is similar to `#reader`, but more constrain=
ed: the `#lang` must be followed by a single space (ASCII 32), and then =
a non-empty sequence of alphanumeric ASCII, `+`, `-`, `_`, and/or `/` ch=
aracters terminated by whitespace or an end-of-file. The sequence must n=
ot start or end with `/`. A sequence `#lang =E2=80=B9name=E2=80=BA` is e=
quivalent to either `#reader (submod =E2=80=B9name=E2=80=BA reader)` or =
`#reader =E2=80=B9name=E2=80=BA/lang/reader`, where the former is tried =
first guarded by a `module-declared?` check (but after filtering by `cur=
rent-reader-guard`, so both are passed to the value of `current-reader-g=
uard` if the latter is used). Note that the terminating whitespace (if a=
ny) is not consumed before the external reading procedure is called.
>>>>
>>>> Finally, `#!` is an alias for `#lang` followed by a space when `#!`=
 is followed by alphanumeric ASCII, `+`, `-`, or `_`. Use of this alias =
is discouraged except as needed to construct programs that conform to ce=
rtain grammars, such as that of R6RS [Sperber07].

(The rationale for the constraints, which Racketeers generally tend to c=
hafe against, is that the syntax of `#lang=E2=80=B9name=E2=80=BA` is the=
 one and only thing that `#lang` doesn't give us a way to compatibly cha=
nge. We can quickly get to a less constrained syntax by using a chaining=
 "meta-language": see `#lang s-exp` and `#lang reader` on that page for =
two of many examples.)

I expect reading this would raise more questions, because that page give=
s lots of details on Racket's `#lang` protocol. Do I really expect Guile=
 to implement all of those details? If not, in what sense is what I'm ad=
vocating actually compatible with `#lang`?

I am definitely **not** suggesting that Guile implement all the details =
of Racket's `#lang` implementation. What I do strongly advocate is that =
you design Guile's support for `#lang` (or `#!`) to leave open a pathway=
 for compatibility in the future.

I think the best way to explain how that would work is to take as an ext=
ended example Zuo, the tiny Scheme-like language created last year to re=
place the build scripts for Racket and Racket's branch of Chez Scheme. Z=
uo was initially prototyped in Racket as a `#lang` language. Since the g=
oal was to use Zuo to build Racket, the primary implementation is an int=
erpreter implemented in a single file of C code, avoiding bootstrapping =
issues. There isn't a working Zuo implementation as a Racket at the mome=
nt. (There's a shim implementation, and there's some work in progress, a=
s people have time and interest, to get a real implementation working ag=
ain.)=20

Zuo is based on `#lang`, but its protocol [5][6] is quite different than=
 Racket's. Nevertheless, as I will explain, they are compatible.

The C code in fact implements not `#lang zuo` or even `#lang zuo/base` b=
ut `#lang zuo/kernel`: the rest of `#lang zuo` is implemented in Zuo, bu=
ilding up to `#lang zuo` through a series of internal languages. A modul=
e written in `#lang zuo/kernel` is a single expression which produces an=
 immutable symbol-keyed hash table, which is Zup's core representation o=
f a module. When Zuo encounters `#lang whatever`, it looks up the symbol=
 `'read-and-eval` in the hash table representing the module `whatever`: =
the result should be a procedure that, given a Zuo string (a Scheme byte=
vector) with the source of the module, returns a hash table to be used a=
s the module's representation.

An implementation of `#lang zuo/kernel` in Racket would bridge this prot=
ocol with Racket's `#lang` by synthesizing `reader` submodules implement=
ing the procedures the Racket protocol expects by wrapping the procedure=
 mapped to `'read-and-eval` in the Zuo-level hash table. The wrappers wo=
uld propagate themselves, so a language implemented in a language implem=
ented in `#lang zuo/kernel` would likewise be automatically bridged, and=
 so on ad infinitum. Racket's submodules [7] make this work especially e=
legantly.

In Guile, my experience with the tower of languages is limited, but AIUI=
 many of the existing facilities are like `lookup-language`[8] in expect=
ing language X to be implemented by a language object bound to X in the =
module `(language X spec)`. I'd suggest that Guile support `#lang langua=
ge/X` (or `#!language/X`, if you prefer to spell it that way) by likewis=
e looking up X in the `(language X spec)` module. One day, compatibility=
 could be achieved by adding trivial bridge (sub)modules: for an illustr=
ation of how trivial this can be, see [8], a one-line module that makes =
SRFI 11 available as `(import (srfi :11))` for R6RS by wrapping its hist=
orical PLT Scheme location, `(require srfi/11)`.

I would NOT suggest supporting arbitrary things after `#lang`, because o=
ne part of planning for compatibility is avoiding future namespace colli=
sions. Happily, `language/` is not otherwise in use in the Racket world,=
 so I suggest that Guile claim it. I don't think this should be overly r=
estrictive: if it seems worth-while to support languages from other modu=
les, you could implement the "chaining meta-language" approach I mention=
ed above: imagine something like `#!language/other (@ (some other module=
) exported-language)`, where the `other` export of `(language other spec=
)` is responsible for reading the next datum and using it to obtain the =
language object to be used for the rest of the module.

(Other kinds of potential namespace collisions are easier to manage: for=
 example, we could imagine that `(use-modules (foo bar baz))` might not =
access the same module as `(require foo/bar/baz)`. This is in a way an e=
xample of where it makes sense to be constrained in the syntax of `#lang=
` itself and let `#lang` unlock endless possibilities.)

I've sort of alluded above to my pipe dream of a grand unified future fo=
r Racket-and-Guile-on-Chez, Guile-and-Racket-on-the-Guile-VM, and endles=
s other possibilities. I wrote about it in more detail on the guix-devel=
 list at [10]. (These thoughts were inspired by conversations with Chris=
tine Lemmer-Webber, though she bears no responsibility for my zany imagi=
nings.)

Finally, I looked into the history of `#!` in R6RS a bit, and I'll leave=
 a few pointers here for posterity. Will Clinger's 2015 Scheme Workshop =
paper [11] says in section 3.1 that "Kent Dybvig suggested the `#!r6rs` =
flag in May 2006", Clinger "formally proposed addition of Dybvig=E2=80=99=
s suggestion" [12], and, "less than six weeks later," `#!r6rs` was "in t=
he R6RS editors=E2=80=99 status report". (I am not persuaded by all of t=
he arguments about `#!r6rs` in that paper: in particular, the analysis d=
oesn't seem to account for R6RS Appendix A [1].) As best as I can tell, =
the suggestion from Kent Dybvig is [13]:

On Wed May 10 15:40:13 EDT 2006, Kent Dybvig wrote:
> We already have (as of last week's meeting) a syntax for dealing with
> implementation-dependent lexical exceptions, which is to allow for
> #!<symbol-like-thing>, e.g.:
>
>  #!mzsceheme
>  #!larceny
>  ...
>
> Perhaps we can plan on using the same tool for future extensions to the
> syntax:
>
>  #!r7rs
>
> We can even require #!r6rs to appear at the top of a library now, or at
> least allow it to be included.
>
> This is a lot more concise than a MIME content-type line.
>
> Kent

I haven't tracked down any older writing about `#!<symbol-like-thing>` f=
or "implementation-dependent lexical exceptions": it may have been a con=
ference call.

-Philip

[1]: http://www.r6rs.org/final/html/r6rs-app/r6rs-app-Z-H-3.html#node_ch=
ap_A
[2]: http://www.r6rs.org/final/html/r6rs-app/r6rs-app-Z-H-6.html#node_ch=
ap_D
[3]: http://www.r6rs.org/final/html/r6rs-rationale/r6rs-rationale-Z-H-6.=
html#node_chap_4
[4]: https://docs.racket-lang.org/reference/reader.html#%28part._parse-r=
eader%29
[5]: https://docs.racket-lang.org/zuo/Zuo_Overview.html#%28part._.Zuo_.I=
mplementation_and_.Macros%29
[6]: https://docs.racket-lang.org/zuo/Zuo_Overview.html#%28part._module-=
protocol%29
[7]: https://www-old.cs.utah.edu/plt/publications/gpce13-f-color.pdf
[8]: https://www.gnu.org/software/guile/manual/html_node/Compiler-Tower.=
html#index-lookup_002dlanguage
[9]: https://github.com/racket/srfi/blob/25eb1c0e1ab8a1fa227750aa7f0689a=
2c531f8c8/srfi-lib/srfi/%253a11.rkt
[10]: https://lists.gnu.org/archive/html/guix-devel/2021-10/msg00010.html
[11]: https://andykeep.com/SchemeWorkshop2015/papers/sfpw1-2015-clinger.=
pdf
[12]: http://www.r6rs.org/r6rs-editors/2006-May/001251.html
[13]: http://www.r6rs.org/r6rs-editors/2006-May/001248.html