On 04-02-2023 16:46, Dr. Arne Babenhauserheide wrote:
> [...]
>>> So I’d like to ask: can we merge Wisp as supported language into Guile?
>>
>>  From some conversations elsewhere, I got the impression that
>>
>> (use-modules (foo))
>>
>> will search for foo.scm and not in foo.w.  I think you'll need to
>> tweak the loading mechanism to also look for foo.w instead of only
>> foo.scm, if not done already.
> 
> This needs an addition to the extensions via guile -x .w — I wrote that
> in the documentation. I didn’t want to do that unconditionally, because
> detecting a wisp file as scheme import would cause errors.

If done carefully, I don't think this situations would happen.
More precisely:

   * .w would be in the file extensions list.

   * Instead of a list, it would actually be a map from extensions to
     languages:

       .scm -> scheme
       .w -> wisp

     With this change, (use-modules (foo)) will load 'foo.scm' as Scheme
     and 'foo.w' as Wisp.  (Assuming that foo.go is out-of-date or
     doesn't exist.)

     (For backwards compatibility, I think %load-extensions needs to
     remain a list of strings, but a %extension-language variable could
     be defined.)

   * "guile --language=whatever foo" loads foo as whatever, regardless
     of the extension of 'foo' (if a specific language is requested,
     then the user knows best).

   * "guile foo" without --language will look up the extension of foo in
     the extension map. If an entry exists, it would use the
     corresponding language.  If no entry exists, it would use
     a default language (scheme).

With these changes, I don't think that Wisp code would be detected as 
Scheme or the other way around.

> Is there a way to only extend the loading mechanism to detect .w when
> language is changed to wisp?

Regardless of whether it's technically possible, that sounds 
insufficient to me.

Suppose someone writes a library 'Foo' in Wisp.
Suppose I write a library 'Bar' in parenthese-y Scheme, that happens to 
use the Foo library as a dependency.

Then when compiling Bar or running its tests, it will be done in the 
Scheme language, and additionally assuming that compiled .go are 
available for Foo, then the language will never be changed to Wisp, and 
hence .w will never be added to %load-extensions.

As such, the Makefile.am or equivalent of Foo would need to be converted 
to Wisp, or '-x w' would need to be added.

I don't care what language the library Foo is written in, and my library 
Bar isn't written in Wisp so it seems unreasonable to have to add -x w. 
(It wouldn't be too much trouble, but still not something that should 
have to be done _in Bar_, as the Wispyness of Foo is just an 
implementation detail of Foo, not Bar.)

Worse, adding the Wispy library Foo of the parenthese-y library Bar 
would be an incompatible change, as parenthese-y dependents of Foo would 
need to add '-x w' in places whereas they didn't to previously.  It's 
easily resolvable, but I think it would be very annoying as well.

> readable uses

This sentence appears to be incomplete; I might have misinterpreted it 
below (I don't know what you mean with 'readable' -- its an adjective 
and you are using it as a noun?).

> (set! %load-extensions (cons ".sscm" %load-extensions))
> 
> Would that be the correct way of doing this?

I assume you meant ".w" instead of ".sscm".  I don't quite see how this 
would be an answer to:

   Is there a way to only extend the loading mechanism to detect .w when
   language is changed to wisp?

More precisely, I'm missing how it addresses 'only ... when the language 
is changed to wisp'.

FWIW, it appears to be an answer to the following unasked question:

   How to make Guile accept "foo.go" when "foo.w" exists and is
   up-to-date.

>> Also, I think that when foo.go exists, but foo.scm doesn't, then Guile
>> refuses to load foo.scm, though I'm less sure of that. If this is the
>> case, I propose removing the requirement that the source code is
>> available, or alternatively keep the 'source code available'
>> requirement and also accept 'foo.w', if not done already.
> 
> I think accepting any extension supported by any language in Guile would
> be better.

This sounds like the second proposal ('alternatively ...'), but the way 
it is written, you appear to proposing it as a third proposal.  Is this 
the case?

(I mean, after this patch, Wisp is a supported language, so it seems 
equivalent to me.)

>>> +; Set locale to something which supports unicode. Required to avoid
>>> using fluids.
>>> +(catch #t
>>
>>   * Why avoid fluids?
> 
> I’m not sure anymore. It has been years since I wrote that code …
> 
> I think it was because I did not understand what that would mean for the
> program. And I actually still don’t know …
> 
> Hoow would I do that instead with fluids?
> 
>>   * Assuming for sake of argument that fluids are to be avoided,
>>     what is the point of setting the locale to something supporting
>>     Unicode?
> 
> I had problems with reading unicode symbols. Things like
> define (Σ . args) : apply + args
 > [...]>
> This is to ensure that Wisp are always read as Unicode. Since it uses
> regular (read) as part of parsing, it must affect (read), too.

OK.  So, Wisp files are supposed to be UTF-8, no matter the locale?
AFAICT, the SRFI-119 document does not mention this UTF-8 (or UTF-16, or 
...) requirement anywhere, this seems like an omission in 
<https://srfi.schemers.org/srfi-119/srfi-119.html> to me.

First, I would like to point out the following part of
‘(guile)The Top of a Script File’:

    • If this source code file is not ASCII or ISO-8859-1 encoded, a
      coding declaration such as ‘coding: utf-8’ should appear in a
      comment somewhere in the first five lines of the file: see *note
      Character Encoding of Source Files::.

oing by this, it is already possible to ask Guile to read the Scheme 
files as UTF-8; presumably the relevant bits could be copied over to 
Wisp. (I don't know if this applies to non-script files, but I'd assume so.)

It's not 'UTF-8 by default', but it can be 'close enough', and doing 
'always UTF-8 even if coding: something-else' would be inconsistent with 
the Scheme language, so I ask you to consider whether it's worth (and 
perhaps the answer is 'yes').

(OTOH, (guile)Character Encoding says 'In the absence of any hints, 
UTF-8 is assumed.' which appears to suffice for you, but it also 
contradicts "If this source file is not ASCII or ISO-8859-1 encodes, 
...", so I don't know what precisely is going on here.)

If you aren't going for the 'coding: ...' stuff or porting the encoding 
autodetection from Scheme to Wisp, here's an alternative solution:

Keep in mind that encodings are a per-port property -- the locale might 
have a default encoding, and ports by default take the encoding from 
%default-port-encoding or the locale (I think), but you can override the 
port encoding:

  -- Scheme Procedure: set-port-encoding! port enc
  -- C Function: scm_set_port_encoding_x (port, enc)
      Sets the character encoding that will be used to interpret I/O to
      PORT.  ENC is a string containing the name of an encoding.  Valid
      encoding names are those defined by IANA
      (http://www.iana.org/assignments/character-sets), for example
      ‘"UTF-8"’ or ‘"ISO-8859-1"’.

As such, I propose calling set-port-encoding! right in the beginning of 
read-one-wisp-sexp.

Also, unrelated, I now noticed some dead code you can remove:

+(define wisp-pending-sexps (list))

 > [...]

>>> +(define (wisp-replace-paren-quotation-repr code)
>>> +         "Replace lists starting with a quotation symbol by
>>> +         quoted lists."
>>> +         (match code
>>> +             (('REPR-QUOTE-e749c73d-c826-47e2-a798-c16c13cb89dd a ...)
>>> +                (list 'quote (map wisp-replace-paren-quotation-repr a)))
>>> [...]
>>> +(define wisp-uuid "e749c73d-c826-47e2-a798-c16c13cb89dd")
>>> +; define an intermediate dot replacement with UUID to avoid clashes.
>>> +(define repr-dot ; .
>>> +       (string->symbol (string-append "REPR-DOT-" wisp-uuid)))
>>
>> There is a risk of collision -- e.g., suppose that someone translates
>> your implementation of Wisp into Wisp.  I imagine there might be a
>> risk of misinterpreting the 'REPR-QUOTE-...' in
>> wisp-replace-parent-quotation-repr, though I haven't tried it out.
> 
> This is actually auto-translated from wisp via wisp2lisp :-)
> 
>> As such, assuming this actually works, I propose using uninterned
>> symbols instead, e.g.:
>>
>> (define repr-dot (make-symbol "REPR-DOT")).
> 
> That looks better — does uninterned symbol mean it can’t be
> mis-interpreted?

Yes.  This is because 'read' only reads interned symbols; uninterned 
symbols are unreadable:

scheme@(guile-user)> (make-symbol "foo")
$1 = #<uninterned-symbol foo 7f17efab7240>
scheme@(guile-user)> #<uninterned-symbol foo 7f17efab7240>
While reading expression:
#<unknown port>:2:3: Unknown # object: "#<"

Also: (eq? (make-symbol "stuff") 'stuff) -> #false.

> Can I (match l ...) on uninterned symbols? They are used to match on
> precisely these symbols later.

Yes, but it's going to look differently and more verbose:

(define interned-symbol1 (make-symbol "foo1"))
(define interned-symbol2 (make-symbol "foo2"))
(match symbol
   ((? (lambda (x)
         (eq? x interned-symbol1)))
    stuff1)
   ((? (lambda (x)
         (eq? x interned-symbol2)))
    stuff2)
   [...])

-- basically, replace 'stuff by (? (lambda (x) ...)).

> Can I write it into a string and then read it back?

No.  If you could, then uninterned symbols wouldn't be uninterned 
anymore, but rather a separation of symbols in two kinds that pretty 
much behave the same, and then you would again have a (very low) risk of 
a collision:

> When I see them, I have to turn them into a different representation
> that I can then write back into the string and allow it to be read by
> the normal reader.

That's the case for the old code, but AFAIK it is only done in the 
following ...

> 
>> If this change is done, you might need to replace
>>
>> +             ;; literal array as start of a line: # (a b) c -> (#(a b) c)
>> +             ((#\# a ...)
>> +               (with-input-from-string ;; hack to defer to read
>> +                   (string-append "#"
>> +                       (with-output-to-string
>> +                           (λ ()
>> +                             (write (map
>> wisp-replace-paren-quotation-repr a)
>> +                                     (current-output-port)))))
>> +                   read)) >>
>> (unverified -- I think removing this is unneeded but I don't
>> understand this REPR-... stuff well enough).

..., for which I proposed a replacement, so do you still need to turn it 
in a string & back?

> 
> The REPR supports the syntactic sugar like '(...) for (quote ...) by turning
> (' ...) into '(...).
> 
> Also it is needed to turn ((. a b c)) into (a b c).
> 
> However the literal array is used to make it possible to define
> procedure properties which need a literal array.
> 
>> Also, I wonder if you could just do something like
>>
>>    (apply vector (map wisp-replace-paren-quotation-repr a))
>>
>> instead of this 'hack to defer to read' thing.  This seems simpler to
>> me and equivalent.
> 
> That looks much cleaner. Thank you!

This sounds positive, but it is unclear to me if I have found a 
solution, because of your negative "However the literal array is used to 
make it possible to define procedure properties which need a literal 
array." comment.

Do I need to look into solving the 'literal array and procedure 
properties' stuff, or does the (apply vector (map ...)) suffice as-is?

(If there is 'literal array and procedure properties' stuff to be 
solved, you will need to elaborate on what you mean, because arrays 
aren't procedures and procedures aren't arrays -- maybe you meant 
'object properties'?)

Greetings,
Maxime.