unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* Recursive Macros generating Definitions
@ 2022-10-03 11:32 Frank Terbeck
  2022-10-03 12:48 ` Maxime Devos
  0 siblings, 1 reply; 5+ messages in thread
From: Frank Terbeck @ 2022-10-03 11:32 UTC (permalink / raw)
  To: guile-devel

Good day, good people!

There might  be a bug  in recursive macro  expansion, at least  when the
definition of parameters, using (define …) and similar is involved. Here
is a slightly simplified example.

The purpose  of this macro  is to define a  couple of short-hands  for a
generic encoder/decoder pair  of functions. The intention is  to call it
like this:

  (generate-shorthands (unsigned-integer twos-complement zig-zag)
                       (32 64 128 256 512))

…to generate 5*3*2 = 30 functions,  that call the generic functions with
the proper concrete  arguments. The macro is  implemented recursively to
generate all desired combinations.

If called like that, this implementation generates names like:

  varint:sint32-decode-ea351ae5fca3566

This seems to be connected to the recursiveness of the macro. If calling
the base case manually (see example at the end to reproduce), the inten-
ded name is generated:

  varint:sint32-decode

This happens  with guile 3.0.5,  3.0.8 as well  as the current  git main
branch HEAD. It does not seem to happen in 2.0.0.

I've bisected this down to:

    commit de41e56492666801078e73860a358e1c63cbc8c2
    Author: Andy Wingo <wingo@pobox.com>
    Date:   Fri Nov 4 19:34:22 2011 +0100

    hygienically rename macro-introduced bindings, reproducibly

    * module/ice-9/psyntax.scm (chi-top-sequence): Detect bindings to
    identifiers introduced by macros.  In that case, in order to preserve
    hygiene, uniquify the variable's name, but in a way that is
    reproduceable (i.e., yields the same uniquified name after a
    recompile).

    module/ice-9/psyntax.scm | 22 ++++++++++++++++++++--
    1 file changed, 20 insertions(+), 2 deletions(-)


When looking at  this, I also saw the following,  which might be related
if ‘syntax-rules’ is implemented using  ‘syntax-case’ (I didn't check if
this is the case):

    (define-syntax-rule (foobar n) (define quux n))
    ,exp (foobar 23)
  → (define quux-ea7bdcf8675f4a4 23)

Here's the code, that  can be loaded into a REPL  and example REPL macro
expansion calls to reproduce the issue:


(use-modules (ice-9 match))

(define-syntax generate-shorthands
  (lambda (x)
    ;; This is a helper that makes a name depending on semantics and width. It is
    ;; completely inconsequential to the issue and can be ignored.
    (define (make-base-name s w)
      (symbol-append 'varint:
                     (match (syntax->datum s)
                       ('unsigned-integer 'uint)
                       ('twos-complement  'int)
                       ('zig-zag          'sint))
                     (string->symbol (number->string (syntax->datum w)))))

    ;; The first two cases of this syntax-case recur on generate-shorthands, to
    ;; iterate on the list input to generate all desired combinations.
    (syntax-case x ()
      ;; (_ LIST-OF-SEMANTICS-SYMBOLS LIST-OF-WIDTH-LITERALS)
      ((_ (sems ...) (widths ...))
       (format #t "# Outer~%")  ;; (format #t …) returns #t, so it can be
                                ;; called in guard position to get a trace.
       #'(begin (generate-shorthands sems (widths ...)) ...))

      ;; (_ SEMANTICS-SYMBOL LIST-OF-WIDTH-LITERALS)
      ((_ sem (widths ...))
       (and (format #t "# Middle~%")
            (identifier? #'sem))
       #'(begin (generate-shorthands sem widths) ...))

      ;; Base case:
      ;; (_ SEMANTICS-SYMBOL WIDTH-LITERAL)
      ((_ s w)
       (and (format #t "# Inner~%")
            (identifier? #'s)
            (integer? (syntax->datum #'w)))
       (let ((base (make-base-name #'s #'w)))
         (with-syntax ((enc (datum->syntax x (symbol-append base '-encode)))
                       (dec (datum->syntax x (symbol-append base '-decode))))
           #'(begin (define (dec bv) (varint-decode bv w s))
                    (define (enc  n) (varint-encode  n w s)))))))))


;; Example expansions:

;; ,exp (generate-shorthands (zig-zag) (32))
;; # Outer
;; # Middle
;; # Inner
;; (begin (define (varint:sint32-decode-ea351ae5fca3566 bv) (varint-decode bv 32 zig-zag))
;;        (define (varint:sint32-encode-e47ba11af8c0627  n) (varint-encode  n 32 zig-zag)))

;; ,exp (generate-shorthands zig-zag (32))
;; # Middle
;; # Inner
;; (begin (define (varint:sint32-decode-ea351ae5fca3566 bv) (varint-decode bv 32 zig-zag))
;;        (define (varint:sint32-encode-e47ba11af8c0627  n) (varint-encode  n 32 zig-zag)))


;; ,exp (generate-shorthands zig-zag 32)
;; # Inner
;; (begin (define (varint:sint32-decode bv) (varint-decode bv 32 zig-zag))
;;        (define (varint:sint32-encode  n) (varint-encode  n 32 zig-zag)))



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Recursive Macros generating Definitions
  2022-10-03 11:32 Recursive Macros generating Definitions Frank Terbeck
@ 2022-10-03 12:48 ` Maxime Devos
  2022-10-03 13:41   ` Frank Terbeck
  0 siblings, 1 reply; 5+ messages in thread
From: Maxime Devos @ 2022-10-03 12:48 UTC (permalink / raw)
  To: Frank Terbeck, guile-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 2314 bytes --]



On 03-10-2022 13:32, Frank Terbeck wrote:
> When looking at  this, I also saw the following,  which might be related
> if ‘syntax-rules’ is implemented using  ‘syntax-case’

It is, IIRC.

> (I didn't check if
> this is the case):
> 
>      (define-syntax-rule (foobar n) (define quux n))
>      ,exp (foobar 23)
>    → (define quux-ea7bdcf8675f4a4 23)

This is correct (as in, functioning as intended and not a bug) to my 
understanding -- in the match expression of 'foobar', 'quux' does not 
appear, so the for hygiene, the 'quux' inside shouldn't be the quux outside.

Compare:

(define-syntax-rule (define-pair-contents pair the-car the-cdr)
   (begin
     (define p pair) ; only compute it once.  Due to lexical hygiene, 
this won't interfere with any 'p' in the environment.
     (define the-car (car pair))
     (define the-cdr (cdr pair)))).

-- this shouldn't be expanded to

(define p pair)
(define the-car (car p))
(define the-cdr (cdr p))

because of hygiene (the environment might already be using 'p' for 
something else).

It's sometimes a bit inconvenient -- sometimes you _want_ to define 
'quux' (and not just only available to the macro), but that's easily 
resolved by adding an additional 'quux' argument to 'foobar':

       (define-syntax-rule (foobar quux n) (define quux n))
       ,exp (foobar quux 23)

 > (define-syntax generate-shorthands [...]

Your recursive macro is, well, recursive.  This is fine, but IIUC a 
consequence of this is that the recursive 'call' to generate-shorthands 
is a new lexical lexical environment (hence, hygience, so -?????? stuff).

As such, I consider this not a bug in Guile, but a bug in your code.

My proposal would be to change the 'x' in (datum->syntax x) -- instead 
of using #'x (which refers to the whole expression, which in a recursive 
call has an undesired lexical environment), use something of the 
'end-user' of generate-shorthands, say, #'s (i.e., SEMANTICS-SYMBOL) 
(for the right lexical environment).

If I make that change, I get some reasonable output (no -????? suffixes):

$1 = (begin
   (define (varint:sint32-decode bv)
     (varint-decode bv 32 zig-zag))
   (define (varint:sint32-encode n)
     (varint-encode n 32 zig-zag)))

Greetings,
Maxime.

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 929 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Recursive Macros generating Definitions
  2022-10-03 12:48 ` Maxime Devos
@ 2022-10-03 13:41   ` Frank Terbeck
  2022-10-03 18:42     ` Jean Abou Samra
  0 siblings, 1 reply; 5+ messages in thread
From: Frank Terbeck @ 2022-10-03 13:41 UTC (permalink / raw)
  To: Maxime Devos; +Cc: guile-devel

Hey Maxime!

Maxime Devos wrote:
> On 03-10-2022 13:32, Frank Terbeck wrote:
>> When looking at  this, I also saw the following,  which might be related
>> if ‘syntax-rules’ is implemented using  ‘syntax-case’
>
> It is, IIRC.
>
>> (I didn't check if
>> this is the case):
>>      (define-syntax-rule (foobar n) (define quux n))
>>      ,exp (foobar 23)
>>    → (define quux-ea7bdcf8675f4a4 23)
>
> This is correct (as in, functioning as intended and not a bug) to my
> understanding -- in the match expression of 'foobar', 'quux' does not appear,
> so the for hygiene, the 'quux' inside shouldn't be the quux outside.
[…]
> It's sometimes a bit inconvenient -- sometimes you _want_ to define 'quux' (and
> not just only available to the macro), but that's easily resolved by adding an
> additional 'quux' argument to 'foobar':
>
>       (define-syntax-rule (foobar quux n) (define quux n))
>       ,exp (foobar quux 23)

I get the point, but I think it's sort of surprising, when everything in
the macro-language is  otherwise quite literal, to  my understanding. It
may be warranted to  point this out in the documentation  that this is a
side effect of hygienic macros, I think.


>> (define-syntax generate-shorthands [...]
>
> Your recursive macro is, well, recursive.  This is fine, but IIUC a consequence
> of this is that the recursive 'call' to generate-shorthands is a new lexical
> lexical environment (hence, hygience, so -?????? stuff).
>
> As such, I consider this not a bug in Guile, but a bug in your code.
>
> My proposal would be to change the 'x' in (datum->syntax x) -- instead of using
> #'x (which refers to the whole expression, which in a recursive call has an
> undesired lexical environment), use something of the 'end-user' of
> generate-shorthands, say, #'s (i.e., SEMANTICS-SYMBOL) (for the right lexical
> environment).
>
> If I make that change, I get some reasonable output (no -????? suffixes):
>
> $1 = (begin
>   (define (varint:sint32-decode bv)
>     (varint-decode bv 32 zig-zag))
>   (define (varint:sint32-encode n)
>     (varint-encode n 32 zig-zag)))

Thanks, this does work indeed!

So,   clearly  I   don't  fully   understand  the   first  argument   to
‘datum->syntax’, because  I thought ‘x’  would be exactly right  since I
thought it  captured the context  of where the  macro was called  in the
original  code.  But  every  time,  I  perform  an  indirection  through
recursion, the context captured by ‘x’ is bound to something new, that's
in the context of the outer level of macro expansion.

If I understand this correctly, using #'s here, refers to something that
was created  at the source level  of where the initial  expansion of the
macro happened.  If that  is the  case, is it  correct, that  it doesn't
really matter whether I used #'s or #'w. And that seems to be the case.

And if I wanted to do this:

  ((op s w)
   (let ((base (make-base-name #'s #'w)))
     (with-syntax ((enc (datum->syntax #'op (symbol-append base '-encode)))
                   (dec (datum->syntax #'op (symbol-append base '-decode))))
       #'(begin (define (dec bv) (varint-decode bv w s))
                (define (enc  n) (varint-encode  n w s))))))

…I'd still  end up with  the -HEX amended  symbol names, because  in the
recursing cases, I am using  a literal ‘generate-shorthands’ symbol. But
if I'd do this:

  ((op (sems ...) (widths ...)) #'(begin (op sems (widths ...)) ...))
  ((op sem        (widths ...)) #'(begin (op sem widths) ...))

It should work, because ‘op’ always  refers to the syntactic object used
at the source level, where the user  used the macro, which means that in
the base-case, the  ‘op’ template variable still refers  to the original
source position,  so the scopes match  up correctly, so that  the intro-
duced bindings don't have to be amended for hygiene.

In hindsight, I should have sent this to -users after all. :)

This behaviour  is probably  explained in one  of the  ‘syntax-case’ and
‘datum->syntax’ examples  in the manual,  but it  wasn't clear to  me at
all. Not sure how, but I think there's room for improvement here. :)

Thanks for  clearing this up!  And feel free  to correct anything  I got
wrong in what I wrote in the above.


Regards, Frank



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Recursive Macros generating Definitions
  2022-10-03 13:41   ` Frank Terbeck
@ 2022-10-03 18:42     ` Jean Abou Samra
  2022-10-03 20:29       ` Frank Terbeck
  0 siblings, 1 reply; 5+ messages in thread
From: Jean Abou Samra @ 2022-10-03 18:42 UTC (permalink / raw)
  To: Frank Terbeck, Maxime Devos; +Cc: guile-devel



Le 03/10/2022 à 15:41, Frank Terbeck a écrit :
> I get the point, but I think it's sort of surprising, when everything in
> the macro-language is  otherwise quite literal, to  my understanding. It
> may be warranted to  point this out in the documentation  that this is a
> side effect of hygienic macros, I think.


It *is* extensively documented.

https://www.gnu.org/software/guile/manual/html_node/Hygiene-and-the-Top_002dLevel.html#Hygiene-and-the-Top_002dLevel


> This behaviour  is probably  explained in one  of the  ‘syntax-case’ and
> ‘datum->syntax’ examples  in the manual,  but it  wasn't clear to  me at
> all. Not sure how, but I think there's room for improvement here. :)
>
> Thanks for  clearing this up!  And feel free  to correct anything  I got
> wrong in what I wrote in the above.


I think it is worth taking a look not just at the Guile documentation 
but also at the Scheme standards, which are more verbose on the details 
of syntax->datum and such. See

http://www.r6rs.org/final/html/r6rs-lib/r6rs-lib-Z-H-13.html




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Recursive Macros generating Definitions
  2022-10-03 18:42     ` Jean Abou Samra
@ 2022-10-03 20:29       ` Frank Terbeck
  0 siblings, 0 replies; 5+ messages in thread
From: Frank Terbeck @ 2022-10-03 20:29 UTC (permalink / raw)
  To: Jean Abou Samra; +Cc: Maxime Devos, guile-devel

Hey!

Jean Abou Samra wrote:
> Le 03/10/2022 à 15:41, Frank Terbeck a écrit :
>> I get the point, but I think it's sort of surprising, when everything in
>> the macro-language is  otherwise quite literal, to  my understanding. It
>> may be warranted to  point this out in the documentation  that this is a
>> side effect of hygienic macros, I think.
>
> It *is* extensively documented.
>
> https://www.gnu.org/software/guile/manual/html_node/Hygiene-and-the-Top_002dLevel.html#Hygiene-and-the-Top_002dLevel

Thanks! I had to check when that was added. I would have guessed recent-
ly, but it wasn't. …so, I guess I don't have an excuse. :)


[…]
> I think it is worth taking a look not just at the Guile documentation but also
> at the Scheme standards, which are more verbose on the details of syntax->datum
> and such. See
>
> http://www.r6rs.org/final/html/r6rs-lib/r6rs-lib-Z-H-13.html
>

Indeed. That's a nice reference, thanks! I didn't realise there was such
a document about the library section of r6rs. Useful!


Regards, Frank



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-10-03 20:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-03 11:32 Recursive Macros generating Definitions Frank Terbeck
2022-10-03 12:48 ` Maxime Devos
2022-10-03 13:41   ` Frank Terbeck
2022-10-03 18:42     ` Jean Abou Samra
2022-10-03 20:29       ` Frank Terbeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).