Am Fr., 16. Nov. 2018 um 01:01 Uhr schrieb Mark H Weaver <mhw@netris.org>:
Hi Marc,

Marc Nieper-Wißkirchen <marc@nieper-wisskirchen.de> writes:

>  > Let's assume we are writing a macro that reimplements syntax (or some
>  > variation thereof) and which has to check whether identifiers are
>  > ellipses. For example, the following could be given:
>  >
>  > (with-ellipsis e
>  >   (my-syntax a e)
>  > 
>  > Now, this could be a result of a macro expansion and e could carry
>  > different marks than with-syntax or my-syntax. This is why I have been
>  > thinking that one also needs the lexical context of my-syntax and not
>  > only the context of e.
>
>  I don't see what problem would be caused by 'e' carrying different marks
>  than 'my-syntax'.
>
>  As far as I can tell, in the end, the two instances of 'e' above will
>  effectively be compared to one another using 'bound-identifier=?'.  They
>  must have the same name and the same marks to match.  The marks on
>  'my-syntax' are irrelevant here.
>
> I have been thinking of the scope in which $sc-ellipsis is bound by
> `with-syntax'.

You've written 'with-syntax' is several places, in both this email and
in your previous email, and I'm guessing that you meant to write
'with-ellipsis' in each of those places.  Is that right?

Yes! I shouldn't do things in parallel (writing emails for this list and programming syntax-case transformers, which use with-syntax but not with-ellipsis, at the same time). I hope it didn't cause too much confusion.
 
> If `my-syntax' is within the scope of `with-ellipsis', the binding of
> $sc-ellipsis introduced by this `with-syntax' will be relevant; if
> `my-syntax' is not in the lexical scope of `with-ellipsis', the
> binding should be irrelevant; thus my thought that we need the lexical
> information of my-syntax as well.

>  Operationally, when (with-ellipsis e (my-syntax a e)) is expanded, 'e'
>  will be added to the macro expansion environment as the innermost
>  binding of the ellipsis identifier, and then (my-syntax a e) will be
>  expanded within that new expansion environment.  That is the expansion
>  environment that will be consulted by the 'ellipsis-identifier?'
>  predicate to find the current ellipsis identifier, which is compared
>  with its argument (after stripping its anti-mark) using
>  'bound-identifier=?'.
>
> Aha, so maybe I have misunderstood the scope of `with-syntax'. Please
> consider the following example:
>
> (define-syntax foo
>   (lambda (stx)
>     (with-ellipsis e
>       (syntax-case stx ()
>         ((_ x e) (bar #'(x e)))))))
>
> (eval-when (expand)
>   (define (bar x*)
>     (syntax-case x* ()
>       ((x ...) ---))))
>
> I would have thought that the `...' identifier in `bar' is recognized
> as an ellipsis,

It is.

> but from what you are saying it seems that the binding `with-syntax'
> is dynamic with respect to macro expansion (like syntax
> parameters). Is this really what we want?

I agree that it's not what we want, and if I understand correctly, it's
not what we have in Guile.

In Psyntax, lexical lookups of identifiers are done in two steps, using
two different data structures.  First, the deferred substitutions in the
wrap are applied to the identifier, which yields a gensym if the
identifier is lexically bound.  Next, the gensym is looked up in the
expansion environment 'r' to find the actual binding.

The deferred substitutions are applied to the inner bodies of each core
binding construct.  When the macro expander encounters a core binding
construct, a fresh gensym is created for the binding, and that gensym is
effectively substituted for all free occurrences of the identifier
within the inner body.  Mostly for efficiency reasons, this substitution
is done lazily, by adding it to the wrap.  The expansion environment is
also extended each time the macro expander encounters a core binding
construct.

With this in mind, let's examine your example above more closely.  The
ellipsis binding for 'e' is only in the transformer environment when the
'syntax-case' form is expanded.  It is _not_ in the transformer
environment when your 'foo' macro is later used.

I agree and I see that my example doesn't demonstrate what it should have demonstrated because `bar' is not executed before `foo' is used as a macro. The example should have been more like the following:

(define-syntax foo
  (lambda (stx)
    (with-ellipsis e
      (syntax-case (third-party-macro-transformer-helper-macro stx) ()
        ---))))

Here, the helper macro may expand into another instance of syntax-case. That instance should not recognize `e' as the ellipsis but whatever the ellipsis was where the helper macro was defined.
 
But let's go one step further.  Let's consider what will happen if 'foo'
is used within 'with-ellipsis':

  (with-ellipsis --- (foo a b))

When this is expanded, a fresh gensym will be generated, and an ellipsis
binding will be added for that gensym in the expansion environment 'r'.
Also, a substitution from #{ $sc-ellipsis }# to that gensym will be
added to the wrap of (foo a b).

Now let's consider how 'bar' will be affected by this.  In the example
you give, where 'bar' uses 'syntax-case', the ellipsis identifier will
be looked up in the transformer environment where 'bar' is *defined*,
not the transformer environment where 'bar' is called.

But let's suppose that we change 'bar' to use 'ellipsis-identifier?' at
run-time, like this:

  (define-syntax foo
    (lambda (stx)
      (with-ellipsis e
        (syntax-case stx ()
          ((_ x e) (bar #'(x e)))))))

  (eval-when (expand)
    (define (bar x*)
      (syntax-case x* ()
        ((x dots)
         (ellipsis-identifier? #'dots)
         #'#true)
        (_
         #'#false))))

We now see this behavior with my draft patch:

  (with-ellipsis --- (foo a b))    => #false
  (with-ellipsis --- (foo a ...))  => #false
  (with-ellipsis --- (foo a ---))  => #true

I think this is what we want, right?

I think it looks correct.
 
When 'bar' is called, there will be a binding in the transformer
environment 'r' that maps a gensym to an ellipsis binding, which
specifies '---' as the ellipsis identifier.  However, that binding will
only apply when testing identifiers that have been wrapped to include a
substitution from #{ $sc-ellipsis }# to the same gensym, so it will only
apply to identifiers that are in body of the same 'with-ellipsis' form.

> Therefore I think, we want `with-ellipsis' to be lexically scoped (in
> the macro transformer code).

Yes, that was certainly my intent.

Let's run the following example:

(eval-when (expand)
  (define-syntax bar
    (syntax-rules ()
      ((_ stx)
       (syntax-case stx ()
     ((_ a (... ...))
      #'#t)
     ((_ a b c)
      #'#f))))))

(define-syntax foo
  (lambda (stx)
    (with-ellipsis e (bar stx))))

(display (foo 1 2 3))
(newline)

This one displays `#t' in Guile, which is exactly what we want. I guess the reason is that the macro invocation `(bar stx)' creates a new transformer environment, in which `{# $sc-ellipsis #}' becomes unbound again.

Now, why does the following work (i.e. why does it print `#t')?

(eval-when (expand)
  (define-syntax bar2
    (syntax-rules ()
      ((_ e body)
       (with-ellipsis e body)))))

(define-syntax foo2
  (lambda (stx)
    (bar2 f (syntax-case stx ()
        ((_ a ...)
         #'#t)
        ((_ a b c)
         #'#f)))))

(display (foo2 1 2 3))
(newline)

If I change `f' to `e' in `foo2', `#f' will be printed, so the `with-ellipsis' effects the body `body' if and only if `e' has the same marks as the body (or, rather, `...' in the body). Is this the right semantics? I incline to think so but I am not sure.

>  > Thanks for the explanation. I have been toying with my own
>  > implementation of the syntax-case system. In my implementation the
>  > (shared) lexical environments are part of the wraps (so the
>  > identifiers are in some way self-contained).
>
>  Interesting.  Are locally-bound macro transformers included in those
>  lexical environments?  If so, how do you implement 'letrec-syntax'?
>
> My environments are lists of ribs where each rib corresponds to a
> lexical frame. Given the form
>
> (letrec-syntax ((var init)) body ...)
>
> I create a new rib that contains the binding of var to init and add a
> wrap around each expression in body ... that contains the new rib but
> no new marks.

I think that you also need to apply the same wrap to 'init', no?

Yes, for `letrec-syntax'. For `let-syntax', I don't add the new rib to each `init'.
 
> When the body is examined by the expander, the wraps are
> gradually pushed down (like in the original description of
> `syntax-case' by Dybvig and Hieb) so that eventually the environments
> stored with the identifiers in body gain another rib.

>  > Will ellipsis? also work outside of macros? Say, what would be the
>  > result of the following (run-time) code?
>  >
>  > (with-syntax e
>  >   (ellipsis? #'e)
>
>  No, this is an error.  Like 'syntax-local-binding', the
>  'ellipsis-identifier?' predicate must be called within the dynamic
>  extent of a macro transformer call by the macro expander.
>
> Is this related to the question above of whether `with-syntax' has
> lexical or dynamic scope? In the former case I don't see a theoretical
> reason why it has to be restricted to the dynamic extent of a macro
> transformer call.

I'll assume that you meant to write 'with-ellipsis' above, not
'with-syntax', as I did in my earlier responses.

Yep.
 
The reason this can't work is ultimately because the ellipsis bindings
are stored in the transformer environment, which simply does not exist
at run time when 'ellipsis?' is called here.

It might be possible to make this work with your approach to storing
full binding information in the syntax objects, but that's not how
Psyntax works.

FWIW, I will say that in Guile, the size of syntax objects in Psyntax is
already quite significant in practice, and most of that information is
never used unless the syntax objects are passed as the first argument to
'datum->syntax'.  Many years ago, I reduced the size of 'psyntax-pp.scm'
by an order of magnitude by stripping out most of that information from
the syntax objects in the expanded code.

I should also add that my syntax-case expander is still work-in-progress and I haven't profiled the code yet. In particular, I may have to improve structure sharing between syntax objects to make them less fat. My eventual goal is to write a Scheme compiler in Guile, but it will still take some time.
 
Therefore, I would be reluctant to make the syntax objects any larger
than they already are in Psyntax.

That's reasonable. I don't think that one actually needs to use `ellipsis-identifier?' in run-time code.
 
>  > P.S.: By the way, the module (system syntax) and in particular the
>  > procedure syntax-local-binding has already helped me a lot because I
>  > needed to attach extra information to symbols and Guile doesn't (yet)
>  > support Chez's define-property (well, this would be another feature
>  > request).
>
>  Hmm.  Can you tell me more specifically how you are using
>  'syntax-local-binding' to accomplish this?  As the Guile manual warns,
>  those interfaces are subject to change in future versions of Guile, and
>  therefore it is best to avoid them where possible.
>
> What I have been implementing is a pattern matcher and rewriter as a
> macro in Guile that works much like syntax-case/syntax. Let's call it
> my-syntax-case/my-syntax. When `my-syntax' is given a template, it has
> to check whether an identifier appearing in the template is a
> "my-"pattern variable or not. For that, `my-syntax-case' introduces
> (via `let-syntax') lexical bindings of the identifiers that are used
> as pattern variables. The associated syntax transformer just outputs
> an error (namely that the pattern variable is used outside of
> `my-syntax'). However, I also attach a custom property (with
> `make-object-property`) to this syntax transformer that holds
> information about the match and the nesting depth of the pattern
> variable. In order to retrieve this information in `my-syntax', I use
> `syntax-local-binding' to get hold of the associated syntax
> transformer.

Okay.  I would suggest another approach that is more portable: instead
of having the associated syntax transformers always return an error, add
a clause so that when they are applied to a special keyword, they expand
into something that includes the information about the match.

For example, you might take a look at 'define-tagged-inlinable' in
Guile's implementation of srfi-9.scm, where I did something like this.

The problem is that I have to be able to test arbitrary identifiers for whether they are pattern variables. Now if `x' is any identifier and I expand something like `(x :keyword k ...)` and `x' is not a pattern variable that I control, anything may happen. Another (minor) problem would be that the macro expansion (x :keyword k ...) would create new marks so I would have to pass many identifiers in `k ...' to the continuation. Contrary to that, calling a procedure during transformation does not introduce fresh marks.
 
> In Chez Scheme, I would have used `define-property' to define my
> custom property directly on the identifier standing for the pattern
> variable. I haven't found an equivalent feature in Guile. I don't know
> how to nicely code my-syntax-case/my-syntax in standard R6RS.

Sure, that sounds like a nice feature.  I'll add it to my TODO list :)

That would be great! :-)

All the best,

Marc