From: "Marc Nieper-Wißkirchen" <marc@nieper-wisskirchen.de>
To: mhw@netris.org
Cc: guile-devel@gnu.org
Subject: Re: Feature request: Expose `ellipsis?' from psyntax.ss
Date: Fri, 16 Nov 2018 14:37:31 +0100 [thread overview]
Message-ID: <CAEYrNrSGJ9XWH888FFv+j=-Bd01OHMFfJFS08GaJpukM-59wqw@mail.gmail.com> (raw)
In-Reply-To: <87lg5tj3gn.fsf@netris.org>
[-- Attachment #1: Type: text/plain, Size: 14416 bytes --]
Am Fr., 16. Nov. 2018 um 01:01 Uhr schrieb Mark H Weaver <mhw@netris.org>:
> Hi Marc,
>
> Marc Nieper-Wißkirchen <marc@nieper-wisskirchen.de> writes:
>
> > > Let's assume we are writing a macro that reimplements syntax (or some
> > > variation thereof) and which has to check whether identifiers are
> > > ellipses. For example, the following could be given:
> > >
> > > (with-ellipsis e
> > > (my-syntax a e)
> > >
> > > Now, this could be a result of a macro expansion and e could carry
> > > different marks than with-syntax or my-syntax. This is why I have been
> > > thinking that one also needs the lexical context of my-syntax and not
> > > only the context of e.
> >
> > I don't see what problem would be caused by 'e' carrying different marks
> > than 'my-syntax'.
> >
> > As far as I can tell, in the end, the two instances of 'e' above will
> > effectively be compared to one another using 'bound-identifier=?'. They
> > must have the same name and the same marks to match. The marks on
> > 'my-syntax' are irrelevant here.
> >
> > I have been thinking of the scope in which $sc-ellipsis is bound by
> > `with-syntax'.
>
> You've written 'with-syntax' is several places, in both this email and
> in your previous email, and I'm guessing that you meant to write
> 'with-ellipsis' in each of those places. Is that right?
>
Yes! I shouldn't do things in parallel (writing emails for this list and
programming syntax-case transformers, which use with-syntax but not
with-ellipsis, at the same time). I hope it didn't cause too much confusion.
> > If `my-syntax' is within the scope of `with-ellipsis', the binding of
> > $sc-ellipsis introduced by this `with-syntax' will be relevant; if
> > `my-syntax' is not in the lexical scope of `with-ellipsis', the
> > binding should be irrelevant; thus my thought that we need the lexical
> > information of my-syntax as well.
> >
> > Operationally, when (with-ellipsis e (my-syntax a e)) is expanded, 'e'
> > will be added to the macro expansion environment as the innermost
> > binding of the ellipsis identifier, and then (my-syntax a e) will be
> > expanded within that new expansion environment. That is the expansion
> > environment that will be consulted by the 'ellipsis-identifier?'
> > predicate to find the current ellipsis identifier, which is compared
> > with its argument (after stripping its anti-mark) using
> > 'bound-identifier=?'.
> >
> > Aha, so maybe I have misunderstood the scope of `with-syntax'. Please
> > consider the following example:
> >
> > (define-syntax foo
> > (lambda (stx)
> > (with-ellipsis e
> > (syntax-case stx ()
> > ((_ x e) (bar #'(x e)))))))
> >
> > (eval-when (expand)
> > (define (bar x*)
> > (syntax-case x* ()
> > ((x ...) ---))))
> >
> > I would have thought that the `...' identifier in `bar' is recognized
> > as an ellipsis,
>
> It is.
>
> > but from what you are saying it seems that the binding `with-syntax'
> > is dynamic with respect to macro expansion (like syntax
> > parameters). Is this really what we want?
>
> I agree that it's not what we want, and if I understand correctly, it's
> not what we have in Guile.
>
> In Psyntax, lexical lookups of identifiers are done in two steps, using
> two different data structures. First, the deferred substitutions in the
> wrap are applied to the identifier, which yields a gensym if the
> identifier is lexically bound. Next, the gensym is looked up in the
> expansion environment 'r' to find the actual binding.
>
> The deferred substitutions are applied to the inner bodies of each core
> binding construct. When the macro expander encounters a core binding
> construct, a fresh gensym is created for the binding, and that gensym is
> effectively substituted for all free occurrences of the identifier
> within the inner body. Mostly for efficiency reasons, this substitution
> is done lazily, by adding it to the wrap. The expansion environment is
> also extended each time the macro expander encounters a core binding
> construct.
>
> With this in mind, let's examine your example above more closely. The
> ellipsis binding for 'e' is only in the transformer environment when the
> 'syntax-case' form is expanded. It is _not_ in the transformer
> environment when your 'foo' macro is later used.
>
I agree and I see that my example doesn't demonstrate what it should have
demonstrated because `bar' is not executed before `foo' is used as a macro.
The example should have been more like the following:
(define-syntax foo
(lambda (stx)
(with-ellipsis e
(syntax-case (third-party-macro-transformer-helper-macro stx) ()
---))))
Here, the helper macro may expand into another instance of syntax-case.
That instance should not recognize `e' as the ellipsis but whatever the
ellipsis was where the helper macro was defined.
> But let's go one step further. Let's consider what will happen if 'foo'
> is used within 'with-ellipsis':
>
> (with-ellipsis --- (foo a b))
>
> When this is expanded, a fresh gensym will be generated, and an ellipsis
> binding will be added for that gensym in the expansion environment 'r'.
> Also, a substitution from #{ $sc-ellipsis }# to that gensym will be
> added to the wrap of (foo a b).
>
> Now let's consider how 'bar' will be affected by this. In the example
> you give, where 'bar' uses 'syntax-case', the ellipsis identifier will
> be looked up in the transformer environment where 'bar' is *defined*,
> not the transformer environment where 'bar' is called.
>
> But let's suppose that we change 'bar' to use 'ellipsis-identifier?' at
> run-time, like this:
>
> (define-syntax foo
> (lambda (stx)
> (with-ellipsis e
> (syntax-case stx ()
> ((_ x e) (bar #'(x e)))))))
>
> (eval-when (expand)
> (define (bar x*)
> (syntax-case x* ()
> ((x dots)
> (ellipsis-identifier? #'dots)
> #'#true)
> (_
> #'#false))))
>
> We now see this behavior with my draft patch:
>
> (with-ellipsis --- (foo a b)) => #false
> (with-ellipsis --- (foo a ...)) => #false
> (with-ellipsis --- (foo a ---)) => #true
>
> I think this is what we want, right?
>
I think it looks correct.
> When 'bar' is called, there will be a binding in the transformer
> environment 'r' that maps a gensym to an ellipsis binding, which
> specifies '---' as the ellipsis identifier. However, that binding will
> only apply when testing identifiers that have been wrapped to include a
> substitution from #{ $sc-ellipsis }# to the same gensym, so it will only
> apply to identifiers that are in body of the same 'with-ellipsis' form.
>
> > Therefore I think, we want `with-ellipsis' to be lexically scoped (in
> > the macro transformer code).
>
> Yes, that was certainly my intent.
>
Let's run the following example:
(eval-when (expand)
(define-syntax bar
(syntax-rules ()
((_ stx)
(syntax-case stx ()
((_ a (... ...))
#'#t)
((_ a b c)
#'#f))))))
(define-syntax foo
(lambda (stx)
(with-ellipsis e (bar stx))))
(display (foo 1 2 3))
(newline)
This one displays `#t' in Guile, which is exactly what we want. I guess the
reason is that the macro invocation `(bar stx)' creates a new transformer
environment, in which `{# $sc-ellipsis #}' becomes unbound again.
Now, why does the following work (i.e. why does it print `#t')?
(eval-when (expand)
(define-syntax bar2
(syntax-rules ()
((_ e body)
(with-ellipsis e body)))))
(define-syntax foo2
(lambda (stx)
(bar2 f (syntax-case stx ()
((_ a ...)
#'#t)
((_ a b c)
#'#f)))))
(display (foo2 1 2 3))
(newline)
If I change `f' to `e' in `foo2', `#f' will be printed, so the
`with-ellipsis' effects the body `body' if and only if `e' has the same
marks as the body (or, rather, `...' in the body). Is this the right
semantics? I incline to think so but I am not sure.
> > Thanks for the explanation. I have been toying with my own
> > > implementation of the syntax-case system. In my implementation the
> > > (shared) lexical environments are part of the wraps (so the
> > > identifiers are in some way self-contained).
> >
> > Interesting. Are locally-bound macro transformers included in those
> > lexical environments? If so, how do you implement 'letrec-syntax'?
> >
> > My environments are lists of ribs where each rib corresponds to a
> > lexical frame. Given the form
> >
> > (letrec-syntax ((var init)) body ...)
> >
> > I create a new rib that contains the binding of var to init and add a
> > wrap around each expression in body ... that contains the new rib but
> > no new marks.
>
> I think that you also need to apply the same wrap to 'init', no?
>
Yes, for `letrec-syntax'. For `let-syntax', I don't add the new rib to each
`init'.
> > When the body is examined by the expander, the wraps are
> > gradually pushed down (like in the original description of
> > `syntax-case' by Dybvig and Hieb) so that eventually the environments
> > stored with the identifiers in body gain another rib.
> >
> > > Will ellipsis? also work outside of macros? Say, what would be the
> > > result of the following (run-time) code?
> > >
> > > (with-syntax e
> > > (ellipsis? #'e)
> >
> > No, this is an error. Like 'syntax-local-binding', the
> > 'ellipsis-identifier?' predicate must be called within the dynamic
> > extent of a macro transformer call by the macro expander.
> >
> > Is this related to the question above of whether `with-syntax' has
> > lexical or dynamic scope? In the former case I don't see a theoretical
> > reason why it has to be restricted to the dynamic extent of a macro
> > transformer call.
>
> I'll assume that you meant to write 'with-ellipsis' above, not
> 'with-syntax', as I did in my earlier responses.
>
Yep.
> The reason this can't work is ultimately because the ellipsis bindings
> are stored in the transformer environment, which simply does not exist
> at run time when 'ellipsis?' is called here.
>
> It might be possible to make this work with your approach to storing
> full binding information in the syntax objects, but that's not how
> Psyntax works.
>
> FWIW, I will say that in Guile, the size of syntax objects in Psyntax is
> already quite significant in practice, and most of that information is
> never used unless the syntax objects are passed as the first argument to
> 'datum->syntax'. Many years ago, I reduced the size of 'psyntax-pp.scm'
> by an order of magnitude by stripping out most of that information from
> the syntax objects in the expanded code.
>
I should also add that my syntax-case expander is still work-in-progress
and I haven't profiled the code yet. In particular, I may have to improve
structure sharing between syntax objects to make them less fat. My eventual
goal is to write a Scheme compiler in Guile, but it will still take some
time.
> Therefore, I would be reluctant to make the syntax objects any larger
> than they already are in Psyntax.
>
That's reasonable. I don't think that one actually needs to use
`ellipsis-identifier?' in run-time code.
> > > P.S.: By the way, the module (system syntax) and in particular the
> > > procedure syntax-local-binding has already helped me a lot because I
> > > needed to attach extra information to symbols and Guile doesn't (yet)
> > > support Chez's define-property (well, this would be another feature
> > > request).
> >
> > Hmm. Can you tell me more specifically how you are using
> > 'syntax-local-binding' to accomplish this? As the Guile manual warns,
> > those interfaces are subject to change in future versions of Guile, and
> > therefore it is best to avoid them where possible.
> >
> > What I have been implementing is a pattern matcher and rewriter as a
> > macro in Guile that works much like syntax-case/syntax. Let's call it
> > my-syntax-case/my-syntax. When `my-syntax' is given a template, it has
> > to check whether an identifier appearing in the template is a
> > "my-"pattern variable or not. For that, `my-syntax-case' introduces
> > (via `let-syntax') lexical bindings of the identifiers that are used
> > as pattern variables. The associated syntax transformer just outputs
> > an error (namely that the pattern variable is used outside of
> > `my-syntax'). However, I also attach a custom property (with
> > `make-object-property`) to this syntax transformer that holds
> > information about the match and the nesting depth of the pattern
> > variable. In order to retrieve this information in `my-syntax', I use
> > `syntax-local-binding' to get hold of the associated syntax
> > transformer.
>
> Okay. I would suggest another approach that is more portable: instead
> of having the associated syntax transformers always return an error, add
> a clause so that when they are applied to a special keyword, they expand
> into something that includes the information about the match.
>
> For example, you might take a look at 'define-tagged-inlinable' in
> Guile's implementation of srfi-9.scm, where I did something like this.
>
The problem is that I have to be able to test arbitrary identifiers for
whether they are pattern variables. Now if `x' is any identifier and I
expand something like `(x :keyword k ...)` and `x' is not a pattern
variable that I control, anything may happen. Another (minor) problem would
be that the macro expansion (x :keyword k ...) would create new marks so I
would have to pass many identifiers in `k ...' to the continuation.
Contrary to that, calling a procedure during transformation does not
introduce fresh marks.
> > In Chez Scheme, I would have used `define-property' to define my
> > custom property directly on the identifier standing for the pattern
> > variable. I haven't found an equivalent feature in Guile. I don't know
> > how to nicely code my-syntax-case/my-syntax in standard R6RS.
>
> Sure, that sounds like a nice feature. I'll add it to my TODO list :)
>
That would be great! :-)
All the best,
Marc
[-- Attachment #2: Type: text/html, Size: 18576 bytes --]
next prev parent reply other threads:[~2018-11-16 13:37 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-14 13:16 Feature request: Expose `ellipsis?' from psyntax.ss Marc Nieper-Wißkirchen
2018-11-14 19:10 ` Mark H Weaver
2018-11-14 20:27 ` Marc Nieper-Wißkirchen
2018-11-15 9:38 ` Mark H Weaver
2018-11-15 10:03 ` Marc Nieper-Wißkirchen
2018-11-15 10:59 ` Mark H Weaver
2018-11-15 19:41 ` Marc Nieper-Wißkirchen
2018-11-16 0:00 ` Mark H Weaver
2018-11-16 13:37 ` Marc Nieper-Wißkirchen [this message]
2018-11-16 23:36 ` Mark H Weaver
2018-11-17 15:03 ` Marc Nieper-Wißkirchen
2018-11-21 3:37 ` Mark H Weaver
2018-11-21 8:40 ` Marc Nieper-Wißkirchen
2018-11-21 16:09 ` Marc Nieper-Wißkirchen
2018-11-23 7:55 ` Mark H Weaver
2018-11-23 21:06 ` Marc Nieper-Wißkirchen
2018-11-23 20:25 ` Mark H Weaver
2018-11-23 21:28 ` Marc Nieper-Wißkirchen
2018-11-24 9:08 ` Marc Nieper-Wißkirchen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAEYrNrSGJ9XWH888FFv+j=-Bd01OHMFfJFS08GaJpukM-59wqw@mail.gmail.com' \
--to=marc@nieper-wisskirchen.de \
--cc=guile-devel@gnu.org \
--cc=mhw@netris.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).