unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* How to record source properties for all symbols?
@ 2018-06-03 19:48 Fis Trivial
  2018-06-04 10:07 ` Mark H Weaver
  0 siblings, 1 reply; 4+ messages in thread
From: Fis Trivial @ 2018-06-03 19:48 UTC (permalink / raw)
  To: guile-devel

Hi, guiles.

I'm new to scheme and guile. Most of the time, I am a c/c++ and python
programmer. One thing that bugs me about writing scheme with guile is
that I can't have full source information when reading backtrace.

For example, in the following snippet (test.scm):
----------------
(use-modules (ice-9 pretty-print))
(define (function)
  (format #t "function"))
(define* (function-asterisk)
  (format #t "asterisk"))

(format #t "function properties:~a~%" (source-properties function))
(format #t "asterisk properties:~a~%" (source-properties function-asterisk))

(define var "normal var")

(format #t "normal var properties:~a~%" (source-properties var))
----------------

With guile-2.2.3, running $guile ./test.scm , I got the following
result:

----------------
function properties:()
asterisk properties:()
normal var properties:((line . 9) (column . 12) (filename . /home/fis/Others/git-repos/compliers/guile/./test.scm))
----------------

In which you can see, only `var' has non-empty
source-properties. Further, if I enter the above snippet in guile shell,
then source-properties is not recorded even for `var'. I tried to dig in
the source code of guile, so far I have found two related functions:

`scm_primitive_load' in load.c
`maybe_annotate_source' in read.c.

Using gdb to break on these two function doesn't stop the process. Only
running (primitive-load "test.scm") inside guile shell invoke these two
functions. The finding is far away from knowing how to make guile record
source information for every symbols.

Can you give me some guidance for how to achieve the goal(record all
source information for every symbol). If it's theoretically impossible
or hard to achieve, can you give me some inside why and advises for
making best effort? I intend to help, so pointers to internal structures
are also welcomed.

Thanks in advance.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to record source properties for all symbols?
  2018-06-03 19:48 How to record source properties for all symbols? Fis Trivial
@ 2018-06-04 10:07 ` Mark H Weaver
  2018-06-04 15:41   ` Fis Trivial
  0 siblings, 1 reply; 4+ messages in thread
From: Mark H Weaver @ 2018-06-04 10:07 UTC (permalink / raw)
  To: Fis Trivial; +Cc: guile-devel

Hi,

Fis Trivial <ybbs.daans@hotmail.com> writes:
> Can you give me some guidance for how to achieve the goal(record all
> source information for every symbol). If it's theoretically impossible
> or hard to achieve, can you give me some inside why and advises for
> making best effort? I intend to help, so pointers to internal structures
> are also welcomed.

The problem is that there's no place to store the source information for
symbols in the standard S-expression representation.

The principal defining characteristic of symbols -- that "two symbols
are identical (in the sense of 'eqv?') if and only if their names are
spelled the same way" (R5RS § 6.3.3) -- combined with the fact that
'eq?' is specified to be the same as 'eqv?' for symbols, leaves us no
way to distinguish two instances of the same symbol, and therefore no
way to store per-instance annotations such as source information.

Fixing this would require abandoning the plain S-expression
representation in favor of one in which symbols are represented by a
different data structure.  Our reader would need to be extended to
support the option of returning this new data representation instead of
plain S-expressions, and our macro expander would need to be modified to
accept this new representation as input.

      Mark



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to record source properties for all symbols?
  2018-06-04 10:07 ` Mark H Weaver
@ 2018-06-04 15:41   ` Fis Trivial
  2018-06-05 22:09     ` Mark H Weaver
  0 siblings, 1 reply; 4+ messages in thread
From: Fis Trivial @ 2018-06-04 15:41 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel


Mark H Weaver writes:

> The problem is that there's no place to store the source information for
> symbols in the standard S-expression representation.
>
> The principal defining characteristic of symbols -- that "two symbols
> are identical (in the sense of 'eqv?') if and only if their names are
> spelled the same way" (R5RS § 6.3.3) -- combined with the fact that
> 'eq?' is specified to be the same as 'eqv?' for symbols, leaves us no
> way to distinguish two instances of the same symbol, and therefore no
> way to store per-instance annotations such as source information.
>
> Fixing this would require abandoning the plain S-expression
> representation in favor of one in which symbols are represented by a
> different data structure.  Our reader would need to be extended to
> support the option of returning this new data representation instead of
> plain S-expressions, and our macro expander would need to be modified to
> accept this new representation as input.

I still believe it's crucial to give user correct and detailed error
message, since these day the software world is so large we have to learn
stuff by trial and error. New languages strides to embed a full tutorial
in their error message. Like (I am NOT promoting) rust which gives
explanation for basically every syntax and semantic error. Now users
basically take correct error message for granted.

After poking for a few days, I found that I have hard time understanding
the code, can you give me some hints for reading the code so that I can
understand how to encode the source information. Currently, I am still
trying to make a baby step, encode source information into symbol. I
know that will break everything, but at least I will have basic
understanding of underlying mechanic. Then latter I will try to redefine
whatever special structure needed.

I have a few experience with simple compilers, like the one from dragon
book, or some simple DSL. But it seems guile's compiler is quite
different from those I used to know. I'm still trying to understand what
happens when a symbol is read.

Though I didn't find any keyword like `parser' or `lexer', but I tried
to dig into `scm_read_expression', which returns a stringbuf. In
`scm_read_sexp', tmp is somehow encoded in tl by these three lines of
code:

      new_tail = scm_cons (tmp, SCM_EOL);
      SCM_SETCDR (tl, new_tail);
      tl = new_tail;

And by `scm_cons', `tmp' is in the GC, but is it still a stringbuf or
turned into meaningful symbol? Is the GC somehow also represents the
term "environment" in other compiler front ends? What's the effect if I
apply `maybe_annotate_source' to variable `tl'? Well, I tried the last
one, it doesn't do anything.

I guess it will take a long time before I can understand stuffs in
guile, I'm not a smart person. I will keep working on it at spare time,
but if maintainers have any desire to make the wished feature happen
before I make another baby step, please let me know, I can offer my
help.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to record source properties for all symbols?
  2018-06-04 15:41   ` Fis Trivial
@ 2018-06-05 22:09     ` Mark H Weaver
  0 siblings, 0 replies; 4+ messages in thread
From: Mark H Weaver @ 2018-06-05 22:09 UTC (permalink / raw)
  To: Fis Trivial; +Cc: guile-devel

Hi,

Fis Trivial <ybbs.daans@hotmail.com> writes:

> Mark H Weaver writes:
>
>> The problem is that there's no place to store the source information for
>> symbols in the standard S-expression representation.
>>
>> The principal defining characteristic of symbols -- that "two symbols
>> are identical (in the sense of 'eqv?') if and only if their names are
>> spelled the same way" (R5RS § 6.3.3) -- combined with the fact that
>> 'eq?' is specified to be the same as 'eqv?' for symbols, leaves us no
>> way to distinguish two instances of the same symbol, and therefore no
>> way to store per-instance annotations such as source information.
>>
>> Fixing this would require abandoning the plain S-expression
>> representation in favor of one in which symbols are represented by a
>> different data structure.  Our reader would need to be extended to
>> support the option of returning this new data representation instead of
>> plain S-expressions, and our macro expander would need to be modified to
>> accept this new representation as input.
>
> I still believe it's crucial to give user correct and detailed error
> message,

I agree.  I didn't intend to dispute the importance of good error
messages.  I merely intended to explain what the problem is, and why it
hasn't yet been done.

> After poking for a few days, I found that I have hard time understanding
> the code, can you give me some hints for reading the code so that I can
> understand how to encode the source information. Currently, I am still
> trying to make a baby step, encode source information into symbol.

It can't be done.  I tried to explain the reason in my earlier message,
but I guess you don't understand.  I don't have time right now to make
another attempt.

> I know that will break everything, but at least I will have basic
> understanding of underlying mechanic.

No, it simply won't work at all.

> Though I didn't find any keyword like `parser' or `lexer', but I tried
> to dig into `scm_read_expression', which returns a stringbuf.

What makes you think 'scm_read_expression' returns a stringbuf?
It returns an S-expression.

I appreciate your willingness to help with this, but I think that you've
chosen a task that's too difficult given your level of experience with
Guile.  If it was straightforward, it would have been done long ago.

      Mark



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-06-05 22:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-03 19:48 How to record source properties for all symbols? Fis Trivial
2018-06-04 10:07 ` Mark H Weaver
2018-06-04 15:41   ` Fis Trivial
2018-06-05 22:09     ` Mark H Weaver

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).