* Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
@ 2011-12-03 23:23 Alan Mackenzie
2011-12-03 23:40 ` Daniel Colascione
2011-12-04 3:39 ` Stefan Monnier
0 siblings, 2 replies; 16+ messages in thread
From: Alan Mackenzie @ 2011-12-03 23:23 UTC (permalink / raw)
To: emacs-devel
Hi, Emacs.
There's a problem with parse-partial-sexp. If one scans to the middle
of a comment opener /*
^
|
, parse-partial-sexp gives no indication that we might be half inside a
comment. In particular, checking (nth 3 state) and (nth 4 state) is
insufficient to know that one is at a "safe place".
parse-partial-sexp does, however, notify the caller when it is just
after a backquote, a somewhat analogous situation.
No doubt there is some record of this state hidden away in (nth 9
state).
I think it would be a good idea to provide a function to test for this
"half comment" state, somewhat like `syntax-ppss-toplevel-pos'. This
new defun could be called something like
`syntax-ppss-comment-half-opener' and calling it would return nil
usually, but ?/ in these circumstances.
What do other people think?
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
2011-12-03 23:23 Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe Alan Mackenzie
@ 2011-12-03 23:40 ` Daniel Colascione
2011-12-04 3:39 ` Stefan Monnier
1 sibling, 0 replies; 16+ messages in thread
From: Daniel Colascione @ 2011-12-03 23:40 UTC (permalink / raw)
To: Alan Mackenzie; +Cc: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 520 bytes --]
On 12/3/11 3:23 PM, Alan Mackenzie wrote:
> I think it would be a good idea to provide a function to test for this
> "half comment" state, somewhat like `syntax-ppss-toplevel-pos'. This
> new defun could be called something like
> `syntax-ppss-comment-half-opener' and calling it would return nil
> usually, but ?/ in these circumstances.
>
> What do other people think?
Why not just scan one character further ahead, using the previous position and
parse state, to see whether you then enter a comment?
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 235 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
2011-12-03 23:23 Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe Alan Mackenzie
2011-12-03 23:40 ` Daniel Colascione
@ 2011-12-04 3:39 ` Stefan Monnier
2011-12-04 10:41 ` martin rudalics
1 sibling, 1 reply; 16+ messages in thread
From: Stefan Monnier @ 2011-12-04 3:39 UTC (permalink / raw)
To: Alan Mackenzie; +Cc: emacs-devel
> There's a problem with parse-partial-sexp. If one scans to the middle
> of a comment opener /*
> ^
> |
> , parse-partial-sexp gives no indication that we might be half inside a
[...]
> No doubt there is some record of this state hidden away in (nth 9 state).
IIRC you're just a bit too optimistic: parse-partial-sexp does not
record this info anywhere. And yes, if my recollection is right, that
means it's got a bug.
The better way to fix it is probably to change the (nth 5 ppss) value so
it holds something like "buffer position actually described by PPSS in
case the requested buffer position is in the middle of a lexeme" and
so it can be used for both backslashes and multi-char comment markers.
Stefan
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
2011-12-04 3:39 ` Stefan Monnier
@ 2011-12-04 10:41 ` martin rudalics
2011-12-04 15:21 ` Stefan Monnier
0 siblings, 1 reply; 16+ messages in thread
From: martin rudalics @ 2011-12-04 10:41 UTC (permalink / raw)
To: Stefan Monnier; +Cc: Alan Mackenzie, emacs-devel
> The better way to fix it is probably to change the (nth 5 ppss) value so
> it holds something like "buffer position actually described by PPSS in
> case the requested buffer position is in the middle of a lexeme" and
> so it can be used for both backslashes and multi-char comment markers.
If you change (nth 5 ppss) you would still have to say that (nth 4 ppss)
is unreliable in this special case. I think Daniel is right here: Check
whether the character following TO completes a comment begin (or comment
end) lexeme and in that case return consistently the in-comment value.
martin
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
2011-12-04 10:41 ` martin rudalics
@ 2011-12-04 15:21 ` Stefan Monnier
2011-12-04 17:06 ` martin rudalics
0 siblings, 1 reply; 16+ messages in thread
From: Stefan Monnier @ 2011-12-04 15:21 UTC (permalink / raw)
To: martin rudalics; +Cc: Alan Mackenzie, emacs-devel
>> The better way to fix it is probably to change the (nth 5 ppss) value so
>> it holds something like "buffer position actually described by PPSS in
>> case the requested buffer position is in the middle of a lexeme" and
>> so it can be used for both backslashes and multi-char comment markers.
> If you change (nth 5 ppss) you would still have to say that (nth 4 ppss)
> is unreliable in this special case.
Not if (nth 5 ppss) says that the buffer position is the one *after* the
"/*" sequence. Of course for "*/" we'd conversely want to use the state
*before* "*/".
Stefan
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
2011-12-04 15:21 ` Stefan Monnier
@ 2011-12-04 17:06 ` martin rudalics
2011-12-04 20:47 ` Andreas Röhler
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: martin rudalics @ 2011-12-04 17:06 UTC (permalink / raw)
To: Stefan Monnier; +Cc: Alan Mackenzie, emacs-devel
>> If you change (nth 5 ppss) you would still have to say that (nth 4 ppss)
>> is unreliable in this special case.
>
> Not if (nth 5 ppss) says that the buffer position is the one *after* the
> "/*" sequence. Of course for "*/" we'd conversely want to use the state
> *before* "*/".
What I meant was that the caller would have to care about (nth 5 ppss)
too, wherever she now looked only at (nth 3 ppss) and (nth 4 ppss). If
we say that a comment is everything in between and including both
delimiters she won't have to care about (nth 5 ppss) in the first place.
Admittedly, it's not entirely trivial to implement. But the fact that
between "/" and "*" we are not in a comment whilst between "*" and "/"
we are doesn't strike me as very intuitive.
martin
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
2011-12-04 17:06 ` martin rudalics
@ 2011-12-04 20:47 ` Andreas Röhler
2011-12-05 3:33 ` Stefan Monnier
2011-12-05 11:25 ` Alan Mackenzie
2 siblings, 0 replies; 16+ messages in thread
From: Andreas Röhler @ 2011-12-04 20:47 UTC (permalink / raw)
To: emacs-devel
Am 04.12.2011 18:06, schrieb martin rudalics:
> >> If you change (nth 5 ppss) you would still have to say that (nth 4
> ppss)
> >> is unreliable in this special case.
> >
> > Not if (nth 5 ppss) says that the buffer position is the one *after* the
> > "/*" sequence. Of course for "*/" we'd conversely want to use the state
> > *before* "*/".
>
> What I meant was that the caller would have to care about (nth 5 ppss)
> too, wherever she now looked only at (nth 3 ppss) and (nth 4 ppss). If
> we say that a comment is everything in between and including both
> delimiters she won't have to care about (nth 5 ppss) in the first place.
>
> Admittedly, it's not entirely trivial to implement. But the fact that
> between "/" and "*" we are not in a comment whilst between "*" and "/"
> we are doesn't strike me as very intuitive.
>
> martin
>
>
Hi,
a more striking example might deliver comments in html
<!-- base href="https://blub+index" -->
thinks it's only the beginning which needs to be cared beside pps
worked around it with
- looking-at comment-start
- a check, if inside the begin-string, using string-match
Andreas
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
2011-12-04 17:06 ` martin rudalics
2011-12-04 20:47 ` Andreas Röhler
@ 2011-12-05 3:33 ` Stefan Monnier
2011-12-05 7:41 ` martin rudalics
2011-12-05 11:35 ` Alan Mackenzie
2011-12-05 11:25 ` Alan Mackenzie
2 siblings, 2 replies; 16+ messages in thread
From: Stefan Monnier @ 2011-12-05 3:33 UTC (permalink / raw)
To: martin rudalics; +Cc: Alan Mackenzie, emacs-devel
>>> If you change (nth 5 ppss) you would still have to say that (nth 4 ppss)
>>> is unreliable in this special case.
>> Not if (nth 5 ppss) says that the buffer position is the one *after* the
>> "/*" sequence. Of course for "*/" we'd conversely want to use the state
>> *before* "*/".
> What I meant was that the caller would have to care about (nth 5 ppss)
> too, wherever she now looked only at (nth 3 ppss) and (nth 4 ppss).
That's what I understood and my suggestion does address this issue (tho
it means that (nth 5 ppss) will sometimes refer to a buffer position
after (point) and sometimes before).
A case that needs to work is "/*/" in C mode, for example.
Stefan
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
2011-12-05 3:33 ` Stefan Monnier
@ 2011-12-05 7:41 ` martin rudalics
2011-12-05 14:01 ` Stefan Monnier
2011-12-05 11:35 ` Alan Mackenzie
1 sibling, 1 reply; 16+ messages in thread
From: martin rudalics @ 2011-12-05 7:41 UTC (permalink / raw)
To: Stefan Monnier; +Cc: Alan Mackenzie, emacs-devel
>> What I meant was that the caller would have to care about (nth 5 ppss)
>> too, wherever she now looked only at (nth 3 ppss) and (nth 4 ppss).
>
> That's what I understood and my suggestion does address this issue (tho
> it means that (nth 5 ppss) will sometimes refer to a buffer position
> after (point) and sometimes before).
I still miss what you need (nth 5 ppss) for here. Is it for providing
the OLDSTATE argument in another call to `parse-partial-sexp'?
martin
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
2011-12-05 7:41 ` martin rudalics
@ 2011-12-05 14:01 ` Stefan Monnier
0 siblings, 0 replies; 16+ messages in thread
From: Stefan Monnier @ 2011-12-05 14:01 UTC (permalink / raw)
To: martin rudalics; +Cc: Alan Mackenzie, emacs-devel
>>> What I meant was that the caller would have to care about (nth 5 ppss)
>>> too, wherever she now looked only at (nth 3 ppss) and (nth 4 ppss).
>> That's what I understood and my suggestion does address this issue (tho
>> it means that (nth 5 ppss) will sometimes refer to a buffer position
>> after (point) and sometimes before).
> I still miss what you need (nth 5 ppss) for here. Is it for providing
> the OLDSTATE argument in another call to `parse-partial-sexp'?
Yes.
Think of calling parse-partial-sexp twice, passing the first result to
the second call, where the first result is in the middle of a "/*/".
Stefan
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
2011-12-05 3:33 ` Stefan Monnier
2011-12-05 7:41 ` martin rudalics
@ 2011-12-05 11:35 ` Alan Mackenzie
1 sibling, 0 replies; 16+ messages in thread
From: Alan Mackenzie @ 2011-12-05 11:35 UTC (permalink / raw)
To: Stefan Monnier; +Cc: martin rudalics, emacs-devel
Hello, Stefan,
On Sun, Dec 04, 2011 at 10:33:37PM -0500, Stefan Monnier wrote:
> >>> If you change (nth 5 ppss) you would still have to say that (nth 4 ppss)
> >>> is unreliable in this special case.
> >> Not if (nth 5 ppss) says that the buffer position is the one *after* the
> >> "/*" sequence. Of course for "*/" we'd conversely want to use the state
> >> *before* "*/".
> > What I meant was that the caller would have to care about (nth 5 ppss)
> > too, wherever she now looked only at (nth 3 ppss) and (nth 4 ppss).
> That's what I understood and my suggestion does address this issue (tho
> it means that (nth 5 ppss) will sometimes refer to a buffer position
> after (point) and sometimes before).
I think this is very wrong, and will lead to unwanted complications. I
would suggest this:
5. `t' if point is just after a quote character. The character just
scanned if that might be part of a double character comment boundary.
This should be straightforward to hack.
However, there will be crazy hackers who have tested (nth 5 ppss) as
being non-nil, rather than looking for t. :-( I say, tough on them.
> A case that needs to work is "/*/" in C mode, for example.
The above suggestion would handle this appropriately.
> Stefan
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
2011-12-04 17:06 ` martin rudalics
2011-12-04 20:47 ` Andreas Röhler
2011-12-05 3:33 ` Stefan Monnier
@ 2011-12-05 11:25 ` Alan Mackenzie
2011-12-06 10:15 ` martin rudalics
2 siblings, 1 reply; 16+ messages in thread
From: Alan Mackenzie @ 2011-12-05 11:25 UTC (permalink / raw)
To: martin rudalics; +Cc: Stefan Monnier, emacs-devel
Hello, Martin.
On Sun, Dec 04, 2011 at 06:06:16PM +0100, martin rudalics wrote:
> >> If you change (nth 5 ppss) you would still have to say that (nth 4 ppss)
> >> is unreliable in this special case.
> > Not if (nth 5 ppss) says that the buffer position is the one *after* the
> > "/*" sequence. Of course for "*/" we'd conversely want to use the state
> > *before* "*/".
> What I meant was that the caller would have to care about (nth 5 ppss)
> too, wherever she now looked only at (nth 3 ppss) and (nth 4 ppss). If
> we say that a comment is everything in between and including both
> delimiters she won't have to care about (nth 5 ppss) in the first place.
The parse-partial scanner works strictly left to right. If (nth 5 ppss)
records the left hand bit of "/*", we are not yet in a comment. We're
probably about to do a division. Similarly, after * of "*/", we're still
in the comment, probably just passed a comment prefix.
Admittedly CC Mode records the entire comment, including /* and */.
> Admittedly, it's not entirely trivial to implement. But the fact that
> between "/" and "*" we are not in a comment whilst between "*" and "/"
> we are doesn't strike me as very intuitive.
I disagree. I think keeping the stricly L to R invariant of the parse is
critically important (but don't ask me why :-).
> martin
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
2011-12-05 11:25 ` Alan Mackenzie
@ 2011-12-06 10:15 ` martin rudalics
2011-12-06 10:33 ` Alan Mackenzie
0 siblings, 1 reply; 16+ messages in thread
From: martin rudalics @ 2011-12-06 10:15 UTC (permalink / raw)
To: Alan Mackenzie; +Cc: Stefan Monnier, emacs-devel
> The parse-partial scanner works strictly left to right. If (nth 5 ppss)
> records the left hand bit of "/*", we are not yet in a comment. We're
> probably about to do a division. Similarly, after * of "*/", we're still
> in the comment, probably just passed a comment prefix.
If we can look ahead by one character, there is no probability but
certainty. And the latter is what you want in (nth 4 ppss). The
remaining case is with an "/" at the end of a buffer and that case
wouldn't trouble me.
> I disagree. I think keeping the stricly L to R invariant of the parse is
> critically important (but don't ask me why :-).
Why would looking ahead violate a L to R rule?
martin
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
2011-12-06 10:15 ` martin rudalics
@ 2011-12-06 10:33 ` Alan Mackenzie
2011-12-06 13:39 ` martin rudalics
2011-12-06 13:50 ` Stefan Monnier
0 siblings, 2 replies; 16+ messages in thread
From: Alan Mackenzie @ 2011-12-06 10:33 UTC (permalink / raw)
To: martin rudalics; +Cc: Stefan Monnier, emacs-devel
Hello, Martin.
On Tue, Dec 06, 2011 at 11:15:22AM +0100, martin rudalics wrote:
> > The parse-partial scanner works strictly left to right. If (nth 5
> > ppss) records the left hand bit of "/*", we are not yet in a
> > comment. We're probably about to do a division. Similarly, after *
> > of "*/", we're still in the comment, probably just passed a comment
> > prefix.
> If we can look ahead by one character, there is no probability but
> certainty. And the latter is what you want in (nth 4 ppss). The
> remaining case is with an "/" at the end of a buffer and that case
> wouldn't trouble me.
One can delete anything inside a comment and it is still a comment. We
(i.e. I :-) don't want to introduce an extra special case about the first
character of a comment.
> > I disagree. I think keeping the stricly L to R invariant of the
> > parse is critically important (but don't ask me why :-).
> Why would looking ahead violate a L to R rule?
Think of it as the direction one's head is turned on a British street
when about to cross it suicidally. At the moment, parse-partial-sexp
looks only at the characters to the left; it never pays any attention
whatsoever to characters on the right.
p-p-s is a finite state machine. If it starts looking to the right, it
will still be a fsm, but with many more states.
Again, what of "/*/" mentioned by Stefan? If we're already in the
comment after the first "/", then we're apparently looking at a comment
ender. This complication (and it is complicated) surely condemns the
approach.
I think we should use the same approach as for escape characters: record
the fact in (nth 5 state) that we've passed one, but otherwise take no
action.
> martin
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
2011-12-06 10:33 ` Alan Mackenzie
@ 2011-12-06 13:39 ` martin rudalics
2011-12-06 13:50 ` Stefan Monnier
1 sibling, 0 replies; 16+ messages in thread
From: martin rudalics @ 2011-12-06 13:39 UTC (permalink / raw)
To: Alan Mackenzie; +Cc: Stefan Monnier, emacs-devel
> One can delete anything inside a comment and it is still a comment. We
> (i.e. I :-) don't want to introduce an extra special case about the first
> character of a comment.
What is "the first character of a comment"? With current Emacs sources
the first character of a "/* ... */" comment is the leading "/" when
looking at (nth 8 ppss). But at the position to the right of that
character we're still not "within" that comment. Doesn't that strike
you as paradoxical at least?
> p-p-s is a finite state machine. If it starts looking to the right, it
> will still be a fsm, but with many more states.
I think there won't be any more states than with your proposal.
> Again, what of "/*/" mentioned by Stefan? If we're already in the
> comment after the first "/", then we're apparently looking at a comment
> ender. This complication (and it is complicated) surely condemns the
> approach.
This complication exists already as you can verify by looking at the
corresponding code. The value of the last comment start position (the
position before the leading "/") is IMHO sufficient to handle this case
well.
> I think we should use the same approach as for escape characters: record
> the fact in (nth 5 state) that we've passed one, but otherwise take no
> action.
Since you're the person most affected, the choice should be yours.
Nevertheless, I think that your initial claim
In particular, checking (nth 3 state) and (nth 4 state) is
insufficient to know that one is at a "safe place".
could be easily corrected.
martin
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe.
2011-12-06 10:33 ` Alan Mackenzie
2011-12-06 13:39 ` martin rudalics
@ 2011-12-06 13:50 ` Stefan Monnier
1 sibling, 0 replies; 16+ messages in thread
From: Stefan Monnier @ 2011-12-06 13:50 UTC (permalink / raw)
To: Alan Mackenzie; +Cc: martin rudalics, emacs-devel
> I think we should use the same approach as for escape characters: record
> the fact in (nth 5 state) that we've passed one, but otherwise take no
> action.
I think I agree. But I suspect it's going to be painful to write the
patch for it. It's probably going to be easier to store in
(nth 5 state) a buffer position from where to pick up the parse.
Stefan
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2011-12-06 13:50 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-03 23:23 Musings: Supposed places of safety, guaranteed by parse-partial-sexp are not safe Alan Mackenzie
2011-12-03 23:40 ` Daniel Colascione
2011-12-04 3:39 ` Stefan Monnier
2011-12-04 10:41 ` martin rudalics
2011-12-04 15:21 ` Stefan Monnier
2011-12-04 17:06 ` martin rudalics
2011-12-04 20:47 ` Andreas Röhler
2011-12-05 3:33 ` Stefan Monnier
2011-12-05 7:41 ` martin rudalics
2011-12-05 14:01 ` Stefan Monnier
2011-12-05 11:35 ` Alan Mackenzie
2011-12-05 11:25 ` Alan Mackenzie
2011-12-06 10:15 ` martin rudalics
2011-12-06 10:33 ` Alan Mackenzie
2011-12-06 13:39 ` martin rudalics
2011-12-06 13:50 ` Stefan Monnier
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.