bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

* bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max
@ 2019-06-29 12:14 Sam Halliday
  2019-06-29 12:23 ` Eli Zaretskii
  2019-06-29 21:39 ` Stefan Monnier
  0 siblings, 2 replies; 13+ messages in thread
From: Sam Halliday @ 2019-06-29 12:14 UTC (permalink / raw)
  To: 36432

SMIE (via a `indent-for-tab-command') does not request forward tokens
from the lexer when point is at `point-max'.

This might sound like a strange bug report: why should smie expect
there to be any tokens when it is already at point-max? The answer is:
virtual tokens. For example, Haskell may have many closing curly
brackets that live at the end of the buffer.

A workaround is to add a few stray newlines to the end of any buffer
that uses SMIE for indentation. Then SMIE will request the next tokens
(even thought there is only whitespace left until the end of the
buffer) and will receive at least one of those virtuals.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max
  2019-06-29 12:14 bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max Sam Halliday
@ 2019-06-29 12:23 ` Eli Zaretskii
  2019-06-29 12:34   ` Sam Halliday
  2019-06-29 21:39 ` Stefan Monnier
  1 sibling, 1 reply; 13+ messages in thread
From: Eli Zaretskii @ 2019-06-29 12:23 UTC (permalink / raw)
  To: Sam Halliday; +Cc: 36432

> From: Sam Halliday <sam.halliday@gmail.com>
> Date: Sat, 29 Jun 2019 13:14:01 +0100
> 
> SMIE (via a `indent-for-tab-command') does not request forward tokens
> from the lexer when point is at `point-max'.
> 
> This might sound like a strange bug report: why should smie expect
> there to be any tokens when it is already at point-max? The answer is:
> virtual tokens. For example, Haskell may have many closing curly
> brackets that live at the end of the buffer.

??? There can be noting at point-max, as that position is beyond the
last buffer position.  Did you mean the position just before that?  Or
am I missing something here?





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max
  2019-06-29 12:23 ` Eli Zaretskii
@ 2019-06-29 12:34   ` Sam Halliday
  2019-06-29 12:42     ` Eli Zaretskii
  0 siblings, 1 reply; 13+ messages in thread
From: Sam Halliday @ 2019-06-29 12:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 36432

On 29/06/2019, Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Sam Halliday <sam.halliday@gmail.com>
>> Date: Sat, 29 Jun 2019 13:14:01 +0100
>>
>> SMIE (via a `indent-for-tab-command') does not request forward tokens
>> from the lexer when point is at `point-max'.
>>
>> This might sound like a strange bug report: why should smie expect
>> there to be any tokens when it is already at point-max? The answer is:
>> virtual tokens. For example, Haskell may have many closing curly
>> brackets that live at the end of the buffer.
>
> ??? There can be noting at point-max, as that position is beyond the
> last buffer position.  Did you mean the position just before that?  Or
> am I missing something here?
>

I mean at point-max.

Consider this layout algorithm

https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/haskell-tng-layout.el

and this lexer

https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/haskell-tng-lexer.el

that can continue to produce tokens even when the point is at the very
end of the buffer.

e.g.

input https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/test/src/layout.hs
with layout https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/test/src/layout.hs.layout
just tokens https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/test/src/layout.hs.lexer

note the trailing } that exists at the very end of the file. SMIE
always misses this.





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max
  2019-06-29 12:34   ` Sam Halliday
@ 2019-06-29 12:42     ` Eli Zaretskii
  2019-06-29 12:51       ` Eli Zaretskii
  2019-06-29 12:51       ` Sam Halliday
  0 siblings, 2 replies; 13+ messages in thread
From: Eli Zaretskii @ 2019-06-29 12:42 UTC (permalink / raw)
  To: Sam Halliday; +Cc: 36432

> From: Sam Halliday <sam.halliday@gmail.com>
> Date: Sat, 29 Jun 2019 13:34:06 +0100
> Cc: 36432@debbugs.gnu.org
> 
> > ??? There can be noting at point-max, as that position is beyond the
> > last buffer position.  Did you mean the position just before that?  Or
> > am I missing something here?
> >
> 
> I mean at point-max.
> 
> Consider this layout algorithm
> 
> https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/haskell-tng-layout.el
> 
> and this lexer
> 
> https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/haskell-tng-lexer.el
> 
> that can continue to produce tokens even when the point is at the very
> end of the buffer.

So you create an illusion of characters beyond the EOB?

How would Emacs know this is the case?  Why don't you also override
point-max to make it consistent with those illusory characters?





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max
  2019-06-29 12:42     ` Eli Zaretskii
@ 2019-06-29 12:51       ` Eli Zaretskii
  2019-06-29 13:01         ` Sam Halliday
  2019-06-29 12:51       ` Sam Halliday
  1 sibling, 1 reply; 13+ messages in thread
From: Eli Zaretskii @ 2019-06-29 12:51 UTC (permalink / raw)
  To: sam.halliday; +Cc: 36432

> Date: Sat, 29 Jun 2019 15:42:27 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: 36432@debbugs.gnu.org
> 
> How would Emacs know this is the case?  Why don't you also override
> point-max to make it consistent with those illusory characters?

Or actually insert those characters at EOB, but make them invisible?





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max
  2019-06-29 12:42     ` Eli Zaretskii
  2019-06-29 12:51       ` Eli Zaretskii
@ 2019-06-29 12:51       ` Sam Halliday
  2019-06-29 13:06         ` Eli Zaretskii
  1 sibling, 1 reply; 13+ messages in thread
From: Sam Halliday @ 2019-06-29 12:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 36432

On 29/06/2019, Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Sam Halliday <sam.halliday@gmail.com>
>> Date: Sat, 29 Jun 2019 13:34:06 +0100
>> Cc: 36432@debbugs.gnu.org
>>
>> > ??? There can be noting at point-max, as that position is beyond the
>> > last buffer position.  Did you mean the position just before that?  Or
>> > am I missing something here?
>> >
>>
>> I mean at point-max.
>>
>> Consider this layout algorithm
>>
>> https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/haskell-tng-layout.el
>>
>> and this lexer
>>
>> https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/haskell-tng-lexer.el
>>
>> that can continue to produce tokens even when the point is at the very
>> end of the buffer.
>
> So you create an illusion of characters beyond the EOB?
>
> How would Emacs know this is the case?

When testing it is possible to keep polling the lexer until it returns
nil when at point-max, rather than looking at `point-max` and giving
up. I think that could work in general inside SMIE.
https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/test/haskell-tng-lexer-test.el

I suspect the example forward lexer, from the documentation
https://www.gnu.org/software/emacs/manual/html_mono/elisp.html#SMIE-Lexer
would be ok in this situation. I'd be concerned that existing lexers
would throw an error if they were polled when at the beginning/end of
the buffer unexpectedly.

BTW, this also happens at the start of the buffer. SMIE doesn't ask
for backwards tokens when at the beginning.

> Why don't you also override
> point-max to make it consistent with those illusory characters?

Hmm, that is a workaround worth exploring. I'm not sure what the
consequences would be of changing something so fundamental. I think
changing SMIE would probably be easier, even with a monkey patch of
the relevant function or advice. I can have a go at trying to do that.
I just need to figure out exactly which function is doing the check.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max
  2019-06-29 12:51       ` Eli Zaretskii
@ 2019-06-29 13:01         ` Sam Halliday
  2019-06-29 13:08           ` Eli Zaretskii
  0 siblings, 1 reply; 13+ messages in thread
From: Sam Halliday @ 2019-06-29 13:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 36432

On 29/06/2019, Eli Zaretskii <eliz@gnu.org> wrote:
>> Date: Sat, 29 Jun 2019 15:42:27 +0300
>> From: Eli Zaretskii <eliz@gnu.org>
>> Cc: 36432@debbugs.gnu.org
>>
>> How would Emacs know this is the case?  Why don't you also override
>> point-max to make it consistent with those illusory characters?
>
> Or actually insert those characters at EOB, but make them invisible?

I think I'd like to avoid doing that unless I do it for all of the
virtual tokens. I don't know how to do that without it impacting the
underlying source code. Is there more documentation about doing it
that way? It'd be a drastic change in how the lexer is written.

BTW I had a look through smie.el and I can't see anywhere obvious
where (eobp) or (point-max) are called that would lead to the bug I'm
seeing. I'll most likely have to step debug at some point to get to
the bottom of this.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max
  2019-06-29 12:51       ` Sam Halliday
@ 2019-06-29 13:06         ` Eli Zaretskii
  2019-06-29 13:13           ` Sam Halliday
  0 siblings, 1 reply; 13+ messages in thread
From: Eli Zaretskii @ 2019-06-29 13:06 UTC (permalink / raw)
  To: Sam Halliday; +Cc: 36432

> From: Sam Halliday <sam.halliday@gmail.com>
> Date: Sat, 29 Jun 2019 13:51:31 +0100
> Cc: 36432@debbugs.gnu.org
> 
> > So you create an illusion of characters beyond the EOB?
> >
> > How would Emacs know this is the case?
> 
> When testing it is possible to keep polling the lexer until it returns
> nil when at point-max, rather than looking at `point-max` and giving
> up. I think that could work in general inside SMIE.
> https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/test/haskell-tng-lexer-test.el

But SMIE is just an application on top of Emacs basic handling of
buffer positions.  The assumption that there can be nothing at EOB is
hardcoded into many Emacs primitives, into its display engine, and
into core Lisp infrastructure.  You are playing with fire trying to
force Emacs think there are some characters beyond EOB.  Just grep the
C sources for ZV, and you will see the enormous height of the hill you
will need to fight up.  I wouldn't recommend that to anyone.

It should be easier to modify SMIE to take characters from a string,
then you could put whatever you want into that string.  Or maybe SMIE
already supports reading from strings, I don't know.

> BTW, this also happens at the start of the buffer. SMIE doesn't ask
> for backwards tokens when at the beginning.

For the same basic reasons.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max
  2019-06-29 13:01         ` Sam Halliday
@ 2019-06-29 13:08           ` Eli Zaretskii
  0 siblings, 0 replies; 13+ messages in thread
From: Eli Zaretskii @ 2019-06-29 13:08 UTC (permalink / raw)
  To: Sam Halliday; +Cc: 36432

> From: Sam Halliday <sam.halliday@gmail.com>
> Date: Sat, 29 Jun 2019 14:01:52 +0100
> Cc: 36432@debbugs.gnu.org
> 
> On 29/06/2019, Eli Zaretskii <eliz@gnu.org> wrote:
> >> Date: Sat, 29 Jun 2019 15:42:27 +0300
> >> From: Eli Zaretskii <eliz@gnu.org>
> >> Cc: 36432@debbugs.gnu.org
> >>
> >> How would Emacs know this is the case?  Why don't you also override
> >> point-max to make it consistent with those illusory characters?
> >
> > Or actually insert those characters at EOB, but make them invisible?
> 
> I think I'd like to avoid doing that unless I do it for all of the
> virtual tokens. I don't know how to do that without it impacting the
> underlying source code. Is there more documentation about doing it
> that way? It'd be a drastic change in how the lexer is written.

Why? does the lexer only pay attention to visible characters?

> BTW I had a look through smie.el and I can't see anywhere obvious
> where (eobp) or (point-max) are called that would lead to the bug I'm
> seeing.

It's likely in lower-level code.  Like I said: this assumption is
everywhere in Emacs.





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max
  2019-06-29 13:06         ` Eli Zaretskii
@ 2019-06-29 13:13           ` Sam Halliday
  2019-06-29 13:43             ` Eli Zaretskii
  0 siblings, 1 reply; 13+ messages in thread
From: Sam Halliday @ 2019-06-29 13:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 36432

On 29/06/2019, Eli Zaretskii <eliz@gnu.org> wrote:
> It should be easier to modify SMIE to take characters from a string,
> then you could put whatever you want into that string.  Or maybe SMIE
> already supports reading from strings, I don't know.

I agree. SMIE should be operating on a list of tokens and a lookup
from those tokens to the original buffer (and content). In most cases
SMIE is written this way but I suspect there are a few lapses.





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max
  2019-06-29 13:13           ` Sam Halliday
@ 2019-06-29 13:43             ` Eli Zaretskii
  0 siblings, 0 replies; 13+ messages in thread
From: Eli Zaretskii @ 2019-06-29 13:43 UTC (permalink / raw)
  To: Sam Halliday, Stefan Monnier; +Cc: 36432

> From: Sam Halliday <sam.halliday@gmail.com>
> Date: Sat, 29 Jun 2019 14:13:16 +0100
> Cc: 36432@debbugs.gnu.org
> 
> On 29/06/2019, Eli Zaretskii <eliz@gnu.org> wrote:
> > It should be easier to modify SMIE to take characters from a string,
> > then you could put whatever you want into that string.  Or maybe SMIE
> > already supports reading from strings, I don't know.
> 
> I agree. SMIE should be operating on a list of tokens and a lookup
> from those tokens to the original buffer (and content). In most cases
> SMIE is written this way but I suspect there are a few lapses.

In any case, I'm CC'ing Stefan who knows much more about SMIE than I
do.  Apologies in advance if Stefan says my fears have no basis.





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max
  2019-06-29 12:14 bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max Sam Halliday
  2019-06-29 12:23 ` Eli Zaretskii
@ 2019-06-29 21:39 ` Stefan Monnier
  2019-06-30  8:50   ` Sam Halliday
  1 sibling, 1 reply; 13+ messages in thread
From: Stefan Monnier @ 2019-06-29 21:39 UTC (permalink / raw)
  To: Sam Halliday; +Cc: 36432

> SMIE (via a `indent-for-tab-command') does not request forward tokens
> from the lexer when point is at `point-max'.

After looking at the smie.el code I think this bug report is not
sufficiently detailed: it definitely sometimes does, and I don't see any
obvious place where it doesn't.  Can you clarify if it happens during
something like smie-forward-sexp or rather within the smie-indent*
code itself.

Or do you mean when you trigger indent-according-to-mode with point at EOB?

        Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max
  2019-06-29 21:39 ` Stefan Monnier
@ 2019-06-30  8:50   ` Sam Halliday
  0 siblings, 0 replies; 13+ messages in thread
From: Sam Halliday @ 2019-06-30  8:50 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 36432

I'm seeing this when doing indentation.

e.g. in https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/test/src/indentation.hs
move the point to the end of the last line and do a
`newline-and-indent'. The do it again when you have two newlines after
that last point. The results are different.

BTW, in addition to the edebug support you've added, I also have

(bind-key "C-M-<return>" 'haskell-tng-smie:debug-newline haskell-tng-mode-map)
(bind-key "C-M-<tab>" 'haskell-tng-smie:debug-tab haskell-tng-mode-map)

that are useful for seeing what's going on, with some haskell-tng
specific things.

On 29/06/2019, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> SMIE (via a `indent-for-tab-command') does not request forward tokens
>> from the lexer when point is at `point-max'.
>
> After looking at the smie.el code I think this bug report is not
> sufficiently detailed: it definitely sometimes does, and I don't see any
> obvious place where it doesn't.  Can you clarify if it happens during
> something like smie-forward-sexp or rather within the smie-indent*
> code itself.
>
> Or do you mean when you trigger indent-according-to-mode with point at EOB?
>
>
>         Stefan
>
>





^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-06-30  8:50 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-06-29 12:14 bug#36432: 26.2; SMIE does not request forward tokens when point is at point-max Sam Halliday
2019-06-29 12:23 ` Eli Zaretskii
2019-06-29 12:34   ` Sam Halliday
2019-06-29 12:42     ` Eli Zaretskii
2019-06-29 12:51       ` Eli Zaretskii
2019-06-29 13:01         ` Sam Halliday
2019-06-29 13:08           ` Eli Zaretskii
2019-06-29 12:51       ` Sam Halliday
2019-06-29 13:06         ` Eli Zaretskii
2019-06-29 13:13           ` Sam Halliday
2019-06-29 13:43             ` Eli Zaretskii
2019-06-29 21:39 ` Stefan Monnier
2019-06-30  8:50   ` Sam Halliday

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).