* Re: Performance of getting line number for positions around point?
[not found] <DB5A1089-DE61-42D6-8609-CCA977638732@gmail.com>
@ 2025-02-05 8:06 ` Yuan Fu
2025-02-05 14:20 ` Eli Zaretskii
0 siblings, 1 reply; 2+ messages in thread
From: Yuan Fu @ 2025-02-05 8:06 UTC (permalink / raw)
To: Emacs Devel; +Cc: Eli Zaretskii
> On Jan 19, 2025, at 11:18 PM, Yuan Fu <casouri@gmail.com> wrote:
>
> When sending buffer edits to tree-sitter, we’re supposed to pass it both the byte position and (line, col) position. Up until this point we’ve been only passing the byte position, and just pass a dummy (line, col) position. Most of the time, the (line, col) position just gets carried around in tree-sitter and comes out when tree-sitter reports positions back to us. So as long as we don’t use the (line, col) positions it’s fine.
>
> However, it turns out that for some languages, the (line, col) information is significant and can affect parsing. As shown in [1]. Most editors do track line and columns so I don’t think tree-sitter will ever track line and column itself just for Emacs.
>
> So now the question is, is there an existing, performant way to get line numbers? We only need to send line and column positions for buffer edits, so we just need line numbers for positions around point. Also, because narrowing is transparent to tree-sitter, when the buffer is narrowed, the line number also needs to change with it.
>
> IIUC count_line just counts lines, and redisplay_count_line depends on redisplay?
>
> [1] https://github.com/tree-sitter/tree-sitter/issues/4001
I haven’t yet started seriously looking into this, but my initial idea is to keep a cache in buffer local variables that stores the line number of a position near point. And in insert/delete functions, we update the line number cache. If we need to get the line number of a point, just scan the content between the cached position and that point.
Does that sound like a good idea? Or there are existing and better ways to do it?
Yuan
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Performance of getting line number for positions around point?
2025-02-05 8:06 ` Performance of getting line number for positions around point? Yuan Fu
@ 2025-02-05 14:20 ` Eli Zaretskii
0 siblings, 0 replies; 2+ messages in thread
From: Eli Zaretskii @ 2025-02-05 14:20 UTC (permalink / raw)
To: Yuan Fu; +Cc: emacs-devel
> From: Yuan Fu <casouri@gmail.com>
> Date: Wed, 5 Feb 2025 00:06:29 -0800
> Cc: Eli Zaretskii <eliz@gnu.org>
>
> > IIUC count_line just counts lines, and redisplay_count_line depends on redisplay?
It's display_count_lines, and it doesn't depend on redisplay, AFAICT.
> I haven’t yet started seriously looking into this, but my initial idea is to keep a cache in buffer local variables that stores the line number of a position near point. And in insert/delete functions, we update the line number cache. If we need to get the line number of a point, just scan the content between the cached position and that point.
>
> Does that sound like a good idea? Or there are existing and better ways to do it?
If you look at decode_mode_spec, you will see that it uses
display_count_lines and caches the results for faster operation (since
mode-line updates are frequent and must be fast). Its cache is
per-window, but you can make a similar cache local to buffer (or maybe
to a parser?). Of course, the main issue with any cache is when to
invalidate it...
As for column number, this could be tricky: what exactly is the
tree-sitter's definition of a column? In particular, is that a
character offset or a byte offset from the beginning of a line? And
what does tree-sitter expect from a column if characters are composed
into grapheme clusters, as in á (there are two characters there, not
one)? And what about a TAB -- how many "columns" does it take, from
tree-sitter's POV (remember that Emacs has tab-width which affects
that). And there are probably other complications...
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-02-05 14:20 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <DB5A1089-DE61-42D6-8609-CCA977638732@gmail.com>
2025-02-05 8:06 ` Performance of getting line number for positions around point? Yuan Fu
2025-02-05 14:20 ` Eli Zaretskii
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.