unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Ihor Radchenko <yantar92@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: casouri@gmail.com,  emacs-devel@gnu.org
Subject: Re: Exposing buffer text modifications to Lisp (was: Tree-sitter integration on feature/tree-sitter)
Date: Sat, 18 Jun 2022 13:52:59 +0800	[thread overview]
Message-ID: <878rpuwm9w.fsf@localhost> (raw)
In-Reply-To: <834k0jplcm.fsf@gnu.org>

Eli Zaretskii <eliz@gnu.org> writes:

> [I've changed the Subject, since this is not longer about tree-sitter.]

Well. I had some hope that we can generalize the tree-sitter interface
to allow Elisp-based parsers, but it is just a wish.

> OK, but that still doesn't tell what you need from the Emacs core.
> Can you describe those needs?  I presume that modification hooks (of
> any kind) are just the means; the real need is something else.  What
> is it?  If (as I presume) you need to know about changes to the
> buffer, then can you enumerate the changes that are of interest?  For
> example, are changes in text properties and overlays of interest, and
> if so, what kind of properties/overlays?  (But please don't limit your
> answers to just text properties and overlays, because I asked about
> them explicitly.)

Valid question. I am a bit too familiar with Org parser code and assume
that some things are "obvious" when they are not.

I will first answer about AST.

> Next, what kind of ASTs do you want to build, and how do you
> represent text as AST?  In particular, is the AST defined by regexps
> or some other Lisp data structures?

Org AST represents semantic objects using nested lists.
Similar to tree-sitter (AFAIU), each object in the tree is represented
by

(object-type (object-plist) object-children ...)

for example:

* test headline :tag:

is represented as

(headline
  (:raw-value "test headline" :begin 292 :end 314 ... :tags ("tag") ... :parent (...))
  ;; no children
   )

Upon modifying text inside the headline, we need to update :begin/:end
properties to reflect the new headline boundaries in buffer and possibly
update headline properties (e.g. :tags).

The same should be done for all the elements containing the headline.

Updating the elements require the following information:

1. Whether modified text contained terminal symbols or text contributing
   to object-plist _before_ modification.
2. The boundaries of the edited text in buffer and change in the text
   length.
3. Whether the modified text contain terminal symbols/text contributing
   to object-plist _after_ modification.

Org does not care about text property changes or overlay changes.
We just perform a series of regexp searches over the changed parts of
buffer (possibly with extended boundaries) before and after the
modification + know which region of text has been modified (its begin,
end, and change in length).

Missing any significant change (the one involving terminal symbols or
changing region length) will make the AST invalid.

Hope it clarifies the needs.

Best,
Ihor



  reply	other threads:[~2022-06-18  5:52 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-19  1:35 Tree-sitter integration on feature/tree-sitter Kiong-Ge Liau
2022-05-20  2:01 ` Yuan Fu
2022-06-16 19:03   ` Yuan Fu
2022-06-16 19:25     ` [External] : " Drew Adams
2022-06-17  1:11       ` Yuan Fu
2022-06-17 14:22         ` Drew Adams
2022-06-17  1:24     ` Po Lu
2022-06-18  0:09       ` Yuan Fu
2022-06-17  2:00     ` Ihor Radchenko
2022-06-17  2:25       ` Lower-level change hook immune to with-silent-modifications Yuan Fu
2022-06-17  2:55         ` Stefan Monnier
2022-06-17  5:28           ` Eli Zaretskii
2022-06-17 10:10             ` Ihor Radchenko
2022-06-17 11:03               ` Eli Zaretskii
2022-06-17  5:23       ` Tree-sitter integration on feature/tree-sitter Eli Zaretskii
2022-06-17 10:40         ` Ihor Radchenko
2022-06-17 11:42           ` Exposing buffer text modifications to Lisp (was: Tree-sitter integration on feature/tree-sitter) Eli Zaretskii
2022-06-18  5:52             ` Ihor Radchenko [this message]
2022-06-18  7:01               ` Eli Zaretskii
2022-06-18  7:23                 ` Ihor Radchenko
2022-06-18  7:44                   ` Eli Zaretskii
2022-06-18  8:13                     ` Ihor Radchenko
2022-06-18  8:47                       ` Exposing buffer text modifications to Lisp Eli Zaretskii
2022-06-20 11:58                         ` Ihor Radchenko
2022-06-20 12:32                           ` Eli Zaretskii
2022-06-20 14:14                             ` Stefan Kangas
2022-06-21  3:56                               ` Ihor Radchenko
2022-06-21  4:36                             ` Ihor Radchenko
2022-06-21 12:27                               ` Eli Zaretskii
2022-06-25  4:47                                 ` Optimizing performance of buffer markers (was: Exposing buffer text modifications to Lisp) Ihor Radchenko
2022-06-25  8:29                                   ` Optimizing performance of buffer markers Stefan Monnier
2022-06-25  8:44                                     ` Eli Zaretskii
2022-06-25  9:07                                       ` Stefan Monnier
2022-06-25  9:20                                         ` Eli Zaretskii
2022-06-25  9:27                                           ` Stefan Monnier
2022-06-25  9:47                                         ` Ihor Radchenko
2022-06-25  9:53                                           ` Stefan Monnier
2022-06-26 10:32                                   ` Robert Pluim
2022-06-22 15:45                             ` Exposing buffer text modifications to Lisp Basil L. Contovounesios
2022-06-22 16:13                               ` Eli Zaretskii
2022-06-25  4:54                                 ` Ihor Radchenko
2022-06-25  5:46                                   ` Eli Zaretskii
2022-06-29 12:24                                     ` Ihor Radchenko
2022-06-20 14:33                           ` Alan Mackenzie
2022-06-21  3:58                             ` Ihor Radchenko
2022-06-17  6:15     ` Tree-sitter integration on feature/tree-sitter Eli Zaretskii
2022-06-17  7:17       ` Yuan Fu
2022-06-17 10:37         ` Eli Zaretskii
2022-06-18  0:14           ` Yuan Fu
2022-06-18  6:22             ` Eli Zaretskii
2022-06-18  8:25               ` Yuan Fu
2022-06-18  8:50                 ` Eli Zaretskii
2022-06-18 20:07                   ` Yuan Fu
2022-06-19  5:39                     ` Eli Zaretskii
2022-06-20  3:00                       ` Yuan Fu
2022-06-20 11:44                         ` Eli Zaretskii
2022-06-20 20:01                           ` Yuan Fu
2022-06-21  2:26                             ` Eli Zaretskii
2022-06-21  4:39                               ` Yuan Fu
2022-06-21 10:18                                 ` Eli Zaretskii
2022-06-22  0:34                                   ` Yuan Fu
2022-06-17 11:06     ` Jostein Kjønigsen
2022-06-18  0:28       ` Yuan Fu
2022-06-18 20:57         ` Jostein Kjønigsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878rpuwm9w.fsf@localhost \
    --to=yantar92@gmail.com \
    --cc=casouri@gmail.com \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).