Re: Tokenizing - Stephen Leake

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

From: Stephen Leake <stephen_leake@stephe-leake.org>
To: emacs-devel@gnu.org
Subject: Re: Tokenizing
Date: Sun, 21 Sep 2014 10:32:29 -0500	[thread overview]
Message-ID: <85ha01dm5u.fsf@stephe-leake.org> (raw)
In-Reply-To: <CAAs=0-05L9nV69HPWTfEgtEq_x8LiGivmCo7vOJEr5dDJMVCsg@mail.gmail.com> (Vladimir Kazanov's message of "Sat, 20 Sep 2014 19:36:16 +0300")

Vladimir Kazanov <vekazanov@gmail.com> writes:

> Okay, I'll give text properties a try.
>
> Right now my vision for this mode is the following:
>
> - avoid retokenizing undamaged buffer parts at all costs (as a main
> feature meant for incremental parsing);

You might look at what I did in Ada mode (current source in ELPA); see
wisi.el wisi-before/after-change.

> - collect damages and do reparsing only when user stops editing,
> similar to the font-lock-mode (js2-mode, nxml-mode...);

Ada mode only reparses when the user requests an action that requires a
parse. How else do you tell when a user "stops editing"?

font-lock runs after 'idle-time', which appears to be about 2 seconds (I
could not figure out from the structure of 'timer-idle-list' what the
actual idle time is). I guess that's the approximation of when the user
stops editing.

I don't normally edit 7000 line files, so the Ada mode parsing delay is
not noticeable to me, so I prefer the current Ada mode approach of not
using the idle timer to trigger a parse. But it could be a user option. 

> - the incremental logic should have two interfaces, the first one
> meant for language-specific tokenizing code and a second one - for the
> user code, be it code beautifiers or advanced incremental parsers;
>
> - it should be possible to completely replace the font-lock-mode with
> this mode, given a concrete language tokenizer;
>
> You said two things basically: 1) I must use text properties, 2) it is
> possible to improve text properties interfaces to help the tokenizer.
> I suggest the following plan:
>
> 1) try to implement the tokenizer using available text property
> mechanics;

Ada mode uses text properties to store parse results; the tokenizer
results are part of that, but are not stored separately. I don't see
much point in separating the tokenizer from the parser; the tokenizer
results are not useful by themselves (at least, not in Ada mode).

> 2) see if there are slow-downs or problems, or space for improvements
> on the Emacs side.

I have not noticed any problems with the text properties interface; in
particular, storing and retrieving text properties is fast compared to
parsing. Ada mode stores about two parse result text properties per
source line on average.

-- 
-- Stephe

next prev parent reply	other threads:[~2014-09-21 15:32 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-19 14:59 Overlay mechanic improvements Vladimir Kazanov
2014-09-19 17:22 ` Stefan Monnier
2014-09-20 13:19   ` Richard Stallman
2014-09-20 13:37     ` David Kastrup
2014-09-21 13:35       ` Richard Stallman
2014-09-21 13:52         ` David Kastrup
2014-09-21 21:48           ` Richard Stallman
2014-09-21 22:06             ` David Kastrup
2014-09-22 23:11               ` Richard Stallman
2014-09-22 23:50                 ` David Kastrup
2014-09-23 19:15                   ` Richard Stallman
2014-09-21 16:07         ` Stefan Monnier
2014-09-21 16:14           ` David Kastrup
2014-09-21 21:48             ` Richard Stallman
2014-09-21 22:19               ` David Kastrup
2014-09-23 19:16                 ` Richard Stallman
2014-09-23 19:27                   ` David Kastrup
2014-09-28 23:24                     ` Richard Stallman
2014-09-29  5:45                       ` David Kastrup
2014-09-29 20:48                         ` Richard Stallman
2014-09-30  1:21                           ` Stephen J. Turnbull
2014-09-30  8:43                             ` David Kastrup
2014-09-30 10:35                               ` Rasmus
2014-09-30 14:22                                 ` Eli Zaretskii
2014-09-30 16:20                                   ` David Kastrup
2014-09-30 16:35                                     ` Eli Zaretskii
2014-09-30 14:32                                 ` Stefan Monnier
2014-10-02 16:12                                 ` Uwe Brauer
2014-09-30 19:23                             ` Richard Stallman
2014-10-01  3:38                               ` Stephen J. Turnbull
2014-10-01 12:53                                 ` Richard Stallman
2014-10-01 13:11                                   ` David Kastrup
2014-10-02  1:26                                   ` Stephen J. Turnbull
2014-09-30  5:52                           ` David Kastrup
2014-10-06 19:14                             ` Richard Stallman
2014-10-06 21:02                               ` David Kastrup
2014-09-21 16:56           ` Eli Zaretskii
2014-09-21 18:42             ` Stefan Monnier
2014-09-21 18:58               ` Eli Zaretskii
2014-09-21 20:12                 ` Stefan Monnier
2014-09-21 21:48           ` Richard Stallman
2014-09-22  0:31             ` Stefan Monnier
2014-09-22 23:11               ` Richard Stallman
2014-09-20 15:56     ` Eli Zaretskii
2014-09-20 19:49     ` Stefan Monnier
2014-09-21 13:36       ` Richard Stallman
2014-09-19 18:03 ` Richard Stallman
2014-09-20  8:08   ` Vladimir Kazanov
2014-09-20 13:21     ` Richard Stallman
2014-09-20 16:28       ` Stephen Leake
2014-09-20 13:21     ` Tokenizing Richard Stallman
2014-09-20 16:24       ` Tokenizing Stephen Leake
2014-09-20 16:40         ` Tokenizing Vladimir Kazanov
2014-09-20 20:16           ` Tokenizing Eric Ludlam
2014-09-20 20:35             ` Tokenizing Vladimir Kazanov
2014-09-21 15:13             ` parsing (was tokenizing) Stephen Leake
2014-09-20 16:36       ` Tokenizing Vladimir Kazanov
2014-09-20 19:55         ` Tokenizing Stefan Monnier
2014-09-21 15:35           ` Tokenizing Stephen Leake
2014-09-21 16:43             ` Tokenizing Stefan Monnier
2014-09-22 14:05               ` Tokenizing Stephen Leake
2014-09-21 13:35         ` Tokenizing Richard Stallman
2014-09-21 14:24           ` Tokenizing Vladimir Kazanov
2014-09-21 15:32         ` Stephen Leake [this message]
2014-09-21 16:42           ` Tokenizing Stefan Monnier
2014-09-21 18:55           ` Tokenizing Vladimir Kazanov
2014-09-21 22:01             ` Tokenizing Daniel Colascione
2014-09-22 10:21               ` Tokenizing Vladimir Kazanov
2014-09-22 13:55                 ` Tokenizing Daniel Colascione
2014-09-22 14:02               ` Tokenizing Stephen Leake
2014-09-22 14:14                 ` Tokenizing Daniel Colascione
2014-09-22 13:15             ` Tokenizing Stephen Leake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=85ha01dm5u.fsf@stephe-leake.org \
    --to=stephen_leake@stephe-leake.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.