all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Vladimir Kazanov <vekazanov@gmail.com>
To: rms@gnu.org
Cc: emacs-devel@gnu.org
Subject: Re: Overlay mechanic improvements
Date: Sat, 20 Sep 2014 11:08:27 +0300	[thread overview]
Message-ID: <CAAs=0-0Jr7kiJuqmG_sxrO6VAMf7NEoJ-FAed-ZNE0H5srBJ8A@mail.gmail.com> (raw)
In-Reply-To: <E1XV2XA-0002kq-7t@fencepost.gnu.org>

The problem here is that tokens are not just lexemes(token text). To
be general and granular enough one also has to remember dependencies
between tokens. So, a token is: ["lexeme", lookback, lookahead,
position]. Here

 - "lexeme" is the string itself;

 - "lookahead" is a number of chars *after* the lexeme used to define
the lexeme (in the simplest case - just a whitespace between symbol
names, i.e. a single char);

 - "lookback" a number of previous tokens whose lookaheads span over
current token's zone of responsibility (this one can be recalculated
on the fly)

 - "position" is just a token beginning/end

Thus, we can define "a token" as a sum of a lexeme and context
required to extract the lexeme. Whenever something happens within a
particular lexem+lookahead interval -> the token has to be reparsed
and n=lookback previous tokens also have to be fixed.

More on this can be found in this paper:
http://harmonia.cs.berkeley.edu/papers/twagner-lexing.pdf. Quite a
lengthy read, I must warn. What I like about it is that it *proves*
token stream consistency after fixing. My system will hopefully be
simplified compared to the original idea while keeping the proved
part.

>      All I need
>     is an ability to save position pairs, the positions should survive text
>     insertion/deletion
>
> I see multiple meanings for that; could you clarify?

I mean something like relative positioning of markers/overlays:
changes in unrelated buffer parts should change positions accordingly.

>                        and there should be a way to find those pairs given a
>     buffer point.
>
> Text properties are designed to be preserved through copying of text.
> Overlays are not.  So it seems to me that you must use text properties.

Yes, properties do survive copying, but token context might change and
the whole thing will have to be reparsed anyway. We can take a piece
of comments and drop it into the middle of an expression. I don't care
whether the repositioned text was a part of a token or not, I just
need to know that something changed within a particular interval
(lexem+lookahead)

> For each token, you put a text property 'token' onto the characters in
> the token.  The value of the property would say what token they are.
> The property would be eq for all the characters in one token.
>
> Then you can use 'next-single-char-property-change' and
> 'previous-single-char-property-change' to find the end and the
> beginning of the token.

I chose overlays mostly because they allow to control *intervals of
text*, and the intervals can overlap. For example, I can add a
modification-hook to an overlay - and it won't be called for every
character, just for an interval(-s) overlapping. Citing the "Special
properties" documentation page for text properties: "you can't predict
how many times the function will be called". One can imagine a
workaround for this, but it would be just too cumbersome.

> If you run into any difficulties using the existing interfaces
> for text properties, we should improve the interfaces to make your
> program easier to write.
>

Should I just do that or try both (improving overlays on the way)?
After all overlays do have to be fixed anyway.


-- 
Yours sincerely,


Vladimir Kazanov


--
С уважением,

Владимир Казанов



  reply	other threads:[~2014-09-20  8:08 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-19 14:59 Overlay mechanic improvements Vladimir Kazanov
2014-09-19 17:22 ` Stefan Monnier
2014-09-20 13:19   ` Richard Stallman
2014-09-20 13:37     ` David Kastrup
2014-09-21 13:35       ` Richard Stallman
2014-09-21 13:52         ` David Kastrup
2014-09-21 21:48           ` Richard Stallman
2014-09-21 22:06             ` David Kastrup
2014-09-22 23:11               ` Richard Stallman
2014-09-22 23:50                 ` David Kastrup
2014-09-23 19:15                   ` Richard Stallman
2014-09-21 16:07         ` Stefan Monnier
2014-09-21 16:14           ` David Kastrup
2014-09-21 21:48             ` Richard Stallman
2014-09-21 22:19               ` David Kastrup
2014-09-23 19:16                 ` Richard Stallman
2014-09-23 19:27                   ` David Kastrup
2014-09-28 23:24                     ` Richard Stallman
2014-09-29  5:45                       ` David Kastrup
2014-09-29 20:48                         ` Richard Stallman
2014-09-30  1:21                           ` Stephen J. Turnbull
2014-09-30  8:43                             ` David Kastrup
2014-09-30 10:35                               ` Rasmus
2014-09-30 14:22                                 ` Eli Zaretskii
2014-09-30 16:20                                   ` David Kastrup
2014-09-30 16:35                                     ` Eli Zaretskii
2014-09-30 14:32                                 ` Stefan Monnier
2014-10-02 16:12                                 ` Uwe Brauer
2014-09-30 19:23                             ` Richard Stallman
2014-10-01  3:38                               ` Stephen J. Turnbull
2014-10-01 12:53                                 ` Richard Stallman
2014-10-01 13:11                                   ` David Kastrup
2014-10-02  1:26                                   ` Stephen J. Turnbull
2014-09-30  5:52                           ` David Kastrup
2014-10-06 19:14                             ` Richard Stallman
2014-10-06 21:02                               ` David Kastrup
2014-09-21 16:56           ` Eli Zaretskii
2014-09-21 18:42             ` Stefan Monnier
2014-09-21 18:58               ` Eli Zaretskii
2014-09-21 20:12                 ` Stefan Monnier
2014-09-21 21:48           ` Richard Stallman
2014-09-22  0:31             ` Stefan Monnier
2014-09-22 23:11               ` Richard Stallman
2014-09-20 15:56     ` Eli Zaretskii
2014-09-20 19:49     ` Stefan Monnier
2014-09-21 13:36       ` Richard Stallman
2014-09-19 18:03 ` Richard Stallman
2014-09-20  8:08   ` Vladimir Kazanov [this message]
2014-09-20 13:21     ` Richard Stallman
2014-09-20 16:28       ` Stephen Leake
2014-09-20 13:21     ` Tokenizing Richard Stallman
2014-09-20 16:24       ` Tokenizing Stephen Leake
2014-09-20 16:40         ` Tokenizing Vladimir Kazanov
2014-09-20 20:16           ` Tokenizing Eric Ludlam
2014-09-20 20:35             ` Tokenizing Vladimir Kazanov
2014-09-21 15:13             ` parsing (was tokenizing) Stephen Leake
2014-09-20 16:36       ` Tokenizing Vladimir Kazanov
2014-09-20 19:55         ` Tokenizing Stefan Monnier
2014-09-21 15:35           ` Tokenizing Stephen Leake
2014-09-21 16:43             ` Tokenizing Stefan Monnier
2014-09-22 14:05               ` Tokenizing Stephen Leake
2014-09-21 13:35         ` Tokenizing Richard Stallman
2014-09-21 14:24           ` Tokenizing Vladimir Kazanov
2014-09-21 15:32         ` Tokenizing Stephen Leake
2014-09-21 16:42           ` Tokenizing Stefan Monnier
2014-09-21 18:55           ` Tokenizing Vladimir Kazanov
2014-09-21 22:01             ` Tokenizing Daniel Colascione
2014-09-22 10:21               ` Tokenizing Vladimir Kazanov
2014-09-22 13:55                 ` Tokenizing Daniel Colascione
2014-09-22 14:02               ` Tokenizing Stephen Leake
2014-09-22 14:14                 ` Tokenizing Daniel Colascione
2014-09-22 13:15             ` Tokenizing Stephen Leake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAAs=0-0Jr7kiJuqmG_sxrO6VAMf7NEoJ-FAed-ZNE0H5srBJ8A@mail.gmail.com' \
    --to=vekazanov@gmail.com \
    --cc=emacs-devel@gnu.org \
    --cc=rms@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.