unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Indirect text properties
@ 2019-11-17 17:05 Alan Mackenzie
  2019-11-17 22:55 ` Dmitry Gutov
  2019-11-18 18:11 ` Lars Ingebrigtsen
  0 siblings, 2 replies; 4+ messages in thread
From: Alan Mackenzie @ 2019-11-17 17:05 UTC (permalink / raw)
  To: emacs-devel

Hello, Emacs.

This is an idea I had a couple of years ago, and has recently resurfaced
in discussions with Dmitry (Subject: Several major modes).

The idea is that there could be several alternative sets of text
properties with the same symbol simultaneously in a buffer, the Lisp
code selecting which to use by binding a dynamic variable.  This would
be most useful for the syntax-table text property.

How would this work?  In textprop.c, the code would, on any access to a
text property, check its symbol's property 'indirect-text-property, and
if that is a non-nil symbol, access it's value (another symbol) and use
that as the symbol for the text property instead.  It's easier to say in
code, which would look something like:

    #define TEXP_PROP_END_NAME(sym) \
        !NILP (itp = Fget (sym, Qindirect_text_property)) && SYMPOLP (itp) \
        && !NILP (etp = find_symbol_value (itp)) && SYMBOLP (etp) \
        ? etp : sym

.  To switch to a different set of, e.g., syntax-table text properties
it would suffice to bind the lisp variable i-t-p to, say, the gensym
syntax-table-13.  Of course low level caches, e.g. in syntax.c, would
have to be kept synchronised, too.

So, what use would it be?  What I have proposed to Dmitry is having a
distinct set of syntax-table properties for each major mode chunk of an
MMM Mode ("multiple major mode") buffer.  Say syntax-table-13 would be
the set for a CC Mode chunk.  Outside of that chunk, every character
would be given a space syntax-table-13 text property.  This is the
critical thing.

Thus all actions dependent upon syntax (and there are a LOT), could be
performed by CC Mode in the chunk without the other chunks getting in
the way.  It may not even be necessary to narrow to the chunk.

The necessary juggling with the various syntax-table-13s would be done
by MMM Mode.  This might well allow arbitrary major modes to be used in
MMM Mode with minimal, if any, modification.

Thoughts?

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Indirect text properties
  2019-11-17 17:05 Indirect text properties Alan Mackenzie
@ 2019-11-17 22:55 ` Dmitry Gutov
  2019-11-18 18:06   ` Alan Mackenzie
  2019-11-18 18:11 ` Lars Ingebrigtsen
  1 sibling, 1 reply; 4+ messages in thread
From: Dmitry Gutov @ 2019-11-17 22:55 UTC (permalink / raw)
  To: Alan Mackenzie, emacs-devel; +Cc: Vitalie Spinu

Hi Alan,

On 17.11.2019 19:05, Alan Mackenzie wrote:

> This is an idea I had a couple of years ago, and has recently resurfaced
> in discussions with Dmitry (Subject: Several major modes).
> 
> The idea is that there could be several alternative sets of text
> properties with the same symbol simultaneously in a buffer, the Lisp
> code selecting which to use by binding a dynamic variable.  This would
> be most useful for the syntax-table text property.

Could char-property-alias-alist help?

> How would this work?  In textprop.c, the code would, on any access to a
> text property, check its symbol's property 'indirect-text-property, and
> if that is a non-nil symbol, access it's value (another symbol) and use
> that as the symbol for the text property instead.  It's easier to say in
> code, which would look something like:
> 
>      #define TEXP_PROP_END_NAME(sym) \
>          !NILP (itp = Fget (sym, Qindirect_text_property)) && SYMPOLP (itp) \
>          && !NILP (etp = find_symbol_value (itp)) && SYMBOLP (etp) \
>          ? etp : sym
> 
> .  To switch to a different set of, e.g., syntax-table text properties
> it would suffice to bind the lisp variable i-t-p to, say, the gensym
> syntax-table-13.  Of course low level caches, e.g. in syntax.c, would
> have to be kept synchronised, too.

It's a lot of work with likely some performance overhead even for the 
default case as well. It sounds like it could be a piece of the puzzle, 
but let's see if we get the full picture first.

Also, I think most (all?) of this proposal could be implemented in Lisp 
by just setting the 'syntax-table' on the overlays that cover different 
submode regions. With more overhead when setting but less overhead than 
accessing the values.

> So, what use would it be?  What I have proposed to Dmitry is having a
> distinct set of syntax-table properties for each major mode chunk of an
> MMM Mode ("multiple major mode") buffer.  Say syntax-table-13 would be
> the set for a CC Mode chunk.  Outside of that chunk, every character
> would be given a space syntax-table-13 text property.  This is the
> critical thing.
> 
> Thus all actions dependent upon syntax (and there are a LOT), could be
> performed by CC Mode in the chunk without the other chunks getting in
> the way.  It may not even be necessary to narrow to the chunk.

It doesn't seem like it covers all problematic cases. Maybe not even the 
majority:

- Would this win over "local" syntax-table properties as assigned by 
syntax-table? By the usual logic of how we implement property 
priorities, probably not. But it should, for this to work.
- Some code can just be looking for certain characters instead of syntax 
classes with re-search-backward, etc. It wouldn't be fooled either. So 
this would likely require some "are we still in the same major mode" 
predicate. At which point we might get by without the space-syntax-table 
swapping entirely.

So what are the exact scenarios that your aim is to fix with this?

/Cc Vitalie, he could have some ideas, maybe even tell us how Polymode 
maybe solves this problem already.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Indirect text properties
  2019-11-17 22:55 ` Dmitry Gutov
@ 2019-11-18 18:06   ` Alan Mackenzie
  0 siblings, 0 replies; 4+ messages in thread
From: Alan Mackenzie @ 2019-11-18 18:06 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Vitalie Spinu, emacs-devel

Hello, Dmitry.

On Mon, Nov 18, 2019 at 00:55:50 +0200, Dmitry Gutov wrote:
> On 17.11.2019 19:05, Alan Mackenzie wrote:

> > This is an idea I had a couple of years ago, and has recently resurfaced
> > in discussions with Dmitry (Subject: Several major modes).

> > The idea is that there could be several alternative sets of text
> > properties with the same symbol simultaneously in a buffer, the Lisp
> > code selecting which to use by binding a dynamic variable.  This would
> > be most useful for the syntax-table text property.

> Could char-property-alias-alist help?

That is kind of the other way round to indirect properties.  It gives
several names accessing one property, whereas the indirect properties
gives one name accessing several properties.

> > How would this work?  In textprop.c, the code would, on any access to a
> > text property, check its symbol's property 'indirect-text-property, and
> > if that is a non-nil symbol, access it's value (another symbol) and use
> > that as the symbol for the text property instead.  It's easier to say in
> > code, which would look something like:

> >      #define TEXP_PROP_END_NAME(sym) \
> >          !NILP (itp = Fget (sym, Qindirect_text_property)) && SYMPOLP (itp) \
> >          && !NILP (etp = find_symbol_value (itp)) && SYMBOLP (etp) \
> >          ? etp : sym

> > .  To switch to a different set of, e.g., syntax-table text properties
> > it would suffice to bind the lisp variable i-t-p to, say, the gensym
> > syntax-table-13.  Of course low level caches, e.g. in syntax.c, would
> > have to be kept synchronised, too.

> It's a lot of work with likely some performance overhead even for the 
> default case as well. It sounds like it could be a piece of the puzzle, 
> but let's see if we get the full picture first.

The performance overhead whilst the facility is not in use would be tiny:
a look-up of a symbol's property list which is highly likely to be empty
anyway.  This is in the context of text properties, which involve
traversing trees to find an "interval" containing the place we're looking
up.

But OK, we need firmer proposals.

> Also, I think most (all?) of this proposal could be implemented in Lisp 
> by just setting the 'syntax-table' on the overlays that cover different 
> submode regions. With more overhead when setting but less overhead than 
> accessing the values.

Overlays don't have a syntax-table property.  It could be implemented,
but it would slow down syntactic scanning probably a lot, since
parse-partial-sexp would have to check all the overlays on _each_
character, one by one.  Or, some optimisation which might be brittle.

> > So, what use would it be?  What I have proposed to Dmitry is having a
> > distinct set of syntax-table properties for each major mode chunk of an
> > MMM Mode ("multiple major mode") buffer.  Say syntax-table-13 would be
> > the set for a CC Mode chunk.  Outside of that chunk, every character
> > would be given a space syntax-table-13 text property.  This is the
> > critical thing.

> > Thus all actions dependent upon syntax (and there are a LOT), could be
> > performed by CC Mode in the chunk without the other chunks getting in
> > the way.  It may not even be necessary to narrow to the chunk.

> It doesn't seem like it covers all problematic cases. Maybe not even the 
> majority:

Possibly not.

> - Would this win over "local" syntax-table properties as assigned by 
> syntax-table? By the usual logic of how we implement property 
> priorities, probably not. But it should, for this to work.

I'm not clear what you mean by "local" syntax-table properties, or by
"assigned by syntax-table".  Sorry.

> - Some code can just be looking for certain characters instead of syntax 
> classes with re-search-backward, etc. It wouldn't be fooled either. So 
> this would likely require some "are we still in the same major mode" 
> predicate. At which point we might get by without the space-syntax-table 
> swapping entirely.

Yes, that is true.  So the concept of region boundary could not be done
away with altogether.

> So what are the exact scenarios that your aim is to fix with this?

You mentioned having a C Mode "main" section in MMMM with embedded other
modes, possibly in the middle of CC Mode constructs.  Here, my new scheme
would win, since the other modes would just look like whitespace to CC
Mode whilst it is scanning for strings, comments, braces, etc.  This
seems to be a big problem at the moment with CC Mode + MMM Mode.

The same applies to other major modes, many of which will want to use
syntax-ppss.

> /Cc Vitalie, he could have some ideas, maybe even tell us how Polymode 
> maybe solves this problem already.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Indirect text properties
  2019-11-17 17:05 Indirect text properties Alan Mackenzie
  2019-11-17 22:55 ` Dmitry Gutov
@ 2019-11-18 18:11 ` Lars Ingebrigtsen
  1 sibling, 0 replies; 4+ messages in thread
From: Lars Ingebrigtsen @ 2019-11-18 18:11 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

Alan Mackenzie <acm@muc.de> writes:

> The idea is that there could be several alternative sets of text
> properties with the same symbol simultaneously in a buffer, the Lisp
> code selecting which to use by binding a dynamic variable.  This would
> be most useful for the syntax-table text property.

It would be useful -- it's been proposed a few times before, if I recall
correctly.  I think Stefan M called this concept "planes" or
"namespaces" or something.

It'd allow us to get rid of the `font-lock-face'/`face' thing, too, I
think -- font-lock would just work on `face' in its own plane.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-11-18 18:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-11-17 17:05 Indirect text properties Alan Mackenzie
2019-11-17 22:55 ` Dmitry Gutov
2019-11-18 18:06   ` Alan Mackenzie
2019-11-18 18:11 ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).