unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* font-lock-syntactic-keywords obsolet?
@ 2016-06-16 10:18 Andreas Röhler
  2016-06-17 22:13 ` Stefan Monnier
  0 siblings, 1 reply; 95+ messages in thread
From: Andreas Röhler @ 2016-06-16 10:18 UTC (permalink / raw)
  To: emacs-devel@gnu.org

Hi,

for sometimes, appears a


Warning: `font-lock-syntactic-keywords' is an obsolete variable (as of 
24.1);
     use `syntax-propertize-function' instead.


However, having a look at syntax-propertize-function, it seems not ready 
for production - for several reasons.

Cheers,

Andreas



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-16 10:18 font-lock-syntactic-keywords obsolet? Andreas Röhler
@ 2016-06-17 22:13 ` Stefan Monnier
  2016-06-18  7:03   ` Andreas Röhler
  0 siblings, 1 reply; 95+ messages in thread
From: Stefan Monnier @ 2016-06-17 22:13 UTC (permalink / raw)
  To: emacs-devel

> However, having a look at syntax-propertize-function, it seems not ready for
> production - for several reasons.

Please keep those reasons secret,


        Stefan




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-17 22:13 ` Stefan Monnier
@ 2016-06-18  7:03   ` Andreas Röhler
  2016-06-18 15:07     ` Noam Postavsky
  0 siblings, 1 reply; 95+ messages in thread
From: Andreas Röhler @ 2016-06-18  7:03 UTC (permalink / raw)
  To: emacs-devel



On 18.06.2016 00:13, Stefan Monnier wrote:
>> However, having a look at syntax-propertize-function, it seems not ready for
>> production - for several reasons.
> Please keep those reasons secret,
>
>
>          Stefan
>
>

Anyone who looks how syntax-propertize-function is introduced in source, 
--given some decent understanding of Emacs Lisp-- will see the bunch of 
issues. There is nothing to explain here. As this resides pretty at the 
center of things, I'm wondering why the one-thousend-and-one bug is 
discussed so thoroughly, while this basic issue being touched only 
occasionally.




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-18  7:03   ` Andreas Röhler
@ 2016-06-18 15:07     ` Noam Postavsky
  2016-06-18 17:12       ` Alan Mackenzie
  0 siblings, 1 reply; 95+ messages in thread
From: Noam Postavsky @ 2016-06-18 15:07 UTC (permalink / raw)
  To: emacs-devel

On Sat, Jun 18, 2016 at 3:03 AM, Andreas Röhler
<andreas.roehler@online.de> wrote:
> Anyone who looks how syntax-propertize-function is introduced in source,
> --given some decent understanding of Emacs Lisp-- will see the bunch of
> issues. There is nothing to explain here.

Ah, is this like The Emperor's New Clothes? Anyone who doesn't see the
issues lacks a decent understand of Emacs Lisp? I'll play the kid who
doesn't know any better. Here's the source where
syntax-propertize-function is introduced:

;;; Applying syntax-table properties where needed.

(defvar syntax-propertize-function nil
  ;; Rather than a -functions hook, this is a -function because it's easier
  ;; to do a single scan than several scans: with multiple scans,
[more commentary explaining issues with multiple scans that have been
avoided...]

Looks like a normal defvar to me.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-18 15:07     ` Noam Postavsky
@ 2016-06-18 17:12       ` Alan Mackenzie
  2016-06-18 18:13         ` Stefan Monnier
  2016-06-19 12:31         ` Dmitry Gutov
  0 siblings, 2 replies; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-18 17:12 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: emacs-devel

Hello, Noam.

On Sat, Jun 18, 2016 at 11:07:03AM -0400, Noam Postavsky wrote:
> On Sat, Jun 18, 2016 at 3:03 AM, Andreas Röhler
> <andreas.roehler@online.de> wrote:
> > Anyone who looks how syntax-propertize-function is introduced in source,
> > --given some decent understanding of Emacs Lisp-- will see the bunch of
> > issues. There is nothing to explain here.

> Ah, is this like The Emperor's New Clothes? Anyone who doesn't see the
> issues lacks a decent understand of Emacs Lisp? I'll play the kid who
> doesn't know any better. Here's the source where
> syntax-propertize-function is introduced:

I see the issues here, and I avoid syntax-propertize and syntax-ppss
wherever possible.

> ;;; Applying syntax-table properties where needed.

> (defvar syntax-propertize-function nil
>   ;; Rather than a -functions hook, this is a -function because it's easier
>   ;; to do a single scan than several scans: with multiple scans,
> [more commentary explaining issues with multiple scans that have been
> avoided...]

> Looks like a normal defvar to me.

Let's have that excerpt in full:

  ;; Rather than a -functions hook, this is a -function because it's easier
  ;; to do a single scan than several scans: with multiple scans, one cannot
  ;; assume that the text before point has been propertized, so syntax-ppss
  ;; gives unreliable results (and stores them in its cache to boot, so we'd
  ;; have to flush that cache between each function, and we couldn't use
  ;; syntax-ppss-flush-cache since that would not only flush the cache but also
  ;; reset syntax-propertize--done which should not be done in this case).

The issue is not caused by multiple scans.  CC Mode (in particular C++
Mode) does several scans to apply syntax-table text properties, simply
because that's a natural thing to do (there being several distinct
reasons for these properties being applied).  That way, it's easier to
debug, easier to understand, and less error prone.

It is not "easier to do a single scan".  It is simply that a single scan
is virtually forced if one uses the syntax-ppss/syntax-propertize
mechanism, because:
(i) parse-partial-sexp is very likely to be needed for calculating the
  syntax-table text properties.  Because only some s-t properties will
  have been applied, precise control of p-p-s is needed.  I think users
  of syntax-ppss hold that using parse-partial-sexp directly is a Bad
  Thing.
(ii) The only way syntax-propertize gets called is by calling
  syntax-ppss for some position below where one needs the properties
  applying.

Additionally, syntax-ppss hasn't conformed with its specification w.r.t.
narrowed buffers for a long time, if ever.  (See bug #22983.)

Additionally 2, when syntax-propertize-function is non-nil, all
syntax-table text properties beyond point are wiped out by a change,
causing the need to regenerate them.  This is not always necessary, and
might be expensive in run time (see, again, C++ Mode).

In summary, the syntax-ppss/syntax-propertize mechanism might be a good
choice for implementing syntax-table text properties, but it is not
necessarily so.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-18 17:12       ` Alan Mackenzie
@ 2016-06-18 18:13         ` Stefan Monnier
  2016-06-18 19:41           ` Noam Postavsky
  2016-06-19 12:31         ` Dmitry Gutov
  1 sibling, 1 reply; 95+ messages in thread
From: Stefan Monnier @ 2016-06-18 18:13 UTC (permalink / raw)
  To: emacs-devel

>>>>> "Alan" == Alan Mackenzie <acm@muc.de> writes:
> CC Mode (in particular C++ Mode) does several scans to apply
> syntax-table text properties,
[...]
> That way, it's easier to debug, easier to understand, and
> less error prone.

I rest my case,


        Stefan




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-18 18:13         ` Stefan Monnier
@ 2016-06-18 19:41           ` Noam Postavsky
  2016-06-19  7:12             ` Andreas Röhler
  0 siblings, 1 reply; 95+ messages in thread
From: Noam Postavsky @ 2016-06-18 19:41 UTC (permalink / raw)
  To: emacs-devel

On Sat, Jun 18, 2016 at 2:13 PM, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
>>>>>> "Alan" == Alan Mackenzie <acm@muc.de> writes:
>> CC Mode (in particular C++ Mode) does several scans to apply
>> syntax-table text properties,
> [...]
>> That way, it's easier to debug, easier to understand, and
>> less error prone.
>
> I rest my case,
>
>
>         Stefan

Your case seems to be as invisible as Andreas Röhler's.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-18 19:41           ` Noam Postavsky
@ 2016-06-19  7:12             ` Andreas Röhler
  2016-06-19 13:21               ` Noam Postavsky
  2016-06-19 14:36               ` Stefan Monnier
  0 siblings, 2 replies; 95+ messages in thread
From: Andreas Röhler @ 2016-06-19  7:12 UTC (permalink / raw)
  To: emacs-devel



On 18.06.2016 21:41, Noam Postavsky wrote:
> On Sat, Jun 18, 2016 at 2:13 PM, Stefan Monnier
> <monnier@iro.umontreal.ca> wrote:
>>>>>>> "Alan" == Alan Mackenzie <acm@muc.de> writes:
>>> CC Mode (in particular C++ Mode) does several scans to apply
>>> syntax-table text properties,
>> [...]
>>> That way, it's easier to debug, easier to understand, and
>>> less error prone.
>> I rest my case,
>>
>>
>>          Stefan
> Your case seems to be as invisible as Andreas Röhler's.
>

The question is if `font-lock-syntactic-keywords' really should be 
declared obsolete.

 From docu of `syntax-propertize-function':

"The specified function may call `syntax-ppss' on any position
before END, but it should not call `syntax-ppss-flush-cache',
which means that it should not call `syntax-ppss' on some
position and later modify the buffer on some earlier position."

So "on any position" but not "on some position"?

IMHO that's not ready.




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-18 17:12       ` Alan Mackenzie
  2016-06-18 18:13         ` Stefan Monnier
@ 2016-06-19 12:31         ` Dmitry Gutov
  2016-06-19 13:31           ` Alan Mackenzie
  1 sibling, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-19 12:31 UTC (permalink / raw)
  To: Alan Mackenzie, Noam Postavsky; +Cc: emacs-devel

On 06/18/2016 08:12 PM, Alan Mackenzie wrote:

> CC Mode (in particular C++
> Mode) does several scans to apply syntax-table text properties, simply
> because that's a natural thing to do (there being several distinct
> reasons for these properties being applied).

How do you avoid having editing operations be O(buffer length)?



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19  7:12             ` Andreas Röhler
@ 2016-06-19 13:21               ` Noam Postavsky
  2016-06-19 14:03                 ` Andreas Röhler
  2016-06-19 14:36               ` Stefan Monnier
  1 sibling, 1 reply; 95+ messages in thread
From: Noam Postavsky @ 2016-06-19 13:21 UTC (permalink / raw)
  To: emacs-devel

On Sun, Jun 19, 2016 at 3:12 AM, Andreas Röhler
<andreas.roehler@online.de> wrote:
> The question is if `font-lock-syntactic-keywords' really should be declared
> obsolete.
>
> From docu of `syntax-propertize-function':
>
> "The specified function may call `syntax-ppss' on any position
> before END, but it should not call `syntax-ppss-flush-cache',
> which means that it should not call `syntax-ppss' on some
> position and later modify the buffer on some earlier position."
>
> So "on any position" but not "on some position"?
>
> IMHO that's not ready.
>
>

I'm not sure if you're objecting to the restriction itself, or just
the phrasing, e.g. would it be okay like this?

"The specified function may call `syntax-ppss' on any position
before END, but it should not call `syntax-ppss-flush-cache',
meaning that it should not modify the buffer at positions earlier
than those on which it called `syntax-ppss'."



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 12:31         ` Dmitry Gutov
@ 2016-06-19 13:31           ` Alan Mackenzie
  2016-06-19 13:48             ` Dmitry Gutov
  2016-06-20  3:08             ` Stefan Monnier
  0 siblings, 2 replies; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-19 13:31 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel, Noam Postavsky

Hello, Dmitry.

On Sun, Jun 19, 2016 at 03:31:23PM +0300, Dmitry Gutov wrote:
> On 06/18/2016 08:12 PM, Alan Mackenzie wrote:

> > CC Mode (in particular C++
> > Mode) does several scans to apply syntax-table text properties, simply
> > because that's a natural thing to do (there being several distinct
> > reasons for these properties being applied).

> How do you avoid having editing operations be O(buffer length)?

Simply because these scans are done on the change region of
before/after-change-functions (as expanded), not on the entire tail of
the buffer.  In CC Mode, all the uses of the syntax-table property are
"local"; a buffer change in an earlier part of the buffer (aside from
crude syntactic things like inserting unclosed comment/string
delimiters) cannot affect the properties on the current part of the
buffer.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 13:31           ` Alan Mackenzie
@ 2016-06-19 13:48             ` Dmitry Gutov
  2016-06-19 14:59               ` Alan Mackenzie
  2016-06-20  3:08             ` Stefan Monnier
  1 sibling, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-19 13:48 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel, Noam Postavsky

On 06/19/2016 04:31 PM, Alan Mackenzie wrote:

> Simply because these scans are done on the change region of
> before/after-change-functions (as expanded), not on the entire tail of
> the buffer.  In CC Mode, all the uses of the syntax-table property are
> "local"; a buffer change in an earlier part of the buffer (aside from
> crude syntactic things like inserting unclosed comment/string
> delimiters) cannot affect the properties on the current part of the
> buffer.

So if I remove a closing brace somewhere near the beginning of the 
buffer, it still can't affect text properties near the end?

If so, what if I remove a closing double-quote instead?



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 13:21               ` Noam Postavsky
@ 2016-06-19 14:03                 ` Andreas Röhler
  0 siblings, 0 replies; 95+ messages in thread
From: Andreas Röhler @ 2016-06-19 14:03 UTC (permalink / raw)
  To: emacs-devel



On 19.06.2016 15:21, Noam Postavsky wrote:
> On Sun, Jun 19, 2016 at 3:12 AM, Andreas Röhler
> <andreas.roehler@online.de> wrote:
>> The question is if `font-lock-syntactic-keywords' really should be declared
>> obsolete.
>>
>>  From docu of `syntax-propertize-function':
>>
>> "The specified function may call `syntax-ppss' on any position
>> before END, but it should not call `syntax-ppss-flush-cache',
>> which means that it should not call `syntax-ppss' on some
>> position and later modify the buffer on some earlier position."
>>
>> So "on any position" but not "on some position"?
>>
>> IMHO that's not ready.
>>
>>
> I'm not sure if you're objecting to the restriction itself, or just
> the phrasing, e.g. would it be okay like this?
>
> "The specified function may call `syntax-ppss' on any position
> before END, but it should not call `syntax-ppss-flush-cache',
> meaning that it should not modify the buffer at positions earlier
> than those on which it called `syntax-ppss'."
>

If some specific keywords should be fontified,  what has `syntax-ppss' 
here to interfere at all?

`font-lock-syntactic-keywords' is documented as
   "A list of the syntactic keywords to put syntax properties on.
..."

which is straightforward. Defining resp. changing these keywords is well 
introduced all what's needed in many cases. Users are free to specify, 
what a syntactic keyword should be.

Let's look in contrast to proposed replacement:

"Mode-specific function to apply `syntax-table' text properties."

Aren't specifying keywords and function, which applies a syntax-table 
different things?





^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19  7:12             ` Andreas Röhler
  2016-06-19 13:21               ` Noam Postavsky
@ 2016-06-19 14:36               ` Stefan Monnier
  2016-06-19 15:12                 ` Alan Mackenzie
  2016-06-19 15:27                 ` Andreas Röhler
  1 sibling, 2 replies; 95+ messages in thread
From: Stefan Monnier @ 2016-06-19 14:36 UTC (permalink / raw)
  To: emacs-devel

> The question is if `font-lock-syntactic-keywords' really should be
> declared obsolete.
> From docu of `syntax-propertize-function':
> "The specified function may call `syntax-ppss' on any position
> before END, but it should not call `syntax-ppss-flush-cache',
> which means that it should not call `syntax-ppss' on some
> position and later modify the buffer on some earlier position."
> So "on any position" but not "on some position"?
> IMHO that's not ready.

That presumes that font-lock-syntactic-keywords does not suffer from
tricky interactions with syntax-ppss.  Of course, that's not the case,
instead these are simply undocumented.


        Stefan




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 13:48             ` Dmitry Gutov
@ 2016-06-19 14:59               ` Alan Mackenzie
  2016-06-19 15:07                 ` Dmitry Gutov
  2016-06-20  6:40                 ` Andreas Röhler
  0 siblings, 2 replies; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-19 14:59 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Noam Postavsky, emacs-devel

Hello, Dmitry.

On Sun, Jun 19, 2016 at 04:48:05PM +0300, Dmitry Gutov wrote:
> On 06/19/2016 04:31 PM, Alan Mackenzie wrote:

> > Simply because these scans are done on the change region of
> > before/after-change-functions (as expanded), not on the entire tail of
> > the buffer.  In CC Mode, all the uses of the syntax-table property are
> > "local"; a buffer change in an earlier part of the buffer (aside from
> > crude syntactic things like inserting unclosed comment/string
> > delimiters) cannot affect the properties on the current part of the
> > buffer.

> So if I remove a closing brace somewhere near the beginning of the 
> buffer, it still can't affect text properties near the end?

No, it can't.  If you remove (from a C++ buffer) a terminating template
delimiter (">"), that will have the effect of removing the syntax-table
text property from its former matching opener ("<").

> If so, what if I remove a closing double-quote instead?

Good question.  I put printf's (in effect) into the three routines which
can expand the scanning region in the after-change-function, and
removing a closing double-quote doesn't cause that region to be expanded
beyond the current line.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 14:59               ` Alan Mackenzie
@ 2016-06-19 15:07                 ` Dmitry Gutov
  2016-06-19 15:18                   ` Alan Mackenzie
  2016-06-20  6:40                 ` Andreas Röhler
  1 sibling, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-19 15:07 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Noam Postavsky, emacs-devel

On 06/19/2016 05:59 PM, Alan Mackenzie wrote:

> No, it can't.  If you remove (from a C++ buffer) a terminating template
> delimiter (">"), that will have the effect of removing the syntax-table
> text property from its former matching opener ("<").

What if you remove the opener first? And then the closer? Will it try to 
find another opener then?

>> If so, what if I remove a closing double-quote instead?
>
> Good question.  I put printf's (in effect) into the three routines which
> can expand the scanning region in the after-change-function, and
> removing a closing double-quote doesn't cause that region to be expanded
> beyond the current line.

What about strings like

std::cout << "\
This is a\n\
multiline\n\
string.\
";

or

const char* s1 = R"foo(
Hello
World
)foo";

?

The latter being the case that many languages have to deal with: 
multi-line string literals.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 14:36               ` Stefan Monnier
@ 2016-06-19 15:12                 ` Alan Mackenzie
  2016-06-19 15:18                   ` Dmitry Gutov
  2016-06-20  2:58                   ` Stefan Monnier
  2016-06-19 15:27                 ` Andreas Röhler
  1 sibling, 2 replies; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-19 15:12 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Sun, Jun 19, 2016 at 10:36:52AM -0400, Stefan Monnier wrote:
> > The question is if `font-lock-syntactic-keywords' really should be
> > declared obsolete.
> > From docu of `syntax-propertize-function':
> > "The specified function may call `syntax-ppss' on any position
> > before END, but it should not call `syntax-ppss-flush-cache',
> > which means that it should not call `syntax-ppss' on some
> > position and later modify the buffer on some earlier position."
> > So "on any position" but not "on some position"?
> > IMHO that's not ready.

> That presumes that font-lock-syntactic-keywords does not suffer from
> tricky interactions with syntax-ppss.  Of course, that's not the case,
> instead these are simply undocumented.

Wouldn't the answer be simply to document these?

I may be wrong here, but aren't these interactions only tricky when
syntax-propertize-function is not nil?

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 15:12                 ` Alan Mackenzie
@ 2016-06-19 15:18                   ` Dmitry Gutov
  2016-06-19 15:26                     ` Alan Mackenzie
  2016-06-20  2:58                   ` Stefan Monnier
  1 sibling, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-19 15:18 UTC (permalink / raw)
  To: Alan Mackenzie, Stefan Monnier; +Cc: emacs-devel

On 06/19/2016 06:12 PM, Alan Mackenzie wrote:

> Wouldn't the answer be simply to document these?

"Simply" is a curse word in software development.

> I may be wrong here, but aren't these interactions only tricky when
> syntax-propertize-function is not nil?

CC Mode doesn't use font-lock-syntactic-keywords either, does it?

Please don't advocate things you don't understand the implications of.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 15:07                 ` Dmitry Gutov
@ 2016-06-19 15:18                   ` Alan Mackenzie
  2016-06-19 15:22                     ` Dmitry Gutov
  0 siblings, 1 reply; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-19 15:18 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel, Noam Postavsky

Hello, Dmitry.

On Sun, Jun 19, 2016 at 06:07:32PM +0300, Dmitry Gutov wrote:
> On 06/19/2016 05:59 PM, Alan Mackenzie wrote:

> > No, it can't.  If you remove (from a C++ buffer) a terminating template
> > delimiter (">"), that will have the effect of removing the syntax-table
> > text property from its former matching opener ("<").

> What if you remove the opener first? And then the closer? Will it try to 
> find another opener then?

> >> If so, what if I remove a closing double-quote instead?

> > Good question.  I put printf's (in effect) into the three routines which
> > can expand the scanning region in the after-change-function, and
> > removing a closing double-quote doesn't cause that region to be expanded
> > beyond the current line.

> What about strings like

> std::cout << "\
> This is a\n\
> multiline\n\
> string.\
> ";

> or

> const char* s1 = R"foo(
> Hello
> World
> )foo";

> ?

> The latter being the case that many languages have to deal with: 
> multi-line string literals.

The region which will be scanned for the application of syntax-table
properties is expanded to the "logical" line containing the string.  The
critical thing is it is not expanded to EOB, even when removing the
terminating double quote.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 15:18                   ` Alan Mackenzie
@ 2016-06-19 15:22                     ` Dmitry Gutov
  2016-06-19 15:34                       ` Alan Mackenzie
  2016-06-20  3:14                       ` Stefan Monnier
  0 siblings, 2 replies; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-19 15:22 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel, Noam Postavsky

On 06/19/2016 06:18 PM, Alan Mackenzie wrote:

> The region which will be scanned for the application of syntax-table
> properties is expanded to the "logical" line containing the string.  The
> critical thing is it is not expanded to EOB, even when removing the
> terminating double quote.

What's a "logical line"?

Anyway, if all your scans are logically bound within the edit area, I 
see no reason why you can't implement a syntax-propertize-function, 
multiple scans or no.

Maybe you'll need to hand-implement the logic corresponding to 
syntax-propertize-rules, but that's just work.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 15:18                   ` Dmitry Gutov
@ 2016-06-19 15:26                     ` Alan Mackenzie
  2016-06-19 15:52                       ` Stefan Monnier
  2016-06-19 15:53                       ` Dmitry Gutov
  0 siblings, 2 replies; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-19 15:26 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Stefan Monnier, emacs-devel

Hello, Dmitry.

On Sun, Jun 19, 2016 at 06:18:20PM +0300, Dmitry Gutov wrote:
> On 06/19/2016 06:12 PM, Alan Mackenzie wrote:

> > Wouldn't the answer be simply to document these?

> "Simply" is a curse word in software development.

Then I'll go and scrub my tongue out.  :-)

> > I may be wrong here, but aren't these interactions only tricky when
> > syntax-propertize-function is not nil?

> CC Mode doesn't use font-lock-syntactic-keywords either, does it?

No it doesn't, but it was under consideration at one stage.
font-lock-syntactic-keywords can be a good solution if the syntax-table
properties are only needed for fontification, which isn't the case in CC
Mode (and hasn't been the case since (at least) syntax-table properties
started being applied to C++ template delimiters).

> Please don't advocate things you don't understand the implications of.

Have I been doing that?  I don't think so.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 14:36               ` Stefan Monnier
  2016-06-19 15:12                 ` Alan Mackenzie
@ 2016-06-19 15:27                 ` Andreas Röhler
  2016-06-19 15:51                   ` Stefan Monnier
  1 sibling, 1 reply; 95+ messages in thread
From: Andreas Röhler @ 2016-06-19 15:27 UTC (permalink / raw)
  To: emacs-devel



On 19.06.2016 16:36, Stefan Monnier wrote:
>> The question is if `font-lock-syntactic-keywords' really should be
>> declared obsolete.
>>  From docu of `syntax-propertize-function':
>> "The specified function may call `syntax-ppss' on any position
>> before END, but it should not call `syntax-ppss-flush-cache',
>> which means that it should not call `syntax-ppss' on some
>> position and later modify the buffer on some earlier position."
>> So "on any position" but not "on some position"?
>> IMHO that's not ready.
> That presumes that font-lock-syntactic-keywords does not suffer from
> tricky interactions with syntax-ppss.  Of course, that's not the case,
> instead these are simply undocumented.
>
>
>          Stefan
>

So let's cleanup that. Separate the definition of a rule from code which 
applies it.






^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 15:22                     ` Dmitry Gutov
@ 2016-06-19 15:34                       ` Alan Mackenzie
  2016-06-19 15:50                         ` Dmitry Gutov
  2016-06-19 23:59                         ` Stefan Monnier
  2016-06-20  3:14                       ` Stefan Monnier
  1 sibling, 2 replies; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-19 15:34 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel, Noam Postavsky

Hello, Dmitry.

On Sun, Jun 19, 2016 at 06:22:09PM +0300, Dmitry Gutov wrote:
> On 06/19/2016 06:18 PM, Alan Mackenzie wrote:

> > The region which will be scanned for the application of syntax-table
> > properties is expanded to the "logical" line containing the string.  The
> > critical thing is it is not expanded to EOB, even when removing the
> > terminating double quote.

> What's a "logical line"?

What you get when you resolve the escaped new lines, or the non-escaped
new lines inside a C++ raw string.

> Anyway, if all your scans are logically bound within the edit area, I 
> see no reason why you can't implement a syntax-propertize-function, 
> multiple scans or no.

There's no reason to do so, and it would cost a lot of time.  The
syntax-propertize-function stuff just isn't a good way of impelementing
CC Mode.

A critical reason, which I've told you before, is that on any buffer
change, the syntax-propertize-function mechanism blasts all s-t
properties out of existence from the point the change is made onwards.
This is wasteful of run-time, given that these properties are quite
expensive to apply.

> Maybe you'll need to hand-implement the logic corresponding to 
> syntax-propertize-rules, but that's just work.

In this context, both "just" and "work" are curse words.  :-)

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 15:34                       ` Alan Mackenzie
@ 2016-06-19 15:50                         ` Dmitry Gutov
  2016-06-19 17:15                           ` Alan Mackenzie
  2016-06-19 23:59                         ` Stefan Monnier
  1 sibling, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-19 15:50 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel, Noam Postavsky

On 06/19/2016 06:34 PM, Alan Mackenzie wrote:

>> What's a "logical line"?
>
> What you get when you resolve the escaped new lines, or the non-escaped
> new lines inside a C++ raw string.

But when the raw string is unclosed, it stretches until the end of the 
buffer, doesn't it? Hence, the effects of adding or removing a closer 
must affect the buffer until its end.

> There's no reason to do so, and it would cost a lot of time.  The
> syntax-propertize-function stuff just isn't a good way of impelementing
> CC Mode.
>
> A critical reason, which I've told you before, is that on any buffer
> change, the syntax-propertize-function mechanism blasts all s-t
> properties out of existence from the point the change is made onwards.
> This is wasteful of run-time, given that these properties are quite
> expensive to apply.

They can't be too expensive, considering other language modes, which do 
use syntax-propertize-function, exhibit fewer performance problems than 
CC Mode, even at the same file sizes.

And if the automatic removal of syntax-table properties would lose 
important information, you could save them to a separate structure.

Anyway, you're welcome to propose an alternative general abstraction for 
the same kind of thing than syntax-propertize does. 
font-lock-syntactic-keyword is definitely not that.

>> Maybe you'll need to hand-implement the logic corresponding to
>> syntax-propertize-rules, but that's just work.
>
> In this context, both "just" and "work" are curse words.  :-)

"just implementation work", then. As opposed to "design and implementation".



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 15:27                 ` Andreas Röhler
@ 2016-06-19 15:51                   ` Stefan Monnier
  0 siblings, 0 replies; 95+ messages in thread
From: Stefan Monnier @ 2016-06-19 15:51 UTC (permalink / raw)
  To: emacs-devel

> So let's cleanup that.

Why would you clean up font-lock-syntactic-keywords rather than cleaning
up syntax-propertize-function?

Do you even know why syntax-propertize-function was added?


        Stefan




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 15:26                     ` Alan Mackenzie
@ 2016-06-19 15:52                       ` Stefan Monnier
  2016-06-19 15:53                       ` Dmitry Gutov
  1 sibling, 0 replies; 95+ messages in thread
From: Stefan Monnier @ 2016-06-19 15:52 UTC (permalink / raw)
  To: emacs-devel

> font-lock-syntactic-keywords can be a good solution if the syntax-table
> properties are only needed for fontification, which isn't the case in CC
> Mode

It's never the case.  Hence the need for something that works regardless
of font-lock.


        Stefan




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 15:26                     ` Alan Mackenzie
  2016-06-19 15:52                       ` Stefan Monnier
@ 2016-06-19 15:53                       ` Dmitry Gutov
  1 sibling, 0 replies; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-19 15:53 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Stefan Monnier, emacs-devel

On 06/19/2016 06:26 PM, Alan Mackenzie wrote:

> font-lock-syntactic-keywords can be a good solution if the syntax-table
> properties are only needed for fontification, which isn't the case in CC
> Mode (and hasn't been the case since (at least) syntax-table properties
> started being applied to C++ template delimiters).

I don't know of many (any?) modes where that would be the case.

Maybe the most trivial ones, but it's better not to optimize for those.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 15:50                         ` Dmitry Gutov
@ 2016-06-19 17:15                           ` Alan Mackenzie
  2016-06-19 17:55                             ` Dmitry Gutov
                                               ` (2 more replies)
  0 siblings, 3 replies; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-19 17:15 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel, Noam Postavsky

On Sun, Jun 19, 2016 at 06:50:26PM +0300, Dmitry Gutov wrote:
> On 06/19/2016 06:34 PM, Alan Mackenzie wrote:

> >> What's a "logical line"?

> > What you get when you resolve the escaped new lines, or the non-escaped
> > new lines inside a C++ raw string.

> But when the raw string is unclosed, it stretches until the end of the 
> buffer, doesn't it? Hence, the effects of adding or removing a closer 
> must affect the buffer until its end.

Yes, that is indeed the case.  Sorry I didn't spot that earlier.  The
same applies to an ordinary string, too, provided there are enough
escpaed new lines.

> > There's no reason to do so, and it would cost a lot of time.  The
> > syntax-propertize-function stuff just isn't a good way of impelementing
> > CC Mode.

> > A critical reason, which I've told you before, is that on any buffer
> > change, the syntax-propertize-function mechanism blasts all s-t
> > properties out of existence from the point the change is made onwards.
> > This is wasteful of run-time, given that these properties are quite
> > expensive to apply.

> They can't be too expensive, considering other language modes, which do 
> use syntax-propertize-function, exhibit fewer performance problems than 
> CC Mode, even at the same file sizes.

The speed problems with CC Mode are not to do with the way it puts text
properties on characters.  Anyway, that's recently got better.  :-)

> And if the automatic removal of syntax-table properties would lose 
> important information, you could save them to a separate structure.

I could, but why go to all the hassle when handling these properties in
before/after-change-functions works so well?

Consider the following non-unusual case.  In C++ Mode we have nested
template delimiters, thusly:

    A       B     C          D
    <       <     >          >

They each have parentheses syntax-table text properties such that A
matches D and B matches C.  You can, for example put point at A, do
C-M-n, and you will get to after D.

Suppose you delete the < at A, and move point to D.  What will now
happen if you do C-M-p?  At the moment, D no longer has a s-t property,
so it will not (mis)match any other character with paren syntax.

With a syntax-propertize-function instead of the current
before/after-change-functions, I simply can't picture what would happen.
The syntax-table properties would get removed from B, C, and D at some
indeterminate time.  You'd then have a race condition as to whether D
would match or mismatch some indeterminate character before A.

> Anyway, you're welcome to propose an alternative general abstraction for 
> the same kind of thing than syntax-propertize does. 
> font-lock-syntactic-keyword is definitely not that.

No, it's not.  Neither is syntax-propertize-function.  Why do we need
such a general abstraction, anyway?  before/after-change-functions
already form a good scheme for applying and removing these properties.

> >> Maybe you'll need to hand-implement the logic corresponding to
> >> syntax-propertize-rules, but that's just work.

> > In this context, both "just" and "work" are curse words.  :-)

> "just implementation work", then. As opposed to "design and implementation".

Changing CC Mode to use syntax-propertize-function would require a
substantial amount of design work, assuming such were possible.  There
doesn't seem to be a good reason to do this.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 17:15                           ` Alan Mackenzie
@ 2016-06-19 17:55                             ` Dmitry Gutov
  2016-06-19 22:20                               ` Dmitry Gutov
                                                 ` (2 more replies)
  2016-06-20  0:06                             ` Stefan Monnier
  2016-06-20  4:33                             ` Stefan Monnier
  2 siblings, 3 replies; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-19 17:55 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel, Noam Postavsky

On 06/19/2016 08:15 PM, Alan Mackenzie wrote:

> Yes, that is indeed the case.  Sorry I didn't spot that earlier.  The
> same applies to an ordinary string, too, provided there are enough
> escpaed new lines.

So some operations in CC Mode are indeed O(buffer length)? That sounds 
like a problem.

>>> A critical reason, which I've told you before, is that on any buffer
>>> change, the syntax-propertize-function mechanism blasts all s-t
>>> properties out of existence from the point the change is made onwards.
>>> This is wasteful of run-time, given that these properties are quite
>>> expensive to apply.
>
>> They can't be too expensive, considering other language modes, which do
>> use syntax-propertize-function, exhibit fewer performance problems than
>> CC Mode, even at the same file sizes.
>
> The speed problems with CC Mode are not to do with the way it puts text
> properties on characters.  Anyway, that's recently got better.  :-)

My point is, the "critical reason" can't be too critical, considering 
you've been living with bigger problems for quite a while.

>> And if the automatic removal of syntax-table properties would lose
>> important information, you could save them to a separate structure.
>
> I could, but why go to all the hassle when handling these properties in
> before/after-change-functions works so well?

Is O(buffer length) "working well"?

> Consider the following non-unusual case.  In C++ Mode we have nested
> template delimiters, thusly:
>
>     A       B     C          D
>     <       <     >          >
>
> They each have parentheses syntax-table text properties such that A
> matches D and B matches C.  You can, for example put point at A, do
> C-M-n, and you will get to after D.
>
> Suppose you delete the < at A, and move point to D.  What will now
> happen if you do C-M-p?

Scan error?

> With a syntax-propertize-function instead of the current
> before/after-change-functions, I simply can't picture what would happen.

The same.

a) By the time you move point to D, font-lock has most likely run, and 
the current visible area of the buffer is already syntax-propertized 
(this is how this problem was solved in Emacs <25).

b) In addition to that, scan_lists now applies syntax-table properties 
by calling syntax-propertize-function

> Why do we need
> such a general abstraction, anyway?  before/after-change-functions
> already form a good scheme for applying and removing these properties.

Because just by the virtue of having before/after-change-functions, we 
can be sure that a given position is syntax-propertized.

The only way we could, without adding additional abstraction, is by 
agreeing that all buffers must be syntax-propertized in their entirety 
after before/after-change-functions run. And that is just too damn wasteful.

> Changing CC Mode to use syntax-propertize-function would require a
> substantial amount of design work, assuming such were possible.  There
> doesn't seem to be a good reason to do this.

One reason would be for you to get a handle on what it does, and the 
design benefit that follow. Maybe you would discover an even better 
design along the way, who knows.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 17:55                             ` Dmitry Gutov
@ 2016-06-19 22:20                               ` Dmitry Gutov
  2016-06-20 10:22                               ` Alan Mackenzie
  2016-06-20 10:58                               ` Alan Mackenzie
  2 siblings, 0 replies; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-19 22:20 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Noam Postavsky, emacs-devel

On 06/19/2016 08:55 PM, Dmitry Gutov wrote:

Sorry,

> Because just by the virtue of having before/after-change-functions, we
> can be sure that a given position is syntax-propertized.
   ^
    can't



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 15:34                       ` Alan Mackenzie
  2016-06-19 15:50                         ` Dmitry Gutov
@ 2016-06-19 23:59                         ` Stefan Monnier
  1 sibling, 0 replies; 95+ messages in thread
From: Stefan Monnier @ 2016-06-19 23:59 UTC (permalink / raw)
  To: emacs-devel

> A critical reason, which I've told you before, is that on any buffer
> change, the syntax-propertize-function mechanism blasts all s-t
> properties out of existence from the point the change is made onwards.
> This is wasteful of run-time, given that these properties are quite
> expensive to apply.

Normally, properties are only applied/needed up to window-end, so
"blasts all s-t properties out of existence from the point the change is
made onwards" really isn't a big deal at all.


        Stefan




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 17:15                           ` Alan Mackenzie
  2016-06-19 17:55                             ` Dmitry Gutov
@ 2016-06-20  0:06                             ` Stefan Monnier
  2016-06-20 11:03                               ` Alan Mackenzie
  2016-06-20  4:33                             ` Stefan Monnier
  2 siblings, 1 reply; 95+ messages in thread
From: Stefan Monnier @ 2016-06-20  0:06 UTC (permalink / raw)
  To: emacs-devel

> With a syntax-propertize-function instead of the current
> before/after-change-functions, I simply can't picture what would happen.

I'll help you.

> The syntax-table properties would get removed from B, C, and D at some
> indeterminate time.

Indeed (tho I could tell you exactly when, but the abstraction provided
by syntax-* doesn't depend on that).
But right when you do the buffer modification, the text after the change
is immediately marked as "out-of-date".

> You'd then have a race condition as to whether D
> would match or mismatch some indeterminate character before A.

No: as soon as you need to look at the `syntax-table' property, the
parts marked as "out of date" will have their outdated properties
removed+reapplied, so you'll be sure to get uptodate properties at that
time.


        Stefan




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 15:12                 ` Alan Mackenzie
  2016-06-19 15:18                   ` Dmitry Gutov
@ 2016-06-20  2:58                   ` Stefan Monnier
  2016-06-20 11:57                     ` Alan Mackenzie
  1 sibling, 1 reply; 95+ messages in thread
From: Stefan Monnier @ 2016-06-20  2:58 UTC (permalink / raw)
  To: emacs-devel

>> > The question is if `font-lock-syntactic-keywords' really should be
>> > declared obsolete.
[...]
>> That presumes that font-lock-syntactic-keywords does not suffer from
>> tricky interactions with syntax-ppss.  Of course, that's not the case,
>> instead these are simply undocumented.
[...]
> Wouldn't the answer be simply to document these?

The answer to what?  To the question of whether
`font-lock-syntactic-keywords' should be declared obsolete?
Well, I guess you're right that it would make it slightly more obvious
why it should be declared obsolete, but frankly, I don't care that much
if people understand why.  It just has been declared obsolete and if you
want that decision to be reverted you should provide good arguments
against it.  "syntax-propertize is not perfect" is clearly not a good
enough argument, especially when the imperfections hinted at (Andreas
Röhler still hasn't revealed what he considers are those imperfections)
also affect `font-lock-syntactic-keywords'.

If someones doesn't like something about syntax-propertize (or
syntax-ppss), they should file a bug report.  And if they want that
thing to be improved, providing a good concrete case where the
misfeature bites is always a good way to motivate people to fix it.
E.g. the caveat regarding the use of syntax-ppss from syntax-propertize
could be fixed if it's important enough (I don't think there's anything
particularly hard about it).


        Stefan




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 13:31           ` Alan Mackenzie
  2016-06-19 13:48             ` Dmitry Gutov
@ 2016-06-20  3:08             ` Stefan Monnier
  1 sibling, 0 replies; 95+ messages in thread
From: Stefan Monnier @ 2016-06-20  3:08 UTC (permalink / raw)
  To: emacs-devel

> In CC Mode, all the uses of the syntax-table property are
> "local"; a buffer change in an earlier part of the buffer (aside from
> crude syntactic things like inserting unclosed comment/string
> delimiters) cannot affect the properties on the current part of the
> buffer.

AFAICT, this doesn't actually depend on how the syntax-table property is
applied.  I.e. it's not really a choice of CC-mode's author, it's just
the way the language(s) were defined, and indeed this also applies to
all other languages I can think of (and hence all major modes where we
use syntax-propertize).

And indeed, syntax-propertize doesn't take advantage of this property.
The reason is that it's hard to take advantage of it: you need to reason
hard about which are those "crude syntactic things" (i.e. the exceptions
to the general rule) and then add special code to recognize them, and
optimize those cases.

So far there has not been a performance issue which has justified
investing the effort into this kind of optimization.  I don't see
anything in the design of syntax-propertize which would make it
impossible/hard to retro-fit such an optimization, so if/when the need
arises, we'll probably be able to cook something up.


        Stefan




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 15:22                     ` Dmitry Gutov
  2016-06-19 15:34                       ` Alan Mackenzie
@ 2016-06-20  3:14                       ` Stefan Monnier
  2016-06-20  3:20                         ` Dmitry Gutov
  1 sibling, 1 reply; 95+ messages in thread
From: Stefan Monnier @ 2016-06-20  3:14 UTC (permalink / raw)
  To: emacs-devel

> Anyway, if all your scans are logically bound within the edit area, I see no
> reason why you can't implement a syntax-propertize-function, multiple scans
> or no.

Actually, he just gave an example where the scan is *not* bound within
the edit area:

    If you remove (from a C++ buffer) a terminating template delimiter
    (">"), that will have the effect of removing the syntax-table text
    property from its former matching opener ("<").

In syntax-propertize, if you decided to provide the same behavior, you'd
do it via syntax-propertize-extend-region-functions.


        Stefan




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20  3:14                       ` Stefan Monnier
@ 2016-06-20  3:20                         ` Dmitry Gutov
  2016-06-20  3:47                           ` Stefan Monnier
  0 siblings, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-20  3:20 UTC (permalink / raw)
  To: Stefan Monnier, emacs-devel

On 06/20/2016 06:14 AM, Stefan Monnier wrote:

> In syntax-propertize, if you decided to provide the same behavior, you'd
> do it via syntax-propertize-extend-region-functions.

Going back before START inside the s-p-f instelf and re-propertizing 
also works (like ruby-syntax-propertize does).

But by "not found within the edit area" I rather meant the cases where 
changing something results in re-propertizing text eob.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20  3:20                         ` Dmitry Gutov
@ 2016-06-20  3:47                           ` Stefan Monnier
  0 siblings, 0 replies; 95+ messages in thread
From: Stefan Monnier @ 2016-06-20  3:47 UTC (permalink / raw)
  To: emacs-devel

>> In syntax-propertize, if you decided to provide the same behavior, you'd
>> do it via syntax-propertize-extend-region-functions.
> Going back before START inside the s-p-f instelf and re-propertizing also
> works (like ruby-syntax-propertize does).

Hmm... maybe it works indeed, but it's definitely hackish and I wouldn't
be surprised if it bites you in some corner cases.


        Stefan




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 17:15                           ` Alan Mackenzie
  2016-06-19 17:55                             ` Dmitry Gutov
  2016-06-20  0:06                             ` Stefan Monnier
@ 2016-06-20  4:33                             ` Stefan Monnier
  2016-06-20  5:05                               ` John Wiegley
  2 siblings, 1 reply; 95+ messages in thread
From: Stefan Monnier @ 2016-06-20  4:33 UTC (permalink / raw)
  To: emacs-devel

> Changing CC Mode to use syntax-propertize-function would require a
> substantial amount of design work, assuming such were possible.

Actually, my experience with changing existing major modes to use
syntax-propertize is that there really isn't any need for a big redesign.

That includes non-trivial cases such as cperl-mode and nxml-mode.

So far, CC-mode has proved too impenetrable for my motivation [and your
constant rejection of any patch which aligns CC-mode with "all other
major mode" without also fixing a known bug doesn't help of course], but
it really usually boils down to finding those parts of the code which
apply `syntax-table', calling them from a new foo-syntax-propertize
function, and then finding those spots in the code that need
`syntax-table' to be applied, and adding calls to `syntax-propertize'
there (if any).
And then, performing tests to weed out the problems caused by the fact
that I really had no clue about what the code was doing.

In the case of CC-mode, this would result in losing the benefits of your
efforts to try and avoid unnecessary re-scans, but syntax-propertize's
laziness should hopefully make up for it.

> There doesn't seem to be a good reason to do this.

Admittedly, from the point of view of someone who only maintains
CC-mode, the benefits are probably slim (tho I'm pretty sure the result
would be easier to maintain in the long run).

But from the point of view of Emacs, there are some good reasons to try
and unify the way different major modes work.  E.g. so that there's
a standard agreed upon way for a package to know if point is inside
a comment, without having to know anything about the current major mode.
[ FWIW, CC-mode already interacts well enough with syntax-ppss that this
  particular need already works OK.  ]


        Stefan




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20  4:33                             ` Stefan Monnier
@ 2016-06-20  5:05                               ` John Wiegley
  0 siblings, 0 replies; 95+ messages in thread
From: John Wiegley @ 2016-06-20  5:05 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

>>>>> Stefan Monnier <monnier@iro.umontreal.ca> writes:

> But from the point of view of Emacs, there are some good reasons to try and
> unify the way different major modes work. E.g. so that there's a standard
> agreed upon way for a package to know if point is inside a comment, without
> having to know anything about the current major mode. [ FWIW, CC-mode
> already interacts well enough with syntax-ppss that this particular need
> already works OK. ]

It would be nice to refer major modes to a centralized library for core
details like this. It makes it easier for contributors in future to know what
to do, since the major modes could eventually behave in comment in this
regard.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 14:59               ` Alan Mackenzie
  2016-06-19 15:07                 ` Dmitry Gutov
@ 2016-06-20  6:40                 ` Andreas Röhler
  1 sibling, 0 replies; 95+ messages in thread
From: Andreas Röhler @ 2016-06-20  6:40 UTC (permalink / raw)
  To: emacs-devel; +Cc: Alan Mackenzie



On 19.06.2016 16:59, Alan Mackenzie wrote:
> [ ... ]
>    If you remove (from a C++ buffer) a terminating template
> delimiter (">"), that will have the effect of removing the syntax-table
> text property from its former matching opener ("<").
>
>

Hmm, is this reasonable?

Quoting from your post later on:

;;;

Consider the following non-unusual case. In C++ Mode we have nested

template delimiters, thusly:

     A       B     C          D
     <       <     >          >

They each have parentheses syntax-table text properties such that A
matches D and B matches C.  You can, for example put point at A, do
C-M-n, and you will get to after D.

Suppose you delete the < at A, and move point to D.  What will now
happen if you do C-M-p?  At the moment, D no longer has a s-t property,
so it will not (mis)match any other character with paren syntax.

;;;


If balance is broken by removing "A", why should  C-M-p work?
It should send an error instead, i.e. the remaining "D" _should_ mismatch.

Looks like a suitable issue to reduce complexity - or do I miss the point?


Cheers,

Andreas




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 17:55                             ` Dmitry Gutov
  2016-06-19 22:20                               ` Dmitry Gutov
@ 2016-06-20 10:22                               ` Alan Mackenzie
  2016-06-20 11:50                                 ` Dmitry Gutov
  2016-06-20 13:39                                 ` Stefan Monnier
  2016-06-20 10:58                               ` Alan Mackenzie
  2 siblings, 2 replies; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-20 10:22 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Noam Postavsky, emacs-devel

Hello, Dmitry

On Sun, Jun 19, 2016 at 08:55:52PM +0300, Dmitry Gutov wrote:
> On 06/19/2016 08:15 PM, Alan Mackenzie wrote:

> > Yes, that is indeed the case.  Sorry I didn't spot that earlier.  The
> > same applies to an ordinary string, too, provided there are enough
> > escpaed new lines.

> So some operations in CC Mode are indeed O(buffer length)? That sounds 
> like a problem.

Actually, an ordinary string isn't a problem, since in practice, only a
tiny number of lines end in escaped NLs.

The code for raw strings is actually very new - it was only committed a
week and a half ago.  By their very nature, unterminated raw strings are
a problem, since the terminating delimiter could occur anywhere later in
the buffer.  The trick has got to be to apply some artificial limit for
after-change processing until that terminating delimiter is inserted.

[ .... ]

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-19 17:55                             ` Dmitry Gutov
  2016-06-19 22:20                               ` Dmitry Gutov
  2016-06-20 10:22                               ` Alan Mackenzie
@ 2016-06-20 10:58                               ` Alan Mackenzie
  2016-06-20 11:12                                 ` Andreas Röhler
  2016-06-20 12:15                                 ` Dmitry Gutov
  2 siblings, 2 replies; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-20 10:58 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Noam Postavsky, emacs-devel

Hello, Dmitry.

On Sun, Jun 19, 2016 at 08:55:52PM +0300, Dmitry Gutov wrote:
> On 06/19/2016 08:15 PM, Alan Mackenzie wrote:

[ .... ]

> > Consider the following non-unusual case.  In C++ Mode we have nested
> > template delimiters, thusly:

> >     A       B     C          D
> >     <       <     >          >

> > They each have parentheses syntax-table text properties such that A
> > matches D and B matches C.  You can, for example put point at A, do
> > C-M-n, and you will get to after D.

> > Suppose you delete the < at A, and move point to D.  What will now
> > happen if you do C-M-p?

> Scan error?

No.  In the current C++ Mode, after-change functions remove the
syntax-table text property from D, ensuring template delimiters balance
always.

> > With a syntax-propertize-function instead of the current
> > before/after-change-functions, I simply can't picture what would happen.

> The same.

> a) By the time you move point to D, font-lock has most likely run, ....

only when font-lock is enabled.  What about when it's not enabled?

You seem to be advocating that CC Mode should lose its deterministic text
property handling and replace it by a "hope it's OK" non-deterministic
handling.  You can understand me not wanting to do this.

> .... and the current visible area of the buffer is already
> syntax-propertized (this is how this problem was solved in Emacs <25).

There's no guarantee that the matching template delimiter for any given
delimiter is on the screen. It might be after the screen, it might be
before it.

> b) In addition to that, scan_lists now applies syntax-table properties 
> by calling syntax-propertize-function

Yuck!  That's the sort of ugly workaround that becomes needed when things
aren't properly thought through from the beginning.  In a proper design,
the low level routines in syntax.c wouldn't even know about s-p-function,
and nor should they.

> > Why do we need such a general abstraction, anyway?
> > before/after-change-functions already form a good scheme for applying
> > and removing these properties.

> Because just by the virtue of having before/after-change-functions, we 
> can [can't :-] be sure that a given position is syntax-propertized.

In CC Mode that certainty exists.  That certainty exists if the
before/after-change-functions are implemented properly.

> The only way we could, without adding additional abstraction, is by 
> agreeing that all buffers must be syntax-propertized in their entirety 
> after before/after-change-functions run. And that is just too damn wasteful.

How is it wasteful?  If the syntax-table properties are all "local", it's
actually an efficient way to do things.  You simply have to extend the
(beg end) region to that which might contain pertinent characters, remove
s-t properties in a before-change function, and apply them in an
after-change function.  If the s-t props aren't "local", then maybe the
syntax-propertize-function approach is a good one.  I haven't had any
reason to think this through.  Somebody (either you or Stefan) opined
that ALL s-t properties are, in practice, "local".

> > Changing CC Mode to use syntax-propertize-function would require a
> > substantial amount of design work, assuming such were possible.  There
> > doesn't seem to be a good reason to do this.

> One reason would be for you to get a handle on what it does, and the 
> design benefit that follow.

The current CC Mode before/after-change-function design has been
carefully thought out.  I'm not convinced that's the case for the
syntax-propertize-function mechanism.

> Maybe you would discover an even better design along the way, who
> knows.

Maybe.  But it's not as though I'm short of things to do.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20  0:06                             ` Stefan Monnier
@ 2016-06-20 11:03                               ` Alan Mackenzie
  2016-06-20 13:53                                 ` Stefan Monnier
  0 siblings, 1 reply; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-20 11:03 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Sun, Jun 19, 2016 at 08:06:25PM -0400, Stefan Monnier wrote:
> > With a syntax-propertize-function instead of the current
> > before/after-change-functions, I simply can't picture what would happen.

> I'll help you.

> > The syntax-table properties would get removed from B, C, and D at some
> > indeterminate time.

> Indeed (tho I could tell you exactly when, but the abstraction provided
> by syntax-* doesn't depend on that).
> But right when you do the buffer modification, the text after the change
> is immediately marked as "out-of-date".

What about the text _before_ the change, should that become out of date
(as it does in CC Mode)?

> > You'd then have a race condition as to whether D
> > would match or mismatch some indeterminate character before A.

> No: as soon as you need to look at the `syntax-table' property, the
> parts marked as "out of date" will have their outdated properties
> removed+reapplied, so you'll be sure to get uptodate properties at that
> time.

This sounds like black magic.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 10:58                               ` Alan Mackenzie
@ 2016-06-20 11:12                                 ` Andreas Röhler
  2016-06-20 12:15                                 ` Dmitry Gutov
  1 sibling, 0 replies; 95+ messages in thread
From: Andreas Röhler @ 2016-06-20 11:12 UTC (permalink / raw)
  To: emacs-devel



On 20.06.2016 12:58, Alan Mackenzie wrote:
> [ ... ]
>> b) In addition to that, scan_lists now applies syntax-table properties
>> by calling syntax-propertize-function
> Yuck!  That's the sort of ugly workaround that becomes needed when things
> aren't properly thought through from the beginning.  In a proper design,
> the low level routines in syntax.c wouldn't even know about s-p-function,
> and nor should they.
>

+1



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 10:22                               ` Alan Mackenzie
@ 2016-06-20 11:50                                 ` Dmitry Gutov
  2016-06-20 14:50                                   ` Alan Mackenzie
  2016-06-20 13:39                                 ` Stefan Monnier
  1 sibling, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-20 11:50 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Noam Postavsky, emacs-devel

On 06/20/2016 01:22 PM, Alan Mackenzie wrote:

> The code for raw strings is actually very new - it was only committed a
> week and a half ago.  By their very nature, unterminated raw strings are
> a problem, since the terminating delimiter could occur anywhere later in
> the buffer.  The trick has got to be to apply some artificial limit for
> after-change processing until that terminating delimiter is inserted.

So what if you have a literal that's longer than your chosen limit? A 
s-p-f could handle that. Your solution is basically a kludge. You're 
lucky raw strings are rare in C++ now; in many other languages, a normal 
string is not limited to a single line.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20  2:58                   ` Stefan Monnier
@ 2016-06-20 11:57                     ` Alan Mackenzie
  2016-06-20 13:37                       ` Stefan Monnier
  0 siblings, 1 reply; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-20 11:57 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Sun, Jun 19, 2016 at 10:58:44PM -0400, Stefan Monnier wrote:

> If someones doesn't like something about syntax-propertize (or
> syntax-ppss), they should file a bug report.

I've tried doing this with an actual bug, namely bug #22983 "syntax-ppss
returns wrong result".  That was over three months ago, and still there
is no fix.  You don't appear to want to take any responsibility for
getting it fixed, even though it's in "your" code.

I've no expectation anything would be done about more abstract design
flaws reported as bugs.

> And if they want that thing to be improved, providing a good concrete
> case where the misfeature bites is always a good way to motivate
> people to fix it.

What I dislike about these things, as you well know, are the failings in
their fundamental design and the restrictions these place upon other
software, the way things were implemented before being thought through
properly.  So we have bug #22983, and we have the ghastly abortion of
low level code in syntax.c calling out to high level lisp code "just to
make sure things are propertized properly".

> E.g. the caveat regarding the use of syntax-ppss from
> syntax-propertize could be fixed if it's important enough (I don't
> think there's anything particularly hard about it).

If that was the sort of thing I was unhappy with, I'd have little indeed
to be unhappy about.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 10:58                               ` Alan Mackenzie
  2016-06-20 11:12                                 ` Andreas Röhler
@ 2016-06-20 12:15                                 ` Dmitry Gutov
  2016-06-20 14:52                                   ` Noam Postavsky
  2016-06-20 15:25                                   ` Alan Mackenzie
  1 sibling, 2 replies; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-20 12:15 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Noam Postavsky, emacs-devel

Hi Alan,

On 06/20/2016 01:58 PM, Alan Mackenzie wrote:

>> a) By the time you move point to D, font-lock has most likely run, ....
>
> only when font-lock is enabled.  What about when it's not enabled?

Option b.

> You seem to be advocating that CC Mode should lose its deterministic text
> property handling and replace it by a "hope it's OK" non-deterministic
> handling.  You can understand me not wanting to do this.

You seem to be unable to understand any determinism semantics that are 
slightly more complex than the ones you are working with now.

>> b) In addition to that, scan_lists now applies syntax-table properties
>> by calling syntax-propertize-function
>
> Yuck!  That's the sort of ugly workaround that becomes needed when things
> aren't properly thought through from the beginning.  In a proper design,
> the low level routines in syntax.c wouldn't even know about s-p-function,
> and nor should they.

The way you choose to refer to this simple and sensible design as "ugly 
workaround" is incredibly annoying. Please keep that in mind when I stop 
replying.

>> The only way we could, without adding additional abstraction, is by
>> agreeing that all buffers must be syntax-propertized in their entirety
>> after before/after-change-functions run. And that is just too damn wasteful.
>
> How is it wasteful?

Imagine a big file. You open it and start editing near the beginning. 
With s-p-f, Emacs only has to syntax-propertize it in a limited area 
near the beginning, the rest of the file is completely ignored.

If you jump to the end, the whole file is syntax-propertized once, and 
then, again, as you keep working, only a small area gets re-propertized 
as long as you don't jump far away.

Without s-p-f, you have to keep the whole file up-to-date at all times, 
and that means going over its entirety when the user edits something 
near the file's beginning that affects the parse state of the rest of 
the file, e.g. opening or closing a string, or a block comment.

You could try adding kludges to that, but ultimately, if you want the 
file to always be up-to-date in its entirety, eagerly, you're forced to 
make many operations slower than they have to be.

> If the syntax-table properties are all "local", it's
> actually an efficient way to do things.

In CC Mode you have it easier: strings are limited to one line, or their 
extents are obvious by escaped newlines, and an unclosed block comment 
will get closed at the end of the next block comment.

Even so, you have raw strings now, and with them you're forced to make a 
choice between being fast and being correct.

The shortcuts available to CC Mode aren't something all language modes 
can use, so syntax-propertization through before/after-change-functions 
cannot become the standard. s-p-f can, on the other hand, and already is.

> You simply have to extend the
> (beg end) region to that which might contain pertinent characters, remove
> s-t properties in a before-change function, and apply them in an
> after-change function.

Imagine a language with multiline strings (you can call it "Ruby", or, 
maybe, "Emacs Lisp"), and a big file that contains at least one string 
per every ten lines. The user goes to the first string and removes its 
closing delimiter.

What's your after-change-function going to do?

> If the s-t props aren't "local", then maybe the
> syntax-propertize-function approach is a good one.  I haven't had any
> reason to think this through.

The zillion email messages on the subject still haven't encouraged you?

> Somebody (either you or Stefan) opined
> that ALL s-t properties are, in practice, "local".

You're misremembering.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 11:57                     ` Alan Mackenzie
@ 2016-06-20 13:37                       ` Stefan Monnier
  2016-06-20 13:50                         ` Dmitry Gutov
  0 siblings, 1 reply; 95+ messages in thread
From: Stefan Monnier @ 2016-06-20 13:37 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

> I've tried doing this with an actual bug, namely bug #22983 "syntax-ppss
> returns wrong result".  That was over three months ago, and still there
> is no fix.

Indeed, no fix.  A few reasons:
- Lack of a concrete case which suffers from it (not much immediate benefit).
- In many cases, it's easier to fix the other side (the caller of syntax-ppss).
- It's hard to fix it right, not because of syntax-ppss in particular,
  but because it's hard to make a generic facility which can be fast by
  using a cache, yet is not being told where is the real intended
  beginning of the buffer.  In CC-mode you just decided to punt and
  disallow the use of cc-mode where 1 is not the real beginning of the
  C code.  So your approach suffers from the same problem (just the
  other side of it) and you haven't fixed it either.

This said, a quick&dirty fix (if such was needed, e.g. because of a concrete
case which exhibits the problem) would be to make syntax-ppss
always widen (and maybe add a syntax-ppss-dont-widen).  Given that
there's no real hurry to fix it, I'd rather we fix it right.

> You don't appear to want to take any responsibility for
> getting it fixed, even though it's in "your" code.

I definitely don't want to take responsibility to fix anything in
particular, indeed.  I hate it when a core functionality is only touched
by a single individual, because it makes Emacs's development vulnerable
to the disappearance of that individual.
So I'd like for someone else to dig into syntax.el.

> I've no expectation anything would be done about more abstract design
> flaws reported as bugs.

Even more abstract than bug#22983?  No indeed, I'm not going to waste my
time on such things.

Just like you're not wasting your time trying to fix the fundamental
breakage of CC-mode relying on some relations between
before-change-functions, after-change-functions, and actual
highlighting.  As long as real concrete cases don't show up, the
motivation is very low.

> What I dislike about these things, as you well know, are the failings in
> their fundamental design and the restrictions these place upon other
> software, the way things were implemented before being thought through
> properly.

Believe me I did think about those things quite a bit before
implementing them.

> So we have bug #22983, and we have the ghastly abortion of low level
> code in syntax.c calling out to high level lisp code "just to make
> sure things are propertized properly".

There's an example of something I've thought about a lot before
implementing it.  The precise implementation is probably not perfect,
but the fundamental design is something I mulled over for a long time.
It's far from perfect, but if we want to avoid re-scanning the whole
buffer between point and point-max, we need the updates to be performed
lazily, and a call to forward-sexp can't know how far ahead to do it
until it's actually scanning.

You made a different trade off in CC-mode, of trying hard to catch most
cases where the rescan would end up not making any change.  I find it is
much too difficult to do such a thing in general to be worth the effort,
so I preferred to try and gain speed by using laziness instead.

Your trade-off also suffers from "fundamental design flaws": if
syntax-propertize wasn't lazy, major modes would be forced to try and
figure out when to optimize away the rescan, and if they don't try hard
enough, they'd suffer from poor performance on large buffers.


        Stefan



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 10:22                               ` Alan Mackenzie
  2016-06-20 11:50                                 ` Dmitry Gutov
@ 2016-06-20 13:39                                 ` Stefan Monnier
  1 sibling, 0 replies; 95+ messages in thread
From: Stefan Monnier @ 2016-06-20 13:39 UTC (permalink / raw)
  To: emacs-devel

> week and a half ago.  By their very nature, unterminated raw strings are
> a problem, since the terminating delimiter could occur anywhere later in
> the buffer.

Welcome to syntax-ppss's world: most languages actually allow \n in
their strings without having to backslash-escape them, so this problem
is the rule rather than the exception, outside of CC-mode.

> The trick has got to be to apply some artificial limit for
> after-change processing until that terminating delimiter is inserted.

We definitely don't use/want such artificial limits.


        Stefan




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 13:37                       ` Stefan Monnier
@ 2016-06-20 13:50                         ` Dmitry Gutov
  2016-06-20 16:00                           ` Andreas Röhler
  0 siblings, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-20 13:50 UTC (permalink / raw)
  To: Stefan Monnier, Alan Mackenzie; +Cc: emacs-devel

On 06/20/2016 04:37 PM, Stefan Monnier wrote:
>> I've tried doing this with an actual bug, namely bug #22983 "syntax-ppss
>> returns wrong result".  That was over three months ago, and still there
>> is no fix.
>
> Indeed, no fix.  A few reasons:
> - Lack of a concrete case which suffers from it (not much immediate benefit).
> - In many cases, it's easier to fix the other side (the caller of syntax-ppss).
> - It's hard to fix it right, not because of syntax-ppss in particular,
>   but because it's hard to make a generic facility which can be fast by
>   using a cache, yet is not being told where is the real intended
>   beginning of the buffer.  In CC-mode you just decided to punt and
>   disallow the use of cc-mode where 1 is not the real beginning of the
>   C code.  So your approach suffers from the same problem (just the
>   other side of it) and you haven't fixed it either.
>
> This said, a quick&dirty fix (if such was needed, e.g. because of a concrete
> case which exhibits the problem) would be to make syntax-ppss
> always widen (and maybe add a syntax-ppss-dont-widen).  Given that
> there's no real hurry to fix it, I'd rather we fix it right.

I'm very tempted to fix it by pushing the proposed patch into master, 
considering no viable alternative patch has been proposed so far, if 
only to avoid seeing Alan mention that bug for the 101th time



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 11:03                               ` Alan Mackenzie
@ 2016-06-20 13:53                                 ` Stefan Monnier
  0 siblings, 0 replies; 95+ messages in thread
From: Stefan Monnier @ 2016-06-20 13:53 UTC (permalink / raw)
  To: emacs-devel

> What about the text _before_ the change, should that become out of date
> (as it does in CC Mode)?

That's for the major mode to decide, and inform syntax.el about it via
syntax-propertize-extend-region-functions.

And indeed, syntax-propertize-extend-region-functions doesn't mark the
previous text as out of date.  It only causes it to be re-propertized
when we end up re-propertizing the changed text.

This part of syntax.el is not heavily exercised, tho, because it's very
unusual to have changes at POS affect the syntax of text on
earlier lines.

[ In CC-mode you tend to do that in places where I wouldn't.  E.g. for
  raw-strings you propertize the raw string's opener differently if the
  string is properly closed or not, whereas I wouldn't have done so
  and most other major modes where similar issues can show up don't do
  it the way you do in CC-mode either.

  Obviously, you like that kind of behavior, and maybe you're right
  that it's superior.  I tend to dislike it, but maybe I'm just biased
  because I dislike the implementation cost behind it.  ]

>> No: as soon as you need to look at the `syntax-table' property, the
>> parts marked as "out of date" will have their outdated properties
>> removed+reapplied, so you'll be sure to get uptodate properties at that
>> time.
> This sounds like black magic.

Laziness is a standard tool in programmer's toolbox.


        Stefan


PS: by the way, I was much too busy at work, so I had to procrastinate
a little, and I think I have a very rough version of cc-mode using
syntax-propertize-function working.




^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 11:50                                 ` Dmitry Gutov
@ 2016-06-20 14:50                                   ` Alan Mackenzie
  2016-06-20 15:02                                     ` Dmitry Gutov
  0 siblings, 1 reply; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-20 14:50 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Noam Postavsky, emacs-devel

Hello, Dmitry.

On Mon, Jun 20, 2016 at 02:50:49PM +0300, Dmitry Gutov wrote:
> On 06/20/2016 01:22 PM, Alan Mackenzie wrote:

> > The code for raw strings is actually very new - it was only committed a
> > week and a half ago.  By their very nature, unterminated raw strings are
> > a problem, since the terminating delimiter could occur anywhere later in
> > the buffer.  The trick has got to be to apply some artificial limit for
> > after-change processing until that terminating delimiter is inserted.

> So what if you have a literal that's longer than your chosen limit?

The limit would only apply in the case of a raw string missing a valid
terminator.  Strictly speaking that "literal" extends to EOB.

> A s-p-f could handle that. Your solution is basically a kludge. You're
> lucky raw strings are rare in C++ now; in many other languages, a
> normal string is not limited to a single line.

My solution is a good solution.  You're welcome to point out specific
technical shortcomings, but the basic design is sound.

Raw strings are no longer rare in C++.  That was what prompted Ivan
Andrus to push me for an implementation.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 12:15                                 ` Dmitry Gutov
@ 2016-06-20 14:52                                   ` Noam Postavsky
  2016-06-20 15:57                                     ` Dmitry Gutov
  2016-06-20 15:25                                   ` Alan Mackenzie
  1 sibling, 1 reply; 95+ messages in thread
From: Noam Postavsky @ 2016-06-20 14:52 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, emacs-devel

On Mon, Jun 20, 2016 at 8:15 AM, Dmitry Gutov <dgutov@yandex.ru> wrote:
> Hi Alan,
>
> On 06/20/2016 01:58 PM, Alan Mackenzie wrote:
>> Somebody (either you or Stefan) opined
>> that ALL s-t properties are, in practice, "local".
>
>
> You're misremembering.

I think Alan refers to this message from Stefan:
http://lists.gnu.org/archive/html/emacs-devel/2016-06/msg00395.html



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 14:50                                   ` Alan Mackenzie
@ 2016-06-20 15:02                                     ` Dmitry Gutov
  0 siblings, 0 replies; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-20 15:02 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Noam Postavsky, emacs-devel

On 06/20/2016 05:50 PM, Alan Mackenzie wrote:
>> So what if you have a literal that's longer than your chosen limit?
>
> The limit would only apply in the case of a raw string missing a valid
> terminator.  Strictly speaking that "literal" extends to EOB.

No, the limit would apply when you're searching for the terminator. If 
the string is longer than the limit, and you're at its beginning, you 
won't know whether it has a terminator or not. Or I don't understand 
what you're going to use the "limit" for.

>> A s-p-f could handle that. Your solution is basically a kludge. You're
>> lucky raw strings are rare in C++ now; in many other languages, a
>> normal string is not limited to a single line.
>
> My solution is a good solution.  You're welcome to point out specific
> technical shortcomings

I already did. Also see the example with lots of multiline strings in a 
buffer in another email.

> Raw strings are no longer rare in C++.  That was what prompted Ivan
> Andrus to push me for an implementation.

Raw strings are still easy, in that it's hard to mistake a raw string's 
opener for a raw string's closer.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 12:15                                 ` Dmitry Gutov
  2016-06-20 14:52                                   ` Noam Postavsky
@ 2016-06-20 15:25                                   ` Alan Mackenzie
  2016-06-20 16:45                                     ` Dmitry Gutov
  1 sibling, 1 reply; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-20 15:25 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Noam Postavsky, emacs-devel

Hello, Dmitry.

On Mon, Jun 20, 2016 at 03:15:34PM +0300, Dmitry Gutov wrote:
> Hi Alan,

> On 06/20/2016 01:58 PM, Alan Mackenzie wrote:

> > You seem to be advocating that CC Mode should lose its deterministic text
> > property handling and replace it by a "hope it's OK" non-deterministic
> > handling.  You can understand me not wanting to do this.

> You seem to be unable to understand any determinism semantics that are 
> slightly more complex than the ones you are working with now.

I hold simplicity to be an admirable thing to aim for.  KISS.

> >> b) In addition to that, scan_lists now applies syntax-table properties
> >> by calling syntax-propertize-function

> > Yuck!  That's the sort of ugly workaround that becomes needed when things
> > aren't properly thought through from the beginning.  In a proper design,
> > the low level routines in syntax.c wouldn't even know about s-p-function,
> > and nor should they.

> The way you choose to refer to this simple and sensible design as "ugly 
> workaround" is incredibly annoying. Please keep that in mind when I stop 
> replying.

Since when has ad hoc calling of high level code from primitives been
"simple and sensible design"?  You'll note that the CC Mode way simply
doesn't need this.

And I say to you quite openly, the continual niggling and nagging I've
been getting over the past months and years to adapt CC Mode to an
way of dealing with text properties I hold to be inferior has got tiring
and draining.  I'd be most obliged if you would stop doing this.

> >> The only way we could, without adding additional abstraction, is by
> >> agreeing that all buffers must be syntax-propertized in their entirety
> >> after before/after-change-functions run. And that is just too damn wasteful.

> > How is it wasteful?

> Imagine a big file. You open it and start editing near the beginning. 
> With s-p-f, Emacs only has to syntax-propertize it in a limited area 
> near the beginning, the rest of the file is completely ignored.

> If you jump to the end, the whole file is syntax-propertized once, and 
> then, again, as you keep working, only a small area gets re-propertized 
> as long as you don't jump far away.

> Without s-p-f, you have to keep the whole file up-to-date at all times, ....

True,

> and that means going over its entirety when the user edits something 
> near the file's beginning that affects the parse state of the rest of 
> the file, e.g. opening or closing a string, or a block comment.

But that's false.  The text properties get applied on the entire buffer
when it is first opened, and from then on all manipulations are "local".
A change of a s-t property near BOB, say deleting a template opening
delimiter has no effect on the text beyond the next semicolon.

If things had been as you suggest, quite likely I would have come up
with something a bit like syntax-propertize-function, though maybe not
very much like it.

> You could try adding kludges to that, but ultimately, if you want the 
> file to always be up-to-date in its entirety, eagerly, you're forced to 
> make many operations slower than they have to be.

If you still think this is true, and can demonstrate this with a test
case, I will have a look at it and attempt to fix it.

> > If the syntax-table properties are all "local", it's
> > actually an efficient way to do things.

> In CC Mode you have it easier: strings are limited to one line, or their 
> extents are obvious by escaped newlines, and an unclosed block comment 
> will get closed at the end of the next block comment.

> Even so, you have raw strings now, and with them you're forced to make a 
> choice between being fast and being correct.

We'll see.

> The shortcuts available to CC Mode aren't something all language modes 
> can use, so syntax-propertization through before/after-change-functions 
> cannot become the standard. s-p-f can, on the other hand, and already is.

I'd be interested to hear of some Mode where such shortcuts, as you call
them, aren't available.  syntax-propertize-function is just one way of
handling syntax-table text properties, and it probably uses
before/after-change-functions.  If it doesn't, then there will be
arbitrary periods of time in which the state of the s-t properties is
undefined.  This isn't simple, and I hold it to be non-good.

> > You simply have to extend the (beg end) region to that which might
> > contain pertinent characters, remove s-t properties in a
> > before-change function, and apply them in an after-change function.

> Imagine a language with multiline strings (you can call it "Ruby", or, 
> maybe, "Emacs Lisp"), and a big file that contains at least one string 
> per every ten lines. The user goes to the first string and removes its 
> closing delimiter.

> What's your after-change-function going to do?

Whatever is needed.  Sorry, but the question is too vague.  Emacs Lisp
Mode, as far as I know doesn't have its own a-c-f, so the answer would
be "nothing".  I don't know Ruby Mode.

The point here is that, mostly, strings don't require s-t text
properties.

> > If the s-t props aren't "local", then maybe the
> > syntax-propertize-function approach is a good one.  I haven't had any
> > reason to think this through.

> The zillion email messages on the subject still haven't encouraged you?

On non-"local" syntax table text properties?  I don't recall seeing any
discussion of this, except for the one we're now having.  If I've
forgotten it, or missed it, you could perhaps point it out to me.

> > Somebody (either you or Stefan) opined
> > that ALL s-t properties are, in practice, "local".

> You're misremembering.

It must have been somebody else, then.  Are you aware of any non-"local"
use of syntax table text properties?

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 14:52                                   ` Noam Postavsky
@ 2016-06-20 15:57                                     ` Dmitry Gutov
  2016-06-20 17:23                                       ` Noam Postavsky
  0 siblings, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-20 15:57 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Alan Mackenzie, emacs-devel

Hi Noam,

On 06/20/2016 05:52 PM, Noam Postavsky wrote:
> On Mon, Jun 20, 2016 at 8:15 AM, Dmitry Gutov <dgutov@yandex.ru> wrote:

>> On 06/20/2016 01:58 PM, Alan Mackenzie wrote:
>>> Somebody (either you or Stefan) opined
>>> that ALL s-t properties are, in practice, "local".
>>
>>
>> You're misremembering.
>
> I think Alan refers to this message from Stefan:
> http://lists.gnu.org/archive/html/emacs-devel/2016-06/msg00395.html

How does it follow from that message?



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 13:50                         ` Dmitry Gutov
@ 2016-06-20 16:00                           ` Andreas Röhler
  2016-06-20 18:15                             ` Dmitry Gutov
  2016-06-20 18:55                             ` Alan Mackenzie
  0 siblings, 2 replies; 95+ messages in thread
From: Andreas Röhler @ 2016-06-20 16:00 UTC (permalink / raw)
  To: emacs-devel



On 20.06.2016 15:50, Dmitry Gutov wrote:
> On 06/20/2016 04:37 PM, Stefan Monnier wrote:
>>> I've tried doing this with an actual bug, namely bug #22983 
>>> "syntax-ppss
>>> returns wrong result".  That was over three months ago, and still there
>>> is no fix.
>>
>> Indeed, no fix.  A few reasons:
>> - Lack of a concrete case which suffers from it (not much immediate 
>> benefit).
>> - In many cases, it's easier to fix the other side (the caller of 
>> syntax-ppss).
>> - It's hard to fix it right, not because of syntax-ppss in particular,
>>   but because it's hard to make a generic facility which can be fast by
>>   using a cache, yet is not being told where is the real intended
>>   beginning of the buffer.  In CC-mode you just decided to punt and
>>   disallow the use of cc-mode where 1 is not the real beginning of the
>>   C code.  So your approach suffers from the same problem (just the
>>   other side of it) and you haven't fixed it either.
>>
>> This said, a quick&dirty fix (if such was needed, e.g. because of a 
>> concrete
>> case which exhibits the problem) would be to make syntax-ppss
>> always widen (and maybe add a syntax-ppss-dont-widen).  Given that
>> there's no real hurry to fix it, I'd rather we fix it right.
>
> I'm very tempted to fix it by pushing the proposed patch into master, 
> considering no viable alternative patch has been proposed so far, if 
> only to avoid seeing Alan mention that bug for the 101th time
>

IMHO syntax-ppss has many design issues, not a single one. I'd prever to 
see an example, where syntax-ppss can't be replaced by 
parse-partial-sexp. From there designing a syntax-ppss capable of its 
tasks might be of interest.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 15:25                                   ` Alan Mackenzie
@ 2016-06-20 16:45                                     ` Dmitry Gutov
  2016-06-20 18:12                                       ` Alan Mackenzie
  2016-06-20 22:45                                       ` John Wiegley
  0 siblings, 2 replies; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-20 16:45 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Noam Postavsky, emacs-devel

On 06/20/2016 06:25 PM, Alan Mackenzie wrote:

> I hold simplicity to be an admirable thing to aim for.  KISS.

Too bad you have to work on CC Mode, then.

> Since when has ad hoc calling of high level code from primitives been
> "simple and sensible design"?  You'll note that the CC Mode way simply
> doesn't need this.

Emacs low-level code runs quite a few hooks already, which are 
implemented in Lisp. You'll also note that the whole Lisp runtime is 
called from low-level code.

> And I say to you quite openly, the continual niggling and nagging I've
> been getting over the past months and years to adapt CC Mode to an
> way of dealing with text properties I hold to be inferior has got tiring
> and draining.  I'd be most obliged if you would stop doing this.

I wouldn't be in that discussion (or the ones like it) if you just 
stayed inside CC Mode and didn't try to push new abstractions, poorly 
designed and duplicating what syntax.el already does, into Emacs core.

> If things had been as you suggest, quite likely I would have come up
> with something a bit like syntax-propertize-function, though maybe not
> very much like it.

Things are like I say. Maybe not quite for CC Mode, but they are for 
many languages around it.

>> You could try adding kludges to that, but ultimately, if you want the
>> file to always be up-to-date in its entirety, eagerly, you're forced to
>> make many operations slower than they have to be.
>
> If you still think this is true, and can demonstrate this with a test
> case, I will have a look at it and attempt to fix it.

I believe I explained the problem quite clearly. But if you're asking 
for a test case for CC Mode, I don't care for it.

This discussion is about general facilities, after all.

>> The shortcuts available to CC Mode aren't something all language modes
>> can use, so syntax-propertization through before/after-change-functions
>> cannot become the standard. s-p-f can, on the other hand, and already is.
>
> I'd be interested to hear of some Mode where such shortcuts, as you call
> them, aren't available.

ruby-mode, for instance. And, like already pointed out, any language 
where (these requirements are sufficient, but not all necessary):

- Double-quoted strings are allowed to span multiple lines.
- Syntax is complex enough that we need to use the syntax-table property.
- Whether a character gets a syntax-property applied, depends on whether 
it's inside a string or comment, among other things.

>> Imagine a language with multiline strings (you can call it "Ruby", or,
>> maybe, "Emacs Lisp"), and a big file that contains at least one string
>> per every ten lines. The user goes to the first string and removes its
>> closing delimiter.
>
>> What's your after-change-function going to do?
>
> Whatever is needed.  Sorry, but the question is too vague.  Emacs Lisp
> Mode, as far as I know doesn't have its own a-c-f, so the answer would
> be "nothing".  I don't know Ruby Mode.

Why don't you do us all a favor and educate yourself about other 
languages and language modes before arguing that 
before/after-change-functions can be a general solution as-is?

> The point here is that, mostly, strings don't require s-t text
> properties.

Some don't, some do. Heredoc strings do.

But text outside of strings often does need text properties. And their 
application depends on whether given text is inside a string.

>>> If the s-t props aren't "local", then maybe the
>>> syntax-propertize-function approach is a good one.  I haven't had any
>>> reason to think this through.
>
>> The zillion email messages on the subject still haven't encouraged you?
>
> On non-"local" syntax table text properties?

On syntax-ppss. Participating in discussions about a subject is usually 
a good reason to educate themselves about it. For most people, at least.

> I don't recall seeing any
> discussion of this, except for the one we're now having.  If I've
> forgotten it, or missed it, you could perhaps point it out to me.

Non-locality is one of the obvious reasons for syntax-propertize's 
design, the way that syntax-table application is performed lazily.

Maybe if you haven't been busy writing shallow critique and asked 
questions instead, we'd gave gotten to this a lot sooner.

>>> Somebody (either you or Stefan) opined
>>> that ALL s-t properties are, in practice, "local".
>
>> You're misremembering.
>
> It must have been somebody else, then.  Are you aware of any non-"local"
> use of syntax table text properties?

Yes. And I've been telling you about them for the last several messages.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 15:57                                     ` Dmitry Gutov
@ 2016-06-20 17:23                                       ` Noam Postavsky
  2016-06-20 18:58                                         ` Dmitry Gutov
  0 siblings, 1 reply; 95+ messages in thread
From: Noam Postavsky @ 2016-06-20 17:23 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, emacs-devel

On Mon, Jun 20, 2016 at 11:57 AM, Dmitry Gutov <dgutov@yandex.ru> wrote:
> Hi Noam,
>
> On 06/20/2016 05:52 PM, Noam Postavsky wrote:
>>
>> On Mon, Jun 20, 2016 at 8:15 AM, Dmitry Gutov <dgutov@yandex.ru> wrote:
>
>
>>> On 06/20/2016 01:58 PM, Alan Mackenzie wrote:
>>>>
>>>> Somebody (either you or Stefan) opined
>>>> that ALL s-t properties are, in practice, "local".
>>>
>>>
>>>
>>> You're misremembering.
>>
>>
>> I think Alan refers to this message from Stefan:
>> http://lists.gnu.org/archive/html/emacs-devel/2016-06/msg00395.html
>
>
> How does it follow from that message?

Alan said 'In CC Mode, all the uses of the syntax-table property are
"local";' and Stefan replied 'AFAICT, this doesn't actually depend on
how the syntax-table property is
applied[...]and indeed this also applies to all other languages I can
think of (and hence all major modes where we use syntax-propertize).'

But since you brought up multline string literals in subsequent
discussion, it's important to note Alan's caveat (the "aside from...")
in his definition of "local":

    a buffer change in an earlier part of the buffer (aside from
    crude syntactic things like inserting unclosed comment/string
    delimiters) cannot affect the properties on the current part of the
    buffer.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 16:45                                     ` Dmitry Gutov
@ 2016-06-20 18:12                                       ` Alan Mackenzie
  2016-06-20 19:15                                         ` Dmitry Gutov
  2016-06-20 22:45                                       ` John Wiegley
  1 sibling, 1 reply; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-20 18:12 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel, Noam Postavsky

Hello, Dmitry.

On Mon, Jun 20, 2016 at 07:45:06PM +0300, Dmitry Gutov wrote:
> On 06/20/2016 06:25 PM, Alan Mackenzie wrote:

[ .... ]

> > Since when has ad hoc calling of high level code from primitives been
> > "simple and sensible design"?  You'll note that the CC Mode way simply
> > doesn't need this.

> Emacs low-level code runs quite a few hooks already, which are 
> implemented in Lisp. You'll also note that the whole Lisp runtime is 
> called from low-level code.

You're evading the point.  There's a difference between a well designed
interface from low to high (of which there are many in Emacs) and "oh, in
this circumstance we might not have syntax table properties current.
Tell you what, we'll bung a call to syntax-propertize into the lowest
level of the syntax routines, that will surely work most of the time".

> > And I say to you quite openly, the continual niggling and nagging I've
> > been getting over the past months and years to adapt CC Mode to an
> > way of dealing with text properties I hold to be inferior has got tiring
> > and draining.  I'd be most obliged if you would stop doing this.

> I wouldn't be in that discussion (or the ones like it) if you just 
> stayed inside CC Mode and didn't try to push new abstractions, poorly 
> designed and duplicating what syntax.el already does, into Emacs core.

Don't be so cheeky.  I'm part of the Emacs team and pushing abstractions,
new or otherwise, is one of the things I do.  Pointing out that syntax.el
is just one of several ways of doing what it does is also something I do;
somebody's got to do it, after all.

[ .... ]

> >> You could try adding kludges to that, but ultimately, if you want the
> >> file to always be up-to-date in its entirety, eagerly, you're forced to
> >> make many operations slower than they have to be.

> > If you still think this is true, and can demonstrate this with a test
> > case, I will have a look at it and attempt to fix it.

> I believe I explained the problem quite clearly. But if you're asking 
> for a test case for CC Mode, I don't care for it.

It's worth noting you labour under some misconceptions as to what CC Mode
does and how it does it.

> This discussion is about general facilities, after all.

Yes.

> >> The shortcuts available to CC Mode aren't something all language modes
> >> can use, so syntax-propertization through before/after-change-functions
> >> cannot become the standard. s-p-f can, on the other hand, and already is.

> > I'd be interested to hear of some Mode where such shortcuts, as you call
> > them, aren't available.

> ruby-mode, for instance. And, like already pointed out, any language 
> where (these requirements are sufficient, but not all necessary):

> - Double-quoted strings are allowed to span multiple lines.
> - Syntax is complex enough that we need to use the syntax-table property.
> - Whether a character gets a syntax-property applied, depends on whether 
> it's inside a string or comment, among other things.

All of these 3 criteria apply to C++ Mode, yet there's no need for lazy
syntax-table propertification there.

Another question for you.  Under the aforementioned laziness, how and
when do syntax-table properties get modified after a buffer change when
these s-t properties are _above_ the position of the change in the buffer?

> >> Imagine a language with multiline strings (you can call it "Ruby", or,
> >> maybe, "Emacs Lisp"), and a big file that contains at least one string
> >> per every ten lines. The user goes to the first string and removes its
> >> closing delimiter.

> >> What's your after-change-function going to do?

> > Whatever is needed.  Sorry, but the question is too vague.  Emacs Lisp
> > Mode, as far as I know doesn't have its own a-c-f, so the answer would
> > be "nothing".  I don't know Ruby Mode.

> Why don't you do us all a favor and educate yourself about other 
> languages and language modes before arguing that 
> before/after-change-functions can be a general solution as-is?

I can argue that because they're clean, well understood abstractions.
And I do argue that b/a-c-f are a good way of manipulating s-t properties
when these properties are "local".

> > The point here is that, mostly, strings don't require s-t text
> > properties.

> Some don't, some do. Heredoc strings do.

> But text outside of strings often does need text properties. And their 
> application depends on whether given text is inside a string.

Yes.

> >>> If the s-t props aren't "local", then maybe the
> >>> syntax-propertize-function approach is a good one.  I haven't had any
> >>> reason to think this through.

> >> The zillion email messages on the subject still haven't encouraged you?

> > On non-"local" syntax table text properties?

> On syntax-ppss. Participating in discussions about a subject is usually 
> a good reason to educate themselves about it. For most people, at least.

Oh, I'm pretty "educated" about syntax-ppss, thank you very much -
educated enough to submit bug reports about it.  But I was hoping you
could tell me something more about non-"local" s-t properties.

> > I don't recall seeing any discussion of this, except for the one
> > we're now having.  If I've forgotten it, or missed it, you could
> > perhaps point it out to me.

> Non-locality is one of the obvious reasons for syntax-propertize's 
> design, the way that syntax-table application is performed lazily.

And it's a good reason not to use syntax-propertize when all s-t
properties are, in fact "local", and it is desirable for these properties
to be amended instantly on buffer changes.

[ .... ]

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 16:00                           ` Andreas Röhler
@ 2016-06-20 18:15                             ` Dmitry Gutov
  2016-06-20 23:33                               ` John Wiegley
  2016-06-20 18:55                             ` Alan Mackenzie
  1 sibling, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-20 18:15 UTC (permalink / raw)
  To: Andreas Röhler, emacs-devel

On 06/20/2016 07:00 PM, Andreas Röhler wrote:

> IMHO syntax-ppss has many design issues, not a single one. I'd prever to
> see an example, where syntax-ppss can't be replaced by
> parse-partial-sexp. From there designing a syntax-ppss capable of its
> tasks might be of interest.

Andreas, please go away. If you don't "see an example" by now, I doubt 
anything more anyone can write is going to help.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 16:00                           ` Andreas Röhler
  2016-06-20 18:15                             ` Dmitry Gutov
@ 2016-06-20 18:55                             ` Alan Mackenzie
  2016-06-20 20:22                               ` Andreas Röhler
  1 sibling, 1 reply; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-20 18:55 UTC (permalink / raw)
  To: Andreas Röhler; +Cc: emacs-devel

Hello, Andreas.

On Mon, Jun 20, 2016 at 06:00:01PM +0200, Andreas Röhler wrote:


> On 20.06.2016 15:50, Dmitry Gutov wrote:
> > On 06/20/2016 04:37 PM, Stefan Monnier wrote:
> >>> I've tried doing this with an actual bug, namely bug #22983 
> >>> "syntax-ppss
> >>> returns wrong result".  That was over three months ago, and still there
> >>> is no fix.

> >> Indeed, no fix.  A few reasons:
> >> - Lack of a concrete case which suffers from it (not much immediate 
> >> benefit).
> >> - In many cases, it's easier to fix the other side (the caller of 
> >> syntax-ppss).
> >> - It's hard to fix it right, not because of syntax-ppss in particular,
> >>   but because it's hard to make a generic facility which can be fast by
> >>   using a cache, yet is not being told where is the real intended
> >>   beginning of the buffer.  In CC-mode you just decided to punt and
> >>   disallow the use of cc-mode where 1 is not the real beginning of the
> >>   C code.  So your approach suffers from the same problem (just the
> >>   other side of it) and you haven't fixed it either.

> >> This said, a quick&dirty fix (if such was needed, e.g. because of a 
> >> concrete
> >> case which exhibits the problem) would be to make syntax-ppss
> >> always widen (and maybe add a syntax-ppss-dont-widen).  Given that
> >> there's no real hurry to fix it, I'd rather we fix it right.

> > I'm very tempted to fix it by pushing the proposed patch into master, 
> > considering no viable alternative patch has been proposed so far, if 
> > only to avoid seeing Alan mention that bug for the 101th time


> IMHO syntax-ppss has many design issues, not a single one.

I agree wholeheartedly.

> I'd prever to see an example, where syntax-ppss can't be replaced by
> parse-partial-sexp.

Well, syntax-ppss was originally intended to give the result equivalent
to (parse-partial-sexp (point-min) pos), and probably does, providing the
buffer is never narrowed - with narrowing, you get a somewhat random
result.

> From there designing a syntax-ppss capable of its tasks might be of
> interest.

One of the problems is that syntax-ppss, rather than just performing its
function, has the subsidiary function of applying syntax-table text
properties.  Eliminating this incoherence would be a good design aim.
In fact, finding a good way of applying the text properties would win
you a medal, in my eyes.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 17:23                                       ` Noam Postavsky
@ 2016-06-20 18:58                                         ` Dmitry Gutov
  0 siblings, 0 replies; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-20 18:58 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Alan Mackenzie, emacs-devel

On 06/20/2016 08:23 PM, Noam Postavsky wrote:

> Alan said 'In CC Mode, all the uses of the syntax-table property are
> "local";' and Stefan replied 'AFAICT, this doesn't actually depend on
> how the syntax-table property is
> applied[...]and indeed this also applies to all other languages I can
> think of (and hence all major modes where we use syntax-propertize).'

OK, but...

> But since you brought up multline string literals in subsequent
> discussion, it's important to note Alan's caveat (the "aside from...")
> in his definition of "local":
>
>     a buffer change in an earlier part of the buffer (aside from
>     crude syntactic things like inserting unclosed comment/string
>     delimiters) cannot affect the properties on the current part of the
>     buffer.

...there's no reason to make special consideration for the "crude 
syntactic things", because they do have the propensity to span the whole 
buffer when unclosed, and they tend to affect the application of 
syntax-table property on the "fine" syntactic things inside and outside 
of them.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 18:12                                       ` Alan Mackenzie
@ 2016-06-20 19:15                                         ` Dmitry Gutov
  2016-06-20 20:08                                           ` Alan Mackenzie
  0 siblings, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-20 19:15 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel, Noam Postavsky

On 06/20/2016 09:12 PM, Alan Mackenzie wrote:

> "oh, in
> this circumstance we might not have syntax table properties current.
> Tell you what, we'll bung a call to syntax-propertize into the lowest
> level of the syntax routines, that will surely work most of the time".

It will work when it's supposed to work. You still have not provided any 
counter-examples aside from the interaction with narrowing.

> Don't be so cheeky.  I'm part of the Emacs team and pushing abstractions,
> new or otherwise, is one of the things I do.  Pointing out that syntax.el
> is just one of several ways of doing what it does is also something I do;
> somebody's got to do it, after all.

I've been cheeky in the other parts of the message, this is just the 
reality: you have no standing to complain about the pushback if you push 
alternative proposals but do not bother to get familiar with other 
programming languages and the major modes' that use the current facilities.

>> - Double-quoted strings are allowed to span multiple lines.
>> - Syntax is complex enough that we need to use the syntax-table property.
>> - Whether a character gets a syntax-property applied, depends on whether
>> it's inside a string or comment, among other things.
>
> All of these 3 criteria apply to C++ Mode, yet there's no need for lazy
> syntax-table propertification there.

Please give an example of syntax-property application in C++ Mode that 
only happens inside a string. And another, which only happens outside of 
strings, if there are any.

> Another question for you.  Under the aforementioned laziness, how and
> when do syntax-table properties get modified after a buffer change when
> these s-t properties are _above_ the position of the change in the buffer?

Stefan already addressed that: 
http://lists.gnu.org/archive/html/emacs-devel/2016-06/msg00421.html

But like he said, normally, you just don't. It's rare to have the 
syntactic meaning of a construct change based on text several lines down 
from it. Or even just one line.

> I can argue that because they're clean, well understood abstractions.
> And I do argue that b/a-c-f are a good way of manipulating s-t properties
> when these properties are "local".

b/a-c-f are handy when things are easy?

Thanks! That's helpful.

> Oh, I'm pretty "educated" about syntax-ppss, thank you very much -
> educated enough to submit bug reports about it.

Just one, and you like reminding us about it every chance you can.

> But I was hoping you
> could tell me something more about non-"local" s-t properties.

What's that? Properties can't be non-local. They are just values you put 
on a piece of buffer text.

Please ask a specific question.

> And it's a good reason not to use syntax-propertize when all s-t
> properties are, in fact "local", and it is desirable for these properties
> to be amended instantly on buffer changes.

Meaning, never?



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 19:15                                         ` Dmitry Gutov
@ 2016-06-20 20:08                                           ` Alan Mackenzie
  2016-06-20 20:32                                             ` Dmitry Gutov
  0 siblings, 1 reply; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-20 20:08 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel, Noam Postavsky

Hello, Dmitry.

On Mon, Jun 20, 2016 at 10:15:58PM +0300, Dmitry Gutov wrote:
> On 06/20/2016 09:12 PM, Alan Mackenzie wrote:

[ .... ]

> >> - Double-quoted strings are allowed to span multiple lines.
> >> - Syntax is complex enough that we need to use the syntax-table property.
> >> - Whether a character gets a syntax-property applied, depends on whether
> >> it's inside a string or comment, among other things.

> > All of these 3 criteria apply to C++ Mode, yet there's no need for lazy
> > syntax-table propertification there.

> Please give an example of syntax-property application in C++ Mode that 
> only happens inside a string. And another, which only happens outside of 
> strings, if there are any.

The criterion was "... whether it's inside a string or commont, among
other things.".  There are syntax-table uses in "other things", namely
macros.  For example, in

    #error don't panic!

the "'" gets punctuation syntax, but wouldn't outside of the macro.

C++ template delimiters (and the like in Java) get parenthesis syntax in
code, but not when in a string.

> > Another question for you.  Under the aforementioned laziness, how and
> > when do syntax-table properties get modified after a buffer change when
> > these s-t properties are _above_ the position of the change in the buffer?

> Stefan already addressed that: 
> http://lists.gnu.org/archive/html/emacs-devel/2016-06/msg00421.html

He sort of addressed it.  The code which implements
syntax-propertize-extend-region-functions is not fully general.  For the
general case, you'd need to supplement such a function with a mode
specific before-change function.  And if you've got to do that, you
might as well just finish the job off and write an after-change function
and bypass the complexity of syntax-propertize-extend-region-functions
entirely.

> But like he said, normally, you just don't.

He's wrong.  You have, say, in C++ "a < b, c > d", which has been given
s-t properties as a template.  You insert "=" after ">".  That
necessitates de-propertising the "<" as well as the ">".

> It's rare to have the syntactic meaning of a construct change based on
> text several lines down from it. Or even just one line.

It happens regularly.  Or, at least, often enough that it's got to be
dealt with, particularly with C++ template delimiters.

> > I can argue that because they're clean, well understood abstractions.
> > And I do argue that b/a-c-f are a good way of manipulating s-t properties
> > when these properties are "local".

> b/a-c-f are handy when things are easy?

> Thanks! That's helpful.

They're handy in the usual case.

> > Oh, I'm pretty "educated" about syntax-ppss, thank you very much -
> > educated enough to submit bug reports about it.

> Just one, and you like reminding us about it every chance you can.

I'm hoping that, by doing so, it'll get fixed by somebody who isn't me,
given how much I dislike the function.  Please do fix it, like you
suggested somewhere else today.

> > But I was hoping you
> > could tell me something more about non-"local" s-t properties.

> What's that? Properties can't be non-local. They are just values you put 
> on a piece of buffer text.

<sigh>.  The meaning we've already established over several posts is
syntax-table text properties whose setting, or lack thereof, is
influenced by arbitrarily distant text in the buffer.  You have asserted
that there exist such text properties in Ruby Mode.

> Please ask a specific question.

> > And it's a good reason not to use syntax-propertize when all s-t
> > properties are, in fact "local", and it is desirable for these properties
> > to be amended instantly on buffer changes.

> Meaning, never?

My meaning's clear enough.

Good night!

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 18:55                             ` Alan Mackenzie
@ 2016-06-20 20:22                               ` Andreas Röhler
  0 siblings, 0 replies; 95+ messages in thread
From: Andreas Röhler @ 2016-06-20 20:22 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel



On 20.06.2016 20:55, Alan Mackenzie wrote:
> Hello, Andreas.
>
> On Mon, Jun 20, 2016 at 06:00:01PM +0200, Andreas Röhler wrote:
>
>
>> On 20.06.2016 15:50, Dmitry Gutov wrote:
>>> On 06/20/2016 04:37 PM, Stefan Monnier wrote:
>>>>> I've tried doing this with an actual bug, namely bug #22983
>>>>> "syntax-ppss
>>>>> returns wrong result".  That was over three months ago, and still there
>>>>> is no fix.
>>>> Indeed, no fix.  A few reasons:
>>>> - Lack of a concrete case which suffers from it (not much immediate
>>>> benefit).
>>>> - In many cases, it's easier to fix the other side (the caller of
>>>> syntax-ppss).
>>>> - It's hard to fix it right, not because of syntax-ppss in particular,
>>>>    but because it's hard to make a generic facility which can be fast by
>>>>    using a cache, yet is not being told where is the real intended
>>>>    beginning of the buffer.  In CC-mode you just decided to punt and
>>>>    disallow the use of cc-mode where 1 is not the real beginning of the
>>>>    C code.  So your approach suffers from the same problem (just the
>>>>    other side of it) and you haven't fixed it either.
>>>> This said, a quick&dirty fix (if such was needed, e.g. because of a
>>>> concrete
>>>> case which exhibits the problem) would be to make syntax-ppss
>>>> always widen (and maybe add a syntax-ppss-dont-widen).  Given that
>>>> there's no real hurry to fix it, I'd rather we fix it right.
>>> I'm very tempted to fix it by pushing the proposed patch into master,
>>> considering no viable alternative patch has been proposed so far, if
>>> only to avoid seeing Alan mention that bug for the 101th time
>
>> IMHO syntax-ppss has many design issues, not a single one.
> I agree wholeheartedly.
>
>> I'd prever to see an example, where syntax-ppss can't be replaced by
>> parse-partial-sexp.
> Well, syntax-ppss was originally intended to give the result equivalent
> to (parse-partial-sexp (point-min) pos), and probably does, providing the
> buffer is never narrowed - with narrowing, you get a somewhat random
> result.
>
>>  From there designing a syntax-ppss capable of its tasks might be of
>> interest.
> One of the problems is that syntax-ppss, rather than just performing its
> function, has the subsidiary function of applying syntax-table text
> properties.  Eliminating this incoherence would be a good design aim.
> In fact, finding a good way of applying the text properties would win
> you a medal, in my eyes.
>

Thanks Alan. It should be possible to find a simpler solution avoiding 
circular calls.
Nonetheless, rethinking it from scratch might take some time too.
May you open a branch for this? Ideally with its own bug-tracker?

Another problem: signed only the disclaimer to FSF, not the complete 
copyright assignment paper.
So my role would be restricted being a tester and critique rather than 
an active writer... ;)





^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 20:08                                           ` Alan Mackenzie
@ 2016-06-20 20:32                                             ` Dmitry Gutov
  2016-06-21 14:40                                               ` Alan Mackenzie
  0 siblings, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-20 20:32 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel, Noam Postavsky

On 06/20/2016 11:08 PM, Alan Mackenzie wrote:

> C++ template delimiters (and the like in Java) get parenthesis syntax in
> code, but not when in a string.

OK, good.

Does C++ mode in master support raw strings already? Is there a limit on 
how far you look for the end of the raw string, and if yes, how much is it?

>> Stefan already addressed that:
>> http://lists.gnu.org/archive/html/emacs-devel/2016-06/msg00421.html
>
> He sort of addressed it.  The code which implements
> syntax-propertize-extend-region-functions is not fully general.  For the
> general case, you'd need to supplement such a function with a mode
> specific before-change function.

This is unsubstantiated.

> But like he said, normally, you just don't.
>
> He's wrong.  You have, say, in C++ "a < b, c > d", which has been given
> s-t properties as a template.  You insert "=" after ">".  That
> necessitates de-propertising the "<" as well as the ">".

That's only an issue if `>' is on a different line than `<'. But yes, 
syntax-propertize-extend-region-functions exist for a reason.

> They're handy in the usual case.

If by "usual" you just mean the majority of cases you've encountered 
while using and developing CC Mode.

>>> Oh, I'm pretty "educated" about syntax-ppss, thank you very much -
>>> educated enough to submit bug reports about it.
>
>> Just one, and you like reminding us about it every chance you can.
>
> I'm hoping that, by doing so, it'll get fixed by somebody who isn't me,
> given how much I dislike the function.  Please do fix it, like you
> suggested somewhere else today.

You've already expressed dislike for my solution. But sure, I will.

> <sigh>.  The meaning we've already established over several posts is
> syntax-table text properties whose setting, or lack thereof, is
> influenced by arbitrarily distant text in the buffer.  You have asserted
> that there exist such text properties in Ruby Mode.

You still haven't asked a specific question. By now, I've given a 
thorough enough explanation of how things can go wrong that I have a 
hard time understanding what kind of example you want.

So, again: what question do you want answered? "Tell me something more" 
is just wasting my time.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 16:45                                     ` Dmitry Gutov
  2016-06-20 18:12                                       ` Alan Mackenzie
@ 2016-06-20 22:45                                       ` John Wiegley
  2016-06-20 23:30                                         ` Dmitry Gutov
  1 sibling, 1 reply; 95+ messages in thread
From: John Wiegley @ 2016-06-20 22:45 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, emacs-devel, Noam Postavsky

>>>>> Dmitry Gutov <dgutov@yandex.ru> writes:

> Why don't you do us all a favor and educate yourself about other languages
> and language modes before arguing that before/after-change-functions can be
> a general solution as-is?

Dmitry,

There is really no need for this kind of tone. Alan is striving to solve a
problem; let's help him find the best solution, rather than criticizing his
efforts. If before/after-change-functions is not a general solution as-is, all
we need are some examples to demonstrate this.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 22:45                                       ` John Wiegley
@ 2016-06-20 23:30                                         ` Dmitry Gutov
  2016-06-20 23:56                                           ` John Wiegley
                                                             ` (2 more replies)
  0 siblings, 3 replies; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-20 23:30 UTC (permalink / raw)
  To: Alan Mackenzie, Noam Postavsky, emacs-devel

On 06/21/2016 01:45 AM, John Wiegley wrote:

> There is really no need for this kind of tone. Alan is striving to solve a
> problem;

No, he's not. He doesn't have a specific problem he's trying to solve, 
just a lot of opinions and one flimsy, tangentially related, bug report.

What he's doing, third time around now (?), is making misleading 
comments in public, then arguing a lot, and never budging in his stance 
one iota.

If you think my comments are not helpful, I can shut up, of course. And 
save a lot of time doing that.

> let's help him find the best solution, rather than criticizing his
> efforts. If before/after-change-functions is not a general solution as-is, all
> we need are some examples to demonstrate this.

I gave several explanations and examples. There is no way I can continue 
doing that if the explanations are waved off with opinionated "bad 
design" comments, and examples with "I've never had that kind of problem".

Here's an example in Ruby:

     a = `def

     b = :`

If you add ` at the end of the first line, the code will have one 
meaning (with the last ` character having syntax-table property 
"symbol"). Without it, another meaning, and no syntax-table property on 
the last character.

Now mentally insert 300000 lines of code between these lines, none of 
them containing the character `. And imagine yourself adding and 
removing the ` character at the end of the first line.

If Emacs is supposed to keep the syntax-table value on the last 
character up to date using after-change-functions, it will have to scan 
the whole 300000 line buffer after every keypress.

Addendum:

With clever enough caching (to be implemented by someone highly 
motivated), I suppose it's possible to avoid having this problem on 
*every* keypress. But having to do that even on some of them is bad enough.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 18:15                             ` Dmitry Gutov
@ 2016-06-20 23:33                               ` John Wiegley
  0 siblings, 0 replies; 95+ messages in thread
From: John Wiegley @ 2016-06-20 23:33 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Andreas Röhler, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 516 bytes --]

>>>>> Dmitry Gutov <dgutov@yandex.ru> writes:

> Andreas, please go away. If you don't "see an example" by now, I doubt
> anything more anyone can write is going to help.

Dmitry, this is somewhat rude, and not appropriate for this list. Please
respect the time and effort it takes just for people to stay committed and
responsive on this mailing list.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 629 bytes --]

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 23:30                                         ` Dmitry Gutov
@ 2016-06-20 23:56                                           ` John Wiegley
  2016-06-21  0:26                                           ` John Wiegley
  2016-06-21 15:26                                           ` Alan Mackenzie
  2 siblings, 0 replies; 95+ messages in thread
From: John Wiegley @ 2016-06-20 23:56 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, emacs-devel, Noam Postavsky

>>>>> Dmitry Gutov <dgutov@yandex.ru> writes:

>> There is really no need for this kind of tone. Alan is striving to solve a
>> problem;

> No, he's not. He doesn't have a specific problem he's trying to solve, just
> a lot of opinions and one flimsy, tangentially related, bug report.
> 
> What he's doing, third time around now (?), is making misleading comments in
> public, then arguing a lot, and never budging in his stance one iota.
>
> If you think my comments are not helpful, I can shut up, of course. And save
> a lot of time doing that.

This is your assessment of the situation, which I'm fairly certain Alan does
not share. What I'm asking for is that we all work toward a solution, and not
criticize the motives/approach of one another while doing so. If you find
working with Alan on this issue aggravating, then perhaps it's time to leave
this particular discussion to others.

> Now mentally insert 300000 lines of code between these lines, none of them
> containing the character `. And imagine yourself adding and removing the `
> character at the end of the first line.

Sounds like a compelling example of some point; not sure how it relates to the
original request (I haven't followed closely enough).

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 23:30                                         ` Dmitry Gutov
  2016-06-20 23:56                                           ` John Wiegley
@ 2016-06-21  0:26                                           ` John Wiegley
  2016-06-21 15:26                                           ` Alan Mackenzie
  2 siblings, 0 replies; 95+ messages in thread
From: John Wiegley @ 2016-06-21  0:26 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, emacs-devel, Noam Postavsky

>>>>> Dmitry Gutov <dgutov@yandex.ru> writes:

> I gave several explanations and examples. There is no way I can continue
> doing that if the explanations are waved off with opinionated "bad design"
> comments, and examples with "I've never had that kind of problem".
> 
> Here's an example in Ruby:

OK, I like this: a nice, concrete example of why we wouldn't want to do this.

Alan, do you have any counter-examples that are as succinctly expressed? Maybe
both sides have merit here, but the amount of dialog makes me wonder whether
this isn't coming down to just differences of opinion.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 20:32                                             ` Dmitry Gutov
@ 2016-06-21 14:40                                               ` Alan Mackenzie
  2016-06-21 21:06                                                 ` Dmitry Gutov
  2016-06-21 23:50                                                 ` Dmitry Gutov
  0 siblings, 2 replies; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-21 14:40 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Noam Postavsky, emacs-devel

Hello, Dmitry.

On Mon, Jun 20, 2016 at 11:32:05PM +0300, Dmitry Gutov wrote:
> On 06/20/2016 11:08 PM, Alan Mackenzie wrote:

[ .... ]

> Does C++ mode in master support raw strings already? Is there a limit on 
> how far you look for the end of the raw string, and if yes, how much is it?

Yes, no, and N/A, respectively.  Try out this raw string support,
sometime.

> >> Stefan already addressed that:
> >> http://lists.gnu.org/archive/html/emacs-devel/2016-06/msg00421.html

> > He sort of addressed it.  The code which implements
> > syntax-propertize-extend-region-functions is not fully general.  For the
> > general case, you'd need to supplement such a function with a mode
> > specific before-change function.

> This is unsubstantiated.

At the risk of reigniting arguments, the current mechanism pays no
attention to the buffer text before a change.  So, if this is relevant
to the bounds of the region wanting syntax-table props applied/deleted,
the s-p-extend-r-f mechanism will need to be supplemented by a
before-change function.  The sort of situation you'd need it is where a
buffer change consists of deleting an escaped EOL.  If you only look at
the buffer in a-c, you'll have no idea how far back the original C Macro
extended, for example.

[ .... ]

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-20 23:30                                         ` Dmitry Gutov
  2016-06-20 23:56                                           ` John Wiegley
  2016-06-21  0:26                                           ` John Wiegley
@ 2016-06-21 15:26                                           ` Alan Mackenzie
  2016-06-21 16:09                                             ` Dmitry Gutov
  2016-06-21 16:28                                             ` Dmitry Gutov
  2 siblings, 2 replies; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-21 15:26 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel, Noam Postavsky

Hello, Dmitry.

On Tue, Jun 21, 2016 at 02:30:44AM +0300, Dmitry Gutov wrote:
> On 06/21/2016 01:45 AM, John Wiegley wrote:

> > There is really no need for this kind of tone. Alan is striving to solve a
> > problem;

> No, he's not. He doesn't have a specific problem he's trying to solve, 
> just a lot of opinions and one flimsy, tangentially related, bug report.

I think "lot of opinions" could be summed up in my insistence that there
are several strategies which can be adopted for handling syntax-table
text properties.  You seem to be of the opposite opinion, that there is
one single blessed way of doing this handling, and any other way is thus
the Wrong Thing.

As far as I am aware, there has never been a general discussion on
emacs-devel about this topic.  One isolated developer developed the
strategy you like, and he spread it around existing modes as far as he
could, again, without any consultation that I'm aware of.  If that
discussion had taken place, likely the strategy would be better thought
out, more widely applicable, and better implemented with less resulting
bad feeling.

[ .... ]

> Here's an example in Ruby:

>      a = `def

>      b = :`

> If you add ` at the end of the first line, the code will have one 
> meaning (with the last ` character having syntax-table property 
> "symbol"). Without it, another meaning, and no syntax-table property on 
> the last character.

> Now mentally insert 300000 lines of code between these lines, none of 
> them containing the character `. And imagine yourself adding and 
> removing the ` character at the end of the first line.

Thanks, that's an interesting example.

> If Emacs is supposed to keep the syntax-table value on the last 
> character up to date using after-change-functions, it will have to scan 
> the whole 300000 line buffer after every keypress.

Could it not restrict the scanning to cases where a "`" is inserted or
deleted?  Do you not have to do the scanning anyway when you type in "`"
at the end of the "b = :`" line?

> Addendum:

> With clever enough caching (to be implemented by someone highly 
> motivated), I suppose it's possible to avoid having this problem on 
> *every* keypress. But having to do that even on some of them is bad enough.

You've got a strategy in Ruby Mode which works, and you'll note I've
never tried to talk you into abandoning that strategy.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-21 15:26                                           ` Alan Mackenzie
@ 2016-06-21 16:09                                             ` Dmitry Gutov
  2016-06-21 18:34                                               ` Andreas Röhler
  2016-06-21 21:05                                               ` Alan Mackenzie
  2016-06-21 16:28                                             ` Dmitry Gutov
  1 sibling, 2 replies; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-21 16:09 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel, Noam Postavsky

Hi Alan,

On 06/21/2016 06:26 PM, Alan Mackenzie wrote:

> my insistence that there
> are several strategies which can be adopted for handling syntax-table
> text properties.

This is the kind of vague statement that I've seen a lot, and does not 
further the discussion. Yes, there can be lots of approaches to doing 
stuff. It's a truism.

> You seem to be of the opposite opinion, that there is
> one single blessed way of doing this handling, and any other way is thus
> the Wrong Thing.

No. Of any statements I've made the only one that sounds close is that 
when we're caching syntactic information in a buffer, there must be only 
one source of truth, and not multiple. That is from the comment-cache 
discussion.

Again, if you insist on continuing using the bare after-change-functions 
approach in CC Mode, I'm fine with that, provided you deal with all the 
performance-related consequences, and that you don't try to work around 
its problems by pushing solutions tailored to CC Mode to the core, 
ignoring the needs of the rest of the modes.

> As far as I am aware, there has never been a general discussion on
> emacs-devel about this topic.

Without consultation, meaning nobody asked you? Consider me consulted. 
And seeing how well this and related discussions are going, it's quite 
likely that the result of that would have been coming up with no new 
strategy at all, and giving up in disgust instead.

> One isolated developer developed the
> strategy you like, and he spread it around existing modes as far as he
> could, again, without any consultation that I'm aware of.

The "isolated developer" has been an Emacs maintainer for many years, 
and the strategy in question has been in use for many years now, across 
many packages. So trying to reduce its current importance to "one 
isolated developer" is disingenuous, and rather insulting.

> If that
> discussion had taken place, likely the strategy would be better thought
> out, more widely applicable, and better implemented with less resulting
> bad feeling.

By now, you've had every chance to analyze its current usage, benefits, 
drawbacks, and present a decent alternative that does not regress in 
important aspects, which I'm sure is possible (everything has space for 
improvement).

Instead, we've only seen lots of opinionated statements, complaints 
about being forced to switch (this is between you and Stefan, although I 
also suspect that it could help deal with a lot of CC Mode's performance 
problems, current and future ones), and one flimsy and rather obvious 
bug report.

 > ...
> You've got a strategy in Ruby Mode which works, and you'll note I've
> never tried to talk you into abandoning that strategy.

That's not true. You've critiqued it a lot (does that not count as 
persuading to abandon?), and you've tried to push a new incompatible 
facility that would cause problems for syntax-ppss users in the long 
run. Or, at least, is very likely to.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-21 15:26                                           ` Alan Mackenzie
  2016-06-21 16:09                                             ` Dmitry Gutov
@ 2016-06-21 16:28                                             ` Dmitry Gutov
  1 sibling, 0 replies; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-21 16:28 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel, Noam Postavsky

On 06/21/2016 06:26 PM, Alan Mackenzie wrote:

>> If Emacs is supposed to keep the syntax-table value on the last
>> character up to date using after-change-functions, it will have to scan
>> the whole 300000 line buffer after every keypress.
>
> Could it not restrict the scanning to cases where a "`" is inserted or
> deleted?

Even if we could, it would be more like "cases where `, /, " or % are 
inserted or deleted" (with actually more cases, and complex logic for 
the cases' detection).

But "auuugh Ruby Mode is slow when I type /" would be a problem anyway, 
just look at http://debbugs.gnu.org/cgi/bugreport.cgi?bug=22884.

> Do you not have to do the scanning anyway when you type in "`"
> at the end of the "b = :`" line?

I don't. ` has the syntax "string delimiter" in ruby-mode-syntax-table.

When ruby-syntax-propertize is called on the last line, it just checks 
what (syntax-ppss POSITION-OF-THE-LAST-`) returns. If the return value 
says "inside a string", we do not apply syntax-table value "symbol" to 
the last `. Otherwise, we do.

This way, the full scan is replaced with a call to (syntax-ppss), which 
uses cache. In more complex cases, we can go look at the actual 
character that begins the string (or a string-like literal) that we're 
currently inside, parse the text around it locally a bit, and so on, but 
this is always faster that scanning the whole buffer.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-21 16:09                                             ` Dmitry Gutov
@ 2016-06-21 18:34                                               ` Andreas Röhler
  2016-06-21 18:42                                                 ` John Wiegley
  2016-06-21 18:49                                                 ` Eli Zaretskii
  2016-06-21 21:05                                               ` Alan Mackenzie
  1 sibling, 2 replies; 95+ messages in thread
From: Andreas Röhler @ 2016-06-21 18:34 UTC (permalink / raw)
  To: emacs-devel



On 21.06.2016 18:09, Dmitry Gutov wrote:
[ ... ]
> The "isolated developer" has been an Emacs maintainer for many years, 
[ ... ]

Which points at the underlying reasons of the problem. "Emacs hampered 
by legacy code of maintainers" would by worth a study. Hopefully John 
knows how to avoid that trap, which is purely social, not technical.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-21 18:34                                               ` Andreas Röhler
@ 2016-06-21 18:42                                                 ` John Wiegley
  2016-06-21 18:52                                                   ` Eli Zaretskii
  2016-06-21 19:23                                                   ` Andreas Röhler
  2016-06-21 18:49                                                 ` Eli Zaretskii
  1 sibling, 2 replies; 95+ messages in thread
From: John Wiegley @ 2016-06-21 18:42 UTC (permalink / raw)
  To: Andreas Röhler; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1269 bytes --]

>>>>> Andreas Röhler <andreas.roehler@online.de> writes:

> Which points at the underlying reasons of the problem. "Emacs hampered by
> legacy code of maintainers" would by worth a study. Hopefully John knows how
> to avoid that trap, which is purely social, not technical.

It's fairly easy to avoid: Be objective, professional and courteous in
analysis and discussion of issues; let evidence drive the discussion, and show
real comparisons between alternatives. If the discussion becomes too abstract,
or is based on "what other people have done in the past", then in my opinion
it lacks the force of argument.

There is no reason that personalities should have anything to do with how
Emacs grows, or which code we use or don't use. If something is objectively
better, we should consider it; if it's not, we'll stick with what works
(however badly) until that better thing comes along.

I'm willing to rip out any code that hampers us, no matter who it was written
by. Show me clear proof, borne by consensus among our developers, and we'll
push the commit to master whenever you're ready.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 629 bytes --]

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-21 18:34                                               ` Andreas Röhler
  2016-06-21 18:42                                                 ` John Wiegley
@ 2016-06-21 18:49                                                 ` Eli Zaretskii
  1 sibling, 0 replies; 95+ messages in thread
From: Eli Zaretskii @ 2016-06-21 18:49 UTC (permalink / raw)
  To: Andreas Röhler; +Cc: emacs-devel

> From: Andreas Röhler <andreas.roehler@online.de>
> Date: Tue, 21 Jun 2016 20:34:56 +0200
> 
> On 21.06.2016 18:09, Dmitry Gutov wrote:
> [ ... ]
> > The "isolated developer" has been an Emacs maintainer for many years, 
> [ ... ]
> 
> Which points at the underlying reasons of the problem. "Emacs hampered 
> by legacy code of maintainers" would by worth a study. Hopefully John 
> knows how to avoid that trap, which is purely social, not technical.

Sorry, but that's baloney.  There's no way "to avoid that trap", nor
is there any need to.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-21 18:42                                                 ` John Wiegley
@ 2016-06-21 18:52                                                   ` Eli Zaretskii
  2016-06-27 11:50                                                     ` Andreas Röhler
  2016-06-21 19:23                                                   ` Andreas Röhler
  1 sibling, 1 reply; 95+ messages in thread
From: Eli Zaretskii @ 2016-06-21 18:52 UTC (permalink / raw)
  To: John Wiegley; +Cc: andreas.roehler, emacs-devel

> From: John Wiegley <jwiegley@gmail.com>
> Date: Tue, 21 Jun 2016 11:42:27 -0700
> Cc: emacs-devel@gnu.org
> 
> I'm willing to rip out any code that hampers us, no matter who it was written
> by.

There's no such code in Emacs, not by and large.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-21 18:42                                                 ` John Wiegley
  2016-06-21 18:52                                                   ` Eli Zaretskii
@ 2016-06-21 19:23                                                   ` Andreas Röhler
  1 sibling, 0 replies; 95+ messages in thread
From: Andreas Röhler @ 2016-06-21 19:23 UTC (permalink / raw)
  To: emacs-devel; +Cc: John Wiegley



On 21.06.2016 20:42, John Wiegley wrote:
>>>>>> Andreas Röhler <andreas.roehler@online.de> writes:
>> Which points at the underlying reasons of the problem. "Emacs hampered by
>> legacy code of maintainers" would by worth a study. Hopefully John knows how
>> to avoid that trap, which is purely social, not technical.
> It's fairly easy to avoid: Be objective, professional and courteous in
> analysis and discussion of issues; let evidence drive the discussion, and show
> real comparisons between alternatives. If the discussion becomes too abstract,
> or is based on "what other people have done in the past", then in my opinion
> it lacks the force of argument.
>
> There is no reason that personalities should have anything to do with how
> Emacs grows, or which code we use or don't use. If something is objectively
> better, we should consider it; if it's not, we'll stick with what works
> (however badly) until that better thing comes along.
>
> I'm willing to rip out any code that hampers us, no matter who it was written
> by. Show me clear proof, borne by consensus among our developers, and we'll
> push the commit to master whenever you're ready.
>

Maybe should stress my compassion and respect of the tremendous work, 
which maintaining Emacs meant and means, not to be misunderstood,

looking forward,

Andreas



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-21 16:09                                             ` Dmitry Gutov
  2016-06-21 18:34                                               ` Andreas Röhler
@ 2016-06-21 21:05                                               ` Alan Mackenzie
  2016-06-21 21:17                                                 ` Dmitry Gutov
  1 sibling, 1 reply; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-21 21:05 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Noam Postavsky, emacs-devel

Hello, Dmitry.

On Tue, Jun 21, 2016 at 07:09:56PM +0300, Dmitry Gutov wrote:
> Hi Alan,

> On 06/21/2016 06:26 PM, Alan Mackenzie wrote:

> > my insistence that there
> > are several strategies which can be adopted for handling syntax-table
> > text properties.

> This is the kind of vague statement that I've seen a lot, and does not 
> further the discussion. Yes, there can be lots of approaches to doing 
> stuff. It's a truism.

And each of these ways is equally valid.  Some will be better in some
situations, others in other situations.  What I object to is you trying
to dictate to the Emacs community that they are only to be allowed to
handle syntax-table text properties in your favoured manner.

> > You seem to be of the opposite opinion, that there is
> > one single blessed way of doing this handling, and any other way is thus
> > the Wrong Thing.

> No. Of any statements I've made the only one that sounds close is that 
> when we're caching syntactic information in a buffer, there must be only 
> one source of truth, and not multiple. That is from the comment-cache 
> discussion.

This isn't true.  One such statement you've made is this:

> ....., and you've tried to push a new incompatible facility that would
> cause problems for syntax-ppss users in the long run. Or, at least, is
> very likely to.

There is, at the very least an implication there, that you consider
"syntax-ppss users" in some way privileged, in that other Emacs
developers must constrain their development strategies to fit in with
the desires and defficiencies of these "syntax-ppss users".

I say that it is up to the "syntax-ppss users" to keep their software
compatible with Emacs, not the other way around.  They have no right to
impose constraints on other developers, certainly not on how they will
manipulate syntax-table text properties.

[ .... ]

> Again, if you insist on continuing using the bare after-change-functions 
> approach in CC Mode, I'm fine with that, provided you deal with all the 
> performance-related consequences, ...

There are none.  The performance problems in CC Mode arise from its
insistence on highly accurate fontification.

> ..., and that you don't try to work around its problems by pushing
> solutions tailored to CC Mode to the core, ignoring the needs of the
> rest of the modes.

Other bits of Emacs are welcome to use solutions developed in CC Mode
where appropriate.  I'm not sure how you imagine that implementing
something in CC Mode somehow "ignores the needs of the rest of the
modes".  They're separate modes.

[ .... ]

>  > ...
> > You've got a strategy in Ruby Mode which works, and you'll note I've
> > never tried to talk you into abandoning that strategy.

> That's not true. You've critiqued it a lot (does that not count as 
> persuading to abandon?), ....

No, it doesn't.  It's technical discussion leading to greater
understanding on both sides.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-21 14:40                                               ` Alan Mackenzie
@ 2016-06-21 21:06                                                 ` Dmitry Gutov
  2016-06-21 23:50                                                 ` Dmitry Gutov
  1 sibling, 0 replies; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-21 21:06 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Noam Postavsky, emacs-devel

On 06/21/2016 05:40 PM, Alan Mackenzie wrote:

>> Does C++ mode in master support raw strings already? Is there a limit on
>> how far you look for the end of the raw string, and if yes, how much is it?
>
> Yes, no, and N/A, respectively.  Try out this raw string support,
> sometime.

I will.

> At the risk of reigniting arguments, the current mechanism pays no
> attention to the buffer text before a change.  So, if this is relevant
> to the bounds of the region wanting syntax-table props applied/deleted,
> the s-p-extend-r-f mechanism will need to be supplemented by a
> before-change function.  The sort of situation you'd need it is where a
> buffer change consists of deleting an escaped EOL.  If you only look at
> the buffer in a-c, you'll have no idea how far back the original C Macro
> extended, for example.

This makes a certain amount of sense, but I disagree with the 
conclusion: the fact that you *can* do this additional stuff in 
before-change-function, then save and use the resulting information 
later inside syntax-propertize-extend-region-functions, means the latter 
is general enough.

But if you were willing to change how CC Mode works further, maybe you 
won't need this. Instead, you need two things:

- A way to get back to a "safe" position. (syntax-ppss) provides this, 
as long as the only types of contexts one has to worry about are 
comments and strings (maybe different kinds), but with C/C++ you may 
have to first (goto-char (car (nth 9 (syntax-ppss)))) and then parse 
locally from that. Failing that, you could keep your own cache of safe 
positions, similar to syntax-ppss cache.

- Know how to syntax-propertize buffer contents going forward from a 
safe position. You must know this already because you can 
syntax-propertize a newly opened buffer.

So when the user edits something inside a macro, 
cc-mode-syntax-propertize, when called on that position, will go back 
and re-propertize the whole macro, along with all buffer contents 
between the safe position and here.

There's a certain overhead associated with this approach, but it's been 
working rather well in many cases, and it's conceptually simple. Maybe 
you should try that and see how big the overhead is.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-21 21:05                                               ` Alan Mackenzie
@ 2016-06-21 21:17                                                 ` Dmitry Gutov
  0 siblings, 0 replies; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-21 21:17 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Noam Postavsky, emacs-devel

On 06/22/2016 12:05 AM, Alan Mackenzie wrote:

> What I object to is you trying
> to dictate to the Emacs community that they are only to be allowed to
> handle syntax-table text properties in your favoured manner.

I'm not doing that. Again: please go on with doing whatever you wish in 
CC Mode.

> There is, at the very least an implication there, that you consider
> "syntax-ppss users" in some way privileged, in that other Emacs
> developers must constrain their development strategies to fit in with
> the desires and defficiencies of these "syntax-ppss users".

Only when you try to change how Emacs primitives work. Then yes, you 
damn better consider the existing facilities and their users. I'm again 
referring to comment-cache here and your proposed implementation.

> I say that it is up to the "syntax-ppss users" to keep their software
> compatible with Emacs, not the other way around.

By the same reasoning, you could push for renaming `car' into `cr' for 
efficiency reasons, without an alias, and demand that all other users 
adapt to keep their code compatible. That's not how Emacs works, and you 
know it.

 > They have no right to
 > impose constraints on other developers, certainly not on how they will
 > manipulate syntax-table text properties.

That sounds like a different discussion.

> I'm not sure how you imagine that implementing
> something in CC Mode somehow "ignores the needs of the rest of the
> modes".

It does not while it stays in CC Mode. Or, if extracted from it, remains 
fully optional.

>> That's not true. You've critiqued it a lot (does that not count as
>> persuading to abandon?), ....
>
> No, it doesn't.  It's technical discussion leading to greater
> understanding on both sides.

I really hope it does.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-21 14:40                                               ` Alan Mackenzie
  2016-06-21 21:06                                                 ` Dmitry Gutov
@ 2016-06-21 23:50                                                 ` Dmitry Gutov
  2016-06-23 16:30                                                   ` Alan Mackenzie
  1 sibling, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-21 23:50 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Noam Postavsky, emacs-devel

On 06/21/2016 05:40 PM, Alan Mackenzie wrote:

> Yes, no, and N/A, respectively.  Try out this raw string support,
> sometime.

Here's some results: a performance degradation example, and a way to 
break CC Mode, apparently.

First, paste this near the top of xdisp.c (I did it after the big 
comment and before the header includes, but this is probably not too 
important):

const char* s1 = R"foo(
Hello
World
)foo";

and switch to c++-mode.

1) Delete the last double-quote, then type it again. Deletion is 
noticeably slow (even though bearably so: feels like 0.5 sec), restoring 
it is faster.

2) Delete the first double-quote, then type it again. Get this backtrace:

Debugger entered--Lisp error: (wrong-type-argument integer-or-marker-p nil)
   c-after-change-re-mark-raw-strings(15101 15102 0)
   #[(fn) "\b	\n\v#\207" [fn beg end old-len] 
4](c-after-change-re-mark-raw-strings)
   mapc(#[(fn) "\b	\n\v#\207" [fn beg end old-len] 4] 
(c-extend-font-lock-region-for-macros c-after-change-re-mark-raw-strings 
c-neutralize-syntax-in-and-mark-CPP c-restore-<>-properties 
c-change-expand-fl-region))
   c-after-change(15101 15102 0)
   self-insert-command(1)
   funcall-interactively(self-insert-command 1)
   call-interactively(self-insert-command nil nil)
   command-execute(self-insert-command)

This scenario is really the first thing I've tried.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-21 23:50                                                 ` Dmitry Gutov
@ 2016-06-23 16:30                                                   ` Alan Mackenzie
  2016-06-27 11:48                                                     ` Alan Mackenzie
  2016-06-29  0:30                                                     ` Dmitry Gutov
  0 siblings, 2 replies; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-23 16:30 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Noam Postavsky, emacs-devel

Hello, Dmitry.

On Wed, Jun 22, 2016 at 02:50:58AM +0300, Dmitry Gutov wrote:
> On 06/21/2016 05:40 PM, Alan Mackenzie wrote:

> > Yes, no, and N/A, respectively.  Try out this raw string support,
> > sometime.

> Here's some results: a performance degradation example, and a way to 
> break CC Mode, apparently.

Thanks for doing this test and telling me about it.

> First, paste this near the top of xdisp.c (I did it after the big 
> comment and before the header includes, but this is probably not too 
> important):

> const char* s1 = R"foo(
> Hello
> World
> )foo";

> and switch to c++-mode.

> 1) Delete the last double-quote, then type it again. Deletion is 
> noticeably slow (even though bearably so: feels like 0.5 sec), restoring 
> it is faster.

Yes.  There were two functions in cc-fonts.el that were using
(point-max) as a limit for something, when they should have been using,
respectively, (min limit (point-max)), and limit.  A bit of playing
around suggests there is more to fix, there.

> 2) Delete the first double-quote, then type it again. Get this backtrace:

> Debugger entered--Lisp error: (wrong-type-argument integer-or-marker-p nil)
>    c-after-change-re-mark-raw-strings(15101 15102 0)
>    #[(fn) "\b	\n\v#\207" [fn beg end old-len] 
> 4](c-after-change-re-mark-raw-strings)
>    mapc(#[(fn) "\b	\n\v#\207" [fn beg end old-len] 4] 
> (c-extend-font-lock-region-for-macros c-after-change-re-mark-raw-strings 
> c-neutralize-syntax-in-and-mark-CPP c-restore-<>-properties 
> c-change-expand-fl-region))
>    c-after-change(15101 15102 0)
>    self-insert-command(1)
>    funcall-interactively(self-insert-command 1)
>    call-interactively(self-insert-command nil nil)
>    command-execute(self-insert-command)

Yes.  This was caused by a low level function failing to do
(save-match-data ...) around a (looking-at ....) with the result that
the match-data was corrupted for the higher level function.  That bug's
been there for some while.

> This scenario is really the first thing I've tried.

Thanks.  I've just committed a patch which should fix your exact
scenario.  As I said, there's a little more to do.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-23 16:30                                                   ` Alan Mackenzie
@ 2016-06-27 11:48                                                     ` Alan Mackenzie
  2016-06-29  0:30                                                     ` Dmitry Gutov
  1 sibling, 0 replies; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-27 11:48 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel, Noam Postavsky

Hello, Dmitry.

On Thu, Jun 23, 2016 at 04:30:21PM +0000, Alan Mackenzie wrote:
> On Wed, Jun 22, 2016 at 02:50:58AM +0300, Dmitry Gutov wrote:
> > On 06/21/2016 05:40 PM, Alan Mackenzie wrote:

> > > Yes, no, and N/A, respectively.  Try out this raw string support,
> > > sometime.

> > Here's some results: a performance degradation example, and a way to 
> > break CC Mode, apparently.

> Thanks for doing this test and telling me about it.

> > First, paste this near the top of xdisp.c (I did it after the big 
> > comment and before the header includes, but this is probably not too 
> > important):

> > const char* s1 = R"foo(
> > Hello
> > World
> > )foo";

> > and switch to c++-mode.

> > 1) Delete the last double-quote, then type it again. Deletion is 
> > noticeably slow (even though bearably so: feels like 0.5 sec), restoring 
> > it is faster.

[ .... ]

> Thanks.  I've just committed a patch which should fix your exact
> scenario.  As I said, there's a little more to do.

What was going wrong was typing characters into a C++ raw string when
there was no closing delimiter at all was very sluggish.  This was
because a lot of scanning to the "end of the literal" (i.e. here the end
of the buffer) was being done needlessly.

I've sorted this out, and typing into a raw string now doesn't cause a
delay.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-21 18:52                                                   ` Eli Zaretskii
@ 2016-06-27 11:50                                                     ` Andreas Röhler
  0 siblings, 0 replies; 95+ messages in thread
From: Andreas Röhler @ 2016-06-27 11:50 UTC (permalink / raw)
  To: Eli Zaretskii, John Wiegley; +Cc: emacs-devel



On 21.06.2016 20:52, Eli Zaretskii wrote:
>> From: John Wiegley <jwiegley@gmail.com>
>> Date: Tue, 21 Jun 2016 11:42:27 -0700
>> Cc: emacs-devel@gnu.org
>>
>> I'm willing to rip out any code that hampers us, no matter who it was written
>> by.
> There's no such code in Emacs, not by and large.

There are several items, severity ranges from bug to design flaw, the 
latter hampering smart use or maintenance. New thread "Beyond release" 
lists the first one seen here WRT difficulty and importance.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-23 16:30                                                   ` Alan Mackenzie
  2016-06-27 11:48                                                     ` Alan Mackenzie
@ 2016-06-29  0:30                                                     ` Dmitry Gutov
  2016-06-30  9:52                                                       ` Alan Mackenzie
  1 sibling, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-06-29  0:30 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Noam Postavsky, emacs-devel

On 06/23/2016 07:30 PM, Alan Mackenzie wrote:

> 1)...  There were two functions in cc-fonts.el that were using
> (point-max) as a limit for something, when they should have been using,
> respectively, (min limit (point-max)), and limit.  A bit of playing
> around suggests there is more to fix, there.

So now the raw strings are properly using limits? Does that mean there 
is a limit on the length of a raw string that CC Mode supports? (Testing 
indicates so).

Maybe it's not too terrible, but, depending on the limit's value, it 
could be a problem in certain specialized files (e.g. in a game sources 
where the author decided to keep some art assets in the code, or in some 
test files).

Anyway, that's the performance-vs-correctness tradeoff I've mentioned 
earlier. Using syntax-propertize-function, I've never seen the necessity 
to make that choice, so far. And Ruby has several counterparts to C++'s 
raw strings, all with irregular syntax.

> 2) ... This was caused by a low level function failing to do
> (save-match-data ...) around a (looking-at ....) with the result that
> the match-data was corrupted for the higher level function.  That bug's
> been there for some while.

That works now, thanks.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-29  0:30                                                     ` Dmitry Gutov
@ 2016-06-30  9:52                                                       ` Alan Mackenzie
  2016-07-10  2:01                                                         ` Dmitry Gutov
  0 siblings, 1 reply; 95+ messages in thread
From: Alan Mackenzie @ 2016-06-30  9:52 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Noam Postavsky, emacs-devel

Hello, Dmitry.

On Wed, Jun 29, 2016 at 03:30:27AM +0300, Dmitry Gutov wrote:
> On 06/23/2016 07:30 PM, Alan Mackenzie wrote:

> > 1)...  There were two functions in cc-fonts.el that were using
> > (point-max) as a limit for something, when they should have been using,
> > respectively, (min limit (point-max)), and limit.  A bit of playing
> > around suggests there is more to fix, there.

> So now the raw strings are properly using limits? Does that mean there 
> is a limit on the length of a raw string that CC Mode supports? (Testing 
> indicates so).

There isn't any limit on the length of a raw string that I know about,
nor should there be.  If you've got a test which shows there is such a
limit, please tell me about it!

The "limit" in my previous post was a bound supplied as an argument to
c-font-lock-declarators, which does what it says.  Up till now, that
precise bound wasn't important, since the function stopped anyway when it
reached the end of a (declaration) statement.  But with unterminated raw
strings, that didn't work, and the bound became important.

> Maybe it's not too terrible, but, depending on the limit's value, it 
> could be a problem in certain specialized files (e.g. in a game sources 
> where the author decided to keep some art assets in the code, or in some 
> test files).

> Anyway, that's the performance-vs-correctness tradeoff I've mentioned 
> earlier. Using syntax-propertize-function, I've never seen the necessity 
> to make that choice, so far. And Ruby has several counterparts to C++'s 
> raw strings, all with irregular syntax.

> > 2) ... This was caused by a low level function failing to do
> > (save-match-data ...) around a (looking-at ....) with the result that
> > the match-data was corrupted for the higher level function.  That bug's
> > been there for some while.

> That works now, thanks.

Excellent!

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-06-30  9:52                                                       ` Alan Mackenzie
@ 2016-07-10  2:01                                                         ` Dmitry Gutov
  2016-07-10 22:11                                                           ` Alan Mackenzie
  0 siblings, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-07-10  2:01 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Noam Postavsky, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1984 bytes --]

Hi Alan,

Sorry for the late response.

On 06/30/2016 12:52 PM, Alan Mackenzie wrote:

>> So now the raw strings are properly using limits? Does that mean there
>> is a limit on the length of a raw string that CC Mode supports? (Testing
>> indicates so).
>
> There isn't any limit on the length of a raw string that I know about,
> nor should there be.  If you've got a test which shows there is such a
> limit, please tell me about it!

Hmm, maybe not a limit, but long raw strings still aren't getting 
handled right.

Example 1:

- Apply the attached patch to xdisp.c, which should make most of the 
code belong within the raw string literal.
- Visit this file. Switch to c++-mode.
- See the literal highlighted as expected.

Press `M->', to get to the end of the buffer (that happens rather 
slowly, esp. considering we're inside a string, and font-lock can get 
this information quickly).

The literal ends at )foo".

- Modify the trailing "foo" piece: delete it, or replace with "bar", etc 
=> the literal still ends at the same line.

I have to go back to the opener and fiddle with the delimiter there, for 
it to finally notice that something is wrong.

If the raw string is small, on the other hand, I don't see this problem.

Example 2:

- Visit this file. Switch to c++-mode.
- See the literal highlighted as expected.
- M->.

Kill the closing delimiter and paste it a few lines below the opening 
delimiter. See the new positions of the raw string recognized (or not, 
I'm getting different results). But if they are recognized...

- Call `undo' a few times, until the closing delimiter is back at its 
original position. The literal is broken again.

> The "limit" in my previous post was a bound supplied as an argument to
> c-font-lock-declarators, which does what it says.

I'm confused. If, as we discussed before, syntax properties are applied 
in before/after-functions, why does c-font-lock-declarations need to be 
concerned with scanning for raw string bounds?

[-- Attachment #2: c++-raw-string-example.diff --]
[-- Type: text/x-patch, Size: 458 bytes --]

diff --git a/src/xdisp.c b/src/xdisp.c
index d5ffb25..b21cae0 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -579,6 +579,8 @@ redisplay_other_windows (void)
 {
   if (!windows_or_buffers_changed)
     windows_or_buffers_changed = REDISPLAY_SOME;
+
+  const char* s1 = R"foo(
 }
 
 void
@@ -31998,6 +32000,7 @@ cancel_hourglass (void)
       cancel_atimer (hourglass_atimer);
       hourglass_atimer = NULL;
     }
+  )foo";
 
   if (hourglass_shown_p)
     {

^ permalink raw reply related	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-07-10  2:01                                                         ` Dmitry Gutov
@ 2016-07-10 22:11                                                           ` Alan Mackenzie
  2016-07-11  0:06                                                             ` Dmitry Gutov
  0 siblings, 1 reply; 95+ messages in thread
From: Alan Mackenzie @ 2016-07-10 22:11 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Noam Postavsky, emacs-devel

Hello, Dmitry.

On Sun, Jul 10, 2016 at 05:01:55AM +0300, Dmitry Gutov wrote:
> Hi Alan,

> Sorry for the late response.

No problems!

> On 06/30/2016 12:52 PM, Alan Mackenzie wrote:

> >> So now the raw strings are properly using limits? Does that mean there
> >> is a limit on the length of a raw string that CC Mode supports? (Testing
> >> indicates so).

> > There isn't any limit on the length of a raw string that I know about,
> > nor should there be.  If you've got a test which shows there is such a
> > limit, please tell me about it!

> Hmm, maybe not a limit, but long raw strings still aren't getting 
> handled right.

Yes.

> Example 1:

> - Apply the attached patch to xdisp.c, which should make most of the 
> code belong within the raw string literal.
> - Visit this file. Switch to c++-mode.
> - See the literal highlighted as expected.

> Press `M->', to get to the end of the buffer (that happens rather 
> slowly, esp. considering we're inside a string, and font-lock can get 
> this information quickly).

> The literal ends at )foo".

> - Modify the trailing "foo" piece: delete it, or replace with "bar", etc 
> => the literal still ends at the same line.

> I have to go back to the opener and fiddle with the delimiter there, for 
> it to finally notice that something is wrong.

> If the raw string is small, on the other hand, I don't see this problem.

> Example 2:

> - Visit this file. Switch to c++-mode.
> - See the literal highlighted as expected.
> - M->.

> Kill the closing delimiter and paste it a few lines below the opening 
> delimiter. See the new positions of the raw string recognized (or not, 
> I'm getting different results). But if they are recognized...

> - Call `undo' a few times, until the closing delimiter is back at its 
> original position. The literal is broken again.

There was some code in the mix designed to stop too much expansion of
a region when there were humongous macros.  (A few years back somebody
had complained about the speed in processing a ~5,000 line macro.)
Unfortunately, this code got caught up in raw string processing.  I hope
the following patch fixes it.  I know that the processing is currently
slow in such a large raw string.  It is probably possible to optimise
this.  Whether it is worthwhile is the question.



diff -r 2fcfc6e054b3 cc-mode.el
--- a/cc-mode.el	Sun Jul 03 17:54:20 2016 +0000
+++ b/cc-mode.el	Sun Jul 10 21:53:29 2016 +0000
@@ -906,14 +906,16 @@
   ;; before change function.
   (goto-char c-new-BEG)
   (c-beginning-of-macro)
-  (setq c-new-BEG (point))
+  (when (< (point) c-new-BEG)
+    (setq c-new-BEG (max (point) (c-determine-limit 500 c-new-BEG))))
 
   (goto-char c-new-END)
   (when (c-beginning-of-macro)
     (c-end-of-macro)
     (or (eobp) (forward-char)))	 ; Over the terminating NL which may be marked
 				 ; with a c-cpp-delimiter category property
-  (setq c-new-END (point)))
+  (when (> (point) c-new-END)
+    (setq c-new-END (min (point) (c-determine-+ve-limit 500 c-new-END)))))
 
 (defun c-depropertize-new-text (beg end old-len)
   ;; Remove from the new text in (BEG END) any and all text properties which
@@ -941,15 +943,17 @@
   ;; Point is undefined on both entry and exit to this function.  The buffer
   ;; will have been widened on entry.
   ;;
+  ;; c-new-BEG has already been extended in `c-extend-region-for-CPP' so we
+  ;; don't need to repeat the exercise here.
+  ;;
   ;; This function is in the C/C++/ObjC value of `c-before-font-lock-functions'.
   (goto-char endd)
-  (if (c-beginning-of-macro)
-      (c-end-of-macro))
-  (setq c-new-END (max endd c-new-END (point)))
-  ;; Determine the region, (c-new-BEG c-new-END), which will get font
-  ;; locked.  This restricts the region should there be long macros.
-  (setq c-new-BEG (max c-new-BEG (c-determine-limit 500 begg))
-	c-new-END (min c-new-END (c-determine-+ve-limit 500 endd))))
+  (when (c-beginning-of-macro)
+    (c-end-of-macro)
+    ;; Determine the region, (c-new-BEG c-new-END), which will get font
+    ;; locked.  This restricts the region should there be long macros.
+    (setq c-new-END (min (max c-new-END (point))
+			 (c-determine-+ve-limit 500 c-new-END)))))
 
 (defun c-neutralize-CPP-line (beg end)
   ;; BEG and END bound a region, typically a preprocessor line.  Put a



> > The "limit" in my previous post was a bound supplied as an argument to
> > c-font-lock-declarators, which does what it says.

> I'm confused. If, as we discussed before, syntax properties are applied 
> in before/after-functions, why does c-font-lock-declarations need to be 
> concerned with scanning for raw string bounds?

The raw string bounds have nothing to do with c-font-lock-declarators.
It's just that that function takes a bound which was hardly ever needed,
since the function stops when it reaches something which signalled the
end of a sequence of declarators (for example, a semicolon).  Hence the
fact that the bound given was wrong didn't get noticed.  However, with
raw strings in the game, when (point-max) was the bound, the function
actually ended up fruitlessly scanning to (point-max) rather than to the
`limit' it should have been scanning to.

The limit to c-font-lock-declarators is now `(min limit (point-max))'.
That way, when the buffer is narrowed to less than `limit', there won't
be an out of bounds error, and when there are unterminated raw strings,
there won't be useless scanning past `limit' either.

I'm not sure if the above will help much, but I hope it does.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-07-10 22:11                                                           ` Alan Mackenzie
@ 2016-07-11  0:06                                                             ` Dmitry Gutov
  2016-07-11 17:20                                                               ` Alan Mackenzie
  0 siblings, 1 reply; 95+ messages in thread
From: Dmitry Gutov @ 2016-07-11  0:06 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Noam Postavsky, emacs-devel

On 07/11/2016 01:11 AM, Alan Mackenzie wrote:

> There was some code in the mix designed to stop too much expansion of
> a region when there were humongous macros.  (A few years back somebody
> had complained about the speed in processing a ~5,000 line macro.)
> Unfortunately, this code got caught up in raw string processing.  I hope
> the following patch fixes it.

It works, thanks.

Here's a comparatively minor problem: normally, "foo" in the closer has 
the default face. But if I backspace over the semicolon that follows it, 
and then type it again, "foo" gets the font-lock-string-face.

Similarly if I do backspace over and retype the closing double-quote 
character instead.

> I know that the processing is currently
> slow in such a large raw string.  It is probably possible to optimise
> this.  Whether it is worthwhile is the question.

I don't know. Probably not if that will involve making the code even 
more complex.

>> I'm confused. If, as we discussed before, syntax properties are applied
>> in before/after-functions, why does c-font-lock-declarations need to be
>> concerned with scanning for raw string bounds?
>
> The raw string bounds have nothing to do with c-font-lock-declarators.
> It's just that that function takes a bound which was hardly ever needed,
> since the function stops when it reaches something which signalled the
> end of a sequence of declarators (for example, a semicolon).  Hence the
> fact that the bound given was wrong didn't get noticed.  However, with
> raw strings in the game, when (point-max) was the bound, the function
> actually ended up fruitlessly scanning to (point-max) rather than to the
> `limit' it should have been scanning to.
>
> The limit to c-font-lock-declarators is now `(min limit (point-max))'.
> That way, when the buffer is narrowed to less than `limit', there won't
> be an out of bounds error, and when there are unterminated raw strings,
> there won't be useless scanning past `limit' either.
>
> I'm not sure if the above will help much, but I hope it does.

Not sure. That description sounds like it's about performance, and not 
about correctness. Since you're only mentioning scanning forward, that 
doesn't seem to account for problems with when I'm fiddling with the 
closing delimiter.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-07-11  0:06                                                             ` Dmitry Gutov
@ 2016-07-11 17:20                                                               ` Alan Mackenzie
  2016-07-11 22:44                                                                 ` Dmitry Gutov
  0 siblings, 1 reply; 95+ messages in thread
From: Alan Mackenzie @ 2016-07-11 17:20 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Noam Postavsky, emacs-devel

Hello, Dmitry.

On Mon, Jul 11, 2016 at 03:06:30AM +0300, Dmitry Gutov wrote:
> On 07/11/2016 01:11 AM, Alan Mackenzie wrote:

> > There was some code in the mix designed to stop too much expansion of
> > a region when there were humongous macros.  (A few years back somebody
> > had complained about the speed in processing a ~5,000 line macro.)
> > Unfortunately, this code got caught up in raw string processing.  I hope
> > the following patch fixes it.

> It works, thanks.

> Here's a comparatively minor problem: normally, "foo" in the closer has 
> the default face. But if I backspace over the semicolon that follows it, 
> and then type it again, "foo" gets the font-lock-string-face.

> Similarly if I do backspace over and retype the closing double-quote 
> character instead.

Yes.  Thank you for finding these glitches, and sorry I've not been more
careful to find them myself.  Please try out the following patch (as a
supplement to the last one, not a replacement):




diff --git a/lisp/progmodes/cc-fonts.el b/lisp/progmodes/cc-fonts.el
index dfc2c06..b45686c 100644
--- a/lisp/progmodes/cc-fonts.el
+++ b/lisp/progmodes/cc-fonts.el
@@ -1542,33 +1542,45 @@ c-font-lock-raw-strings
   ;; font-lock-keyword-face.  It always returns NIL to inhibit this and
   ;; prevent a repeat invocation.  See elisp/lispref page "Search-based
   ;; Fontification".
-  (while (search-forward-regexp
-	  "R\\(\"\\)\\([^ ()\\\n\r\t]\\{,16\\}\\)(" limit t)
-    (when
-	(or (and (eobp)
-		 (eq (c-get-char-property (1- (point)) 'face)
-		     'font-lock-warning-face))
-	    (eq (c-get-char-property (point) 'face) 'font-lock-string-face)
-	    (and (equal (c-get-char-property (match-end 2) 'syntax-table) '(1))
-		 (equal (c-get-char-property (match-beginning 1) 'syntax-table)
-			'(1))))
-      (let ((paren-prop (c-get-char-property (1- (point)) 'syntax-table)))
-	(if paren-prop
-	    (progn
-	      (c-put-font-lock-face (match-beginning 0) (match-end 0)
-				    'font-lock-warning-face)
-	      (when
-		  (and
-		   (equal paren-prop '(15))
-		   (not (c-search-forward-char-property 'syntax-table '(15) limit)))
-		(goto-char limit)))
-	  (c-put-font-lock-face (match-beginning 1) (match-end 2) 'default)
-	  (when (search-forward-regexp
-		 (concat ")\\(" (regexp-quote (match-string-no-properties 2))
-			 "\\)\"")
-		 limit t)
-	    (c-put-font-lock-face (match-beginning 1) (point)
-				  'default))))))
+  (let* ((state (c-state-semi-pp-to-literal (point)))
+	 (string-start (and (eq (cadr state) 'string)
+			    (car (cddr state))))
+	 (raw-id (and string-start
+		      (save-excursion
+			(goto-char string-start)
+			(and (eq (char-before) ?R)
+			     (looking-at "\"\\([^ ()\\\n\r\t]\\{0,16\\}\\)(")
+			     (match-string-no-properties 1))))))
+    (while (< (point) limit)
+      (if raw-id
+	  (progn
+	    (if (search-forward-regexp (concat ")\\(" (regexp-quote raw-id) "\\)\"")
+				       limit 'limit)
+		(c-put-font-lock-face (match-beginning 1) (point) 'default))
+	    (setq raw-id nil))
+
+	(when (search-forward-regexp
+	       "R\\(\"\\)\\([^ ()\\\n\r\t]\\{0,16\\}\\)(" limit 'limit)
+	  (when
+	      (or (and (eobp)
+		       (eq (c-get-char-property (1- (point)) 'face)
+			   'font-lock-warning-face))
+		  (eq (c-get-char-property (point) 'face) 'font-lock-string-face)
+		  (and (equal (c-get-char-property (match-end 2) 'syntax-table) '(1))
+		       (equal (c-get-char-property (match-beginning 1) 'syntax-table)
+			      '(1))))
+	    (let ((paren-prop (c-get-char-property (1- (point)) 'syntax-table)))
+	      (if paren-prop
+		  (progn
+		    (c-put-font-lock-face (match-beginning 0) (match-end 0)
+					  'font-lock-warning-face)
+		    (when
+			(and
+			 (equal paren-prop '(15))
+			 (not (c-search-forward-char-property 'syntax-table '(15) limit)))
+		      (goto-char limit)))
+		(c-put-font-lock-face (match-beginning 1) (match-end 2) 'default)
+		(setq raw-id (match-string-no-properties 2)))))))))
   nil)
 
 (c-lang-defconst c-simple-decl-matchers


> > I know that the processing is currently
> > slow in such a large raw string.  It is probably possible to optimise
> > this.  Whether it is worthwhile is the question.

> I don't know. Probably not if that will involve making the code even 
> more complex.

Optimisation does make code more complicated.  My feeling is that raw
strings of length 1Mbyte are going to be quite rare, and that I should
wait for somebody to complain, first.

> >> I'm confused. If, as we discussed before, syntax properties are applied
> >> in before/after-functions, why does c-font-lock-declarations need to be
> >> concerned with scanning for raw string bounds?

> > The raw string bounds have nothing to do with c-font-lock-declarators.
> > It's just that that function takes a bound which was hardly ever needed,
> > since the function stops when it reaches something which signalled the
> > end of a sequence of declarators (for example, a semicolon).  Hence the
> > fact that the bound given was wrong didn't get noticed.  However, with
> > raw strings in the game, when (point-max) was the bound, the function
> > actually ended up fruitlessly scanning to (point-max) rather than to the
> > `limit' it should have been scanning to.

> > The limit to c-font-lock-declarators is now `(min limit (point-max))'.
> > That way, when the buffer is narrowed to less than `limit', there won't
> > be an out of bounds error, and when there are unterminated raw strings,
> > there won't be useless scanning past `limit' either.

> > I'm not sure if the above will help much, but I hope it does.

> Not sure. That description sounds like it's about performance, and not 
> about correctness.

Well, giving the correct bound improves the performance.  Let's leave it
at that.

> Since you're only mentioning scanning forward, that doesn't seem to
> account for problems with when I'm fiddling with the closing
> delimiter.

No, they're two unrelated things.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply related	[flat|nested] 95+ messages in thread

* Re: font-lock-syntactic-keywords obsolet?
  2016-07-11 17:20                                                               ` Alan Mackenzie
@ 2016-07-11 22:44                                                                 ` Dmitry Gutov
  0 siblings, 0 replies; 95+ messages in thread
From: Dmitry Gutov @ 2016-07-11 22:44 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Noam Postavsky, emacs-devel

Hi Alan,

On 07/11/2016 08:20 PM, Alan Mackenzie wrote:

> Yes.  Thank you for finding these glitches, and sorry I've not been more
> careful to find them myself.  Please try out the following patch (as a
> supplement to the last one, not a replacement):

It seems to do it, thanks. If I find any further problems, I'll send 
them to the bug tracker.

> Optimisation does make code more complicated.  My feeling is that raw
> strings of length 1Mbyte are going to be quite rare, and that I should
> wait for somebody to complain, first.

Naturally, but a code reorganization can both improve performance and 
reduce the complexity, or at least keep it on the same level.

I'm repeating myself, but in ruby-mode I can have a 1Mbyte long heredoc 
(or a percent literal, which is like a C++ raw string, but with more 
features), and edit either of its bounds with much better latency.

Anyway, this subject seems exhausted. Thanks for the discussion.



^ permalink raw reply	[flat|nested] 95+ messages in thread

end of thread, other threads:[~2016-07-11 22:44 UTC | newest]

Thread overview: 95+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-16 10:18 font-lock-syntactic-keywords obsolet? Andreas Röhler
2016-06-17 22:13 ` Stefan Monnier
2016-06-18  7:03   ` Andreas Röhler
2016-06-18 15:07     ` Noam Postavsky
2016-06-18 17:12       ` Alan Mackenzie
2016-06-18 18:13         ` Stefan Monnier
2016-06-18 19:41           ` Noam Postavsky
2016-06-19  7:12             ` Andreas Röhler
2016-06-19 13:21               ` Noam Postavsky
2016-06-19 14:03                 ` Andreas Röhler
2016-06-19 14:36               ` Stefan Monnier
2016-06-19 15:12                 ` Alan Mackenzie
2016-06-19 15:18                   ` Dmitry Gutov
2016-06-19 15:26                     ` Alan Mackenzie
2016-06-19 15:52                       ` Stefan Monnier
2016-06-19 15:53                       ` Dmitry Gutov
2016-06-20  2:58                   ` Stefan Monnier
2016-06-20 11:57                     ` Alan Mackenzie
2016-06-20 13:37                       ` Stefan Monnier
2016-06-20 13:50                         ` Dmitry Gutov
2016-06-20 16:00                           ` Andreas Röhler
2016-06-20 18:15                             ` Dmitry Gutov
2016-06-20 23:33                               ` John Wiegley
2016-06-20 18:55                             ` Alan Mackenzie
2016-06-20 20:22                               ` Andreas Röhler
2016-06-19 15:27                 ` Andreas Röhler
2016-06-19 15:51                   ` Stefan Monnier
2016-06-19 12:31         ` Dmitry Gutov
2016-06-19 13:31           ` Alan Mackenzie
2016-06-19 13:48             ` Dmitry Gutov
2016-06-19 14:59               ` Alan Mackenzie
2016-06-19 15:07                 ` Dmitry Gutov
2016-06-19 15:18                   ` Alan Mackenzie
2016-06-19 15:22                     ` Dmitry Gutov
2016-06-19 15:34                       ` Alan Mackenzie
2016-06-19 15:50                         ` Dmitry Gutov
2016-06-19 17:15                           ` Alan Mackenzie
2016-06-19 17:55                             ` Dmitry Gutov
2016-06-19 22:20                               ` Dmitry Gutov
2016-06-20 10:22                               ` Alan Mackenzie
2016-06-20 11:50                                 ` Dmitry Gutov
2016-06-20 14:50                                   ` Alan Mackenzie
2016-06-20 15:02                                     ` Dmitry Gutov
2016-06-20 13:39                                 ` Stefan Monnier
2016-06-20 10:58                               ` Alan Mackenzie
2016-06-20 11:12                                 ` Andreas Röhler
2016-06-20 12:15                                 ` Dmitry Gutov
2016-06-20 14:52                                   ` Noam Postavsky
2016-06-20 15:57                                     ` Dmitry Gutov
2016-06-20 17:23                                       ` Noam Postavsky
2016-06-20 18:58                                         ` Dmitry Gutov
2016-06-20 15:25                                   ` Alan Mackenzie
2016-06-20 16:45                                     ` Dmitry Gutov
2016-06-20 18:12                                       ` Alan Mackenzie
2016-06-20 19:15                                         ` Dmitry Gutov
2016-06-20 20:08                                           ` Alan Mackenzie
2016-06-20 20:32                                             ` Dmitry Gutov
2016-06-21 14:40                                               ` Alan Mackenzie
2016-06-21 21:06                                                 ` Dmitry Gutov
2016-06-21 23:50                                                 ` Dmitry Gutov
2016-06-23 16:30                                                   ` Alan Mackenzie
2016-06-27 11:48                                                     ` Alan Mackenzie
2016-06-29  0:30                                                     ` Dmitry Gutov
2016-06-30  9:52                                                       ` Alan Mackenzie
2016-07-10  2:01                                                         ` Dmitry Gutov
2016-07-10 22:11                                                           ` Alan Mackenzie
2016-07-11  0:06                                                             ` Dmitry Gutov
2016-07-11 17:20                                                               ` Alan Mackenzie
2016-07-11 22:44                                                                 ` Dmitry Gutov
2016-06-20 22:45                                       ` John Wiegley
2016-06-20 23:30                                         ` Dmitry Gutov
2016-06-20 23:56                                           ` John Wiegley
2016-06-21  0:26                                           ` John Wiegley
2016-06-21 15:26                                           ` Alan Mackenzie
2016-06-21 16:09                                             ` Dmitry Gutov
2016-06-21 18:34                                               ` Andreas Röhler
2016-06-21 18:42                                                 ` John Wiegley
2016-06-21 18:52                                                   ` Eli Zaretskii
2016-06-27 11:50                                                     ` Andreas Röhler
2016-06-21 19:23                                                   ` Andreas Röhler
2016-06-21 18:49                                                 ` Eli Zaretskii
2016-06-21 21:05                                               ` Alan Mackenzie
2016-06-21 21:17                                                 ` Dmitry Gutov
2016-06-21 16:28                                             ` Dmitry Gutov
2016-06-20  0:06                             ` Stefan Monnier
2016-06-20 11:03                               ` Alan Mackenzie
2016-06-20 13:53                                 ` Stefan Monnier
2016-06-20  4:33                             ` Stefan Monnier
2016-06-20  5:05                               ` John Wiegley
2016-06-19 23:59                         ` Stefan Monnier
2016-06-20  3:14                       ` Stefan Monnier
2016-06-20  3:20                         ` Dmitry Gutov
2016-06-20  3:47                           ` Stefan Monnier
2016-06-20  6:40                 ` Andreas Röhler
2016-06-20  3:08             ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).