* bug#22983: syntax-ppss returns wrong result. @ 2016-03-11 15:15 Alan Mackenzie 2016-03-11 20:31 ` Dmitry Gutov ` (3 more replies) 0 siblings, 4 replies; 155+ messages in thread From: Alan Mackenzie @ 2016-03-11 15:15 UTC (permalink / raw) To: 22983 Hello, Emacs. The fundamental contract in syntax-ppss is that (syntax-ppss POS) returns the same value as (parse-partial-sexp (point-min) POS) (with the exception of elements 2 and 6). This is currently not always the case. In the master branch, emacs -Q and visit xdisp.c with C-x C-f. Follow this recipe: M-: (syntax-ppss-flush-cache 1) M-: (setq ppss-0 (syntax-ppss 40000)) M-< C-s #include " <CR> M-> C-x n n M-: (setq ppss-1 (syntax-ppss 40000)) M-: (setq parse (parse-partial-sexp (point-min) 40000)) At this point, `ppss-1' and `parse' should match (apart from elements 2 and 6). What we actually have is: ppss-1: (2 39992 nil nil nil nil 2 nil nil (39975 39992)) parse: (0 nil 15674 34 nil nil 0 nil 15675 nil) . -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-11 15:15 bug#22983: syntax-ppss returns wrong result Alan Mackenzie @ 2016-03-11 20:31 ` Dmitry Gutov 2016-03-11 21:24 ` Alan Mackenzie 2016-03-13 18:52 ` Andreas Röhler ` (2 subsequent siblings) 3 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-03-11 20:31 UTC (permalink / raw) To: Alan Mackenzie, 22983 On 03/11/2016 05:15 PM, Alan Mackenzie wrote: > At this point, `ppss-1' and `parse' should match (apart from elements 2 > and 6). What we actually have is: > > ppss-1: (2 39992 nil nil nil nil 2 nil nil (39975 39992)) > parse: (0 nil 15674 34 nil nil 0 nil 15675 nil) I think you mean that ppss-0 and ppss-1 must match independent of narrowing, and also match (parse-partial-sexp 1 40000). Considering narrowing can change point-min arbitrarily, specifying (syntax-ppss pos) as (parse-partial-sexp (point-min) pos) is a losing proposition if you want consistency. Alas, we have some code out there that implements multiple-major-mode functionality using narrowing and some hacking of syntax-ppss-last syntax-ppss-cache values. Changing syntax-ppss to be independent of narrowing will break it, and we'll need to provide some alternative first. We could introduce a syntax-ppss-dont-widen variable, though. Similar to font-lock-dont-widen. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-11 20:31 ` Dmitry Gutov @ 2016-03-11 21:24 ` Alan Mackenzie 2016-03-11 21:35 ` Dmitry Gutov 2016-03-13 17:32 ` bug#22983: syntax-ppss returns wrong result Stefan Monnier 0 siblings, 2 replies; 155+ messages in thread From: Alan Mackenzie @ 2016-03-11 21:24 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 22983 Hello, Dmitry. On Fri, Mar 11, 2016 at 10:31:50PM +0200, Dmitry Gutov wrote: > On 03/11/2016 05:15 PM, Alan Mackenzie wrote: > > At this point, `ppss-1' and `parse' should match (apart from elements 2 > > and 6). What we actually have is: > > ppss-1: (2 39992 nil nil nil nil 2 nil nil (39975 39992)) > > parse: (0 nil 15674 34 nil nil 0 nil 15675 nil) > I think you mean that ppss-0 and ppss-1 must match independent of > narrowing, and also match (parse-partial-sexp 1 40000). Er no, I meant what I wrote: the result of (syntax-ppss pos) must match that of (parse-partial-sexp (point-min) pos). I think ppss-0 and ppss-1 did actually match (but I can't quite remember). > Considering narrowing can change point-min arbitrarily, specifying > (syntax-ppss pos) as (parse-partial-sexp (point-min) pos) is a losing > proposition if you want consistency. Indeed. But that is how syntax-ppss is specified, and (partially) how it is implemented. > Alas, we have some code out there that implements multiple-major-mode > functionality using narrowing and some hacking of syntax-ppss-last > syntax-ppss-cache values. > Changing syntax-ppss to be independent of narrowing will break it, and > we'll need to provide some alternative first. syntax-ppss is broken, and can't be fixed. The only sensible fix would be to specify that (syntax-ppss pos) is the same as (parse-partial-sexp 1 pos). But that is then a totally different function, and there are around 200 uses in the Emacs sources to check and fix, to say nothing of external code. > We could introduce a syntax-ppss-dont-widen variable, though. Similar to > font-lock-dont-widen. I'm trying to figure that out. Wouldn't that still leave you with problems when point-min is inside a string? -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-11 21:24 ` Alan Mackenzie @ 2016-03-11 21:35 ` Dmitry Gutov 2016-03-11 22:15 ` Alan Mackenzie 2016-03-13 17:32 ` bug#22983: syntax-ppss returns wrong result Stefan Monnier 1 sibling, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-03-11 21:35 UTC (permalink / raw) To: Alan Mackenzie; +Cc: 22983 On 03/11/2016 11:24 PM, Alan Mackenzie wrote: >> I think you mean that ppss-0 and ppss-1 must match independent of >> narrowing, and also match (parse-partial-sexp 1 40000). > > Er no, I meant what I wrote: the result of (syntax-ppss pos) must match > that of (parse-partial-sexp (point-min) pos). I think ppss-0 and ppss-1 > did actually match (but I can't quite remember). I imagine they didn't. I got the same value in all three cases, though, so your scenario could use some revising. >> Considering narrowing can change point-min arbitrarily, specifying >> (syntax-ppss pos) as (parse-partial-sexp (point-min) pos) is a losing >> proposition if you want consistency. > > Indeed. But that is how syntax-ppss is specified, and (partially) how > it is implemented. That part of specification can be rephrased. >> Alas, we have some code out there that implements multiple-major-mode >> functionality using narrowing and some hacking of syntax-ppss-last >> syntax-ppss-cache values. > >> Changing syntax-ppss to be independent of narrowing will break it, and >> we'll need to provide some alternative first. > > syntax-ppss is broken, and can't be fixed. It's used ubiquitously, so it must be working. > The only sensible fix would > be to specify that (syntax-ppss pos) is the same as (parse-partial-sexp > 1 pos). But that is then a totally different function, and there are > around 200 uses in the Emacs sources to check and fix, to say nothing of > external code. Not entirely different, no. AFAIK, these are the semantics the vast majority of its usages expect. Except the multiple-major-mode case, which we'd ideally try to accommodate, too. >> We could introduce a syntax-ppss-dont-widen variable, though. Similar to >> font-lock-dont-widen. > > I'm trying to figure that out. Wouldn't that still leave you with > problems when point-min is inside a string? syntax-ppss-dont-widen would be nil by default, it would be an escape hatch toward the current semantics, for when the caller knows how to manage narrowings, etc. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-11 21:35 ` Dmitry Gutov @ 2016-03-11 22:15 ` Alan Mackenzie 2016-03-11 22:38 ` Dmitry Gutov 0 siblings, 1 reply; 155+ messages in thread From: Alan Mackenzie @ 2016-03-11 22:15 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 22983 Hello, Dmitry. On Fri, Mar 11, 2016 at 11:35:08PM +0200, Dmitry Gutov wrote: > On 03/11/2016 11:24 PM, Alan Mackenzie wrote: > > Er no, I meant what I wrote: the result of (syntax-ppss pos) must match > > that of (parse-partial-sexp (point-min) pos). I think ppss-0 and ppss-1 > > did actually match (but I can't quite remember). > I imagine they didn't. I got the same value in all three cases, though, > so your scenario could use some revising. Sorry about that. > >> Considering narrowing can change point-min arbitrarily, specifying > >> (syntax-ppss pos) as (parse-partial-sexp (point-min) pos) is a losing > >> proposition if you want consistency. > > Indeed. But that is how syntax-ppss is specified, and (partially) how > > it is implemented. > That part of specification can be rephrased. It's more than the specification which needs redoing. The implementation (sometimes) returns the equivalent of (parse-partial-sexp (point-min) pos)), when point-min is not in a "safe place". > >> Alas, we have some code out there that implements multiple-major-mode > >> functionality using narrowing and some hacking of syntax-ppss-last > >> syntax-ppss-cache values. > >> Changing syntax-ppss to be independent of narrowing will break it, and > >> we'll need to provide some alternative first. > > syntax-ppss is broken, and can't be fixed. > It's used ubiquitously, so it must be working. It might well be ubiquitous, but it's broken. Consider this: syntax-ppss will return the result of a parse based at point-min. In general, the caller does not know whether point-min is in a string or not. Therefore the result is of little value, UNLESS the caller takes special action, such as widening the buffer before every call to syntax-ppss. > > The only sensible fix would be to specify that (syntax-ppss pos) is > > the same as (parse-partial-sexp 1 pos). But that is then a totally > > different function, and there are around 200 uses in the Emacs > > sources to check and fix, to say nothing of external code. > Not entirely different, no. AFAIK, these are the semantics the vast > majority of its usages expect. But it's not the semantics these .el files get. What's probably keeping them functional is the rarity with which buffers are narrowed to an "awkward" point-min. > Except the multiple-major-mode case, which we'd ideally try to > accommodate, too. How does this code handle the changeover of syntax tables at a mode boundary? > >> We could introduce a syntax-ppss-dont-widen variable, though. Similar to > >> font-lock-dont-widen. > > I'm trying to figure that out. Wouldn't that still leave you with > > problems when point-min is inside a string? > syntax-ppss-dont-widen would be nil by default, it would be an escape > hatch toward the current semantics, for when the caller knows how to > manage narrowings, etc. Ah, OK. I think I see that now. Maybe. Surely the trouble is that either ALL calls or NONE must have s-p-dont-widen set. When that flag is toggled, all the caches have to be cleared. Maybe there should be some initialisation flag in some initialisation function. Or something like that. (It's getting late!). It strikes me that the multiple major mode stuff could do with a substantially enhanced version of syntax-ppss which would smoothly handle going over a mode boundary. But I don't know how you're implementing that. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-11 22:15 ` Alan Mackenzie @ 2016-03-11 22:38 ` Dmitry Gutov 2016-03-13 17:37 ` Stefan Monnier 2016-03-14 15:16 ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Alan Mackenzie 0 siblings, 2 replies; 155+ messages in thread From: Dmitry Gutov @ 2016-03-11 22:38 UTC (permalink / raw) To: Alan Mackenzie; +Cc: 22983 On 03/12/2016 12:15 AM, Alan Mackenzie wrote: >> That part of specification can be rephrased. > > It's more than the specification which needs redoing. The implementation > (sometimes) returns the equivalent of (parse-partial-sexp (point-min) > pos)), when point-min is not in a "safe place". Sure. I just meant that we shouldn't get hung up on that element of the specification. >>>> Changing syntax-ppss to be independent of narrowing will break it, and >>>> we'll need to provide some alternative first. > >>> syntax-ppss is broken, and can't be fixed. > >> It's used ubiquitously, so it must be working. > > It might well be ubiquitous, but it's broken. And yet, it can be fixed. > Consider this: syntax-ppss > will return the result of a parse based at point-min. In general, the > caller does not know whether point-min is in a string or not. Therefore > the result is of little value, UNLESS the caller takes special action, > such as widening the buffer before every call to syntax-ppss. You can say that. >> Not entirely different, no. AFAIK, these are the semantics the vast >> majority of its usages expect. > > But it's not the semantics these .el files get. What's probably keeping > them functional is the rarity with which buffers are narrowed to an > "awkward" point-min. Another thing that keeps it together, is that narrowing, as a user-level operator, is not that popular. Personally, I consider it an anti-feature. >> Except the multiple-major-mode case, which we'd ideally try to >> accommodate, too. > > How does this code handle the changeover of syntax tables at a mode > boundary? The "inner" regions start with an "empty", top-level state. This is actually fine, because these are usually small enough not to benefit from the syntax-ppss cache too much (and syntax-ppss-last still helps). The parts of the outer region following a subregion with different syntax table... rely on a few hacks, and a manual application of a `syntax-table' property when necessary. We need a better solution there, but it's probably out of scope for this discussion. >> syntax-ppss-dont-widen would be nil by default, it would be an escape >> hatch toward the current semantics, for when the caller knows how to >> manage narrowings, etc. > > Ah, OK. I think I see that now. Maybe. Surely the trouble is that > either ALL calls or NONE must have s-p-dont-widen set. Hmm, you're right. This variable still seems essential, but to be safe, mmm-mode and friends should probably also advise syntax-ppss, to always perform narrowing as appropriate. > When that flag is > toggled, all the caches have to be cleared. Maybe there should be some > initialisation flag in some initialisation function. Or something like > that. (It's getting late!). Is the syntax-ppss-dont-widen really relevant for your comment cache? It would be used only by certain major modes, and worst comes to worst, you could disable the cache in those buffers. > It strikes me that the multiple major mode stuff could do with a > substantially enhanced version of syntax-ppss which would smoothly handle > going over a mode boundary. But I don't know how you're implementing > that. So far, we're just wrapping the font-lock and indentation code, and otherwise hope for the best. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-11 22:38 ` Dmitry Gutov @ 2016-03-13 17:37 ` Stefan Monnier 2016-03-13 18:57 ` Alan Mackenzie 2016-03-14 15:16 ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Alan Mackenzie 1 sibling, 1 reply; 155+ messages in thread From: Stefan Monnier @ 2016-03-13 17:37 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Alan Mackenzie, 22983 >> But it's not the semantics these .el files get. What's probably keeping >> them functional is the rarity with which buffers are narrowed to an >> "awkward" point-min. > Another thing that keeps it together, is that narrowing, as a user-level > operator, is not that popular. Luckily, yes. > Personally, I consider it an anti-feature. Same here. Luckily also, as pointed out elsewhere, the semantics of it is unclear, so that in several important cases, whichever behavior we end up choosing will be both correct for some users and incorrect for others. Hence, so far, I didn't make any effort to try and "do the right thing" for user-activated narrowing, since these are just not well defined enough to even determine what is "the right thing". Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-13 17:37 ` Stefan Monnier @ 2016-03-13 18:57 ` Alan Mackenzie 2016-03-14 0:47 ` Dmitry Gutov 2016-03-14 1:49 ` Stefan Monnier 0 siblings, 2 replies; 155+ messages in thread From: Alan Mackenzie @ 2016-03-13 18:57 UTC (permalink / raw) To: Stefan Monnier; +Cc: 22983, Dmitry Gutov Hello, Stefan. On Sun, Mar 13, 2016 at 01:37:27PM -0400, Stefan Monnier wrote: > >> But it's not the semantics these .el files get. What's probably keeping > >> them functional is the rarity with which buffers are narrowed to an > >> "awkward" point-min. > > Another thing that keeps it together, is that narrowing, as a user-level > > operator, is not that popular. > Luckily, yes. I happen to use it frequently. I expect other users do, to. It's useful. > > Personally, I consider it an anti-feature. > Same here. Luckily also, as pointed out elsewhere, the semantics of it > is unclear, so that in several important cases, whichever behavior we > end up choosing will be both correct for some users and incorrect > for others. That's pure sophistry. The semantics needed are quite clear: What were strings and comments before narrowing should remain strings and comments after narrowing. Otherwise, nothing would work in such a narrowed buffer. font-locking, for example, behaves properly in a narrowed buffer. > Hence, so far, I didn't make any effort to try and "do the right thing" > for user-activated narrowing, since these are just not well defined > enough to even determine what is "the right thing". Lets define them as I said in the previous paragraph. Or can you conceive of a use case where one would want narrowing to invert strings and non-strings, leaving comments totally random? Do you have any views on how the bug should be resolved? > Stefan -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-13 18:57 ` Alan Mackenzie @ 2016-03-14 0:47 ` Dmitry Gutov 2016-03-14 1:04 ` Drew Adams 2016-03-14 1:49 ` Stefan Monnier 1 sibling, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-03-14 0:47 UTC (permalink / raw) To: Alan Mackenzie, Stefan Monnier; +Cc: 22983 On 03/13/2016 08:57 PM, Alan Mackenzie wrote: > I happen to use it frequently. I expect other users do, to. It's > useful. It might be, but it's not very well-designed. If you only want to hide some parts of buffer from being displayed, changing point-min and point-max, which affect quite a lot of Lisp functions, seems unnecessary. Introducing a couple of global variables that would only be read by the display code, seems like a better approach. I don't think that narrow-to-region should be a user-level function. Introducing a new function, using a different mechanism shouldn't be too hard though, if we reuse the existing binding. > What were > strings and comments before narrowing should remain strings and comments > after narrowing. Otherwise, nothing would work in such a narrowed > buffer. font-locking, for example, behaves properly in a narrowed > buffer. It behaves like we tell it to behave. If I bind font-lock-dont-widen to t, font-lock won't look beyond the narrowing. >> Hence, so far, I didn't make any effort to try and "do the right thing" >> for user-activated narrowing, since these are just not well defined >> enough to even determine what is "the right thing". > > Lets define them as I said in the previous paragraph. Or can you > conceive of a use case where one would want narrowing to invert strings > and non-strings, leaving comments totally random? At risk of inviting further confusion, yes, mmm-mode and polymode (new example!) use narrowing to persuade font-lock and indentation code that there's nothing beyond the narrowed region. We might declare such usages invalid, and that's a possible choice, but I think keeping support for them wouldn't be too hard, at least for a while. Note that if your comment cache always widens the buffer before calculating the values to save, its result might conflict with syntax-ppss in mmm-mode and polymode (right?). Leading to font-lock, indentation and certain commands behaving in different, conflicting ways. That's just conjecture at this point, of course. > Do you have any views on how the bug should be resolved? Stefan probably has another opinion, but I'd either ignore the issue of narrowing, or introduce syntax-ppss-dont-widen like proposed (and thus make syntax-ppss widen by default). Together with adding a command that would replace interactive use of narrow-to-region. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-14 0:47 ` Dmitry Gutov @ 2016-03-14 1:04 ` Drew Adams 2016-04-03 22:55 ` John Wiegley 0 siblings, 1 reply; 155+ messages in thread From: Drew Adams @ 2016-03-14 1:04 UTC (permalink / raw) To: Dmitry Gutov, Alan Mackenzie, Stefan Monnier; +Cc: 22983 > > I happen to use it frequently. I expect other users do, to. It's > > useful. > > It might be, but it's not very well-designed. If you only want to hide > some parts of buffer from being displayed, changing point-min and > point-max, which affect quite a lot of Lisp functions, seems unnecessary. Well, well, well. All of this is likely OT for this thread. But no - narrowing is in fact explicitly _about_ changing `point-min' and `point-max', so you can act on a particular section of a buffer. It is not only about "hid[ing] some parts of a buffer from being displayed". This is true for both interactive use and in code. > I don't think that narrow-to-region should be a user-level > function. Is this a joke? Maybe you think that because you think it is only about hiding text? > Introducing a new function, using a different mechanism > shouldn't be too hard though, if we reuse the existing binding. Please don't. Please don't even think about it. And if you really think you have something to say about it, then please bring it up in emacs-devel, not in a bug thread that is not especially related to it. > adding a command that would replace interactive use of > narrow-to-region. Ridiculous (IMHO). Add whatever commands you like, but please do not think about replacing `narrow-to-region' willy nilly. It is one of the most useful Emacs commands. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-14 1:04 ` Drew Adams @ 2016-04-03 22:55 ` John Wiegley 0 siblings, 0 replies; 155+ messages in thread From: John Wiegley @ 2016-04-03 22:55 UTC (permalink / raw) To: Drew Adams; +Cc: 22983, Dmitry Gutov, Stefan Monnier, Alan Mackenzie >>>>> Drew Adams <drew.adams@oracle.com> writes: >> Introducing a new function, using a different mechanism shouldn't be too >> hard though, if we reuse the existing binding. > Please don't. Please don't even think about it. > And if you really think you have something to say about it, then please > bring it up in emacs-devel, not in a bug thread that is not especially > related to it. +1! -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-13 18:57 ` Alan Mackenzie 2016-03-14 0:47 ` Dmitry Gutov @ 2016-03-14 1:49 ` Stefan Monnier 1 sibling, 0 replies; 155+ messages in thread From: Stefan Monnier @ 2016-03-14 1:49 UTC (permalink / raw) To: Alan Mackenzie; +Cc: 22983, Dmitry Gutov > That's pure sophistry. The semantics needed are quite clear: For your use case, yes. It's quite clear *in your mind*. There are other use cases. Worse yet: Elisp doesn't generally know if the narrowing was setup by the user or by some Elisp caller up the stack. So even if we were to pretend that the use-case is clear when the narrowing is set by the user, we'd still have to figure out if that's the case. > Lets define them as I said in the previous paragraph. Or can you > conceive of a use case where one would want narrowing to invert strings > and non-strings, leaving comments totally random? There's the case where some Elisp code does (save-restriction (narrow-to-region beg end (with-syntax-table ...))) to parse a sub-part of your buffer in a different way. Of course this completely breaks syntax-ppss and friends. I need to do exactly that in sm-c-mode (when parsing the C code inside CPP directives, since those directives are marked as comments), for example and had to use (let ((syntax-propertize-function nil) (syntax-ppss-cache nil) (syntax-ppss-last nil)) ...) to deal with it. It would be easy/natural to add a binding of syntax-ppss-dont-widen in there (and/or literal-cache-dont-widen for that matter). > Do you have any views on how the bug should be resolved? Look up some past discussions of how to number lines in a narrowed buffer (same basic issue), where we discussed this. We basically need to add information about which kind of narrowing is in effect. IIRC one way suggested was to have 2 narrowing states at the same time: the current one, plus a new one which is a kind of "hard narrowing" (the current narrowing would have to be "narrower" than the "hard narrowing"), with corresponding new kind of "widen". Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-11 22:38 ` Dmitry Gutov 2016-03-13 17:37 ` Stefan Monnier @ 2016-03-14 15:16 ` Alan Mackenzie 2016-03-14 17:34 ` Andreas Röhler 2016-03-14 20:06 ` Dmitry Gutov 1 sibling, 2 replies; 155+ messages in thread From: Alan Mackenzie @ 2016-03-14 15:16 UTC (permalink / raw) To: Dmitry Gutov; +Cc: emacs-devel Hello, Dmitry. On Sat, Mar 12, 2016 at 12:38:49AM +0200, Dmitry Gutov wrote: > On 03/12/2016 12:15 AM, Alan Mackenzie wrote: > >> Except the multiple-major-mode case, which we'd ideally try to > >> accommodate, too. > > How does this code handle the changeover of syntax tables at a mode > > boundary? > The "inner" regions start with an "empty", top-level state. This is > actually fine, because these are usually small enough not to benefit > from the syntax-ppss cache too much (and syntax-ppss-last still helps). > The parts of the outer region following a subregion with different > syntax table... rely on a few hacks, and a manual application of a > `syntax-table' property when necessary. We need a better solution there, > but it's probably out of scope for this discussion. How about an extension to the parse-partial-sexp (etc.) code? For example, a feature I would call an "island", where a character could be marked with the "island start" syntax-table property, and another character lower down could be marked with the "island end" character. parse-partial-sexp, on encountering an island start syntax would somehow stack the current parse state and start a new one with a different syntax table. The parse state, instead of consisting of the 10 elements currently, would in general have 10n elements, where n is the depth of nesting. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-14 15:16 ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Alan Mackenzie @ 2016-03-14 17:34 ` Andreas Röhler 2016-03-14 20:06 ` Dmitry Gutov 1 sibling, 0 replies; 155+ messages in thread From: Andreas Röhler @ 2016-03-14 17:34 UTC (permalink / raw) To: emacs-devel; +Cc: Alan Mackenzie On 14.03.2016 16:16, Alan Mackenzie wrote: > Hello, Dmitry. > > On Sat, Mar 12, 2016 at 12:38:49AM +0200, Dmitry Gutov wrote: >> On 03/12/2016 12:15 AM, Alan Mackenzie wrote: >>>> Except the multiple-major-mode case, which we'd ideally try to >>>> accommodate, too. >>> How does this code handle the changeover of syntax tables at a mode >>> boundary? >> The "inner" regions start with an "empty", top-level state. This is >> actually fine, because these are usually small enough not to benefit >> from the syntax-ppss cache too much (and syntax-ppss-last still helps). >> The parts of the outer region following a subregion with different >> syntax table... rely on a few hacks, and a manual application of a >> `syntax-table' property when necessary. We need a better solution there, >> but it's probably out of scope for this discussion. > How about an extension to the parse-partial-sexp (etc.) code? For > example, a feature I would call an "island", where a character could be > marked with the "island start" syntax-table property, and another > character lower down could be marked with the "island end" character. > parse-partial-sexp, on encountering an island start syntax would somehow > stack the current parse state and start a new one with a different syntax > table. The parse state, instead of consisting of the 10 elements > currently, would in general have 10n elements, where n is the depth of > nesting. > 0 AFAIU narrowing would provide that already WRT parse-partial-sexp, maybe combined with some markup like folding-mode. Remains to hand over these sheets to font-lock. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-14 15:16 ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Alan Mackenzie 2016-03-14 17:34 ` Andreas Röhler @ 2016-03-14 20:06 ` Dmitry Gutov 2016-03-19 22:51 ` Vitalie Spinu 1 sibling, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-03-14 20:06 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel Hi Alan, On 03/14/2016 05:16 PM, Alan Mackenzie wrote: > How about an extension to the parse-partial-sexp (etc.) code? For > example, a feature I would call an "island", where a character could be > marked with the "island start" syntax-table property, and another > character lower down could be marked with the "island end" character. Something like that might help, although I hesitate asking for that change because it's a relatively big one, and it would still solve only one multiple-mode-related issue. What it would help with, is fool the "outer" major mode into ignoring the preceding submode regions, in the return value of syntax-ppss. But we could have pretty much that already by advising syntax-ppss. That leaves out parse-partial-sexp, but it's not used that often directly in major mode code (though sgml-mode uses it). > parse-partial-sexp, on encountering an island start syntax would somehow > stack the current parse state and start a new one with a different syntax > table. The parse state, instead of consisting of the 10 elements > currently, would in general have 10n elements, where n is the depth of > nesting. To be able to parse across different regions, it would need to know the syntax table for each one (using the syntax-table text property?), as well as to be able to apply the appropriate syntax-propertize-function in each region. The latter is handled by mmm-mode, though, in a seemingly adequate fashion (it installs a composite function that knows how to dispatch to mode-specific ones). Maybe it's worth a try. Though I don't know how Stefan uses narrowing in sm-c-mode, and whether this proposal is appropriate to replace narrowing in syntax-ppss for this use case. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-14 20:06 ` Dmitry Gutov @ 2016-03-19 22:51 ` Vitalie Spinu 2016-03-20 2:19 ` Dmitry Gutov 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-19 22:51 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Alan Mackenzie, emacs-devel >> On Mon, Mar 14 2016 22:06, Dmitry Gutov wrote: > Hi Alan, > On 03/14/2016 05:16 PM, Alan Mackenzie wrote: >> How about an extension to the parse-partial-sexp (etc.) code? For >> example, a feature I would call an "island", where a character could be >> marked with the "island start" syntax-table property, and another >> character lower down could be marked with the "island end" character. > Something like that might help, although I hesitate asking for that change > because it's a relatively big one, and it would still solve only one > multiple-mode-related issue. [...] > But we could have pretty much that already by advising syntax-ppss. That > leaves out parse-partial-sexp, but it's not used that often directly in major > mode code (though sgml-mode uses it). You can simulate islands by marking inner spans as comments with comment classes (11 and 12). I used those in polymode in the past, but not anymore. It's not that useful. Most of the parsing that modes do is regex based. So if a mode author decides to regexpf for a wiki link on a full buffer after widening it, islands won't help. As Dmitry mentioned, there is little multi-mode cannot do with advising syntax-ppss. The issue is still that parse-partial-sexp blows with narrowed code and you cannot advice it. IMO the most useful direction for multi-modes is to add a hard narrowing that Stephen mentioned in the other thread. `syntax-ppss-dont-widen` goes in that direction, but it doesn't address the issue of distinguishing between user narrowing and "hard narrowing" in multi modes. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-19 22:51 ` Vitalie Spinu @ 2016-03-20 2:19 ` Dmitry Gutov 2016-03-20 12:15 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-03-20 2:19 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Alan Mackenzie, emacs-devel On 03/20/2016 12:51 AM, Vitalie Spinu wrote: > You can simulate islands by marking inner spans as comments with comment classes > (11 and 12). Ooh, that's a solid idea. Should be more generic that my "propertize <> as punctuation" approach. > I used those in polymode in the past, but not anymore. It's not > that useful. Most of the parsing that modes do is regex based. I'd say it's still useful. Without the above, I've had indentation problems with sgml-mode. A good mode would use syntax-ppss to check that point is not inside a string or comment. Maybe that's not often done in font-lock, but it's at least common in syntax-propertize and indentation functions. Example: sgml-lexical-context. It performs a search at first, but in the end uses parse-partial-sexp, and returns a value based on that status. > So if a mode > author decides to regexpf for a wiki link on a full buffer after widening it, > islands won't help. Where does widening happens in this case? First, we have font-lock-dont-widen. For indentation, we've introduced prog-indentation-context recently. And indentation functions in programming modes are supposed to call prog-widen instead of widen now. syntax-propertize-function's aren't supposed to call widen at all, I think. > IMO the most useful direction for multi-modes is to add a hard narrowing that > Stephen mentioned in the other thread. `syntax-ppss-dont-widen` goes in that > direction, but it doesn't address the issue of distinguishing between user > narrowing and "hard narrowing" in multi modes. syntax-ppss-dont-widen and prog-indentation-context will be the indicators of the "hard narrowing", I guess. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-20 2:19 ` Dmitry Gutov @ 2016-03-20 12:15 ` Vitalie Spinu 2016-03-20 15:58 ` Dmitry Gutov 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-20 12:15 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Alan Mackenzie, emacs-devel >> On Sun, Mar 20 2016 04:19, Dmitry Gutov wrote: >> So if a mode author decides to regexpf for a wiki link on a full buffer after >> widening it, islands won't help. > Where does widening happens in this case? First, we have font-lock-dont-widen. Well, font-lock-dont-widen is not respected even in c-mode. Look at c-before-context-fl-expand-region and semi-safe-place which are called directly or indirectly from c-font-lock-fontify-region. > For indentation, we've introduced prog-indentation-context recently. And > indentation functions in programming modes are supposed to call prog-widen > instead of widen now. I was not aware of that. Not sure if it is a step in right direction though. `prog-indentation-context` looks fine to me but multi-modes already have their own wrappers for indentation which do just that according to their own semantics of modes/submodes/chunks/headers etc. The primary intent of `prog-indentation-context` is to be used in `prog-widen`. This part seems like a major complication. All mode authors now have to understand what is prog-widen, prog-first-column and prog-indentation-context. Why to burden prog-mode authors with notions that multi-mode engines can take care themselves? It is also not clear to me why should prog-widen be used in indentation context only? It makes perfect sense for this function to be used in font-locking and syntax-propertize-function as well. It's essentially a half-backed implementation of "hard widening" discussed earlier. Why not impose the widening restriction directly in `widen` then? Maybe bring widen to elisp and rename C widen into widen-internal. Then add generic `prog-hard-widen-limits` which would be checked along prog-indentation-context limits. Multi-mode engines can then impose those hard limits whenever they need to and adjust indentation accordingly. It's not that hard in my experience. Polymode has a few lines to wrap indentation and it works reasonably well in pretty much all contexts I have tried. All other problems can be solved with hard narrowing. https://github.com/vspinu/polymode/blob/master/polymode-methods.el#L715-L809 Unless I miss something essential it's really not worth imposing such complexities on mode authors. Judging from the python.el, which is the only mode using prog-first-column so far, it's not a trivial task. Each mode author will basically have to implement indentation logic that mmm-mode or polymode already implement. And even then, multi-mode engines will probably need to overwrite that because the semantics of submode spans is either emacs-mode or multi-mode-engine specific. > syntax-propertize-function's aren't supposed to call widen at all, I think. This should probably be in the docs then. Mode authors can decide to do loads of work in there. One instance is `markdown-mode` which caches all font-lock properties in syntax-propertize-function. While markdown-mode is clean and doesn't use widen anywhere, that might not be the case for other modes. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-20 12:15 ` Vitalie Spinu @ 2016-03-20 15:58 ` Dmitry Gutov 2016-03-21 1:05 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-03-20 15:58 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel On 03/20/2016 02:15 PM, Vitalie Spinu wrote: > Well, font-lock-dont-widen is not respected even in c-mode. Look at > c-before-context-fl-expand-region and semi-safe-place which are called directly > or indirectly from c-font-lock-fontify-region. Well, yes. c-mode is special, as usual. That should be workable if CC Mode starts using prog-widen instead of widen, though. >> For indentation, we've introduced prog-indentation-context recently. And >> indentation functions in programming modes are supposed to call prog-widen >> instead of widen now. > > I was not aware of that. Not sure if it is a step in right direction though. I'm not 100% happy with it either. > `prog-indentation-context` looks fine to me but multi-modes already have their > own wrappers for indentation which do just that according to their own semantics > of modes/submodes/chunks/headers etc. Too bad you were not around when this addition was discussed. > The primary intent of `prog-indentation-context` is to be used in > `prog-widen`. This part seems like a major complication. All mode authors now > have to understand what is prog-widen, prog-first-column and > prog-indentation-context. Why to burden prog-mode authors with notions that > multi-mode engines can take care themselves? IIRC, using first-column is fairly justified, the outer mode can't add extra indentation to the submode is a reliable, sane way (though I've also been hacking around that quite successfully). Here's the full discussion: http://lists.gnu.org/archive/html/emacs-devel/2015-01/msg00431.html http://lists.gnu.org/archive/html/emacs-devel/2015-02/msg00290.html with my messages further down. > It is also not clear to me why should prog-widen be used in indentation context > only? It makes perfect sense for this function to be used in font-locking and > syntax-propertize-function as well. Indeed. In js-mode's case, the offending code is called from font-lock-keywords, for example. > It's essentially a half-backed implementation of "hard widening" discussed > earlier. Why not impose the widening restriction directly in `widen` then? Maybe > bring widen to elisp and rename C widen into widen-internal. Then add generic > `prog-hard-widen-limits` which would be checked along prog-indentation-context > limits. Right! At the very least, I we should extract the second element of prog-indentation-context into a separate variable, and make prog-widen more prominent. But a proper implementation of hard-widen would be even better in my book. Although someone would need to comb through all low-level functions, at least, and decide which of them need to call widen-internal, and which will be fine with just widen. Are you interested in working on a patch? Also Cc'ing Stefan. Looking back on it, it seems prog-indentation-context was merged too early: it only has one usage so far, so it's not clear if the approach is generally viable. Christoph sort of promised to add support in CC Mode, but then disappeared. Which is not so surprising, that stuff is difficult. > Unless I miss something essential it's really not worth imposing such > complexities on mode authors. Judging from the python.el, which is the only mode > using prog-first-column so far, it's not a trivial task. Each mode author will > basically have to implement indentation logic that mmm-mode or polymode already > implement. And even then, multi-mode engines will probably need to overwrite > that because the semantics of submode spans is either emacs-mode or > multi-mode-engine specific. This is not too different what I was saying, I think. That discussion is fairly long, though, and it veered off to the side. AFAICT, though, the ultimate justification for having first-column is Python's indentation cycling behavior: http://lists.gnu.org/archive/html/emacs-devel/2015-02/msg01096.html Which is not that convincing, but makes some things clearner anyway. But the last element, previous-chunks, is still not used anywhere in Emacs. I think including it turned out to be a mistake, or at least premature. >> syntax-propertize-function's aren't supposed to call widen at all, I think. > > This should probably be in the docs then. Probably. > Mode authors can decide to do loads of > work in there. One instance is `markdown-mode` which caches all font-lock > properties in syntax-propertize-function. While markdown-mode is clean and > doesn't use widen anywhere, that might not be the case for other modes. ruby-syntax-propertize also does some involved parsing, but as long as there's no `widen' there, we should be fine. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-20 15:58 ` Dmitry Gutov @ 2016-03-21 1:05 ` Vitalie Spinu 2016-03-21 3:11 ` Stefan Monnier ` (2 more replies) 0 siblings, 3 replies; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 1:05 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel >> On Sun, Mar 20 2016 17:58, Dmitry Gutov wrote: > On 03/20/2016 02:15 PM, Vitalie Spinu wrote: > IIRC, using first-column is fairly justified, the outer mode can't add extra > indentation to the submode is a reliable, sane way The inner mode cannot often make that decision either. Same inner mode can be used in very different multi-mode contexts, each with their own semantics for chunks/headers/indentation. Reducing all that to a simple (first-column . previous-chunk) pair and letting inner mode do the job is surely not enough. The only actor to make that decision should be multi-mode engine itself. Instead of teaching modes about multi-modes, a much better idea is to introduce `calculate-indent-function` which would accept POS and optional STRING-AFTER and STRING-BEFORE. This function will return the indentation of STRING-AFTER at POS assuming there is a virtual STRING-BEFORE just before POS. This way, a multi-mode engine can call inner-mode's calculate-indent-function at the end of previous chunk with STRING-AFTER being the line at point and STRING-BEFORE being the content of current chunk. Most modes indent reliably based on one previous line, so in 99% of the cases STRING-BEFORE can be nil and multi-mode engine can call calculate-indent-function only on first line of the current chunk (and that only for continuation chunks, which are a minority out there). Then a lot of modes don't even care about what's in the current line, so STRING-AFTER will be irrelevant as well. Thus most modes will not even need a an implementation of calculate-indent-function. This is both more general than prog-indentation-context and doesn't require teaching major-modes about multi-modes. Moreover, a lot of major-modes already have such a "calculator" in place. >> It's essentially a half-backed implementation of "hard widening" discussed >> earlier. Why not impose the widening restriction directly in `widen` then? >> Maybe bring widen to elisp and rename C widen into widen-internal. Then add >> generic `prog-hard-widen-limits` which would be checked along >> prog-indentation-context limits. > Right! At the very least, I we should extract the second element of > prog-indentation-context into a separate variable, and make prog-widen more > prominent. Not sure about removing second element. Good thing about keeping all of them in one place is for the indentation engine to be concerned with a single variable. BTW, third argument should be renamed into PREVIOUS-CHUNK. The function returns one chunk. > But a proper implementation of hard-widen would be even better in my > book. Although someone would need to comb through all low-level functions, at > least, and decide which of them need to call widen-internal, and which will be > fine with just widen. No need to decide on widen-internal. All functions are free to call widen just as they do before. It's 100% backward compatible. The only reason to use `widen-internal` is to bring `widen` to elisp in order to allow for advise and better debugging. Actually, with hard-widen-limits, there will be no need for advice, so it can be kept in C. Only consumers of `hard-widen-limits` should be concerned with its side effects. But that's uniformly better than current situation when you cannot do much about restricting widen. In my experience hard-widen and parse-partial-sexp are the only hurdle in the way of proper multi-modes. I don't remember a single problem that would occur for a different reason. BTW, I parse-partial-sexp must abide hard-widen-limits as well. This way the request aired in bug#22983 of parse-partial-sexp == syntax-ppss will be automatically satisfied. You won't need syntax-ppss-dont-widen either. > Are you interested in working on a patch? Also Cc'ing Stefan. My knowledge of emacs C internals is close to 0. Elisp side (and probably C side) of this is trivial. I will look into it but I don't think I am the best person for that. > Looking back on it, it seems prog-indentation-context was merged too early: it > only has one usage so far, so it's not clear if the approach is generally > viable. > Christoph sort of promised to add support in CC Mode, but then > disappeared. Which is not so surprising, that stuff is difficult. A patch that would require hunting every single mode out there and implementing multi-modes locally should have been more carefully considered IMO. Emacs 25 is not yet there, so it's not late to reconsider that decision. > AFAICT, though, the ultimate justification for having first-column is Python's > indentation cycling behavior: > http://lists.gnu.org/archive/html/emacs-devel/2015-02/msg01096.html > Which is not that convincing, but makes some things clearner anyway. It's not convincing to me either. I use Christoph's indentation-0 trick in and it indeed works reliably for all modes I have tried except python. But python's issue can be fixed with a simple advice of python-indent-line-function, no need to overhaul python indentation because of that. This is how it's now done in polymode: https://github.com/vspinu/polymode/blob/master/polymode-compat.el#L189-L199 Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 1:05 ` Vitalie Spinu @ 2016-03-21 3:11 ` Stefan Monnier 2016-03-21 5:05 ` Vitalie Spinu 2016-03-21 11:56 ` Dmitry Gutov 2016-03-21 5:08 ` [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] Vitalie Spinu 2016-03-21 11:47 ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Dmitry Gutov 2 siblings, 2 replies; 155+ messages in thread From: Stefan Monnier @ 2016-03-21 3:11 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Alan Mackenzie, emacs-devel, Dmitry Gutov > BTW, I parse-partial-sexp must abide hard-widen-limits as well. I don't understand what this means. parse-partial-sexp is passed 2 locations and it works between them. There's not much opportunity for widening. But yes, syntax-ppss should obey hard-widen-limits. > A patch that would require hunting every single mode out there and > implementing multi-modes locally should have been more carefully > considered IMO. I must say I don't understand how what we have is so very different from what you suggest. Of course, I fully agree on the need to deprecate indent-line-function and use a side-effect free replacement which returns the desired indentation (instead performing the indentation). I think both suggestions require changes to every mode, and in both cases the changes can be reduced to a one-liner or close enough (for the simple case). Admittedly, for it to be a one-liner, we'll need to provide a standard helper function. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 3:11 ` Stefan Monnier @ 2016-03-21 5:05 ` Vitalie Spinu 2016-03-21 7:13 ` Andreas Röhler 2016-03-21 12:26 ` Stefan Monnier 2016-03-21 11:56 ` Dmitry Gutov 1 sibling, 2 replies; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 5:05 UTC (permalink / raw) To: Stefan Monnier; +Cc: Alan Mackenzie, emacs-devel, Dmitry Gutov >> On Sun, Mar 20 2016 23:11, Stefan Monnier wrote: >> BTW, I parse-partial-sexp must abide hard-widen-limits as well. > I don't understand what this means. parse-partial-sexp is passed > 2 locations and it works between them. There's not much opportunity > for widening. parse-partial-sexp should work between hard limits (at least the lower bound). It should operate as if hard-narrowed buffer is the real buffer. So ideally it should take (max FROM (car hard-widen-limits)) as the starting position. This will give the desired consistency between parse-partial-sexp and syntax-ppss with the price of slightly modifying the semantics of parse-partial-sexp in a backward compatible way. >> A patch that would require hunting every single mode out there and >> implementing multi-modes locally should have been more carefully >> considered IMO. > I must say I don't understand how what we have is so very different from > what you suggest. It's quite a bit different: - Major mode authors won't need to know about multi-modes. That means not dealing with chunks/spans/headers etc. These concepts are not even uniformly defined between existing multi-mode engines. - Major mode authors won't need to re-implement the indentation logic already there in multi-modes. The logic is likely to be too simplistic and major mode authors will have to re-do it anyways. - Setup is more general. multi-mode engine decides where to call calculate-indent-function and with what parameters and with what narrowing. - Arguably calculate-indent-function is a simpler concept to grasp - calculate-indent-function is needed anyways > I think both suggestions require changes to every mode, and in both cases the > changes can be reduced to a one-liner or close enough (for the simple > case). Admittedly, for it to be a one-liner, we'll need to provide a standard > helper function. Judging from python.el it might be quite hard to provide a generic one liner to deal with all those 3 elements. For calculate-indent-function instead you can provide a straightforward one line assuming that STRING-BEFORE/AFTER do not matter. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 5:05 ` Vitalie Spinu @ 2016-03-21 7:13 ` Andreas Röhler 2016-03-21 12:26 ` Stefan Monnier 1 sibling, 0 replies; 155+ messages in thread From: Andreas Röhler @ 2016-03-21 7:13 UTC (permalink / raw) To: emacs-devel On 21.03.2016 06:05, Vitalie Spinu wrote: > >>> On Sun, Mar 20 2016 23:11, Stefan Monnier wrote: >>> BTW, I parse-partial-sexp must abide hard-widen-limits as well. >> I don't understand what this means. parse-partial-sexp is passed >> 2 locations and it works between them. There's not much opportunity >> for widening. > parse-partial-sexp should work between hard limits (at least the lower > bound). It should operate as if hard-narrowed buffer is the real buffer. > > So ideally it should take (max FROM (car hard-widen-limits)) as the starting > position. This will give the desired consistency between parse-partial-sexp and > syntax-ppss with the price of slightly modifying the semantics of > parse-partial-sexp in a backward compatible way. > >>> A patch that would require hunting every single mode out there and >>> implementing multi-modes locally should have been more carefully >>> considered IMO. >> I must say I don't understand how what we have is so very different from >> what you suggest. > It's quite a bit different: > > - Major mode authors won't need to know about multi-modes. That means not > dealing with chunks/spans/headers etc. These concepts are not even uniformly > defined between existing multi-mode engines. > > - Major mode authors won't need to re-implement the indentation logic already > there in multi-modes. The logic is likely to be too simplistic and major > mode authors will have to re-do it anyways. > > - Setup is more general. multi-mode engine decides where to call > calculate-indent-function and with what parameters and with what narrowing. > > - Arguably calculate-indent-function is a simpler concept to grasp > > - calculate-indent-function is needed anyways > >> I think both suggestions require changes to every mode, and in both cases the >> changes can be reduced to a one-liner or close enough (for the simple >> case). Admittedly, for it to be a one-liner, we'll need to provide a standard >> helper function. > Judging from python.el it might be quite hard to provide a generic one liner to > deal with all those 3 elements. For calculate-indent-function instead you can > provide a straightforward one line assuming that STRING-BEFORE/AFTER do not > matter. > > > Vitalie > +1 ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 5:05 ` Vitalie Spinu 2016-03-21 7:13 ` Andreas Röhler @ 2016-03-21 12:26 ` Stefan Monnier 2016-03-21 14:13 ` Vitalie Spinu 1 sibling, 1 reply; 155+ messages in thread From: Stefan Monnier @ 2016-03-21 12:26 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Alan Mackenzie, emacs-devel, Dmitry Gutov >>> BTW, I parse-partial-sexp must abide hard-widen-limits as well. >> I don't understand what this means. parse-partial-sexp is passed >> 2 locations and it works between them. There's not much opportunity >> for widening. > parse-partial-sexp should work between hard limits (at least the lower > bound). It should operate as if hard-narrowed buffer is the real buffer. You mean it should ignore the current (user)narrowing? Why? I'd think that if something needs to ignore the (user)narrowing it'd be parse-partial-sexp's *caller* but not parse-partial-sexp itself. > So ideally it should take (max FROM (car hard-widen-limits)) as the starting > position. You mean: as opposed to (max FROM (point-min))? I disagree. Functions should usually not accept to talk about positions outside of the point-min/max range. Notice how syntax-ppss is different in this regard: since it doesn't receive FROM, that same rule doesn't prevent syntax-ppss from widening to (car hard-widen-limits). > This will give the desired consistency between parse-partial-sexp and > syntax-ppss with the price of slightly modifying the semantics of > parse-partial-sexp in a backward compatible way. I'd be curious to know in which circumstances (i.e. specific code in specific packages) this would make a difference. As mentioned above, I think these cases would be better fixed by changing the calling code to perform widening before calling parse-partial-sexp. >>> A patch that would require hunting every single mode out there and >>> implementing multi-modes locally should have been more carefully >>> considered IMO. > - Major mode authors won't need to know about multi-modes. That > means not dealing with chunks/spans/headers etc. These concepts are > not even uniformly defined between existing multi-mode engines. I understand that's your claim, but I don't understand why/how this is different between the two proposals. > - Major mode authors won't need to re-implement the indentation > logic already there in multi-modes. The logic is likely to be too > simplistic and major mode authors will have to re-do it anyways. > > - Setup is more general. multi-mode engine decides where to call > calculate-indent-function and with what parameters and with > what narrowing. Same here. > - Arguably calculate-indent-function is a simpler concept to grasp As mentioned, I fully agree with the need to replace indent-line-function with calculate-indent-function (tho I like to name it prog-indent-function). So the difference is w.r.t your STRONG-BEFORE/AFTER: which code provides them, which code obeys them, and how that compares to the way prog-indentation-context is provided and obeyed. >> I think both suggestions require changes to every mode, and in both >> cases the changes can be reduced to a one-liner or close enough (for >> the simple case). Admittedly, for it to be a one-liner, we'll need to >> provide a standard helper function. > Judging from python.el it might be quite hard to provide a generic one > liner to deal with all those 3 elements. For calculate-indent-function > instead you can provide a straightforward one line assuming that > STRING-BEFORE/AFTER do not matter. My hunch is that if STRING-BEFORE/AFTER don't matter, then it should not be hard to come up with a generic function in prog.el which can be invoked with a one liner in the major mode (assuming the major mode sets (prog|calculate)-indent-function). Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 12:26 ` Stefan Monnier @ 2016-03-21 14:13 ` Vitalie Spinu 2016-03-21 14:43 ` Stefan Monnier 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 14:13 UTC (permalink / raw) To: Stefan Monnier; +Cc: Alan Mackenzie, Dmitry Gutov, emacs-devel >> On Mon, Mar 21 2016 08:26, Stefan Monnier wrote: >> parse-partial-sexp should work between hard limits (at least the lower >> bound). It should operate as if hard-narrowed buffer is the real buffer. > You mean it should ignore the current (user)narrowing? Why? I'd think that if > something needs to ignore the (user)narrowing it'd be parse-partial-sexp's > *caller* but not parse-partial-sexp itself. Currently it just throws out-of-range errors. So in that sense it does ignore user narrowing in a very inconvenient way. parse-partial-sexp is called from code exclusively and it just happens that in multi-modes it is called outside of narrow region quite often. That's a major inconvenience. Why on earth one would need to take account in user narrowing for syntax parsing? If parse-partial-sexp could be made to always widen to hard limits it will automatically solve a bunch of problems. bug#22983 being one of them, condition-case awkwardness in syntax-ppss being another one, and the ubiquitous out-of-range errors in font-lock in multi-modes being the most important one. >> So ideally it should take (max FROM (car hard-widen-limits)) as the starting >> position. > You mean: as opposed to (max FROM (point-min))? Yes. > I disagree. Functions should usually not accept to talk about positions > outside of the point-min/max range. Depends on the function. point-max/min is mostly user level. Why wold syntax parsing would need to respect that? Bug#22983 ilustrates that clearly. If user narrows in the middles of a string, it creates huge problems. Note that with Dmitry's new syntax-ppps-dont-widen proposal syntax-ppps widens first. Can I ask you the reverse? What do you gain by respecting user narrowing in syntax parsing? > Notice how syntax-ppss is different in this regard: since it doesn't > receive FROM, that same rule doesn't prevent syntax-ppss from widening > to (car hard-widen-limits). Well, not quite different. It has POS which might be outside of user narrowed range. >> This will give the desired consistency between parse-partial-sexp and >> syntax-ppss with the price of slightly modifying the semantics of >> parse-partial-sexp in a backward compatible way. > I'd be curious to know in which circumstances (i.e. specific code in specific > packages) this would make a difference. As mentioned above, I think these > cases would be better fixed by changing the calling code to perform widening > before calling parse-partial-sexp. I think bug#22983 is illustrative enough. Multi-mode code is a nightmare because of out-of-range errors in parsing. `syntax-ppss` is protected but that condition-case is triggered in 99.99% of the times in multi-modes. In multi modes you really want to keep narrowing because most of the major-mode functionality works well on narrowed code. Pretty much all of it except syntactic parsing and font-locking. Occasional property lockup outside of narrowed region could be dealt with on case by case basis or, hopefully, with new hard-narrowed-limits at the core of it. >>>> A patch that would require hunting every single mode out there and >>>> implementing multi-modes locally should have been more carefully >>>> considered IMO. >> - Major mode authors won't need to know about multi-modes. That >> means not dealing with chunks/spans/headers etc. These concepts are >> not even uniformly defined between existing multi-mode engines. > I understand that's your claim, but I don't understand why/how this is > different between the two proposals. Major mode author has to deal with the span explicitly as defined in previous-chunk in prog-indentation-context. Cognitively this is a more demanding task. Ask a new person to go and read the doc of prog-indentation-context and ask how much he or she understands of it. I read it and I think I understand most of it, but looking at all the usages of prog-widen and prog-first-column in python.el my brain gives up. Previous-chunk is not even used in python.el! The prog-calculate-indent-function is more general. You can call it on any buffer position (need not be last point in the previous span). It can be called with whatever STRING-BEFORE and STRING-AFTER (these can, but need not be, actual strings in the buffer). Current prog-indentation-context allows for possibility of a string to be inserted before begging in of current chunk. STRING-BEFORE is more more general than that because of the arbitrary POS that it can be applied to. My claim is that we can achieve much higher generality and don't bother mode authors with all those concepts like current/previous span/chunk, starting/end position etc. Only multi-mode engine can take proper care of those anyways. Here is a simple example when inner mode cannot decide by itself on the indentation. Assume for concreteness a noweb header with some code immediately following the header: <<foo, some_arg=4>>= some_call(blabla) some_other_call(blabla) ## indented by offset 2 with respect to header or prev_chunk How do you indent the some_call(blabal) after the header? The most meaningful way is to keep it untouched just as user defined it. If inner mode would indent it by itself it would give offset of 4. This is a simple example of header dependence. You can easily imagine more complex cases when not only one previous span need to be considered but a range of previous spans of the same inner mode. Moreover there might be nested inner chunks. Which chunk/span will you include in prog-indentation-context? The entire previous code chunk or only the last homogeneous span after the most recent inner-inner chunk? Indentation of a span is commonly dependent on the header of the chunk (note the terminology distinction). You can imagine having a parameter in the header that would determine the indentation of the chunk's body. Header-dependence is a simple and common case of inter-span dependence. It's not hard to imagine complex cases when indentation in current span will depend not only on the previous span of the same mode but on other spans of host mode or even other inner (nested or not) modes. IMO the best way is to leave all this complexities to multi-mode authors to deal with on case by case basis. You never know what sort of complexities and chunk dependencies new multi-modes will impose. Better keep things generic. prog-calculate-indent-function seems like a multi-mode agnostic solution. I am not sure if it will solve all problems, but it's surely solves more than prog-indentation-context does in a cleaner way. Note on terminology. I put quite some effort to sort things out in polymode. Glossary of terms is here: https://github.com/vspinu/polymode/tree/master/modes#glossary-of-terms For many reasons it's important to distinguish between portions of code that include header/tails and homogeneous portions of the same mode. Former portions I call `chunks` and those can include other chunks of different sub-modes. The latter, homogeneous portions, I call `spans`. The fact that core emacs is now starting building pieces of multi-mode functionality here and there and thus entrenching a somewhat naive interpretation of a "chunk" doesn't make me happy. Not a big deal though. > My hunch is that if STRING-BEFORE/AFTER don't matter, It will actually matter for quite some modes in continuation chunks. I was too optimistic. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 14:13 ` Vitalie Spinu @ 2016-03-21 14:43 ` Stefan Monnier 2016-03-21 16:42 ` Vitalie Spinu 2016-03-21 16:45 ` Vitalie Spinu 0 siblings, 2 replies; 155+ messages in thread From: Stefan Monnier @ 2016-03-21 14:43 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Alan Mackenzie, Dmitry Gutov, emacs-devel > parse-partial-sexp is called from code exclusively and it just happens > that in multi-modes it is called outside of narrow region quite often. How/why? Can you give some concrete scenario? > That's a major inconvenience. Why on earth one would need to > take account in user narrowing for syntax parsing? Because you need it for *everything* that looks at the buffer. Why should parse-partial-sexp be different from (say) scan-sexps? > If parse-partial-sexp could be made to always widen to hard limits it > will automatically solve a bunch of problems. bug#22983 being one of them, Bug#22983 should be fixed by widening, indeed, but it should be done in syntax.el. Widening in parse-partial-sexp would only address some cases but not all (e.g. the syntax-begin-function cases or the syntax-propertize-function cases). Those other cases can only be fixed in syntax.el. > the ubiquitous out-of-range errors in font-lock in multi-modes being > the most important one. I'm not familiar with those, so if you could give some examples it would help us judge if they would indeed benefit from a fix in parse-partial-sexp rather than elsewhere. >> Notice how syntax-ppss is different in this regard: since it doesn't >> receive FROM, that same rule doesn't prevent syntax-ppss from widening >> to (car hard-widen-limits). > Well, not quite different. It has POS which might be outside of user narrowed > range. No: POS should be within point-min/max. > Major mode author has to deal with the span explicitly as defined in > previous-chunk in prog-indentation-context. Cognitively this is a more > demanding task. Ask a new person to go and read the doc of > prog-indentation-context and ask how much he or she understands of > it. I read it and I think I understand most of it, but looking at all > the usages of prog-widen and prog-first-column in python.el my brain > gives up. Previous-chunk is not even used in python.el! Replace all your widen calls with calls to `prog-widen' and you get the same result (since (nth 1 prog-indentation-context) is basically another name for your hard-widen-limit). So I don't think prog-widen is that very complicated. As for prog-first-column the local major mode can just ignore it in which case the multi-mode can do the same that you do. It's only useful if you need/want to provide a more complex behavior than what polymode supports. So, of course, it's more complex. > The prog-calculate-indent-function is more general. You can call it on > any buffer position (need not be last point in the previous span). [ Note: In my mind, the "natural main case" for multi-mode indentation is when you call the indentation function on the *first position* of a span. But you seem to look at it from the other end, where you call the indentation function on the *last position* of the previous span. I think I'm beginning to see why. ] Note that "is more general" here means that the major mode's function has to handle more cases, so it would seem to fundamentally require more work on the major mode's side. > Current prog-indentation-context allows for possibility of a string to > be inserted before begging in of current chunk. STRING-BEFORE is more > more general than that because of the arbitrary POS that it can be > applied to. Good point. I didn't think of that. Do you make use of that possibility, and/or can you give an example where it's useful? > Here is a simple example when inner mode cannot decide by itself on > the indentation. Assume for concreteness a noweb header with some code > immediately following the header: > > <<foo, some_arg=4>>= some_call(blabla) > some_other_call(blabla) ## indented by offset 2 with respect to header or prev_chunk > > How do you indent the some_call(blabal) after the header? The most > meaningful way is to keep it untouched just as user defined it. If > inner mode would indent it by itself it would give offset of 4. This > is a simple example of header dependence. Maybe it's because I'm not familiar with noweb, but I didn't understand this example. It looks like a very interesting example, so could you go over it again in much more detail? Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 14:43 ` Stefan Monnier @ 2016-03-21 16:42 ` Vitalie Spinu 2016-03-21 18:31 ` Stefan Monnier 2016-03-21 20:33 ` Alan Mackenzie 2016-03-21 16:45 ` Vitalie Spinu 1 sibling, 2 replies; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 16:42 UTC (permalink / raw) To: Stefan Monnier; +Cc: Alan Mackenzie, emacs-devel, Dmitry Gutov >> On Mon, Mar 21 2016 10:43, Stefan Monnier wrote: >> parse-partial-sexp is called from code exclusively and it just happens >> that in multi-modes it is called outside of narrow region quite often. > How/why? Can you give some concrete scenario? MM engine narrows to span region for a lot of tasks, most importantly font-lock. If inner mode fortification functions misbehaves (ignoring font-lock-dont-widen for example) like c-mode does this leads to trouble. So to avoid those troubles you would advice individual functions and narrow them properly or apply other tricks like overwriting output value or input args. It all works fine till that function calls parse-partial-sexp (or some other low level function) and blows with args-out-of-range error. To be frank, the issue of parse-partial-sexp is fading because modes are now using syntax-ppss more extensively. Most of the problems with parse-partial-sexp from the past are now internalized in condition-case inside syntax-ppss. That condition-case is triggered very often (almost always) from inside polymode chunk narrowing. >> That's a major inconvenience. Why on earth one would need to >> take account in user narrowing for syntax parsing? > Because you need it for *everything* that looks at the buffer. > Why should parse-partial-sexp be different from (say) scan-sexps? I think parse-partial-sexp, syntax-ppss and maybe some others, are special in the sense that in order to return a correct value they need to be aware of the whole buffer. I don't see this as an inconsistency but I might be too naive. >> If parse-partial-sexp could be made to always widen to hard limits it >> will automatically solve a bunch of problems. bug#22983 being one of them, > Bug#22983 should be fixed by widening, indeed, but it should be done in > syntax.el. Widening in parse-partial-sexp would only address some cases > but not all (e.g. the syntax-begin-function cases or the > syntax-propertize-function cases). Those other cases can only be fixed > in syntax.el. >> the ubiquitous out-of-range errors in font-lock in multi-modes being >> the most important one. > I'm not familiar with those, so if you could give some examples it > would help us judge if they would indeed benefit from a fix in > parse-partial-sexp rather than elsewhere. c-mode provides an example. I don't remember where exactly and how but it has to do with but c-before-context-fl-expand-region and c-state-semi-safe-place because I am advising these two functions currently. The logic is roughly like this, c-mode engine doesn't respect font-lock-dont-widen, widens stuff in some of it's functions, then calls its parsing, gets back some points outside font-lock range and blows when trying to access those points from narrowed region. I was not collecting these cases carefully but I will start doing it and will get with more concrete examples in the following weeks. Another directions of problems is syntax-propertize. It can be called with POS outside of current narrowed region. Particularly from internal--syntax-propertize. But again I don't recall how exactly that was happening now. > Replace all your widen calls with calls to `prog-widen' and you get the same > result (since (nth 1 prog-indentation-context) is basically another name for > your hard-widen-limit). So I don't think prog-widen is that very complicated. It's not but you have to enforce that in all known modes. > Note that "is more general" here means that the major mode's function has to > handle more cases, so it would seem to fundamentally require more work on the > major mode's side. I don't agree. Work must be done only in the generic multi-mode engines (mmm-mode, polymode etc). Other modes should re-use that generic infrastructure, or even better, do nothing, and leave to someone else to define a new polymode with host chunk being *the* mode. That every mode with basic needs for inner sub-modes tries to re-invent the wheel is a dead end. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 16:42 ` Vitalie Spinu @ 2016-03-21 18:31 ` Stefan Monnier 2016-03-21 19:16 ` Vitalie Spinu 2016-03-21 20:33 ` Alan Mackenzie 1 sibling, 1 reply; 155+ messages in thread From: Stefan Monnier @ 2016-03-21 18:31 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Alan Mackenzie, emacs-devel, Dmitry Gutov > To be frank, the issue of parse-partial-sexp is fading because modes > are now using syntax-ppss more extensively. Aha! So we agree: there's no reason to worry about parse-partial-sexp, since at this point pretty much all modes rely on syntax-ppss (other than CC-modes, obviously). > Most of the problems with parse-partial-sexp from the past are now > internalized in condition-case inside syntax-ppss. That condition-case > is triggered very often (almost always) from inside polymode > chunk narrowing. Right. But I don't see it as a problem in parse-partial-sexp, rather than a problem in syntax.el. >> Because you need it for *everything* that looks at the buffer. >> Why should parse-partial-sexp be different from (say) scan-sexps? > I think parse-partial-sexp, syntax-ppss and maybe some others, are special in > the sense that in order to return a correct value they need to be aware of the > whole buffer. I don't see this as an inconsistency but I might be too naive. scan-sexps will complain about unmatched parens and things like that if it bumps into point-min/max. Same for re-search-*. I think you've just been too often exposed to the use of (parse-partial-sexp 1 ...) where the resulting signal bites you right away, whereas many other functions won't signal an error and will instead do *something* (which may not always be incorrect, but may often enough still result in acceptable behavior). > c-mode provides an example. I don't remember where exactly and how but > it has to do with but c-before-context-fl-expand-region and > c-state-semi-safe-place because I am advising these two > functions currently. CC-modes is definitely a very special case here. We should aim to limit the amount of changes in most major modes, so better not pay too much attention to cc-mode from that point of view. > The logic is roughly like this, c-mode engine doesn't respect > font-lock-dont-widen, widens stuff in some of it's functions, then > calls its parsing, gets back some points outside font-lock range and > blows when trying to access those points from narrowed region. Sounds like a problem in cc-mode, which will require changes in cc-mode. The generic code shouldn't worry about that. > I was not collecting these cases carefully but I will start doing it and will > get with more concrete examples in the following weeks. Thanks. > Another directions of problems is syntax-propertize. It can be called > with POS outside of current narrowed region. Particularly from > internal--syntax-propertize. That would definitely be an error, so if you bump into such a case please report it. >> Replace all your widen calls with calls to `prog-widen' and you get >> the same result (since (nth 1 prog-indentation-context) is basically >> another name for your hard-widen-limit). So I don't think prog-widen >> is that very complicated. > It's not but you have to enforce that in all known modes. I prefer to say that "any major mode which wants to play with the new snazzy multi-mode feature needs to be adjusted (e.g. with prog-widen)". It's perfectly fine if some major modes don't play along correctly until they're fixed. "Try to get multi-mode working without touching anyone's code" (e.g. using advice) is great, but we already have packages which do that. >> Note that "is more general" here means that the major mode's function has to >> handle more cases, so it would seem to fundamentally require more work on the >> major mode's side. > I don't agree. Work must be done only in the generic multi-mode engines > (mmm-mode, polymode etc). The "is more general" I was quoting was talking about the ways the generic code can call the major-mode-specific code. If this is more generic, it means the major-mode-specific code needs to handle more situations (e.g. STRING-BEFORE appearing not just at the beginning of a chunk). > Other modes should re-use that generic infrastructure, or even better, > do nothing, and leave to someone else to define a new polymode with > host chunk being *the* mode. That every mode with basic needs for > inner sub-modes tries to re-invent the wheel is a dead end. I don't understand: every major mode's indentation code will have to pay attention to the STRING-BEFORE/AFTER that it receives from the generic code and will have to do something with it (it can ignore it but at the cost of sub-optimal results). And AFAIK this can only be done by the major mode's code, not the generic mode's code. [ I feel like I must be missing something. ] Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 18:31 ` Stefan Monnier @ 2016-03-21 19:16 ` Vitalie Spinu 2016-03-21 20:47 ` Stefan Monnier 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 19:16 UTC (permalink / raw) To: Stefan Monnier; +Cc: Alan Mackenzie, Dmitry Gutov, emacs-devel >> On Mon, Mar 21 2016 14:31, Stefan Monnier wrote: >> Other modes should re-use that generic infrastructure, or even better, >> do nothing, and leave to someone else to define a new polymode with >> host chunk being *the* mode. That every mode with basic needs for >> inner sub-modes tries to re-invent the wheel is a dead end. > I don't understand: every major mode's indentation code will have to pay > attention to the STRING-BEFORE/AFTER that it receives from the generic > code and will have to do something with it (it can ignore it but at the > cost of sub-optimal results). And AFAIK this can only be done by the > major mode's code, not the generic mode's code. > [ I feel like I must be missing something. ] The hope is that most modes will need the default implementation. Maybe prog-indentation-funciton need not even know about those arguments. Along what Dmitry proposed, there could be an optional extra piece prog-indentation-with-virtual-context-function that modes might choose to set. I think it might be good to try this as part of a multi-mode engine first. I can try it with polymode. If it's generic enough it could be ported into emacs in a later stage to be re-used with other multi-mode setups. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 19:16 ` Vitalie Spinu @ 2016-03-21 20:47 ` Stefan Monnier 0 siblings, 0 replies; 155+ messages in thread From: Stefan Monnier @ 2016-03-21 20:47 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Alan Mackenzie, Dmitry Gutov, emacs-devel > The hope is that most modes will need the default implementation. Maybe > prog-indentation-funciton need not even know about those arguments. Along what > Dmitry proposed, there could be an optional extra piece > prog-indentation-with-virtual-context-function that modes might choose to set. That was the idea being the prog-indent-context: indentation functions can choose to use if they wish, but by default they don't have to. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 16:42 ` Vitalie Spinu 2016-03-21 18:31 ` Stefan Monnier @ 2016-03-21 20:33 ` Alan Mackenzie 2016-03-21 20:49 ` Stefan Monnier ` (2 more replies) 1 sibling, 3 replies; 155+ messages in thread From: Alan Mackenzie @ 2016-03-21 20:33 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Dmitry Gutov, Stefan Monnier, emacs-devel Hello, Vitalie. On Mon, Mar 21, 2016 at 05:42:51PM +0100, Vitalie Spinu wrote: > >> On Mon, Mar 21 2016 10:43, Stefan Monnier wrote: > >> parse-partial-sexp is called from code exclusively and it just happens > >> that in multi-modes it is called outside of narrow region quite often. > > How/why? Can you give some concrete scenario? > MM engine narrows to span region for a lot of tasks, most importantly > font-lock. If inner mode fortification functions misbehaves (ignoring > font-lock-dont-widen for example) like c-mode does this leads to trouble. That's a misunderstanding of what `font-lock-dont-widen' is. It's purely a signal to font-lock. Its doc string makes clear that it's intended for use by major modes. It is for a major mode to set this flag, not to act on it. CC Mode absolutely needs to widen, to get the context necessary for correct fontification and indentation (which can be an arbitrary depth). > So to avoid those troubles you would advice individual functions and > narrow them properly or apply other tricks like overwriting output > value or input args. It all works fine till that function calls > parse-partial-sexp (or some other low level function) and blows with > args-out-of-range error. Reading some of the posts on emacs-devel today, it strikes me that narrowing might be the wrong tool for marking the boundaries of distinct regions where different major modes are in effect. It seems to cause nothing but trouble. I don't know what the right tool is, and it may not currently exist in Emacs. But it might be a good use of time to work out what properties such boundary markers ought to have, and if necessary, to implement them. [ .... ] > Vitalie -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 20:33 ` Alan Mackenzie @ 2016-03-21 20:49 ` Stefan Monnier 2016-03-21 21:03 ` Drew Adams 2016-03-21 21:12 ` Dmitry Gutov 2 siblings, 0 replies; 155+ messages in thread From: Stefan Monnier @ 2016-03-21 20:49 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Vitalie Spinu, Dmitry Gutov, emacs-devel > Reading some of the posts on emacs-devel today, it strikes me that > narrowing might be the wrong tool for marking the boundaries of distinct > regions where different major modes are in effect. It seems to cause > nothing but trouble. That's why prog-indent-context just provides the boundaries of the current chunk but without narrowing. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* RE: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 20:33 ` Alan Mackenzie 2016-03-21 20:49 ` Stefan Monnier @ 2016-03-21 21:03 ` Drew Adams 2016-03-21 21:12 ` Dmitry Gutov 2 siblings, 0 replies; 155+ messages in thread From: Drew Adams @ 2016-03-21 21:03 UTC (permalink / raw) To: Alan Mackenzie, Vitalie Spinu; +Cc: emacs-devel, Stefan Monnier, Dmitry Gutov > Reading some of the posts on emacs-devel today, it strikes me that > narrowing might be the wrong tool for marking the boundaries of distinct > regions where different major modes are in effect. It seems to cause > nothing but trouble. > > I don't know what the right tool is, and it may not currently exist > in Emacs. But it might be a good use of time to work out what > properties such boundary markers ought to have, and if necessary, > to implement them. Indeed. Just what properties do you need for "such boundary markers"? What's wrong with using _markers_ to mark area boundaries? (That's what I use in library zones.el, for example.) What's wrong with using text properties to mark areas? (That's what I use in library isearch-prop.el, for example.) I haven't been following this thread except for scanning it, so I don't really know what the need is for messing with narrowing (or for defining another, "harder" narrowing). I'm just hoping that at the end of the day our age-old narrowing feature will at least remain as it has been, for both programs and interactive use. (It doesn't give me confidence, a priori, to have seen that at least one developer involved expressed little understanding of how users and programs actually use narrowing, thinking that narrowing is only for hiding text you do not want to see.) How about a bit of a spec (description, summary) of what you really need? It's not even clear what the problem is that you are trying to solve. Look at the Subject line of this thread, and that of the bug thread that this one derived from, and that of the other derived thread, for the patch "hard-widen-limits". Do they really characterize what this is all about: the problem to be solved? ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 20:33 ` Alan Mackenzie 2016-03-21 20:49 ` Stefan Monnier 2016-03-21 21:03 ` Drew Adams @ 2016-03-21 21:12 ` Dmitry Gutov 2 siblings, 0 replies; 155+ messages in thread From: Dmitry Gutov @ 2016-03-21 21:12 UTC (permalink / raw) To: Alan Mackenzie, Vitalie Spinu; +Cc: Stefan Monnier, emacs-devel Hi Alan, On 03/21/2016 10:33 PM, Alan Mackenzie wrote: >> MM engine narrows to span region for a lot of tasks, most importantly >> font-lock. If inner mode fortification functions misbehaves (ignoring >> font-lock-dont-widen for example) like c-mode does this leads to trouble. > > That's a misunderstanding of what `font-lock-dont-widen' is. It's > purely a signal to font-lock. Its doc string makes clear that it's > intended for use by major modes. It does not. The docstring gives examples of the modes where it can be useful. It does not say that the variable can only be set by a major mode. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 14:43 ` Stefan Monnier 2016-03-21 16:42 ` Vitalie Spinu @ 2016-03-21 16:45 ` Vitalie Spinu 2016-03-21 22:55 ` Dmitry Gutov 2016-03-22 14:51 ` Stefan Monnier 1 sibling, 2 replies; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 16:45 UTC (permalink / raw) To: Stefan Monnier; +Cc: Alan Mackenzie, emacs-devel, Dmitry Gutov I have split the answer in two separate conceptually distinct parts. This one is about indentation complexities of generic multi-modes. >> On Mon, Mar 21 2016 10:43, Stefan Monnier wrote: >> Current prog-indentation-context allows for possibility of a string to be >> inserted before begging in of current chunk. STRING-BEFORE is more more >> general than that because of the arbitrary POS that it can be applied to. > Good point. I didn't think of that. Do you make use of that > possibility, and/or can you give an example where it's useful? Please see example of erb mode below. >> Here is a simple example when inner mode cannot decide by itself on >> the indentation. Assume for concreteness a noweb header with some code >> immediately following the header: >> >> <<foo, some_arg=4>>= some_call(blabla) >> some_other_call(blabla) ## indented by offset 2 with respect to header or prev_chunk >> >> How do you indent the some_call(blabal) after the header? The most >> meaningful way is to keep it untouched just as user defined it. If >> inner mode would indent it by itself it would give offset of 4. This >> is a simple example of header dependence. > Maybe it's because I'm not familiar with noweb, but I didn't understand this > example. It looks like a very interesting example, so could you go over it > again in much more detail? Noweb is not essential here. The story will hold for pretty much all multi-modes with non-full-line headers. In noweb `<<foo, some_arg=4>>=` is a header of a chunk. Polymode places heads and tail in their own modes because they are not conceptually part of nor host or sub-mode. You can specify arbitrary parameters in the head which might even instruct how to indent the chunk. The first code line `some_call(blabla)` is placed on the same line with the head. This is uncommon but it's the simplest real case I can think of. There are two issues here. First one is how do you indent the head itself? Let's assume the point is after `foo`. If you follow the naive prog-indentation-context the indentation should be handled by the mode in the head chunk, right? Let's call it noweb-head-mode. This mode is the same for many noweb host-mode/inner-mode combinations and defaults to poly-head-tail-mode. Host mode is commonly LaTeX but it can be anything. One reasonable way to indent it is to use the host mode indentation engine. Note that this is in contrast of the prog-indentation-context assumption for which PREVIOUS-CHUNK is assumed to be of the same mode type as the current type. The second issue is with respect to the first line immediately after the header. If you naively call inner mode indentation engine on that line in a narrowed buffer starting after >>= you will get it indented to FIRST-COLUMN, which in the above case is the indentation of the head, plus noweb chunk offset which is a polymode specific thing and it is customizable per inner mode. Do you really want to insert that space after >>=? Probably not. So the code following the header is special. That means that you will either have to take care of that in multi-mode engine or extend prog-indentation-context. Think of markdown simple code spans `this = is(a, codes, span)` which can occur anywhere in the buffer. Indentation is not meaningful within the span at all, the whole chunk should be indented by the outer mode just before the opening `. Real trouble comes with continuation chunks. You might need to have a completely reversed indentation logic - in outer/host spans MM engine needs to call inner mode for indentation. Consider this example of erb mode taken from https://github.com/fxbois/web-mode/blob/master/tests/demo.erb. <div id='header'> <% if signed_in? -%> <%= link_to t('.sign_out'), sign_out_path, :method => :delete %> <% else -%> <%= link_to t('.sign_in'), sign_in_path %> <% end -%> </div> One meaningful approach here is to indent if-else-end block using inner mode rules, right? This is what web-mode seems to be currently doing. Assume you are just in front of `<%= link_to`. This is host hmtl mode. But you need to indent according to inner mode construct. So what do you do? You go to the end of previous code chunk and call prog-indentation-function of inner mode with STRING-BEFORE = "\n" and STRING-AFTER="link_to t('.sign_out'), sign_out_path, :method => :delete". Simple isn't it? That's precisely my proposal. The message is that whatever you try you will not be able to completely leave all the work to inner mode or capture it with naive constructs like prog-indentation-context. Quite the opposite, new complexities are likely to make multi-mode authors life harder. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 16:45 ` Vitalie Spinu @ 2016-03-21 22:55 ` Dmitry Gutov 2016-03-22 14:51 ` Stefan Monnier 1 sibling, 0 replies; 155+ messages in thread From: Dmitry Gutov @ 2016-03-21 22:55 UTC (permalink / raw) To: Vitalie Spinu, Stefan Monnier; +Cc: Alan Mackenzie, emacs-devel On 03/21/2016 06:45 PM, Vitalie Spinu wrote: > Real trouble comes with continuation chunks. You might need to have a completely > reversed indentation logic - in outer/host spans MM engine needs to call inner > mode for indentation. Consider this example of erb mode taken from > https://github.com/fxbois/web-mode/blob/master/tests/demo.erb. > > <div id='header'> > <% if signed_in? -%> > <%= link_to t('.sign_out'), sign_out_path, :method => :delete %> > <% else -%> > <%= link_to t('.sign_in'), sign_in_path %> > <% end -%> > </div> FWIW, I've successfully implemented indentation for ERB (and EJS) files without delegating to ruby-mode and js-mode indentation code: https://github.com/purcell/mmm-mode/blob/c9a857a638701482931ffaaee262b61ce53489f3/mmm-erb.el#L157-L225 (it indents web-mode's example almost identically) It should be easy to add support for similar types of files with almost the same code. But yes, it implies adding support for each new template format manually. > One meaningful approach here is to indent if-else-end block using inner mode > rules, right? This is what web-mode seems to be currently doing. Assume you are > just in front of `<%= link_to`. This is host hmtl mode. But you need to indent > according to inner mode construct. So what do you do? You go to the end of > previous code chunk and call prog-indentation-function of inner mode with > STRING-BEFORE = "\n" and STRING-AFTER="link_to t('.sign_out'), sign_out_path, > :method => :delete". Simple isn't it? That's precisely my proposal. It's simpler for the caller, but I'm having hard time imagining how to implement it properly on a major mode's side without inserting those strings into the buffer, or using a temporary buffer. If STRING-BEFORE and STRING-AFTER are not allowed to contain newlines, yes, that becomes easier to handle, but that also loses in generality. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 16:45 ` Vitalie Spinu 2016-03-21 22:55 ` Dmitry Gutov @ 2016-03-22 14:51 ` Stefan Monnier 2016-03-22 18:17 ` Vitalie Spinu 2016-03-22 18:26 ` Vitalie Spinu 1 sibling, 2 replies; 155+ messages in thread From: Stefan Monnier @ 2016-03-22 14:51 UTC (permalink / raw) To: emacs-devel > Noweb is not essential here. The story will hold for pretty much all > multi-modes with non-full-line headers. In noweb `<<foo, > some_arg=4>>=` is a header of a chunk. Polymode places heads and tail > in their own modes because they are not conceptually part of nor host > or sub-mode. You can specify arbitrary parameters in the head which > might even instruct how to indent the chunk. The first code line > `some_call(blabla)` is placed on the same line with the head. This is > uncommon but it's the simplest real case I can think of. OK. > There are two issues here. Haha! > First one is how do you indent the head itself? Let's assume point is > after `foo`. If you follow the naive prog-indentation-context the > indentation should be handled by the mode in the head chunk, right? If it's got its own "chunk", then yes. > Let's call it noweb-head-mode. This mode is the same for many noweb > host-mode/inner-mode combinations and defaults to poly-head-tail-mode. OK. > Host mode is commonly LaTeX but it can be anything. OK. > One reasonable way to indent it is to use the host mode > indentation engine. Right, since the beginning of line is still in latex-mode, the "<<foo, some_arg=4>>= some_call(blabla)" line would be indented by latex-mode. I.e. the generic code would go to BOL, and call latex-mode's indentation while setting prog-indentation-context with an "end of chunk" that's at point. > Note that this is in contrast of the prog-indentation-context > assumption for which PREVIOUS-CHUNK is assumed to be of the same mode > type as the current type. Not sure what PREVIOUS-CHUNK has to do with it. > The second issue is with respect to the first line immediately after the > header. Since it's not on its own line, I don't see why it would be an issue for indentation. > If you naively call inner mode indentation engine on that line in a > narrowed buffer starting after >>= you will get it indented to FIRST-COLUMN, That's also one of the reasons why I didn't want to impose narrowing in prog-indentation-context. > mode for indentation. Consider this example of erb mode taken from > https://github.com/fxbois/web-mode/blob/master/tests/demo.erb. > > <div id='header'> > <% if signed_in? -%> > <%= link_to t('.sign_out'), sign_out_path, :method => :delete %> > <% else -%> > <%= link_to t('.sign_in'), sign_in_path %> > <% end -%> > </div> IIUC, we have here a tight interleaving of lots of little chunks, alternating between HTML and ..[according to duckduckgo].. Ruby. > One meaningful approach here is to indent if-else-end block using inner mode > rules, right? Another approach would be to consider it as a sequence of chunks, rather than as chunks of one mode nested in another. So each chunk controls the FIRST-COLUMN of the next chunk. In any case, this seems messy. > This is what web-mode seems to be currently doing. Assume you are > just in front of `<%= link_to`. This is host hmtl mode. But you need to indent > according to inner mode construct. So what do you do? You go to the end of > previous code chunk and call prog-indentation-function of inner mode with > STRING-BEFORE = "\n" and STRING-AFTER="link_to t('.sign_out'), sign_out_path, > :method => :delete". Simple isn't it? That's precisely my proposal. Ah, now I see a use of STRING-BEFORE and STRING-AFTER, thanks. This case of STRING-BEFORE being "\n" is very special: in SMIE, the core indentation function (smie-indent-calculate) basically behaves as if it's always called with STRING-BEFORE="\n". IOW, we could define prog-indent-function as always behaving "as if STRING-BEFORE was \n". In the normal case, the foo-calculate-indent function is called at the beginning of line anyway, so adding a STRING-BEFORE="\n" won't affect its behavior. As for STRING-AFTER, the example is compelling, but I don't yet understand really how it would all work out overall. I'm thinking of cases like: <% 3.times do %> <li> some text <% if signed_in? -%> <%= link_to t('.sign_out'), sign_out_path, :method => :delete %> <% else -%> <%= link_to t('.sign_in'), sign_in_path %> <% end -%> </li> <% end %> How should the "generic" code that links HTML and Ruby know when to indent using the HTML indentation code and when to use the Ruby indentation rules? Maybe my suggestion of considering it as a sequence of chunks (where each chunk controls the FIRST-COLUMN of the next chunk) could work, but it's far from obvious. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-22 14:51 ` Stefan Monnier @ 2016-03-22 18:17 ` Vitalie Spinu 2016-03-23 1:18 ` Dmitry Gutov 2016-03-23 13:18 ` Stefan Monnier 2016-03-22 18:26 ` Vitalie Spinu 1 sibling, 2 replies; 155+ messages in thread From: Vitalie Spinu @ 2016-03-22 18:17 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel >> On Tue, Mar 22 2016 10:51, Stefan Monnier wrote: >> The second issue is with respect to the first line immediately after the >> header. > Since it's not on its own line, I don't see why it would be an issue > for indentation. It's a problem if you narrow to current span and allow inner mode to indent first line. So one way or another, multi-mode has to interfere in this case beyond FIRST-COLUMN hint. Without narrowing it's not clear what is the contract that inner mode should respect to handle previous chunk locations. It's not even clear if previous locations should be of the same modes chunk, previous head span or maybe a set of heterogeneous chunks. In any case once the inner mode gets locations of previous chunks it all becomes an very messy open question. Modes can decide to do whatever they see fit. The STRING-BEFORE/AFTER system, not ideal of course, but it keeps the mode within its own world and doesn't leave much space for "improvisation". >> mode for indentation. Consider this example of erb mode taken from >> https://github.com/fxbois/web-mode/blob/master/tests/demo.erb.> >> <div id='header'> >> <% if signed_in? -%> >> <%= link_to t('.sign_out'), sign_out_path, :method => :delete %> >> <% else -%> >> <%= link_to t('.sign_in'), sign_in_path %> >> <% end -%> >> </div> >> One meaningful approach here is to indent if-else-end block using inner mode >> rules, right? > Another approach would be to consider it as a sequence of chunks, rather > than as chunks of one mode nested in another. So each chunk controls > the FIRST-COLUMN of the next chunk. This will not work in above case. <%else-%> chunk needs to know about where <%if signed_in? -%> was indented which is not an immediately preceding chunk. It's hard to think of better solution than collecting all relevant previous chunks in one place and indenting according to inner mode. In order to indent "<%else-%>", STRING-BEFORE should be full "link_to ..." line. So basically STRING-BEFORE must consist of all ruby spans in between "if" and "else" chunks. > In any case, this seems messy. Yeh. Very much. > As for STRING-AFTER, the example is compelling, but I don't yet > understand really how it would all work out overall. Neither do I. Strings are hard to process in emacs and the mode will need to either modify current buffer by inserting it in a special region or use a separate buffer for that. I tend to agree with Dmitry, if you decide not to pass chunk locations to inner modes then there is no much point in getting complicated with passing BEFORE/AFTER strings. Multi-mode engine can take care of that satisfactory. > How should the "generic" code that links HTML and Ruby know when to indent > using the HTML indentation code and when to use the Ruby indentation rules? No idea. Dmitry should have an answer for that. He implemented mmm-erb. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-22 18:17 ` Vitalie Spinu @ 2016-03-23 1:18 ` Dmitry Gutov 2016-03-23 13:18 ` Stefan Monnier 1 sibling, 0 replies; 155+ messages in thread From: Dmitry Gutov @ 2016-03-23 1:18 UTC (permalink / raw) To: Vitalie Spinu, Stefan Monnier; +Cc: emacs-devel On 03/22/2016 08:17 PM, Vitalie Spinu wrote: > This will not work in above case. <%else-%> chunk needs to know about where <%if > signed_in? -%> was indented which is not an immediately preceding chunk. > > It's hard to think of better solution than collecting all relevant previous > chunks in one place and indenting according to inner mode. In order to indent > "<%else-%>", STRING-BEFORE should be full "link_to ..." line. So basically > STRING-BEFORE must consist of all ruby spans in between "if" and "else" chunks. ...and the multi-mode package would have to know, somehow, that the "if" chunk is special in this regard, and know which "if" matches which "end", etc. Or simply always include all previous chunks in the given mode in STRING-BEFORE. >> How should the "generic" code that links HTML and Ruby know when to indent >> using the HTML indentation code and when to use the Ruby indentation rules? > > No idea. Dmitry should have an answer for that. He implemented mmm-erb. Again, mmm-erb is written to support a limited set of template languages (currently, two, though supporting JSP would be trivial-ish, were java-mode not a part of CC Mode, with associated pitfalls). So IME, the multi-mode package needs to hardcode, in some form of another, the knowledge which file formats use this approach to indentation. And since we're doing that anyway, using a simpler indentation code in those particular files doesn't seem like a bad idea either. (For non-continuation hunks, at least). BTW, web-mode doesn't seem to dispatch to inner modes's functions at all: https://github.com/fxbois/web-mode/blob/c5aacacb8f4c233844306806a102405c8e9671c9/web-mode.el#L7164-L7198. I'm not a fan of this approach in general, but that clearly means that it can work for indentation. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-22 18:17 ` Vitalie Spinu 2016-03-23 1:18 ` Dmitry Gutov @ 2016-03-23 13:18 ` Stefan Monnier 1 sibling, 0 replies; 155+ messages in thread From: Stefan Monnier @ 2016-03-23 13:18 UTC (permalink / raw) To: emacs-devel >> Since it's not on its own line, I don't see why it would be an issue >> for indentation. > It's a problem if you narrow to current span and allow inner mode to indent > first line. I guess it's an issue if the buffer is "always narrowed", in which case the "first" line might get indented accidentally, indeed. But otherwise, there's no reason for the generic mode to go through the trouble of "narrow + indent" this partial line. > Without narrowing it's not clear what is the contract that inner mode should > respect to handle previous chunk locations. For prog-indent-context we provide (START . END), as well as PREVIOUS-CHUNKS, where the contract is that the major mode's indentation code should only look at those parts of the buffer. It's up to the mode to decide whether it does that via narrowing, or some other way. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-22 14:51 ` Stefan Monnier 2016-03-22 18:17 ` Vitalie Spinu @ 2016-03-22 18:26 ` Vitalie Spinu 2016-03-23 2:07 ` Stefan Monnier 1 sibling, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-22 18:26 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel >> On Tue, Mar 22 2016 10:51, Stefan Monnier wrote: > As for STRING-AFTER, the example is compelling, but I don't yet > understand really how it would all work out overall. How about passing signatures to indentation-funciton? Assume that there is a way to represent previous indentation context with a simple data structure, akin to parse-partial-sexp but for indentation. Then you can compute indentation context of a span by passing to the indentation-funciton the content of the previous location. Of course the indentation context data structure should be mode specific and modes must be constructing it themselves. But some useful degree of uniformity is surely possible. For example FIRST-COLUMN is a very simple one dimensional signature. Besides being useful for incremental indentation within a mode, it can be directly leveraged by multi-modes. Just pick the context from previous chunk, modify usefully and pass to the next chunk. (Of course locations of previous chunks should not be part of the signature). Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-22 18:26 ` Vitalie Spinu @ 2016-03-23 2:07 ` Stefan Monnier 2016-03-23 10:56 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Stefan Monnier @ 2016-03-23 2:07 UTC (permalink / raw) To: Vitalie Spinu; +Cc: emacs-devel > Of course the indentation context data structure should be mode specific and > modes must be constructing it themselves. But some useful degree of uniformity > is surely possible. For example FIRST-COLUMN is a very simple one dimensional > signature. Yes, it's an attractive idea. But for example in the case of SMIE we never compute this context directly, instead we discover it as we parse the text backward from point. But I guess we could represent the context as an integer (the position from which to parse backward). Still, in the ERB case we'd need to mix the HTML context with the Ruby context, so the representation of the context can't be "internal to the major mode". Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-23 2:07 ` Stefan Monnier @ 2016-03-23 10:56 ` Vitalie Spinu 2016-03-23 11:41 ` Stefan Monnier 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-23 10:56 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel >> On Tue, Mar 22 2016 22:07, Stefan Monnier wrote: > Still, in the ERB case we'd need to mix the HTML context with the Ruby > context, so the representation of the context can't be "internal to the > major mode". The signature is internal to the inner mode but will have a generic part which multi-mode can understand and modify. Multi-mode will take care of the interleaving and mixing when needed. I think there is absolutely no way to avoid a "superviser", but each inner mode need not know about the big picture. In erb case each ruby line will be asked for an indentation offset in a narrowed buffer with context from previous ruby span. Then multi-mode will indent the whole <% ... %> according to this inner offset plus an offset derived from parent html element. The last step is to ask the ruby mode to produce an indentation context at the end of this ruby line and cache it. > But I guess we could represent the context as an integer (the position from > which to parse backward). No, no. Context should not have absolute positions in it. That would ruin the whole thing. It should contain information about the nesting of language constructs sufficient to be able to indent first line of an inner span without any other positional knowledge. For example, if current line is directly part of IF block, most languages don't care what precedes IF head at all, only the offset of IF. So an entry in the context data structure might look like (IF . IF-OFFSET). If line number within the block is important then you can pass (IF IF-OFFSET . RELATIVE-LINE). My hunch is that for most languages you would be able to reduce indentation to a small number of block-continuation constructs (less than 10). Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-23 10:56 ` Vitalie Spinu @ 2016-03-23 11:41 ` Stefan Monnier 2016-03-23 12:39 ` Vitalie Spinu 2016-03-24 7:30 ` Andreas Röhler 0 siblings, 2 replies; 155+ messages in thread From: Stefan Monnier @ 2016-03-23 11:41 UTC (permalink / raw) To: Vitalie Spinu; +Cc: emacs-devel > No, no. Context should not have absolute positions in it. That would ruin the > whole thing. It should contain information about the nesting of language > constructs sufficient to be able to indent first line of an inner span without > any other positional knowledge. As explained, the SMIE indentation has no such "summary of context". The only context it could use is either a position or the complete text before that position. > For example, if current line is directly part of IF block, most > languages don't care what precedes IF head at all, only the offset > of IF. Yes, there are simple cases we know how to handle. The problem is the general case, along with the work to modify the existing indentation codes to be able to generate and use that data. > So an entry in the context data structure might look like (IF > . IF-OFFSET). If line number within the block is important then you > can pass (IF IF-OFFSET . RELATIVE-LINE). My hunch is that for most > languages you would be able to reduce indentation to a small number of > block-continuation constructs (less than 10). Consider x = a << b + c * d the code after this line can be indented in various different ways: * e or + e or == e or ; e And of course, in this example, I put all relevant operators, but in practice they'll generally be on different lines. And SMIE (which supports that kinds of indentation, e.g. in sm-c-mode) doesn't pre-compute that context: it's only when it sees the "+" at the beginning of line that it moves back over higher-precedence operators to find the matching alignment spot. In theory, SMIE could try to create the kind of context you're thinking of, but that would amount to a complete rewrite (and it would likely be very difficult if not impossible to make it work with existing smie-rules-functions, so it'd break backward compatibility). Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-23 11:41 ` Stefan Monnier @ 2016-03-23 12:39 ` Vitalie Spinu 2016-03-23 13:23 ` Stefan Monnier 2016-03-24 7:30 ` Andreas Röhler 1 sibling, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-23 12:39 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel >> On Wed, Mar 23 2016 07:41, Stefan Monnier wrote: > In theory, SMIE could try to create the kind of context you're thinking of, > but that would amount to a complete rewrite (and it would likely be very > difficult if not impossible to make it work with existing > smie-rules-functions, so it'd break backward compatibility). You are the best to judge what is possible or not. I had in mind that each mode will build such a context, but tackling it at SMIE level seems like a much better start. Will be back when I have a better understanding of it. SMIE is relatively new; 7 modes in emacs and 5 in ELPA are already using it, but it probably can be still molded here and there. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-23 12:39 ` Vitalie Spinu @ 2016-03-23 13:23 ` Stefan Monnier 2016-03-23 15:28 ` Dmitry Gutov 0 siblings, 1 reply; 155+ messages in thread From: Stefan Monnier @ 2016-03-23 13:23 UTC (permalink / raw) To: emacs-devel > SMIE is relatively new; 7 modes in emacs and 5 in ELPA are already > using it, but it probably can be still molded here and there. Sure. I'm just pointing out that it's a difficult modification to make, at least for some indentation codes (and probably for many of them). Maybe it's OK to design a multi-mode system which requires every major mode that wants to play with it well (e.g. well enough to get the kind of behavior we want for ERB) to basically rewrite its indentation code. But this won't fly unless we also make it possible to use major modes which haven't been rewritten in that way. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-23 13:23 ` Stefan Monnier @ 2016-03-23 15:28 ` Dmitry Gutov 2016-03-23 21:51 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-03-23 15:28 UTC (permalink / raw) To: Stefan Monnier, emacs-devel On 03/23/2016 03:23 PM, Stefan Monnier wrote: > Maybe it's OK to design a multi-mode system which requires every major > mode that wants to play with it well (e.g. well enough to get the kind > of behavior we want for ERB) to basically rewrite its indentation code. > > But this won't fly unless we also make it possible to use major modes > which haven't been rewritten in that way. Supporting both approaches would also require some feature discovery mechanism, or hardcodng a list of modes that support the "advanced" way, somewhere. Can we agree to shelve the PREVIOUS-CHUNKS/STRING-BEFORE/etc discussion until someone comes with a patch that shows a convincing usage of it, in multiple modes? Preferably with some performance numbers, showing a corresponding improvement when used together with some multi-mode package. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-23 15:28 ` Dmitry Gutov @ 2016-03-23 21:51 ` Vitalie Spinu 0 siblings, 0 replies; 155+ messages in thread From: Vitalie Spinu @ 2016-03-23 21:51 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Stefan Monnier, emacs-devel >> On Wed, Mar 23 2016 17:28, Dmitry Gutov wrote: > On 03/23/2016 03:23 PM, Stefan Monnier wrote: > Can we agree to shelve the PREVIOUS-CHUNKS/STRING-BEFORE/etc discussion until > someone comes with a patch that shows a convincing usage of it, in multiple > modes? > Preferably with some performance numbers, showing a corresponding improvement > when used together with some multi-mode package. Yeps. The topic has been exhausted for now. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-23 11:41 ` Stefan Monnier 2016-03-23 12:39 ` Vitalie Spinu @ 2016-03-24 7:30 ` Andreas Röhler 1 sibling, 0 replies; 155+ messages in thread From: Andreas Röhler @ 2016-03-24 7:30 UTC (permalink / raw) To: emacs-devel; +Cc: Vitalie Spinu, Stefan Monnier On 23.03.2016 12:41, Stefan Monnier wrote: >> No, no. Context should not have absolute positions in it. That would ruin the >> whole thing. It should contain information about the nesting of language >> constructs sufficient to be able to indent first line of an inner span without >> any other positional knowledge. > As explained, the SMIE indentation has no such "summary of context". > The only context it could use is either a position or the complete text > before that position. > >> For example, if current line is directly part of IF block, most >> languages don't care what precedes IF head at all, only the offset >> of IF. > Yes, there are simple cases we know how to handle. The problem is the > general case, along with the work to modify the existing indentation > codes to be able to generate and use that data. > >> So an entry in the context data structure might look like (IF >> . IF-OFFSET). If line number within the block is important then you >> can pass (IF IF-OFFSET . RELATIVE-LINE). My hunch is that for most >> languages you would be able to reduce indentation to a small number of >> block-continuation constructs (less than 10). > Consider > > x = a << b + c * d > > the code after this line can be indented in various different ways: > > * e > or > + e > or > == e > or > ; e > > And of course, in this example, I put all relevant operators, but in > practice they'll generally be on different lines. And SMIE (which > supports that kinds of indentation, e.g. in sm-c-mode) doesn't > pre-compute that context: it's only when it sees the "+" at the > beginning of line that it moves back over higher-precedence operators to > find the matching alignment spot. > > In theory, SMIE could try to create the kind of context you're thinking > of, but that would amount to a complete rewrite (and it would likely be > very difficult if not impossible to make it work with existing > smie-rules-functions, so it'd break backward compatibility). > > > Stefan > Stefan, came across SMIE as it looked like an interesting abstraction. Estimate your efforts. Nonetheless think that path turned out wrong meanwhile. SMIE-based indentation will not be easier, but run into more and more complexity instead. The reasons deserve being discussed elsewhere. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 3:11 ` Stefan Monnier 2016-03-21 5:05 ` Vitalie Spinu @ 2016-03-21 11:56 ` Dmitry Gutov 1 sibling, 0 replies; 155+ messages in thread From: Dmitry Gutov @ 2016-03-21 11:56 UTC (permalink / raw) To: Stefan Monnier, Vitalie Spinu; +Cc: Alan Mackenzie, emacs-devel On 03/21/2016 05:11 AM, Stefan Monnier wrote: > I must say I don't understand how what we have is so very different from > what you suggest. Of course, I fully agree on the need to deprecate > indent-line-function and use a side-effect free replacement which > returns the desired indentation (instead performing the indentation). > > I think both suggestions require changes to every mode, and in both > cases the changes can be reduced to a one-liner or close enough (for > the simple case). Admittedly, for it to be a one-liner, we'll need to > provide a standard helper function. It also sounds like we should revert the changes that brought in prog-indentation-context in emacs-25, and proceed with the results of this discussion on master. Provided we reach an agreement here, of course. ^ permalink raw reply [flat|nested] 155+ messages in thread
* [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-21 1:05 ` Vitalie Spinu 2016-03-21 3:11 ` Stefan Monnier @ 2016-03-21 5:08 ` Vitalie Spinu 2016-03-21 12:39 ` Stefan Monnier 2016-03-21 11:47 ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Dmitry Gutov 2 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 5:08 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 385 bytes --] >> On Mon, Mar 21 2016 02:05, Vitalie Spinu wrote: >> Are you interested in working on a patch? Also Cc'ing Stefan. > My knowledge of emacs C internals is close to 0. Elisp side (and probably C > side) of this is trivial. I will look into it but I don't think I am the best > person for that. The widen part turned to be easy. Will look at parse-partial-sexp tomorrow. Vitalie [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: text/x-diff, Size: 2049 bytes --] From eafcff9e72499e3bb6dc462d406dd33a885e3d49 Mon Sep 17 00:00:00 2001 From: Vitalie Spinu <spinuvit@gmail.com> Date: Mon, 21 Mar 2016 05:41:55 +0100 Subject: [PATCH] Implement hard-narrowing `widen` now respects restrictions imposed by new variable `hard-widen-limits` --- src/buffer.c | 5 +++++ src/editfns.c | 14 +++++++++++++- 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/src/buffer.c b/src/buffer.c index f06d7e0..5232c49 100644 --- a/src/buffer.c +++ b/src/buffer.c @@ -6219,6 +6219,11 @@ and disregard a `read-only' text property if the property value is a member of the list. */); Vinhibit_read_only = Qnil; + DEFVAR_LISP ("hard-widen-limits", Vhard_widen_limits, + doc: /* When non-nil `widen` will widen to these limits. +Must be a cons of the form (MIN . MAX) where MIN and MAX are integers or markers. */); + Vhard_widen_limits = Qnil; + DEFVAR_PER_BUFFER ("cursor-type", &BVAR (current_buffer, cursor_type), Qnil, doc: /* Cursor to use when this buffer is in the selected window. Values are interpreted as follows: diff --git a/src/editfns.c b/src/editfns.c index 2ac0537..fb1f652 100644 --- a/src/editfns.c +++ b/src/editfns.c @@ -3480,12 +3480,24 @@ DEFUN ("delete-and-extract-region", Fdelete_and_extract_region, return empty_unibyte_string; return del_range_1 (XINT (start), XINT (end), 1, 1); } + \f DEFUN ("widen", Fwiden, Swiden, 0, 0, "", doc: /* Remove restrictions (narrowing) from current buffer. -This allows the buffer's full text to be seen and edited. */) +This allows the buffer's full text to be seen and edited. +If `hard-widen-limits` is non-nil, widen only to those limits. */) (void) { + + if (! NILP (Vhard_widen_limits)) + { + CHECK_CONS(Vhard_widen_limits); + Lisp_Object hbeg = XCAR(Vhard_widen_limits); + Lisp_Object hend = XCDR(Vhard_widen_limits); + Fnarrow_to_region(hbeg, hend); + return Qnil; + } + if (BEG != BEGV || Z != ZV) current_buffer->clip_changed = 1; BEGV = BEG; -- 2.5.0 ^ permalink raw reply related [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-21 5:08 ` [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] Vitalie Spinu @ 2016-03-21 12:39 ` Stefan Monnier 2016-03-21 12:54 ` Vitalie Spinu 2016-03-21 14:04 ` Stefan Monnier 0 siblings, 2 replies; 155+ messages in thread From: Stefan Monnier @ 2016-03-21 12:39 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Alan Mackenzie, emacs-devel, Dmitry Gutov > + DEFVAR_LISP ("hard-widen-limits", Vhard_widen_limits, > + doc: /* When non-nil `widen` will widen to these limits. > +Must be a cons of the form (MIN . MAX) where MIN and MAX are integers or markers. */); > + Vhard_widen_limits = Qnil; Sorry to nitpick, but I'm not completely happy with this API. As an implementation it might be OK, but I can imagine wanting to change the implementation in the future but being stuck by the exposed internals. So I suggest we instead expose only a new primitive "call-with-hard-narrowing" which could look like: (defun call-with-hard-narrowing (from to func) (make-local-variable 'internal--hard-widen-limits) (let ((internal--hard-widen-limits (cons from to))) (funcall func))) which could be supplemented with a corresponding macro (defmacro with-hard-narrowing (from to &rest body) `(call-with-hard-narrowing ,from ,to (lambda () ,body))) -- Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-21 12:39 ` Stefan Monnier @ 2016-03-21 12:54 ` Vitalie Spinu 2016-03-21 14:07 ` Stefan Monnier 2016-03-21 14:04 ` Stefan Monnier 1 sibling, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 12:54 UTC (permalink / raw) To: Stefan Monnier; +Cc: Alan Mackenzie, Dmitry Gutov, emacs-devel Sounds reasonable. But whatever is the internal implementation shouldn't hard-widen-limits be there anyways? Why to bother with call-with-hard-narrowing and not have all the logic in with-hard-narrowing directly? I looks to me that it's better to expose hard narrowing to elisp only. If possible it should be transparent to low level code. Vitalie >> On Mon, Mar 21 2016 08:39, Stefan Monnier wrote: >> + DEFVAR_LISP ("hard-widen-limits", Vhard_widen_limits, >> + doc: /* When non-nil `widen` will widen to these limits. >> +Must be a cons of the form (MIN . MAX) where MIN and MAX are integers or markers. */); >> + Vhard_widen_limits = Qnil; > Sorry to nitpick, but I'm not completely happy with this API. As an > implementation it might be OK, but I can imagine wanting to change the > implementation in the future but being stuck by the exposed internals. > So I suggest we instead expose only a new primitive > "call-with-hard-narrowing" which could look like: > (defun call-with-hard-narrowing (from to func) > (make-local-variable 'internal--hard-widen-limits) > (let ((internal--hard-widen-limits (cons from to))) > (funcall func))) > which could be supplemented with a corresponding macro > (defmacro with-hard-narrowing (from to &rest body) > `(call-with-hard-narrowing ,from ,to (lambda () ,body))) > -- Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-21 12:54 ` Vitalie Spinu @ 2016-03-21 14:07 ` Stefan Monnier 2016-03-21 14:14 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Stefan Monnier @ 2016-03-21 14:07 UTC (permalink / raw) To: emacs-devel > Why to bother with call-with-hard-narrowing and not have all the logic > in with-hard-narrowing directly? I looks to me that it's better to > expose hard narrowing to elisp only. If possible it should be > transparent to low level code. The macro-expanded code will be written into the .elc files which people will expect will work in the future without having to recompile. So any API used by the macro-expanded code ends up being sufficiently exposed that we can't easily get rid of it. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-21 14:07 ` Stefan Monnier @ 2016-03-21 14:14 ` Vitalie Spinu 0 siblings, 0 replies; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 14:14 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel Aha. Clear. >> On Mon, Mar 21 2016 10:07, Stefan Monnier wrote: >> Why to bother with call-with-hard-narrowing and not have all the logic >> in with-hard-narrowing directly? I looks to me that it's better to >> expose hard narrowing to elisp only. If possible it should be >> transparent to low level code. > The macro-expanded code will be written into the .elc files which people > will expect will work in the future without having to recompile. > So any API used by the macro-expanded code ends up being sufficiently > exposed that we can't easily get rid of it. > Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-21 12:39 ` Stefan Monnier 2016-03-21 12:54 ` Vitalie Spinu @ 2016-03-21 14:04 ` Stefan Monnier 2016-03-21 14:33 ` Vitalie Spinu 1 sibling, 1 reply; 155+ messages in thread From: Stefan Monnier @ 2016-03-21 14:04 UTC (permalink / raw) To: emacs-devel > (defun call-with-hard-narrowing (from to func) > (make-local-variable 'internal--hard-widen-limits) > (let ((internal--hard-widen-limits (cons from to))) > (funcall func))) Hmm... I now realize that this won't handle the case of info-mode buffers (and similarly rmail buffers) where the hard-narrowing is not scoped. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-21 14:04 ` Stefan Monnier @ 2016-03-21 14:33 ` Vitalie Spinu 2016-03-21 14:54 ` Stefan Monnier 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 14:33 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel >> On Mon, Mar 21 2016 10:04, Stefan Monnier wrote: >> (defun call-with-hard-narrowing (from to func) >> (make-local-variable 'internal--hard-widen-limits) >> (let ((internal--hard-widen-limits (cons from to))) >> (funcall func))) > Hmm... I now realize that this won't handle the case of info-mode > buffers (and similarly rmail buffers) where the hard-narrowing is not > scoped. What does this mean in plain English? Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-21 14:33 ` Vitalie Spinu @ 2016-03-21 14:54 ` Stefan Monnier 2016-03-21 17:16 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Stefan Monnier @ 2016-03-21 14:54 UTC (permalink / raw) To: emacs-devel >>> (defun call-with-hard-narrowing (from to func) >>> (make-local-variable 'internal--hard-widen-limits) >>> (let ((internal--hard-widen-limits (cons from to))) >>> (funcall func))) >> Hmm... I now realize that this won't handle the case of info-mode >> buffers (and similarly rmail buffers) where the hard-narrowing is not >> scoped. > What does this mean in plain English? That it sets narrowing and leaves it there. So it can't use call-with-hard-narrowing for that. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-21 14:54 ` Stefan Monnier @ 2016-03-21 17:16 ` Vitalie Spinu 2016-03-21 18:36 ` Stefan Monnier 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 17:16 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel >> On Mon, Mar 21 2016 10:54, Stefan Monnier wrote: > That it sets narrowing and leaves it there. So it can't use > call-with-hard-narrowing for that. Why would it need it for? This is outside of use cases that I have in mind. with-hard-narrowing should be used in limited, transient, prog-only contexts; almost exclusively for advices in multi-mode engines. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-21 17:16 ` Vitalie Spinu @ 2016-03-21 18:36 ` Stefan Monnier 2016-03-21 19:18 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Stefan Monnier @ 2016-03-21 18:36 UTC (permalink / raw) To: Vitalie Spinu; +Cc: emacs-devel >> That it sets narrowing and leaves it there. So it can't use >> call-with-hard-narrowing for that. > Why would it need it for? Because an info file is made up of various nodes, and Emacs only shows one node at a time by narrowing. This narrowing should be "hard" because font-lock and such should only operate on a node at a time. The same was true for Rmail which used to show the content of each email simply by narrowing the mailbox file to the specific email. Not sure if it still does that (clearly it can't be so simple with MIME's base64 and attachments). > This is outside of use cases that I have in mind. Indeed, it's a different case, but one where the narrowing should be hard as well. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-21 18:36 ` Stefan Monnier @ 2016-03-21 19:18 ` Vitalie Spinu 2016-03-22 3:17 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 19:18 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel >> On Mon, Mar 21 2016 14:36, Stefan Monnier wrote: >> This is outside of use cases that I have in mind. > Indeed, it's a different case, but one where the narrowing should be > hard as well. Ok. This part is trickier but it might not be that hard. Will keep this in mind. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-21 19:18 ` Vitalie Spinu @ 2016-03-22 3:17 ` Vitalie Spinu 2016-03-22 9:57 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-22 3:17 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 618 bytes --] >> On Mon, Mar 21 2016 20:18, Vitalie Spinu wrote: >>> This is outside of use cases that I have in mind. >> Indeed, it's a different case, but one where the narrowing should be >> hard as well. > Ok. This part is trickier but it might not be that hard. Will keep this in mind. I have pushed the proposed change to `widen-limits` branch. The C level consequences are fairly innocuous. There are only 3 instances of calls to Fwiden in the whole emacs. Given that there seem to be use cases of permanent limiting I kept it as buffer local variable. I also switched to milder name buffer-widen-limits. Vitalie [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: text/x-diff, Size: 5854 bytes --] 3 files changed, 48 insertions(+), 3 deletions(-) src/buffer.c | 19 +++++++++++++++++-- src/buffer.h | 17 +++++++++++++++++ src/editfns.c | 15 ++++++++++++++- modified src/buffer.c @@ -329,6 +329,11 @@ bset_scroll_up_aggressively (struct buffer *b, Lisp_Object val) b->scroll_up_aggressively_ = val; } static void +bset_widen_limits (struct buffer *b, Lisp_Object val) +{ + b->widen_limits_ = val; +} +static void bset_selective_display (struct buffer *b, Lisp_Object val) { b->selective_display_ = val; @@ -847,6 +852,7 @@ CLONE nil means the indirect buffer's state is reset to default values. */) bset_display_count (b, make_number (0)); bset_backed_up (b, Qnil); bset_auto_save_file_name (b, Qnil); + bset_widen_limits (b, b->base_buffer->widen_limits_); set_buffer_internal_1 (b); Fset (intern ("buffer-save-without-query"), Qnil); Fset (intern ("buffer-file-number"), Qnil); @@ -961,6 +967,7 @@ reset_buffer_local_variables (struct buffer *b, bool permanent_too) things that depend on the major mode. default-major-mode is handled at a higher level. We ignore it here. */ + bset_widen_limits(b, Qnil); bset_major_mode (b, Qfundamental_mode); bset_keymap (b, Qnil); bset_mode_name (b, QSFundamental); @@ -2167,7 +2174,7 @@ so the buffer is truly empty after this. */) { Fwiden (); - del_range (BEG, Z); + del_range (BEGWL, ZWL); current_buffer->last_window_start = 1; /* Prevent warnings, or suspension of auto saving, that would happen @@ -5037,6 +5044,7 @@ init_buffer_once (void) bset_display_count (&buffer_local_flags, make_number (-1)); bset_display_time (&buffer_local_flags, make_number (-1)); bset_enable_multibyte_characters (&buffer_local_flags, make_number (-1)); + bset_widen_limits (&buffer_local_flags, make_number (-1)); /* These used to be stuck at 0 by default, but now that the all-zero value means Qnil, we have to initialize them explicitly. */ @@ -5160,6 +5168,7 @@ init_buffer_once (void) bset_cursor_type (&buffer_defaults, Qt); bset_extra_line_spacing (&buffer_defaults, Qnil); bset_cursor_in_non_selected_windows (&buffer_defaults, Qt); + bset_widen_limits (&buffer_defaults, Qnil); bset_enable_multibyte_characters (&buffer_defaults, Qt); bset_buffer_file_coding_system (&buffer_defaults, Qnil); @@ -5367,7 +5376,6 @@ defvar_per_buffer (struct Lisp_Buffer_Objfwd *bo_fwd, const char *namestring, emacs_abort (); } - /* Initialize the buffer routines. */ void syms_of_buffer (void) @@ -5796,6 +5804,13 @@ If you set this to -2, that means don't turn off auto-saving in this buffer if its text size shrinks. If you use `buffer-swap-text' on a buffer, you probably should set this to -2 in that buffer. */); + DEFVAR_PER_BUFFER ("buffer-widen-limits", &BVAR (current_buffer, widen_limits), + Qnil, + doc: /* When non-nil `widen` will widen to these limits. +Must be a cons of the form (MIN . MAX) where MIN and MAX are integers +of hard widen limits in this buffer. This is an experimental variable +intended primarily for multi-mode engines. */); + DEFVAR_PER_BUFFER ("selective-display", &BVAR (current_buffer, selective_display), Qnil, doc: /* Non-nil enables selective display. modified src/buffer.h @@ -59,6 +59,10 @@ INLINE_HEADER_BEGIN #define Z (current_buffer->text->z) #define Z_BYTE (current_buffer->text->z_byte) +/* Positions that take into account widen limits. */ +#define BEGWL (BUF_BEGWL (current_buffer)) +#define ZWL (BUF_ZWL(current_buffer)) + /* Macros for the addresses of places in the buffer. */ /* Address of beginning of buffer. */ @@ -128,6 +132,15 @@ INLINE_HEADER_BEGIN : NILP (BVAR (buf, begv_marker)) ? buf->begv_byte \ : marker_byte_position (BVAR (buf, begv_marker))) +/* Hard positions in buffer. */ +#define BUF_BEGWL(buf) \ + ((NILP (BVAR (buf, widen_limits))) ? BUF_BEG (buf) \ + : XINT( XCAR (BVAR (buf, widen_limits)))) + +#define BUF_ZWL(buf) \ + ((NILP (BVAR (buf, widen_limits))) ? BUF_Z (buf) \ + : XINT( XCDR (BVAR (buf, widen_limits)))) + /* Position of point in buffer. */ #define BUF_PT(buf) \ (buf == current_buffer ? PT \ @@ -150,6 +163,7 @@ INLINE_HEADER_BEGIN : NILP (BVAR (buf, zv_marker)) ? buf->zv_byte \ : marker_byte_position (BVAR (buf, zv_marker))) + /* Position of gap in buffer. */ #define BUF_GPT(buf) ((buf)->text->gpt) #define BUF_GPT_BYTE(buf) ((buf)->text->gpt_byte) @@ -748,6 +762,9 @@ struct buffer See `cursor-type' for other values. */ Lisp_Object cursor_in_non_selected_windows_; + /* Cons of hard widen limits */ + Lisp_Object widen_limits_; + /* No more Lisp_Object beyond this point. Except undo_list, which is handled specially in Fgarbage_collect. */ modified src/editfns.c @@ -3480,12 +3480,25 @@ DEFUN ("delete-and-extract-region", Fdelete_and_extract_region, return empty_unibyte_string; return del_range_1 (XINT (start), XINT (end), 1, 1); } + \f DEFUN ("widen", Fwiden, Swiden, 0, 0, "", doc: /* Remove restrictions (narrowing) from current buffer. -This allows the buffer's full text to be seen and edited. */) +This allows the buffer's full text to be seen and edited. +If `buffer-widen-limits` is non-nil, widen only to those limits. */) (void) { + + if (!NILP (BVAR(current_buffer, widen_limits))) + { + Lisp_Object hl = BVAR(current_buffer, widen_limits); + CHECK_CONS(hl); + CHECK_NUMBER(XCAR(hl)); + CHECK_NUMBER(XCDR(hl)); + Fnarrow_to_region(XCAR(hl), XCDR(hl)); + return Qnil; + } + if (BEG != BEGV || Z != ZV) current_buffer->clip_changed = 1; BEGV = BEG; [back] ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-22 3:17 ` Vitalie Spinu @ 2016-03-22 9:57 ` Vitalie Spinu 2016-03-22 10:05 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-22 9:57 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel I think having a local variable for this is not the best idea. In proposed implementation you first set buffer-widen-limits, but the effect of it will show only on the next invocation of widen. So the consumer will need to set this variable and always follow it by widen. If widen is not called, one can access outside regions. Particularly `narrow` doesn't know about hard limits, so you can narrow to outside those limits. I think a better implementation would be to have `set-widen-limits` function which would set the limits and narrow the region taking into account current narrowing. The supporting `with-widen-limits` macro will call `set-widen-limits` in unwind-protect. I guess this comes very close to the implementation that you suggested earlier for hiding the internals. Vitalie >> On Tue, Mar 22 2016 04:17, Vitalie Spinu wrote: >>> On Mon, Mar 21 2016 20:18, Vitalie Spinu wrote: >>>> This is outside of use cases that I have in mind. >>> Indeed, it's a different case, but one where the narrowing should be >>> hard as well. >> Ok. This part is trickier but it might not be that hard. Will keep this in mind. > I have pushed the proposed change to `widen-limits` branch. The C level > consequences are fairly innocuous. There are only 3 instances of calls to Fwiden > in the whole emacs. > Given that there seem to be use cases of permanent limiting I kept it as buffer > local variable. I also switched to milder name buffer-widen-limits. > Vitalie > 3 files changed, 48 insertions(+), 3 deletions(-) > src/buffer.c | 19 +++++++++++++++++-- > src/buffer.h | 17 +++++++++++++++++ > src/editfns.c | 15 ++++++++++++++- > modified src/buffer.c > @@ -329,6 +329,11 @@ bset_scroll_up_aggressively (struct buffer *b, Lisp_Object val) > b->scroll_up_aggressively_ = val; > } > static void > +bset_widen_limits (struct buffer *b, Lisp_Object val) > +{ > + b->widen_limits_ = val; > +} > +static void > bset_selective_display (struct buffer *b, Lisp_Object val) > { > b->selective_display_ = val; > @@ -847,6 +852,7 @@ CLONE nil means the indirect buffer's state is reset to default values. */) > bset_display_count (b, make_number (0)); > bset_backed_up (b, Qnil); > bset_auto_save_file_name (b, Qnil); > + bset_widen_limits (b, b->base_buffer->widen_limits_); > set_buffer_internal_1 (b); > Fset (intern ("buffer-save-without-query"), Qnil); > Fset (intern ("buffer-file-number"), Qnil); > @@ -961,6 +967,7 @@ reset_buffer_local_variables (struct buffer *b, bool permanent_too) > things that depend on the major mode. > default-major-mode is handled at a higher level. > We ignore it here. */ > + bset_widen_limits(b, Qnil); > bset_major_mode (b, Qfundamental_mode); > bset_keymap (b, Qnil); > bset_mode_name (b, QSFundamental); > @@ -2167,7 +2174,7 @@ so the buffer is truly empty after this. */) > { > Fwiden (); > > - del_range (BEG, Z); > + del_range (BEGWL, ZWL); > > current_buffer->last_window_start = 1; > /* Prevent warnings, or suspension of auto saving, that would happen > @@ -5037,6 +5044,7 @@ init_buffer_once (void) > bset_display_count (&buffer_local_flags, make_number (-1)); > bset_display_time (&buffer_local_flags, make_number (-1)); > bset_enable_multibyte_characters (&buffer_local_flags, make_number (-1)); > + bset_widen_limits (&buffer_local_flags, make_number (-1)); > > /* These used to be stuck at 0 by default, but now that the all-zero value > means Qnil, we have to initialize them explicitly. */ > @@ -5160,6 +5168,7 @@ init_buffer_once (void) > bset_cursor_type (&buffer_defaults, Qt); > bset_extra_line_spacing (&buffer_defaults, Qnil); > bset_cursor_in_non_selected_windows (&buffer_defaults, Qt); > + bset_widen_limits (&buffer_defaults, Qnil); > > bset_enable_multibyte_characters (&buffer_defaults, Qt); > bset_buffer_file_coding_system (&buffer_defaults, Qnil); > @@ -5367,7 +5376,6 @@ defvar_per_buffer (struct Lisp_Buffer_Objfwd *bo_fwd, const char *namestring, > emacs_abort (); > } > > - > /* Initialize the buffer routines. */ > void > syms_of_buffer (void) > @@ -5796,6 +5804,13 @@ If you set this to -2, that means don't turn off auto-saving in this buffer > if its text size shrinks. If you use `buffer-swap-text' on a buffer, > you probably should set this to -2 in that buffer. */); > > + DEFVAR_PER_BUFFER ("buffer-widen-limits", &BVAR (current_buffer, widen_limits), > + Qnil, > + doc: /* When non-nil `widen` will widen to these limits. > +Must be a cons of the form (MIN . MAX) where MIN and MAX are integers > +of hard widen limits in this buffer. This is an experimental variable > +intended primarily for multi-mode engines. */); > + > DEFVAR_PER_BUFFER ("selective-display", &BVAR (current_buffer, selective_display), > Qnil, > doc: /* Non-nil enables selective display. > modified src/buffer.h > @@ -59,6 +59,10 @@ INLINE_HEADER_BEGIN > #define Z (current_buffer->text->z) > #define Z_BYTE (current_buffer->text->z_byte) > > +/* Positions that take into account widen limits. */ > +#define BEGWL (BUF_BEGWL (current_buffer)) > +#define ZWL (BUF_ZWL(current_buffer)) > + > /* Macros for the addresses of places in the buffer. */ > > /* Address of beginning of buffer. */ > @@ -128,6 +132,15 @@ INLINE_HEADER_BEGIN > : NILP (BVAR (buf, begv_marker)) ? buf->begv_byte \ > : marker_byte_position (BVAR (buf, begv_marker))) > > +/* Hard positions in buffer. */ > +#define BUF_BEGWL(buf) \ > + ((NILP (BVAR (buf, widen_limits))) ? BUF_BEG (buf) \ > + : XINT( XCAR (BVAR (buf, widen_limits)))) > + > +#define BUF_ZWL(buf) \ > + ((NILP (BVAR (buf, widen_limits))) ? BUF_Z (buf) \ > + : XINT( XCDR (BVAR (buf, widen_limits)))) > + > /* Position of point in buffer. */ > #define BUF_PT(buf) \ > (buf == current_buffer ? PT \ > @@ -150,6 +163,7 @@ INLINE_HEADER_BEGIN > : NILP (BVAR (buf, zv_marker)) ? buf->zv_byte \ > : marker_byte_position (BVAR (buf, zv_marker))) > > + > /* Position of gap in buffer. */ > #define BUF_GPT(buf) ((buf)->text->gpt) > #define BUF_GPT_BYTE(buf) ((buf)->text->gpt_byte) > @@ -748,6 +762,9 @@ struct buffer > See `cursor-type' for other values. */ > Lisp_Object cursor_in_non_selected_windows_; > > + /* Cons of hard widen limits */ > + Lisp_Object widen_limits_; > + > /* No more Lisp_Object beyond this point. Except undo_list, > which is handled specially in Fgarbage_collect. */ > > modified src/editfns.c > @@ -3480,12 +3480,25 @@ DEFUN ("delete-and-extract-region", Fdelete_and_extract_region, > return empty_unibyte_string; > return del_range_1 (XINT (start), XINT (end), 1, 1); > } > + > \f > DEFUN ("widen", Fwiden, Swiden, 0, 0, "", > doc: /* Remove restrictions (narrowing) from current buffer. > -This allows the buffer's full text to be seen and edited. */) > +This allows the buffer's full text to be seen and edited. > +If `buffer-widen-limits` is non-nil, widen only to those limits. */) > (void) > { > + > + if (!NILP (BVAR(current_buffer, widen_limits))) > + { > + Lisp_Object hl = BVAR(current_buffer, widen_limits); > + CHECK_CONS(hl); > + CHECK_NUMBER(XCAR(hl)); > + CHECK_NUMBER(XCDR(hl)); > + Fnarrow_to_region(XCAR(hl), XCDR(hl)); > + return Qnil; > + } > + > if (BEG != BEGV || Z != ZV) > current_buffer->clip_changed = 1; > BEGV = BEG; > [back] ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-22 9:57 ` Vitalie Spinu @ 2016-03-22 10:05 ` Vitalie Spinu 2016-03-22 11:57 ` Stefan Monnier 2016-03-22 20:08 ` Richard Stallman 0 siblings, 2 replies; 155+ messages in thread From: Vitalie Spinu @ 2016-03-22 10:05 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel >> On Tue, Mar 22 2016 10:57, Vitalie Spinu wrote: > So the consumer will need to set this variable and always follow it by widen. Hm. This also implies that each consumer will need to take care of current narrowing and re-narrow to new limits. This doesn't sound right. I am also not sure what the behavior of save-restriction should be. Should save-restriction unwind hard limits as well? Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-22 10:05 ` Vitalie Spinu @ 2016-03-22 11:57 ` Stefan Monnier 2016-03-22 16:28 ` Vitalie Spinu 2016-04-28 13:29 ` Vitalie Spinu 2016-03-22 20:08 ` Richard Stallman 1 sibling, 2 replies; 155+ messages in thread From: Stefan Monnier @ 2016-03-22 11:57 UTC (permalink / raw) To: Vitalie Spinu; +Cc: emacs-devel >> So the consumer will need to set this variable and always follow it by widen. > Hm. This also implies that each consumer will need to take care of current > narrowing and re-narrow to new limits. This doesn't sound right. > I am also not sure what the behavior of save-restriction should be. Should > save-restriction unwind hard limits as well? IIRC past discussions on this issue, one option was to merge your set-widen-limits into narrow-to-region by adding an optional argument `hard'. And yes, I think save-restriction should unwind hard limits. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-22 11:57 ` Stefan Monnier @ 2016-03-22 16:28 ` Vitalie Spinu 2016-03-22 16:44 ` Stefan Monnier 2016-04-28 13:29 ` Vitalie Spinu 1 sibling, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-22 16:28 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel >> On Tue, Mar 22 2016 07:57, Stefan Monnier wrote: > IIRC past discussions on this issue, one option was to merge your > set-widen-limits into narrow-to-region by adding an optional argument `hard'. If narrowing is already in place, set-widen-limits will not touch it unless the visible region expands beyond the hard limits. I think widen limits is fundamentally about widening and only indirectly about narrowing. Mixing hard limit into a common user level function is a bad marketing strategy. We don't want to encourage major modes to use it in funny ways. If users and major modes decide to use hard limits we might end up in the same situation as now when narrow/widen is not perceived as a good tool for multi-modes. Static usage of hard widening, like in Info example, is not really a problem. Multi modes need to impose hard limits transiently, in specific contexts like indentation, syntax parsing or font lock, and will restore the limits at the end. Problems will occur if major modes start using hard limits in such contexts directly. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-22 16:28 ` Vitalie Spinu @ 2016-03-22 16:44 ` Stefan Monnier 2016-03-22 19:36 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Stefan Monnier @ 2016-03-22 16:44 UTC (permalink / raw) To: emacs-devel > Mixing hard limit into a common user level function is a bad marketing > strategy. `narrow-to-region' is not only a user-level command. It's also a low-level primitive. The narrow-to-region command can't set the optional argument unless we take extra steps to let it, so the "hard narrowing" would only be available from Elisp, not interactively. > If users and major modes decide to use hard limits we might end up in > the same situation as now when narrow/widen is not perceived as a good > tool for multi-modes. Could be. Maybe there are more "kinds of narrowing" than just 2, indeed. But for me, the main consideration is whether the text before/after point-min can be taken into account as a kind of context, or whether the text between point-min/max should be treated (even if temporarily) as being the whole&sole truth. That's what "hard narrowing" means to me. I don't think I'd be able to design something that can take into account finer distinctions of narrowings right now, for lack of understanding about what those finer distinctions could be and what kind of problems they lead to. > limits at the end. Problems will occur if major modes start using hard > limits in such contexts directly. I don't see any reason why problems *will* occur in that case (tho, of course, Murphy could be that reason). So until such problems do show up, I wouldn't worry. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-22 16:44 ` Stefan Monnier @ 2016-03-22 19:36 ` Vitalie Spinu 2016-03-23 2:22 ` Stefan Monnier 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-22 19:36 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel >> On Tue, Mar 22 2016 12:44, Stefan Monnier wrote: > "hard narrowing" would only be available from Elisp, not interactively. Interactive or not, doesn't matter. The danger is the whatever eslip used within hard-narrowed regions. >> If users and major modes decide to use hard limits we might end up in >> the same situation as now when narrow/widen is not perceived as a good >> tool for multi-modes. > Could be. Maybe there are more "kinds of narrowing" than just 2, indeed. > But for me, the main consideration is whether the text before/after > point-min can be taken into account as a kind of context, or whether the > text between point-min/max should be treated (even if temporarily) as > being the whole&sole truth. I agree completely. But I think defining an "whole&sole" universe need not involve current implementation of narrowing. It's about inability to widen not ability to narrow. My patch didn't even touch `narrow` because that's not needed. There is no real need to invent extra type of narrowing. It's a lot of extra work with no additional benefit. It's simply enough to define hard limits that none of the standard functions can lift. In order to define a different type of narrowing you would need to introduce alternatives to BEGV, BEGV_BYTE ZV, ZV_BYTE and the hunt them everywhere where BEGV or BEGV_BYTE are used right now. What concrete semantics do you have in mind? If a user or elisp already narrowed the buffer, will hard narrowing re-narrow it? If user typed within a hard region the hard narrowed region, will the upper hard limit expand just as ZV does? My approach is simpler and leaves current narrowing functionality alone. You set the limits and allow narrowing happening inside those limits normally. Even widen cannot lift those limits. You create a small universe within the buffer with only one exit (set-widen-limits nil nil). You might end up loosing text outside of the bounds if you modify the buffer and then call widen, but that's by design and this is how it's different from visual narrowing. Hard limits stay the same irrespective of what happens to the buffer. >> limits at the end. Problems will occur if major modes start using hard >> limits in such contexts directly. > I don't see any reason why problems *will* occur in that case (tho, of > course, Murphy could be that reason). So until such problems do show up, > I wouldn't worry. The problem is not hypothetical. It's occurring right now. If you impose limits in order to do font-lock and font-lock-fontify-region-function changes those limits that screws your multi mode. That's what is happening with current narrowing/widening mechanism and that's precisely the reason for extra widen limits in the first place. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-22 19:36 ` Vitalie Spinu @ 2016-03-23 2:22 ` Stefan Monnier 2016-03-23 11:41 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Stefan Monnier @ 2016-03-23 2:22 UTC (permalink / raw) To: Vitalie Spinu; +Cc: emacs-devel > There is no real need to invent extra type of narrowing. It's a lot of extra > work with no additional benefit. I don't see any extra work. (narrow-to-region BEG END 'hard) would just be the API used to set your hard limits, and that's all there is to it. > If user typed within a hard region the hard narrowed region, will the > upper hard limit expand just as ZV does? This is indispensable, yes. No matter whether the hard limits are folded int narrow-to-region or any other way: the upper limit has to be a marker, and unless we strictly enforce that the hard limits can't be circumvented at all, the lower limit would probably have to be a marker as well. > My approach is simpler and leaves current narrowing functionality > alone. You set the limits and allow narrowing happening inside those > limits normally. That's also how I imagine (narrow-to-region BEG END 'hard) working. It just won't allow widening outside of those hard limits. > You might end up loosing text outside of the bounds if you modify the > buffer and then call widen, but that's by design and this is how it's > different from visual narrowing. Hard limits stay the same > irrespective of what happens to the buffer. Sounds like a wart. What's the benefit? >>> limits at the end. Problems will occur if major modes start using hard >>> limits in such contexts directly. >> I don't see any reason why problems *will* occur in that case (tho, of >> course, Murphy could be that reason). So until such problems do show up, >> I wouldn't worry. > The problem is not hypothetical. It's occurring right now. It can't because we don't have hard limits right now. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-23 2:22 ` Stefan Monnier @ 2016-03-23 11:41 ` Vitalie Spinu 2016-03-23 12:34 ` Stefan Monnier 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-23 11:41 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel >> On Tue, Mar 22 2016 22:22, Stefan Monnier wrote: >> If user typed within a hard region the hard narrowed region, will the >> upper hard limit expand just as ZV does? > This is indispensable, yes. No matter whether the hard limits are > folded int narrow-to-region or any other way: the upper limit has to be > a marker, and unless we strictly enforce that the hard limits can't be > circumvented at all, the lower limit would probably have to be a marker > as well. Ok. So we agree that there is work involved of tracking an extra marker. Whenever buffer is modified by low level code, it must track new ZH marker and respect the relationship between ZH and ZV. There are 544 occurrences of ZV in emacs source. In order to add this extra marker one would need to go through all of those cases and enforce the semantics of ZH. It might be that adjusting ZV macros might do the job, but I cannot judge because I am not yet familiar with buffer modification code. >> You might end up loosing text outside of the bounds if you modify the >> buffer and then call widen, but that's by design and this is how it's >> different from visual narrowing. Hard limits stay the same >> irrespective of what happens to the buffer. > Sounds like a wart. What's the benefit? True, but it's almost a direct implementation of the restriction in prog-widen. It has same limitations and multi-modes are completely fine with those. A with-widen-limits macro will suffice for multi-mode use case. But you proposed to extend it to permanent set-widen-limits or (narrow-to-region .. 'hard). I see the benefit of it in info mode but I think it's pretty marginal. The proposed non-marker implementation will deter usage of widen-limits in contexts that involve buffer modification. But it will work just fine with multi-modes and with read-only info use cases. It also works fine with editing as long as it's not followed by widen. If widen is used the buffer will be re-narrowed to old limits. I will look into ZH marker this weekend. Maybe it's not that hard as I imagine. >>>> limits at the end. Problems will occur if major modes start using hard >>>> limits in such contexts directly. >>> I don't see any reason why problems *will* occur in that case (tho, of >>> course, Murphy could be that reason). So until such problems do show up, >>> I wouldn't worry. >> The problem is not hypothetical. It's occurring right now. > It can't because we don't have hard limits right now. Oh common. You know I was referring to current widen/narrow mechanism. It's one step to extrapolate to hard narrowing from there. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-23 11:41 ` Vitalie Spinu @ 2016-03-23 12:34 ` Stefan Monnier 2016-03-23 12:41 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Stefan Monnier @ 2016-03-23 12:34 UTC (permalink / raw) To: Vitalie Spinu; +Cc: emacs-devel > Ok. So we agree that there is work involved of tracking an extra > marker. Whenever buffer is modified by low level code, it must track new ZH > marker and respect the relationship between ZH and ZV. There are 544 occurrences > of ZV in emacs source. In order to add this extra marker one would need to go > through all of those cases and enforce the semantics of ZH. I wouldn't want to touch Z* and BEG*, indeed. I'm just suggesting to keep the limits as markers rather than as integers. It's a trivial change. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-23 12:34 ` Stefan Monnier @ 2016-03-23 12:41 ` Vitalie Spinu 2016-03-29 21:43 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-23 12:41 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel >> On Wed, Mar 23 2016 08:34, Stefan Monnier wrote: > I wouldn't want to touch Z* and BEG*, indeed. I'm just suggesting to keep the > limits as markers rather than as integers. It's a trivial change. Hm. That might work quite well actually. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-23 12:41 ` Vitalie Spinu @ 2016-03-29 21:43 ` Vitalie Spinu 2016-04-22 14:34 ` Dmitry Gutov 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-29 21:43 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 2046 bytes --] >> On Wed, Mar 23 2016 12:41, Vitalie Spinu wrote: >> I'm just suggesting to keep the limits as markers rather than as integers. Attaching a patch for this. AFAIC it shapes pretty nicely. There are two types of narrowing, visual and hard. The imposition or lifting of these is done through narrow-to-region and widen depending on the value of optional HARD argument. I haven't tested it yet because of the following build problem with loaddeffs: ... Finding pointers to doc strings... Finding pointers to doc strings...done Dumping under the name emacs 91970 pure bytes used : paxctl -zex emacs mv -f emacs bootstrap-emacs make -C ../lisp compile-first EMACS="../src/bootstrap-emacs" make[3]: Entering directory '/home/vspinu/bin/emacs-test/lisp' ELC emacs-lisp/macroexp.elc ELC emacs-lisp/cconv.elc ELC emacs-lisp/byte-opt.elc ELC emacs-lisp/bytecomp.elc ELC emacs-lisp/autoload.elc make[3]: Leaving directory '/home/vspinu/bin/emacs-test/lisp' make -C ../lisp autoloads EMACS="../src/bootstrap-emacs" make[3]: Entering directory '/home/vspinu/bin/emacs-test/lisp' GEN calendar/cal-loaddefs.el Loading macroexp.elc... appt.el:0:0: error: wrong-type-argument: (markerp /home/vspinu/bin/emacs-test/lisp/calendar/appt.el) Makefile:402: recipe for target 'calendar/cal-loaddefs.el' failed make[3]: *** [calendar/cal-loaddefs.el] Error 255 make[3]: Leaving directory '/home/vspinu/bin/emacs-test/lisp' Makefile:727: recipe for target '../lisp/loaddefs.el' failed make[2]: *** [../lisp/loaddefs.el] Error 2 make[2]: Leaving directory '/home/vspinu/bin/emacs-test/src' Makefile:398: recipe for target 'src' failed make[1]: *** [src] Error 2 make[1]: Leaving directory '/home/vspinu/bin/emacs-test' Makefile:1091: recipe for target 'bootstrap' failed make: *** [bootstrap] Error 2 Any ideas of why this is happening? The relevant branch is scratch/hard-narrow. Vitalie [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: text/x-diff, Size: 10728 bytes --] scratch/hard-narrow origin/scratch/hard-narrow 7068e4c811f7530e14d2684fea68499418642b33 Author: Vitalie Spinu <spinuvit@gmail.com> AuthorDate: Mon Mar 21 05:41:55 2016 +0100 Commit: Vitalie Spinu <spinuvit@gmail.com> CommitDate: Tue Mar 29 23:29:54 2016 +0200 Parent: f99b512 In M-%, avoid making buffer-local binding of text-property-default-nonsticky Merged: emacs-24 master scratch/hard-narrow Containing: scratch/hard-narrow Follows: emacs-25.0.92 (138) Hard narrowing Idem modified src/buffer.c @@ -571,6 +571,9 @@ even if it is dead. The return value is never nil. */) bset_begv_marker (b, Qnil); bset_zv_marker (b, Qnil); + bset_begh_marker (b, Qnil); + bset_zh_marker (b, Qnil); + name = Fcopy_sequence (buffer_or_name); set_string_intervals (name, NULL); bset_name (b, name); @@ -835,6 +838,7 @@ CLONE nil means the indirect buffer's state is reset to default values. */) bset_pt_marker (b, build_marker (b, b->pt, b->pt_byte)); bset_begv_marker (b, build_marker (b, b->begv, b->begv_byte)); bset_zv_marker (b, build_marker (b, b->zv, b->zv_byte)); + XMARKER (BVAR (b, zv_marker))->insertion_type = 1; } else @@ -2165,9 +2169,9 @@ Any narrowing restriction in effect (see `narrow-to-region') is removed, so the buffer is truly empty after this. */) (void) { - Fwiden (); + Fwiden (Qnil); - del_range (BEG, Z); + del_range (BEGV, ZV); current_buffer->last_window_start = 1; /* Prevent warnings, or suspension of auto saving, that would happen @@ -2310,6 +2314,8 @@ DEFUN ("buffer-swap-text", Fbuffer_swap_text, Sbuffer_swap_text, swapfield_ (pt_marker, Lisp_Object); swapfield_ (begv_marker, Lisp_Object); swapfield_ (zv_marker, Lisp_Object); + swapfield_ (begh_marker, Lisp_Object); + swapfield_ (zh_marker, Lisp_Object); bset_point_before_scroll (current_buffer, Qnil); bset_point_before_scroll (other_buffer, Qnil); @@ -2490,7 +2496,7 @@ current buffer is cleared. */) } } if (narrowed) - Fnarrow_to_region (make_number (begv), make_number (zv)); + Fnarrow_to_region (make_number (begv), make_number (zv), Qnil); } else { @@ -2571,7 +2577,7 @@ current buffer is cleared. */) TEMP_SET_PT (pt); if (narrowed) - Fnarrow_to_region (make_number (begv), make_number (zv)); + Fnarrow_to_region (make_number (begv), make_number (zv), Qnil); /* Do this first, so that chars_in_text asks the right question. set_intervals_multibyte needs it too. */ @@ -5053,6 +5059,8 @@ init_buffer_once (void) bset_pt_marker (&buffer_local_flags, make_number (0)); bset_begv_marker (&buffer_local_flags, make_number (0)); bset_zv_marker (&buffer_local_flags, make_number (0)); + bset_begh_marker (&buffer_local_flags, make_number (0)); + bset_zh_marker (&buffer_local_flags, make_number (0)); bset_last_selected_window (&buffer_local_flags, make_number (0)); idx = 1; modified src/buffer.h @@ -416,6 +416,26 @@ extern void enlarge_buffer_text (struct buffer *, ptrdiff_t); #define BUF_FETCH_BYTE(buf, n) \ *(BUF_BYTE_ADDRESS ((buf), (n))) + +\f +/* Macros for setting and accessing hard-narrow markers */ + +/* Position of beginning of hard-narrowed range of buffer. */ +#define BEGH (BUF_BEGH (current_buffer)) +#define BUF_BEGH(buf) \ + ((NILP (BVAR (buf, begh_marker))) ? BUF_BEG (buf) \ + : marker_position (BVAR (buf, begh_marker))) +#define SET_BUF_BEGH(buf, charpos) \ + (bset_begh_marker (buf, build_marker(buf, charpos, buf_charpos_to_bytepos(buf, charpos)))) + +/* Position of end of hard-narrowed range of buffer. */ +#define ZH (BUF_ZH(current_buffer)) +#define BUF_ZH(buf) \ + ((NILP (BVAR (buf, zh_marker))) ? BUF_Z (buf) \ + : marker_position (BVAR (buf, zh_marker))) +#define SET_BUF_ZH(buf, charpos) \ + (bset_zh_marker (buf, build_marker(buf, charpos, buf_charpos_to_bytepos(buf, charpos)))) + \f /* Define the actual buffer data structures. */ @@ -666,6 +686,12 @@ struct buffer ZV for this buffer when the buffer is not current. */ Lisp_Object zv_marker_; + /* Lower hard limit of the buffer.*/ + Lisp_Object begh_marker_; + + /* Upper hard limit of the buffer.*/ + Lisp_Object zh_marker_; + /* This holds the point value before the last scroll operation. Explicitly setting point sets this to nil. */ Lisp_Object point_before_scroll_; @@ -984,6 +1010,16 @@ bset_width_table (struct buffer *b, Lisp_Object val) { b->width_table_ = val; } +INLINE void +bset_begh_marker (struct buffer *b, Lisp_Object val) +{ + b->begh_marker_ = val; +} +INLINE void +bset_zh_marker (struct buffer *b, Lisp_Object val) +{ + b->zh_marker_ = val; +} /* Number of Lisp_Objects at the beginning of struct buffer. If you add, remove, or reorder Lisp_Objects within buffer modified src/bytecode.c @@ -1682,17 +1682,18 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, CASE (Bnarrow_to_region): { - Lisp_Object v1; + Lisp_Object v1, v2; BEFORE_POTENTIAL_GC (); v1 = POP; - TOP = Fnarrow_to_region (TOP, v1); + v2 = POP; + TOP = Fnarrow_to_region (TOP, v2, v1); AFTER_POTENTIAL_GC (); NEXT; } CASE (Bwiden): BEFORE_POTENTIAL_GC (); - PUSH (Fwiden ()); + TOP = Fwiden (TOP); AFTER_POTENTIAL_GC (); NEXT; modified src/editfns.c @@ -3480,33 +3480,54 @@ DEFUN ("delete-and-extract-region", Fdelete_and_extract_region, return empty_unibyte_string; return del_range_1 (XINT (start), XINT (end), 1, 1); } + \f -DEFUN ("widen", Fwiden, Swiden, 0, 0, "", +DEFUN ("widen", Fwiden, Swiden, 0, 1, "", doc: /* Remove restrictions (narrowing) from current buffer. -This allows the buffer's full text to be seen and edited. */) - (void) +If HARD is non-nil, remove the hard restriction imposed by a previous +call to \\[narrow-to-region]. If HARD is nil, remove visual +restriction up to the previously imposed hard limit (if any). */) + (Lisp_Object hard) { - if (BEG != BEGV || Z != ZV) - current_buffer->clip_changed = 1; - BEGV = BEG; - BEGV_BYTE = BEG_BYTE; - SET_BUF_ZV_BOTH (current_buffer, Z, Z_BYTE); - /* Changing the buffer bounds invalidates any recorded current column. */ - invalidate_current_column (); + + if(!NILP (hard)) + { + bset_begh_marker(current_buffer, Qnil); + bset_zh_marker(current_buffer, Qnil); + } + else + { + if (BEG != BEGV || Z != ZV) + current_buffer->clip_changed = 1; + BEGV = BEG; + BEGV_BYTE = BEG_BYTE; + SET_BUF_ZV_BOTH (current_buffer, Z, Z_BYTE); + /* Changing the buffer bounds invalidates any recorded current column. */ + invalidate_current_column (); + } + return Qnil; } -DEFUN ("narrow-to-region", Fnarrow_to_region, Snarrow_to_region, 2, 2, "r", - doc: /* Restrict editing in this buffer to the current region. -The rest of the text becomes temporarily invisible and untouchable -but is not deleted; if you save the buffer in a file, the invisible -text is included in the file. \\[widen] makes all visible again. -See also `save-restriction'. +DEFUN ("narrow-to-region", Fnarrow_to_region, Snarrow_to_region, 2, 3, "r", + doc: /* Restrict editing in this buffer to the current +region. START and END are positions (integers or markers) bounding the +text that should restricted. There can be two types of restrictions, +visual and hard. If HARD is nil, impose visual restriction, otherwise +a hard one. -When calling from a program, pass two arguments; positions (integers -or markers) bounding the text that should remain visible. */) - (register Lisp_Object start, Lisp_Object end) +When visual restriction is in place, the rest of the text is invisible +and untouchable but is not deleted; if you save the buffer in a file, +the invisible text is included in the file. \\[widen] with nil +optional argument makes it all visible again. + +When hard restriction is in place, invocations of (visual) \\[widen] +with nil argument removes visual narrowing up to the hard +restriction. In order to lift hard restriction, call \\[widen] with +non-nil HARD argument. */) + (register Lisp_Object start, Lisp_Object end, Lisp_Object hard) { + CHECK_NUMBER_COERCE_MARKER (start); CHECK_NUMBER_COERCE_MARKER (end); @@ -3519,6 +3540,15 @@ or markers) bounding the text that should remain visible. */) if (!(BEG <= XINT (start) && XINT (start) <= XINT (end) && XINT (end) <= Z)) args_out_of_range (start, end); + if (!NILP (hard)) + { + SET_BUF_BEGH (current_buffer, XFASTINT (start)); + SET_BUF_ZH (current_buffer, XFASTINT (end)); + if (BEGV >= XFASTINT (start) && ZV <= XFASTINT (end)) + /* Visual narrowing within hard limits. */ + return Qnil; + } + if (BEGV != XFASTINT (start) || ZV != XFASTINT (end)) current_buffer->clip_changed = 1; @@ -3533,6 +3563,7 @@ or markers) bounding the text that should remain visible. */) return Qnil; } + Lisp_Object save_restriction_save (void) { modified src/fileio.c @@ -4764,7 +4764,7 @@ write_region (Lisp_Object start, Lisp_Object end, Lisp_Object filename, This is useful in tar-mode. --Stef XSETFASTINT (start, BEG); XSETFASTINT (end, Z); */ - Fwiden (); + Fwiden (Qnil); } record_unwind_protect (build_annotations_unwind, modified src/lread.c @@ -1850,7 +1850,7 @@ readevalloop (Lisp_Object readcharfun, /* Set point and ZV around stuff to be read. */ Fgoto_char (start); if (!NILP (end)) - Fnarrow_to_region (make_number (BEGV), end); + Fnarrow_to_region (make_number (BEGV), end, Qnil); /* Just for cleanliness, convert END to a marker if it is an integer. */ modified src/process.c @@ -5514,7 +5514,7 @@ Otherwise it discards the output. */) /* If the output marker is outside of the visible region, save the restriction and widen. */ if (! (BEGV <= PT && PT <= ZV)) - Fwiden (); + Fwiden (Qnil); /* Adjust the multibyteness of TEXT to that of the buffer. */ if (NILP (BVAR (current_buffer, enable_multibyte_characters)) @@ -5558,7 +5558,7 @@ Otherwise it discards the output. */) /* If the restriction isn't what it should be, set it. */ if (old_begv != BEGV || old_zv != ZV) - Fnarrow_to_region (make_number (old_begv), make_number (old_zv)); + Fnarrow_to_region (make_number (old_begv), make_number (old_zv), Qnil); bset_read_only (current_buffer, old_read_only); SET_PT_BOTH (opoint, opoint_byte); [back] ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-29 21:43 ` Vitalie Spinu @ 2016-04-22 14:34 ` Dmitry Gutov 2016-04-24 7:22 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-04-22 14:34 UTC (permalink / raw) To: Vitalie Spinu, Stefan Monnier; +Cc: emacs-devel Hi Vitalie, On 03/30/2016 12:43 AM, Vitalie Spinu wrote: > I haven't tested it yet because of the following build problem with loaddeffs: It actually builds fine here. Have you tried 'make bootstrap'? ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-04-22 14:34 ` Dmitry Gutov @ 2016-04-24 7:22 ` Vitalie Spinu 2016-04-24 7:28 ` Achim Gratz 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-04-24 7:22 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Stefan Monnier, emacs-devel Hm. I have reported that error with `make bootstrap`. I got stuck at that silly error and didn't have time to figure it out. Will get back at it next week after my deadlines are over. >> On Fri, Apr 22 2016 17:34, Dmitry Gutov wrote: > Hi Vitalie, > On 03/30/2016 12:43 AM, Vitalie Spinu wrote: >> I haven't tested it yet because of the following build problem with loaddeffs: > It actually builds fine here. Have you tried 'make bootstrap'? ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-04-24 7:22 ` Vitalie Spinu @ 2016-04-24 7:28 ` Achim Gratz 2016-04-24 11:33 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Achim Gratz @ 2016-04-24 7:28 UTC (permalink / raw) To: emacs-devel Vitalie Spinu writes: > Hm. I have reported that error with `make bootstrap`. > > I got stuck at that silly error and didn't have time to figure it out. Will get > back at it next week after my deadlines are over. Do a 'make extraclean' and try again. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf rackAttack: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-04-24 7:28 ` Achim Gratz @ 2016-04-24 11:33 ` Vitalie Spinu 2016-04-24 13:20 ` Andreas Schwab 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-04-24 11:33 UTC (permalink / raw) To: Achim Gratz; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 211 bytes --] >> On Sun, Apr 24 2016 09:28, Achim Gratz wrote: > Do a 'make extraclean' and try again. Doesn't help either. I have narrowed it down to adding an extra args to primitives. This is how I do it right now. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: text/x-diff, Size: 1918 bytes --] 5 files changed, 7 insertions(+), 6 deletions(-) src/buffer.c | 2 +- src/bytecode.c | 2 +- src/editfns.c | 5 +++-- src/fileio.c | 2 +- src/process.c | 2 +- modified src/buffer.c @@ -2165,7 +2165,7 @@ Any narrowing restriction in effect (see `narrow-to-region') is removed, so the buffer is truly empty after this. */) (void) { - Fwiden (); + Fwiden (Qnil); del_range (BEG, Z); modified src/bytecode.c @@ -1692,7 +1692,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, CASE (Bwiden): BEFORE_POTENTIAL_GC (); - PUSH (Fwiden ()); + TOP = Fwiden (TOP); AFTER_POTENTIAL_GC (); NEXT; modified src/editfns.c @@ -3483,11 +3483,12 @@ DEFUN ("delete-and-extract-region", Fdelete_and_extract_region, return empty_unibyte_string; return del_range_1 (XINT (start), XINT (end), 1, 1); } + \f -DEFUN ("widen", Fwiden, Swiden, 0, 0, "", +DEFUN ("widen", Fwiden, Swiden, 0, 1, "", doc: /* Remove restrictions (narrowing) from current buffer. This allows the buffer's full text to be seen and edited. */) - (void) + (Lisp_Object hard) { if (BEG != BEGV || Z != ZV) current_buffer->clip_changed = 1; modified src/fileio.c @@ -4764,7 +4764,7 @@ write_region (Lisp_Object start, Lisp_Object end, Lisp_Object filename, This is useful in tar-mode. --Stef XSETFASTINT (start, BEG); XSETFASTINT (end, Z); */ - Fwiden (); + Fwiden (Qnil); } record_unwind_protect (build_annotations_unwind, modified src/process.c @@ -5514,7 +5514,7 @@ Otherwise it discards the output. */) /* If the output marker is outside of the visible region, save the restriction and widen. */ if (! (BEGV <= PT && PT <= ZV)) - Fwiden (); + Fwiden (Qnil); /* Adjust the multibyteness of TEXT to that of the buffer. */ if (NILP (BVAR (current_buffer, enable_multibyte_characters)) [-- Attachment #3: Type: text/plain, Size: 999 bytes --] And this is the error which `make bootstrap` gives: make -C ../lisp autoloads EMACS="../src/bootstrap-emacs" make[3]: Entering directory '/home/vspinu/bin/emacs-test/lisp' GEN calendar/cal-loaddefs.el Loading macroexp.elc... appt.el:0:0: error: wrong-type-argument: (markerp /home/vspinu/bin/emacs-test/lisp/calendar/appt.el) Makefile:406: recipe for target 'calendar/cal-loaddefs.el' failed make[3]: *** [calendar/cal-loaddefs.el] Error 255 make[3]: Leaving directory '/home/vspinu/bin/emacs-test/lisp' Makefile:727: recipe for target '../lisp/loaddefs.el' failed make[2]: *** [../lisp/loaddefs.el] Error 2 make[2]: Leaving directory '/home/vspinu/bin/emacs-test/src' Makefile:398: recipe for target 'src' failed make[1]: *** [src] Error 2 make[1]: Leaving directory '/home/vspinu/bin/emacs-test' Makefile:1091: recipe for target 'bootstrap' failed make: *** [bootstrap] Error 2 What am I doing wrong here? Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-04-24 11:33 ` Vitalie Spinu @ 2016-04-24 13:20 ` Andreas Schwab 2016-04-24 16:11 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Andreas Schwab @ 2016-04-24 13:20 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Achim Gratz, emacs-devel Vitalie Spinu <spinuvit@gmail.com> writes: > @@ -1692,7 +1692,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, > > CASE (Bwiden): > BEFORE_POTENTIAL_GC (); > - PUSH (Fwiden ()); > + TOP = Fwiden (TOP); You are clobbering the stack here. Instead of pushing a new value you are overwriting an unrelated value on the stack. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-04-24 13:20 ` Andreas Schwab @ 2016-04-24 16:11 ` Vitalie Spinu 2016-04-24 16:19 ` Andreas Schwab 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-04-24 16:11 UTC (permalink / raw) To: Andreas Schwab; +Cc: Achim Gratz, emacs-devel >> On Sun, Apr 24 2016 15:20, Andreas Schwab wrote: > Vitalie Spinu <spinuvit@gmail.com> writes: >> @@ -1692,7 +1692,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, >> >> CASE (Bwiden): >> BEFORE_POTENTIAL_GC (); >> - PUSH (Fwiden ()); >> + TOP = Fwiden (TOP); > You are clobbering the stack here. Instead of pushing a new value you > are overwriting an unrelated value on the stack. I don't think so. I pick an argument form the stack and put the return value in it. This is what all one-arg functions do in bytecode.c. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-04-24 16:11 ` Vitalie Spinu @ 2016-04-24 16:19 ` Andreas Schwab 2016-04-24 16:41 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Andreas Schwab @ 2016-04-24 16:19 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Achim Gratz, emacs-devel Vitalie Spinu <spinuvit@gmail.com> writes: >>> On Sun, Apr 24 2016 15:20, Andreas Schwab wrote: > >> Vitalie Spinu <spinuvit@gmail.com> writes: > >>> @@ -1692,7 +1692,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, >>> >>> CASE (Bwiden): >>> BEFORE_POTENTIAL_GC (); >>> - PUSH (Fwiden ()); >>> + TOP = Fwiden (TOP); > >> You are clobbering the stack here. Instead of pushing a new value you >> are overwriting an unrelated value on the stack. > > I don't think so. I pick an argument form the stack and put the return value in > it. This is what all one-arg functions do in bytecode.c. But nobody is pushing that argument. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-04-24 16:19 ` Andreas Schwab @ 2016-04-24 16:41 ` Vitalie Spinu 2016-04-24 16:48 ` Andreas Schwab 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-04-24 16:41 UTC (permalink / raw) To: Andreas Schwab; +Cc: Achim Gratz, emacs-devel >> On Sun, Apr 24 2016 18:19, Andreas Schwab wrote: > Vitalie Spinu <spinuvit@gmail.com> writes: >>>> On Sun, Apr 24 2016 15:20, Andreas Schwab wrote: >> >>> Vitalie Spinu <spinuvit@gmail.com> writes: >> >>>> @@ -1692,7 +1692,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, >>>> >>>> CASE (Bwiden): >>>> BEFORE_POTENTIAL_GC (); >>>> - PUSH (Fwiden ()); >>>> + TOP = Fwiden (TOP); >> >>> You are clobbering the stack here. Instead of pushing a new value you >>> are overwriting an unrelated value on the stack. >> >> I don't think so. I pick an argument form the stack and put the return value in >> it. This is what all one-arg functions do in bytecode.c. > But nobody is pushing that argument. I am not pushing anything. I am just overriding. PUSH in above diff is a deleted line. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-04-24 16:41 ` Vitalie Spinu @ 2016-04-24 16:48 ` Andreas Schwab 2016-04-24 18:01 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Andreas Schwab @ 2016-04-24 16:48 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Achim Gratz, emacs-devel Vitalie Spinu <spinuvit@gmail.com> writes: >>> On Sun, Apr 24 2016 18:19, Andreas Schwab wrote: > >> Vitalie Spinu <spinuvit@gmail.com> writes: > >>>>> On Sun, Apr 24 2016 15:20, Andreas Schwab wrote: >>> >>>> Vitalie Spinu <spinuvit@gmail.com> writes: >>> >>>>> @@ -1692,7 +1692,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, >>>>> >>>>> CASE (Bwiden): >>>>> BEFORE_POTENTIAL_GC (); >>>>> - PUSH (Fwiden ()); >>>>> + TOP = Fwiden (TOP); >>> >>>> You are clobbering the stack here. Instead of pushing a new value you >>>> are overwriting an unrelated value on the stack. >>> >>> I don't think so. I pick an argument form the stack and put the return value in >>> it. This is what all one-arg functions do in bytecode.c. > >> But nobody is pushing that argument. > > I am not pushing anything. Exactly. This opcode takes no argument, and you cannot change that. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-04-24 16:48 ` Andreas Schwab @ 2016-04-24 18:01 ` Vitalie Spinu 2016-04-24 19:05 ` Andreas Schwab 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-04-24 18:01 UTC (permalink / raw) To: Andreas Schwab; +Cc: Achim Gratz, emacs-devel >> On Sun, Apr 24 2016 18:48, Andreas Schwab wrote: > Vitalie Spinu <spinuvit@gmail.com> writes: >>>> On Sun, Apr 24 2016 18:19, Andreas Schwab wrote: >> >>> Vitalie Spinu <spinuvit@gmail.com> writes: >> >>>>>> On Sun, Apr 24 2016 15:20, Andreas Schwab wrote: >>>> >>>>> Vitalie Spinu <spinuvit@gmail.com> writes: >>>> >>>>>> @@ -1692,7 +1692,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, >>>>>> >>>>>> CASE (Bwiden): >>>>>> BEFORE_POTENTIAL_GC (); >>>>>> - PUSH (Fwiden ()); >>>>>> + TOP = Fwiden (TOP); >>>> >>>>> You are clobbering the stack here. Instead of pushing a new value you >>>>> are overwriting an unrelated value on the stack. >>>> >>>> I don't think so. I pick an argument form the stack and put the return value in >>>> it. This is what all one-arg functions do in bytecode.c. >> >>> But nobody is pushing that argument. >> >> I am not pushing anything. > Exactly. This opcode takes no argument, and you cannot change that. Could you please elaborate a bit where you think the problem is? What exactly I cannot change? I am changing the number of argument of widen (from zero to one) and adjusting the byte code table. Do you say that's not the right way to do it? How should I do it then? Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-04-24 18:01 ` Vitalie Spinu @ 2016-04-24 19:05 ` Andreas Schwab 0 siblings, 0 replies; 155+ messages in thread From: Andreas Schwab @ 2016-04-24 19:05 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Achim Gratz, emacs-devel Vitalie Spinu <spinuvit@gmail.com> writes: > What exactly I cannot change? I am changing the number of argument of widen > (from zero to one) and adjusting the byte code table. Do you say that's not the > right way to do it? How should I do it then? All bytecode ops are part of the ABI. You cannot just change them without breaking existing elc files. Either introduce a new op or leave the one-argument form as is. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-22 11:57 ` Stefan Monnier 2016-03-22 16:28 ` Vitalie Spinu @ 2016-04-28 13:29 ` Vitalie Spinu 2016-04-30 14:06 ` Stefan Monnier 1 sibling, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-04-28 13:29 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 402 bytes --] >> On Tue, Mar 22 2016 06:57, Stefan Monnier wrote: > IIRC past discussions on this issue, one option was to merge your > set-widen-limits into narrow-to-region by adding an optional argument `hard'. Stefan, adding extra argument turned to be a train wreck. I am afraid, if I cannot get help on how to extend primitives, I am giving up at this point. Adding a dummy argument to Fbobp like this: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: text/x-diff, Size: 770 bytes --] 2 files changed, 3 insertions(+), 3 deletions(-) src/bytecode.c | 2 +- src/editfns.c | 4 ++-- modified src/bytecode.c @@ -1589,7 +1589,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, NEXT; CASE (Bbobp): - PUSH (Fbobp ()); + TOP = Fbobp (TOP); NEXT; CASE (Bcurrent_buffer): modified src/editfns.c @@ -1164,10 +1164,10 @@ At the beginning of the buffer or accessible region, return 0. */) return temp; } -DEFUN ("bobp", Fbobp, Sbobp, 0, 0, 0, +DEFUN ("bobp", Fbobp, Sbobp, 0, 1, 0, doc: /* Return t if point is at the beginning of the buffer. If the buffer is narrowed, this means the beginning of the narrowed part. */) - (void) + (Lisp_Object dummy) { if (PT == BEGV) return Qt; [-- Attachment #3: Type: text/plain, Size: 1605 bytes --] then make extraclean && git clean -f && make bootstrap gives "Wrong type argument" during byte compilation: make[3]: Entering directory '/home/vspinu/bin/emacs-test/lisp' ELC ../lisp/international/eucjp-ms.elc Reloading stale loaddefs.el Loading /home/vspinu/bin/emacs-test/lisp/loaddefs.el (source)... make[3]: Leaving directory '/home/vspinu/bin/emacs-test/lisp' make -C ../admin/unidata all EMACS="../../src/bootstrap-emacs" make[3]: Entering directory '/home/vspinu/bin/emacs-test/admin/unidata' GEN ../../src/macuvs.h GEN ../../lisp/international/charprop.el Wrong type argument: char-or-string-p, #<EMACS BUG: INVALID DATATYPE (MISC 0x48ff) Save your buffers immediately and please report this bug> Makefile:87: recipe for target '../../lisp/international/charprop.el' failed make[3]: *** [../../lisp/international/charprop.el] Error 255 make[3]: Leaving directory '/home/vspinu/bin/emacs-test/admin/unidata' Makefile:498: recipe for target '../lisp/international/charprop.el' failed make[2]: *** [../lisp/international/charprop.el] Error 2 make[2]: Leaving directory '/home/vspinu/bin/emacs-test/src' Makefile:398: recipe for target 'src' failed make[1]: *** [src] Error 2 make[1]: Leaving directory '/home/vspinu/bin/emacs-test' GNUmakefile:79: recipe for target 'bootstrap' failed I am defining Bbobp, Fbobp, Sbobp just as it's done with any other primitive with an optional argument. `Fbobp` is never used at C level, so the above diff is complete and could be run as it is. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-04-28 13:29 ` Vitalie Spinu @ 2016-04-30 14:06 ` Stefan Monnier 0 siblings, 0 replies; 155+ messages in thread From: Stefan Monnier @ 2016-04-30 14:06 UTC (permalink / raw) To: emacs-devel > Stefan, adding extra argument turned to be a train wreck. I am afraid, if I > cannot get help on how to extend primitives, I am giving up at this point. Indeed, I didn't consider the fact that some (most?) of those functions are also implemented as bytecode. Hmm... > case (Bbobp): > - PUSH (Fbobp ()); > + TOP = Fbobp (TOP); > NEXT; You can't change the bytecode's behavior since existing .elc files will otherwise break down completely since the stack's will be modified in a way it doesn't expect [ For new .elc files you could make it work by changing bytecomp.el to update the byte compiler's understanding of how the bytecode works. ] So for functions that have a corresponding bytecode, you can either stop using the bytecode when the new arg is used (this requires changing bytecomp.el accordingly, e.g. by removing the corresponding code such as "(byte-defop 125 -1 byte-narrow-to-region)"), or you leave the function unchanged and introduce another function instead. Having a bytecode for `narrow-to-region` is not very useful (the main/only benefit is speed of executing narrow-to-region, but the difference shouldn't be significant) so it'd be perfectly OK to stop using this bytecode. For `bobp` I'm not completely sure of the potential performance impact, OTOH. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-22 10:05 ` Vitalie Spinu 2016-03-22 11:57 ` Stefan Monnier @ 2016-03-22 20:08 ` Richard Stallman 2016-03-22 22:45 ` Vitalie Spinu 1 sibling, 1 reply; 155+ messages in thread From: Richard Stallman @ 2016-03-22 20:08 UTC (permalink / raw) To: Vitalie Spinu; +Cc: monnier, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > So the consumer will need to set this variable and always follow it by widen. In the context of Emacs -- or software, generally -- what does "consumer" mean? One of the nice things about installing a program in your computer is that running it does not use it up. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] 2016-03-22 20:08 ` Richard Stallman @ 2016-03-22 22:45 ` Vitalie Spinu 0 siblings, 0 replies; 155+ messages in thread From: Vitalie Spinu @ 2016-03-22 22:45 UTC (permalink / raw) To: Richard Stallman; +Cc: monnier, emacs-devel >> On Tue, Mar 22 2016 16:08, Richard Stallman wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > So the consumer will need to set this variable and always follow it by widen. > In the context of Emacs -- or software, generally -- what does > "consumer" mean? One of the nice things about installing a program > in your computer is that running it does not use it up. By "consumer" of a function or variable I meant any elisp code that uses (aka consumes) that function or variable. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 1:05 ` Vitalie Spinu 2016-03-21 3:11 ` Stefan Monnier 2016-03-21 5:08 ` [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] Vitalie Spinu @ 2016-03-21 11:47 ` Dmitry Gutov 2016-03-21 12:40 ` Vitalie Spinu 2 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-03-21 11:47 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel On 03/21/2016 03:05 AM, Vitalie Spinu wrote: > >>> On Sun, Mar 20 2016 17:58, Dmitry Gutov wrote: > >> On 03/20/2016 02:15 PM, Vitalie Spinu wrote: > >> IIRC, using first-column is fairly justified, the outer mode can't add extra >> indentation to the submode is a reliable, sane way > > The inner mode cannot often make that decision either. What decision? One case where the mode cannot return its proposed indentation at all, is when the resulting column would be negative. Using first-column can make it positive again via simple addition. Using calculate-indent-function which returns a numeric value, would solve that as well, of course, at the expense of having to update all major modes out there, and documentation. And making whatever third-party guides are out there obsolete in this regard. I'm not really against that, mind you. > Same inner mode can be > used in very different multi-mode contexts, each with their own semantics for > chunks/headers/indentation. Reducing all that to a simple (first-column > . previous-chunk) pair and letting inner mode do the job is surely not > enough. The only actor to make that decision should be multi-mode engine itself. I'm not claiming that using previous-chunk is good. > Instead of teaching modes about multi-modes, a much better idea is to introduce > `calculate-indent-function` which would accept POS and optional STRING-AFTER and > STRING-BEFORE. This function will return the indentation of STRING-AFTER at POS > assuming there is a virtual STRING-BEFORE just before POS. Strings? Indentation engines do not deal with strings, they deal with buffer contents. Having them handle this possibility would also amount to sharing a part of multi-mode logic. Instead, if you want to know what indentation an inner mode would return if STRING-BEFORE was before it, insert that string into the buffer (while inhibiting undo history). Call the indentation function, then remove the string. Any performance concerns with that? > Most modes indent reliably > based on one previous line, Ruby doesn't. Most modes based on SMIE will need more than the previous line in the general case, too. > Then a lot of modes don't even care about what's in the current line, so > STRING-AFTER will be irrelevant as well. Almost all of them care whether the current line contains }, or `end', or `else', and so on. >>> It's essentially a half-backed implementation of "hard widening" discussed >>> earlier. Why not impose the widening restriction directly in `widen` then? >>> Maybe bring widen to elisp and rename C widen into widen-internal. Then add >>> generic `prog-hard-widen-limits` which would be checked along >>> prog-indentation-context limits. > >> Right! At the very least, I we should extract the second element of >> prog-indentation-context into a separate variable, and make prog-widen more >> prominent. > > Not sure about removing second element. Good thing about keeping all of them in > one place is for the indentation engine to be concerned with a single variable. Didn't you mention font-lock and syntax-propertize yourself? Why would they call a function that's solely dependent on an indentation variable? In any case, your hard-narrowing proposal is very similar. Surely you don't want to keep the second element of prog-indentation-context after hard-narrowing becomes available? > Only consumers of `hard-widen-limits` should be concerned with its side > effects. But that's uniformly better than current situation when you cannot do > much about restricting widen. OK, so *every* consumer of widen will have to obey the hard limits. That might work, if there's no low-level code that absolutely has to always be able to widen to the whole buffer. > BTW, I parse-partial-sexp must abide hard-widen-limits as well. If you want parse-partial-sexp to obey limits, you narrow the buffer around it. > This way the > request aired in bug#22983 of parse-partial-sexp == syntax-ppss will be > automatically satisfied. You won't need syntax-ppss-dont-widen either. That doesn't seem relevant. That bug is about stale cache values between different narrowing bounds. > A patch that would require hunting every single mode out there and implementing > multi-modes locally should have been more carefully considered IMO. Emacs 25 is > not yet there, so it's not late to reconsider that decision. I concur. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 11:47 ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Dmitry Gutov @ 2016-03-21 12:40 ` Vitalie Spinu 2016-03-21 13:07 ` Dmitry Gutov 2016-03-21 14:02 ` Stefan Monnier 0 siblings, 2 replies; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 12:40 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel >> On Mon, Mar 21 2016 13:47, Dmitry Gutov wrote: > On 03/21/2016 03:05 AM, Vitalie Spinu wrote: >> >>>> On Sun, Mar 20 2016 17:58, Dmitry Gutov wrote: >> >> >> The inner mode cannot often make that decision either. > What decision? Decision of how much to indent. Inner mode just doesn't have a complete picture of what is going on. Just having access to previous chunk is not enough. Note that I don't mind FIRST-COLUMN functionality. I think it's harmless and probably useful. I mostly mind the last two arguments of prog-indentation-context. > I'm not claiming that using previous-chunk is good. Good ;) >> Instead of teaching modes about multi-modes, a much better idea is to introduce >> `calculate-indent-function` which would accept POS and optional STRING-AFTER and >> STRING-BEFORE. This function will return the indentation of STRING-AFTER at POS >> assuming there is a virtual STRING-BEFORE just before POS. > Strings? Indentation engines do not deal with strings, they deal with buffer > contents. Having them handle this possibility would also amount to sharing a > part of multi-mode logic. Yeh. That's the sucky part. My hope is that BEFORE-STRING will be seldom used. Given that this case applies only to continuation chunks and assuming that multi-mode engine can identify those (at least at multi-mode level) this is a reasonable trade off IMO. In polymode I haven't even got down to indentation of continuation chunks yet. They are not that common in literate programming. Performance is not a primary concern for indentation. Correctness and conceptual cleanness is at a much higher stake here. My hope is that generic helper functions can be optimized to re-use same temp buffer for multiple invocations of calculate-indent-function. >> Then a lot of modes don't even care about what's in the current line, so >> STRING-AFTER will be irrelevant as well. > Almost all of them care whether the current line contains }, or `end', or > `else', and so on. Indeed. But this information is trivial to retrieve from STRING-AFTER. > In any case, your hard-narrowing proposal is very similar. Surely you don't want > to keep the second element of prog-indentation-context after hard-narrowing > becomes available? Indeed. I was not thinking about algorithmic complexities. AFAIK if second element is removed, the third one should go as well. That leaves only FIRST-COLUMN then, which I personally don't mind. >> Only consumers of `hard-widen-limits` should be concerned with its side >> effects. But that's uniformly better than current situation when you cannot do >> much about restricting widen. > OK, so *every* consumer of widen will have to obey the hard limits. That might > work, if there's no low-level code that absolutely has to always be able to > widen to the whole buffer. I think as long as low level code uses BEGV and ZV instead of BEG and Z everything should be fine. That is with an implicit assumption that hard limits are always wider than the current visual narrowing which is a reasonable contract IMO. Even better, as long as low level routines use BEG and Z consistently (and it looks like they do) BEG and Z can be modified to take care of hard-widen-limits. This might be the easiest solution. In any case going through all C code and checking usage of widen is not such an insurmountable task. >> BTW, I parse-partial-sexp must abide hard-widen-limits as well. > If you want parse-partial-sexp to obey limits, you narrow the buffer around it. >> This way the >> request aired in bug#22983 of parse-partial-sexp == syntax-ppss will be >> automatically satisfied. You won't need syntax-ppss-dont-widen either. > That doesn't seem relevant. That bug is about stale cache values between > different narrowing bounds. Right. Those stale values won't occur in multi-modes because both syntax-ppss and parse-partial-sexp will always operate on same hard-narrowed regions. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 12:40 ` Vitalie Spinu @ 2016-03-21 13:07 ` Dmitry Gutov 2016-03-21 14:20 ` Vitalie Spinu 2016-03-21 14:02 ` Stefan Monnier 1 sibling, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-03-21 13:07 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel On 03/21/2016 02:40 PM, Vitalie Spinu wrote: >> Strings? Indentation engines do not deal with strings, they deal with buffer >> contents. Having them handle this possibility would also amount to sharing a >> part of multi-mode logic. > > Yeh. That's the sucky part. My hope is that BEFORE-STRING will be seldom > used. Then let's not add that to the API until we see a concrete need for it. > Performance is not a primary concern for indentation. Correctness and conceptual > cleanness is at a much higher stake here. My hope is that generic helper > functions can be optimized to re-use same temp buffer for multiple invocations > of calculate-indent-function. So, how about trying my alternative proposal first? >>> Then a lot of modes don't even care about what's in the current line, so >>> STRING-AFTER will be irrelevant as well. > >> Almost all of them care whether the current line contains }, or `end', or >> `else', and so on. > > Indeed. But this information is trivial to retrieve from STRING-AFTER. Feeding it to each particular indentation engine is not going to be trivial. >> In any case, your hard-narrowing proposal is very similar. Surely you don't want >> to keep the second element of prog-indentation-context after hard-narrowing >> becomes available? > > Indeed. I was not thinking about algorithmic complexities. > > AFAIK if second element is removed, the third one should go as well. That leaves > only FIRST-COLUMN then, which I personally don't mind. OK. And that one could be replaced with the introduction of prog-indentation-function. Though that might be getting ahead of ourselfves. >>> This way the >>> request aired in bug#22983 of parse-partial-sexp == syntax-ppss will be >>> automatically satisfied. You won't need syntax-ppss-dont-widen either. > >> That doesn't seem relevant. That bug is about stale cache values between >> different narrowing bounds. > > Right. Those stale values won't occur in multi-modes because both syntax-ppss > and parse-partial-sexp will always operate on same hard-narrowed regions. We could only be sure of that for syntax-ppss calls in facilities that the multi-mode handles specially, like font-lock, syntax-propertize and indentation. Not so with any other functions the user could call. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 13:07 ` Dmitry Gutov @ 2016-03-21 14:20 ` Vitalie Spinu 2016-03-21 14:29 ` Dmitry Gutov 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 14:20 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel >> On Mon, Mar 21 2016 15:07, Dmitry Gutov wrote: > On 03/21/2016 02:40 PM, Vitalie Spinu wrote: >>> Strings? Indentation engines do not deal with strings, they deal with buffer >>> contents. Having them handle this possibility would also amount to sharing a >>> part of multi-mode logic. >> >> Yeh. That's the sucky part. My hope is that BEFORE-STRING will be seldom >> used. > Then let's not add that to the API until we see a concrete need for it. It might be good to not include these (prog-indentation-context including) in emacs 25 release. >> Performance is not a primary concern for indentation. Correctness and conceptual >> cleanness is at a much higher stake here. My hope is that generic helper >> functions can be optimized to re-use same temp buffer for multiple invocations >> of calculate-indent-function. > So, how about trying my alternative proposal first? Sorry. What proposal do you mean? >> Right. Those stale values won't occur in multi-modes because both syntax-ppss >> and parse-partial-sexp will always operate on same hard-narrowed regions. > We could only be sure of that for syntax-ppss calls in facilities that the > multi-mode handles specially, like font-lock, syntax-propertize and > indentation. Not so with any other functions the user could call. I assume that multi-mode engine is advising syntax-ppss which I think it should. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 14:20 ` Vitalie Spinu @ 2016-03-21 14:29 ` Dmitry Gutov 2016-03-21 14:42 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-03-21 14:29 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel On 03/21/2016 04:20 PM, Vitalie Spinu wrote: > It might be good to not include these (prog-indentation-context including) in > emacs 25 release. Of course, none of them. But nor should we put BEFORE-STRING into master until we understand that we really need it, and how to use it. >>> Performance is not a primary concern for indentation. Correctness and conceptual >>> cleanness is at a much higher stake here. My hope is that generic helper >>> functions can be optimized to re-use same temp buffer for multiple invocations >>> of calculate-indent-function. > >> So, how about trying my alternative proposal first? > > Sorry. What proposal do you mean? """ Instead, if you want to know what indentation an inner mode would return if STRING-BEFORE was before it, insert that string into the buffer (while inhibiting undo history). Call the indentation function, then remove the string. """ Same with AFTER-STRING. The multi-mode package itself would do that. >>> Right. Those stale values won't occur in multi-modes because both syntax-ppss >>> and parse-partial-sexp will always operate on same hard-narrowed regions. > >> We could only be sure of that for syntax-ppss calls in facilities that the >> multi-mode handles specially, like font-lock, syntax-propertize and >> indentation. Not so with any other functions the user could call. > > I assume that multi-mode engine is advising syntax-ppss which I think it should. Very well, that's an option. Having syntax-ppss-dont-widen (or making syntax-ppss respect hard-widen-limits) should be sufficient for it. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 14:29 ` Dmitry Gutov @ 2016-03-21 14:42 ` Vitalie Spinu 2016-03-21 14:56 ` Dmitry Gutov 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 14:42 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel >> On Mon, Mar 21 2016 16:29, Dmitry Gutov wrote: >> Sorry. What proposal do you mean? > """ > Instead, if you want to know what indentation an inner mode would return if > STRING-BEFORE was before it, insert that string into the buffer (while > inhibiting undo history). Call the indentation function, then remove the string. > """ Inner mode might decide to operate on string directly, or put stuff in a temp buffer, work on last line only, or simply ignore it. Why to hard-wire the usage of STRING-BEFORE so badly? My gut feeling is to avoid modifying buffer context in indentation engine at all costs. In the future, if performance with temp buffers will be a real issue, we can add more low level functions for fast operation on string to do some common parsing tasks. We can even extend parse-ppss to deal with BEFORE-STRING. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 14:42 ` Vitalie Spinu @ 2016-03-21 14:56 ` Dmitry Gutov 2016-03-21 16:52 ` Vitalie Spinu 0 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-03-21 14:56 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel On 03/21/2016 04:42 PM, Vitalie Spinu wrote: >> """ >> Instead, if you want to know what indentation an inner mode would return if >> STRING-BEFORE was before it, insert that string into the buffer (while >> inhibiting undo history). Call the indentation function, then remove the string. >> """ > > Inner mode might decide to operate on string directly, or put stuff in a temp > buffer, work on last line only, or simply ignore it. Yes, each major mode would have to make all of these choices. Why burden them with that concern? Wouldn't that become a part of the same problem that you yourself mentioned, "teaching modes about multi-modes"? > Why to hard-wire the usage > of STRING-BEFORE so badly? What hard-wiring? STRING-BEFORE is not a tangible part of my proposal. There's no API change tied to it. > My gut feeling is to avoid modifying buffer context in indentation engine at all > costs. Why? That's worked out okay for me. Alternatively, you can create a temp buffer each time, compose pieces of inner mode text in it, and call the indentation function. Again, in multi-mode code. > In the future, if performance with temp buffers will be a real issue, we > can add more low level functions for fast operation on string to do some common > parsing tasks. We can even extend parse-ppss to deal with BEFORE-STRING. Performance is a distant concern, complexity is the immediate one. If modifying buffers turns out to be a problem, then we can do all the stuff you mention above. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 14:56 ` Dmitry Gutov @ 2016-03-21 16:52 ` Vitalie Spinu 2016-03-21 21:30 ` Dmitry Gutov 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 16:52 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel Ok, so the alternative proposal is not to do anything. I like that. The only reason to have STRING-AFTER and STRING-BEFORE is potential mode specific optimization. If that's not a concern, no need for that. Vitalie >> On Mon, Mar 21 2016 16:56, Dmitry Gutov wrote: > On 03/21/2016 04:42 PM, Vitalie Spinu wrote: >>> """ >>> Instead, if you want to know what indentation an inner mode would return if >>> STRING-BEFORE was before it, insert that string into the buffer (while >>> inhibiting undo history). Call the indentation function, then remove the string. >>> """ >> >> Inner mode might decide to operate on string directly, or put stuff in a temp >> buffer, work on last line only, or simply ignore it. > Yes, each major mode would have to make all of these choices. > Why burden them with that concern? Wouldn't that become a part of the same > problem that you yourself mentioned, "teaching modes about multi-modes"? >> Why to hard-wire the usage >> of STRING-BEFORE so badly? > What hard-wiring? > STRING-BEFORE is not a tangible part of my proposal. There's no API change tied > to it. >> My gut feeling is to avoid modifying buffer context in indentation engine at all >> costs. > Why? That's worked out okay for me. > Alternatively, you can create a temp buffer each time, compose pieces of inner > mode text in it, and call the indentation function. Again, in multi-mode code. >> In the future, if performance with temp buffers will be a real issue, we >> can add more low level functions for fast operation on string to do some common >> parsing tasks. We can even extend parse-ppss to deal with BEFORE-STRING. > Performance is a distant concern, complexity is the immediate one. If modifying > buffers turns out to be a problem, then we can do all the stuff you mention > above. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 16:52 ` Vitalie Spinu @ 2016-03-21 21:30 ` Dmitry Gutov 2016-04-03 23:34 ` John Wiegley 0 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-03-21 21:30 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel On 03/21/2016 06:52 PM, Vitalie Spinu wrote: > > Ok, so the alternative proposal is not to do anything. I like that. Rather, wait and see, instead of hurrying to put those into the API. > The only > reason to have STRING-AFTER and STRING-BEFORE is potential mode specific > optimization. If that's not a concern, no need for that. Performance may be a concern, but we don't know that yet. As long as they're not required for correctness, let's not get ahead of ourselves. ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 21:30 ` Dmitry Gutov @ 2016-04-03 23:34 ` John Wiegley 0 siblings, 0 replies; 155+ messages in thread From: John Wiegley @ 2016-04-03 23:34 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Alan Mackenzie, Vitalie Spinu, Stefan Monnier, emacs-devel >>>>> Dmitry Gutov <dgutov@yandex.ru> writes: > Performance may be a concern, but we don't know that yet. As long as they're > not required for correctness, let's not get ahead of ourselves. Yes, much agreed. -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 12:40 ` Vitalie Spinu 2016-03-21 13:07 ` Dmitry Gutov @ 2016-03-21 14:02 ` Stefan Monnier 2016-03-21 14:31 ` Vitalie Spinu 1 sibling, 1 reply; 155+ messages in thread From: Stefan Monnier @ 2016-03-21 14:02 UTC (permalink / raw) To: emacs-devel > Note that I don't mind FIRST-COLUMN functionality. I think it's harmless and > probably useful. I mostly mind the last two arguments of > prog-indentation-context. OK, so you're OK with FIRST-COLUMN. The last two args are: - (START . END), which you actually do want, except you want to store it in hard-widen-limit. I'm OK with storing it elsewhere. - PREVIOUS-CHUNKS. It can be a string, in which case it's just like your STRING-BEFORE. So your main issues with it are either that you don't want to allow it to be a function, or that you want to store/pas it in a different way, right? >> Almost all of them care whether the current line contains }, or `end', or >> `else', and so on. > Indeed. But this information is trivial to retrieve from STRING-AFTER. In the case of SMIE, it would probably not be too difficult to adjust it so it can work with STRING-AFTER, tho I definitely wouldn't call it trivial to implement the case of "END END END aligns with the matching outer BEGIN" which is currently supported (and was default until 24.5 or so). But I must say that I don't understand why you need this STRING-AFTER thingy. Isn't that text already right there in the buffer? E.g. in prog-indentation-context, we do have something equivalent to hard-widen-limit and to STRING-BEFORE but we have nothing like STRING-AFTER: the indentation code is expected to get that info by looking at the buffer after point. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 14:02 ` Stefan Monnier @ 2016-03-21 14:31 ` Vitalie Spinu 2016-03-21 15:06 ` Stefan Monnier 0 siblings, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-21 14:31 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel I have elaborated on all these in my other long reply. I would just conclude here that because both calculate-indent-function and prog-indentation-context try to solve same problem, they are bound to have overlapping parts. It's just that calculate-indent-function is more general, easier to understand for prog authors and it solves three problems at once - replacement of indent-line-function, removing extra prog-indentation-context/prog-widen and not exposing multi-mode complexities. Note also that the intention of the `hard-widen-limit` is to make it work seamlessly for all existing code that use widen. While prog-indentation-context requires to teach every mode out there to use prog-widen instead of widen. This doesn't sound right at all. Vitalie >> On Mon, Mar 21 2016 10:02, Stefan Monnier wrote: >> Note that I don't mind FIRST-COLUMN functionality. I think it's harmless and >> probably useful. I mostly mind the last two arguments of >> prog-indentation-context. > OK, so you're OK with FIRST-COLUMN. The last two args are: > - (START . END), which you actually do want, except you want to store it > in hard-widen-limit. I'm OK with storing it elsewhere. > - PREVIOUS-CHUNKS. It can be a string, in which case it's just like your > STRING-BEFORE. So your main issues with it are either that you don't > want to allow it to be a function, or that you want to store/pas it in > a different way, right? >>> Almost all of them care whether the current line contains }, or `end', or >>> `else', and so on. >> Indeed. But this information is trivial to retrieve from STRING-AFTER. > In the case of SMIE, it would probably not be too difficult to adjust it > so it can work with STRING-AFTER, tho I definitely wouldn't call it > trivial to implement the case of "END END END aligns with the matching > outer BEGIN" which is currently supported (and was default until 24.5 or > so). > But I must say that I don't understand why you need this > STRING-AFTER thingy. Isn't that text already right there in the buffer? > E.g. in prog-indentation-context, we do have something equivalent to > hard-widen-limit and to STRING-BEFORE but we have nothing like > STRING-AFTER: the indentation code is expected to get that info by > looking at the buffer after point. > Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 14:31 ` Vitalie Spinu @ 2016-03-21 15:06 ` Stefan Monnier 2016-03-21 17:15 ` Andreas Röhler 0 siblings, 1 reply; 155+ messages in thread From: Stefan Monnier @ 2016-03-21 15:06 UTC (permalink / raw) To: emacs-devel > I have elaborated on all these in my other long reply. I would just > conclude here that because both calculate-indent-function and > prog-indentation-context try to solve same problem, they are bound to > have overlapping parts. It's just that calculate-indent-function is > more general, easier to understand for prog authors and it solves > three problems at once - replacement of indent-line-function, removing > extra prog-indentation-context/prog-widen and not exposing > multi-mode complexities. STRING-BEFORE/STRING-AFTER/PREVIOUS-CHUNKS look like the main complexity (from smie.el's point of view, they all seem pretty painful to support). > Note also that the intention of the `hard-widen-limit` is to make it > work seamlessly for all existing code that use widen. While > prog-indentation-context requires to teach every mode out there to use > prog-widen instead of widen. This doesn't sound right at all. The reason is that your suggestion risks breaking code since it changes the behavior of `widen'. Maybe the breakage would be extremely limited or even not exist at all, in which case the tradeoff is probably worth it. My gut feeling is that it's too risky, but that's just my gut feeling. Note also that most modes don't bother using widen, and search&replace is pretty easy to do. But if my fear is unfounded, then indeed it's better to just change `widen' directly. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] 2016-03-21 15:06 ` Stefan Monnier @ 2016-03-21 17:15 ` Andreas Röhler 0 siblings, 0 replies; 155+ messages in thread From: Andreas Röhler @ 2016-03-21 17:15 UTC (permalink / raw) To: emacs-devel On 21.03.2016 16:06, Stefan Monnier wrote: >> I have elaborated on all these in my other long reply. I would just >> conclude here that because both calculate-indent-function and >> prog-indentation-context try to solve same problem, they are bound to >> have overlapping parts. It's just that calculate-indent-function is >> more general, easier to understand for prog authors and it solves >> three problems at once - replacement of indent-line-function, removing >> extra prog-indentation-context/prog-widen and not exposing >> multi-mode complexities. > STRING-BEFORE/STRING-AFTER/PREVIOUS-CHUNKS look like the main complexity > (from smie.el's point of view, they all seem pretty painful to support). Would expect that. > >> Note also that the intention of the `hard-widen-limit` is to make it >> work seamlessly for all existing code that use widen. While >> prog-indentation-context requires to teach every mode out there to use >> prog-widen instead of widen. This doesn't sound right at all. > The reason is that your suggestion risks breaking code since it changes > the behavior of `widen'. > > Maybe the breakage would be extremely limited or even not exist at all, > in which case the tradeoff is probably worth it. My gut feeling is that > it's too risky, but that's just my gut feeling. > > Note also that most modes don't bother using widen, and search&replace > is pretty easy to do. But if my fear is unfounded, then indeed it's > better to just change `widen' directly. > > > Stefan > > What about if going back and reflect from the starting point, if a well defined result of narrowing would be the best - without changing code in core at all? When it narrows from midst of a sexp, its beginning is missing, okay. Where is the problem? If this wasn't wanted, the narrowing was wrong. No need to fix this from Emacs side. Sorry, should I not have understood what's at stake... ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-11 21:24 ` Alan Mackenzie 2016-03-11 21:35 ` Dmitry Gutov @ 2016-03-13 17:32 ` Stefan Monnier 1 sibling, 0 replies; 155+ messages in thread From: Stefan Monnier @ 2016-03-13 17:32 UTC (permalink / raw) To: Alan Mackenzie; +Cc: 22983, Dmitry Gutov > Er no, I meant what I wrote: the result of (syntax-ppss pos) must match > that of (parse-partial-sexp (point-min) pos). That's what the docstring says, but is it the result you're looking for? Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-11 15:15 bug#22983: syntax-ppss returns wrong result Alan Mackenzie 2016-03-11 20:31 ` Dmitry Gutov @ 2016-03-13 18:52 ` Andreas Röhler 2016-03-13 18:56 ` Dmitry Gutov 2016-03-18 0:49 ` Dmitry Gutov [not found] ` <mailman.7307.1457709188.843.bug-gnu-emacs@gnu.org> 3 siblings, 1 reply; 155+ messages in thread From: Andreas Röhler @ 2016-03-13 18:52 UTC (permalink / raw) To: 22983 [-- Attachment #1: Type: text/plain, Size: 532 bytes --] On 11.03.2016 16:15, Alan Mackenzie wrote: > Hello, Emacs. > > The fundamental contract in syntax-ppss is that (syntax-ppss POS) > returns the same value as (parse-partial-sexp (point-min) POS) (with the > exception of elements 2 and 6). This is currently not always the case. > > In the master branch, emacs -Q and visit xdisp.c with C-x C-f. Follow > this recipe: > > M-: (syntax-ppss-flush-cache 1) > M-: (setq ppss-0 (syntax-ppss 40000)) (setq ppss-0 (syntax-ppss 40000) moved point - see attachment. Should it? [-- Attachment #2: moves-point.png --] [-- Type: image/png, Size: 125904 bytes --] ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-13 18:52 ` Andreas Röhler @ 2016-03-13 18:56 ` Dmitry Gutov 0 siblings, 0 replies; 155+ messages in thread From: Dmitry Gutov @ 2016-03-13 18:56 UTC (permalink / raw) To: Andreas Röhler, 22983 On 03/13/2016 08:52 PM, Andreas Röhler wrote: > (setq ppss-0 (syntax-ppss 40000) > > moved point - see attachment. Should it? Yes. See the last sentence in its docstring. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-11 15:15 bug#22983: syntax-ppss returns wrong result Alan Mackenzie 2016-03-11 20:31 ` Dmitry Gutov 2016-03-13 18:52 ` Andreas Röhler @ 2016-03-18 0:49 ` Dmitry Gutov 2016-03-19 12:27 ` Alan Mackenzie 2016-03-19 23:00 ` Vitalie Spinu [not found] ` <mailman.7307.1457709188.843.bug-gnu-emacs@gnu.org> 3 siblings, 2 replies; 155+ messages in thread From: Dmitry Gutov @ 2016-03-18 0:49 UTC (permalink / raw) To: Alan Mackenzie, 22983 On 03/11/2016 05:15 PM, Alan Mackenzie wrote: This patch should make ppss-0 and ppss-1 match: diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el index e20a210..c1b9d84 100644 --- a/lisp/emacs-lisp/syntax.el +++ b/lisp/emacs-lisp/syntax.el @@ -371,6 +371,11 @@ syntax-ppss-max-span We try to make sure that cache entries are at least this far apart from each other, to avoid keeping too much useless info.") +(defvar syntax-ppss-dont-widen nil + "If non-nil, `syntax-ppss' will work on the non-widened buffer. +The code that uses this should create local bindings for +`syntax-ppss-cache' and `syntax-ppss-last' too.") + (defvar syntax-begin-function nil "Function to move back outside of any comment/string/paren. This function should move the cursor back to some syntactically safe @@ -423,12 +428,21 @@ syntax-ppss in the returned list (counting from 0) cannot be relied upon. Point is at POS when this function returns. +IF `syntax-ppss-dont-widen' is nil, the buffer is temporarily +widened. + It is necessary to call `syntax-ppss-flush-cache' explicitly if this function is called while `before-change-functions' is temporarily let-bound, or if the buffer is modified without running the hook." ;; Default values. (unless pos (setq pos (point))) + (save-restriction + (unless syntax-ppss-dont-widen + (widen)) + (syntax-pps--at pos))) + +(defun syntax-ppss--at (pos) (syntax-propertize pos) ;; (let ((old-ppss (cdr syntax-ppss-last)) ^ permalink raw reply related [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-18 0:49 ` Dmitry Gutov @ 2016-03-19 12:27 ` Alan Mackenzie 2016-03-19 18:47 ` Dmitry Gutov 2016-03-19 23:16 ` Vitalie Spinu 2016-03-19 23:00 ` Vitalie Spinu 1 sibling, 2 replies; 155+ messages in thread From: Alan Mackenzie @ 2016-03-19 12:27 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 22983 Hello, Dmitry. On Fri, Mar 18, 2016 at 02:49:34AM +0200, Dmitry Gutov wrote: > On 03/11/2016 05:15 PM, Alan Mackenzie wrote: > This patch should make ppss-0 and ppss-1 match: OK, no bad thing! But seeing that the function is a new function (its specification has changed), it will need new test cases, fresh new attempts to break it. > diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el > index e20a210..c1b9d84 100644 > --- a/lisp/emacs-lisp/syntax.el > +++ b/lisp/emacs-lisp/syntax.el > @@ -371,6 +371,11 @@ syntax-ppss-max-span > We try to make sure that cache entries are at least this far apart > from each other, to avoid keeping too much useless info.") > +(defvar syntax-ppss-dont-widen nil > + "If non-nil, `syntax-ppss' will work on the non-widened buffer. > +The code that uses this should create local bindings for > +`syntax-ppss-cache' and `syntax-ppss-last' too.") > + I'm against this bit. If syntax-ppss-dont-widen is non-nil, the buffer is narrowed, and the local cache variables are correctly bound and filled, then something at a low level is going to widen the buffer (and call back_comment) without knowing to restore the global bindings for those cache variables. This could easily give the wrong result and corrupt the locally bound cache. I think the only sensible functionality for syntax-ppss is to be equivalent to (parse-partial-sexp 1 pos). Then everybody knows where they stand. Those pieces of code which actually need a ppss cache with origin other than 1 could then use a more appropriate specialized function whose cache wouldn't get mixed up with syntax-ppss's. (It could share a lot of code with syntax-ppss). > (defvar syntax-begin-function nil > "Function to move back outside of any comment/string/paren. > This function should move the cursor back to some syntactically safe > @@ -423,12 +428,21 @@ syntax-ppss > in the returned list (counting from 0) cannot be relied upon. > Point is at POS when this function returns. > +IF `syntax-ppss-dont-widen' is nil, the buffer is temporarily > +widened. > + > It is necessary to call `syntax-ppss-flush-cache' explicitly if > this function is called while `before-change-functions' is > temporarily let-bound, or if the buffer is modified without > running the hook." > ;; Default values. > (unless pos (setq pos (point))) > + (save-restriction > + (unless syntax-ppss-dont-widen > + (widen)) > + (syntax-pps--at pos))) > + > +(defun syntax-ppss--at (pos) > (syntax-propertize pos) > ;; > (let ((old-ppss (cdr syntax-ppss-last)) -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-19 12:27 ` Alan Mackenzie @ 2016-03-19 18:47 ` Dmitry Gutov 2016-03-27 0:51 ` John Wiegley 2016-03-19 23:16 ` Vitalie Spinu 1 sibling, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-03-19 18:47 UTC (permalink / raw) To: Alan Mackenzie; +Cc: 22983 On 03/19/2016 02:27 PM, Alan Mackenzie wrote: > OK, no bad thing! Good. > But seeing that the function is a new function (its specification has > changed), it will need new test cases, fresh new attempts to break it. Sure, please go ahead. It needed new test cases even before this miraculous transformation. >> +(defvar syntax-ppss-dont-widen nil >> + "If non-nil, `syntax-ppss' will work on the non-widened buffer. >> +The code that uses this should create local bindings for >> +`syntax-ppss-cache' and `syntax-ppss-last' too.") >> + > > I'm against this bit. I'm not married to it, but at least it would provide a backward compatibility escape hatch for a while. If a new way of handling mixed modes is added and turns out to be satisfactory, we can remove this variable later. > If syntax-ppss-dont-widen is non-nil, the buffer > is narrowed, and the local cache variables are correctly bound and > filled, then something at a low level is going to widen the buffer (and > call back_comment) without knowing to restore the global bindings for > those cache variables. When and why would that happen? I do not recall that happening before. Since the "low level" is a bounded set, we should be able to make sure that the primitives do not, in fact, widen before calling syntax-ppss. I suppose some could widen afterward. > This could easily give the wrong result and > corrupt the locally bound cache. Even so, that would only affect the local cache, and as such, only the subregions, in the case of mixed-mode usage. In the general case, it would only affect the consumers of syntax-ppss that bound syntax-ppss-dont-widen, as long as they bound the cache variables as well, which we tell them to. That lowers the damage area considerably. > I think the only sensible functionality for syntax-ppss is to be > equivalent to (parse-partial-sexp 1 pos). Then everybody knows where > they stand. Those pieces of code which actually need a ppss cache with > origin other than 1 could then use a more appropriate specialized > function whose cache wouldn't get mixed up with syntax-ppss's. (It > could share a lot of code with syntax-ppss). They already use syntax-ppss. I imagine Emacs's backward compatibility policy has something to say about that. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-19 18:47 ` Dmitry Gutov @ 2016-03-27 0:51 ` John Wiegley 2016-03-27 1:14 ` Dmitry Gutov 0 siblings, 1 reply; 155+ messages in thread From: John Wiegley @ 2016-03-27 0:51 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Alan Mackenzie, 22983 [-- Attachment #1: Type: text/plain, Size: 1571 bytes --] >>>>> Dmitry Gutov <dgutov@yandex.ru> writes: >> I think the only sensible functionality for syntax-ppss is to be equivalent >> to (parse-partial-sexp 1 pos). Then everybody knows where they stand. Those >> pieces of code which actually need a ppss cache with origin other than 1 >> could then use a more appropriate specialized function whose cache wouldn't >> get mixed up with syntax-ppss's. (It could share a lot of code with >> syntax-ppss). > They already use syntax-ppss. I imagine Emacs's backward compatibility > policy has something to say about that. There are times when our backward compatibility policy must bend, and even break. Specifically, we have a few existing cases where incomplete code has or will be shipped in a release. The argument for doing so has often been, "So we can see what users think." But if we *also* say that once it is released and people start using, we can't change it, then it's a Catch-22. syntax-ppss needs more work, that seems to be fairly clear based on the volume of discussion around this feature, and bugs like this one. Therefore, since it is not solid yet I'm not willing to let existing dependencies prevent us from fixing its flaws. When a feature becomes solid and true, like lexical-binding, that's when I become incredibly reticent to make any changes whatsoever -- without the convergence of all the planets and the moons. -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 629 bytes --] ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-27 0:51 ` John Wiegley @ 2016-03-27 1:14 ` Dmitry Gutov 2016-04-03 22:58 ` John Wiegley 0 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-03-27 1:14 UTC (permalink / raw) To: John Wiegley; +Cc: Alan Mackenzie, 22983 On 03/27/2016 02:51 AM, John Wiegley wrote: > syntax-ppss needs more work, that seems to be fairly clear based on the volume > of discussion around this feature, and bugs like this one. Bugs, plural? Alan has filed just one so far, and I've posted the trivial patch. > Therefore, since it > is not solid yet I'm not willing to let existing dependencies prevent us from > fixing its flaws. The aforementioned patch both fixes the bug and allows syntax-ppss to continue to be used in the fashion I've mentioned previously. The question that's holding it, as far as I'm concerned, if whether the "hard narrowing" discussion reaches a satisfying conclusion. If it does, we won't really need syntax-ppss-dont-widen. > When a feature becomes solid and true, like lexical-binding, that's when I > become incredibly reticent to make any changes whatsoever -- without the > convergence of all the planets and the moons. I've never said anything about avoiding making changes to it. But when we do that, we usually try to accommodate the existing uses (the importance of which depends on how many uses there are out there). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-27 1:14 ` Dmitry Gutov @ 2016-04-03 22:58 ` John Wiegley 2016-04-03 23:15 ` Dmitry Gutov 0 siblings, 1 reply; 155+ messages in thread From: John Wiegley @ 2016-04-03 22:58 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Alan Mackenzie, 22983 [-- Attachment #1: Type: text/plain, Size: 1035 bytes --] >>>>> Dmitry Gutov <dgutov@yandex.ru> writes: > The aforementioned patch both fixes the bug and allows syntax-ppss to > continue to be used in the fashion I've mentioned previously. > The question that's holding it, as far as I'm concerned, if whether the > "hard narrowing" discussion reaches a satisfying conclusion. If it does, we > won't really need syntax-ppss-dont-widen. Have things reached a satisfactory conclusion now? It was hard for me to tell by the end of this thread. > I've never said anything about avoiding making changes to it. But when we do > that, we usually try to accommodate the existing uses (the importance of > which depends on how many uses there are out there). Sure, though it's experimental nature does get taken into account. If a thing is wrong, I'm not interested in accommodating existing workarounds to its wrongness. -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 629 bytes --] ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-04-03 22:58 ` John Wiegley @ 2016-04-03 23:15 ` Dmitry Gutov 2017-09-02 13:12 ` Eli Zaretskii 0 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2016-04-03 23:15 UTC (permalink / raw) To: John Wiegley; +Cc: Alan Mackenzie, 22983 On 04/04/2016 01:58 AM, John Wiegley wrote: > Have things reached a satisfactory conclusion now? It was hard for me to tell > by the end of this thread. It's a separate discussion, see http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01576.html > Sure, though it's experimental nature does get taken into account. If a thing > is wrong, I'm not interested in accommodating existing workarounds to its > wrongness. What experimental nature? ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-04-03 23:15 ` Dmitry Gutov @ 2017-09-02 13:12 ` Eli Zaretskii 2017-09-02 17:40 ` Alan Mackenzie 0 siblings, 1 reply; 155+ messages in thread From: Eli Zaretskii @ 2017-09-02 13:12 UTC (permalink / raw) To: Dmitry Gutov; +Cc: jwiegley, acm, 22983 unblock 24655 by 22983 thanks > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Mon, 4 Apr 2016 02:15:50 +0300 > Cc: Alan Mackenzie <acm@muc.de>, 22983@debbugs.gnu.org > > On 04/04/2016 01:58 AM, John Wiegley wrote: > > > Have things reached a satisfactory conclusion now? It was hard for me to tell > > by the end of this thread. > > It's a separate discussion, see > http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01576.html > > > Sure, though it's experimental nature does get taken into account. If a thing > > is wrong, I'm not interested in accommodating existing workarounds to its > > wrongness. > > What experimental nature? It doesn't sound like this discussion is leading anywhere, and since almost 1.5 years has passed with no comments, I guess this bug doesn't need to block the release of Emacs 26.1, at least. Thanks. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2017-09-02 13:12 ` Eli Zaretskii @ 2017-09-02 17:40 ` Alan Mackenzie 2017-09-02 17:53 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 155+ messages in thread From: Alan Mackenzie @ 2017-09-02 17:40 UTC (permalink / raw) To: Eli Zaretskii; +Cc: jwiegley, Dmitry Gutov, 22983 Hello, Eli. On Sat, Sep 02, 2017 at 16:12:48 +0300, Eli Zaretskii wrote: > unblock 24655 by 22983 > thanks > > From: Dmitry Gutov <dgutov@yandex.ru> > > Date: Mon, 4 Apr 2016 02:15:50 +0300 > > Cc: Alan Mackenzie <acm@muc.de>, 22983@debbugs.gnu.org > > On 04/04/2016 01:58 AM, John Wiegley wrote: > > > Have things reached a satisfactory conclusion now? It was hard for me to tell > > > by the end of this thread. > > It's a separate discussion, see > > http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01576.html > > > Sure, though it's experimental nature does get taken into account. If a thing > > > is wrong, I'm not interested in accommodating existing workarounds to its > > > wrongness. > > What experimental nature? > It doesn't sound like this discussion is leading anywhere, and since > almost 1.5 years has passed with no comments, I guess this bug doesn't > need to block the release of Emacs 26.1, at least. I'm not happy about this. 22983 is a serious design flaw, which has had deleterious effects deep within Emacs. One recorded example, resulting in an infinite loop, is: ######################################################################### From: Philipp Stephani <p.stephani2@gmail.com> To: emacs-devel@gnu.org Subject: [PATCH] Protect against an infloop in python-mode Date: Tue, 28 Feb 2017 22:31:49 +0100 There appears to be an edge case caused by using `syntax-ppss' in a narrowed buffer during JIT lock inside of Python triple-quote strings. Unfortunately it is impossible to reproduce without manually destroying the syntactic information in the Python buffer, but it has been observed in practice. In that case it can happen that the syntax caches get sufficiently out of whack so that there appear to be overlapping strings in the buffer. As Python has no nested strings, this situation is impossible and leads to an infloop in `python-nav-end-of-statement'. Protect against this by checking whether the search for the end of the current string makes progress. ######################################################################### In this case, Philipp had to apply a workaround. Seeing as how Stefan is not prepared to take responsibility for his own bugs, I suggest that I fix it, something I really don't want to spend time on. Before I do start spending time on it, I would like some assurance that my fix will not be blocked or reverted (both have happened to other things in the core I've worked on), and that I will have a reasonable amount of time to get the job done (a few weeks) before any freeze for Emacs 25.3 or 26 comes into force. > Thanks. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2017-09-02 17:40 ` Alan Mackenzie @ 2017-09-02 17:53 ` Eli Zaretskii 2017-09-03 20:44 ` John Wiegley 2017-09-04 23:34 ` Dmitry Gutov 2 siblings, 0 replies; 155+ messages in thread From: Eli Zaretskii @ 2017-09-02 17:53 UTC (permalink / raw) To: Alan Mackenzie; +Cc: jwiegley, dgutov, 22983 > Date: Sat, 2 Sep 2017 17:40:27 +0000 > Cc: Dmitry Gutov <dgutov@yandex.ru>, jwiegley@gmail.com, 22983@debbugs.gnu.org > From: Alan Mackenzie <acm@muc.de> > > > It doesn't sound like this discussion is leading anywhere, and since > > almost 1.5 years has passed with no comments, I guess this bug doesn't > > need to block the release of Emacs 26.1, at least. > > I'm not happy about this. 22983 is a serious design flaw, which has had > deleterious effects deep within Emacs. I didn't close the bug, mind you. I just removed it from the list of those blocking the impending release. You, or anyone else, are free to work on fixing it and/or discuss the various approaches to dealing with this issue. > Before I do start spending time on it, I would like some assurance > that my fix will not be blocked or reverted (both have happened to > other things in the core I've worked on) I doubt that anyone could give you such a promise without seeing the proposed changes. Especially since this and related issues, and solutions proposed for them, already have some history of being controversial. > and that I will have a reasonable amount of time to get the job done > (a few weeks) before any freeze for Emacs 25.3 or 26 comes into > force. That I can promise you. Feature freeze doesn't affect bugfixes, and Emacs 26.1 is not going to be released tomorrow or the next week. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2017-09-02 17:40 ` Alan Mackenzie 2017-09-02 17:53 ` Eli Zaretskii @ 2017-09-03 20:44 ` John Wiegley 2017-09-04 23:34 ` Dmitry Gutov 2 siblings, 0 replies; 155+ messages in thread From: John Wiegley @ 2017-09-03 20:44 UTC (permalink / raw) To: Alan Mackenzie; +Cc: 22983, Dmitry Gutov >>>>> Alan Mackenzie <acm@muc.de> writes: > Seeing as how Stefan is not prepared to take responsibility for his own bugs Let's not use language like this if avoidable, please. I'm certain Stefan would do so, he just may not see this issue the same way you do (which is what I recall from the extensive discussions on syntax-ppss). To characterize it as a fault is only discouraging or frustrating; it doesn't help Emacs. Thanks, -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2017-09-02 17:40 ` Alan Mackenzie 2017-09-02 17:53 ` Eli Zaretskii 2017-09-03 20:44 ` John Wiegley @ 2017-09-04 23:34 ` Dmitry Gutov 2017-09-05 6:57 ` Andreas Röhler ` (2 more replies) 2 siblings, 3 replies; 155+ messages in thread From: Dmitry Gutov @ 2017-09-04 23:34 UTC (permalink / raw) To: Alan Mackenzie, Eli Zaretskii; +Cc: jwiegley, Philipp Stephani, 22983 On 9/2/17 8:40 PM, Alan Mackenzie wrote: > I'm not happy about this. 22983 is a serious design flaw, which has had > deleterious effects deep within Emacs. I'm sure we want to fix design flaws. As long as there is a solid plan that does not swap one flaw for another. > One recorded example, resulting > in an infinite loop, is: > > ######################################################################### > From: Philipp Stephani <p.stephani2@gmail.com> > To: emacs-devel@gnu.org > Subject: [PATCH] Protect against an infloop in python-mode > Date: Tue, 28 Feb 2017 22:31:49 +0100 > > There appears to be an edge case caused by using `syntax-ppss' in a > narrowed buffer during JIT lock inside of Python triple-quote strings. > Unfortunately it is impossible to reproduce without manually > destroying the syntactic information in the Python buffer, but it has > been observed in practice. In that case it can happen that the syntax > caches get sufficiently out of whack so that there appear to be > overlapping strings in the buffer. As Python has no nested strings, > this situation is impossible and leads to an infloop in > `python-nav-end-of-statement'. Protect against this by checking > whether the search for the end of the current string makes progress. > ######################################################################### > > In this case, Philipp had to apply a workaround. The problem manifested during jit-lock. Do we understand why the (widen) call inside font-lock-default-fontify-region didn't help? ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2017-09-04 23:34 ` Dmitry Gutov @ 2017-09-05 6:57 ` Andreas Röhler 2017-09-05 12:28 ` John Wiegley 2017-09-07 17:56 ` Alan Mackenzie 2 siblings, 0 replies; 155+ messages in thread From: Andreas Röhler @ 2017-09-05 6:57 UTC (permalink / raw) To: 22983 [-- Attachment #1: Type: text/plain, Size: 1740 bytes --] On 05.09.2017 01:34, Dmitry Gutov wrote: > On 9/2/17 8:40 PM, Alan Mackenzie wrote: >> I'm not happy about this. 22983 is a serious design flaw, which has had >> deleterious effects deep within Emacs. > > I'm sure we want to fix design flaws. As long as there is a solid plan > that does not swap one flaw for another. > >> One recorded example, resulting >> in an infinite loop, is: >> >> ######################################################################### >> >> From: Philipp Stephani <p.stephani2@gmail.com> >> To: emacs-devel@gnu.org >> Subject: [PATCH] Protect against an infloop in python-mode >> Date: Tue, 28 Feb 2017 22:31:49 +0100 >> >> There appears to be an edge case caused by using `syntax-ppss' in a >> narrowed buffer during JIT lock inside of Python triple-quote strings. >> Unfortunately it is impossible to reproduce without manually >> destroying the syntactic information in the Python buffer, but it has >> been observed in practice. In that case it can happen that the syntax >> caches get sufficiently out of whack so that there appear to be >> overlapping strings in the buffer. As Python has no nested strings, >> this situation is impossible and leads to an infloop in >> `python-nav-end-of-statement'. Protect against this by checking >> whether the search for the end of the current string makes progress. >> ######################################################################### >> >> >> In this case, Philipp had to apply a workaround. > > The problem manifested during jit-lock. Do we understand why the > (widen) call inside font-lock-default-fontify-region didn't help? > > > IIRC its about dissolving circular dependencies notably between syntax-propertize-function and syntax-ppss. [-- Attachment #2: Type: text/html, Size: 2904 bytes --] ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2017-09-04 23:34 ` Dmitry Gutov 2017-09-05 6:57 ` Andreas Röhler @ 2017-09-05 12:28 ` John Wiegley 2017-09-07 20:45 ` Alan Mackenzie 2017-09-07 17:56 ` Alan Mackenzie 2 siblings, 1 reply; 155+ messages in thread From: John Wiegley @ 2017-09-05 12:28 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Alan Mackenzie, Philipp Stephani, 22983 >>>>> Dmitry Gutov <dgutov@yandex.ru> writes: > I'm sure we want to fix design flaws. As long as there is a solid plan that > does not swap one flaw for another. Can we have a summary of the current proposal(s) on the table? It would help to clarify, rather than navigating past discussions. Alan has told me that this issue is affecting people and has been outstanding for some time; I'd like to get a better idea of its seriousness/scope, and what effect the available solutions would have (as Dmitry says, we don't want to replace one flaw with another). -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2017-09-05 12:28 ` John Wiegley @ 2017-09-07 20:45 ` Alan Mackenzie 2017-09-08 16:04 ` Andreas Röhler 2017-09-09 9:44 ` Dmitry Gutov 0 siblings, 2 replies; 155+ messages in thread From: Alan Mackenzie @ 2017-09-07 20:45 UTC (permalink / raw) To: John Wiegley; +Cc: 22983, Philipp Stephani, Dmitry Gutov Hello, John. On Tue, Sep 05, 2017 at 13:28:52 +0100, John Wiegley wrote: > >>>>> Dmitry Gutov <dgutov@yandex.ru> writes: > > I'm sure we want to fix design flaws. As long as there is a solid plan that > > does not swap one flaw for another. > Can we have a summary of the current proposal(s) on the table? It would help > to clarify, rather than navigating past discussions. Alan has told me that > this issue is affecting people and has been outstanding for some time; I'd > like to get a better idea of its seriousness/scope, and what effect the > available solutions would have (as Dmitry says, we don't want to replace one > flaw with another). First, I think it's worthwhile emphasising what the function purports to do: syntax-ppss is a compiled Lisp function in `syntax.el'. (syntax-ppss &optional POS) Parse-Partial-Sexp State at POS, defaulting to point. The returned value is the same as that of `parse-partial-sexp' run from `point-min' to POS except that values at positions 2 and 6 in the returned list (counting from 0) cannot be relied upon. Point is at POS when this function returns. The solution I propose is to introduce a second cache into syntax-ppss, and this cache would be used whenever (not (eq (point-min) 1)). Whenever point-min changes, and isn't 1, this second cached would be calculated again from scratch. This proposal has these advantages: (i) It would make the function deliver what its unchanged doc string says. This is important, given that syntax-ppss has been very widely used within Emacs, and likely by external packages too; these will typically have assumed the advertised behaviour of the function, without having tested it in narrowed buffers. (i) In the case which currently works, namely a non-narrowed buffer, there would be only a minute slow-down (basically, there would be extra code to check point-min and select the cache to use). (ii) The cache for use in a narrowed buffer might well be sufficiently fast in normal use. If it is not, it could be enhanced readily. I think Dmitry also proposed a method of solution some months ago, though I don't remember in detail what it was. Dmitry, do you still think your solution would work? If so, please elaborate on it. > -- > John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F > http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2017-09-07 20:45 ` Alan Mackenzie @ 2017-09-08 16:04 ` Andreas Röhler 2017-09-10 18:26 ` Alan Mackenzie 2017-09-09 9:44 ` Dmitry Gutov 1 sibling, 1 reply; 155+ messages in thread From: Andreas Röhler @ 2017-09-08 16:04 UTC (permalink / raw) To: 22983; +Cc: Alan Mackenzie, John Wiegley [-- Attachment #1: Type: text/plain, Size: 2958 bytes --] On 07.09.2017 22:45, Alan Mackenzie wrote: > Hello, John. > > On Tue, Sep 05, 2017 at 13:28:52 +0100, John Wiegley wrote: >>>>>>> Dmitry Gutov <dgutov@yandex.ru> writes: >>> I'm sure we want to fix design flaws. As long as there is a solid plan that >>> does not swap one flaw for another. >> Can we have a summary of the current proposal(s) on the table? It would help >> to clarify, rather than navigating past discussions. Alan has told me that >> this issue is affecting people and has been outstanding for some time; I'd >> like to get a better idea of its seriousness/scope, and what effect the >> available solutions would have (as Dmitry says, we don't want to replace one >> flaw with another). > First, I think it's worthwhile emphasising what the function purports to > do: > > syntax-ppss is a compiled Lisp function in `syntax.el'. > > (syntax-ppss &optional POS) > > Parse-Partial-Sexp State at POS, defaulting to point. > The returned value is the same as that of `parse-partial-sexp' > run from `point-min' to POS except that values at positions 2 and 6 > in the returned list (counting from 0) cannot be relied upon. > Point is at POS when this function returns. > > The solution I propose is to introduce a second cache into syntax-ppss, > and this cache would be used whenever (not (eq (point-min) 1)). > Whenever point-min changes, and isn't 1, this second cached would be > calculated again from scratch. > > This proposal has these advantages: > > (i) It would make the function deliver what its unchanged doc string > says. This is important, given that syntax-ppss has been very widely > used within Emacs, and likely by external packages too; these will > typically have assumed the advertised behaviour of the function, without > having tested it in narrowed buffers. > > (i) In the case which currently works, namely a non-narrowed buffer, > there would be only a minute slow-down (basically, there would be extra > code to check point-min and select the cache to use). > > (ii) The cache for use in a narrowed buffer might well be sufficiently > fast in normal use. If it is not, it could be enhanced readily. > > I think Dmitry also proposed a method of solution some months ago, > though I don't remember in detail what it was. Dmitry, do you still > think your solution would work? If so, please elaborate on it. > >> -- >> John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F >> http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 Hi Alan and all, assume a complex matter behind, a bunch of bugs resp. design issues, not a single one. Fixing this would affect syntax-propertize, parse-partial-sexp, syntax-ppss and font-lock stuff at once. http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01576.html points at some spot. There should be more. As a first step listing referential tests including benchmarks should be helpful. [-- Attachment #2: Type: text/html, Size: 4540 bytes --] ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2017-09-08 16:04 ` Andreas Röhler @ 2017-09-10 18:26 ` Alan Mackenzie 0 siblings, 0 replies; 155+ messages in thread From: Alan Mackenzie @ 2017-09-10 18:26 UTC (permalink / raw) To: Andreas Röhler; +Cc: 22983 Hello, Andreas. On Fri, Sep 08, 2017 at 18:04:37 +0200, Andreas Röhler wrote: > On 07.09.2017 22:45, Alan Mackenzie wrote: > > The solution I propose is to introduce a second cache into syntax-ppss, > > and this cache would be used whenever (not (eq (point-min) 1)). > > Whenever point-min changes, and isn't 1, this second cached would be > > calculated again from scratch. > > This proposal has these advantages: > > (i) It would make the function deliver what its unchanged doc string > > says. This is important, given that syntax-ppss has been very widely > > used within Emacs, and likely by external packages too; these will > > typically have assumed the advertised behaviour of the function, without > > having tested it in narrowed buffers. > > (i) In the case which currently works, namely a non-narrowed buffer, > > there would be only a minute slow-down (basically, there would be extra > > code to check point-min and select the cache to use). > > (ii) The cache for use in a narrowed buffer might well be sufficiently > > fast in normal use. If it is not, it could be enhanced readily. > Hi Alan and all, > assume a complex matter behind, a bunch of bugs resp. design issues, not > a single one. I don't think this bug is _that_ complex, and even if it has associated bugs, I think we can fix it on its own. > Fixing this would affect syntax-propertize, parse-partial-sexp, > syntax-ppss and font-lock stuff at once. I'll give you one out of four. ;-) syntax-ppss will definitely be affected, parse-partial-sexp definitely not, and the other two possibly in corner cases, but hopefully not. > http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01576.html > points at some spot. There should be more. I think, at least I hope, that is an orthoganol issue. > As a first step listing referential tests including benchmarks should be > helpful. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2017-09-07 20:45 ` Alan Mackenzie 2017-09-08 16:04 ` Andreas Röhler @ 2017-09-09 9:44 ` Dmitry Gutov 2017-09-09 10:20 ` Alan Mackenzie 2017-09-10 11:36 ` bug#22983: [ Patch ] " Alan Mackenzie 1 sibling, 2 replies; 155+ messages in thread From: Dmitry Gutov @ 2017-09-09 9:44 UTC (permalink / raw) To: Alan Mackenzie, John Wiegley; +Cc: 22983, Philipp Stephani Hi Alan, On 9/7/17 11:45 PM, Alan Mackenzie wrote: > The solution I propose is to introduce a second cache into syntax-ppss, > and this cache would be used whenever (not (eq (point-min) 1)). > Whenever point-min changes, and isn't 1, this second cached would be > calculated again from scratch. Thanks for writing this up. I think it's a good step, and since it follow the current wording of the docstring, it should be highly compatible with the existing code. > This proposal has these advantages: > > (i) It would make the function deliver what its unchanged doc string > says. This is important, given that syntax-ppss has been very widely > used within Emacs, and likely by external packages too; these will > typically have assumed the advertised behaviour of the function, without > having tested it in narrowed buffers. It will also continue to function as expected in mmm-mode, AFAICT, without the need for an "escape hatch" we discussed before. > (i) In the case which currently works, namely a non-narrowed buffer, > there would be only a minute slow-down (basically, there would be extra > code to check point-min and select the cache to use). > > (ii) The cache for use in a narrowed buffer might well be sufficiently > fast in normal use. If it is not, it could be enhanced readily. And since the API doesn't change, and the observable behavior doesn't either (in the vast majority of cases; probably all except the broken ones), we can refine this solution easily, or even swap it for something else, with little cost. > I think Dmitry also proposed a method of solution some months ago, > though I don't remember in detail what it was. Dmitry, do you still > think your solution would work? If so, please elaborate on it. There is a simple patch at https://debbugs.gnu.org/cgi/bugreport.cgi?bug=22983#47, but I after some consideration, I now prefer your proposed approach. We've also had some grander ideas about enhancing things further, but those can be added later, after we finally decide. I do want to know what Stefan thinks of this subject now, though. Caveats: - This solves the dependency on point-min, but does nothing about the dependency on the current syntax-table (which can change). I'm not necessarily suggesting we try to solve that now, though. - Before this change is pushed to master, or shortly after, I'd like to know that it actually fixed the problem Philipp experienced with python-mode, so we can revert 4fbd330. If it was caused by e.g. syntax-table changing, we've not improved much. All the best, Dmitry. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2017-09-09 9:44 ` Dmitry Gutov @ 2017-09-09 10:20 ` Alan Mackenzie 2017-09-09 12:18 ` Dmitry Gutov 2017-09-10 11:36 ` bug#22983: [ Patch ] " Alan Mackenzie 1 sibling, 1 reply; 155+ messages in thread From: Alan Mackenzie @ 2017-09-09 10:20 UTC (permalink / raw) To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983 Hello, Dmitry. On Sat, Sep 09, 2017 at 12:44:02 +0300, Dmitry Gutov wrote: > Hi Alan, > On 9/7/17 11:45 PM, Alan Mackenzie wrote: > > The solution I propose is to introduce a second cache into syntax-ppss, > > and this cache would be used whenever (not (eq (point-min) 1)). > > Whenever point-min changes, and isn't 1, this second cached would be > > calculated again from scratch. > Thanks for writing this up. I think it's a good step, and since it > follow the current wording of the docstring, it should be highly > compatible with the existing code. Thanks. > > This proposal has these advantages: > > (i) It would make the function deliver what its unchanged doc string > > says. This is important, given that syntax-ppss has been very widely > > used within Emacs, and likely by external packages too; these will > > typically have assumed the advertised behaviour of the function, without > > having tested it in narrowed buffers. > It will also continue to function as expected in mmm-mode, AFAICT, > without the need for an "escape hatch" we discussed before. > > (i) In the case which currently works, namely a non-narrowed buffer, > > there would be only a minute slow-down (basically, there would be extra > > code to check point-min and select the cache to use). > > (ii) The cache for use in a narrowed buffer might well be sufficiently > > fast in normal use. If it is not, it could be enhanced readily. > And since the API doesn't change, and the observable behavior doesn't > either (in the vast majority of cases; probably all except the broken > ones), we can refine this solution easily, or even swap it for something > else, with little cost. Yes. I now have a provisional implementation of this new strategy, which works on the test case for xdisp.c with which I opened the bug. It seems to be working, generally. I need to test it more thoroughly. In the implementation, I have left the function `syntax-ppss' untouched except for adding a function call to set up the cache right at the start. I have refactored syntax-ppss-flush-cache, extracting a function which is called directly from the cache-selecting code. Other than that, there is one new function (which switches the current cache in use) and a few new variables to keep track of the caches. > > I think Dmitry also proposed a method of solution some months ago, > > though I don't remember in detail what it was. Dmitry, do you still > > think your solution would work? If so, please elaborate on it. > There is a simple patch at > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=22983#47, but I after some > consideration, I now prefer your proposed approach. We've also had some > grander ideas about enhancing things further, but those can be added > later, after we finally decide. Yes, I agree. > I do want to know what Stefan thinks of this subject now, though. Yes. > Caveats: > - This solves the dependency on point-min, but does nothing about the > dependency on the current syntax-table (which can change). I'm not > necessarily suggesting we try to solve that now, though. I had some ideas on this back in the spring (about having "indirect variables") which could be used quickly to "swap out" the current syntax-table text properties, and (more importantly) quickly swap them back in. But that's for another day. > - Before this change is pushed to master, or shortly after, I'd like to > know that it actually fixed the problem Philipp experienced with > python-mode, so we can revert 4fbd330. If it was caused by e.g. > syntax-table changing, we've not improved much. I am naturally interested in this, too. If my patch doesn't fix this bug, at least it will have removed a layer of fog inhibiting its investigation. > All the best, > Dmitry. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2017-09-09 10:20 ` Alan Mackenzie @ 2017-09-09 12:18 ` Dmitry Gutov 2017-09-10 11:42 ` Alan Mackenzie 0 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2017-09-09 12:18 UTC (permalink / raw) To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983 On 9/9/17 1:20 PM, Alan Mackenzie wrote: > In the implementation, I have left the function `syntax-ppss' untouched > except for adding a function call to set up the cache right at the > start. I have refactored syntax-ppss-flush-cache, extracting a function > which is called directly from the cache-selecting code. Other than > that, there is one new function (which switches the current cache in > use) and a few new variables to keep track of the caches. Not sure I understand. If you call (syntax-ppss) with significantly different narrowings without flushing the cache (e.g. without modifying the buffer), sounds like it'll have to return the same results under the described implementation. If so, it doesn't sound strict enough. >> Caveats: > >> - This solves the dependency on point-min, but does nothing about the >> dependency on the current syntax-table (which can change). I'm not >> necessarily suggesting we try to solve that now, though. > > I had some ideas on this back in the spring (about having "indirect > variables") which could be used quickly to "swap out" the current > syntax-table text properties, and (more importantly) quickly swap them > back in. But that's for another day. I admit I'm not sure what all this implies. >> - Before this change is pushed to master, or shortly after, I'd like to >> know that it actually fixed the problem Philipp experienced with >> python-mode, so we can revert 4fbd330. If it was caused by e.g. >> syntax-table changing, we've not improved much. > > I am naturally interested in this, too. If my patch doesn't fix this > bug, at least it will have removed a layer of fog inhibiting its > investigation. Let's hope so. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2017-09-09 12:18 ` Dmitry Gutov @ 2017-09-10 11:42 ` Alan Mackenzie 0 siblings, 0 replies; 155+ messages in thread From: Alan Mackenzie @ 2017-09-10 11:42 UTC (permalink / raw) To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983 Hello, Dmitry. On Sat, Sep 09, 2017 at 15:18:11 +0300, Dmitry Gutov wrote: > On 9/9/17 1:20 PM, Alan Mackenzie wrote: > > In the implementation, I have left the function `syntax-ppss' untouched > > except for adding a function call to set up the cache right at the > > start. I have refactored syntax-ppss-flush-cache, extracting a function > > which is called directly from the cache-selecting code. Other than > > that, there is one new function (which switches the current cache in > > use) and a few new variables to keep track of the caches. > Not sure I understand. If you call (syntax-ppss) with significantly > different narrowings without flushing the cache (e.g. without modifying > the buffer), sounds like it'll have to return the same results under the > described implementation. > If so, it doesn't sound strict enough. On changing from one narrowing to another narrowing (more precisely, when point-min is changed, neither value being 1), the cache is flushed, even though the buffer has not been modified. Anyhow, I've posted a patch elsewhere on this thread. Comments on it would be welcome. [ .... ] -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-09 9:44 ` Dmitry Gutov 2017-09-09 10:20 ` Alan Mackenzie @ 2017-09-10 11:36 ` Alan Mackenzie 2017-09-10 22:53 ` Stefan Monnier ` (2 more replies) 1 sibling, 3 replies; 155+ messages in thread From: Alan Mackenzie @ 2017-09-10 11:36 UTC (permalink / raw) To: Dmitry Gutov, Philipp Stephani; +Cc: John Wiegley, 22983 Hello, Dmitry and Philipp. On Sat, Sep 09, 2017 at 12:44:02 +0300, Dmitry Gutov wrote: > Hi Alan, > On 9/7/17 11:45 PM, Alan Mackenzie wrote: > > The solution I propose is to introduce a second cache into syntax-ppss, > > and this cache would be used whenever (not (eq (point-min) 1)). > > Whenever point-min changes, and isn't 1, this second cached would be > > calculated again from scratch. Here is a patch implementing this. Comments about it would be welcome. [ .... ] > And since the API doesn't change, and the observable behavior doesn't > either (in the vast majority of cases; probably all except the broken > ones), we can refine this solution easily, or even swap it for something > else, with little cost. [ .... ] > Caveats: > - This solves the dependency on point-min, but does nothing about the > dependency on the current syntax-table (which can change). I'm not > necessarily suggesting we try to solve that now, though. > - Before this change is pushed to master, or shortly after, I'd like to > know that it actually fixed the problem Philipp experienced with > python-mode, so we can revert 4fbd330. If it was caused by e.g. > syntax-table changing, we've not improved much. Philipp, any chance of you trying out python mode with this patch but without 4fbd330? diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el index d1d5176944..952ea8bb83 100644 --- a/lisp/emacs-lisp/syntax.el +++ b/lisp/emacs-lisp/syntax.el @@ -386,11 +386,103 @@ syntax-ppss-cache (defvar-local syntax-ppss-last nil "Cache of (LAST-POS . LAST-PPSS).") -(defalias 'syntax-ppss-after-change-function 'syntax-ppss-flush-cache) -(defun syntax-ppss-flush-cache (beg &rest ignored) - "Flush the cache of `syntax-ppss' starting at position BEG." - ;; Set syntax-propertize to refontify anything past beg. - (setq syntax-propertize--done (min beg syntax-propertize--done)) +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Several caches. +;; +;; Because `syntax-ppss' is equivalent to (parse-partial-sexp +;; (POINT-MIN) x), we need either to empty the cache when we narrow +;; the buffer, which is suboptimal, or we need to use several caches. +;; +;; The implementation which follows uses three caches, the current one +;; (in `syntax-ppss-cache' and `syntax-ppss-last') and two inactive +;; ones (in `syntax-ppss-{cache,last}-{wide,narrow}'), which store the +;; former state of the active cache as it was used in widened and +;; narrowed buffers respectively. There are also the variables +;; `syntax-ppss-max-valid-{wide,narrow}' which hold the maximum +;; position where the caches are valid, due to buffer changes. +;; +;; At the first call to `syntax-ppss' after a widening or narrowing of +;; the buffer, the pertinent inactive cache is swapped into the +;; current cache by calling `syntax-ppss-set-cache'. Note that there +;; is currently just one inactive cache for narrowed buffers, so only +;; one inactive narrowed cache can be stored at a time. +;; +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +(defvar-local syntax-ppss-cache-wide nil + "Holds the value of `syntax-ppss-cache' for a widened buffer.") +(defvar-local syntax-ppss-last-wide nil + "Holds the value of `syntax-ppss-last' for a widened buffer.") +(defvar-local syntax-ppss-max-valid-wide most-positive-fixnum + "The buffer position after which `syntax-ppss-cache-wide' is invalid.") + +(defvar-local syntax-ppss-cache-narrow nil + "Holds the value of `syntax-ppss-cache' for a narrowed buffer.") +(defvar-local syntax-ppss-last-narrow nil + "Holds the value of `syntax-ppss-last' for a narrowed buffer.") +(defvar-local syntax-ppss-max-valid-narrow most-positive-fixnum + "The buffer position after which `syntax-ppss-cache-narrow' is invalid.") + +(defvar-local syntax-ppss-narrow-point-min 1 + "Value of `point-min' for which the stored \"narrow\" cache is valid.") + +(defvar-local syntax-ppss-supremum most-positive-fixnum + "Lowest change position since previous restriction change.") + +(defvar-local syntax-ppss-cache-point-min 1 + "Value of `point-min' for which the current cache is valid.") + +(defun syntax-ppss-set-cache () + "Swap in and out the cache pertinent to the new point-min." + (unless (eq (point-min) syntax-ppss-cache-point-min) + ;; Update the stored `...max-valid' values. + (setq syntax-ppss-max-valid-wide + (if (eq syntax-ppss-cache-point-min 1) + (or (caar syntax-ppss-cache) + 1) + (min syntax-ppss-max-valid-wide syntax-ppss-supremum))) + (setq syntax-ppss-max-valid-narrow + (if (eq syntax-ppss-cache-point-min syntax-ppss-narrow-point-min) + (or (caar syntax-ppss-cache) + syntax-ppss-cache-point-min) + (min syntax-ppss-max-valid-narrow syntax-ppss-supremum))) + (setq syntax-ppss-supremum most-positive-fixnum) + + ;; Store away the current values of the cache. + (cond + ((eq syntax-ppss-cache-point-min 1) + (setq syntax-ppss-cache-wide syntax-ppss-cache + syntax-ppss-last-wide syntax-ppss-last)) + ((eq syntax-ppss-cache-point-min syntax-ppss-narrow-point-min) + (setq syntax-ppss-cache-narrow syntax-ppss-cache + syntax-ppss-last-narrow syntax-ppss-last)) + (syntax-ppss-cache + (setq syntax-ppss-narrow-point-min syntax-ppss-cache-point-min + syntax-ppss-cache-narrow syntax-ppss-cache + syntax-ppss-last-narrow syntax-ppss-last)) + (t nil)) + + ;; Restore/initialize the cache for the new point-min. + (cond + ((eq (point-min) 1) + (setq syntax-ppss-cache syntax-ppss-cache-wide + syntax-ppss-last syntax-ppss-last-wide) + (save-restriction + (widen) + (syntax-ppss-invalidate-cache syntax-ppss-max-valid-wide))) + ((eq (point-min) syntax-ppss-narrow-point-min) + (setq syntax-ppss-cache syntax-ppss-cache-narrow + syntax-ppss-last syntax-ppss-last-narrow) + (save-restriction + (widen) + (syntax-ppss-invalidate-cache syntax-ppss-max-valid-narrow))) + (t + (setq syntax-ppss-cache nil + syntax-ppss-last nil))) + (setq syntax-ppss-cache-point-min (point-min)))) + +(defun syntax-ppss-invalidate-cache (beg &rest ignored) + "Invalidate the cache of `syntax-ppss' starting at position BEG." ;; Flush invalid cache entries. (while (and syntax-ppss-cache (> (caar syntax-ppss-cache) beg)) (setq syntax-ppss-cache (cdr syntax-ppss-cache))) @@ -411,6 +503,16 @@ syntax-ppss-flush-cache ;; (remove-hook 'before-change-functions 'syntax-ppss-flush-cache t)) ) +;; Retain the following two for compatibility reasons. +(defalias 'syntax-ppss-after-change-function 'syntax-ppss-flush-cache) +(defun syntax-ppss-flush-cache (beg &rest ignored) + "Flush the `syntax-ppss' caches and set `syntax-propertize--done'." + (setq syntax-ppss-supremum (min beg syntax-ppss-supremum)) + ;; Ensure the appropriate cache is active. + (syntax-ppss-set-cache) + (setq syntax-propertize--done (min beg syntax-propertize--done)) + (syntax-ppss-invalidate-cache beg ignored)) + (defvar syntax-ppss-stats [(0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (1 . 2500.0)]) (defun syntax-ppss-stats () @@ -434,6 +536,8 @@ syntax-ppss this function is called while `before-change-functions' is temporarily let-bound, or if the buffer is modified without running the hook." + ;; Ensure the appropriate cache is active. + (syntax-ppss-set-cache) ;; Default values. (unless pos (setq pos (point))) (syntax-propertize pos) > All the best, > Dmitry. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply related [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-10 11:36 ` bug#22983: [ Patch ] " Alan Mackenzie @ 2017-09-10 22:53 ` Stefan Monnier 2017-09-10 23:36 ` Dmitry Gutov 2017-09-11 19:42 ` Alan Mackenzie 2017-09-11 0:11 ` Dmitry Gutov 2017-09-17 11:12 ` Philipp Stephani 2 siblings, 2 replies; 155+ messages in thread From: Stefan Monnier @ 2017-09-10 22:53 UTC (permalink / raw) To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, Dmitry Gutov, 22983 > +;; Several caches. > +;; Because `syntax-ppss' is equivalent to (parse-partial-sexp > +;; (POINT-MIN) x), we need either to empty the cache when we narrow > +;; the buffer, which is suboptimal, or we need to use several caches. I think that (parse-partial-sexp 1 x) is more often what the caller wants than (parse-partial-sexp (point-min) x), but if you're happy with the behavior described by the docstring, then that's fine. > +;; The implementation which follows uses three caches, the current one > +;; (in `syntax-ppss-cache' and `syntax-ppss-last') and two inactive > +;; ones (in `syntax-ppss-{cache,last}-{wide,narrow}'), which store the > +;; former state of the active cache as it was used in widened and > +;; narrowed buffers respectively. Earlier in the thread, I suggested to use a single cache indexed by the position of point-min (or by the position and point-min and by the current syntax-table, so as to also handle changes in the syntax-table), i.e. a list of (POINT-MIN-POS . CACHE-DATA) or ((POINT-MIN-POS . SYNTAX-TABLE) . CACHE-DATA). I think it would lead to less code duplication than your patch which only handles 2 different POINT-MIN-POS (and one of the two has to be equal to 1), but existing code trumps hypothetical designs. So, I have no objections to the patch. But I think (parse-partial-sexp (point-min) x) is a design bug in syntax-ppss which we will need to fix sooner or later, which is why I never bothered to implement something like your patch, which only makes the code do what its doc says rather than what the caller needs. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-10 22:53 ` Stefan Monnier @ 2017-09-10 23:36 ` Dmitry Gutov 2017-09-11 11:10 ` Stefan Monnier 2017-09-11 19:42 ` Alan Mackenzie 1 sibling, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2017-09-10 23:36 UTC (permalink / raw) To: Stefan Monnier, Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983 On 9/11/17 1:53 AM, Stefan Monnier wrote: > I think that (parse-partial-sexp 1 x) is more often what the caller > wants than (parse-partial-sexp (point-min) x), but if you're happy with > the behavior described by the docstring, then that's fine. And yet, I struggle to find such callers. But those that do, can (save-restriction (widen) (syntax-ppss)) anyway. >> +;; The implementation which follows uses three caches, the current one >> +;; (in `syntax-ppss-cache' and `syntax-ppss-last') and two inactive >> +;; ones (in `syntax-ppss-{cache,last}-{wide,narrow}'), which store the >> +;; former state of the active cache as it was used in widened and >> +;; narrowed buffers respectively. > > Earlier in the thread, I suggested to use a single cache indexed by the > position of point-min That would lead to clobbering the global cache when we use syntax-ppss for some local parsing. E.g. if ruby-syntax-propertize-percent-literal didn't bind parse-sexp-lookup-properties to nil, it might clobber the cache unnecessarily. I don't have the data on whether this would be a frequent problem, though. > i.e. a list of (POINT-MIN-POS . CACHE-DATA) or > ((POINT-MIN-POS . SYNTAX-TABLE) . CACHE-DATA). I think it would lead to > less code duplication than your patch which only handles 2 different > POINT-MIN-POS (and one of the two has to be equal to 1), but existing > code trumps hypothetical designs. I also think there's a way to implement this behavior with less code and new variables, albeit with extra indirection. > So, I have no objections to the patch. But I think (parse-partial-sexp > (point-min) x) is a design bug in syntax-ppss which we will need to fix > sooner or later, which is why I never bothered to implement something > like your patch, which only makes the code do what its doc says rather > than what the caller needs. I'm considering the idea now that syntax-ppss should stay a caching wrapper around parse-partial-sexp, and the responsibility to widen should always be the caller's. This way, it can be used for different purposes that we've discussed before many times. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-10 23:36 ` Dmitry Gutov @ 2017-09-11 11:10 ` Stefan Monnier 2017-09-12 0:11 ` Dmitry Gutov 0 siblings, 1 reply; 155+ messages in thread From: Stefan Monnier @ 2017-09-11 11:10 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Alan Mackenzie, Philipp Stephani, John Wiegley, 22983 >> I think that (parse-partial-sexp 1 x) is more often what the caller >> wants than (parse-partial-sexp (point-min) x), but if you're happy with >> the behavior described by the docstring, then that's fine. > And yet, I struggle to find such callers. But those that do, can > (save-restriction (widen) (syntax-ppss)) anyway. Good point. >>> +;; The implementation which follows uses three caches, the current one >>> +;; (in `syntax-ppss-cache' and `syntax-ppss-last') and two inactive >>> +;; ones (in `syntax-ppss-{cache,last}-{wide,narrow}'), which store the >>> +;; former state of the active cache as it was used in widened and >>> +;; narrowed buffers respectively. >> Earlier in the thread, I suggested to use a single cache indexed by the >> position of point-min > That would lead to clobbering the global cache when we use syntax-ppss for > some local parsing. My suggestion is to have a list of N caches, instead of having exactly 2 caches. I can't see how that could lead to more clobbering. > I'm considering the idea now that syntax-ppss should stay a caching wrapper > around parse-partial-sexp, and the responsibility to widen should always be > the caller's. This way, it can be used for different purposes that we've > discussed before many times. It does have the advantage of circumventing the discussion of "up-to-where should we widen" ;-) Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-11 11:10 ` Stefan Monnier @ 2017-09-12 0:11 ` Dmitry Gutov 2017-09-12 22:12 ` Richard Stallman 0 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2017-09-12 0:11 UTC (permalink / raw) To: Stefan Monnier; +Cc: Alan Mackenzie, Philipp Stephani, John Wiegley, 22983 On 9/11/17 2:10 PM, Stefan Monnier wrote: > My suggestion is to have a list of N caches, instead of having exactly > 2 caches. I can't see how that could lead to more clobbering. Um, sorry I misunderstood. I interpreted that as only keeping one pair. But here are some other issues: 1) If we maintain a cache for all narrowings that have ever been used in the buffer, we adopt the idea that all of them are "real" and e.g. correspond to chunks in different major modes in a multi-mode context. Switching to a different syntax table and parsing a segment of text like ruby-syntax-propertize-percent-literal does falls outside of this concept. But of course, we can index by syntax table as well... overall, things become much complex than when changing the narrowing bounds implies just throwing away that cache. 2) If there are a lot of elements inside the cache alist, we have to get rid of them from time to time. Not sure what the rules will be. Again, if they correspond to multi-mode chunks, we can at least be confident that the number of items in the alist will be finite. Not necessarily so if narrowing+spss is used for arbitrary purposes. 3) As the number of elements in the alist grows, flushing each value inside syntax-ppss-flush-cache eagerly will become slower and slower, I expect. And a lazy strategy of the kind proposed by Alan will become necessary. >> I'm considering the idea now that syntax-ppss should stay a caching wrapper >> around parse-partial-sexp, and the responsibility to widen should always be >> the caller's. This way, it can be used for different purposes that we've >> discussed before many times. > > It does have the advantage of circumventing the discussion of > "up-to-where should we widen" ;-) Indeed. :) ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-12 0:11 ` Dmitry Gutov @ 2017-09-12 22:12 ` Richard Stallman 0 siblings, 0 replies; 155+ messages in thread From: Richard Stallman @ 2017-09-12 22:12 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 22983, p.stephani2, jwiegley, monnier, acm [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] There is probably some optimal number of caches to remember. If the code can handle any number of caches, it can discard all but the last N, and then we could try adjusting N to get the best performance. I expect we don't want N to be more than 4. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-10 22:53 ` Stefan Monnier 2017-09-10 23:36 ` Dmitry Gutov @ 2017-09-11 19:42 ` Alan Mackenzie 2017-09-11 20:20 ` Stefan Monnier 1 sibling, 1 reply; 155+ messages in thread From: Alan Mackenzie @ 2017-09-11 19:42 UTC (permalink / raw) To: Stefan Monnier; +Cc: John Wiegley, Philipp Stephani, Dmitry Gutov, 22983 Hello, Stefan. On Sun, Sep 10, 2017 at 18:53:53 -0400, Stefan Monnier wrote: > > +;; Several caches. > > +;; Because `syntax-ppss' is equivalent to (parse-partial-sexp > > +;; (POINT-MIN) x), we need either to empty the cache when we narrow > > +;; the buffer, which is suboptimal, or we need to use several caches. > I think that (parse-partial-sexp 1 x) is more often what the caller > wants than (parse-partial-sexp (point-min) x), but if you're happy with > the behavior described by the docstring, then that's fine. I've never been happy with the specification, partly for that reason, but we are where we are, with lots of use of syntax-ppss, so I think it needs fixing according to that spec. > > +;; The implementation which follows uses three caches, the current one > > +;; (in `syntax-ppss-cache' and `syntax-ppss-last') and two inactive > > +;; ones (in `syntax-ppss-{cache,last}-{wide,narrow}'), which store the > > +;; former state of the active cache as it was used in widened and > > +;; narrowed buffers respectively. > Earlier in the thread, I suggested to use a single cache indexed by the > position of point-min (or by the position and point-min and by the > current syntax-table, so as to also handle changes in the syntax-table), > i.e. a list of (POINT-MIN-POS . CACHE-DATA) or > ((POINT-MIN-POS . SYNTAX-TABLE) . CACHE-DATA). I think it would lead to > less code duplication than your patch which only handles 2 different > POINT-MIN-POS (and one of the two has to be equal to 1), but existing > code trumps hypothetical designs. I deliberately kept the patch simple, avoiding even an alist with the point-min position as key. This would necessitate having an arbitrary maximum length of alist, and continual manipulation of this list. Not difficult, I agree, but do we need it? How often are there going to be nested or alternating narrowing with enough calls to syntax-ppss to cause the establishment of syntax-ppss-cache (as opposed to merely syntax-ppss-last, which my patch doesn't consider sufficient reason to store a new narrow-cache)? (These aren't rhetorical questions, by the way, but real ones. Which is the best way forward?) However, the patch was deliberately contructed to make the replacement of the two-cache cache by an arbitrary length alist simple. > So, I have no objections to the patch. But I think (parse-partial-sexp > (point-min) x) is a design bug in syntax-ppss which we will need to fix > sooner or later, which is why I never bothered to implement something > like your patch, which only makes the code do what its doc says rather > than what the caller needs. I couldn't agree more. However, syntax-ppss is established and there are callers that depend on its literal specification. Maybe a way forward would be to introduce a new function equivalent to (parse-partial-sexp 1 x) and deprecate syntax-ppss. However, a name would need to be found for this new function, not an easy task. ;-) (syntax-ppss is a very good name, but couldn't be reused.) > Stefan -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-11 19:42 ` Alan Mackenzie @ 2017-09-11 20:20 ` Stefan Monnier 0 siblings, 0 replies; 155+ messages in thread From: Stefan Monnier @ 2017-09-11 20:20 UTC (permalink / raw) To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, Dmitry Gutov, 22983 > difficult, I agree, but do we need it? How often are there going to be > nested or alternating narrowing with enough calls to syntax-ppss to > cause the establishment of syntax-ppss-cache (as opposed to merely > syntax-ppss-last, which my patch doesn't consider sufficient reason to > store a new narrow-cache)? (These aren't rhetorical questions, by the > way, but real ones. Which is the best way forward?) I agree that it probably doesn't make much difference in practice. > I couldn't agree more. However, syntax-ppss is established and there > are callers that depend on its literal specification. Maybe a way > forward would be to introduce a new function equivalent to > (parse-partial-sexp 1 x) and deprecate syntax-ppss. However, a name > would need to be found for this new function, not an easy task. ;-) > (syntax-ppss is a very good name, but couldn't be reused.) Let's go with your patch for now, and then see if Dmitry's impression that adding a call to `widen` before the call works even better. Stefan ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-10 11:36 ` bug#22983: [ Patch ] " Alan Mackenzie 2017-09-10 22:53 ` Stefan Monnier @ 2017-09-11 0:11 ` Dmitry Gutov 2017-09-11 20:12 ` Alan Mackenzie 2017-09-17 11:12 ` Philipp Stephani 2 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2017-09-11 0:11 UTC (permalink / raw) To: Alan Mackenzie, Philipp Stephani; +Cc: John Wiegley, 22983 On 9/10/17 2:36 PM, Alan Mackenzie wrote: >>> The solution I propose is to introduce a second cache into syntax-ppss, >>> and this cache would be used whenever (not (eq (point-min) 1)). >>> Whenever point-min changes, and isn't 1, this second cached would be >>> calculated again from scratch. > > Here is a patch implementing this. Comments about it would be welcome. Thank you. It seems to hold up to the main test scenario I had in mind, so I don't have any complaints behavior-wise. It looks pretty big, though. With lots of new global variables. Before, we had syntax-ppss-cache and syntax-ppss-last. The patch adds 8 new ones. I propose two avenues for simplification: 1) Use a cons structure for the (PPSS-CACHE . PPSS-LAST) structure. We will have three global variables total: syntax-ppss-data-wide, syntax-ppss-data-narrow, syntax-ppss-data-narrow-point-min. syntax-ppss would bind a local variable syntax-ppss-data to one of the first two depending on the value of the third (and then modify its car and cdr during the course of execution). 2) Some extra vars serve to delay the actual clearing of the unused cache until it's used again. It's a valid idea, but what if we try without it at first? So syntax-ppss-flush-cache would always clear both caches eagerly. The advantages: - Less code, easier to reason about. - Any package than advises syntax-ppss will have to juggle fewer global variables. So Vatalie's polymode will have an easier time of it. It could even reuse some of the cache-while-narrowed logic by substituting the values of syntax-ppss-data-narrow and syntax-ppss-data-narrow-point-min as appropriate. The obvious downside is, of course, extra indirection, which translates to extra overhead. We don't know how significant it will be, though. Would you like to see the code? ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-11 0:11 ` Dmitry Gutov @ 2017-09-11 20:12 ` Alan Mackenzie 2017-09-12 0:24 ` Dmitry Gutov 0 siblings, 1 reply; 155+ messages in thread From: Alan Mackenzie @ 2017-09-11 20:12 UTC (permalink / raw) To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983 Hello, Dmitry. On Mon, Sep 11, 2017 at 03:11:22 +0300, Dmitry Gutov wrote: > On 9/10/17 2:36 PM, Alan Mackenzie wrote: > >>> The solution I propose is to introduce a second cache into syntax-ppss, > >>> and this cache would be used whenever (not (eq (point-min) 1)). > >>> Whenever point-min changes, and isn't 1, this second cached would be > >>> calculated again from scratch. > > Here is a patch implementing this. Comments about it would be welcome. > Thank you. It seems to hold up to the main test scenario I had in mind, > so I don't have any complaints behavior-wise. Thanks. > It looks pretty big, though. With lots of new global variables. > Before, we had syntax-ppss-cache and syntax-ppss-last. The patch adds 8 > new ones. Yes. But each one has a very single purpose, and there are no loops in the new code, which makes it easier to be sure it is correct. > I propose two avenues for simplification: > 1) Use a cons structure for the (PPSS-CACHE . PPSS-LAST) structure. We > will have three global variables total: syntax-ppss-data-wide, > syntax-ppss-data-narrow, syntax-ppss-data-narrow-point-min. syntax-ppss > would bind a local variable syntax-ppss-data to one of the first two > depending on the value of the third (and then modify its car and cdr > during the course of execution). I'm in favour rather of setting syntax-ppss-{cache,last} to the appropriate stored cache. This will avoid needing to change the function syntax-ppss much. A disadvantage of using such a cons is in debugging. It is more difficult to understand a cons like this when it is printed out, than the two component lists (which are difficult enough themselves). > 2) Some extra vars serve to delay the actual clearing of the unused > cache until it's used again. It's a valid idea, but what if we try > without it at first? So syntax-ppss-flush-cache would always clear both > caches eagerly. When there's a lot of buffer changing going on, it is an overhead having to clear both (or several) caches continually. (I'm thinking about the possible extension to using an alist of caches, which could be quite long.) Also clearing both caches at the same time would be a bigger change to syntax-ppss-flush-cache than it's suffered so far. But I'm really not sure which way is better. > The advantages: > - Less code, easier to reason about. > - Any package than advises syntax-ppss will have to juggle fewer global > variables. I was intending that the new variables be purely internal, and that no external elisp would need to access them. I suppose I really ought to have put "--" in the middle of their names. > So Vatalie's polymode will have an easier time of it. It could even > reuse some of the cache-while-narrowed logic by substituting the > values of syntax-ppss-data-narrow and > syntax-ppss-data-narrow-point-min as appropriate. That sounds a little dangerous. > The obvious downside is, of course, extra indirection, which translates > to extra overhead. We don't know how significant it will be, though. I wouldn't be keen on seeing lots of (car compound-variable) and (cdr compound-variable) throughout the syntax-ppss function. I think it would make it significantly more difficult to understand. > Would you like to see the code? Yes, why not? But just to make my position clear, I'm not particularly fixed on my patch as submitted. It was optimised for simplicity and correctness rather than elegance, though I don't think it's too bad. I'm fairly open on whether we use your suggestions or Stefan's suggestion of having an alist of caches. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-11 20:12 ` Alan Mackenzie @ 2017-09-12 0:24 ` Dmitry Gutov 2017-09-17 10:29 ` Alan Mackenzie 0 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2017-09-12 0:24 UTC (permalink / raw) To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983 On 9/11/17 11:12 PM, Alan Mackenzie wrote: >> Before, we had syntax-ppss-cache and syntax-ppss-last. The patch adds 8 >> new ones. > > Yes. But each one has a very single purpose, and there are no loops in > the new code, which makes it easier to be sure it is correct. On the one hand, yes, on the other hand, the more code you have (or the more vars you have to juggle), the harder it is to keep track. > I'm in favour rather of setting syntax-ppss-{cache,last} to the > appropriate stored cache. This will avoid needing to change the > function syntax-ppss much. My proposal will change syntax-ppss, yes. So, unfortunately, the patch will be more difficult to read. But not the resulting code, hopefully. But I think I see what you mean. The disadvantage is that we'll need code that will ferry those values back to the appropriate variables as well (which we see in your patch). We can discuss that option after. > A disadvantage of using such a cons is in debugging. It is more > difficult to understand a cons like this when it is printed out, than > the two component lists (which are difficult enough themselves). You win some, you lose some. We could use structs, if you like, but overall, the values are already complex, so consing won't make that much worse. > When there's a lot of buffer changing going on, it is an overhead having > to clear both (or several) caches continually. (I'm thinking about the > possible extension to using an alist of caches, which could be quite > long.) Both caches - yes, but shouldn't be too bad. The "alist of caches" approach would most likely require that laziness, but I'm not sure we really want to go there (see another email). > Also clearing both caches at the same time would be a bigger change to > syntax-ppss-flush-cache than it's suffered so far. True. >> - Any package than advises syntax-ppss will have to juggle fewer global >> variables. > > I was intending that the new variables be purely internal, and that no > external elisp would need to access them. I suppose I really ought to > have put "--" in the middle of their names. Yes, but if we can make life easier for some, why not? Sometimes third-party author can life with breakage between Emacs versions. >> So Vatalie's polymode will have an easier time of it. It could even >> reuse some of the cache-while-narrowed logic by substituting the >> values of syntax-ppss-data-narrow and >> syntax-ppss-data-narrow-point-min as appropriate. > > That sounds a little dangerous. Not much worse than what multi-mode packages already do, though. >> The obvious downside is, of course, extra indirection, which translates >> to extra overhead. We don't know how significant it will be, though. > > I wouldn't be keen on seeing lots of (car compound-variable) and (cdr > compound-variable) throughout the syntax-ppss function. I think it > would make it significantly more difficult to understand. Hopefully there will be only several such places. But again, we can use structs. >> Would you like to see the code? > > Yes, why not? Please give me until the end of the week. > But just to make my position clear, I'm not particularly fixed on my > patch as submitted. It was optimised for simplicity and correctness > rather than elegance, though I don't think it's too bad. I'm fairly > open on whether we use your suggestions or Stefan's suggestion of having > an alist of caches. Cool. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-12 0:24 ` Dmitry Gutov @ 2017-09-17 10:29 ` Alan Mackenzie 2017-09-17 23:43 ` Dmitry Gutov 0 siblings, 1 reply; 155+ messages in thread From: Alan Mackenzie @ 2017-09-17 10:29 UTC (permalink / raw) To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983 Hello, Dmitry. On Tue, Sep 12, 2017 at 03:24:08 +0300, Dmitry Gutov wrote: > On 9/11/17 11:12 PM, Alan Mackenzie wrote: [ .... ] > > I wouldn't be keen on seeing lots of (car compound-variable) and (cdr > > compound-variable) throughout the syntax-ppss function. I think it > > would make it significantly more difficult to understand. > Hopefully there will be only several such places. But again, we can use > structs. I don't know anything about these things. But seeing as how syntax.el is preloaded, the definition of structs would need to be preloaded earlier. > >> Would you like to see the code? > > Yes, why not? > Please give me until the end of the week. The end of the week has arrived. Are you still intending to propose an alternative formulation of the new cache manipulation for syntax-ppss? > > But just to make my position clear, I'm not particularly fixed on my > > patch as submitted. It was optimised for simplicity and correctness > > rather than elegance, though I don't think it's too bad. I'm fairly > > open on whether we use your suggestions or Stefan's suggestion of having > > an alist of caches. > Cool. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-17 10:29 ` Alan Mackenzie @ 2017-09-17 23:43 ` Dmitry Gutov 2017-09-18 19:08 ` Alan Mackenzie 0 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2017-09-17 23:43 UTC (permalink / raw) To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983 [-- Attachment #1: Type: text/plain, Size: 715 bytes --] Hi Alan, On 9/17/17 1:29 PM, Alan Mackenzie wrote: > I don't know anything about these things. But seeing as how syntax.el is > preloaded, the definition of structs would need to be preloaded earlier. OK, let's do without that for now. The result doesn't look too bad to my eyes, at least. >>>> Would you like to see the code? > >>> Yes, why not? > >> Please give me until the end of the week. > > The end of the week has arrived. Are you still intending to propose an > alternative formulation of the new cache manipulation for syntax-ppss? Thanks for the reminder. The patch is attached. I've tested it minimally, any feedback is welcome. (It reads much better in Emacs with diff-auto-refine-mode). [-- Attachment #2: alt-ppss-fix.diff --] [-- Type: text/x-patch, Size: 7392 bytes --] diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el index d1d5176944..a77589f1b7 100644 --- a/lisp/emacs-lisp/syntax.el +++ b/lisp/emacs-lisp/syntax.el @@ -381,10 +381,26 @@ syntax-begin-function point (where the PPSS is equivalent to nil).") (make-obsolete-variable 'syntax-begin-function nil "25.1") -(defvar-local syntax-ppss-cache nil - "List of (POS . PPSS) pairs, in decreasing POS order.") -(defvar-local syntax-ppss-last nil - "Cache of (LAST-POS . LAST-PPSS).") +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Several caches. +;; +;; Because `syntax-ppss' is equivalent to (parse-partial-sexp +;; (POINT-MIN) x), we need either to empty the cache when we narrow +;; the buffer, which is suboptimal, or we need to use several caches. +;; We use two of them, one for widened buffer, and one for narrowing. +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +(defvar-local syntax-ppss-wide nil + "Cons of two elements (CACHE . LAST). +Where CACHE is a list of (POS . PPSS) pairs, in decreasing POS order, +and LAST is a pair (LAST-POS . LAST-PPS) caching the last invocation. +These are valid when the buffer has no restriction.") + +(defvar-local syntax-ppss-narrow nil + "Same as `syntax-ppss-wide' but for a narrowed buffer.") + +(defvar-local syntax-ppss-narrow-start nil + "Start position of the narrowing for `syntax-ppss-narrow'.") (defalias 'syntax-ppss-after-change-function 'syntax-ppss-flush-cache) (defun syntax-ppss-flush-cache (beg &rest ignored) @@ -392,24 +408,29 @@ syntax-ppss-flush-cache ;; Set syntax-propertize to refontify anything past beg. (setq syntax-propertize--done (min beg syntax-propertize--done)) ;; Flush invalid cache entries. - (while (and syntax-ppss-cache (> (caar syntax-ppss-cache) beg)) - (setq syntax-ppss-cache (cdr syntax-ppss-cache))) - ;; Throw away `last' value if made invalid. - (when (< beg (or (car syntax-ppss-last) 0)) - ;; If syntax-begin-function jumped to BEG, then the old state at BEG can - ;; depend on the text after BEG (which is presumably changed). So if - ;; BEG=(car (nth 10 syntax-ppss-last)) don't reuse that data because the - ;; assumed nil state at BEG may not be valid any more. - (if (<= beg (or (syntax-ppss-toplevel-pos (cdr syntax-ppss-last)) - (nth 3 syntax-ppss-last) - 0)) - (setq syntax-ppss-last nil) - (setcar syntax-ppss-last nil))) - ;; Unregister if there's no cache left. Sadly this doesn't work - ;; because `before-change-functions' is temporarily bound to nil here. - ;; (unless syntax-ppss-cache - ;; (remove-hook 'before-change-functions 'syntax-ppss-flush-cache t)) - ) + (dolist (cell (list syntax-ppss-wide syntax-ppss-narrow)) + (pcase cell + (`(,cache . ,last) + (while (and cache (> (caar cache) beg)) + (setq cache (cdr cache))) + ;; Throw away `last' value if made invalid. + (when (< beg (or (car last) 0)) + ;; If syntax-begin-function jumped to BEG, then the old state at BEG can + ;; depend on the text after BEG (which is presumably changed). So if + ;; BEG=(car (nth 10 syntax-ppss-last)) don't reuse that data because the + ;; assumed nil state at BEG may not be valid any more. + (if (<= beg (or (syntax-ppss-toplevel-pos (cdr last)) + (nth 3 last) + 0)) + (setq last nil) + (setcar last nil))) + ;; Unregister if there's no cache left. Sadly this doesn't work + ;; because `before-change-functions' is temporarily bound to nil here. + ;; (unless cache + ;; (remove-hook 'before-change-functions 'syntax-ppss-flush-cache t)) + (setcar cell cache) + (setcdr cell last))) + )) (defvar syntax-ppss-stats [(0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (1 . 2500.0)]) @@ -423,6 +444,17 @@ syntax-ppss-stats (defvar-local syntax-ppss-table nil "Syntax-table to use during `syntax-ppss', if any.") +(defun syntax-ppss--data () + (if (eq (point-min) 1) + (progn + (unless syntax-ppss-wide + (setq syntax-ppss-wide (cons nil nil))) + syntax-ppss-wide) + (unless (eq syntax-ppss-narrow-start (point-min)) + (setq syntax-ppss-narrow-start (point-min)) + (setq syntax-ppss-narrow (cons nil nil))) + syntax-ppss-narrow)) + (defun syntax-ppss (&optional pos) "Parse-Partial-Sexp State at POS, defaulting to point. The returned value is the same as that of `parse-partial-sexp' @@ -439,10 +471,13 @@ syntax-ppss (syntax-propertize pos) ;; (with-syntax-table (or syntax-ppss-table (syntax-table)) - (let ((old-ppss (cdr syntax-ppss-last)) - (old-pos (car syntax-ppss-last)) - (ppss nil) - (pt-min (point-min))) + (let* ((cell (syntax-ppss--data)) + (ppss-cache (car cell)) + (ppss-last (cdr cell)) + (old-ppss (cdr ppss-last)) + (old-pos (car ppss-last)) + (ppss nil) + (pt-min (point-min))) (if (and old-pos (> old-pos pos)) (setq old-pos nil)) ;; Use the OLD-POS if usable and close. Don't update the `last' cache. (condition-case nil @@ -475,7 +510,7 @@ syntax-ppss ;; The OLD-* data can't be used. Consult the cache. (t (let ((cache-pred nil) - (cache syntax-ppss-cache) + (cache ppss-cache) (pt-min (point-min)) ;; I differentiate between PT-MIN and PT-BEST because ;; I feel like it might be important to ensure that the @@ -491,7 +526,7 @@ syntax-ppss (if cache (setq pt-min (caar cache) ppss (cdar cache))) ;; Setup the before-change function if necessary. - (unless (or syntax-ppss-cache syntax-ppss-last) + (unless (or ppss-cache ppss-last) (add-hook 'before-change-functions 'syntax-ppss-flush-cache t t)) @@ -541,7 +576,7 @@ syntax-ppss pt-min (setq pt-min (/ (+ pt-min pos) 2)) nil nil ppss)) (push (cons pt-min ppss) - (if cache-pred (cdr cache-pred) syntax-ppss-cache))) + (if cache-pred (cdr cache-pred) ppss-cache))) ;; Compute the actual return value. (setq ppss (parse-partial-sexp pt-min pos nil nil ppss)) @@ -562,13 +597,15 @@ syntax-ppss (if (> (- (caar cache-pred) pos) syntax-ppss-max-span) (push pair (cdr cache-pred)) (setcar cache-pred pair)) - (if (or (null syntax-ppss-cache) - (> (- (caar syntax-ppss-cache) pos) + (if (or (null ppss-cache) + (> (- (caar ppss-cache) pos) syntax-ppss-max-span)) - (push pair syntax-ppss-cache) - (setcar syntax-ppss-cache pair))))))))) + (push pair ppss-cache) + (setcar ppss-cache pair))))))))) - (setq syntax-ppss-last (cons pos ppss)) + (setq ppss-last (cons pos ppss)) + (setcar cell ppss-cache) + (setcdr cell ppss-last) ppss) (args-out-of-range ;; If the buffer is more narrowed than when we built the cache, @@ -582,7 +619,7 @@ syntax-ppss (defun syntax-ppss-debug () (let ((pt nil) (min-diffs nil)) - (dolist (x (append syntax-ppss-cache (list (cons (point-min) nil)))) + (dolist (x (append (car (syntax-ppss--data)) (list (cons (point-min) nil)))) (when pt (push (- pt (car x)) min-diffs)) (setq pt (car x))) min-diffs)) ^ permalink raw reply related [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-17 23:43 ` Dmitry Gutov @ 2017-09-18 19:08 ` Alan Mackenzie 2017-09-19 0:02 ` Dmitry Gutov 0 siblings, 1 reply; 155+ messages in thread From: Alan Mackenzie @ 2017-09-18 19:08 UTC (permalink / raw) To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983 Hello, Dmitry On Mon, Sep 18, 2017 at 02:43:05 +0300, Dmitry Gutov wrote: > Hi Alan, > On 9/17/17 1:29 PM, Alan Mackenzie wrote: > > I don't know anything about these things. But seeing as how syntax.el is > > preloaded, the definition of structs would need to be preloaded earlier. > OK, let's do without that for now. The result doesn't look too bad to my > eyes, at least. > >>>> Would you like to see the code? > >>> Yes, why not? > >> Please give me until the end of the week. > > The end of the week has arrived. Are you still intending to propose an > > alternative formulation of the new cache manipulation for syntax-ppss? > Thanks for the reminder. The patch is attached. I've tested it > minimally, any feedback is welcome. Thanks for this. I'm impressed. Your syntax-ppss--data is far more elegant than my syntax-ppss-set-cache. The burden of carrying around the caches in cons cells is much less than I had feared. The amendments to syntax-ppss are also less than I had feared, amounting to little more than substituting "syntax-ppss-cache" with "ppss-cache" etc., and making a few bindings to support that. I notice you flush both caches eagerly, as you said you would. No harm in that. So, I'm willing to go with your version. I haven't tried actually running it, yet. But there's one small change I would ask you to consider making - that is, in the cache conses, to put ppss-last in the car and ppss-cache in the cdr. That way, while debugging, ppss-last will be easy to find (it's the first element of the list) and ppss-cache will also be easy to find (the second element onwards). > (It reads much better in Emacs with diff-auto-refine-mode). [ .... ] -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-18 19:08 ` Alan Mackenzie @ 2017-09-19 0:02 ` Dmitry Gutov 2017-09-19 20:47 ` Alan Mackenzie 0 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2017-09-19 0:02 UTC (permalink / raw) To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983 [-- Attachment #1: Type: text/plain, Size: 1080 bytes --] On 9/18/17 10:08 PM, Alan Mackenzie wrote: > Thanks for this. I'm impressed. Your syntax-ppss--data is far more > elegant than my syntax-ppss-set-cache. The burden of carrying around > the caches in cons cells is much less than I had feared. The amendments > to syntax-ppss are also less than I had feared, amounting to little more > than substituting "syntax-ppss-cache" with "ppss-cache" etc., and making > a few bindings to support that. Thanks! > I notice you flush both caches eagerly, as you said you would. No harm > in that. > > So, I'm willing to go with your version. I haven't tried actually > running it, yet. Please do. > But there's one small change I would ask you to consider making - that > is, in the cache conses, to put ppss-last in the car and ppss-cache in > the cdr. That way, while debugging, ppss-last will be easy to find > (it's the first element of the list) and ppss-cache will also be easy to > find (the second element onwards). Sure, that makes a lot of sense, since ppss-last is a smaller structure. The modified patch is attached. [-- Attachment #2: alt-ppss-fix-2.diff --] [-- Type: text/x-patch, Size: 7391 bytes --] diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el index d1d5176944..c44e754ac0 100644 --- a/lisp/emacs-lisp/syntax.el +++ b/lisp/emacs-lisp/syntax.el @@ -381,10 +381,26 @@ syntax-begin-function point (where the PPSS is equivalent to nil).") (make-obsolete-variable 'syntax-begin-function nil "25.1") -(defvar-local syntax-ppss-cache nil - "List of (POS . PPSS) pairs, in decreasing POS order.") -(defvar-local syntax-ppss-last nil - "Cache of (LAST-POS . LAST-PPSS).") +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Several caches. +;; +;; Because `syntax-ppss' is equivalent to (parse-partial-sexp +;; (POINT-MIN) x), we need either to empty the cache when we narrow +;; the buffer, which is suboptimal, or we need to use several caches. +;; We use two of them, one for widened buffer, and one for narrowing. +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +(defvar-local syntax-ppss-wide nil + "Cons of two elements (LAST . CACHE). +Where LAST is a pair (LAST-POS . LAST-PPS) caching the last invocation +and CACHE is a list of (POS . PPSS) pairs, in decreasing POS order. +These are valid when the buffer has no restriction.") + +(defvar-local syntax-ppss-narrow nil + "Same as `syntax-ppss-wide' but for a narrowed buffer.") + +(defvar-local syntax-ppss-narrow-start nil + "Start position of the narrowing for `syntax-ppss-narrow'.") (defalias 'syntax-ppss-after-change-function 'syntax-ppss-flush-cache) (defun syntax-ppss-flush-cache (beg &rest ignored) @@ -392,24 +408,29 @@ syntax-ppss-flush-cache ;; Set syntax-propertize to refontify anything past beg. (setq syntax-propertize--done (min beg syntax-propertize--done)) ;; Flush invalid cache entries. - (while (and syntax-ppss-cache (> (caar syntax-ppss-cache) beg)) - (setq syntax-ppss-cache (cdr syntax-ppss-cache))) - ;; Throw away `last' value if made invalid. - (when (< beg (or (car syntax-ppss-last) 0)) - ;; If syntax-begin-function jumped to BEG, then the old state at BEG can - ;; depend on the text after BEG (which is presumably changed). So if - ;; BEG=(car (nth 10 syntax-ppss-last)) don't reuse that data because the - ;; assumed nil state at BEG may not be valid any more. - (if (<= beg (or (syntax-ppss-toplevel-pos (cdr syntax-ppss-last)) - (nth 3 syntax-ppss-last) - 0)) - (setq syntax-ppss-last nil) - (setcar syntax-ppss-last nil))) - ;; Unregister if there's no cache left. Sadly this doesn't work - ;; because `before-change-functions' is temporarily bound to nil here. - ;; (unless syntax-ppss-cache - ;; (remove-hook 'before-change-functions 'syntax-ppss-flush-cache t)) - ) + (dolist (cell (list syntax-ppss-wide syntax-ppss-narrow)) + (pcase cell + (`(,last . ,cache) + (while (and cache (> (caar cache) beg)) + (setq cache (cdr cache))) + ;; Throw away `last' value if made invalid. + (when (< beg (or (car last) 0)) + ;; If syntax-begin-function jumped to BEG, then the old state at BEG can + ;; depend on the text after BEG (which is presumably changed). So if + ;; BEG=(car (nth 10 syntax-ppss-last)) don't reuse that data because the + ;; assumed nil state at BEG may not be valid any more. + (if (<= beg (or (syntax-ppss-toplevel-pos (cdr last)) + (nth 3 last) + 0)) + (setq last nil) + (setcar last nil))) + ;; Unregister if there's no cache left. Sadly this doesn't work + ;; because `before-change-functions' is temporarily bound to nil here. + ;; (unless cache + ;; (remove-hook 'before-change-functions 'syntax-ppss-flush-cache t)) + (setcar cell last) + (setcdr cell cache))) + )) (defvar syntax-ppss-stats [(0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (1 . 2500.0)]) @@ -423,6 +444,17 @@ syntax-ppss-stats (defvar-local syntax-ppss-table nil "Syntax-table to use during `syntax-ppss', if any.") +(defun syntax-ppss--data () + (if (eq (point-min) 1) + (progn + (unless syntax-ppss-wide + (setq syntax-ppss-wide (cons nil nil))) + syntax-ppss-wide) + (unless (eq syntax-ppss-narrow-start (point-min)) + (setq syntax-ppss-narrow-start (point-min)) + (setq syntax-ppss-narrow (cons nil nil))) + syntax-ppss-narrow)) + (defun syntax-ppss (&optional pos) "Parse-Partial-Sexp State at POS, defaulting to point. The returned value is the same as that of `parse-partial-sexp' @@ -439,10 +471,13 @@ syntax-ppss (syntax-propertize pos) ;; (with-syntax-table (or syntax-ppss-table (syntax-table)) - (let ((old-ppss (cdr syntax-ppss-last)) - (old-pos (car syntax-ppss-last)) - (ppss nil) - (pt-min (point-min))) + (let* ((cell (syntax-ppss--data)) + (ppss-last (car cell)) + (ppss-cache (cdr cell)) + (old-ppss (cdr ppss-last)) + (old-pos (car ppss-last)) + (ppss nil) + (pt-min (point-min))) (if (and old-pos (> old-pos pos)) (setq old-pos nil)) ;; Use the OLD-POS if usable and close. Don't update the `last' cache. (condition-case nil @@ -475,7 +510,7 @@ syntax-ppss ;; The OLD-* data can't be used. Consult the cache. (t (let ((cache-pred nil) - (cache syntax-ppss-cache) + (cache ppss-cache) (pt-min (point-min)) ;; I differentiate between PT-MIN and PT-BEST because ;; I feel like it might be important to ensure that the @@ -491,7 +526,7 @@ syntax-ppss (if cache (setq pt-min (caar cache) ppss (cdar cache))) ;; Setup the before-change function if necessary. - (unless (or syntax-ppss-cache syntax-ppss-last) + (unless (or ppss-cache ppss-last) (add-hook 'before-change-functions 'syntax-ppss-flush-cache t t)) @@ -541,7 +576,7 @@ syntax-ppss pt-min (setq pt-min (/ (+ pt-min pos) 2)) nil nil ppss)) (push (cons pt-min ppss) - (if cache-pred (cdr cache-pred) syntax-ppss-cache))) + (if cache-pred (cdr cache-pred) ppss-cache))) ;; Compute the actual return value. (setq ppss (parse-partial-sexp pt-min pos nil nil ppss)) @@ -562,13 +597,15 @@ syntax-ppss (if (> (- (caar cache-pred) pos) syntax-ppss-max-span) (push pair (cdr cache-pred)) (setcar cache-pred pair)) - (if (or (null syntax-ppss-cache) - (> (- (caar syntax-ppss-cache) pos) + (if (or (null ppss-cache) + (> (- (caar ppss-cache) pos) syntax-ppss-max-span)) - (push pair syntax-ppss-cache) - (setcar syntax-ppss-cache pair))))))))) + (push pair ppss-cache) + (setcar ppss-cache pair))))))))) - (setq syntax-ppss-last (cons pos ppss)) + (setq ppss-last (cons pos ppss)) + (setcar cell ppss-last) + (setcdr cell ppss-cache) ppss) (args-out-of-range ;; If the buffer is more narrowed than when we built the cache, @@ -582,7 +619,7 @@ syntax-ppss (defun syntax-ppss-debug () (let ((pt nil) (min-diffs nil)) - (dolist (x (append syntax-ppss-cache (list (cons (point-min) nil)))) + (dolist (x (append (cdr (syntax-ppss--data)) (list (cons (point-min) nil)))) (when pt (push (- pt (car x)) min-diffs)) (setq pt (car x))) min-diffs)) ^ permalink raw reply related [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-19 0:02 ` Dmitry Gutov @ 2017-09-19 20:47 ` Alan Mackenzie 2017-09-22 14:09 ` Dmitry Gutov 0 siblings, 1 reply; 155+ messages in thread From: Alan Mackenzie @ 2017-09-19 20:47 UTC (permalink / raw) To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983 Hello, Dmitry. On Tue, Sep 19, 2017 at 03:02:06 +0300, Dmitry Gutov wrote: > On 9/18/17 10:08 PM, Alan Mackenzie wrote: [ .... ] > > So, I'm willing to go with your version. I haven't tried actually > > running it, yet. > Please do. I have done now, without the slightest cause for concern (see below). > > But there's one small change I would ask you to consider making - that > > is, in the cache conses, to put ppss-last in the car and ppss-cache in > > the cdr. That way, while debugging, ppss-last will be easy to find > > (it's the first element of the list) and ppss-cache will also be easy to > > find (the second element onwards). > Sure, that makes a lot of sense, since ppss-last is a smaller structure. > The modified patch is attached. Thanks. I've done some semi-formal testing on it. My semi-formal test log is: (ii) Do some testing, using xdisp.c as test file. A file.c will not have other calls to syntax-ppss interfering with the tests. o - 1. Normal working: check both caches stay empty. They don't, because syntax-ppss is used, I think, by font locking. o - 2. Normal work in a narrowed buffer. Seems OK. o - 3. Switch back to widened. Seems OK. o - 4. Switch back to narrowed, same point-min. Check the caches. They look OK. o - 5. Switch to a different narrowing and (syntax-ppss (point-min)). This does indeed empty the syntax-ppss-narrow, as it should. s-p-wide looks unchanged. Good. o - 6. Get well filled caches for both narrow and wide regions. With the buffer wide, make a buffer change early in the buffer. Check both caches are properly trimmed. They are. o - 7. Repeat 6, but trim with the buffer narrow. Both caches look OK, the narrow cache being (nil). Maybe I should also try some heavy hacking in, say, Emacs Lisp mode as a kind of soak test, since elisp mode uses syntax-ppss quite a bit, I believe. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-19 20:47 ` Alan Mackenzie @ 2017-09-22 14:09 ` Dmitry Gutov 2017-09-24 11:26 ` Alan Mackenzie 0 siblings, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2017-09-22 14:09 UTC (permalink / raw) To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983 Hi Alan, On 9/19/17 11:47 PM, Alan Mackenzie wrote: > I have done now, without the slightest cause for concern (see below). Thank you. Should you commit the patch (with any documentation tweaks you deem necessary), or should I? > I've done some semi-formal testing on it. My semi-formal test log is: > > (ii) Do some testing, using xdisp.c as test file. A file.c will not have > other calls to syntax-ppss interfering with the tests. > o - 1. Normal working: check both caches stay empty. They don't, because > syntax-ppss is used, I think, by font locking. > o - 2. Normal work in a narrowed buffer. Seems OK. > o - 3. Switch back to widened. Seems OK. > o - 4. Switch back to narrowed, same point-min. Check the caches. They > look OK. > o - 5. Switch to a different narrowing and (syntax-ppss (point-min)). This > does indeed empty the syntax-ppss-narrow, as it should. s-p-wide looks > unchanged. Good. > o - 6. Get well filled caches for both narrow and wide regions. With the > buffer wide, make a buffer change early in the buffer. Check both caches > are properly trimmed. They are. > o - 7. Repeat 6, but trim with the buffer narrow. Both caches look OK, the > narrow cache being (nil). Yes, this sounds fine. I've tried out most of those myself too, except usually without checking the cache contents. Just the syntax-ppss results. It would be nice to have 2 or 3 of those added as automated tests, BTW. > Maybe I should also try some heavy hacking in, say, Emacs Lisp mode as a > kind of soak test, since elisp mode uses syntax-ppss quite a bit, I > believe. Sure, except emacs-lisp-mode seems to still retain certain indentation-related problems, even without this change. I don't really expect to uncover problems from this patch much later. That's been the point of making the change as simple as possible. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-22 14:09 ` Dmitry Gutov @ 2017-09-24 11:26 ` Alan Mackenzie 2017-09-25 23:53 ` Dmitry Gutov 2017-10-04 20:07 ` Johan Bockgård 0 siblings, 2 replies; 155+ messages in thread From: Alan Mackenzie @ 2017-09-24 11:26 UTC (permalink / raw) To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983 Hello, Dmitry. On Fri, Sep 22, 2017 at 17:09:03 +0300, Dmitry Gutov wrote: > Hi Alan, > On 9/19/17 11:47 PM, Alan Mackenzie wrote: > > I have done now, without the slightest cause for concern (see below). > Thank you. Should you commit the patch (with any documentation tweaks > you deem necessary), or should I? Could I ask you to do it, please? I'm somewhat exhausted from debating another basic Emacs change. Ah yes, the documentation. I checked the doc in the elisp manual, and twice the phrase "from the beginning of the buffer" was used. I've clarified that with "from the beginning of the visible portion of the buffer". I've also amended "a cache" to "caches", though this doesn't seem too important. What do you think: diff --git a/doc/lispref/syntax.texi b/doc/lispref/syntax.texi index e3ae53536f..b37f2b22b8 100644 --- a/doc/lispref/syntax.texi +++ b/doc/lispref/syntax.texi @@ -751,7 +751,8 @@ Position Parse @defun syntax-ppss &optional pos This function returns the parser state that the parser would reach at -position @var{pos} starting from the beginning of the buffer. +position @var{pos} starting from the beginning of the visible portion +of the buffer. @iftex See the next section for @end iftex @@ -762,11 +763,11 @@ Position Parse The return value is the same as if you call the low-level parsing function @code{parse-partial-sexp} to parse from the beginning of the -buffer to @var{pos} (@pxref{Low-Level Parsing}). However, -@code{syntax-ppss} uses a cache to speed up the computation. Due to -this optimization, the second value (previous complete subexpression) -and sixth value (minimum parenthesis depth) in the returned parser -state are not meaningful. +visible portion of the buffer to @var{pos} (@pxref{Low-Level +Parsing}). However, @code{syntax-ppss} uses caches to speed up the +computation. Due to this optimization, the second value (previous +complete subexpression) and sixth value (minimum parenthesis depth) in +the returned parser state are not meaningful. This function has a side effect: it adds a buffer-local entry to @code{before-change-functions} (@pxref{Change Hooks}) for -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply related [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-24 11:26 ` Alan Mackenzie @ 2017-09-25 23:53 ` Dmitry Gutov 2017-10-01 16:36 ` Alan Mackenzie 2017-10-04 20:07 ` Johan Bockgård 1 sibling, 1 reply; 155+ messages in thread From: Dmitry Gutov @ 2017-09-25 23:53 UTC (permalink / raw) To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983 Hi Alan, On 9/24/17 2:26 PM, Alan Mackenzie wrote: > Could I ask you to do it, please? I'm somewhat exhausted from debating > another basic Emacs change. Pushed to emacs-26, thanks. > Ah yes, the documentation. I checked the doc in the elisp manual, and > twice the phrase "from the beginning of the buffer" was used. I've > clarified that with "from the beginning of the visible portion of the > buffer". I've also amended "a cache" to "caches", though this doesn't > seem too important. What do you think: LGTM. I think you can push it and finally close this bug. ;-) ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-25 23:53 ` Dmitry Gutov @ 2017-10-01 16:36 ` Alan Mackenzie 0 siblings, 0 replies; 155+ messages in thread From: Alan Mackenzie @ 2017-10-01 16:36 UTC (permalink / raw) To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983 Hello, Dmitry. On Tue, Sep 26, 2017 at 02:53:52 +0300, Dmitry Gutov wrote: > Hi Alan, > On 9/24/17 2:26 PM, Alan Mackenzie wrote: > > Could I ask you to do it, please? I'm somewhat exhausted from debating > > another basic Emacs change. > Pushed to emacs-26, thanks. > > Ah yes, the documentation. I checked the doc in the elisp manual, and > > twice the phrase "from the beginning of the buffer" was used. I've > > clarified that with "from the beginning of the visible portion of the > > buffer". I've also amended "a cache" to "caches", though this doesn't > > seem too important. What do you think: > LGTM. I think you can push it and finally close this bug. ;-) Thanks. I've just done both of these things. Phew! -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-24 11:26 ` Alan Mackenzie 2017-09-25 23:53 ` Dmitry Gutov @ 2017-10-04 20:07 ` Johan Bockgård 1 sibling, 0 replies; 155+ messages in thread From: Johan Bockgård @ 2017-10-04 20:07 UTC (permalink / raw) To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, Dmitry Gutov, 22983 Alan Mackenzie <acm@muc.de> writes: > Ah yes, the documentation. I checked the doc in the elisp manual, and > twice the phrase "from the beginning of the buffer" was used. I've > clarified that with "from the beginning of the visible portion of the > buffer". The manual uses the term "accessible portion" for this. ("Visible" usually refers to text in a window.) ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-10 11:36 ` bug#22983: [ Patch ] " Alan Mackenzie 2017-09-10 22:53 ` Stefan Monnier 2017-09-11 0:11 ` Dmitry Gutov @ 2017-09-17 11:12 ` Philipp Stephani 2017-09-19 20:50 ` Alan Mackenzie 2 siblings, 1 reply; 155+ messages in thread From: Philipp Stephani @ 2017-09-17 11:12 UTC (permalink / raw) To: Alan Mackenzie, Dmitry Gutov; +Cc: John Wiegley, 22983 [-- Attachment #1: Type: text/plain, Size: 845 bytes --] Alan Mackenzie <acm@muc.de> schrieb am So., 10. Sep. 2017 um 13:42 Uhr: > > > - Before this change is pushed to master, or shortly after, I'd like to > > know that it actually fixed the problem Philipp experienced with > > python-mode, so we can revert 4fbd330. If it was caused by e.g. > > syntax-table changing, we've not improved much. > > Philipp, any chance of you trying out python mode with this patch but > without 4fbd330? Unfortunately the problem wasn't easily reproducible back then. The problem would occur from time to time, but I never found a way to trigger it reproducibly. Therefore the unit test I've added in the commit artificially generates the symptom. The root cause is still unknown; while syntax-ppss and narrowing might be a potential root cause (the fontification code uses both), it might also be something else. [-- Attachment #2: Type: text/html, Size: 1157 bytes --] ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result. 2017-09-17 11:12 ` Philipp Stephani @ 2017-09-19 20:50 ` Alan Mackenzie 0 siblings, 0 replies; 155+ messages in thread From: Alan Mackenzie @ 2017-09-19 20:50 UTC (permalink / raw) To: Philipp Stephani; +Cc: John Wiegley, Dmitry Gutov, 22983 Hello, Philipp. On Sun, Sep 17, 2017 at 11:12:25 +0000, Philipp Stephani wrote: > Alan Mackenzie <acm@muc.de> schrieb am So., 10. Sep. 2017 um 13:42 Uhr: > > > - Before this change is pushed to master, or shortly after, I'd like to > > > know that it actually fixed the problem Philipp experienced with > > > python-mode, so we can revert 4fbd330. If it was caused by e.g. > > > syntax-table changing, we've not improved much. > > Philipp, any chance of you trying out python mode with this patch but > > without 4fbd330? > Unfortunately the problem wasn't easily reproducible back then. The problem > would occur from time to time, but I never found a way to trigger it > reproducibly. Therefore the unit test I've added in the commit artificially > generates the symptom. The root cause is still unknown; while syntax-ppss > and narrowing might be a potential root cause (the fontification code uses > both), it might also be something else. OK, I understand. Cache effects are the very devil to debug. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2017-09-04 23:34 ` Dmitry Gutov 2017-09-05 6:57 ` Andreas Röhler 2017-09-05 12:28 ` John Wiegley @ 2017-09-07 17:56 ` Alan Mackenzie 2017-09-07 20:36 ` Dmitry Gutov 2 siblings, 1 reply; 155+ messages in thread From: Alan Mackenzie @ 2017-09-07 17:56 UTC (permalink / raw) To: Dmitry Gutov; +Cc: jwiegley, Philipp Stephani, 22983 Hello, Dmitry. On Tue, Sep 05, 2017 at 02:34:15 +0300, Dmitry Gutov wrote: > On 9/2/17 8:40 PM, Alan Mackenzie wrote: > > I'm not happy about this. 22983 is a serious design flaw, which has had > > deleterious effects deep within Emacs. > I'm sure we want to fix design flaws. As long as there is a solid plan > that does not swap one flaw for another. Plan or not, it should be fixed. > > One recorded example, resulting > > in an infinite loop, is: > > > > ######################################################################### > > From: Philipp Stephani <p.stephani2@gmail.com> > > To: emacs-devel@gnu.org > > Subject: [PATCH] Protect against an infloop in python-mode > > Date: Tue, 28 Feb 2017 22:31:49 +0100 > > > > There appears to be an edge case caused by using `syntax-ppss' in a > > narrowed buffer during JIT lock inside of Python triple-quote strings. > > Unfortunately it is impossible to reproduce without manually > > destroying the syntactic information in the Python buffer, but it has > > been observed in practice. In that case it can happen that the syntax > > caches get sufficiently out of whack so that there appear to be > > overlapping strings in the buffer. As Python has no nested strings, > > this situation is impossible and leads to an infloop in > > `python-nav-end-of-statement'. Protect against this by checking > > whether the search for the end of the current string makes progress. > > ######################################################################### > > > > In this case, Philipp had to apply a workaround. > The problem manifested during jit-lock. Do we understand why the (widen) > call inside font-lock-default-fontify-region didn't help? I don't, not in detail, no. Philipp might know. But if syntax-ppss was used whilst the buffer was narrowed, it likely corrupted its cache, and that corruption remained after widening the buffer. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2017-09-07 17:56 ` Alan Mackenzie @ 2017-09-07 20:36 ` Dmitry Gutov 0 siblings, 0 replies; 155+ messages in thread From: Dmitry Gutov @ 2017-09-07 20:36 UTC (permalink / raw) To: Alan Mackenzie; +Cc: jwiegley, Philipp Stephani, 22983 On 9/7/17 8:56 PM, Alan Mackenzie wrote: >> The problem manifested during jit-lock. Do we understand why the (widen) >> call inside font-lock-default-fontify-region didn't help? > > I don't, not in detail, no. Philipp might know. But if syntax-ppss was > used whilst the buffer was narrowed, it likely corrupted its cache, and > that corruption remained after widening the buffer. Details matter. ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-19 12:27 ` Alan Mackenzie 2016-03-19 18:47 ` Dmitry Gutov @ 2016-03-19 23:16 ` Vitalie Spinu 1 sibling, 0 replies; 155+ messages in thread From: Vitalie Spinu @ 2016-03-19 23:16 UTC (permalink / raw) To: Alan Mackenzie; +Cc: 22983, Dmitry Gutov >> On Sat, Mar 19 2016 12:27, Alan Mackenzie wrote: > I think the only sensible functionality for syntax-ppss is to be > equivalent to (parse-partial-sexp 1 pos). Then everybody knows where > they stand. This would not work for multi modes. Till there is a feasible way to advice parse-partial-sexp there will be no way to ensure the above contract is satisfied in multi-modes. Vitalie ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-18 0:49 ` Dmitry Gutov 2016-03-19 12:27 ` Alan Mackenzie @ 2016-03-19 23:00 ` Vitalie Spinu 2016-03-19 23:20 ` Dmitry Gutov 1 sibling, 1 reply; 155+ messages in thread From: Vitalie Spinu @ 2016-03-19 23:00 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Alan Mackenzie, 22983 Thanks for this. This is a step in right direction IMHO. One side note. `parsep-ppss` has a condition-case for args-out-of-range which could be easily optimized out. You already know that you are calling parse-partial-sexp with out of range arguments if narrowing is in place. The current error check obfuscates the logic and makes debugging harder. Would it be possible for you to have a look once you are on it? Not a big deal though. Thanks, Vitalie >> On Fri, Mar 18 2016 02:49, Dmitry Gutov wrote: > On 03/11/2016 05:15 PM, Alan Mackenzie wrote: > This patch should make ppss-0 and ppss-1 match: > diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el > index e20a210..c1b9d84 100644 > --- a/lisp/emacs-lisp/syntax.el > +++ b/lisp/emacs-lisp/syntax.el > @@ -371,6 +371,11 @@ syntax-ppss-max-span > We try to make sure that cache entries are at least this far apart > from each other, to avoid keeping too much useless info.") > +(defvar syntax-ppss-dont-widen nil > + "If non-nil, `syntax-ppss' will work on the non-widened buffer. > +The code that uses this should create local bindings for > +`syntax-ppss-cache' and `syntax-ppss-last' too.") > + > (defvar syntax-begin-function nil > "Function to move back outside of any comment/string/paren. > This function should move the cursor back to some syntactically safe > @@ -423,12 +428,21 @@ syntax-ppss > in the returned list (counting from 0) cannot be relied upon. > Point is at POS when this function returns. > +IF `syntax-ppss-dont-widen' is nil, the buffer is temporarily > +widened. > + > It is necessary to call `syntax-ppss-flush-cache' explicitly if > this function is called while `before-change-functions' is > temporarily let-bound, or if the buffer is modified without > running the hook." > ;; Default values. > (unless pos (setq pos (point))) > + (save-restriction > + (unless syntax-ppss-dont-widen > + (widen)) > + (syntax-pps--at pos))) > + > +(defun syntax-ppss--at (pos) > (syntax-propertize pos) > ;; > (let ((old-ppss (cdr syntax-ppss-last)) ^ permalink raw reply [flat|nested] 155+ messages in thread
* bug#22983: syntax-ppss returns wrong result. 2016-03-19 23:00 ` Vitalie Spinu @ 2016-03-19 23:20 ` Dmitry Gutov 0 siblings, 0 replies; 155+ messages in thread From: Dmitry Gutov @ 2016-03-19 23:20 UTC (permalink / raw) To: Vitalie Spinu; +Cc: Alan Mackenzie, 22983 On 03/20/2016 01:00 AM, Vitalie Spinu wrote: > > Thanks for this. This is a step in right direction IMHO. > > One side note. `parsep-ppss` has a condition-case for args-out-of-range which > could be easily optimized out. You already know that you are calling > parse-partial-sexp with out of range arguments if narrowing is in place. That seems like it might make the code more complex: there are several parse-partial-sexp calls inside condition-case (for different situations with the existing cache), and we may have to add a comparison near each of them. > The > current error check obfuscates the logic and makes debugging harder. Would it be > possible for you to have a look once you are on it? Not a big deal though. I think you can still follow the execution flow with edebug, can't you? If you're debugging a problem with args-out-of-range, another option is to replace `condition-case' with `condition-case-unless-debug' and re-evaluate the definition (but restore it when you're done, otherwise the args-out-of-range handler won't fire, I think). ^ permalink raw reply [flat|nested] 155+ messages in thread
[parent not found: <mailman.7307.1457709188.843.bug-gnu-emacs@gnu.org>]
* bug#22983: syntax-ppss returns wrong result. [not found] ` <mailman.7307.1457709188.843.bug-gnu-emacs@gnu.org> @ 2017-10-01 16:31 ` Alan Mackenzie 0 siblings, 0 replies; 155+ messages in thread From: Alan Mackenzie @ 2017-10-01 16:31 UTC (permalink / raw) To: 22983-done The bug has been fixed by patches to the emacs-26 branch. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 155+ messages in thread
end of thread, other threads:[~2017-10-04 20:07 UTC | newest] Thread overview: 155+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-03-11 15:15 bug#22983: syntax-ppss returns wrong result Alan Mackenzie 2016-03-11 20:31 ` Dmitry Gutov 2016-03-11 21:24 ` Alan Mackenzie 2016-03-11 21:35 ` Dmitry Gutov 2016-03-11 22:15 ` Alan Mackenzie 2016-03-11 22:38 ` Dmitry Gutov 2016-03-13 17:37 ` Stefan Monnier 2016-03-13 18:57 ` Alan Mackenzie 2016-03-14 0:47 ` Dmitry Gutov 2016-03-14 1:04 ` Drew Adams 2016-04-03 22:55 ` John Wiegley 2016-03-14 1:49 ` Stefan Monnier 2016-03-14 15:16 ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Alan Mackenzie 2016-03-14 17:34 ` Andreas Röhler 2016-03-14 20:06 ` Dmitry Gutov 2016-03-19 22:51 ` Vitalie Spinu 2016-03-20 2:19 ` Dmitry Gutov 2016-03-20 12:15 ` Vitalie Spinu 2016-03-20 15:58 ` Dmitry Gutov 2016-03-21 1:05 ` Vitalie Spinu 2016-03-21 3:11 ` Stefan Monnier 2016-03-21 5:05 ` Vitalie Spinu 2016-03-21 7:13 ` Andreas Röhler 2016-03-21 12:26 ` Stefan Monnier 2016-03-21 14:13 ` Vitalie Spinu 2016-03-21 14:43 ` Stefan Monnier 2016-03-21 16:42 ` Vitalie Spinu 2016-03-21 18:31 ` Stefan Monnier 2016-03-21 19:16 ` Vitalie Spinu 2016-03-21 20:47 ` Stefan Monnier 2016-03-21 20:33 ` Alan Mackenzie 2016-03-21 20:49 ` Stefan Monnier 2016-03-21 21:03 ` Drew Adams 2016-03-21 21:12 ` Dmitry Gutov 2016-03-21 16:45 ` Vitalie Spinu 2016-03-21 22:55 ` Dmitry Gutov 2016-03-22 14:51 ` Stefan Monnier 2016-03-22 18:17 ` Vitalie Spinu 2016-03-23 1:18 ` Dmitry Gutov 2016-03-23 13:18 ` Stefan Monnier 2016-03-22 18:26 ` Vitalie Spinu 2016-03-23 2:07 ` Stefan Monnier 2016-03-23 10:56 ` Vitalie Spinu 2016-03-23 11:41 ` Stefan Monnier 2016-03-23 12:39 ` Vitalie Spinu 2016-03-23 13:23 ` Stefan Monnier 2016-03-23 15:28 ` Dmitry Gutov 2016-03-23 21:51 ` Vitalie Spinu 2016-03-24 7:30 ` Andreas Röhler 2016-03-21 11:56 ` Dmitry Gutov 2016-03-21 5:08 ` [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] Vitalie Spinu 2016-03-21 12:39 ` Stefan Monnier 2016-03-21 12:54 ` Vitalie Spinu 2016-03-21 14:07 ` Stefan Monnier 2016-03-21 14:14 ` Vitalie Spinu 2016-03-21 14:04 ` Stefan Monnier 2016-03-21 14:33 ` Vitalie Spinu 2016-03-21 14:54 ` Stefan Monnier 2016-03-21 17:16 ` Vitalie Spinu 2016-03-21 18:36 ` Stefan Monnier 2016-03-21 19:18 ` Vitalie Spinu 2016-03-22 3:17 ` Vitalie Spinu 2016-03-22 9:57 ` Vitalie Spinu 2016-03-22 10:05 ` Vitalie Spinu 2016-03-22 11:57 ` Stefan Monnier 2016-03-22 16:28 ` Vitalie Spinu 2016-03-22 16:44 ` Stefan Monnier 2016-03-22 19:36 ` Vitalie Spinu 2016-03-23 2:22 ` Stefan Monnier 2016-03-23 11:41 ` Vitalie Spinu 2016-03-23 12:34 ` Stefan Monnier 2016-03-23 12:41 ` Vitalie Spinu 2016-03-29 21:43 ` Vitalie Spinu 2016-04-22 14:34 ` Dmitry Gutov 2016-04-24 7:22 ` Vitalie Spinu 2016-04-24 7:28 ` Achim Gratz 2016-04-24 11:33 ` Vitalie Spinu 2016-04-24 13:20 ` Andreas Schwab 2016-04-24 16:11 ` Vitalie Spinu 2016-04-24 16:19 ` Andreas Schwab 2016-04-24 16:41 ` Vitalie Spinu 2016-04-24 16:48 ` Andreas Schwab 2016-04-24 18:01 ` Vitalie Spinu 2016-04-24 19:05 ` Andreas Schwab 2016-04-28 13:29 ` Vitalie Spinu 2016-04-30 14:06 ` Stefan Monnier 2016-03-22 20:08 ` Richard Stallman 2016-03-22 22:45 ` Vitalie Spinu 2016-03-21 11:47 ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Dmitry Gutov 2016-03-21 12:40 ` Vitalie Spinu 2016-03-21 13:07 ` Dmitry Gutov 2016-03-21 14:20 ` Vitalie Spinu 2016-03-21 14:29 ` Dmitry Gutov 2016-03-21 14:42 ` Vitalie Spinu 2016-03-21 14:56 ` Dmitry Gutov 2016-03-21 16:52 ` Vitalie Spinu 2016-03-21 21:30 ` Dmitry Gutov 2016-04-03 23:34 ` John Wiegley 2016-03-21 14:02 ` Stefan Monnier 2016-03-21 14:31 ` Vitalie Spinu 2016-03-21 15:06 ` Stefan Monnier 2016-03-21 17:15 ` Andreas Röhler 2016-03-13 17:32 ` bug#22983: syntax-ppss returns wrong result Stefan Monnier 2016-03-13 18:52 ` Andreas Röhler 2016-03-13 18:56 ` Dmitry Gutov 2016-03-18 0:49 ` Dmitry Gutov 2016-03-19 12:27 ` Alan Mackenzie 2016-03-19 18:47 ` Dmitry Gutov 2016-03-27 0:51 ` John Wiegley 2016-03-27 1:14 ` Dmitry Gutov 2016-04-03 22:58 ` John Wiegley 2016-04-03 23:15 ` Dmitry Gutov 2017-09-02 13:12 ` Eli Zaretskii 2017-09-02 17:40 ` Alan Mackenzie 2017-09-02 17:53 ` Eli Zaretskii 2017-09-03 20:44 ` John Wiegley 2017-09-04 23:34 ` Dmitry Gutov 2017-09-05 6:57 ` Andreas Röhler 2017-09-05 12:28 ` John Wiegley 2017-09-07 20:45 ` Alan Mackenzie 2017-09-08 16:04 ` Andreas Röhler 2017-09-10 18:26 ` Alan Mackenzie 2017-09-09 9:44 ` Dmitry Gutov 2017-09-09 10:20 ` Alan Mackenzie 2017-09-09 12:18 ` Dmitry Gutov 2017-09-10 11:42 ` Alan Mackenzie 2017-09-10 11:36 ` bug#22983: [ Patch ] " Alan Mackenzie 2017-09-10 22:53 ` Stefan Monnier 2017-09-10 23:36 ` Dmitry Gutov 2017-09-11 11:10 ` Stefan Monnier 2017-09-12 0:11 ` Dmitry Gutov 2017-09-12 22:12 ` Richard Stallman 2017-09-11 19:42 ` Alan Mackenzie 2017-09-11 20:20 ` Stefan Monnier 2017-09-11 0:11 ` Dmitry Gutov 2017-09-11 20:12 ` Alan Mackenzie 2017-09-12 0:24 ` Dmitry Gutov 2017-09-17 10:29 ` Alan Mackenzie 2017-09-17 23:43 ` Dmitry Gutov 2017-09-18 19:08 ` Alan Mackenzie 2017-09-19 0:02 ` Dmitry Gutov 2017-09-19 20:47 ` Alan Mackenzie 2017-09-22 14:09 ` Dmitry Gutov 2017-09-24 11:26 ` Alan Mackenzie 2017-09-25 23:53 ` Dmitry Gutov 2017-10-01 16:36 ` Alan Mackenzie 2017-10-04 20:07 ` Johan Bockgård 2017-09-17 11:12 ` Philipp Stephani 2017-09-19 20:50 ` Alan Mackenzie 2017-09-07 17:56 ` Alan Mackenzie 2017-09-07 20:36 ` Dmitry Gutov 2016-03-19 23:16 ` Vitalie Spinu 2016-03-19 23:00 ` Vitalie Spinu 2016-03-19 23:20 ` Dmitry Gutov [not found] ` <mailman.7307.1457709188.843.bug-gnu-emacs@gnu.org> 2017-10-01 16:31 ` Alan Mackenzie
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.