From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Lynn Winebarger Newsgroups: gmane.emacs.devel Subject: Re: Major modes using `widen' is a good, even essential, programming practice. Date: Mon, 8 Aug 2022 07:16:44 -0400 Message-ID: References: <6ae35c9306ade07b4c45@heytings.org> <83fsi7wjqe.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000265dfe05e5b8f439" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="11873"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Eli Zaretskii , Alan Mackenzie , gregory@heytings.org, emacs-devel To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Aug 08 13:24:43 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oL0sE-0002sC-Pm for ged-emacs-devel@m.gmane-mx.org; Mon, 08 Aug 2022 13:24:42 +0200 Original-Received: from localhost ([::1]:55492 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oL0sD-0002jW-Q2 for ged-emacs-devel@m.gmane-mx.org; Mon, 08 Aug 2022 07:24:41 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:60912) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oL0kn-0001pX-VL for emacs-devel@gnu.org; Mon, 08 Aug 2022 07:17:04 -0400 Original-Received: from mail-pl1-x62a.google.com ([2607:f8b0:4864:20::62a]:39904) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oL0kl-0000sw-BU; Mon, 08 Aug 2022 07:17:00 -0400 Original-Received: by mail-pl1-x62a.google.com with SMTP id g13so1306584plo.6; Mon, 08 Aug 2022 04:16:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=+2Rai0hqKwyltKPGo/HVC4KSRx/JFn0/wzvhqPUMK0c=; b=BSfXTwTyPEfXFz3ulbLaTgo7v1SJgUAGOc3e/xxUkROTw9obcEXjsY1DK3KoXwELCT vOb7/TnnJSXdP651X7b+oYtliewcrNf+4Fsu0ACUY7lYeNnyXJ/EGo8jn6QfnHnF5u+3 uy/V7YPBy3gzZ2ErrtKVSwOkz31RKdMECqBDAxeLZBoEt/KBPnagjpebqUI8XThoUipY O9/N7X/ONJiAo6Cct6eyMOtmD/jyyCN8/O6fPdzE/jMfUKnr0ICSFdT22J37nR8bd9Fj x9pmC3kYk8qHHsnVTVlaskyrvm1ZQncUOBsu1CCrGTPpUVvqwTo/s6xTMHieZv4dfoN9 HhdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=+2Rai0hqKwyltKPGo/HVC4KSRx/JFn0/wzvhqPUMK0c=; b=teg7n08BbSnxzOV4tjzPdU9g8HUXzN1g884ApwhEoc4T/fSuUL947EMW72AuWHORt+ 1ejmMjNAI4A0CGwraUSYiYWaXAFoQcSCzvOT07D5w7nQ471idWAoWfcGT9b9/fEuquB9 JIYUpONHSzyAM88tickYCy935/NYiUo+vuIpT5uui8Vl9GcbpZfSvv9bLTswbTI0rSOo UumORn551+PhAsmqmBGQ8ajb1Tykxzx54SElraspBVTWcBXkc+y+Z/8gOCE5Z5dB00Y2 zhZ3C/W9ZYbAr79mWS31Eq5ClEtQR7AGixj+rX8MXrPE95Hw2dgXA1C/XaUrJuJu0+xm PjAA== X-Gm-Message-State: ACgBeo2yoUyiLoZdc08am/n+7ZmCkZxSw7naoLZ0bVc8/ZnNgw1J/aD8 n6XH2wbCDyrRWKd8aCmzjDD/NWoM2+iT4q2MdAU= X-Google-Smtp-Source: AA6agR4RQiMVqwcgRyRW7oW566mRrhecOTGf1R71vY+0lEfb8BxOWmEfGfIhc21pWm+bl2U3dmT3NS8eONjhgTlnIvI= X-Received: by 2002:a17:902:900c:b0:16d:28ad:29e1 with SMTP id a12-20020a170902900c00b0016d28ad29e1mr18032632plp.93.1659957416450; Mon, 08 Aug 2022 04:16:56 -0700 (PDT) In-Reply-To: Received-SPF: pass client-ip=2607:f8b0:4864:20::62a; envelope-from=owinebar@gmail.com; helo=mail-pl1-x62a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:293251 Archived-At: --000000000000265dfe05e5b8f439 Content-Type: text/plain; charset="UTF-8" On Mon, Aug 8, 2022, 5:29 AM Stefan Monnier wrote: > >> Eli Zaretskii [2022-08-07 17:20:52] wrote: > >> > jit-lock calls the functions with two arguments, BEG and END, and > >> > expects them to work only on that chunk of text. > >> > >> That is not the case: it expects the function to "fontify" *at least* > >> from BEG to END, but is quite happy to let it fontify more (and the > >> function can return a value indicating which portion was actually > >> returned in that case). Furthermore, it's clear that fontification of > >> BEG..END may need to look at text before BEG (and occasionally beyond > >> END as well). > > > > The intent is clearly that fontifications don't look far beyond these > > two points, because otherwise the whole design of jit-lock and its > > invocations during redisplay is basically thrown out the window. > > Usually, font-lock rules don't look before BOL or after EOL, indeed, > *except* via `syntax-ppss` which does look at all the text from BOB > to point. To make up for that, `syntax-ppss` relies heavily on caching, > so that it *usually* doesn't need to look very far at all (and if > there's no `syntax-propertize-function`, it's usually quite fast > because it's fully coded in C). > > For GB-sized buffers, even the fast C code of `syntax-ppss` incurs > a significant delay in the "unusual" case, so have various options: > - suck it up (potentially wait several minutes when jumping to the end > of the file). > - give up providing more or less correct highlighting (either via some > arbitrary narrowing like we do now, or turning off font-lock). > - try and find some clever heuristic that can find a "nearby safe spot", > i.e. a position for which we can guess the PPSS value (usually we > look for a position that is "known" to be outside of any string, > comment, or parenthesis). > - display the buffer quickly without highlighting while the fontification > is computed in the background. I know CC mode relies on heuristics to identify syntactic structures, and not a full parser (whether from semantic or LSP), but it seems the issue is that you don't have a parse state for the beginning of the narrowed buffer, where an initial parse state is inappropriate. Assuming that text outside the narrowing is not allowed to change, determining the appropriate parse state should only be required once on narrowing. So, could there be a pre-narrowing hook to run before narrowing takes effect to allow a major mode to determine the appropriate parse state for the beginning of the narrowed buffer? Also, as I'm not a big user of explicit narrowing, the only place I've noticed it happening is in info mode, where the focus is narrowed to a particular syntactic unit. Is there a way for a major mode to let the user signal the syntactic unit that they believe they are narrowing to, either with command variants or an interrogative(with a list of options supplied by the mode) when narrowing is performed by the user interactively? With the fall-back of either having the mode determine the correct initial state or turning off fontification during the narrowing? Lynn --000000000000265dfe05e5b8f439 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Mon, Aug 8, 2022, 5:29 AM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> Eli Zaretskii [2022-08-07 17:20= :52] wrote:
>> > jit-lock calls the functions with two arguments, BEG and END,= and
>> > expects them to work only on that chunk of text.
>>
>> That is not the case: it expects the function to "fontify&quo= t; *at least*
>> from BEG to END, but is quite happy to let it fontify more (and th= e
>> function can return a value indicating which portion was actually<= br> >> returned in that case).=C2=A0 Furthermore, it's clear that fon= tification of
>> BEG..END may need to look at text before BEG (and occasionally bey= ond
>> END as well).
>
> The intent is clearly that fontifications don't look far beyond th= ese
> two points, because otherwise the whole design of jit-lock and its
> invocations during redisplay is basically thrown out the window.

Usually, font-lock rules don't look before BOL or after EOL, indeed, *except* via `syntax-ppss` which does look at all the text from BOB
to point.=C2=A0 To make up for that, `syntax-ppss` relies heavily on cachin= g,
so that it *usually* doesn't need to look very far at all (and if
there's no `syntax-propertize-function`, it's usually quite fast because it's fully coded in C).

For GB-sized buffers, even the fast C code of `syntax-ppss` incurs
a significant delay in the "unusual" case, so have various option= s:
- suck it up (potentially wait several minutes when jumping to the end
=C2=A0 of the file).
- give up providing more or less correct highlighting (either via some
=C2=A0 arbitrary narrowing like we do now, or turning off font-lock).
- try and find some clever heuristic that can find a "nearby safe spot= ",
=C2=A0 i.e. a position for which we can guess the PPSS value (usually we =C2=A0 look for a position that is "known" to be outside of any s= tring,
=C2=A0 comment, or parenthesis).
- display the buffer quickly without highlighting while the fontification =C2=A0 is computed in the background.=C2=A0

I know CC mode relies on heuristics = to identify syntactic structures, and not a full parser (whether from seman= tic or LSP), but it seems the issue is that you don't have a parse stat= e for the beginning of the narrowed buffer, where an initial parse state is= inappropriate.=C2=A0 Assuming that text outside the narrowing is not allow= ed to change, determining the appropriate parse state should only be requir= ed once on narrowing.
So, could there be a pre-narro= wing hook to run before narrowing takes effect to allow a major mode to det= ermine the appropriate parse state for the beginning of the narrowed buffer= ?
Also, as I'm not a big user of explicit narrow= ing, the only place I've noticed it happening is in info mode, where th= e focus is narrowed to a particular syntactic unit.
= Is there a way for a major mode to let the user signal the syntactic unit t= hat they believe they are narrowing to, either with command variants or an = interrogative(with a list of options supplied by the mode) when narrowing i= s performed by the user interactively?=C2=A0 With the fall-back of either h= aving the mode determine the correct initial state or turning off fontifica= tion during the narrowing?

Lynn




<= /div> --000000000000265dfe05e5b8f439--