From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Major modes using `widen' is a good, even essential, programming practice. Date: Mon, 08 Aug 2022 08:05:31 -0400 Message-ID: References: <6ae35c9306ade07b4c45@heytings.org> <83fsi7wjqe.fsf@gnu.org> <838rnzvup5.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="18855"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) Cc: acm@muc.de, gregory@heytings.org, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Aug 08 14:07:09 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oL1XJ-0004jT-34 for ged-emacs-devel@m.gmane-mx.org; Mon, 08 Aug 2022 14:07:09 +0200 Original-Received: from localhost ([::1]:32984 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oL1XH-0002ZK-N7 for ged-emacs-devel@m.gmane-mx.org; Mon, 08 Aug 2022 08:07:07 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:41958) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oL1Vu-0001Tz-SV for emacs-devel@gnu.org; Mon, 08 Aug 2022 08:05:42 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:34666) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oL1Vs-00010F-4g; Mon, 08 Aug 2022 08:05:41 -0400 Original-Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 8F7E8441133; Mon, 8 Aug 2022 08:05:37 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 04FEA44107F; Mon, 8 Aug 2022 08:05:36 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1659960336; bh=eApV0Oy2rgrDFjdM9WeF5WgXnar8GCm9IMHucuISvjk=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=ewZXkJh1PlddDZEVMbf7d1jWXq7nLlPLQ6VNvPEck+6K6/f70jBV40WaF3mIqDFRU I/tzgrT+qglOHDgqfAzi2usw3pxYxwr8q0wIue8BeR0JxHQ0JFMj021GRnjv1FcOyZ aKaIKSYVpX8eJaEJVSMH617cyEROpIP7yQ5p9OQEQL2z5tl45bWyptzvEP0wyBXlWq gSp++EfiiMvh0hlS6YcuUM9t18V1+xAIcL0UNseqNuBN6Q0GoDlL1uiP+Wwet7sg0p dI+AQZRY68oDFlF6iq68h7MF7RM0XZozCeZREaKGeYT89bf+CrbQTDqQ9mIZQLb6ww WGVFCTAR2pTOg== Original-Received: from milanesa (dyn.144-85-181-233.dsl.vtx.ch [144.85.181.233]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id C2A7D12040E; Mon, 8 Aug 2022 08:05:34 -0400 (EDT) In-Reply-To: <838rnzvup5.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 08 Aug 2022 14:30:30 +0300") Received-SPF: pass client-ip=132.204.25.50; envelope-from=monnier@iro.umontreal.ca; helo=mailscanner.iro.umontreal.ca X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:293261 Archived-At: >> > The intent is clearly that fontifications don't look far beyond these >> > two points, because otherwise the whole design of jit-lock and its >> > invocations during redisplay is basically thrown out the window. >> Usually, font-lock rules don't look before BOL or after EOL, indeed, > But BOL and EOL could also be very far away, so that, too, needs some > reasonable limit. Indeed. But in general this is easier to address, because the use of BOL/EOL is itself a heuristic, so we can impose some other limit (as is done now with the forced narrowing or with `syntax-wholeline-max`) and the harm is usually tolerable, IME. >> *except* via `syntax-ppss` which does look at all the text from BOB >> to point. To make up for that, `syntax-ppss` relies heavily on caching, >> so that it *usually* doesn't need to look very far at all (and if >> there's no `syntax-propertize-function`, it's usually quite fast >> because it's fully coded in C). >> >> For GB-sized buffers, even the fast C code of `syntax-ppss` incurs >> a significant delay in the "unusual" case, so have various options: > > I reported this for a 18MB single-line file, which is way below the GB > bar. On my own builds (which are slow on old machines), 18MB typically results in a delay of just a few seconds. Given that it's occasional, I find it very tolerable (similar delays can occur for other reasons, such as swapping, or a GC when the heap is large). But this scales linearly, so sooner or later on the way to multi-GB files the delay becomes intolerable. > The problem is that the initial full scan can take "forever" in those > files, and that basically means we cannot edit such files in practice. > So if you dislike the current solution of locked narrowing, how about > making syntax-ppss work in chunks (perhaps from an idle timer?), after > initially scanning only the first small portion of the file. The goal > is to have the file displayed quickly enough, and thereafter complete > the scan when possible. Yes, that's probably the better long term solution. Stefan