From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#51766: 29.0.50; Return value of buffer-chars-modified-tick changes when buffer text is not yet changed before inserting a character for non-latin input methods Date: Fri, 12 Nov 2021 17:17:29 +0200 Message-ID: <83zgq9xv1y.fsf@gnu.org> References: <87mtmalrs1.fsf@localhost> <837dde200c.fsf@gnu.org> <87k0helmig.fsf@localhost> <831r3m1tpk.fsf@gnu.org> <8735o1r31q.fsf@localhost> <834k8hzi10.fsf@gnu.org> <87zgq9pmb6.fsf@localhost> <831r3lzfk4.fsf@gnu.org> <87wnldpk5x.fsf@localhost> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="10682"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 51766@debbugs.gnu.org To: Ihor Radchenko Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Fri Nov 12 16:18:25 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mlYJs-0002at-Mf for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 12 Nov 2021 16:18:24 +0100 Original-Received: from localhost ([::1]:60004 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mlYJr-0003f6-Lp for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 12 Nov 2021 10:18:23 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:59020) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlYJY-0003cO-TE for bug-gnu-emacs@gnu.org; Fri, 12 Nov 2021 10:18:04 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:34206) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mlYJW-0008DQ-3v for bug-gnu-emacs@gnu.org; Fri, 12 Nov 2021 10:18:04 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1mlYJW-0005O2-0W for bug-gnu-emacs@gnu.org; Fri, 12 Nov 2021 10:18:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 12 Nov 2021 15:18:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 51766 X-GNU-PR-Package: emacs Original-Received: via spool by 51766-submit@debbugs.gnu.org id=B51766.163673027420692 (code B ref 51766); Fri, 12 Nov 2021 15:18:01 +0000 Original-Received: (at 51766) by debbugs.gnu.org; 12 Nov 2021 15:17:54 +0000 Original-Received: from localhost ([127.0.0.1]:45752 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mlYJO-0005Nf-Bs for submit@debbugs.gnu.org; Fri, 12 Nov 2021 10:17:54 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:58218) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mlYJN-0005NT-0b for 51766@debbugs.gnu.org; Fri, 12 Nov 2021 10:17:53 -0500 Original-Received: from [2001:470:142:3::e] (port=45628 helo=fencepost.gnu.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlYJH-0008BL-Qz; Fri, 12 Nov 2021 10:17:47 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=aSAE/fiXcKXzsh7Q4lNNCSZWFwJG61hGjg6arXuKpvc=; b=bBwNyC+Zp9so 7ilTdjpKcYTI6O57JFQ8JjEiONZktB/MjhJQNh/SN2LmjGwCJUrYymCAprq5uvYhMMnEQt6pzbc1S ubG+tS50/ZKVSPQYmH6ex8/9bC7/i9SWC9qMDIfwp3LIjNLfj+7byVF3L0tTLdEFKIg9Zc9aoYVxv Lcj5psesbVHXzW0VizouZJexeEJ0itA8THgBVyiu8Xvhh+6I0PEPEP/lePRe4psJv8cx6WwsJ24Kt FrmJmLmgDE3X4mkJOaYBFqY7oBWqi8pkQpB/EGjESuvQS+EWd4O/KxdeFhfO1QpSrC5Nj0oILgr/l SBwpqeL8tp7ZKDfT9cI4YA==; Original-Received: from [87.69.77.57] (port=3895 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlYJH-0006zH-FS; Fri, 12 Nov 2021 10:17:47 -0500 In-Reply-To: <87wnldpk5x.fsf@localhost> (message from Ihor Radchenko on Fri, 12 Nov 2021 21:39:54 +0800) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:219775 Archived-At: > From: Ihor Radchenko > Cc: 51766@debbugs.gnu.org > Date: Fri, 12 Nov 2021 21:39:54 +0800 > > Eli Zaretskii writes: > > >> quail changes the buffer after org-element--after-change-function call, > >> but before org-element--before-change-function. So, all Org can see is > >> that something has been changed in buffer, but there is no way to tell > >> what it was. Org cannot distinguish between harmless buffer edits by > >> quail (they do not change buffer text) and other kinds of "silent" > >> changes. > > > > OK, but why does this invalidate what Org does? All it means, AFAIU, > > is that in some cases Org will do unnecessary processing. Those cases > > are probably not too frequent. > > Normally, buffer changes under inhibit-modification-hooks are not > frequent indeed. But not with quail + non-latin input methods. Every > single self-insert-command triggers such change. "Such change" being what exactly? the situation where buffer-chars-modified-tick changes between post-command-hook and the following pre-command-hook? or something else? I'm asking because I don't think I see the problem you are describing. With the following code: (defun my-before (beg end) (message "buf %s beg %s end %s" (current-buffer) beg end)) (defun my-after (beg end len) (message "buf %s beg %s end %s len %s" (current-buffer) beg end len)) (add-hook 'before-change-functions 'my-before) (add-hook 'after-change-functions 'my-after) if I activate the chinese-py input method, then inserting any character via the input method produces a single call to my-before and a single call to my-after, and with the expected values, for example: buf *scratch* beg 440 end 440 buf *scratch* beg 440 end 441 len 0 So what exactly is the problem with these hooks when non-latin input methods are used? Or what am I missing? > > IOW, why invalidating the cache unnecessarily is such a big deal? > > It is critical for Org to know which part of buffer was changed (i.e. > beg, end, length that are normally passed as arguments of > after-change-functions). org-element-cache can contain >100k elements > for especially large buffers. Manually checking which elements are > changed without knowing the changed region is inefficient. Clearing the > cache is not too much of a big deal, but causes slowdown when user runs > a query on Org buffers (i.e. in agenda or sparse trees) - the buffer has > to be re-parsed. When every edit triggers cache invalidation (that's > what happens when user uses non-latin input method), the slowdown is > pretty much guaranteed. Moreover, too frequent cache resets increase the > load on Emacs' garbage collector (cache size is typically a multiple of > buffer size). Overloading garbage collector leads to overall Emacs > slowdown. Perhaps Org developers should ask for infrastructure changes that will allow Org to maintain such a cache reliably and not too expensively? It sounds like Org currently applies all kinds of heuristics based on assumptions about how the internals work and using hooks and features that were never designed to support this kind of caching. Jumping through hoops in Lisp trying to implement something that might be much easier or even trivial in C is not the best way of getting such jobs done. So perhaps someone could describe on emacs-devel what does Org need to maintain this cache, and we could then see how to provide those features to Org. > >> The hooks are called, but after quail already triggered > >> buffer-chars-modified-tick increase. If quail called > >> before-change-functions before buffer-chars-modified-tick increases, it > >> would be useful for my scenario. Though I am not sure how feasible it > >> is. Just an idea. > > > > Would it help if Org looked at both buffer-modified-tick and > > buffer-chars-modified-tick? > > When buffer-chars-modified-tick is changed, buffer-modified-tick is also > changed. AFAIU, buffer-chars-modified-tick registers a subset of buffer > modifications that actually change buffer text. buffer-modified-tick > also registers text property changes. So, I do not see how > buffer-modified-tick can help. If you look at the implementation, you will see that when Emacs decides that the buffer's chars-modified-tick needs to be increased, it simply assigns to it the value of the buffer's modified-tick at that moment. So by tracking the value of buffer-modified-tick you could perhaps explain why buffer-chars-modified-tick jumps by more than you expected.