From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <emacs-orgmode-bounces+larch=yhetil.org@gnu.org> Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id qHHpMzcB2l4QGgAA0tVLHw (envelope-from <emacs-orgmode-bounces+larch=yhetil.org@gnu.org>) for <larch@yhetil.org>; Fri, 05 Jun 2020 08:24:23 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id GKPTLzcB2l7jYQAAB5/wlQ (envelope-from <emacs-orgmode-bounces+larch=yhetil.org@gnu.org>) for <larch@yhetil.org>; Fri, 05 Jun 2020 08:24:23 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 92CFC940607 for <larch@yhetil.org>; Fri, 5 Jun 2020 08:24:18 +0000 (UTC) Received: from localhost ([::1]:51058 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from <emacs-orgmode-bounces+larch=yhetil.org@gnu.org>) id 1jh7eC-0005Tu-JA for larch@yhetil.org; Fri, 05 Jun 2020 04:24:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60758) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <yantar92@gmail.com>) id 1jh7dp-0005Tb-9j for emacs-orgmode@gnu.org; Fri, 05 Jun 2020 04:23:53 -0400 Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]:42454) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from <yantar92@gmail.com>) id 1jh7dn-0006l0-S2 for emacs-orgmode@gnu.org; Fri, 05 Jun 2020 04:23:52 -0400 Received: by mail-pl1-x636.google.com with SMTP id x11so3346828plv.9 for <emacs-orgmode@gnu.org>; Fri, 05 Jun 2020 01:23:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:in-reply-to:references:date:message-id :mime-version; bh=KVWhI63nEWegNjwI3EQXgYiDAhf6CDiS1/xTzn0Fe80=; b=BmNlTwNw/jiXZKu71Jyj7VpckGwfKED7DTwcwt9iM/2PsdIE+mkAkKGg1OQvphLhBk 8JdW8MmmAJ/LTONdedHIun3JDNk/4X843TLfjwd/Yfw8c3iACF7dNVUAjZbZecYfU79K RC82wlA6vso0p+dFffccs+2XbXDzRcp8TA1IFr2uAYj1b5nj9NBdFlUcfq82VZ3kbls/ 46Riic09C1Q7J12h4YdCOD4fQq+GGFE2YItJhIvKJRS2wFegNYDfcuz+eIbCBvocgbdS TcF1oR+q5KfBH8Oyrv/Q6fqeNXTVuHMPTpZm0rb8URBsLrAUhoIxsK6j6fC4xWpN4mN7 TJVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=KVWhI63nEWegNjwI3EQXgYiDAhf6CDiS1/xTzn0Fe80=; b=Up7lxdPux8y9mvhtrz99FWKIMoVZgWpX8LnLYvPDO2p3eryBgcTG0X55Moru4A2wPi Fbv07BMPpGoOpKhRPcu12bIdTcsL59qcuwR+b7mj56jyaYQ+I1DzvbQ75z+6uF7KOm/U Wu5CsBPmr2K9l//UU5Rc55zToKXocSFpBjl97osNYziwLEQfGQaAd1/7GXPhNhJcsx0J Om0oXiDBgTdK/gxwf8DVBsJVKsCAOPHE0IEjUeDq8PqssIgLS12ZaG5B1stZlOdOJNPP HDuuwceWTIXTx7pGSQdTqP1MacLCRWoYb0zM7cIfAvCXrBe9Rxxsc/rmCGht9mQ4LrQz mn1A== X-Gm-Message-State: AOAM530Fs60YC3ZeLa+3ftF1h/pLrZ9KIm5n24VKJSebtpF91GmRjihe 8UbigtWMSyIm3SqKB4QvTpfECcDg6bA= X-Google-Smtp-Source: ABdhPJxFDa4MO3fF8Gq2CvnBaAXvvX7FB3moMT3Q7PsDNxSZ1qUZYULqXeHrISky93isq+n7VhLksg== X-Received: by 2002:a17:902:8e82:: with SMTP id bg2mr9017221plb.198.1591345427082; Fri, 05 Jun 2020 01:23:47 -0700 (PDT) Received: from localhost ([210.3.160.226]) by smtp.gmail.com with ESMTPSA id b16sm6557215pfd.111.2020.06.05.01.23.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Jun 2020 01:23:46 -0700 (PDT) From: Ihor Radchenko <yantar92@gmail.com> To: Nicolas Goaziou <mail@nicolasgoaziou.fr> Subject: Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers In-Reply-To: <87r1uuotw8.fsf@nicolasgoaziou.fr> References: <87h7x9e5jo.fsf@localhost> <875zdpia5i.fsf@nicolasgoaziou.fr> <87y2qi8c8w.fsf@localhost> <87r1vu5qmc.fsf@nicolasgoaziou.fr> <87imh5w1zt.fsf@localhost> <87blmxjckl.fsf@localhost> <87y2q13tgs.fsf@nicolasgoaziou.fr> <878si1j83x.fsf@localhost> <87d07bzvhd.fsf@nicolasgoaziou.fr> <87imh34usq.fsf@localhost> <87pnbby49m.fsf@nicolasgoaziou.fr> <87tv0efvyd.fsf@localhost> <874kse1seu.fsf@localhost> <87r1vhqpja.fsf@nicolasgoaziou.fr> <87tv0d2nk7.fsf@localhost> <87o8qkhy3g.fsf@nicolasgoaziou.fr> <87sgfqu5av.fsf@localhost> <87sgfn6qpc.fsf@nicolasgoaziou.fr> <87367d4ydc.fsf@localhost> <87r1uuotw8.fsf@nicolasgoaziou.fr> Date: Fri, 05 Jun 2020 16:18:59 +0800 Message-ID: <87mu5iq618.fsf@localhost> MIME-Version: 1.0 Content-Type: text/plain Received-SPF: pass client-ip=2607:f8b0:4864:20::636; envelope-from=yantar92@gmail.com; helo=mail-pl1-x636.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." <emacs-orgmode.gnu.org> List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-orgmode>, <mailto:emacs-orgmode-request@gnu.org?subject=unsubscribe> List-Archive: <https://lists.gnu.org/archive/html/emacs-orgmode> List-Post: <mailto:emacs-orgmode@gnu.org> List-Help: <mailto:emacs-orgmode-request@gnu.org?subject=help> List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-orgmode>, <mailto:emacs-orgmode-request@gnu.org?subject=subscribe> Cc: emacs-orgmode@gnu.org Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" <emacs-orgmode-bounces+larch=yhetil.org@gnu.org> X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=BmNlTwNw; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Spam-Score: -1.21 X-TUID: vtiGpnXtF7C9 > See also `gensym'. Do we really need to use it for something else than > `invisible'? If not, the tool doesn't need to be generic. For now, I also use it for buffer-local 'invisible stack. The stack is needed to preserve folding state of drawers/blocks inside folded outline. Though I am thinking about replacing the stack with separate text properties, like 'invisible-outline-buffer-local + 'invisible-drawer-buffer-local + 'invisible-block-buffer-local. Maintaining stack takes a noticeable percentage of CPU time in profiler. org--get-buffer-local-text-property-symbol must take care about situation with indirect buffers. When an indirect buffer is created from some org buffer, the old value of char-property-alias-alist is carried over. We need to detect this case and create new buffer-local symbol, which is unique to the newly created buffer (but not create it if the buffer-local property is already there). Then, the new symbol must replace the old alias in char-property-alias-alist + old folding state must be preserved (via copying the old invisibility specs into the new buffer-local text property). I do not see how gensym can benefit this logic. > OK, but this may not be sufficient if we want to do slightly better than > overlays in that area. This is not mandatory, though. Could you elaborate on what can be "slightly better"? > As discussed before, I don't think you need to use `modification-hooks' > or `insert-behind-hooks' if you already use `after-change-functions'. > > `after-change-functions' are also triggered upon text properties > changes. So, what is the use case for the other hooks? The problem is that `after-change-functions' cannot be a text property. Only `modification-hooks' and `insert-in-front/behind-hooks' can be a valid text property. If we use `after-change-functions', they will always be triggered, regardless if the change was made inside or outside folded region. >> :asd: >> :drawer: >> lksjdfksdfjl >> sdfsdfsdf >> :end: >> >> If :asd: was inserted in front of folded :drawer:, changes in :drawer: >> line of the new folded :asd: drawer would reveal the text between >> :drawer: and :end:. >> >> Let me know what you think on this. > I have first to understand the use case for `modification-hook'. But > I think unfolding is the right thing to do in this situation, isn't it? That situation arises because the modification-hooks from ":drawer:" (they are set via text properties) only have information about the :drawer:...:end: drawer before the modifications (they were set when :drawer: was folded last time). So, they will only unfold a part of the new :asd: drawer. I do not see a simple way to unfold everything without re-parsing the drawer around the changed text. Actually, I am quite unhappy with the performance of modification-hooks set via text properties (I am using this patch on my Emacs during this week). It appears that setting the text properties costs a significant CPU time in practice, even though running the hooks is pretty fast. I will think about a way to handle modifications using global after-change-functions. > `org--get-element-region-at-point' is certainly faster, but it is also > wrong, unfortunately. > > Org syntax is not context-free grammar. If you try to parse it locally, > starting from anywhere, it will fail at some point. For example, your > function would choke in the following case: > > [fn:1] Def1 > #+begin_something > > [fn:2] Def2 > #+end_something I see. > AFAIK, the only proper way to parse it is to start from a known position > in the buffer. If you have no information about the buffer, the headline > above is the position you want. With cache could help to start below. > Anyway, in this particular case, you should not use > `org--get-element-region-at-point'. OK Best, Ihor Nicolas Goaziou <mail@nicolasgoaziou.fr> writes: > Hello, > > Ihor Radchenko <yantar92@gmail.com> writes: > >> [The patch itself will be provided in the following email] > > Thank you. > >> I have found char-property-alias-alist variable that controls how Emacs >> calculates text property value if the property is not set. This variable >> can be buffer-local, which allows independent 'invisible states in >> different buffers. > > Great. I didn't know about this variable! > >> All the implementation stays in >> org--get-buffer-local-text-property-symbol, which takes care about >> generating unique property name and mapping it to 'invisible (or any >> other) text property. > > See also `gensym'. Do we really need to use it for something else than > `invisible'? If not, the tool doesn't need to be generic. > >> I simplified the code as suggested, without using pairs of before- and >> after-change-functions. > > Great! > >> Handling text inserted into folded/invisible region is handled by a >> simple after-change function. After testing, it turned out that simple >> re-hiding text based on 'invisible property of the text before/after the >> inserted region works pretty well. > > OK, but this may not be sufficient if we want to do slightly better than > overlays in that area. This is not mandatory, though. > >> Modifications to BEGIN/END line of the drawers and blocks is handled via >> 'modification-hooks + 'insert-behind-hooks text properties (there is no >> after-change-functions analogue for text properties in Emacs). The >> property is applied during folding and the modification-hook function is >> made aware about the drawer/block boundaries (via apply-partially >> passing element containing :begin :end markers for the current >> drawer/block). Passing the element boundary is important because the >> 'modification-hook will not directly know where it belongs to. Only the >> modified region (which can be larger than the drawer) is passed to the >> function. In the worst case, the region can be the whole buffer (if one >> runs revert-buffer). > > As discussed before, I don't think you need to use `modification-hooks' > or `insert-behind-hooks' if you already use `after-change-functions'. > > `after-change-functions' are also triggered upon text properties > changes. So, what is the use case for the other hooks? > >> It turned out that adding 'modification-hook text property takes a >> significant cpu time (partially, because we need to take care about >> possible existing 'modification-hook value, see >> org--add-to-list-text-property). For now, I decided to not clear the >> modification hooks during unfolding because of poor performance. >> However, this approach would lead to partial unfolding in the following >> case: >> >> :asd: >> :drawer: >> lksjdfksdfjl >> sdfsdfsdf >> :end: >> >> If :asd: was inserted in front of folded :drawer:, changes in :drawer: >> line of the new folded :asd: drawer would reveal the text between >> :drawer: and :end:. >> >> Let me know what you think on this. > > I have first to understand the use case for `modification-hook'. But > I think unfolding is the right thing to do in this situation, isn't it? > >> My simplified implementation of element boundary parser >> (org--get-element-region-at-point) appears to be much faster and also >> uses much less memory in comparison with org-element-at-point. >> Moreover, not all the places where org-element-at-point is called >> actually need the full parsed element. For example, org-hide-drawer-all, >> org-hide-drawer-toggle, org-hide-block-toggle, and >> org--hide-wrapper-toggle only need element type and some information >> about the element boundaries - the information we can get from >> org--get-element-region-at-point. > > [...] > >> What do you think about the idea of making use of >> org--get-element-region-at-point in org code base? > > `org--get-element-region-at-point' is certainly faster, but it is also > wrong, unfortunately. > > Org syntax is not context-free grammar. If you try to parse it locally, > starting from anywhere, it will fail at some point. For example, your > function would choke in the following case: > > [fn:1] Def1 > #+begin_something > > [fn:2] Def2 > #+end_something > > AFAIK, the only proper way to parse it is to start from a known position > in the buffer. If you have no information about the buffer, the headline > above is the position you want. With cache could help to start below. > Anyway, in this particular case, you should not use > `org--get-element-region-at-point'. > > Hopefully, we don't need to parse anything. In an earlier message, > I suggested a few checks to make on the modified text in order to decide > if something should be unfolded, or not. I suggest to start from there, > and fix any shortcomings we might encounter. We're replacing overlays: > low-level is good in this area. > > WDYT? > > > Regards, > > -- > Nicolas Goaziou -- Ihor Radchenko, PhD, Center for Advancing Materials Performance from the Nanoscale (CAMP-nano) State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong University, Xi'an, China Email: yantar92@gmail.com, ihor_radchenko@alumni.sutd.edu.sg