emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Please document the caching and its user options
@ 2024-06-12  9:38 Eli Zaretskii
  2024-06-14 13:12 ` Ihor Radchenko
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2024-06-12  9:38 UTC (permalink / raw)
  To: emacs-orgmode

I needed to visit org.org, the Org manual, today, and to my surprise
saw Emacs writing some data files into the ~/.cache/org-persist/
directory.  What's more, Emacs popped a buffer out of the blue telling
me that it could not safely encode the data written to (I presume)
some of those files, and asked me to select a safe coding-system.

By randomly poking here and there, I've succeeded to figure out that
this is due to org-element's caching of data from parsing Org files.
It seems this caching is turned on by default, but is not documented
in the Org manual, and in particular there's nothing in the manual
about turning off the caching.

Please document the caching features of Org in the manual, including
how to turn that off.  (I also question the wisdom of turning this on
by default without as much as a single request for confirmation from
the user.)

Please also make sure that the code which actually writes the data to
the cache files makes a point of binding coding-system-for-write to a
proper value (probably utf-8-unix), or forces
buffer-file-coding-system of the buffer from which it writes to have
such a safe value, to avoid annoying and unexpected prompting of the
user to select a proper encoding.  Lisp programs that write files in
the background cannot fail to set a proper encoding, because the call
to select-safe-coding-system is not supposed to be triggered by Lisp
programs unless they run as a direct result of a user-invoked command.

I've seen those problems in Emacs 29.3.  If these issues are already
solved in what will become Emacs 30, then my apologies, and kudos to
whoever solved them.  (However, the latest Org manual still keeps
completely silent about these features and their control by users.)

Thanks.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-12  9:38 Please document the caching and its user options Eli Zaretskii
@ 2024-06-14 13:12 ` Ihor Radchenko
  2024-06-14 13:41   ` Eli Zaretskii
                     ` (2 more replies)
  0 siblings, 3 replies; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-14 13:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-orgmode

Eli Zaretskii <eliz@gnu.org> writes:

> I needed to visit org.org, the Org manual, today, and to my surprise
> saw Emacs writing some data files into the ~/.cache/org-persist/
> directory.  What's more, Emacs popped a buffer out of the blue telling
> me that it could not safely encode the data written to (I presume)
> some of those files, and asked me to select a safe coding-system.
>
> By randomly poking here and there, I've succeeded to figure out that
> this is due to org-element's caching of data from parsing Org files.
> It seems this caching is turned on by default, but is not documented
> in the Org manual, and in particular there's nothing in the manual
> about turning off the caching.
>
> Please document the caching features of Org in the manual, including
> how to turn that off.  (I also question the wisdom of turning this on
> by default without as much as a single request for confirmation from
> the user.)

Hmm. What aspect of caching do you want us to document?
FYI, Org mode has been doing various forms of caching since
forever. Recently, we just employed a bit more regular API and
introduced one more kind of caching - parser cache. In addition to the
previously existing image cache, publishing cache, ID cache, clock
cache, etc.

> Please also make sure that the code which actually writes the data to
> the cache files makes a point of binding coding-system-for-write to a
> proper value (probably utf-8-unix), or forces
> buffer-file-coding-system of the buffer from which it writes to have
> such a safe value, to avoid annoying and unexpected prompting of the
> user to select a proper encoding.  Lisp programs that write files in
> the background cannot fail to set a proper encoding, because the call
> to select-safe-coding-system is not supposed to be triggered by Lisp
> programs unless they run as a direct result of a user-invoked command.

I believe that this particular problem has been solved in
https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=c8f88589c
It is a part of Org 9.7.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-14 13:12 ` Ihor Radchenko
@ 2024-06-14 13:41   ` Eli Zaretskii
  2024-06-14 15:31     ` Ihor Radchenko
  2024-06-14 13:56   ` Jens Lechtenboerger
  2024-06-16  5:40   ` Please document the caching and its user options Daniel Clemente
  2 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2024-06-14 13:41 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: emacs-orgmode@gnu.org
> Date: Fri, 14 Jun 2024 13:12:42 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > I needed to visit org.org, the Org manual, today, and to my surprise
> > saw Emacs writing some data files into the ~/.cache/org-persist/
> > directory.  What's more, Emacs popped a buffer out of the blue telling
> > me that it could not safely encode the data written to (I presume)
> > some of those files, and asked me to select a safe coding-system.
> >
> > By randomly poking here and there, I've succeeded to figure out that
> > this is due to org-element's caching of data from parsing Org files.
> > It seems this caching is turned on by default, but is not documented
> > in the Org manual, and in particular there's nothing in the manual
> > about turning off the caching.
> >
> > Please document the caching features of Org in the manual, including
> > how to turn that off.  (I also question the wisdom of turning this on
> > by default without as much as a single request for confirmation from
> > the user.)
> 
> Hmm. What aspect of caching do you want us to document?

First and foremost, that it exists, and is turned on by default.  The
manual is currently completely silent about it.

Next, please document the user options that control this caching, and
especially those options which can be used to turn this caching off or
direct it to a different place.

> FYI, Org mode has been doing various forms of caching since
> forever. Recently, we just employed a bit more regular API and
> introduced one more kind of caching - parser cache. In addition to the
> previously existing image cache, publishing cache, ID cache, clock
> cache, etc.

I'm not a heavy user of Org, but I do have several Org files that I
visit from time to time.  This was the first time I got prompted about
anything related to this caching.  Moreover, I think this was the
first time the Org file I visited was parsed by Org and the results
cached: I have a feature on my system that prominently indicates when
the machine is heavily loaded, and I was surprised to see it in action
when I visited org.org.  I never had this activated before just by
visiting an Org file.  I presumed the high load was due to the
parsing.  So either this is very new, or maybe my Org files are much
simpler than doc/misc/org.org, and so the parsing I triggered before
was much less expensive.

I hope you now understand why I wrote this report now and not before,
and why I was surprised: this caching was never so explicitly and
prominently into my face, so I could have completely missed its
existence.

> > Please also make sure that the code which actually writes the data to
> > the cache files makes a point of binding coding-system-for-write to a
> > proper value (probably utf-8-unix), or forces
> > buffer-file-coding-system of the buffer from which it writes to have
> > such a safe value, to avoid annoying and unexpected prompting of the
> > user to select a proper encoding.  Lisp programs that write files in
> > the background cannot fail to set a proper encoding, because the call
> > to select-safe-coding-system is not supposed to be triggered by Lisp
> > programs unless they run as a direct result of a user-invoked command.
> 
> I believe that this particular problem has been solved in
> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=c8f88589c
> It is a part of Org 9.7.

Maybe.  Try visiting org.org on a system whose locale is set to, say,
Latin-1, and see if you get the warnings about a safe coding-system.

But why do you use utf-8 there and not utf-8-unix?  Come to think
about it, why not emacs-internal?  Those files are used internally by
Org, so they should be able to encode any characters supported by
Emacs, not just those which have UTF-8 encoding.  And using native EOL
convention is not needed, and will get in the way if the user shares
these files between systems.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-14 13:12 ` Ihor Radchenko
  2024-06-14 13:41   ` Eli Zaretskii
@ 2024-06-14 13:56   ` Jens Lechtenboerger
  2024-06-14 14:31     ` Publishing cache (was: Please document the caching and its user options) Ihor Radchenko
  2024-06-16  5:40   ` Please document the caching and its user options Daniel Clemente
  2 siblings, 1 reply; 61+ messages in thread
From: Jens Lechtenboerger @ 2024-06-14 13:56 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 861 bytes --]

On 2024-06-14, Ihor Radchenko wrote:

> Eli Zaretskii <eliz@gnu.org> writes:
>
>> Please document the caching features of Org in the manual, including
>> how to turn that off.  (I also question the wisdom of turning this on
>> by default without as much as a single request for confirmation from
>> the user.)
>
> Hmm. What aspect of caching do you want us to document?
> FYI, Org mode has been doing various forms of caching since
> forever. Recently, we just employed a bit more regular API and
> introduced one more kind of caching - parser cache. In addition to the
> previously existing image cache, publishing cache, ID cache, clock
> cache, etc.

Jumping in here, I do not understand the publishing cache.  Some of
my org documents are re-published every time, while others are only
re-published after changes.  What is checked where?

Best wishes,
Jens

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 6187 bytes --]

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Publishing cache (was: Please document the caching and its user options)
  2024-06-14 13:56   ` Jens Lechtenboerger
@ 2024-06-14 14:31     ` Ihor Radchenko
  2024-08-12  7:55       ` Proposal: Change publication timestamps (was: Publishing cache) Jens Lechtenboerger
  0 siblings, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-14 14:31 UTC (permalink / raw)
  To: Jens Lechtenboerger; +Cc: Eli Zaretskii, emacs-orgmode

Jens Lechtenboerger <lechten@wi.uni-muenster.de> writes:

> Jumping in here, I do not understand the publishing cache.  Some of
> my org documents are re-published every time, while others are only
> re-published after changes.  What is checked where?

See "14.4 Triggering Publication" section of Org mode manual:

       Org uses timestamps to track when a file has changed.  The above
    functions normally only publish changed files.  You can override this
    and force publishing of all files by giving a prefix argument to any of
    the commands above, or by customizing the variable
    ‘org-publish-use-timestamps-flag’.  This may be necessary in particular
    if files include other files via ‘SETUPFILE’ or ‘INCLUDE’ keywords.
    
Apart from caching "timestamps" (a combination of modification time and
file hash), ox-publish stores information about generated link anchors,
so that they remain stable upon repeated publications (by default Org
mode export generates random anchors, unless they are specified in Org
mode source).

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-14 13:41   ` Eli Zaretskii
@ 2024-06-14 15:31     ` Ihor Radchenko
  2024-06-14 15:56       ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-14 15:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-orgmode

Eli Zaretskii <eliz@gnu.org> writes:

>> Hmm. What aspect of caching do you want us to document?
>
> First and foremost, that it exists, and is turned on by default.  The
> manual is currently completely silent about it.
>
> Next, please document the user options that control this caching, and
> especially those options which can be used to turn this caching off or
> direct it to a different place.

I am not convinced that we have to do it.

Firstly, it is not clear if you are asking to document caching parser
state specifically or all kinds of caching Org mode does.

Secondly, I am not sure if we have to document the details of caching at
all in the manual. We do not document all the custom options in the
manual; just the most important/useful.

Emacs user manual does not document `multisession-directory' - something
very close to how we implement Org caches.  So, apparently, customizing
`multisession-directory' and even the very multisession feature
existence is not deemed necessary inside Emacs manual. Why would it be
different for Org mode manual?

> I'm not a heavy user of Org, but I do have several Org files that I
> visit from time to time.  This was the first time I got prompted about
> anything related to this caching.

The prompt you saw is indeed a bug.

> ...  Moreover, I think this was the
> first time the Org file I visited was parsed by Org and the results
> cached: I have a feature on my system that prominently indicates when
> the machine is heavily loaded, and I was surprised to see it in action
> when I visited org.org.  I never had this activated before just by
> visiting an Org file.  I presumed the high load was due to the
> parsing.  So either this is very new, or maybe my Org files are much
> simpler than doc/misc/org.org, and so the parsing I triggered before
> was much less expensive.

Org mode uses parser since long time ago. Previously, the parser was
invoked without any caching, even in-memory. Since Org 9.6, we
implemented in-memory and on-disk caches for the parser. This allowed us
to utilize the parser more frequently, without relying upon
half-accurate regexp matches. Overall, it decreased CPU loads, but there
are different scenarios; sometimes CPU load is larger momentarily.

>> I believe that this particular problem has been solved in
>> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=c8f88589c
>> It is a part of Org 9.7.
>
> Maybe.  Try visiting org.org on a system whose locale is set to, say,
> Latin-1, and see if you get the warnings about a safe coding-system.
>
> But why do you use utf-8 there and not utf-8-unix?  Come to think
> about it, why not emacs-internal?  Those files are used internally by
> Org, so they should be able to encode any characters supported by
> Emacs, not just those which have UTF-8 encoding.  And using native EOL
> convention is not needed, and will get in the way if the user shares
> these files between systems.

Mostly because we chose whatever looked reasonable. I am not 100% sure
what is the practical difference between `utf-8' and `utf-8-unix' and
why the latter should be considered better.

As for `emacs-internal', we try to make files readable if at all
possible. In particular, index.eld file is even pretty-printed for user
convenience. The idea is to keep things in plain text and not in binary
formats, following the overall spirit how Emacs usually stores data. (I
think you may recall people raising their voice about plain text
vs. binary during the discussion of multisession feature and the use of
sqlite database).

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-14 15:31     ` Ihor Radchenko
@ 2024-06-14 15:56       ` Eli Zaretskii
  2024-06-15 12:47         ` Ihor Radchenko
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2024-06-14 15:56 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: emacs-orgmode@gnu.org
> Date: Fri, 14 Jun 2024 15:31:28 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Hmm. What aspect of caching do you want us to document?
> >
> > First and foremost, that it exists, and is turned on by default.  The
> > manual is currently completely silent about it.
> >
> > Next, please document the user options that control this caching, and
> > especially those options which can be used to turn this caching off or
> > direct it to a different place.
> 
> I am not convinced that we have to do it.

That's too bad.  When a user finds out about this caching, how do you
propose that he/she looks for the information about it?  I wanted to
know what is being cached, why, and in what file/directory.  It took
me quite some time to find the answers, since Org is a very large
package, and there's no org-cache.el file or similar to serve as the
immediate suspect.  Surely, such a basic functionality should be at
least hinted in the documentation, so that users new which options to
look at and where?

> Firstly, it is not clear if you are asking to document caching parser
> state specifically or all kinds of caching Org mode does.

All of them.

> Secondly, I am not sure if we have to document the details of caching at
> all in the manual. We do not document all the custom options in the
> manual; just the most important/useful.

I submit that at least the options which control where the cache is
and how to disable it are important enough to be in the manual.  Given
their names, users can use apropos or customize-group to find other
relevant options.

> Emacs user manual does not document `multisession-directory' - something
> very close to how we implement Org caches.  So, apparently, customizing
> `multisession-directory' and even the very multisession feature
> existence is not deemed necessary inside Emacs manual. Why would it be
> different for Org mode manual?

multisession is an optional package, it is neither preloaded nor
turned on by default in Emacs.  And even if Emacs makes a mistake of
not documenting anything it is not a valid argument to make the same
mistake elsewhere.

> > But why do you use utf-8 there and not utf-8-unix?  Come to think
> > about it, why not emacs-internal?  Those files are used internally by
> > Org, so they should be able to encode any characters supported by
> > Emacs, not just those which have UTF-8 encoding.  And using native EOL
> > convention is not needed, and will get in the way if the user shares
> > these files between systems.
> 
> Mostly because we chose whatever looked reasonable. I am not 100% sure
> what is the practical difference between `utf-8' and `utf-8-unix' and
> why the latter should be considered better.
> 
> As for `emacs-internal', we try to make files readable if at all
> possible. In particular, index.eld file is even pretty-printed for user
> convenience. The idea is to keep things in plain text and not in binary
> formats, following the overall spirit how Emacs usually stores data. (I
> think you may recall people raising their voice about plain text
> vs. binary during the discussion of multisession feature and the use of
> sqlite database).

The emacs-internal encoding is not binary.  In almost all the cases it
is indistinguishable from utf-8-unix.  It differs where a buffer
includes characters outside of the Unicode codespace.  The usual
practice in Emacs is that files holding internal data use
emacs-internal to make sure all the characters are saved correctly and
can be later restored correctly.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-14 15:56       ` Eli Zaretskii
@ 2024-06-15 12:47         ` Ihor Radchenko
  2024-06-15 13:01           ` Eli Zaretskii
  2024-06-15 13:47           ` Ihor Radchenko
  0 siblings, 2 replies; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-15 12:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-orgmode

Eli Zaretskii <eliz@gnu.org> writes:

>> I am not convinced that we have to do it.
>
> That's too bad.  When a user finds out about this caching, how do you
> propose that he/she looks for the information about it?  I wanted to
> know what is being cached, why, and in what file/directory.  It took
> me quite some time to find the answers, since Org is a very large
> package, and there's no org-cache.el file or similar to serve as the
> immediate suspect.  Surely, such a basic functionality should be at
> least hinted in the documentation, so that users new which options to
> look at and where?

Maybe. Although it is not clear where to document such things.
Ideally, it would be nice if caches were managed by Emacs itself, with
all the cache storage locations customizeable across various packages.
Then, documenting cache locations in the Emacs manual would suffice.

Would it be possible for Emacs to define a framework for cache/var/data
locations? Such framework would not only be useful in the context of
this discussion, but also to tackle the issue with packages sprinkling
things randomly into .emacs.d or ~/ (see
https://github.com/emacscollective/no-littering/)

>> Emacs user manual does not document `multisession-directory' - something
>> very close to how we implement Org caches.  So, apparently, customizing
>> `multisession-directory' and even the very multisession feature
>> existence is not deemed necessary inside Emacs manual. Why would it be
>> different for Org mode manual?
>
> multisession is an optional package, it is neither preloaded nor
> turned on by default in Emacs.

It is used by default in emoji.el (C-x 8 e r)

> ... And even if Emacs makes a mistake of
> not documenting anything it is not a valid argument to make the same
> mistake elsewhere.

I 100% agree. But my default assumption is that things added to Emacs
are usually documented in the manual, if necessary. I assumed that the
judgment was that documenting multisession was not necessary and worked
out of that assumption.

Of course, if you say that multisession and similar things should be
documented, I will follow. Let's discuss the details.

(Also, should we open some kind of bug report to track documenting
multisession in the manual?)

> The emacs-internal encoding is not binary.  In almost all the cases it
> is indistinguishable from utf-8-unix.  It differs where a buffer
> includes characters outside of the Unicode codespace.  The usual
> practice in Emacs is that files holding internal data use
> emacs-internal to make sure all the characters are saved correctly and
> can be later restored correctly.

Then, I agree that using emacs-internal for cached data makes sense.

Note, however, that I see no indication about such convention in the
manual. The only relevant bit is

       The coding system ‘utf-8-emacs’ specifies that the data is
    represented in the internal Emacs encoding (*note Text
    Representations::).  This is like ‘raw-text’ in that no code conversion
    happens, but different in that the result is multibyte data.  The name
    ‘emacs-internal’ is an alias for ‘utf-8-emacs-unix’ (so it forces no
    conversion of end-of-line, unlike ‘utf-8-emacs’, which can decode all 3
    kinds of end-of-line conventions).

However, I cannot come to the conclusion you pointed from reading that
paragraph.

Would it make sense to add the tip about storing Elisp data somewhere in
the Elisp manual?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-15 12:47         ` Ihor Radchenko
@ 2024-06-15 13:01           ` Eli Zaretskii
  2024-06-15 14:13             ` Ihor Radchenko
  2024-06-15 13:47           ` Ihor Radchenko
  1 sibling, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2024-06-15 13:01 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: emacs-orgmode@gnu.org
> Date: Sat, 15 Jun 2024 12:47:29 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> I am not convinced that we have to do it.
> >
> > That's too bad.  When a user finds out about this caching, how do you
> > propose that he/she looks for the information about it?  I wanted to
> > know what is being cached, why, and in what file/directory.  It took
> > me quite some time to find the answers, since Org is a very large
> > package, and there's no org-cache.el file or similar to serve as the
> > immediate suspect.  Surely, such a basic functionality should be at
> > least hinted in the documentation, so that users new which options to
> > look at and where?
> 
> Maybe. Although it is not clear where to document such things.
> Ideally, it would be nice if caches were managed by Emacs itself, with
> all the cache storage locations customizeable across various packages.
> Then, documenting cache locations in the Emacs manual would suffice.
> 
> Would it be possible for Emacs to define a framework for cache/var/data
> locations? Such framework would not only be useful in the context of
> this discussion, but also to tackle the issue with packages sprinkling
> things randomly into .emacs.d or ~/ (see
> https://github.com/emacscollective/no-littering/)

I think Emacs already provides all the framework for caching that is
needed.  Caching simply means you write some data to file, and all the
building blocks of that already exist, for quite some time, actually.
The only thing that is application dependent is the data to be cached
and how to serialize that, but that cannot be usefully generalized.

> >> Emacs user manual does not document `multisession-directory' - something
> >> very close to how we implement Org caches.  So, apparently, customizing
> >> `multisession-directory' and even the very multisession feature
> >> existence is not deemed necessary inside Emacs manual. Why would it be
> >> different for Org mode manual?
> >
> > multisession is an optional package, it is neither preloaded nor
> > turned on by default in Emacs.
> 
> It is used by default in emoji.el (C-x 8 e r)

Which is also optional.  And a minor feature at that.

> Of course, if you say that multisession and similar things should be
> documented, I will follow. Let's discuss the details.

I think at least emoji.el should say somewhere in its doc strings that
it caches the previously used emoji sequences, yes.

> (Also, should we open some kind of bug report to track documenting
> multisession in the manual?)

I don't mind, but it sounds like an exaggeration to me.

> > The emacs-internal encoding is not binary.  In almost all the cases it
> > is indistinguishable from utf-8-unix.  It differs where a buffer
> > includes characters outside of the Unicode codespace.  The usual
> > practice in Emacs is that files holding internal data use
> > emacs-internal to make sure all the characters are saved correctly and
> > can be later restored correctly.
> 
> Then, I agree that using emacs-internal for cached data makes sense.
> 
> Note, however, that I see no indication about such convention in the
> manual.

The opportunities for using it are rare enough.  But I added that now,
in the hope that someone will actually read all those recommendations.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-15 12:47         ` Ihor Radchenko
  2024-06-15 13:01           ` Eli Zaretskii
@ 2024-06-15 13:47           ` Ihor Radchenko
  1 sibling, 0 replies; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-15 13:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-orgmode

Ihor Radchenko <yantar92@posteo.net> writes:

>> The emacs-internal encoding is not binary.  In almost all the cases it
>> is indistinguishable from utf-8-unix.  It differs where a buffer
>> includes characters outside of the Unicode codespace.  The usual
>> practice in Emacs is that files holding internal data use
>> emacs-internal to make sure all the characters are saved correctly and
>> can be later restored correctly.
>
> Then, I agree that using emacs-internal for cached data makes sense.

Done in
https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?h=bugfix&id=be39e61c4efa5027536809c89b90bfe66b76b712 (bugfix)

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-15 13:01           ` Eli Zaretskii
@ 2024-06-15 14:13             ` Ihor Radchenko
  2024-06-15 14:37               ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-15 14:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-orgmode, emacs-devel, Michael Albinus

CCing emacs-devel as I'd like to upgrade this discussion to Emacs-wide
context.

Eli Zaretskii <eliz@gnu.org> writes:

>> ... I wanted to know what is being cached, why, and in what file/directory.
>> 
> >  ...
>> Would it be possible for Emacs to define a framework for cache/var/data
>> locations? Such framework would not only be useful in the context of
>> this discussion, but also to tackle the issue with packages sprinkling
>> things randomly into .emacs.d or ~/ (see
>> https://github.com/emacscollective/no-littering/)
>
> I think Emacs already provides all the framework for caching that is
> needed.  Caching simply means you write some data to file, and all the
> building blocks of that already exist, for quite some time, actually.
> The only thing that is application dependent is the data to be cached
> and how to serialize that, but that cannot be usefully generalized.

I was referring to some kind of global option that defines cache
directory, data directory, etc. Something akin XDG.

Then, Org can place cache inside that directory rather than trying to
cook up something independently.

Also, caching is not as simple, because caches may contain sensitive
data. (see
https://list.orgmode.org/orgmode/CAM9ALR8fuSu0YWS1SehRw7sYxprJFX-r2juXd_DgvCYVKQc95Q@mail.gmail.com/)
Some users may want to move caches to read-restricted location
or even to location dependent on where the cache is originating from
(separate caches depending on whether default-directory is from
encrypted volume, remote mount, etc)

Finally, we got several requests to have caches cleared up upon exiting
Emacs, which is also something that should be better managed centrally,
by Emacs, for all possible kinds of cache/history data.

>> > multisession is an optional package, it is neither preloaded nor
>> > turned on by default in Emacs.
>> 
>> It is used by default in emoji.el (C-x 8 e r)
>
> Which is also optional.  And a minor feature at that.

It is just for now.
TRAMP (by no means a minor feature), has the following TODO item in
tramp-cache.el:

;;; TODO:
;;
;; * Use multisession.el, starting with Emacs 29.1.

>> (Also, should we open some kind of bug report to track documenting
>> multisession in the manual?)
>
> I don't mind, but it sounds like an exaggeration to me.

I kind of agree, if we talk about the current state of affairs. But, I'd
like to discuss this in the context I elaborated on above - more
centralized cache management.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-15 14:13             ` Ihor Radchenko
@ 2024-06-15 14:37               ` Eli Zaretskii
  2024-06-16  9:05                 ` Ihor Radchenko
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2024-06-15 14:37 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode, emacs-devel, michael.albinus

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: emacs-orgmode@gnu.org, emacs-devel@gnu.org, Michael Albinus
>  <michael.albinus@gmx.de>
> Date: Sat, 15 Jun 2024 14:13:03 +0000
> 
> CCing emacs-devel as I'd like to upgrade this discussion to Emacs-wide
> context.
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> ... I wanted to know what is being cached, why, and in what file/directory.
> >> 
> > >  ...
> >> Would it be possible for Emacs to define a framework for cache/var/data
> >> locations? Such framework would not only be useful in the context of
> >> this discussion, but also to tackle the issue with packages sprinkling
> >> things randomly into .emacs.d or ~/ (see
> >> https://github.com/emacscollective/no-littering/)
> >
> > I think Emacs already provides all the framework for caching that is
> > needed.  Caching simply means you write some data to file, and all the
> > building blocks of that already exist, for quite some time, actually.
> > The only thing that is application dependent is the data to be cached
> > and how to serialize that, but that cannot be usefully generalized.
> 
> I was referring to some kind of global option that defines cache
> directory, data directory, etc. Something akin XDG.

We already have xdg-cache-home (and a few others in xdg.el).  Is that
what you meant?

> Also, caching is not as simple, because caches may contain sensitive
> data. (see
> https://list.orgmode.org/orgmode/CAM9ALR8fuSu0YWS1SehRw7sYxprJFX-r2juXd_DgvCYVKQc95Q@mail.gmail.com/)
> Some users may want to move caches to read-restricted location
> or even to location dependent on where the cache is originating from
> (separate caches depending on whether default-directory is from
> encrypted volume, remote mount, etc)

AFAIK, Emacs has APIs for at least some of that, but whether to use
them is up to the application, I think.

> Finally, we got several requests to have caches cleared up upon exiting
> Emacs, which is also something that should be better managed centrally,
> by Emacs, for all possible kinds of cache/history data.

Deleting files in a directory, recursively if needed, is already
available.  is that what you meant?

> >> > multisession is an optional package, it is neither preloaded nor
> >> > turned on by default in Emacs.
> >> 
> >> It is used by default in emoji.el (C-x 8 e r)
> >
> > Which is also optional.  And a minor feature at that.
> 
> It is just for now.
> TRAMP (by no means a minor feature), has the following TODO item in
> tramp-cache.el:
> 
> ;;; TODO:
> ;;
> ;; * Use multisession.el, starting with Emacs 29.1.

How far are you prepared to go just to make a point?

> >> (Also, should we open some kind of bug report to track documenting
> >> multisession in the manual?)
> >
> > I don't mind, but it sounds like an exaggeration to me.
> 
> I kind of agree, if we talk about the current state of affairs. But, I'd
> like to discuss this in the context I elaborated on above - more
> centralized cache management.

Can we first fix the problems for which I started this thread?  The
more general issues should be subjects of separate discussions, IMO.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-14 13:12 ` Ihor Radchenko
  2024-06-14 13:41   ` Eli Zaretskii
  2024-06-14 13:56   ` Jens Lechtenboerger
@ 2024-06-16  5:40   ` Daniel Clemente
  2024-06-16 12:36     ` Ihor Radchenko
  2 siblings, 1 reply; 61+ messages in thread
From: Daniel Clemente @ 2024-06-16  5:40 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

> > Please document the caching features of Org in the manual, including
> > how to turn that off.  (I also question the wisdom of turning this on
> > by default without as much as a single request for confirmation from
> > the user.)
> Hmm. What aspect of caching do you want us to document?
> FYI, Org mode has been doing various forms of caching since
> forever. Recently, we just employed a bit more regular API and
> introduced one more kind of caching - parser cache. In addition to the
> previously existing image cache, publishing cache, ID cache, clock
> cache, etc.

One of the discussion points is specifically org-persist, which is
what creates files on disk.
There have been reports, like
https://lists.gnu.org/archive/html/emacs-devel/2024-06/msg00203.html,
or Eli's message here, mentioning that ~/.cache/org-persist is created
when the user doesn't want it or expect it.

In particular, when setting (setq org-element-cache-persistent nil)
org-mode *should not* create an org-persist directory anywhere. And I
think it shouldn't activate org-persist timers (it does now) or hooks.
The user's preference should be respected.

That's a code change.
If you just want to update documentation, a starting point can be
org-element-cache-persistent's documentation, which is just "Non-nil
when cache should persist between Emacs sessions.", and doesn't
mention that some files will always be created even if it's nil. It
also doesn't explicitly mention that it will create files (better be
explicit about this), or where (or how to control where), or which
content (i.e. just statistics, or parts of possible private org
files).

I suggest making an explicit difference between "caching in memory"
and "caching by storing files on disk".
For instance:
(defvar org-element-use-cache t
  "Non-nil when Org parser should cache its results.")
From that description, it's not clear to a new user whether they're
creating files on disk (as caches often do) or not.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-15 14:37               ` Eli Zaretskii
@ 2024-06-16  9:05                 ` Ihor Radchenko
  2024-06-16 10:41                   ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-16  9:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-orgmode, emacs-devel, michael.albinus

Eli Zaretskii <eliz@gnu.org> writes:

>> I was referring to some kind of global option that defines cache
>> directory, data directory, etc. Something akin XDG.
>
> We already have xdg-cache-home (and a few others in xdg.el).  Is that
> what you meant?

Yes, except that `xdg-cache-home' is limited:

1. It cannot be customized by users
2. It may sometimes return nil
3. It is limited to XDG - not all the Emacs platforms

What I had in mind is a new custom option for cache dir (defaulting to
OS-specific cache like XDG on Linux or something equivalent on Windows)
+ a new API function like `system-cache-home' that will be guaranteed to
return some kind of meaningful dir.

>> Also, caching is not as simple, because caches may contain sensitive
>> data. (see
>> https://list.orgmode.org/orgmode/CAM9ALR8fuSu0YWS1SehRw7sYxprJFX-r2juXd_DgvCYVKQc95Q@mail.gmail.com/)
>> Some users may want to move caches to read-restricted location
>> or even to location dependent on where the cache is originating from
>> (separate caches depending on whether default-directory is from
>> encrypted volume, remote mount, etc)
>
> AFAIK, Emacs has APIs for at least some of that, but whether to use
> them is up to the application, I think.

What are those APIs?

>> Finally, we got several requests to have caches cleared up upon exiting
>> Emacs, which is also something that should be better managed centrally,
>> by Emacs, for all possible kinds of cache/history data.
>
> Deleting files in a directory, recursively if needed, is already
> available.  is that what you meant?

No. I mean a new user option like `clear-caches-on-exit' that will work
across all the packages. Then, concerned users may set it to non-nil to
delete *all* the caches upon exiting Emacs.

Having to set this for each specific package (with some packages not
documenting that they use cache, or users not expecting that cache may
be used and not reading _all_ the docs carefully enough) is not ideal,
IMHO.

> Can we first fix the problems for which I started this thread?  The
> more general issues should be subjects of separate discussions, IMO.

If there is a global Emacs-wide customization how to handle caches,
there will be no need to document it in Org mode manual. So, I would
like to see if introducing such global customization is feasible before
making non-trivial changes to Org manual. (I am not even sure where to
document these things in the manual yet; they seem way too generic wrt
Org mode's scope)

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-16  9:05                 ` Ihor Radchenko
@ 2024-06-16 10:41                   ` Eli Zaretskii
  2024-06-23  9:12                     ` Björn Bidar
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2024-06-16 10:41 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode, emacs-devel, michael.albinus

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: emacs-orgmode@gnu.org, emacs-devel@gnu.org, michael.albinus@gmx.de
> Date: Sun, 16 Jun 2024 09:05:02 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> I was referring to some kind of global option that defines cache
> >> directory, data directory, etc. Something akin XDG.
> >
> > We already have xdg-cache-home (and a few others in xdg.el).  Is that
> > what you meant?
> 
> Yes, except that `xdg-cache-home' is limited:
> 
> 1. It cannot be customized by users

Of course it can: just make the default value of a defcustom be
derived by xdg-cache-home, and users can then customize the option to
a different value if they want.

> 2. It may sometimes return nil

The fallback is well-known.

> 3. It is limited to XDG - not all the Emacs platforms

No, it's supported on all platforms, even if XDG isn't.

> What I had in mind is a new custom option for cache dir (defaulting to
> OS-specific cache like XDG on Linux or something equivalent on Windows)
> + a new API function like `system-cache-home' that will be guaranteed to
> return some kind of meaningful dir.

Using xdg-cache-home and its fallbacks is a de-facto standard of
solving this in Emacs, and it supports all the platforms.  Even
startup.el uses it (albeit by customized code, to avoid interfering
with user customizations) when looking for init files and suchlikes.

So I think you raise a problem that is already solved in Emacs.

> >> Also, caching is not as simple, because caches may contain sensitive
> >> data. (see
> >> https://list.orgmode.org/orgmode/CAM9ALR8fuSu0YWS1SehRw7sYxprJFX-r2juXd_DgvCYVKQc95Q@mail.gmail.com/)
> >> Some users may want to move caches to read-restricted location
> >> or even to location dependent on where the cache is originating from
> >> (separate caches depending on whether default-directory is from
> >> encrypted volume, remote mount, etc)
> >
> > AFAIK, Emacs has APIs for at least some of that, but whether to use
> > them is up to the application, I think.
> 
> What are those APIs?

Making files and directories readable only by the owner, for example:
set-file-modes and with-file-modes.  All the other Lisp programs in
Emacs use that, so why would Org need something special?

> >> Finally, we got several requests to have caches cleared up upon exiting
> >> Emacs, which is also something that should be better managed centrally,
> >> by Emacs, for all possible kinds of cache/history data.
> >
> > Deleting files in a directory, recursively if needed, is already
> > available.  is that what you meant?
> 
> No. I mean a new user option like `clear-caches-on-exit' that will work
> across all the packages.

Having a single option for all the caches makes little sense to me.
This must be a per-cache setting.

However, users on XDG platforms can have that via XDG system-wide
settings.

> Having to set this for each specific package (with some packages not
> documenting that they use cache, or users not expecting that cache may
> be used and not reading _all_ the docs carefully enough) is not ideal,
> IMHO.

I cannot disagree more.  Each cache has its own logic for when it is a
good time to empty the cache.

> > Can we first fix the problems for which I started this thread?  The
> > more general issues should be subjects of separate discussions, IMO.
> 
> If there is a global Emacs-wide customization how to handle caches,
> there will be no need to document it in Org mode manual.

I respectfully ask the Org developers to solve this particular issue
first, without waiting for some hypothetical general Emacs feature,
which may or may not materialize.

> like to see if introducing such global customization is feasible before
> making non-trivial changes to Org manual. (I am not even sure where to
> document these things in the manual yet; they seem way too generic wrt
> Org mode's scope)

A new chapter should be fine, if no existing chapter is relevant.

TIA


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-16  5:40   ` Please document the caching and its user options Daniel Clemente
@ 2024-06-16 12:36     ` Ihor Radchenko
  2024-06-17 12:41       ` Daniel Clemente
  0 siblings, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-16 12:36 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

Daniel Clemente <n142857@gmail.com> writes:

> In particular, when setting (setq org-element-cache-persistent nil)
> org-mode *should not* create an org-persist directory anywhere. And I
> think it shouldn't activate org-persist timers (it does now) or hooks.
> The user's preference should be respected.

Nope. "org-persist" directory is not only used by org-element. If some
other parts of Org need to cache something, they can also store cache
there.

> That's a code change.
> If you just want to update documentation, a starting point can be
> org-element-cache-persistent's documentation, which is just "Non-nil
> when cache should persist between Emacs sessions.", and doesn't
> mention that some files will always be created even if it's nil. It
> also doesn't explicitly mention that it will create files (better be
> explicit about this), or where (or how to control where), or which
> content (i.e. just statistics, or parts of possible private org
> files).

May you suggest an alternative docstring?

> I suggest making an explicit difference between "caching in memory"
> and "caching by storing files on disk".
> For instance:
> (defvar org-element-use-cache t
>   "Non-nil when Org parser should cache its results.")
> From that description, it's not clear to a new user whether they're
> creating files on disk (as caches often do) or not.

Do you mean something like

"Non-nil when Org parser should cache its results.

The cache is stored in-memory and may also be stored on disk if
`org-element-cache-persistent' is non-nil (the default)."

?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-16 12:36     ` Ihor Radchenko
@ 2024-06-17 12:41       ` Daniel Clemente
  2024-06-18 15:53         ` Ihor Radchenko
  0 siblings, 1 reply; 61+ messages in thread
From: Daniel Clemente @ 2024-06-17 12:41 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

> > In particular, when setting (setq org-element-cache-persistent nil)
> > org-mode *should not* create an org-persist directory anywhere. And I
> > think it shouldn't activate org-persist timers (it does now) or hooks.
> > The user's preference should be respected.
>
> Nope. "org-persist" directory is not only used by org-element. If some
> other parts of Org need to cache something, they can also store cache
> there.
>

What's the setting then to disable org-persist? I.e. to disable
creating of files like ~/.cache/org-persist/gc-lock.eld
Many people seem to want to disable all creation of org-mode related files.


> > That's a code change.
> > If you just want to update documentation, a starting point can be
> > org-element-cache-persistent's documentation, which is just "Non-nil
> > when cache should persist between Emacs sessions.", and doesn't
> > mention that some files will always be created even if it's nil. It
> > also doesn't explicitly mention that it will create files (better be
> > explicit about this), or where (or how to control where), or which
> > content (i.e. just statistics, or parts of possible private org
> > files).
>
> May you suggest an alternative docstring?
>

I don't know org-persist or org-element-cache-persistent so this needs
your input. I can start with a template, and you can fine-tune it,
expand it or rewrite it:

(defvar org-element-cache-persistent t
  "Non-nil when Org element cache should persist between Emacs sessions.
Cache files are written to disk at `org-persist-directory'.
The cache will be updated regularly (as controlled by
`org-element-cache-sync-idle-time') and when Emacs is closed.

Persisting the cache to disk can speed up ................(startup?
file opening time?, agendas? ...)...... especially if you open
.......(large files? mostly unmodified files? multiple emacs
instances?).
It is not recommended if ........(you edit the same files from
different emacs instances? if the Org files include sensitive
data?).... If you use `org-crypt', note that the persisted cache may
temporarily store unencrypted data after decrypting a header.

Use `org-element-use-cache' instead to use a memory-only cache.")




I mentioned I don't know org-element-cache-persistent, I mean that as a user.
It's explained in developer terms („make the cache persistent“).
But as an user I don't know: is it good? will things be faster? are
there risks involved? can it corrupt my files? will it leave traces of
my files in other places? who should enable it? what's the downside?
etc.
My own experience, very subjective and it may be an edge case, is that
enabling org-element-cache-persistent didn't make loading my org files
faster; on the contrary, it made some things slower (including closing
Emacs).


> > I suggest making an explicit difference between "caching in memory"
> > and "caching by storing files on disk".
> > For instance:
> > (defvar org-element-use-cache t
> >   "Non-nil when Org parser should cache its results.")
> > From that description, it's not clear to a new user whether they're
> > creating files on disk (as caches often do) or not.
>
> Do you mean something like
>
> "Non-nil when Org parser should cache its results.
>
> The cache is stored in-memory and may also be stored on disk if
> `org-element-cache-persistent' is non-nil (the default)."
>
> ?

This seems better.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-17 12:41       ` Daniel Clemente
@ 2024-06-18 15:53         ` Ihor Radchenko
  2024-06-18 16:15           ` Eli Zaretskii
  2024-06-23 11:45           ` Daniel Clemente
  0 siblings, 2 replies; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-18 15:53 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 977 bytes --]

Daniel Clemente <n142857@gmail.com> writes:

>> Nope. "org-persist" directory is not only used by org-element. If some
>> other parts of Org need to cache something, they can also store cache
>> there.
>>
> What's the setting then to disable org-persist? I.e. to disable
> creating of files like ~/.cache/org-persist/gc-lock.eld
> Many people seem to want to disable all creation of org-mode related files.

It is impossible. We need to store files like latex previews
somewhere. This somewhere is org-persist-directory now.

That said, gc-lock.eld should not be created when nothing else is
actually stored in the cache. It will be fixed.

>> May you suggest an alternative docstring?
>>
>
> I don't know org-persist or org-element-cache-persistent so this needs
> your input. I can start with a template, and you can fine-tune it,
> expand it or rewrite it:...

Thanks!
I am attaching tentative patch that improve the documentation. I hope
that it clarifies things for you.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-org-element-cache-Improve-docstrings.patch --]
[-- Type: text/x-patch, Size: 1759 bytes --]

From 8a64e83303566bad608c386fbdafe34aa9065a2b Mon Sep 17 00:00:00 2001
Message-ID: <8a64e83303566bad608c386fbdafe34aa9065a2b.1718725818.git.yantar92@posteo.net>
From: Ihor Radchenko <yantar92@posteo.net>
Date: Tue, 18 Jun 2024 17:49:43 +0200
Subject: [PATCH] org-element-cache: Improve docstrings

* lisp/org-element.el (org-element-use-cache):
(org-element-cache-persistent): Add more details to the docstrings.

Reported-by: Daniel Clemente <n142857@gmail.com>
Link: https://orgmode.org/list/CAJKAhPBUAS2bDT5k+xB2E-vu+d==yoNAfKjdKu2HC4qmB_XUnw@mail.gmail.com
---
 lisp/org-element.el | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/lisp/org-element.el b/lisp/org-element.el
index 191bb5698..631cdf20c 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -5744,10 +5744,19 @@ ;;; Cache
 
 ;;;###autoload
 (defvar org-element-use-cache t
-  "Non-nil when Org parser should cache its results.")
+  "Non-nil when Org parser should cache its results.
+The results are cached in memory and may also be cached between Emacs
+sessions if `org-element-cache-persistent' is set to non-nil.")
 
 (defvar org-element-cache-persistent t
-  "Non-nil when cache should persist between Emacs sessions.")
+  "Non-nil when Org element cache should persist between Emacs sessions.
+Cache files are written to disk at `org-persist-directory'.
+The cache will be updated when Emacs is closed or when an Org buffer
+is closed.
+
+Persisting the cache to disk can speed up opening Org files
+\\(especially large Org files).  It is not recommended if the Org files
+include sensitive data, unless the data is encrypted via `org-crypt'.")
 
 (defconst org-element-cache-version "2.3"
   "Version number for Org AST structure.
-- 
2.45.1


[-- Attachment #3: Type: text/plain, Size: 631 bytes --]


> My own experience, very subjective and it may be an edge case, is that
> enabling org-element-cache-persistent didn't make loading my org files
> faster; on the contrary, it made some things slower (including closing
> Emacs).

What happens if you set `org-persist--report-time' to t in your config
and examine *Messages* buffer after opening/closing some Org files?
Look for "org-persist:..." messages.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 15:53         ` Ihor Radchenko
@ 2024-06-18 16:15           ` Eli Zaretskii
  2024-06-18 16:25             ` Ihor Radchenko
  2024-06-23 11:45           ` Daniel Clemente
  1 sibling, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2024-06-18 16:15 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: n142857, emacs-orgmode

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-orgmode@gnu.org
> Date: Tue, 18 Jun 2024 15:53:18 +0000
> 
> Daniel Clemente <n142857@gmail.com> writes:
> 
> > What's the setting then to disable org-persist? I.e. to disable
> > creating of files like ~/.cache/org-persist/gc-lock.eld
> > Many people seem to want to disable all creation of org-mode related files.
> 
> It is impossible. We need to store files like latex previews
> somewhere. This somewhere is org-persist-directory now.

Sorry, I don't understand: why do you need to store them as files?
Why not keep the previews in buffer(s)?


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 16:15           ` Eli Zaretskii
@ 2024-06-18 16:25             ` Ihor Radchenko
  2024-06-18 16:33               ` Eli Zaretskii
  2024-06-18 22:06               ` Rudolf Adamkovič
  0 siblings, 2 replies; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-18 16:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: n142857, emacs-orgmode

Eli Zaretskii <eliz@gnu.org> writes:

>> It is impossible. We need to store files like latex previews
>> somewhere. This somewhere is org-persist-directory now.
>
> Sorry, I don't understand: why do you need to store them as files?
> Why not keep the previews in buffer(s)?

In Org mode, in order to create latex previews, we
(1) run latex to generate the preview image
(2) that image is stored in some directory
(3) we display that image over the corresponding latex fragment in an
    overlay
(4) we retain the image on disk, so that we do not need to run latex
    many times if the users toggles displaying the previews (this is
    very important, because running latex is costly)

Can we instead store them in memory? Yes, but (1) it will make Emacs RAM
consumption grow constantly and more and more previews are generated;
(2) it will require significant changes in the Org mode codebase.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 16:25             ` Ihor Radchenko
@ 2024-06-18 16:33               ` Eli Zaretskii
  2024-06-18 16:55                 ` Ihor Radchenko
  2024-06-18 22:06               ` Rudolf Adamkovič
  1 sibling, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2024-06-18 16:33 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: n142857, emacs-orgmode

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: n142857@gmail.com, emacs-orgmode@gnu.org
> Date: Tue, 18 Jun 2024 16:25:10 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> It is impossible. We need to store files like latex previews
> >> somewhere. This somewhere is org-persist-directory now.
> >
> > Sorry, I don't understand: why do you need to store them as files?
> > Why not keep the previews in buffer(s)?
> 
> In Org mode, in order to create latex previews, we
> (1) run latex to generate the preview image
> (2) that image is stored in some directory
> (3) we display that image over the corresponding latex fragment in an
>     overlay
> (4) we retain the image on disk, so that we do not need to run latex
>     many times if the users toggles displaying the previews (this is
>     very important, because running latex is costly)
> 
> Can we instead store them in memory? Yes, but (1) it will make Emacs RAM
> consumption grow constantly and more and more previews are generated;
> (2) it will require significant changes in the Org mode codebase.

I understand all that, but if the user wants it, and insist on not
caching any data, let them have what they want.  My surprise was
caused by your "it is impossible"; I now understand that you meant
"not reasonable" or perhaps "users will not like that" instead.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 16:33               ` Eli Zaretskii
@ 2024-06-18 16:55                 ` Ihor Radchenko
  2024-06-19  9:27                   ` Colin Baxter
  0 siblings, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-18 16:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: n142857, emacs-orgmode

Eli Zaretskii <eliz@gnu.org> writes:

>> Can we instead store them in memory? Yes, but (1) it will make Emacs RAM
>> consumption grow constantly and more and more previews are generated;
>> (2) it will require significant changes in the Org mode codebase.
>
> I understand all that, but if the user wants it, and insist on not
> caching any data, let them have what they want.

It is not about letting or not letting them. I would have to implement
it. (I am ok with it, but I am not going to prioritize my time for
nice-to-haves; though I would not mind patches submitted by interested
users).

> ... My surprise was
> caused by your "it is impossible"; I now understand that you meant
> "not reasonable" or perhaps "users will not like that" instead.

I meant:

1. not reasonable in a sense that it has downsides compared to what we
   do now - save latex previews on disk
2. impossible in a sense that we do not have an existing toggle to store
   cached previews in memory. Such functionality would have to be added;
   and it is not necessarily trivial to add it.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 16:25             ` Ihor Radchenko
  2024-06-18 16:33               ` Eli Zaretskii
@ 2024-06-18 22:06               ` Rudolf Adamkovič
  2024-06-19  4:29                 ` tomas
  1 sibling, 1 reply; 61+ messages in thread
From: Rudolf Adamkovič @ 2024-06-18 22:06 UTC (permalink / raw)
  To: Ihor Radchenko, Eli Zaretskii; +Cc: n142857, emacs-orgmode

Ihor Radchenko <yantar92@posteo.net> writes:

> Can we instead store them in memory? Yes, but (1) it will make Emacs RAM
> consumption grow constantly and more and more previews are generated;
> (2) it will require significant changes in the Org mode codebase.

And, (3) all previews would be lost every time one shuts down their
computer, say for the night, or even restarts Emacs, which would be
terrible experience.

Rudy
-- 
"It is no paradox to say that in our most theoretical moods we may be
nearest to our most practical applications."
--- Alfred North Whitehead, 1861-1947

Rudolf Adamkovič <rudolf@adamkovic.org> [he/him]
http://adamkovic.org


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 22:06               ` Rudolf Adamkovič
@ 2024-06-19  4:29                 ` tomas
  0 siblings, 0 replies; 61+ messages in thread
From: tomas @ 2024-06-19  4:29 UTC (permalink / raw)
  To: Rudolf Adamkovič
  Cc: Ihor Radchenko, Eli Zaretskii, n142857, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 840 bytes --]

On Wed, Jun 19, 2024 at 12:06:42AM +0200, Rudolf Adamkovič wrote:
> Ihor Radchenko <yantar92@posteo.net> writes:
> 
> > Can we instead store them in memory? Yes, but (1) it will make Emacs RAM
> > consumption grow constantly and more and more previews are generated;
> > (2) it will require significant changes in the Org mode codebase.
> 
> And, (3) all previews would be lost every time one shuts down their
> computer, say for the night, or even restarts Emacs, which would be
> terrible experience.

I was one of those clamouring for a "master switch". I'm aware of all
of that. I can live with that (not everyone will, that's why it should
be optional).

I arrived at the impression that the discussion was becoming unproductive,
that's why I gave up and went with the non-existing directory trick.

Cheers
-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 16:55                 ` Ihor Radchenko
@ 2024-06-19  9:27                   ` Colin Baxter
  2024-06-19 10:35                     ` Ihor Radchenko
  0 siblings, 1 reply; 61+ messages in thread
From: Colin Baxter @ 2024-06-19  9:27 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, n142857, emacs-orgmode

>>>>> Ihor Radchenko <yantar92@posteo.net> writes:

    > Eli Zaretskii <eliz@gnu.org> writes:
    >>> Can we instead store them in memory? Yes, but (1) it will make
    >>> Emacs RAM consumption grow constantly and more and more previews
    >>> are generated; (2) it will require significant changes in the
    >>> Org mode codebase.
    >> 
    >> I understand all that, but if the user wants it, and insist on
    >> not caching any data, let them have what they want.

    > It is not about letting or not letting them. I would have to
    > implement it. (I am ok with it, but I am not going to prioritize
    > my time for nice-to-haves; though I would not mind patches
    > submitted by interested users).

    >> ... My surprise was caused by your "it is impossible"; I now
    >> understand that you meant "not reasonable" or perhaps "users will
    >> not like that" instead.

    > I meant:

    > 1. not reasonable in a sense that it has downsides compared to
    > what we do now - save latex previews on disk 2. impossible in a
    > sense that we do not have an existing toggle to store cached
    > previews in memory. Such functionality would have to be added; and
    > it is not necessarily trivial to add it.

I too was one of those complainers who wanted to be able to disable
org-persist completely. The argument about latex preview is really a
non-starter in my opinion. I never use latex-preview and I'm sure I'm
not alone in this. I also would not class the disabling of org-persist
to be a 'nice-to-have'.

Best wishes,

Colin Baxter.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-19  9:27                   ` Colin Baxter
@ 2024-06-19 10:35                     ` Ihor Radchenko
  2024-06-19 13:04                       ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-19 10:35 UTC (permalink / raw)
  To: m43cap; +Cc: Eli Zaretskii, n142857, emacs-orgmode

Colin Baxter <m43cap@yandex.com> writes:

>     > 1. not reasonable in a sense that it has downsides compared to
>     > what we do now - save latex previews on disk 2. impossible in a
>     > sense that we do not have an existing toggle to store cached
>     > previews in memory. Such functionality would have to be added; and
>     > it is not necessarily trivial to add it.
>
> I too was one of those complainers who wanted to be able to disable
> org-persist completely. The argument about latex preview is really a
> non-starter in my opinion. I never use latex-preview and I'm sure I'm
> not alone in this. I also would not class the disabling of org-persist
> to be a 'nice-to-have'.

Let me clarify.

If you do not use latex-preview or other features that cache their
results, org-persist should not create any files or directories.
(It currently does create gc-lock.eld, but I will fix this)

However, if you do use it, Org mode has no option to disable creating
cache.  In fact, Org mode never had such an option. For example,
`org-preview-latex-image-directory' is a part of Org mode since at least
Org 9.0, and it was never an option to disable it. org-persist did not
introduce anything drastically new in this regard.

So, this discussion and people insisting on completely disabling the
cache is a bit strange to me. I suspect that the problem may be not the
cache itself, but either (1) that it is created when cache features are
not really used; (2) that it is created in .emacs.d for some users.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-19 10:35                     ` Ihor Radchenko
@ 2024-06-19 13:04                       ` Eli Zaretskii
  2024-06-19 13:30                         ` Ihor Radchenko
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2024-06-19 13:04 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: m43cap, n142857, emacs-orgmode

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: Eli Zaretskii <eliz@gnu.org>, n142857@gmail.com, emacs-orgmode@gnu.org
> Date: Wed, 19 Jun 2024 10:35:39 +0000
> 
> If you do not use latex-preview or other features that cache their
> results, org-persist should not create any files or directories.
> (It currently does create gc-lock.eld, but I will fix this)
> 
> However, if you do use it, Org mode has no option to disable creating
> cache.  In fact, Org mode never had such an option. For example,
> `org-preview-latex-image-directory' is a part of Org mode since at least
> Org 9.0, and it was never an option to disable it. org-persist did not
> introduce anything drastically new in this regard.
> 
> So, this discussion and people insisting on completely disabling the
> cache is a bit strange to me. I suspect that the problem may be not the
> cache itself, but either (1) that it is created when cache features are
> not really used; (2) that it is created in .emacs.d for some users.

Let me clarify.  In the scenario in which I found out about Org
caching, I didn't use latex-preview, not at all.  All I did was visit
the org.org file that we have now on the master branch, and look
around for a while (specifically, I looked for the constructs that
produce the Texinfo @dircategory and @direntry directives).  Perhaps
the caching I saw was a different kind of caching, I don't know (hence
the request to document that, and if there's more than one kind of
caching, I hope they will all be documented), but evidently the
caching by Org happens (by default!) even if the user doesn't come
anywhere near latex-preview.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-19 13:04                       ` Eli Zaretskii
@ 2024-06-19 13:30                         ` Ihor Radchenko
  2024-06-19 16:07                           ` Colin Baxter
  0 siblings, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-19 13:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: m43cap, n142857, emacs-orgmode

Eli Zaretskii <eliz@gnu.org> writes:

> Let me clarify.  In the scenario in which I found out about Org
> caching, I didn't use latex-preview, not at all....

Sure. Org uses multiple caches.
You encountered the one created by parser. The parser cache in
particular can be disabled. But not the latex preview cache.

When replying to Colin, I was clarifying about why some parts of the
cache cannot be disabled. That's all.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-19 13:30                         ` Ihor Radchenko
@ 2024-06-19 16:07                           ` Colin Baxter
  2024-06-19 16:15                             ` Ihor Radchenko
  0 siblings, 1 reply; 61+ messages in thread
From: Colin Baxter @ 2024-06-19 16:07 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, n142857, emacs-orgmode

>>>>> Ihor Radchenko <yantar92@posteo.net> writes:

    > Eli Zaretskii <eliz@gnu.org> writes:
    >> Let me clarify.  In the scenario in which I found out about Org
    >> caching, I didn't use latex-preview, not at all....

    > Sure. Org uses multiple caches.  You encountered the one created
    > by parser. The parser cache in particular can be disabled. But not
    > the latex preview cache.

This what I cannot understand. If the user never uses latex preview why
cannot the latex preview cache be disabled? I don't want to go on and on
and become a bore - I've said my piece and I will be silent from now
on.

Best wishes,

Colin Baxter.



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-19 16:07                           ` Colin Baxter
@ 2024-06-19 16:15                             ` Ihor Radchenko
  0 siblings, 0 replies; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-19 16:15 UTC (permalink / raw)
  To: m43cap; +Cc: Eli Zaretskii, n142857, emacs-orgmode

Colin Baxter <m43cap@yandex.com> writes:

> This what I cannot understand. If the user never uses latex preview why
> cannot the latex preview cache be disabled? I don't want to go on and on
> and become a bore - I've said my piece and I will be silent from now
> on.

I believe that we have some kind of misunderstanding.
Disabling cache only makes sense when it is used.
When it is unused, no cache files will be created.

So, your ask to allow disabling preview cache means that you want latex
previews to work without creating cache files, which is currently not an
option.

If you do not use latex previews, no cache files will be created due to
latex previews. Other cache files may be created though. One of them is
parser cache, but you can disable this one.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-16 10:41                   ` Eli Zaretskii
@ 2024-06-23  9:12                     ` Björn Bidar
  0 siblings, 0 replies; 61+ messages in thread
From: Björn Bidar @ 2024-06-23  9:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Ihor Radchenko, emacs-orgmode, emacs-devel, michael.albinus

Eli Zaretskii <eliz@gnu.org> writes:

>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>> >> I was referring to some kind of global option that defines cache
>> >> directory, data directory, etc. Something akin XDG.
>> >
>> > We already have xdg-cache-home (and a few others in xdg.el).  Is that
>> > what you meant?
>> 
>> Yes, except that `xdg-cache-home' is limited:
>> 
>> 1. It cannot be customized by users
>
> Of course it can: just make the default value of a defcustom be
> derived by xdg-cache-home, and users can then customize the option to
> a different value if they want.

That and it can be overridden just like any XDG Directory variable using
environment variables.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-18 15:53         ` Ihor Radchenko
  2024-06-18 16:15           ` Eli Zaretskii
@ 2024-06-23 11:45           ` Daniel Clemente
  2024-06-24 10:36             ` Ihor Radchenko
  1 sibling, 1 reply; 61+ messages in thread
From: Daniel Clemente @ 2024-06-23 11:45 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

>
> Thanks!
> I am attaching tentative patch that improve the documentation. I hope
> that it clarifies things for you.
>
>


Thanks. I'm not sure about the "unless" part here:

> Persisting the cache to disk […]
> It is not recommended if the Org files
> include sensitive data, unless the data is encrypted via `org-crypt'.")

I first mentioned org-crypt because users of org-crypt may be
surprised if they see encrypted data stored unencrypted in disk, due
to this cache.
A user has somefile.org which contains some headers marked with the
"crypt" tag. Only those headers are encrypted. The org-element cache
may now cache the whole file, including the encrypted headers (this is
ok). Now the user temporarily decrypts the encrypted header, works on
it some time (including closing the file and opening it again) then
encrypts the section again. During the time that the header was
unencrypted, the org-element cache was storing information about
unencrypted data in ~/.cache/org-persist, which could even be a remote
server (NFS, SMB etc), not as private as the org file itself.

Apparently the data stored in the cache doesn't contain the actual
paragraphs of text but it still contains plain text (like: names of
tags, properties, files, macros, scheduling information), which I
would call private if I'm using org-crypt.


I saw some code related to the org-element cache to avoid putting
encrypted files in the cache, but if I remember correctly that would
be just for whole encrypted files.
The part about how org-crypt works with caching could also be
documented in org-crypt instead, or in the manual.


The rest of the documentation change seems good, it improves things.
I would just mention the shortcomings or disclaimers, if there are.
For instance I worry about what may happen when different Emacs
processes load the same Org files at the same time (e.g. I run several
automated batch export jobs). And I guess that having a disk cache
creates new problems, like when in a web browser a simple F5 won't
refresh and you need S-F5.
But if there are no shortcomings (i.e. all operations will always use
up to date information and everything will keep working as usual when
you enable on-disk cache), it's ok like it is. It's also good if it's
explicitly mentioned. It could also be mentioned somewhere else, like
in a cache section in the manual, if it gets one.


> > My own experience, very subjective and it may be an edge case, is that
> > enabling org-element-cache-persistent didn't make loading my org files
> > faster; on the contrary, it made some things slower (including closing
> > Emacs).
>
> What happens if you set `org-persist--report-time' to t in your config
> and examine *Messages* buffer after opening/closing some Org files?
> Look for "org-persist:..." messages.

Thanks, I'll try that for some time and learn about org-persist and if
there are problems I'll continue in another thread.
For now I can say I see that each operation takes 0.00 or 0.01
seconds, but I have ~150 files so it amounts to a short delay (shorter
than the last time time I tried it).


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-23 11:45           ` Daniel Clemente
@ 2024-06-24 10:36             ` Ihor Radchenko
  2024-06-26 12:59               ` Daniel Clemente
  0 siblings, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-24 10:36 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

Daniel Clemente <n142857@gmail.com> writes:

> Thanks. I'm not sure about the "unless" part here:
>
>> Persisting the cache to disk […]
>> It is not recommended if the Org files
>> include sensitive data, unless the data is encrypted via `org-crypt'.")
>
> I first mentioned org-crypt because users of org-crypt may be
> surprised if they see encrypted data stored unencrypted in disk, due
> to this cache.

No unencrypted data should be stored in the cache _on fs_.
If it does get stored, it is a bug that should be reported.

> A user has somefile.org which contains some headers marked with the
> "crypt" tag. Only those headers are encrypted. The org-element cache
> may now cache the whole file, including the encrypted headers (this is
> ok). Now the user temporarily decrypts the encrypted header, works on
> it some time (including closing the file and opening it again) then
> encrypts the section again. During the time that the header was
> unencrypted, the org-element cache was storing information about
> unencrypted data in ~/.cache/org-persist, which could even be a remote
> server (NFS, SMB etc), not as private as the org file itself.

Nope. Storing to disk only happens when you kill the buffer and before
exiting Emacs. At that point, org-crypt must take care about
re-encrypting everything.

> The rest of the documentation change seems good, it improves things.
> I would just mention the shortcomings or disclaimers, if there are.
> For instance I worry about what may happen when different Emacs
> processes load the same Org files at the same time (e.g. I run several
> automated batch export jobs). And I guess that having a disk cache
> creates new problems, like when in a web browser a simple F5 won't
> refresh and you need S-F5.
> But if there are no shortcomings (i.e. all operations will always use
> up to date information and everything will keep working as usual when
> you enable on-disk cache), it's ok like it is. It's also good if it's
> explicitly mentioned. It could also be mentioned somewhere else, like
> in a cache section in the manual, if it gets one.

Multiple Emacs instances are handled correctly. I do not see much
point documenting that things are working as expected.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-24 10:36             ` Ihor Radchenko
@ 2024-06-26 12:59               ` Daniel Clemente
  2024-06-26 13:21                 ` org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options) Ihor Radchenko
  2024-06-27  9:27                 ` Please document the caching and its user options Eli Zaretskii
  0 siblings, 2 replies; 61+ messages in thread
From: Daniel Clemente @ 2024-06-26 12:59 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

> > A user has somefile.org which contains some headers marked with the
> > "crypt" tag. Only those headers are encrypted. The org-element cache
> > may now cache the whole file, including the encrypted headers (this is
> > ok). Now the user temporarily decrypts the encrypted header, works on
> > it some time (including closing the file and opening it again) then
> > encrypts the section again. During the time that the header was
> > unencrypted, the org-element cache was storing information about
> > unencrypted data in ~/.cache/org-persist, which could even be a remote
> > server (NFS, SMB etc), not as private as the org file itself.
> Nope. Storing to disk only happens when you kill the buffer and before
> exiting Emacs. At that point, org-crypt must take care about
> re-encrypting everything.

Sometimes org-crypt fails to reencrypt the data. E.g. if Emacs
crashes, or if you fail to type the same password twice, or of course
if you don't use (org-crypt-use-before-save-magic), etc.
At the end of the day when I do "git diff" + "git commit" sometimes I
realize there's unencrypted data and then I have to reencrypt it. In
the meantime I might have killed and reopened the buffer, thus
updating the file cache.
That may be a problem by org-encrypt and something to document in
org-crypt itself. The point is that users of org-encrypt should take
extra precautions when enabling org-element-cache-persistent. Like:
not closing buffers while the sections are unencrypted.

> Multiple Emacs instances are handled correctly. I do not see much
> point documenting that things are working as expected.

Ok, thanks, it's good to read this guarantee here. I'm used to
org-element cache inconsistency errors, so I didn't know the state of
things.
I agree it doesn't need to be in the docstring.
If there's some chapter about caches in the manual (which is one of
the topics in the original post of this thread) it can describe these
minor things. But the major ones like what does it do and to turn it
on/off are more interesting.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options)
  2024-06-26 12:59               ` Daniel Clemente
@ 2024-06-26 13:21                 ` Ihor Radchenko
  2024-06-27  8:55                   ` Daniel Clemente
  2024-06-27  9:27                 ` Please document the caching and its user options Eli Zaretskii
  1 sibling, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-26 13:21 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

Daniel Clemente <n142857@gmail.com> writes:

> Sometimes org-crypt fails to reencrypt the data. E.g. if Emacs
> crashes, or if you fail to type the same password twice, or of course
> if you don't use (org-crypt-use-before-save-magic), etc.

I do not think that there is anything left on disk if Emacs crashes.

As for not typing the same password twice and not using
org-crypt-use-before-save-magic, we should somehow fix this.
(I am starting a new thread branch.)

One simple idea is to disable backups if encryption fails.
Or use `write-contents-functions' instead of `before-save-hook' - that
way, Emacs will not ignore errors thrown by org-crypt and will not
actually save anything if encryption fails.

> At the end of the day when I do "git diff" + "git commit" sometimes I
> realize there's unencrypted data and then I have to reencrypt it. In
> the meantime I might have killed and reopened the buffer, thus
> updating the file cache.
> That may be a problem by org-encrypt and something to document in
> org-crypt itself. The point is that users of org-encrypt should take
> extra precautions when enabling org-element-cache-persistent. Like:
> not closing buffers while the sections are unencrypted.

These things should be considered bugs. And we should fix them. Cache and
other libraries should not be responsible for special treatment of
optional org-crypt library.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options)
  2024-06-26 13:21                 ` org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options) Ihor Radchenko
@ 2024-06-27  8:55                   ` Daniel Clemente
  2024-06-27 10:15                     ` org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options)) Ihor Radchenko
  2024-06-27 10:34                     ` org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options) Ihor Radchenko
  0 siblings, 2 replies; 61+ messages in thread
From: Daniel Clemente @ 2024-06-27  8:55 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

>
> As for not typing the same password twice and not using
> org-crypt-use-before-save-magic, we should somehow fix this.
> (I am starting a new thread branch.)
>

„Not using org-crypt-use-before-save-magic“ is currently a user
decision, not a bug.
For instance, I don't use it because it adds around 5 seconds to each
saving of a large file. If it were instantaneous I would enable it.
With it disabled, this explains why I often find unencrypted sections
at the end of the day… I have to rely on myself to reencrypt them
again.



> One simple idea is to disable backups if encryption fails.
> Or use `write-contents-functions' instead of `before-save-hook' - that
> way, Emacs will not ignore errors thrown by org-crypt and will not
> actually save anything if encryption fails.
>

Disabling backups makes sense too, if we decide that unencrypted
private data shouldn't end up in backups.
I don't have an absolute opinion. Some people may prefer having
backups of all data (including private unencrypted data).

If it's possible to detect whether encryption failed in this buffer,
there could be a warning saying „Last encryption failed. Really
save?“.
Or just a message in the style of „Encryption failed. Saving the file
may store unencrypted data in disk, and in backups and cache if
enabled“.

Totally preventing the user from saving a file seems harsh but it also
seems safer. Since users have different safety preferences, Emacs can
let the user decide what the do, through a question or optional
setting.


> > At the end of the day when I do "git diff" + "git commit" sometimes I
> > realize there's unencrypted data and then I have to reencrypt it. In
> > the meantime I might have killed and reopened the buffer, thus
> > updating the file cache.
> > That may be a problem by org-encrypt and something to document in
> > org-crypt itself. The point is that users of org-encrypt should take
> > extra precautions when enabling org-element-cache-persistent. Like:
> > not closing buffers while the sections are unencrypted.
>
> These things should be considered bugs. And we should fix them. Cache and
> other libraries should not be responsible for special treatment of
> optional org-crypt library.
>

You can't fix all bugs all the time, so you can't base security on „we
strongly believe there are no more bugs“. If doing an extra
verification (to avoid storing private data on disk in unencrypted
form) is fast, it's better with the verification.

In addition, „leaving some encrypted sections unencrypted for a short
amount of time, and closing and reopening the buffer during that time“
isn't a bug, it's a possible user behaviour that we can't control. But
org-crypt can mention that that behaviour is unsafe when using on-disk
cache. Or detect it (if it's fast) and warn the user.

> Cache and
> other libraries should not be responsible for special treatment of
> optional org-crypt library.

That's arbitrary. Both persistent cache and org-crypt are optional,
but any of them can check whether the other is enabled and try to do
what the user wants.
I know they both have separate responsibilities, but if there are only
these 2 parts, one of them must be the one caring about „unencrypted
data leaking into disk caches“.

It would be different If we had a third component… E.g. imagine we had
a component/overlay/text property/… in Emacs that could tell whether a
buffer's region contains very private information or not; then all
other components could just obey that setting (that section won't be
backed up, it won't end up in disk cache, … It can even be displayed
in a different face). Then org-crypt just needs to set that flag when
encryption fails. Does something like that exist? Anyway this is a bit
utopic or overengineered. Simpler ways of improving things are with
documentation (e.g. „Don't do this, it's unsafe“), with messages
(„You're doing this, which may be unsafe“), or with questions („Really
do this unsafe thing?“)


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-26 12:59               ` Daniel Clemente
  2024-06-26 13:21                 ` org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options) Ihor Radchenko
@ 2024-06-27  9:27                 ` Eli Zaretskii
  2024-06-27 10:11                   ` Ihor Radchenko
  1 sibling, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2024-06-27  9:27 UTC (permalink / raw)
  To: yantar92; +Cc: emacs-orgmode

Here's one example where org-persistent (I think) triggers an annoying
select-safe-coding-system popup because it is trying to cache
something behind user's back:

  . set your locale's codeset to something other than UTF-8
  . type into *scratch*:

(insert "* heading\n(")
(org-mode)
(hs-minor-mode)
(hs-hide-all)

  . mark the region around these 4 sexps and type "M-x eval-region"
  . observe the popup *Warning* window asking you to select a coding
    system

This is with Emacs 30.0.60, from the emacs-30 release branch.

Can this be fixed on the emacs-30 release branch, please?


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-27  9:27                 ` Please document the caching and its user options Eli Zaretskii
@ 2024-06-27 10:11                   ` Ihor Radchenko
  2024-06-27 10:30                     ` Eli Zaretskii
  2024-06-28 12:54                     ` Rudolf Adamkovič
  0 siblings, 2 replies; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-27 10:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-orgmode

Eli Zaretskii <eliz@gnu.org> writes:

>   . set your locale's codeset to something other than UTF-8
>   . type into *scratch*:
>
> (insert "* heading\n(")
> (org-mode)
> (hs-minor-mode)
> (hs-hide-all)
>
>   . mark the region around these 4 sexps and type "M-x eval-region"
>   . observe the popup *Warning* window asking you to select a coding
>     system

Fixed, on bugfix.
https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=5ffb2675f

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options))
  2024-06-27  8:55                   ` Daniel Clemente
@ 2024-06-27 10:15                     ` Ihor Radchenko
  2024-07-02 16:54                       ` Daniel Clemente
  2024-06-27 10:34                     ` org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options) Ihor Radchenko
  1 sibling, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-27 10:15 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

Daniel Clemente <n142857@gmail.com> writes:

> For instance, I don't use it because it adds around 5 seconds to each
> saving of a large file. If it were instantaneous I would enable it.
> With it disabled, this explains why I often find unencrypted sections
> at the end of the day… I have to rely on myself to reencrypt them
> again.

Does it also happen when you use the latest Org mode version?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-27 10:11                   ` Ihor Radchenko
@ 2024-06-27 10:30                     ` Eli Zaretskii
  2024-06-28 12:54                     ` Rudolf Adamkovič
  1 sibling, 0 replies; 61+ messages in thread
From: Eli Zaretskii @ 2024-06-27 10:30 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: emacs-orgmode@gnu.org
> Date: Thu, 27 Jun 2024 10:11:46 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >   . set your locale's codeset to something other than UTF-8
> >   . type into *scratch*:
> >
> > (insert "* heading\n(")
> > (org-mode)
> > (hs-minor-mode)
> > (hs-hide-all)
> >
> >   . mark the region around these 4 sexps and type "M-x eval-region"
> >   . observe the popup *Warning* window asking you to select a coding
> >     system
> 
> Fixed, on bugfix.
> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=5ffb2675f

Thanks, I hope to see this soon on the emacs-30 release branch of the
Emacs Git repository.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options)
  2024-06-27  8:55                   ` Daniel Clemente
  2024-06-27 10:15                     ` org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options)) Ihor Radchenko
@ 2024-06-27 10:34                     ` Ihor Radchenko
  2024-07-02 16:53                       ` Daniel Clemente
  1 sibling, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-27 10:34 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

Daniel Clemente <n142857@gmail.com> writes:

>> One simple idea is to disable backups if encryption fails.
>> Or use `write-contents-functions' instead of `before-save-hook' - that
>> way, Emacs will not ignore errors thrown by org-crypt and will not
>> actually save anything if encryption fails.
>>
>
> Disabling backups makes sense too, if we decide that unencrypted
> private data shouldn't end up in backups.
> I don't have an absolute opinion. Some people may prefer having
> backups of all data (including private unencrypted data).

Actually, thinking about it more, I realize that backups may never
contain unencrypted data as long as we never write this unencrypted data
when saving normally. That's because backup is always taken from disk
and never from the buffer contents.

So, the real problem to solve is how to _reliably_ prevent the
unencrypted data to be saved onto the disk.

> If it's possible to detect whether encryption failed in this buffer,
> there could be a warning saying „Last encryption failed. Really
> save?“.

Yes. In fact, `org-entrypt-entries' throws an error when encryption
fails. However, this error is displayed as a simple message, which is
immediately hidden by "Wrote ..." message emitted a bit later.

This is because `basic-save-buffer' has

;; Don't let errors prevent saving the buffer.
(with-demoted-errors "Before-save hook error: %S"
  (run-hooks 'before-save-hook))

If we use `write-contents-functions' instead of `before-save-hook',
there should be no such problem.

> Or just a message in the style of „Encryption failed. Saving the file
> may store unencrypted data in disk, and in backups and cache if
> enabled“.
>
> Totally preventing the user from saving a file seems harsh but it also
> seems safer. Since users have different safety preferences, Emacs can
> let the user decide what the do, through a question or optional
> setting.

I agree that "saving prevention" must be a user option.

>> These things should be considered bugs. And we should fix them. Cache and
>> other libraries should not be responsible for special treatment of
>> optional org-crypt library.
>>
>
> You can't fix all bugs all the time, so you can't base security on „we
> strongly believe there are no more bugs“.

I did not suggest that.
What I am saying is that "we might have bugs, so be careful" is not
something we need to write in the documentation. The only exception is
when there is a known, long-living bug, that we cannot fix quickly and
must warn users about.

> ... If doing an extra
> verification (to avoid storing private data on disk in unencrypted
> form) is fast, it's better with the verification.

>> Cache and other libraries should not be responsible for special
>> treatment of optional org-crypt library.
>
> That's arbitrary. Both persistent cache and org-crypt are optional,
> but any of them can check whether the other is enabled and try to do
> what the user wants.
> I know they both have separate responsibilities, but if there are only
> these 2 parts, one of them must be the one caring about „unencrypted
> data leaking into disk caches“.

Sure. But I meant that we should still write this code in org-crypt
library, not inside org-persist. This is more of a technical detail and
code style.

> In addition, „leaving some encrypted sections unencrypted for a short
> amount of time, and closing and reopening the buffer during that time“
> isn't a bug, it's a possible user behaviour that we can't control. But
> org-crypt can mention that that behaviour is unsafe when using on-disk
> cache. Or detect it (if it's fast) and warn the user.

I did not mean that opening/closing buffer is a bug.
And I do not see why this behavior is unsafe, sorry.

> It would be different If we had a third component… E.g. imagine we had
> a component/overlay/text property/… in Emacs that could tell whether a
> buffer's region contains very private information or not; then all
> other components could just obey that setting (that section won't be
> backed up, it won't end up in disk cache, … It can even be displayed
> in a different face). Then org-crypt just needs to set that flag when
> encryption fails. Does something like that exist? Anyway this is a bit
> utopic or overengineered. Simpler ways of improving things are with
> documentation (e.g. „Don't do this, it's unsafe“), with messages
> („You're doing this, which may be unsafe“), or with questions („Really
> do this unsafe thing?“)

Sounds interesting, but I am afraid that this idea is too abstract. It
is not clear what Emacs is supposed to do with such regions. Maybe Eli
has something better to say.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-27 10:11                   ` Ihor Radchenko
  2024-06-27 10:30                     ` Eli Zaretskii
@ 2024-06-28 12:54                     ` Rudolf Adamkovič
  2024-06-28 15:31                       ` Ihor Radchenko
  1 sibling, 1 reply; 61+ messages in thread
From: Rudolf Adamkovič @ 2024-06-28 12:54 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

Ihor Radchenko <yantar92@posteo.net> writes:

> Fixed, on bugfix.
> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=5ffb2675f

FYI: A typo, 's/has/hash/'.

(Optionally, also 's/anyway //'.)

Rudy
-- 
"The whole science is nothing more than a refinement of everyday
thinking."  --- Albert Einstein, 1879-1955

Rudolf Adamkovič <rudolf@adamkovic.org> [he/him]
http://adamkovic.org


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Please document the caching and its user options
  2024-06-28 12:54                     ` Rudolf Adamkovič
@ 2024-06-28 15:31                       ` Ihor Radchenko
  0 siblings, 0 replies; 61+ messages in thread
From: Ihor Radchenko @ 2024-06-28 15:31 UTC (permalink / raw)
  To: Rudolf Adamkovič; +Cc: emacs-orgmode

Rudolf Adamkovič <rudolf@adamkovic.org> writes:

> Ihor Radchenko <yantar92@posteo.net> writes:
>
>> Fixed, on bugfix.
>> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=5ffb2675f
>
> FYI: A typo, 's/has/hash/'.
>
> (Optionally, also 's/anyway //'.)

Thanks!
Fixed
https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?h=bugfix&id=e377f3da513ee5ccd0022a447b13dddeb2d95068

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options)
  2024-06-27 10:34                     ` org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options) Ihor Radchenko
@ 2024-07-02 16:53                       ` Daniel Clemente
  0 siblings, 0 replies; 61+ messages in thread
From: Daniel Clemente @ 2024-07-02 16:53 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

> > In addition, „leaving some encrypted sections unencrypted for a short
> > amount of time, and closing and reopening the buffer during that time“
> > isn't a bug, it's a possible user behaviour that we can't control. But
> > org-crypt can mention that that behaviour is unsafe when using on-disk
> > cache. Or detect it (if it's fast) and warn the user.
>
> I did not mean that opening/closing buffer is a bug.
> And I do not see why this behavior is unsafe, sorry.
>

Closing a buffer is unsafe when this happens at the same time:
1. you have org-element-cache-persistent set to t (the default)
2. you're using org-crypt and have temporarily decrypted a region but
for some reason not reencrypted yet (maybe you plan to reencrypt it
some hours later today)
3. the cache directory is in a place which is not as private as the
org file. E.g. a network disk, an unencrypted hard drive etc.

It's unsafe because closing the buffer triggers saving the org-element
cache to disk.

> >
> > Disabling backups makes sense too, if we decide that unencrypted
> > private data shouldn't end up in backups.
> > I don't have an absolute opinion. Some people may prefer having
> > backups of all data (including private unencrypted data).
>
> Actually, thinking about it more, I realize that backups may never
> contain unencrypted data as long as we never write this unencrypted data
> when saving normally. That's because backup is always taken from disk
> and never from the buffer contents.
>
> So, the real problem to solve is how to _reliably_ prevent the
> unencrypted data to be saved onto the disk.
>

If that works, that's good.


> > If it's possible to detect whether encryption failed in this buffer,
> > there could be a warning saying „Last encryption failed. Really
> > save?“.
>
> Yes. In fact, `org-entrypt-entries' throws an error when encryption
> fails. However, this error is displayed as a simple message, which is
> immediately hidden by "Wrote ..." message emitted a bit later.
>
> This is because `basic-save-buffer' has
>
> ;; Don't let errors prevent saving the buffer.
> (with-demoted-errors "Before-save hook error: %S"
>   (run-hooks 'before-save-hook))
>
> If we use `write-contents-functions' instead of `before-save-hook',
> there should be no such problem.
>

It seems so.
But can you also prevent auto-save from saving it?
I randomly found in tar-mode.el line 860 that auto-save doesn't call
the write-contents-functions hook.


> What I am saying is that "we might have bugs, so be careful" is not
> something we need to write in the documentation. The only exception is
> when there is a known, long-living bug, that we cannot fix quickly and
> must warn users about.
>

The needed message isn't „we might have bugs, so be careful“
… but „if for some reason you keep org-crypt sections unencrypted for
long periods of time, be careful“.
That situation may come from user behaviour (e.g. not having enabled
org-crypt-use-before-save-magic), not from bugs.



> > It would be different If we had a third component… E.g. imagine we had
> > a component/overlay/text property/… in Emacs that could tell whether a
> > buffer's region contains very private information or not; then all
> > other components could just obey that setting (that section won't be
> > backed up, it won't end up in disk cache, … It can even be displayed
> > in a different face). Then org-crypt just needs to set that flag when
> > encryption fails. Does something like that exist? Anyway this is a bit
> > utopic or overengineered. Simpler ways of improving things are with
> > documentation (e.g. „Don't do this, it's unsafe“), with messages
> > („You're doing this, which may be unsafe“), or with questions („Really
> > do this unsafe thing?“)
>
> Sounds interesting, but I am afraid that this idea is too abstract. It
> is not clear what Emacs is supposed to do with such regions. Maybe Eli
> has something better to say.
>

By the way, there's already reveal-mode which marks some private
sections and replaces them with asterisks. For instance if you edit
~/.authinfo and write this
machine some_pc  login my_user port su password my_password
…then you'll see
machine some_pc  login my_user port su password ********
unless you position the cursor there.
It seems it's a display thing only, using overlays. I thought that
maybe unencrypted org-crypt sections could be marked or displayed like
that (with * when unfocused), and then org-element could detect them
and avoid caching those parts. But I'm not sure if org-persist sees
overlays.
Anyway this is beyond my Emacs knowledge and I'm speculating.
 also don't know whether „unfocused unencrypted org-crypt sections“
replaced by asterisks would look nice.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options))
  2024-06-27 10:15                     ` org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options)) Ihor Radchenko
@ 2024-07-02 16:54                       ` Daniel Clemente
  2024-07-02 19:16                         ` Ihor Radchenko
  0 siblings, 1 reply; 61+ messages in thread
From: Daniel Clemente @ 2024-07-02 16:54 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

> > For instance, I don't use it because it adds around 5 seconds to each
> > saving of a large file. If it were instantaneous I would enable it.
> > With it disabled, this explains why I often find unencrypted sections
> > at the end of the day… I have to rely on myself to reencrypt them
> > again.
>
> Does it also happen when you use the latest Org mode version?
>

Yes, with today's build. It happens with an 11 Mb Org file which has
19721 headers (some of them reach level 13).
Here I enabled the profiler, added a space, saved (1 time only), and
reported CPU. It took around 5 seconds.

        4616  89% - command-execute
        4349  84%  - funcall-interactively
        4127  80%   - save-buffer
        4127  80%    - basic-save-buffer
        3931  76%     - run-hooks
        3931  76%      - org-encrypt-entries
        3931  76%       - org-scan-tags
        3931  76%        - org-element-cache-map
        1764  34%         - org-element--parse-to
         868  16%          - org-element--cache-put
         848  16%           - avl-tree-enter
         840  16%            - avl-tree--do-enter
         792  15%             - avl-tree--do-enter
         748  14%              - avl-tree--do-enter
         688  13%               - avl-tree--do-enter
         632  12%                - avl-tree--do-enter
         576  11%                 + avl-tree--do-enter
          52   1%                 + org-element--cache-compare
          56   1%                + org-element--cache-compare
          44   0%               + org-element--cache-compare
           4   0%                 avl-tree--node-branch
          44   0%              + org-element--cache-compare
          48   0%             + org-element--cache-compare
         580  11%          - #<native-comp-function
F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_38>
         448   8%           + org-element-headline-parser
          20   0%           + org-element-section-parser
           4   0%             org-element-type
         136   2%          + org-element--cache-find
          20   0%            #<native-comp-function
F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_12>
          16   0%            org-element-type
          12   0%            throw
           8   0%          + #<native-comp-function
F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_15>
           8   0%          + #<native-comp-function
F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_18>
           4   0%            org-element--cache-active-p
        1161  22%         - #<byte-code-function FF1>
         857  16%          - org-entry-get-with-inheritance
         797  15%           + org-element-lineage-map
           8   0%             org-element-at-point
           8   0%             make-closure
           8   0%             org--property-get-separator
           4   0%             mapconcat
         184   3%          + org-element--property
          16   0%          + org-get-tags
           8   0%          + org-encrypt-entry
           8   0%            org-element-begin
           4   0%            #<byte-code-function 99A>
           4   0%            functionp
         446   8%         + org-element-at-point
         200   3%           re-search-forward
         124   2%         + #<native-comp-function
F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_66>
          44   0%         + #<native-comp-function
F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_9>
          28   0%           match-data
          12   0%           make-closure
           8   0%         + org-element--property
           4   0%           #<native-comp-function
F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_15>
           4   0%           org-element-type-p
           4   0%           #<native-comp-function
F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_18>
           4   0%           throw
         196   3%     + basic-save-buffer-1


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options))
  2024-07-02 16:54                       ` Daniel Clemente
@ 2024-07-02 19:16                         ` Ihor Radchenko
  2024-07-04 10:36                           ` Daniel Clemente
  0 siblings, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-07-02 19:16 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

Daniel Clemente <n142857@gmail.com> writes:

>> Does it also happen when you use the latest Org mode version?
>>
>
> Yes, with today's build. It happens with an 11 Mb Org file which has
> 19721 headers (some of them reach level 13).
> Here I enabled the profiler, added a space, saved (1 time only), and
> reported CPU. It took around 5 seconds.
>
> ...
>         3931  76%      - org-encrypt-entries
>         3931  76%       - org-scan-tags

May you try
https://git.sr.ht/~yantar92/org-mode/log/feature/org-crypt-refactor branch?
Is encryption speed satisfactory then?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options))
  2024-07-02 19:16                         ` Ihor Radchenko
@ 2024-07-04 10:36                           ` Daniel Clemente
  2024-07-06 13:02                             ` Ihor Radchenko
  0 siblings, 1 reply; 61+ messages in thread
From: Daniel Clemente @ 2024-07-04 10:36 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

> May you try
> https://git.sr.ht/~yantar92/org-mode/log/feature/org-crypt-refactor branch?
> Is encryption speed satisfactory then?

With that code I see something strange: I opened a file which had
encrypted :crypt: sections (never unencrypted), and after adding a
space somewhere else and saving, it asked me for an encryption
password. It shouldn't, since all sections are encrypted.
I also see „org-crypt: Re-encrypting all decrypted entries due to
auto-save“ asking me for the encryption password.

But I tried removing all :crypt: tags (I renamed them to something
else), and saving a large file seems as slow as before. A few seconds
(often 5 seconds; sometimes it's just 2 or 3; this was the case before
too). Here's when it's 5, for 1 save:

        4669  82% - command-execute
        4076  72%  - funcall-interactively
        4055  72%   - save-buffer
        4051  71%    - basic-save-buffer
        3831  68%     - run-hook-with-args-until-success
        3831  68%      - org-crypt--encrypt-and-mark-entries
        3831  68%       - org-encrypt-entries
        3831  68%        - org-scan-tags
        3831  68%         - org-element-cache-map
        1859  33%          - org-element--parse-to
         848  15%           + org-element--cache-put
         655  11%           + #<native-comp-function
F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_38>
         120   2%           + org-element--cache-find
          28   0%             org-element--cache-active-p
          20   0%             #<native-comp-function
F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_12>
          16   0%             org-element-type
           4   0%             #<native-comp-function
F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_18>
           4   0%             #<native-comp-function
F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_15>
        1100  19%          - #<byte-code-function A8A>
         804  14%           - org-entry-get-with-inheritance
         780  13%            - org-element-lineage-map
         720  12%             - #<byte-code-function A27>
         668  11%              + org--property-local-values
          12   0%                org-element-begin
           4   0%                delq
          36   0%             + org-element--property
           8   0%             + org-element-type-p
           4   0%               functionp
           8   0%              org--property-get-separator
           8   0%              make-closure
         208   3%           + org-element--property
          32   0%           + org-get-tags
           8   0%           + org-element-begin
         384   6%          + org-element-at-point
         164   2%            re-search-forward
         132   2%          + #<native-comp-function
F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_66>
          44   0%          + #<native-comp-function
F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_9>
          16   0%            match-data
          12   0%          + org-element--property
           8   0%          + #<byte-code-function A02>
           8   0%            make-closure
           4   0%            buffer-base-buffer
           4   0%          + org-element-type-p
         220   3%     + basic-save-buffer-1
           8   0%     execute-extended-command
           6   0%   + org-delete-backward-char
           3   0%   + org-self-insert-command
           3   0%   + previous-line
           1   0%   + next-line
         593  10%  - byte-code


I also see new problems (which would take me a long time to explain
since I don't understand the code or the settings), where:
- Org asks me for an encryption password even if there are no :crypt:
tags. I just changed the only :crypt: tag to a :nocrypt: tag and saved
- Org spends around 20 seconds trying to save the file, in a loop,
reporting:  (error "org-crypt: Encryption failed.  Not saving the
buffer. Error: GPG error: \"Encrypt failed\", \"Canceled; Exit\"")


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options))
  2024-07-04 10:36                           ` Daniel Clemente
@ 2024-07-06 13:02                             ` Ihor Radchenko
  2024-07-10 13:09                               ` Daniel Clemente
  0 siblings, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-07-06 13:02 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

Daniel Clemente <n142857@gmail.com> writes:

>> May you try
>> https://git.sr.ht/~yantar92/org-mode/log/feature/org-crypt-refactor branch?
>> Is encryption speed satisfactory then?
>
> With that code I see something strange: I opened a file which had
> encrypted :crypt: sections (never unencrypted), and after adding a
> space somewhere else and saving, it asked me for an encryption
> password. It shouldn't, since all sections are encrypted.
> I also see „org-crypt: Re-encrypting all decrypted entries due to
> auto-save“ asking me for the encryption password.

Note that it may be asking about _different_ buffers, not just current.
That's because auto-save-mode saves all the buffers, not just current :)

> But I tried removing all :crypt: tags (I renamed them to something
> else), and saving a large file seems as slow as before. A few seconds
> (often 5 seconds; sometimes it's just 2 or 3; this was the case before
> too). Here's when it's 5, for 1 save:

I think I fixed this now.
May you try the latest version of the same branch?

> I also see new problems (which would take me a long time to explain
> since I don't understand the code or the settings), where:
> - Org asks me for an encryption password even if there are no :crypt:
> tags. I just changed the only :crypt: tag to a :nocrypt: tag and saved

I cannot reproduce. May you create a small example file and explain how
to trigger the problem you are seeing?

> - Org spends around 20 seconds trying to save the file, in a loop,
> reporting:  (error "org-crypt: Encryption failed.  Not saving the
> buffer. Error: GPG error: \"Encrypt failed\", \"Canceled; Exit\"")

This is curious, but I again have no clue. Maybe the new version of the
branch works a bit better. 

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options))
  2024-07-06 13:02                             ` Ihor Radchenko
@ 2024-07-10 13:09                               ` Daniel Clemente
  2024-07-11 10:40                                 ` Ihor Radchenko
  0 siblings, 1 reply; 61+ messages in thread
From: Daniel Clemente @ 2024-07-10 13:09 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

> > With that code I see something strange: I opened a file which had
> > encrypted :crypt: sections (never unencrypted), and after adding a
> > space somewhere else and saving, it asked me for an encryption
> > password. It shouldn't, since all sections are encrypted.
> > I also see „org-crypt: Re-encrypting all decrypted entries due to
> > auto-save“ asking me for the encryption password.
>
> Note that it may be asking about _different_ buffers, not just current.
> That's because auto-save-mode saves all the buffers, not just current :)
>

It would be good to know what buffer it is asking about; see
suggestion later below, about mentioning its name.
But I think it's already telling me which buffer it is, by moving the
point to the affected section, before asking the question.
It's the first section in the current file. And I can see that it's
actually encrypted.

I see it's trying to decrypt things (therefore it asks for the
password). It shouldn't, since I didn't modify any encrypted section.
I said „it asked me for an encryption password“ because the GPG prompt
confusingly uses the word „encryption“ („Passphrase for symmetric
encryption“), though it's actually asking for a decryption password.
It calls:

  org-decrypt-entry()
  (progn (org-decrypt-entry))
  (if (get-text-property (point) 'org-crypt-auto-encrypted) (progn
(org-decrypt-entry)))
  (while (not (eobp)) (if (get-text-property (point)
'org-crypt-auto-encrypted) (progn (org-decrypt-entry))) (goto-char
(next-single-char-property-change (point) 'org-crypt-auto-encrypted)))
  (save-restriction (widen) (goto-char (point-min)) (while (not
(eobp)) (if (get-text-property (point) 'org-crypt-auto-encrypted)
(progn (org-decrypt-entry))) (goto-char
(next-single-char-property-change (point)
'org-crypt-auto-encrypted))))
  (save-excursion (save-restriction (widen) (goto-char (point-min))
(while (not (eobp)) (if (get-text-property (point)
'org-crypt-auto-encrypted) (progn (org-decrypt-entry))) (goto-char
(next-single-char-property-change (point)
'org-crypt-auto-encrypted)))))
  (let ((modified-flag (buffer-modified-p))) (save-excursion
(save-restriction (widen) (goto-char (point-min)) (while (not (eobp))
(if (get-text-property (point) 'org-crypt-auto-encrypted) (progn
(org-decrypt-entry))) (goto-char (next-single-char-property-change
(point) 'org-crypt-auto-encrypted))))) (set-buffer-modified-p
modified-flag))
  org-crypt--decrypt-marked-entries()
  run-hooks(after-save-hook)
  basic-save-buffer(t)



> > But I tried removing all :crypt: tags (I renamed them to something
> > else), and saving a large file seems as slow as before. A few seconds
> > (often 5 seconds; sometimes it's just 2 or 3; this was the case before
> > too). Here's when it's 5, for 1 save:
>
> I think I fixed this now.
> May you try the latest version of the same branch?
>

This particular case in which there are no :crypt: tags is fast now,
thanks. In the same large file as before, saving is instantaneous
(well, the usual 100 to 200 ms).

          22   2%     - run-hook-with-args-until-success
          22   2%      - org-crypt--encrypt-and-mark-entries
          22   2%       - let
          22   2%        - condition-case
          22   2%         - unwind-protect
          22   2%          - org-encrypt-entries
          22   2%           - org-encrypt--map-items
          22   2%            - let*
          22   2%             - if
          22   2%              - or
          22   2%               - save-excursion
          22   2%                - save-restriction
          22   2%                   re-search-forward


> > I also see new problems (which would take me a long time to explain
> > since I don't understand the code or the settings), where:
> > - Org asks me for an encryption password even if there are no :crypt:
> > tags. I just changed the only :crypt: tag to a :nocrypt: tag and saved
>
> I cannot reproduce. May you create a small example file and explain how
> to trigger the problem you are seeing?

This is the text "abc" encrypted with password "abc". Use this file:

* hi                                                                  :nocrypt:
-----BEGIN PGP MESSAGE-----

jA0ECQMCVpS/qSoed5f/0joBYoIRWdgt/+PVQCsZh9sg176SdnvP2Wc8tH/CV1Rk
l2MjAh3Rk19Q2aP2EffpZ5CFeGELTMXCnCYv
=FNtI
-----END PGP MESSAGE-----

Open the file, add a space to the title and save it. The first time it
works (no questions asked) because there's no tag called :crypt:
Now change the :nocrypt: to :crypt: and save.  It asks for the
password. Press C-g to cancel.
Change again the tag to :nocrypt:. Save. It asks for the encryption
password; it shouldn't.
Add a space to the title, save, it keeps asking for the encryption
password, though there's no :crypt: section.


>
> > - Org spends around 20 seconds trying to save the file, in a loop,
> > reporting:  (error "org-crypt: Encryption failed.  Not saving the
> > buffer. Error: GPG error: \"Encrypt failed\", \"Canceled; Exit\"")
>
> This is curious, but I again have no clue. Maybe the new version of the
> branch works a bit better.
>

Since this error can happen because of a problem in a different buffer
(not the current one), would it be good to mention the file name in
that error message?

I didn't see this particular problem again. But I see others, which
are hard to report and reproduce. For instance I had an encrypted
section under a :crypt: header (I see „BEGIN PGP“ and hex codes), I
save, and saving *UNencrypts* the header before saving, without
asking. It should never decrypt when saving, but it does. This happens
with the same small example I posted above (but using the :crypt:
tag).


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options))
  2024-07-10 13:09                               ` Daniel Clemente
@ 2024-07-11 10:40                                 ` Ihor Radchenko
  2024-07-15 17:00                                   ` Daniel Clemente
  0 siblings, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-07-11 10:40 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

Daniel Clemente <n142857@gmail.com> writes:

> I see it's trying to decrypt things (therefore it asks for the
> password). It shouldn't, since I didn't modify any encrypted section.
> I said „it asked me for an encryption password“ because the GPG prompt
> confusingly uses the word „encryption“ („Passphrase for symmetric
> encryption“), though it's actually asking for a decryption password.
> ...
> This is the text "abc" encrypted with password "abc". Use this file:
>
> * hi                                                                  :nocrypt:
> -----BEGIN PGP MESSAGE-----
>
> jA0ECQMCVpS/qSoed5f/0joBYoIRWdgt/+PVQCsZh9sg176SdnvP2Wc8tH/CV1Rk
> l2MjAh3Rk19Q2aP2EffpZ5CFeGELTMXCnCYv
> =FNtI
> -----END PGP MESSAGE-----
>
> Open the file, add a space to the title and save it. The first time it
> works (no questions asked) because there's no tag called :crypt:
> Now change the :nocrypt: to :crypt: and save.  It asks for the
> password. Press C-g to cancel.
> Change again the tag to :nocrypt:. Save. It asks for the encryption
> password; it shouldn't.
> Add a space to the title, save, it keeps asking for the encryption
> password, though there's no :crypt: section.

This should be fixed now.
May you try yet again?

>> > - Org spends around 20 seconds trying to save the file, in a loop,
>> > reporting:  (error "org-crypt: Encryption failed.  Not saving the
>> > buffer. Error: GPG error: \"Encrypt failed\", \"Canceled; Exit\"")
>>
>> This is curious, but I again have no clue. Maybe the new version of the
>> branch works a bit better.
>>
>
> Since this error can happen because of a problem in a different buffer
> (not the current one), would it be good to mention the file name in
> that error message?

Yes. Done now on the branch.

> I didn't see this particular problem again. But I see others, which
> are hard to report and reproduce. For instance I had an encrypted
> section under a :crypt: header (I see „BEGIN PGP“ and hex codes), I
> save, and saving *UNencrypts* the header before saving, without
> asking. It should never decrypt when saving, but it does. This happens
> with the same small example I posted above (but using the :crypt:
> tag).

The other problem you reported had something to do with incorrectly
cycling encryption state during save. I hope that fixing one also fixed
another.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options))
  2024-07-11 10:40                                 ` Ihor Radchenko
@ 2024-07-15 17:00                                   ` Daniel Clemente
  2024-07-20 14:14                                     ` Ihor Radchenko
  0 siblings, 1 reply; 61+ messages in thread
From: Daniel Clemente @ 2024-07-15 17:00 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

In that branch, I don't see the previously mentioned bugs; thanks.

But org-crypt still feels strange. For instance, I decrypt a header,
add a space somewhere else and save. It's saved, but the header is
still visibly unencrypted in Emacs; that's unexpected, because
org-crypt-use-before-save-magic promised to „automatically encrypt
entries before a file is saved to disk“.
I checked the file from outside Emacs and I see that the header is
actually encrypted, so technically it did what it promised to do
though I don't see it in Emacs.
So there's a discordance between what I see and what is saved. Maybe
it's feature, not a bug: „you still see the decrypted contents but you
can trust that when they're saved they'll be saved encrypted“. This
may be clarified in the docstring. If it's a feature, I think it may
be useful; I just don't like having to trust that the silent
background-auto-encryption is working (I'll often want to verify the
file from outside Emacs). But users may have different preferences.
This may be material for another thread.

The part about the slowness has improved to acceptable levels, thanks.

Minor thing, not important now: the cursor jumps to the end of the
header after a C-x C-s when in the middle of a currently-decrypted
block without changes.

Another minor thing: I use a key that calls
(org-save-all-org-buffers), and if I press it e.g. from the *scratch*
buffer it may ask me the „Passphrase for symmetric encryption“
question (because I edited some crypted section) but I don't know
which buffer it's asking about. But it's not a problem because if I
press C-g then I'll see it.

I see a new problem: with (org-crypt-use-before-save-magic) enabled, I
edit a decrypted section, press C-x C-s to save and it asks me for the
encryption password. Here, if I press C-g, org-crypt would catch it
and then tell me that it won't be able to encrypt due to the C-g.
However I'm not pressing C-g, what I'm doing is opening another TTY
frame (I'm running TTY emacsclient, with no X support, but under
urxvt); this makes the minibuffer disappear, and I see „Back to top
level“, and the whole contents of the section being encrypted are
lost.



On Thu, 11 Jul 2024 at 10:39, Ihor Radchenko <yantar92@posteo.net> wrote:
>
> Daniel Clemente <n142857@gmail.com> writes:
>
> > I see it's trying to decrypt things (therefore it asks for the
> > password). It shouldn't, since I didn't modify any encrypted section.
> > I said „it asked me for an encryption password“ because the GPG prompt
> > confusingly uses the word „encryption“ („Passphrase for symmetric
> > encryption“), though it's actually asking for a decryption password.
> > ...
> > This is the text "abc" encrypted with password "abc". Use this file:
> >
> > * hi                                                                  :nocrypt:
> > -----BEGIN PGP MESSAGE-----
> >
> > jA0ECQMCVpS/qSoed5f/0joBYoIRWdgt/+PVQCsZh9sg176SdnvP2Wc8tH/CV1Rk
> > l2MjAh3Rk19Q2aP2EffpZ5CFeGELTMXCnCYv
> > =FNtI
> > -----END PGP MESSAGE-----
> >
> > Open the file, add a space to the title and save it. The first time it
> > works (no questions asked) because there's no tag called :crypt:
> > Now change the :nocrypt: to :crypt: and save.  It asks for the
> > password. Press C-g to cancel.
> > Change again the tag to :nocrypt:. Save. It asks for the encryption
> > password; it shouldn't.
> > Add a space to the title, save, it keeps asking for the encryption
> > password, though there's no :crypt: section.
>
> This should be fixed now.
> May you try yet again?
>
> >> > - Org spends around 20 seconds trying to save the file, in a loop,
> >> > reporting:  (error "org-crypt: Encryption failed.  Not saving the
> >> > buffer. Error: GPG error: \"Encrypt failed\", \"Canceled; Exit\"")
> >>
> >> This is curious, but I again have no clue. Maybe the new version of the
> >> branch works a bit better.
> >>
> >
> > Since this error can happen because of a problem in a different buffer
> > (not the current one), would it be good to mention the file name in
> > that error message?
>
> Yes. Done now on the branch.
>
> > I didn't see this particular problem again. But I see others, which
> > are hard to report and reproduce. For instance I had an encrypted
> > section under a :crypt: header (I see „BEGIN PGP“ and hex codes), I
> > save, and saving *UNencrypts* the header before saving, without
> > asking. It should never decrypt when saving, but it does. This happens
> > with the same small example I posted above (but using the :crypt:
> > tag).
>
> The other problem you reported had something to do with incorrectly
> cycling encryption state during save. I hope that fixing one also fixed
> another.
>
> --
> Ihor Radchenko // yantar92,
> Org mode contributor,
> Learn more about Org mode at <https://orgmode.org/>.
> Support Org development at <https://liberapay.com/org-mode>,
> or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options))
  2024-07-15 17:00                                   ` Daniel Clemente
@ 2024-07-20 14:14                                     ` Ihor Radchenko
  2024-07-24 13:47                                       ` Daniel Clemente
  0 siblings, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-07-20 14:14 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

Daniel Clemente <n142857@gmail.com> writes:

> But org-crypt still feels strange. For instance, I decrypt a header,
> add a space somewhere else and save. It's saved, but the header is
> still visibly unencrypted in Emacs; that's unexpected, because
> org-crypt-use-before-save-magic promised to „automatically encrypt
> entries before a file is saved to disk“.
> I checked the file from outside Emacs and I see that the header is
> actually encrypted, so technically it did what it promised to do
> though I don't see it in Emacs.
> So there's a discordance between what I see and what is saved. Maybe
> it's feature, not a bug: „you still see the decrypted contents but you
> can trust that when they're saved they'll be saved encrypted“. This
> may be clarified in the docstring. If it's a feature, I think it may
> be useful; I just don't like having to trust that the silent
> background-auto-encryption is working (I'll often want to verify the
> file from outside Emacs). But users may have different preferences.
> This may be material for another thread.

Yup, I consider this as a feature. Especially for people using
auto-save-visited-mode and similar. If saving is triggered on timer,
while editing encrypted heading, encrypting everything in the middle of
typing is not fun.

> Minor thing, not important now: the cursor jumps to the end of the
> header after a C-x C-s when in the middle of a currently-decrypted
> block without changes.

Should be better now on the latest version of the branch.

> Another minor thing: I use a key that calls
> (org-save-all-org-buffers), and if I press it e.g. from the *scratch*
> buffer it may ask me the „Passphrase for symmetric encryption“
> question (because I edited some crypted section) but I don't know
> which buffer it's asking about. But it's not a problem because if I
> press C-g then I'll see it.

Should also be better now.

> I see a new problem: with (org-crypt-use-before-save-magic) enabled, I
> edit a decrypted section, press C-x C-s to save and it asks me for the
> encryption password. Here, if I press C-g, org-crypt would catch it
> and then tell me that it won't be able to encrypt due to the C-g.
> However I'm not pressing C-g, what I'm doing is opening another TTY
> frame (I'm running TTY emacsclient, with no X support, but under
> urxvt); this makes the minibuffer disappear, and I see „Back to top
> level“, and the whole contents of the section being encrypted are
> lost.

I tried to reproduce with the latest version of the branch. Seems to
work fine. May you test?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options))
  2024-07-20 14:14                                     ` Ihor Radchenko
@ 2024-07-24 13:47                                       ` Daniel Clemente
  2024-07-25  7:31                                         ` Ihor Radchenko
  0 siblings, 1 reply; 61+ messages in thread
From: Daniel Clemente @ 2024-07-24 13:47 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

The 3 small problems mentioned above are fixed, thanks.
Encryption is faster, and safer now. The part about communicating the
encryption status (communicating „this is actually encrypted on disk
even when you're seeing it unencrypted“) can be improved later if
others find the current behaviour confusing.

In the „Back to top level“ situation I mentioned, the encryption
prompt is canceled, and I'm happy that now I don't lose the section I
was encrypting.
Maybe you wanted to have org-crypt also catch this case (encryption
prompt canceled) to be able to show a „could not encrypt“ message. But
I think it's good enough as it is: it stays unencrypted, and on the
next attempt to save it, there will be another encryption password
prompt.

I found minor but unrelated issues, e.g. if you have an empty section like this:

************* abc2                                                    :crypt:
************* def

… if you rename the abc2 header, e.g. to abc, it will ask the
encryption password again, even when the contents (an empty header)
didn't change.

Another minor and weird bug: inline blocks. The part about showing the
unencrypted contents while keeping the disk contents encrypted doesn't
seem to work with encrypted inline blocks: they're saved encrypted,
but they're displayed encrypted. In fact they can't be displayed
unencrypted even if you call org-decrypt-contents. Maybe inline
encrypted blocks aren't supported.
To test this:
***** section
********************** this is an inline block
                 :crypt:
Content.

If you want you can split this to other threads or just ignore these
edge cases for now.

On Sat, 20 Jul 2024 at 14:12, Ihor Radchenko <yantar92@posteo.net> wrote:
>
> Daniel Clemente <n142857@gmail.com> writes:
>
> > But org-crypt still feels strange. For instance, I decrypt a header,
> > add a space somewhere else and save. It's saved, but the header is
> > still visibly unencrypted in Emacs; that's unexpected, because
> > org-crypt-use-before-save-magic promised to „automatically encrypt
> > entries before a file is saved to disk“.
> > I checked the file from outside Emacs and I see that the header is
> > actually encrypted, so technically it did what it promised to do
> > though I don't see it in Emacs.
> > So there's a discordance between what I see and what is saved. Maybe
> > it's feature, not a bug: „you still see the decrypted contents but you
> > can trust that when they're saved they'll be saved encrypted“. This
> > may be clarified in the docstring. If it's a feature, I think it may
> > be useful; I just don't like having to trust that the silent
> > background-auto-encryption is working (I'll often want to verify the
> > file from outside Emacs). But users may have different preferences.
> > This may be material for another thread.
>
> Yup, I consider this as a feature. Especially for people using
> auto-save-visited-mode and similar. If saving is triggered on timer,
> while editing encrypted heading, encrypting everything in the middle of
> typing is not fun.
>
> > Minor thing, not important now: the cursor jumps to the end of the
> > header after a C-x C-s when in the middle of a currently-decrypted
> > block without changes.
>
> Should be better now on the latest version of the branch.
>
> > Another minor thing: I use a key that calls
> > (org-save-all-org-buffers), and if I press it e.g. from the *scratch*
> > buffer it may ask me the „Passphrase for symmetric encryption“
> > question (because I edited some crypted section) but I don't know
> > which buffer it's asking about. But it's not a problem because if I
> > press C-g then I'll see it.
>
> Should also be better now.
>
> > I see a new problem: with (org-crypt-use-before-save-magic) enabled, I
> > edit a decrypted section, press C-x C-s to save and it asks me for the
> > encryption password. Here, if I press C-g, org-crypt would catch it
> > and then tell me that it won't be able to encrypt due to the C-g.
> > However I'm not pressing C-g, what I'm doing is opening another TTY
> > frame (I'm running TTY emacsclient, with no X support, but under
> > urxvt); this makes the minibuffer disappear, and I see „Back to top
> > level“, and the whole contents of the section being encrypted are
> > lost.
>
> I tried to reproduce with the latest version of the branch. Seems to
> work fine. May you test?
>
> --
> Ihor Radchenko // yantar92,
> Org mode contributor,
> Learn more about Org mode at <https://orgmode.org/>.
> Support Org development at <https://liberapay.com/org-mode>,
> or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options))
  2024-07-24 13:47                                       ` Daniel Clemente
@ 2024-07-25  7:31                                         ` Ihor Radchenko
  2024-07-25 14:08                                           ` Daniel Clemente
  0 siblings, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-07-25  7:31 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

Daniel Clemente <n142857@gmail.com> writes:

> I found minor but unrelated issues, e.g. if you have an empty section like this:
>
> ************* abc2                                                    :crypt:
> ************* def
>
> … if you rename the abc2 header, e.g. to abc, it will ask the
> encryption password again, even when the contents (an empty header)
> didn't change.
>
> Another minor and weird bug: inline blocks. The part about showing the
> unencrypted contents while keeping the disk contents encrypted doesn't
> seem to work with encrypted inline blocks: they're saved encrypted,
> but they're displayed encrypted. In fact they can't be displayed
> unencrypted even if you call org-decrypt-contents. Maybe inline
> encrypted blocks aren't supported.
> To test this:
> ***** section
> ********************** this is an inline block
>                  :crypt:
> Content.
>
> If you want you can split this to other threads or just ignore these
> edge cases for now.

There is no such thing as "inline block" in Org syntax.
The current behavior of org-crypt is because it mishandles inlinetasks
in specific way.

I can add support for proper inlinetasks delimited by END line, but not
for what you call "blocks" - that one is actually a bug when org-crypt
encrypts everything spanning between one-line inlinetask down to the
next heading.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options))
  2024-07-25  7:31                                         ` Ihor Radchenko
@ 2024-07-25 14:08                                           ` Daniel Clemente
  2024-07-25 14:15                                             ` Ihor Radchenko
  0 siblings, 1 reply; 61+ messages in thread
From: Daniel Clemente @ 2024-07-25 14:08 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode

> There is no such thing as "inline block" in Org syntax.

I meant "inline task", sorry. I remembered "display: inline-block" from CSS…
I don't think we need support for encrypted #+BEGIN_…/#+END… blocks.

Fixing the org-crypt + inline task bugs seems low priority since it's
an uncommon case and there are probably workarounds.


On Thu, 25 Jul 2024 at 07:30, Ihor Radchenko <yantar92@posteo.net> wrote:
>
> Daniel Clemente <n142857@gmail.com> writes:
>
> > I found minor but unrelated issues, e.g. if you have an empty section like this:
> >
> > ************* abc2                                                    :crypt:
> > ************* def
> >
> > … if you rename the abc2 header, e.g. to abc, it will ask the
> > encryption password again, even when the contents (an empty header)
> > didn't change.
> >
> > Another minor and weird bug: inline blocks. The part about showing the
> > unencrypted contents while keeping the disk contents encrypted doesn't
> > seem to work with encrypted inline blocks: they're saved encrypted,
> > but they're displayed encrypted. In fact they can't be displayed
> > unencrypted even if you call org-decrypt-contents. Maybe inline
> > encrypted blocks aren't supported.
> > To test this:
> > ***** section
> > ********************** this is an inline block
> >                  :crypt:
> > Content.
> >
> > If you want you can split this to other threads or just ignore these
> > edge cases for now.
>
> There is no such thing as "inline block" in Org syntax.
> The current behavior of org-crypt is because it mishandles inlinetasks
> in specific way.
>
> I can add support for proper inlinetasks delimited by END line, but not
> for what you call "blocks" - that one is actually a bug when org-crypt
> encrypts everything spanning between one-line inlinetask down to the
> next heading.
>
> --
> Ihor Radchenko // yantar92,
> Org mode contributor,
> Learn more about Org mode at <https://orgmode.org/>.
> Support Org development at <https://liberapay.com/org-mode>,
> or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options))
  2024-07-25 14:08                                           ` Daniel Clemente
@ 2024-07-25 14:15                                             ` Ihor Radchenko
  0 siblings, 0 replies; 61+ messages in thread
From: Ihor Radchenko @ 2024-07-25 14:15 UTC (permalink / raw)
  To: Daniel Clemente; +Cc: Eli Zaretskii, emacs-orgmode

Daniel Clemente <n142857@gmail.com> writes:

>> There is no such thing as "inline block" in Org syntax.
>
> I meant "inline task", sorry. I remembered "display: inline-block" from CSS…
> I don't think we need support for encrypted #+BEGIN_…/#+END… blocks.

I did not mean this.

>> > ********************** this is an inline block  :crypt:
>> > Content.

In the above, "Content." does not belong to the inlinetask.
Only inlinetasks with ******** END can have non-empty contents.

The fact that it is encrypted anyway is simply a bug in org-crypt.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Proposal: Change publication timestamps (was: Publishing cache)
  2024-06-14 14:31     ` Publishing cache (was: Please document the caching and its user options) Ihor Radchenko
@ 2024-08-12  7:55       ` Jens Lechtenboerger
  2024-08-15 18:29         ` Ihor Radchenko
  0 siblings, 1 reply; 61+ messages in thread
From: Jens Lechtenboerger @ 2024-08-12  7:55 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode


[-- Attachment #1.1: Type: text/plain, Size: 1879 bytes --]

On 2024-06-14, Ihor Radchenko wrote:

> Jens Lechtenboerger <lechten@wi.uni-muenster.de> writes:
>
>> Jumping in here, I do not understand the publishing cache.  Some of
>> my org documents are re-published every time, while others are only
>> re-published after changes.  What is checked where?
>
> See "14.4 Triggering Publication" section of Org mode manual:
>
>        Org uses timestamps to track when a file has changed.  The above
>     functions normally only publish changed files.  You can override this
> [...]

I propose to change caching and checking of timestamps as in the
attached path.

Currently, org-publish-cache-file-needs-publishing checks whether
the source file was modified after the cached modification time,
which is fine.  However, for each included file B, it checks whether
that was modified more recently than the source file A.  If so, the
source is file A is considered to need publishing.  This does not
make sense.  File A will be published again and again, even if
neither A nor B changed since the last publishing.

In the patch, I store the current time in the publish cache, not the
source file’s modification time.  Also, for included files, I do not
check their timestamp against the one of the source file but against
the cached publish timestamp.

What do you think?

As an aside, recursive inclusions are currently not checked.
Maybe code to collect all included files is available elsewhere in
the code base?

Also, in my case, it would be useful to have a new keyword like
“#+PUBLISH_DEPENDENCY: some-file” to record timestamps for
additional dependencies, whose changes should also trigger
publishing.  Currently, that could be added as alternative to the
regexp for INCLUDE.  However, maybe you prefer to refactor that
code to deal with recursive inclusions?

Best wishes,
Jens


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: 0001-lisp-ox-publish.el-Use-publish-time-in-publish-cache.patch --]
[-- Type: text/x-diff, Size: 2493 bytes --]

From 5f1c5d4e56afd91db85884df6018960f3639230b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jens=20Lechtenb=C3=B6rger?= <lechten@wi.uni-muenster.de>
Date: Mon, 12 Aug 2024 09:40:13 +0200
Subject: [PATCH] lisp/ox-publish.el: Use publish time in publish cache

* lisp/ox-publish.el (org-publish-update-timestamp): Store current
time in publish cache, instead of modification time of source file.
(org-publish-cache-file-needs-publishing): Return t, if source or an
included file was modified more recently.

Previously, a source file including a newer file was published again
and again.  Now, publishing only happens if either source or an
included file changed since the last publishing.
---
 lisp/ox-publish.el | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/lisp/ox-publish.el b/lisp/ox-publish.el
index 9f943cedc..e26e2ca37 100644
--- a/lisp/ox-publish.el
+++ b/lisp/ox-publish.el
@@ -389,7 +389,7 @@ still decide about that independently."
   "Update publishing timestamp for file FILENAME.
 If there is no timestamp, create one."
   (let ((key (org-publish-timestamp-filename filename pub-dir pub-func))
-	(stamp (org-publish-cache-mtime-of-src filename)))
+	(stamp (current-time)))
     (org-publish-cache-set key stamp)))
 
 (defun org-publish-remove-all-timestamps ()
@@ -1286,9 +1286,10 @@ If FREE-CACHE, empty the cache."
 (defun org-publish-cache-file-needs-publishing
     (filename &optional pub-dir pub-func _base-dir)
   "Check the timestamp of the last publishing of FILENAME.
-Return non-nil if the file needs publishing.  Also check if
-any included files have been more recently published, so that
-the file including them will be republished as well."
+Return non-nil if the file needs publishing.  This is the case
+if either FILENAME does not occur in the publication cache or
+if FILENAME, or any file included by it, was modified after the
+most recent publication."
   (unless org-publish-cache
     (error
      "`org-publish-cache-file-needs-publishing' called, but no cache present"))
@@ -1322,7 +1323,7 @@ the file including them will be republished as well."
     (or (null pstamp)
 	(let ((mtime (org-publish-cache-mtime-of-src filename)))
 	  (or (time-less-p pstamp mtime)
-	      (cl-some (lambda (ct) (time-less-p mtime ct))
+	      (cl-some (lambda (ct) (time-less-p pstamp ct))
 		       included-files-mtime))))))
 
 (defun org-publish-cache-set-file-property
-- 
2.25.1


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 6187 bytes --]

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: Proposal: Change publication timestamps (was: Publishing cache)
  2024-08-12  7:55       ` Proposal: Change publication timestamps (was: Publishing cache) Jens Lechtenboerger
@ 2024-08-15 18:29         ` Ihor Radchenko
  2024-08-25 17:00           ` Proposal: Change publication timestamps Jens Lechtenboerger
  0 siblings, 1 reply; 61+ messages in thread
From: Ihor Radchenko @ 2024-08-15 18:29 UTC (permalink / raw)
  To: Jens Lechtenboerger; +Cc: Eli Zaretskii, emacs-orgmode

Jens Lechtenboerger <lechten@wi.uni-muenster.de> writes:

>> See "14.4 Triggering Publication" section of Org mode manual:
>>
>>        Org uses timestamps to track when a file has changed.  The above
>>     functions normally only publish changed files.  You can override this
>> [...]
>
> I propose to change caching and checking of timestamps as in the
> attached path.
>
> Currently, org-publish-cache-file-needs-publishing checks whether
> the source file was modified after the cached modification time,
> which is fine.  However, for each included file B, it checks whether
> that was modified more recently than the source file A.  If so, the
> source is file A is considered to need publishing.  This does not
> make sense.  File A will be published again and again, even if
> neither A nor B changed since the last publishing.
>
> In the patch, I store the current time in the publish cache, not the
> source file’s modification time.  Also, for included files, I do not
> check their timestamp against the one of the source file but against
> the cached publish timestamp.
>
> What do you think?

The patch looks reasonable.
Applied, onto main.
https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=38d1bc67b2

Ideally, it would be nice to have tests as well.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Proposal: Change publication timestamps
  2024-08-15 18:29         ` Ihor Radchenko
@ 2024-08-25 17:00           ` Jens Lechtenboerger
  2024-09-15 12:02             ` Jens Lechtenboerger
  0 siblings, 1 reply; 61+ messages in thread
From: Jens Lechtenboerger @ 2024-08-25 17:00 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode


[-- Attachment #1.1: Type: text/plain, Size: 684 bytes --]

On 2024-08-15, Ihor Radchenko wrote:

> Jens Lechtenboerger <lechten@wi.uni-muenster.de> writes:
>
>>> See "14.4 Triggering Publication" section of Org mode manual:
>>>
>>>        Org uses timestamps to track when a file has changed.  The above
>>>     functions normally only publish changed files.  You can override this
>>> [...]
>>
>> I propose to change caching and checking of timestamps as in the
>> attached path.
>
> The patch looks reasonable.
> Applied, onto main.
> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=38d1bc67b2

Great, thanks!

> Ideally, it would be nice to have tests as well.

I added test cases in the attached patch.

Best wishes,
Jens


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: 0001-Add-test-case-for-publish-cache.patch --]
[-- Type: text/x-diff, Size: 6923 bytes --]

From bc20910287edad8b6740acdaa065e93100355405 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jens=20Lechtenb=C3=B6rger?= <lechten@wi.uni-muenster.de>
Date: Sun, 25 Aug 2024 18:52:59 +0200
Subject: [PATCH] Add test case for publish cache

---
 testing/examples/pub-cache/config.org |  3 +
 testing/examples/pub-cache/source.org |  4 ++
 testing/lisp/test-ox-publish.el       | 98 +++++++++++++++++++++++++--
 3 files changed, 98 insertions(+), 7 deletions(-)
 create mode 100644 testing/examples/pub-cache/config.org
 create mode 100644 testing/examples/pub-cache/source.org

diff --git a/testing/examples/pub-cache/config.org b/testing/examples/pub-cache/config.org
new file mode 100644
index 000000000..70628d3af
--- /dev/null
+++ b/testing/examples/pub-cache/config.org
@@ -0,0 +1,3 @@
+#+OPTIONS: author:nil
+
+This is included.
diff --git a/testing/examples/pub-cache/source.org b/testing/examples/pub-cache/source.org
new file mode 100644
index 000000000..7072f560e
--- /dev/null
+++ b/testing/examples/pub-cache/source.org
@@ -0,0 +1,4 @@
+#+TITLE: Test
+#+INCLUDE: config.org
+
+Nothing special
diff --git a/testing/lisp/test-ox-publish.el b/testing/lisp/test-ox-publish.el
index 6419b8f6c..a8127ccf2 100644
--- a/testing/lisp/test-ox-publish.el
+++ b/testing/lisp/test-ox-publish.el
@@ -24,7 +24,9 @@
 \f
 ;;; Helper functions
 
-(defun org-test-publish (properties handler &optional remove-prop)
+(defun org-test-publish
+    (properties handler
+                &optional remove-prop timestamp-flag pubdir keep-pubdir-p)
   "Publish a project defined by PROPERTIES.
 Call HANDLER with the publishing directory as its sole argument.
 Unless set otherwise in PROPERTIES, `:base-directory' is set to
@@ -33,12 +35,17 @@ Unless set otherwise in PROPERTIES, `:base-directory' is set to
 Because `org-publish-property' uses `plist-member' to check the
 existence of a property, a property with a value nil is different
 from a non-existing property.  Properties in REMOVE-PROP will be
-removed from the final plist."
+removed from the final plist.
+Assign optional TIMESTAMP-FLAG to `org-publish-use-timestamps-flag'.
+Optional PUBDIR specifies the `:publishing-directory', which
+overrides the default of a randomly generated temporary directory.
+If optional KEEP-PUBDIR-P is non-nil, keep publishing directory,
+including timestamp directory; otherwise, delete it."
   (declare (indent 1))
-  (let* ((org-publish-use-timestamps-flag nil)
+  (let* ((org-publish-use-timestamps-flag timestamp-flag)
 	 (org-publish-cache nil)
 	 (base-dir (expand-file-name "examples/pub/" org-test-dir))
-	 (pub-dir (make-temp-file "org-test" t))
+	 (pub-dir (or pubdir (make-temp-file "org-test" t)))
 	 (org-publish-timestamp-directory
 	  (expand-file-name ".org-timestamps/" pub-dir))
          (props (org-plist-delete-all
@@ -54,8 +61,9 @@ removed from the final plist."
 	(progn
 	  (org-publish-projects (list project))
 	  (funcall handler pub-dir))
-      ;; Clear published data.
-      (delete-directory pub-dir t)
+      (unless keep-pubdir-p
+        ;; Clear published data.
+        (delete-directory pub-dir t))
       ;; Delete auto-generated site-map file, if applicable.
       (let ((site-map (and (plist-get properties :auto-sitemap)
 			   (expand-file-name
@@ -69,7 +77,7 @@ removed from the final plist."
 ;;; Mandatory properties
 
 (ert-deftest test-org-publish/base-extension ()
-  "Test `:base-extension' specifications"
+  "Test `:base-extension' specifications."
   ;; Regular tests.
   (should
    (equal '("a.org" "b.org")
@@ -114,6 +122,82 @@ removed from the final plist."
     (equal (org-test-publish nil func '(:publishing-function))
            (org-test-publish '(:publishing-function org-html-publish-to-html) func)))))
 
+\f
+;;; Publish cache
+
+(defun org-test-publish-touch (file)
+  "Change the modification time of FILE."
+  (let ((buf (get-file-buffer file)))
+    (when buf
+      (kill-buffer buf)))
+  (find-file file)
+  (set-buffer-modified-p t)
+  (save-buffer 0))
+
+(ert-deftest test-org-publish/publish-cache ()
+  "Test publish cache based on timestamps.
+Publish a source file, which includes a config file, to HTML.
+Test updates of source and config file."
+  (let* ((base (expand-file-name "examples/pub-cache/" org-test-dir))
+         (source (expand-file-name "source.org" base))
+         (config (expand-file-name "config.org" base))
+         (pub-dir (make-temp-file "org-test" t))
+         (html (expand-file-name "source.html" pub-dir))
+         (plist `(:publishing-function org-html-publish-to-html
+                                       :base-extension nil
+                                       :base-directory ,base
+                                       :exclude "."
+			               :include ("source.org")))
+         (handler (lambda (dir)
+	            (remove ".org-timestamps"
+		            (cl-remove-if #'file-directory-p
+				          (directory-files dir))))))
+    (should
+     ;; Publish HTML from source.org for the first time.
+     (equal '("source.html")
+	    (org-test-publish plist handler nil t pub-dir t)))
+    (let ((hmtime (org-publish-cache-mtime-of-src html)))
+      (sleep-for 0.1)
+      ;; Publish again, without source changes.
+      ;; Should not publish, but keep the HTML file unchanged.
+      (org-test-publish plist handler nil t pub-dir t)
+      (should
+       (equal hmtime
+              (org-publish-cache-mtime-of-src html)))
+
+      ;; Pretend the source has changed.
+      ;; Publish again, with a new mtime.
+      (org-test-publish-touch source)
+      (org-test-publish plist handler nil t pub-dir t)
+      (let ((hmtime2 (org-publish-cache-mtime-of-src html)))
+        (should
+         (time-less-p hmtime hmtime2))
+
+        (sleep-for 0.1)
+        ;; Publish again, without source changes.
+        ;; Does not publish, but keeps the HTML file unchanged.
+        (org-test-publish plist handler nil t pub-dir t)
+        (should
+         (equal hmtime2
+                (org-publish-cache-mtime-of-src html)))
+
+        ;; Pretend file config.org has changed.
+        ;; Publish again, with a new mtime.
+        (org-test-publish-touch config)
+        (org-test-publish plist handler nil t pub-dir t)
+        (let ((hmtime3 (org-publish-cache-mtime-of-src html)))
+          (should
+           (time-less-p hmtime2 hmtime3))
+
+          (sleep-for 0.1)
+          ;; Publish again, without source changes.
+          ;; Should not publish, but keep the HTML file unchanged.
+          (org-test-publish plist handler nil t pub-dir t)
+          (should
+           (equal hmtime3
+                  (org-publish-cache-mtime-of-src html))))))
+    (delete-directory pub-dir t)))
+
 \f
 ;;; Site-map
 
-- 
2.25.1


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 6187 bytes --]

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: Proposal: Change publication timestamps
  2024-08-25 17:00           ` Proposal: Change publication timestamps Jens Lechtenboerger
@ 2024-09-15 12:02             ` Jens Lechtenboerger
  2024-09-17 18:33               ` Ihor Radchenko
  0 siblings, 1 reply; 61+ messages in thread
From: Jens Lechtenboerger @ 2024-09-15 12:02 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, emacs-orgmode


[-- Attachment #1.1: Type: text/plain, Size: 426 bytes --]

On 2024-08-25, Jens Lechtenboerger wrote:

> On 2024-08-15, Ihor Radchenko wrote:
>
>> The patch looks reasonable.
>> Applied, onto main.
>> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=38d1bc67b2
>
> Great, thanks!
>
>> Ideally, it would be nice to have tests as well.
>
> I added test cases in the attached patch.

I amended the commit message.  I hope this to be ready for inclusion.

Best wishes,
Jens


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: 0001-test-ox-publish.el-Add-tests-for-publish-cache.patch --]
[-- Type: text/x-diff, Size: 7447 bytes --]

From 8914a3c6aa3821917ae10c965f4b1d9aadb4b554 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jens=20Lechtenb=C3=B6rger?= <lechten@wi.uni-muenster.de>
Date: Sun, 25 Aug 2024 18:52:59 +0200
Subject: [PATCH] test-ox-publish.el: Add tests for publish cache

* testing/lisp/test-ox-publish.el (org-test-publish): Add optional
arguments `timestamp-flag' and `pubdir' for control over existing
local variables, add optional argument `keep-pubdir-p' for conditional
deletion of publication directory.
(org-test-publish-touch): New function to change modification time of
file.
(test-org-publish/publish-cache): New function with tests.

* testing/examples/pub-cache/source.org: New source test file.

* testing/examples/pub-cache/config.org: New config file.
---
 testing/examples/pub-cache/config.org |  3 +
 testing/examples/pub-cache/source.org |  4 ++
 testing/lisp/test-ox-publish.el       | 98 +++++++++++++++++++++++++--
 3 files changed, 98 insertions(+), 7 deletions(-)
 create mode 100644 testing/examples/pub-cache/config.org
 create mode 100644 testing/examples/pub-cache/source.org

diff --git a/testing/examples/pub-cache/config.org b/testing/examples/pub-cache/config.org
new file mode 100644
index 000000000..70628d3af
--- /dev/null
+++ b/testing/examples/pub-cache/config.org
@@ -0,0 +1,3 @@
+#+OPTIONS: author:nil
+
+This is included.
diff --git a/testing/examples/pub-cache/source.org b/testing/examples/pub-cache/source.org
new file mode 100644
index 000000000..7072f560e
--- /dev/null
+++ b/testing/examples/pub-cache/source.org
@@ -0,0 +1,4 @@
+#+TITLE: Test
+#+INCLUDE: config.org
+
+Nothing special
diff --git a/testing/lisp/test-ox-publish.el b/testing/lisp/test-ox-publish.el
index 6419b8f6c..a8127ccf2 100644
--- a/testing/lisp/test-ox-publish.el
+++ b/testing/lisp/test-ox-publish.el
@@ -24,7 +24,9 @@
 \f
 ;;; Helper functions
 
-(defun org-test-publish (properties handler &optional remove-prop)
+(defun org-test-publish
+    (properties handler
+                &optional remove-prop timestamp-flag pubdir keep-pubdir-p)
   "Publish a project defined by PROPERTIES.
 Call HANDLER with the publishing directory as its sole argument.
 Unless set otherwise in PROPERTIES, `:base-directory' is set to
@@ -33,12 +35,17 @@ Unless set otherwise in PROPERTIES, `:base-directory' is set to
 Because `org-publish-property' uses `plist-member' to check the
 existence of a property, a property with a value nil is different
 from a non-existing property.  Properties in REMOVE-PROP will be
-removed from the final plist."
+removed from the final plist.
+Assign optional TIMESTAMP-FLAG to `org-publish-use-timestamps-flag'.
+Optional PUBDIR specifies the `:publishing-directory', which
+overrides the default of a randomly generated temporary directory.
+If optional KEEP-PUBDIR-P is non-nil, keep publishing directory,
+including timestamp directory; otherwise, delete it."
   (declare (indent 1))
-  (let* ((org-publish-use-timestamps-flag nil)
+  (let* ((org-publish-use-timestamps-flag timestamp-flag)
 	 (org-publish-cache nil)
 	 (base-dir (expand-file-name "examples/pub/" org-test-dir))
-	 (pub-dir (make-temp-file "org-test" t))
+	 (pub-dir (or pubdir (make-temp-file "org-test" t)))
 	 (org-publish-timestamp-directory
 	  (expand-file-name ".org-timestamps/" pub-dir))
          (props (org-plist-delete-all
@@ -54,8 +61,9 @@ removed from the final plist."
 	(progn
 	  (org-publish-projects (list project))
 	  (funcall handler pub-dir))
-      ;; Clear published data.
-      (delete-directory pub-dir t)
+      (unless keep-pubdir-p
+        ;; Clear published data.
+        (delete-directory pub-dir t))
       ;; Delete auto-generated site-map file, if applicable.
       (let ((site-map (and (plist-get properties :auto-sitemap)
 			   (expand-file-name
@@ -69,7 +77,7 @@ removed from the final plist."
 ;;; Mandatory properties
 
 (ert-deftest test-org-publish/base-extension ()
-  "Test `:base-extension' specifications"
+  "Test `:base-extension' specifications."
   ;; Regular tests.
   (should
    (equal '("a.org" "b.org")
@@ -114,6 +122,82 @@ removed from the final plist."
     (equal (org-test-publish nil func '(:publishing-function))
            (org-test-publish '(:publishing-function org-html-publish-to-html) func)))))
 
+\f
+;;; Publish cache
+
+(defun org-test-publish-touch (file)
+  "Change the modification time of FILE."
+  (let ((buf (get-file-buffer file)))
+    (when buf
+      (kill-buffer buf)))
+  (find-file file)
+  (set-buffer-modified-p t)
+  (save-buffer 0))
+
+(ert-deftest test-org-publish/publish-cache ()
+  "Test publish cache based on timestamps.
+Publish a source file, which includes a config file, to HTML.
+Test updates of source and config file."
+  (let* ((base (expand-file-name "examples/pub-cache/" org-test-dir))
+         (source (expand-file-name "source.org" base))
+         (config (expand-file-name "config.org" base))
+         (pub-dir (make-temp-file "org-test" t))
+         (html (expand-file-name "source.html" pub-dir))
+         (plist `(:publishing-function org-html-publish-to-html
+                                       :base-extension nil
+                                       :base-directory ,base
+                                       :exclude "."
+			               :include ("source.org")))
+         (handler (lambda (dir)
+	            (remove ".org-timestamps"
+		            (cl-remove-if #'file-directory-p
+				          (directory-files dir))))))
+    (should
+     ;; Publish HTML from source.org for the first time.
+     (equal '("source.html")
+	    (org-test-publish plist handler nil t pub-dir t)))
+    (let ((hmtime (org-publish-cache-mtime-of-src html)))
+      (sleep-for 0.1)
+      ;; Publish again, without source changes.
+      ;; Should not publish, but keep the HTML file unchanged.
+      (org-test-publish plist handler nil t pub-dir t)
+      (should
+       (equal hmtime
+              (org-publish-cache-mtime-of-src html)))
+
+      ;; Pretend the source has changed.
+      ;; Publish again, with a new mtime.
+      (org-test-publish-touch source)
+      (org-test-publish plist handler nil t pub-dir t)
+      (let ((hmtime2 (org-publish-cache-mtime-of-src html)))
+        (should
+         (time-less-p hmtime hmtime2))
+
+        (sleep-for 0.1)
+        ;; Publish again, without source changes.
+        ;; Does not publish, but keeps the HTML file unchanged.
+        (org-test-publish plist handler nil t pub-dir t)
+        (should
+         (equal hmtime2
+                (org-publish-cache-mtime-of-src html)))
+
+        ;; Pretend file config.org has changed.
+        ;; Publish again, with a new mtime.
+        (org-test-publish-touch config)
+        (org-test-publish plist handler nil t pub-dir t)
+        (let ((hmtime3 (org-publish-cache-mtime-of-src html)))
+          (should
+           (time-less-p hmtime2 hmtime3))
+
+          (sleep-for 0.1)
+          ;; Publish again, without source changes.
+          ;; Should not publish, but keep the HTML file unchanged.
+          (org-test-publish plist handler nil t pub-dir t)
+          (should
+           (equal hmtime3
+                  (org-publish-cache-mtime-of-src html))))))
+    (delete-directory pub-dir t)))
+
 \f
 ;;; Site-map
 
-- 
2.25.1


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 6187 bytes --]

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: Proposal: Change publication timestamps
  2024-09-15 12:02             ` Jens Lechtenboerger
@ 2024-09-17 18:33               ` Ihor Radchenko
  0 siblings, 0 replies; 61+ messages in thread
From: Ihor Radchenko @ 2024-09-17 18:33 UTC (permalink / raw)
  To: Jens Lechtenboerger; +Cc: Eli Zaretskii, emacs-orgmode

Jens Lechtenboerger <lechten@wi.uni-muenster.de> writes:

>>> Ideally, it would be nice to have tests as well.
>>
>> I added test cases in the attached patch.
>
> I amended the commit message.  I hope this to be ready for inclusion.

Thanks!
Applied, onto main, with a small amendment to the commit message.
https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=9cbf0c99c3

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2024-09-17 18:32 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-12  9:38 Please document the caching and its user options Eli Zaretskii
2024-06-14 13:12 ` Ihor Radchenko
2024-06-14 13:41   ` Eli Zaretskii
2024-06-14 15:31     ` Ihor Radchenko
2024-06-14 15:56       ` Eli Zaretskii
2024-06-15 12:47         ` Ihor Radchenko
2024-06-15 13:01           ` Eli Zaretskii
2024-06-15 14:13             ` Ihor Radchenko
2024-06-15 14:37               ` Eli Zaretskii
2024-06-16  9:05                 ` Ihor Radchenko
2024-06-16 10:41                   ` Eli Zaretskii
2024-06-23  9:12                     ` Björn Bidar
2024-06-15 13:47           ` Ihor Radchenko
2024-06-14 13:56   ` Jens Lechtenboerger
2024-06-14 14:31     ` Publishing cache (was: Please document the caching and its user options) Ihor Radchenko
2024-08-12  7:55       ` Proposal: Change publication timestamps (was: Publishing cache) Jens Lechtenboerger
2024-08-15 18:29         ` Ihor Radchenko
2024-08-25 17:00           ` Proposal: Change publication timestamps Jens Lechtenboerger
2024-09-15 12:02             ` Jens Lechtenboerger
2024-09-17 18:33               ` Ihor Radchenko
2024-06-16  5:40   ` Please document the caching and its user options Daniel Clemente
2024-06-16 12:36     ` Ihor Radchenko
2024-06-17 12:41       ` Daniel Clemente
2024-06-18 15:53         ` Ihor Radchenko
2024-06-18 16:15           ` Eli Zaretskii
2024-06-18 16:25             ` Ihor Radchenko
2024-06-18 16:33               ` Eli Zaretskii
2024-06-18 16:55                 ` Ihor Radchenko
2024-06-19  9:27                   ` Colin Baxter
2024-06-19 10:35                     ` Ihor Radchenko
2024-06-19 13:04                       ` Eli Zaretskii
2024-06-19 13:30                         ` Ihor Radchenko
2024-06-19 16:07                           ` Colin Baxter
2024-06-19 16:15                             ` Ihor Radchenko
2024-06-18 22:06               ` Rudolf Adamkovič
2024-06-19  4:29                 ` tomas
2024-06-23 11:45           ` Daniel Clemente
2024-06-24 10:36             ` Ihor Radchenko
2024-06-26 12:59               ` Daniel Clemente
2024-06-26 13:21                 ` org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options) Ihor Radchenko
2024-06-27  8:55                   ` Daniel Clemente
2024-06-27 10:15                     ` org-encrypt-entries is slow (was: org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options)) Ihor Radchenko
2024-07-02 16:54                       ` Daniel Clemente
2024-07-02 19:16                         ` Ihor Radchenko
2024-07-04 10:36                           ` Daniel Clemente
2024-07-06 13:02                             ` Ihor Radchenko
2024-07-10 13:09                               ` Daniel Clemente
2024-07-11 10:40                                 ` Ihor Radchenko
2024-07-15 17:00                                   ` Daniel Clemente
2024-07-20 14:14                                     ` Ihor Radchenko
2024-07-24 13:47                                       ` Daniel Clemente
2024-07-25  7:31                                         ` Ihor Radchenko
2024-07-25 14:08                                           ` Daniel Clemente
2024-07-25 14:15                                             ` Ihor Radchenko
2024-06-27 10:34                     ` org-crypt leaking data when encryption password is not entered twice (was: Please document the caching and its user options) Ihor Radchenko
2024-07-02 16:53                       ` Daniel Clemente
2024-06-27  9:27                 ` Please document the caching and its user options Eli Zaretskii
2024-06-27 10:11                   ` Ihor Radchenko
2024-06-27 10:30                     ` Eli Zaretskii
2024-06-28 12:54                     ` Rudolf Adamkovič
2024-06-28 15:31                       ` Ihor Radchenko

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).