Ludovic Courtès <ludo@gnu.org> writes:

>> +        (call-with-retrying-transaction db
>> +            (lambda ()
>              ^^
> Too much indentation (maybe we miss a rule in .dir-locals.el?).

My bad, my understanding of our .dir-locals.el was more-or-less cargo
culting until I read the lisp-indent-function documentation just now.
The 'scheme-indent-function should be 1 for both call-with-transaction
and call-with-retrying-transaction. Patch attached.

> Two questions:
>
>   1. Can another process come and fiddle with TO-REGISTER while we’re
>      still in ‘reset-timestamps’?  Or can GC happen while we’re in
>      ‘reset-timestamps’ and delete TO-REGISTER and remove it from the
>      database?

>
> I think none of these scenarios can happen, as long as we’ve taken the
> .lock file for TO-REGISTER before, like ‘finalize-store-file’ does.

The lock file has no bearing on liveness of the path it locks, though
the liveness of the path it locks *does* count as liveness for the lock
file and for the .chroot file; see tryToDelete() in nix/libstore/gc.cc.

(Well, semi-liveness; .lock and .chroot files won't show up in a list of
live paths, but they will still be protected from deletion if their
associated store item is a temp root)

So general "fiddling" can't happen, but GC can. The responsibility for
both locking and registering as temporary roots falls on the caller,
currently, as I believe it should. We may want to document this
responsibility in the docstring for register-items, though.

So currently finalize-store-file is safe from clobbering by another
process attempting to finalize to the same path, but not safe from
garbage collection between when the temporary store file is renamed and
when it is registered. It needs an add-temp-root prior to renaming.

>
>   2. After the transaction, TO-REGISTER is considered valid.  But are
>      the effects of the on-going deduplication observable, due to
>      non-atomicity of some operation?

Subdirectories of store items need to be made writable prior to renaming
the temp link, so there will necessarily be a window of time during
which various subdirectories will appear writable. I don't think this
should cause problems; we already assume that the .lock file is held, so
we should be the only process attempting to deduplicate it. On a related
note, we should probably use dynamic-wind for setting and restoring the
permissions in replace-with-link.

Also, replace-with-link doesn't check whether the "containing directory"
is the store like optimisePath_() does, so in theory if another process
tried to make a permanent change to the store's permissions it could be
clobbered when replace-with-link restores the permissions. I don't know
of any instance where this could happen currently, but it's something to
keep in mind.

And, of course, there's the inherent visible change of deduplication -
possible modification of inode number, but I don't see how that could
cause problems with any reasonable programs.

> I think the ‘replace-with-link’ dance is atomic, so we should be fine.
>
> Thoughts?

replace-with-link is atomic, but it can fail if the "canonical" link in
.links is deleted by the garbage collector between when its existence is
checked in 'deduplicate' and when temp link creation in
replace-with-link happens. Then ENOENT would be returned, and
'deduplicate' wouldn't catch that exception, nor would optimisePath_()
in nix/libstore/optimise-store.cc.

The proper behavior there, in my opinion, would be to retry the
deduplication. Attached is a patch that makes 'deduplicate' do that.

Also, while I'm perusing optimisePath_(), there's a minor bug in which
EMLINK generated by renaming the temp link may still be treated as a
fatal error. This is because errno may be clobbered by unlink() prior to
being checked (according to the errno man page it can still be modified
even if the call succeeds).

In summary, there are potential issues, but nothing that should be
affected by reducing the transaction scope AFAIK.

- reepca