unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* enabling atomic-update more often, i.e., rewrite via rename
@ 2011-12-09 10:26 Jim Meyering
  2011-12-13 17:13 ` save-buffer: avoid data loss on interrupt Jim Meyering
  0 siblings, 1 reply; 11+ messages in thread
From: Jim Meyering @ 2011-12-09 10:26 UTC (permalink / raw)
  To: Emacs development discussions

TL;DR, for a regular file with no other hard links in a writable directory,
why is the default to rewrite in place (non-atomically), rather than to write
to temporary-in-same-dir and then to rename, thus updating atomically?
--------------------------------------

A few days ago I was editing my .procmailrc file with emacs, made a
small change and saved it.  I was dismayed to see that change provoke a
flurry of syntax errors from procmail as two or three incoming messages
made it parse the partially-rewritten .procmailrc file.  Emacs was rewriting
that file in place, and procmail was reading it at the same time.

This made me think there must be a way to configure emacs so that when
it writes a file, it updates that file atomically.

Sure enough, there is: the buffer-local file-precious-flag.
I've added these two lines to the top of .procmailrc, and
confirmed via strace that emacs now does indeed update via rename:

  # -*- file-precious-flag: t -*-
  # The above tells emacs to save by write-then-rename, i.e., atomically.

That solved my immediate problem, but why isn't this the default, I wondered?
Reading files.el, it seems that one reason is to avoid breaking hard links.
If you set break-hardlink-on-save to `t' and the file happens to have
two or more links, then it does happen by default.

Here's the relevant code from files.el:

    (let* ((dir (file-name-directory buffer-file-name))
           (dir-writable (file-writable-p dir)))
      (if (or (and file-precious-flag dir-writable)
              (and break-hardlink-on-save
                   (file-exists-p buffer-file-name)
                   (> (file-nlinks buffer-file-name) 1)
                   (or dir-writable
                       (error (concat (format
                                       "Directory %s write-protected; " dir)
                                      "cannot break hardlink when saving")))))
	  ;; Write temp name, then rename it.
	  ;; This requires write access to the containing dir,
	  ;; which is why we don't try it if we don't have that access.

But my .procmailrc has only one link and is in a writable directory,
so an atomic update would not cause any harm.

Has anyone considered making update-via-rename the default in that case?



^ permalink raw reply	[flat|nested] 11+ messages in thread

* save-buffer: avoid data loss on interrupt
  2011-12-09 10:26 enabling atomic-update more often, i.e., rewrite via rename Jim Meyering
@ 2011-12-13 17:13 ` Jim Meyering
  2011-12-13 20:03   ` Paul Eggert
                     ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Jim Meyering @ 2011-12-13 17:13 UTC (permalink / raw)
  To: Emacs development discussions

Jim Meyering wrote:
> TL;DR, for a regular file with no other hard links in a writable directory,
> why is the default to rewrite in place (non-atomically), rather than to write
> to temporary-in-same-dir and then to rename, thus updating atomically?
> --------------------------------------
...

Note that this change helps avoid data loss when emacs is killed
while rewriting an input file.

So far, no one responded to the above, so here's a proof-of-concept patch.
This is a significant enough change that explanation like what's in the
commit log below belongs at least in etc/NEWS or in documentation.
This change is obviously not appropriate before the release.

(Obviously, before pushing it, I would also make the commit log
contents into a ChangeLog entry)

From 3fc1719d0365707f25f512a87aa1c84e1b4fece5 Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering@redhat.com>
Date: Tue, 13 Dec 2011 17:47:28 +0100
Subject: [PATCH] make save-buffer mostly immune to interrupts: avoid data loss

When you type C-xC-s, emacs usually rewrites a file in-place.  Thus,
if it is interrupted, you may be left with a corrupt file.  Even when
emacs is not interrupted, if some process is reading that file as you
write it, it can easily process an incomplete version of it, which can
lead to errors.  You can avoid all of that by setting the buffer-local
file-precious-flag for files you care about, but it's often best not to
do that for a file with multiple hard links, because when you set that
flag, emacs' save mechanism writes a file by first writing a temporary
file in the same directory, and then renaming that (an atomic process)
to the actual file you wanted to save.  Renaming that way effectively
unlinks a file with multiple hard links.
* files.el (basic-save-buffer-2): Save via write-temp-and-rename
not just when file-precious-flag, but also when the number of hard
links is 1.
---
 lisp/files.el |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/lisp/files.el b/lisp/files.el
index 40b6e7d..535715c 100644
--- a/lisp/files.el
+++ b/lisp/files.el
@@ -4467,7 +4467,9 @@ Before and after saving the buffer, this function runs
 	(setq setmodes (backup-buffer)))
     (let* ((dir (file-name-directory buffer-file-name))
            (dir-writable (file-writable-p dir)))
-      (if (or (and file-precious-flag dir-writable)
+      (if (or (and dir-writable
+		   (or file-precious-flag
+		       (= (file-nlinks buffer-file-name) 1)))
               (and break-hardlink-on-save
                    (file-exists-p buffer-file-name)
                    (> (file-nlinks buffer-file-name) 1)
--
1.7.8.163.g9859a



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: save-buffer: avoid data loss on interrupt
  2011-12-13 17:13 ` save-buffer: avoid data loss on interrupt Jim Meyering
@ 2011-12-13 20:03   ` Paul Eggert
  2011-12-13 20:52     ` Jim Meyering
  2011-12-14  2:40   ` Stefan Monnier
  2011-12-15 12:58   ` Jim Meyering
  2 siblings, 1 reply; 11+ messages in thread
From: Paul Eggert @ 2011-12-13 20:03 UTC (permalink / raw)
  To: Jim Meyering; +Cc: Emacs development discussions

On 12/13/11 09:13, Jim Meyering wrote:

> -      (if (or (and file-precious-flag dir-writable)
> +      (if (or (and dir-writable
> +		   (or file-precious-flag
> +		       (= (file-nlinks buffer-file-name) 1)))

I like the general idea, but this solution seems a bit aggressive,
as it will cause the file's ownership to change
if the file's link count is 1, and often that isn't what is wanted.
Instead, how about extending the semantics of break-hardlink-on-save,
with something like the following (untested) patch?

--- lisp/files.el	2011-12-04 08:02:42 +0000
+++ lisp/files.el	2011-12-13 19:46:33 +0000
@@ -4469,8 +4469,11 @@ Before and after saving the buffer, this
            (dir-writable (file-writable-p dir)))
       (if (or (and file-precious-flag dir-writable)
               (and break-hardlink-on-save
-                   (file-exists-p buffer-file-name)
-                   (> (file-nlinks buffer-file-name) 1)
+		   (let ((nlinks (file-nlinks buffer-file-name)))
+		     (and nlinks
+			  (> nlinks (if (numberp break-hardlink-on-save)
+					break-hardlink-on-save
+				      1))))
                    (or dir-writable
                        (error (concat (format
                                        "Directory %s write-protected; " dir)

That way, you can set break-hardlink-on-save to -1 to get
the behavior that you want.

While we're on the subject, break-hardlink-on-save is not
documented; I wonder why not?  Also, the current code
does not work if the directory is sticky and writable
and the file is owned by someone else (a common situation
in /tmp); this should get fixed too.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: save-buffer: avoid data loss on interrupt
  2011-12-13 20:03   ` Paul Eggert
@ 2011-12-13 20:52     ` Jim Meyering
  2011-12-13 21:47       ` Paul Eggert
  0 siblings, 1 reply; 11+ messages in thread
From: Jim Meyering @ 2011-12-13 20:52 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Emacs development discussions

Paul Eggert wrote:
> On 12/13/11 09:13, Jim Meyering wrote:
>
>> -      (if (or (and file-precious-flag dir-writable)
>> +      (if (or (and dir-writable
>> +		   (or file-precious-flag
>> +		       (= (file-nlinks buffer-file-name) 1)))
>
> I like the general idea, but this solution seems a bit aggressive,
> as it will cause the file's ownership to change
> if the file's link count is 1, and often that isn't what is wanted.
> Instead, how about extending the semantics of break-hardlink-on-save,
> with something like the following (untested) patch?
>
> --- lisp/files.el	2011-12-04 08:02:42 +0000
> +++ lisp/files.el	2011-12-13 19:46:33 +0000
> @@ -4469,8 +4469,11 @@ Before and after saving the buffer, this
>             (dir-writable (file-writable-p dir)))
>        (if (or (and file-precious-flag dir-writable)
>                (and break-hardlink-on-save
> -                   (file-exists-p buffer-file-name)
> -                   (> (file-nlinks buffer-file-name) 1)
> +		   (let ((nlinks (file-nlinks buffer-file-name)))
> +		     (and nlinks
> +			  (> nlinks (if (numberp break-hardlink-on-save)
> +					break-hardlink-on-save
> +				      1))))
>                     (or dir-writable
>                         (error (concat (format
>                                         "Directory %s write-protected; " dir)
>
> That way, you can set break-hardlink-on-save to -1 to get
> the behavior that you want.
>
> While we're on the subject, break-hardlink-on-save is not
> documented; I wonder why not?  Also, the current code
> does not work if the directory is sticky and writable
> and the file is owned by someone else (a common situation
> in /tmp); this should get fixed too.

That'd be ok, but doesn't this deserve to be enabled more often
than when someone tweaks the break-hardlink-on-save variable?

How about this instead, assuming a file-owner-uid function?
(or if the two users of file-attributes is an issue,
we could combine file-nlinks and file-owner-uid into
a function that calls file-attributes just once)

Sure, this might still change the group, but if that's an issue
we could compare it to the default group.

diff --git a/lisp/files.el b/lisp/files.el
index 535715c..b0f01f2 100644
--- a/lisp/files.el
+++ b/lisp/files.el
@@ -4469,7 +4469,8 @@ Before and after saving the buffer, this function runs
            (dir-writable (file-writable-p dir)))
       (if (or (and dir-writable
 		   (or file-precious-flag
-		       (= (file-nlinks buffer-file-name) 1)))
+		       (and (= (file-nlinks buffer-file-name) 1)
+			    (= (file-owner-uid buffer-file-name) (user-uid)))))
               (and break-hardlink-on-save
                    (file-exists-p buffer-file-name)
                    (> (file-nlinks buffer-file-name) 1)



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: save-buffer: avoid data loss on interrupt
  2011-12-13 20:52     ` Jim Meyering
@ 2011-12-13 21:47       ` Paul Eggert
  2011-12-13 22:27         ` chad
  2011-12-14 14:40         ` Jim Meyering
  0 siblings, 2 replies; 11+ messages in thread
From: Paul Eggert @ 2011-12-13 21:47 UTC (permalink / raw)
  To: Jim Meyering; +Cc: Emacs development discussions

On 12/13/11 12:52, Jim Meyering wrote:
> doesn't this deserve to be enabled more often
> than when someone tweaks the break-hardlink-on-save variable?

I'd like to do that too, but things are a bit tricky here.
For example, what if the file has a special ACL?
Won't that get lost?

> How about this instead, assuming a file-owner-uid function?
> (or if the two users of file-attributes is an issue,
> we could combine file-nlinks and file-owner-uid into
> a function that calls file-attributes just once)
> 
> Sure, this might still change the group, but if that's an issue
> we could compare it to the default group.

Unfortunately one can't predict the ownership of the new file
so easily.  It might be a setuid directory; on some hosts,
that causes new files in the directory to have the same
owner as the directory.

A more reliable way to deal with it might be to create the
temporary file, and ensure that its owner and group
and any other special attributes are correct, before copying to it.
Normally the check should succeed, so this shouldn't cost much
on the average.  The permissions should be conservative
(e.g., original permissions sans executable bits) until the
copying is done.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: save-buffer: avoid data loss on interrupt
  2011-12-13 21:47       ` Paul Eggert
@ 2011-12-13 22:27         ` chad
  2011-12-14 14:40         ` Jim Meyering
  1 sibling, 0 replies; 11+ messages in thread
From: chad @ 2011-12-13 22:27 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Jim Meyering, Emacs development discussions

Since you guys are looking at owner/group issues, I'll mention that it will also clobber ctime on the file and possibly break posix acls.  In a former life, I used to care about file ctimes, so maybe someone else does now.  I grep'd through emacs/src/*.c and convinced myself that neither emacs nor I care much about posix acls at the moment, but I suppose that could change (or maybe I missed something).

Hope that helps,
*Chad




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: save-buffer: avoid data loss on interrupt
  2011-12-13 17:13 ` save-buffer: avoid data loss on interrupt Jim Meyering
  2011-12-13 20:03   ` Paul Eggert
@ 2011-12-14  2:40   ` Stefan Monnier
  2011-12-15 12:58   ` Jim Meyering
  2 siblings, 0 replies; 11+ messages in thread
From: Stefan Monnier @ 2011-12-14  2:40 UTC (permalink / raw)
  To: Jim Meyering; +Cc: Emacs development discussions

I generally like the idea of making buffer-save atomic, but there are
two problems:
- non-trivial issues w.r.t breaking hard links and preserving access rights.
- we can't change any part of this while we're in feature freeze anyway.
The second problem means we have time to work through the first problem.


        Stefan



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: save-buffer: avoid data loss on interrupt
  2011-12-13 21:47       ` Paul Eggert
  2011-12-13 22:27         ` chad
@ 2011-12-14 14:40         ` Jim Meyering
  2011-12-14 18:51           ` Paul Eggert
  1 sibling, 1 reply; 11+ messages in thread
From: Jim Meyering @ 2011-12-14 14:40 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Emacs development discussions

Paul Eggert wrote:
> On 12/13/11 12:52, Jim Meyering wrote:
>> doesn't this deserve to be enabled more often
>> than when someone tweaks the break-hardlink-on-save variable?
>
> I'd like to do that too, but things are a bit tricky here.
> For example, what if the file has a special ACL?
> Won't that get lost?

I can add a test for the presence of an ACL,
and handle that just like a hard link count of 2 or greater:
resort to the standard, non-atomic code path.

>> How about this instead, assuming a file-owner-uid function?
>> (or if the two users of file-attributes is an issue,
>> we could combine file-nlinks and file-owner-uid into
>> a function that calls file-attributes just once)
>>
>> Sure, this might still change the group, but if that's an issue
>> we could compare it to the default group.
>
> Unfortunately one can't predict the ownership of the new file
> so easily.  It might be a setuid directory; on some hosts,
> that causes new files in the directory to have the same
> owner as the directory.
>
> A more reliable way to deal with it might be to create the
> temporary file, and ensure that its owner and group
> and any other special attributes are correct, before copying to it.
> Normally the check should succeed, so this shouldn't cost much
> on the average.  The permissions should be conservative
> (e.g., original permissions sans executable bits) until the
> copying is done.

That sounds good, but complicates the already hairy logic, since
upon mismatch we'd have to clean up and then take the other path.

If the cost of an extra stat is not prohibitive, I'm
tempted to keep this change simple, check for a set-UID
parent directory, and handle that like an ACL.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: save-buffer: avoid data loss on interrupt
  2011-12-14 14:40         ` Jim Meyering
@ 2011-12-14 18:51           ` Paul Eggert
  2011-12-15 23:53             ` Stefan Monnier
  0 siblings, 1 reply; 11+ messages in thread
From: Paul Eggert @ 2011-12-14 18:51 UTC (permalink / raw)
  To: Jim Meyering; +Cc: Emacs development discussions

On 12/14/11 06:40, Jim Meyering wrote:

> I can add a test for the presence of an ACL,
> and handle that just like a hard link count of 2 or greater:
> resort to the standard, non-atomic code path.

Yes, but then my reward for trying to be safer and using an ACL
is that my file's contents are more likely to get lost.  Ouch.

Plus, it's not as simple as looking for an ACL on the original
file.  One must also look for ACLs that might be inherited by
the newly created copy.

> If the cost of an extra stat is not prohibitive, I'm
> tempted to keep this change simple, check for a set-UID
> parent directory

I'm afraid it's not as simple as that.  Not only must
the parent directory be setuid, but the file system needs
to be mounted with the suiddir option.  Not all file systems
support that option; some silently ignore it.

And won't we have a problem there with network file systems?
It's not clear to me whether the setuid semantics are
specified by the client OS or the server OS.  (Likewise
for ACLs inherited by the new file.)

All in all, I think that in the long run it'll be simpler and
more reliable if we create the temp file and then check
that it has (or make it have) the correct ownership and permissions.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: save-buffer: avoid data loss on interrupt
  2011-12-13 17:13 ` save-buffer: avoid data loss on interrupt Jim Meyering
  2011-12-13 20:03   ` Paul Eggert
  2011-12-14  2:40   ` Stefan Monnier
@ 2011-12-15 12:58   ` Jim Meyering
  2 siblings, 0 replies; 11+ messages in thread
From: Jim Meyering @ 2011-12-15 12:58 UTC (permalink / raw)
  To: Emacs development discussions

Jim Meyering wrote:
> Jim Meyering wrote:
>> TL;DR, for a regular file with no other hard links in a writable directory,
>> why is the default to rewrite in place (non-atomically), rather than to write
>> to temporary-in-same-dir and then to rename, thus updating atomically?
>> --------------------------------------
> ...
>
> Note that this change helps avoid data loss when emacs is killed
> while rewriting an input file.
>
> So far, no one responded to the above, so here's a proof-of-concept patch.
> This is a significant enough change that explanation like what's in the
> commit log below belongs at least in etc/NEWS or in documentation.
> This change is obviously not appropriate before the release.
>
> (Obviously, before pushing it, I would also make the commit log
> contents into a ChangeLog entry)
...
> Date: Tue, 13 Dec 2011 17:47:28 +0100
> Subject: [PATCH] make save-buffer mostly immune to interrupts: avoid data loss

FYI, here's a patch that works also when
the file in question does not exist.
The preceding patch used file-nlinks without first
ensuring that the file actually exists.

From e7f560e9dcb4c2128a33fa825c1809064e40f56f Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering@redhat.com>
Date: Thu, 15 Dec 2011 13:53:58 +0100
Subject: [PATCH] make save-buffer mostly immune to interrupts: avoid data loss

When you type C-xC-s, emacs usually rewrites a file in-place.  Thus,
if it is interrupted, you may be left with a corrupt file.  Even when
emacs is not interrupted, if some process is reading that file as you
write it, it can easily process an incomplete version of it, which can
lead to errors.  You can avoid all of that by setting the buffer-local
file-precious-flag for files you care about, but it's often best not to
do that for a file with multiple hard links, because when you set that
flag, emacs' save mechanism writes a file by first writing a temporary
file in the same directory, and then renaming that (an atomic process)
to the actual file you wanted to save.  Renaming that way effectively
unlinks a file with multiple hard links.
* files.el (basic-save-buffer-2): Save via write-temp-and-rename
not just when file-precious-flag, but also when the number of hard
links is 1.
---
 lisp/ChangeLog |   18 ++++++++++++++++++
 lisp/files.el  |    5 ++++-
 2 files changed, 22 insertions(+), 1 deletions(-)

diff --git a/lisp/ChangeLog b/lisp/ChangeLog
index 74eb98c..c419cd7 100644
--- a/lisp/ChangeLog
+++ b/lisp/ChangeLog
@@ -7,6 +7,24 @@
 	(makefile-makepp-font-lock-keywords): Add new patterns.
 	(makefile-match-function-end): Match new [...] and [[...]].

+2011-12-13  Jim Meyering  <meyering@redhat.com>
+
+	make save-buffer mostly immune to interrupts: avoid data loss
+	When you type C-xC-s, emacs usually rewrites a file in-place.  Thus,
+	if it is interrupted, you may be left with a corrupt file.  Even when
+	emacs is not interrupted, if some process is reading that file as you
+	write it, it can easily process an incomplete version of it, which can
+	lead to errors.  You can avoid all of that by setting the buffer-local
+	file-precious-flag for files you care about, but it's often best not to
+	do that for a file with multiple hard links, because when you set that
+	flag, emacs' save mechanism writes a file by first writing a temporary
+	file in the same directory, and then renaming that (an atomic process)
+	to the actual file you wanted to save.  Renaming that way effectively
+	unlinks a file with multiple hard links.
+	* files.el (basic-save-buffer-2): Save via write-temp-and-rename
+	not just when file-precious-flag, but also when the number of hard
+	links is 1.
+
 2011-12-11  Juanma Barranquero  <lekktu@gmail.com>

 	* ses.el (ses-call-printer-return, ses-cell-property-get)
diff --git a/lisp/files.el b/lisp/files.el
index 40b6e7d..e374f69 100644
--- a/lisp/files.el
+++ b/lisp/files.el
@@ -4467,7 +4467,10 @@ Before and after saving the buffer, this function runs
 	(setq setmodes (backup-buffer)))
     (let* ((dir (file-name-directory buffer-file-name))
            (dir-writable (file-writable-p dir)))
-      (if (or (and file-precious-flag dir-writable)
+      (if (or (and dir-writable
+		   (or file-precious-flag
+		       (not (file-exists-p buffer-file-name))
+		       (= (file-nlinks buffer-file-name) 1)))
               (and break-hardlink-on-save
                    (file-exists-p buffer-file-name)
                    (> (file-nlinks buffer-file-name) 1)
--
1.7.8.163.g9859a



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: save-buffer: avoid data loss on interrupt
  2011-12-14 18:51           ` Paul Eggert
@ 2011-12-15 23:53             ` Stefan Monnier
  0 siblings, 0 replies; 11+ messages in thread
From: Stefan Monnier @ 2011-12-15 23:53 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Jim Meyering, Emacs development discussions

> All in all, I think that in the long run it'll be simpler and
> more reliable if we create the temp file and then check
> that it has (or make it have) the correct ownership and permissions.

Agreed,


        Stefan



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-12-15 23:53 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-09 10:26 enabling atomic-update more often, i.e., rewrite via rename Jim Meyering
2011-12-13 17:13 ` save-buffer: avoid data loss on interrupt Jim Meyering
2011-12-13 20:03   ` Paul Eggert
2011-12-13 20:52     ` Jim Meyering
2011-12-13 21:47       ` Paul Eggert
2011-12-13 22:27         ` chad
2011-12-14 14:40         ` Jim Meyering
2011-12-14 18:51           ` Paul Eggert
2011-12-15 23:53             ` Stefan Monnier
2011-12-14  2:40   ` Stefan Monnier
2011-12-15 12:58   ` Jim Meyering

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).