unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Paul Eggert <eggert@cs.ucla.edu>
To: Michal Nazarewicz <mina86@mina86.com>, Eli Zaretskii <eliz@gnu.org>
Cc: 72641@debbugs.gnu.org
Subject: bug#72641: 31.0.50; "Unlocking file: Invalid argument" when deleting lock file on network file system
Date: Thu, 15 Aug 2024 14:43:21 -0700	[thread overview]
Message-ID: <bf7fca1b-14b0-4500-b3a0-641302774804@cs.ucla.edu> (raw)
In-Reply-To: <+b8wcnteufggsda3gtf7ioja@mina86.com>

[-- Attachment #1: Type: text/plain, Size: 901 bytes --]

On 2024-08-15 10:44, Michal Nazarewicz wrote:
> The*Warnings* buffer constantly popping up is annoying so I’d love this
> to be addressed though to be honest I’m not sure what would be a good
> solution.  Though I guess you can also argue this is CIFS bug.

It's definitely a file system bug. The symlink syscall should never 
create a regular file. I suggest reporting the bug to whoever maintains 
your file system code.

I don't see any good way to prevent Emacs from creating these 
zero-length files on buggy file systems.

That being said, I think Emacs can ignore and remove bad lock files 
without introducing more races. I installed the attached into the master 
branch and it works for me on your test case (which I introduced 
artificially on GNU/Linux). Please give it a try.

The first patches in this series are just setup. The last patch is the 
real workaround.

[-- Attachment #2: 0001-Fix-unlikely-lock-file-integer-overflow.patch --]
[-- Type: text/x-patch, Size: 1556 bytes --]

From cbacdca9e3f6dcf9b88704391f06daf7301608b0 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Thu, 15 Aug 2024 11:29:16 -0700
Subject: [PATCH 1/4] Fix unlikely lock file integer overflow

* src/filelock.c (within_one_second): Accept intmax_t first arg.
Avoid undefined behavior on integer overflow.
(current_lock_owner): Simplify based on within_one_second change.
---
 src/filelock.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/filelock.c b/src/filelock.c
index 69bd0322d4c..55ab15feb8d 100644
--- a/src/filelock.c
+++ b/src/filelock.c
@@ -298,9 +298,10 @@ lock_file_1 (Lisp_Object lfname, bool force)
 /* Return true if times A and B are no more than one second apart.  */
 
 static bool
-within_one_second (time_t a, time_t b)
+within_one_second (intmax_t a, time_t b)
 {
-  return (a - b >= -1 && a - b <= 1);
+  intmax_t diff;
+  return !ckd_sub (&diff, a, b) && -1 <= diff && diff <= 1;
 }
 \f
 /* On systems lacking ELOOP, test for an errno value that shouldn't occur.  */
@@ -469,8 +470,7 @@ current_lock_owner (lock_info_type *owner, Lisp_Object lfname)
       else if (VALID_PROCESS_ID (pid)
                && (kill (pid, 0) >= 0 || errno == EPERM)
 	       && (boot_time == 0
-		   || (boot_time <= TYPE_MAXIMUM (time_t)
-		       && within_one_second (boot_time, get_boot_sec ()))))
+		   || within_one_second (boot_time, get_boot_sec ())))
         return ANOTHER_OWNS_IT;
       /* The owner process is dead or has a strange pid, so try to
          zap the lockfile.  */
-- 
2.43.0


[-- Attachment #3: 0002-Avoid-some-GC-when-locking-unlocking-files.patch --]
[-- Type: text/x-patch, Size: 4048 bytes --]

From 4b6b9a7acdc4f7d0594caaaa382e2e633f8f1225 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Thu, 15 Aug 2024 12:58:19 -0700
Subject: [PATCH 2/4] Avoid some GC when locking/unlocking files
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* src/filelock.c (lock_file_1, current_lock_owner):
Don’t possibly invoke the garbage collector when
comparing lock file contents to host names.
---
 src/filelock.c | 62 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 37 insertions(+), 25 deletions(-)

diff --git a/src/filelock.c b/src/filelock.c
index 55ab15feb8d..cdf9e6f0ffc 100644
--- a/src/filelock.c
+++ b/src/filelock.c
@@ -271,27 +271,29 @@ lock_file_1 (Lisp_Object lfname, bool force)
   intmax_t boot = get_boot_sec ();
   Lisp_Object luser_name = Fuser_login_name (Qnil);
   Lisp_Object lhost_name = Fsystem_name ();
-
-  /* Protect against the extremely unlikely case of the host name
-     containing an @ character.  */
-  if (!NILP (lhost_name) && strchr (SSDATA (lhost_name), '@'))
-    lhost_name = CALLN (Ffuncall, Qstring_replace,
-			build_string ("@"), build_string ("-"),
-			lhost_name);
-
   char const *user_name = STRINGP (luser_name) ? SSDATA (luser_name) : "";
   char const *host_name = STRINGP (lhost_name) ? SSDATA (lhost_name) : "";
   char lock_info_str[MAX_LFINFO + 1];
   intmax_t pid = getpid ();
 
-  char const *lock_info_fmt = (boot
-			       ? "%s@%s.%"PRIdMAX":%"PRIdMAX
-			       : "%s@%s.%"PRIdMAX);
-  int len = snprintf (lock_info_str, sizeof lock_info_str,
-		      lock_info_fmt, user_name, host_name, pid, boot);
+  int room = sizeof lock_info_str;
+  int len = snprintf (lock_info_str, room, "%s@", user_name);
   if (! (0 <= len && len < sizeof lock_info_str))
     return ENAMETOOLONG;
-
+  /* Protect against the extremely unlikely case of the host name
+     containing an @ character.  */
+  for (; *host_name; len++, host_name++)
+    {
+      if (! (len < sizeof lock_info_str - 1))
+	return ENAMETOOLONG;
+      lock_info_str[len] = *host_name == '@' ? '-' : *host_name;
+    }
+  char const *lock_info_fmt = boot ? ".%"PRIdMAX":%"PRIdMAX : ".%"PRIdMAX;
+  room = sizeof lock_info_str - len;
+  int suffixlen = snprintf (lock_info_str + len, room,
+			    lock_info_fmt, pid, boot);
+  if (! (0 <= suffixlen && suffixlen < room))
+    return ENAMETOOLONG;
   return create_lock_file (SSDATA (lfname), lock_info_str, force);
 }
 
@@ -448,22 +450,32 @@ current_lock_owner (lock_info_type *owner, Lisp_Object lfname)
   if (lfinfo_end != owner->user + lfinfolen)
     return EINVAL;
 
+  char *linkhost = at + 1;
+  ptrdiff_t linkhostlen = dot - linkhost;
   Lisp_Object system_name = Fsystem_name ();
   /* If `system-name' returns nil, that means we're in a
      --no-build-details Emacs, and the name part of the link (e.g.,
      .#test.txt -> larsi@.118961:1646577954) is an empty string.  */
+  bool on_current_host;
   if (NILP (system_name))
-    system_name = build_string ("");
-  /* Protect against the extremely unlikely case of the host name
-     containing an @ character.  */
-  else if (strchr (SSDATA (system_name), '@'))
-    system_name = CALLN (Ffuncall, intern ("string-replace"),
-			 build_string ("@"), build_string ("-"),
-			 system_name);
-  /* On current host?  */
-  if (STRINGP (system_name)
-      && dot - (at + 1) == SBYTES (system_name)
-      && memcmp (at + 1, SSDATA (system_name), SBYTES (system_name)) == 0)
+    on_current_host = linkhostlen == 0;
+  else
+    {
+      on_current_host = linkhostlen == SBYTES (system_name);
+      if (on_current_host)
+	{
+	  /* Protect against the extremely unlikely case of the host
+	     name containing '@'.  */
+	  char *sysname = SSDATA (system_name);
+	  for (ptrdiff_t i = 0; i < linkhostlen; i++)
+	    if (linkhost[i] != (sysname[i] == '@' ? '-' : sysname[i]))
+	      {
+		on_current_host = false;
+		break;
+	      }
+	}
+    }
+  if (on_current_host)
     {
       if (pid == getpid ())
         return I_OWN_IT;
-- 
2.43.0


[-- Attachment #4: 0003-Refactor-current_lock_owner.patch --]
[-- Type: text/x-patch, Size: 7388 bytes --]

From 775fa8443faa3d7f5ce7f7d0aa6e6fb53321715a Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Thu, 15 Aug 2024 13:17:24 -0700
Subject: [PATCH 3/4] Refactor current_lock_owner

* src/filelock.c (current_lock_owner): Refactor to make further
changes easier.  This should not affect behavior.
---
 src/filelock.c | 185 +++++++++++++++++++++++++------------------------
 1 file changed, 95 insertions(+), 90 deletions(-)

diff --git a/src/filelock.c b/src/filelock.c
index cdf9e6f0ffc..c68aacc46fb 100644
--- a/src/filelock.c
+++ b/src/filelock.c
@@ -386,9 +386,6 @@ integer_prefixed (char const *s)
 current_lock_owner (lock_info_type *owner, Lisp_Object lfname)
 {
   lock_info_type local_owner;
-  ptrdiff_t lfinfolen;
-  intmax_t pid, boot_time;
-  char *at, *dot, *lfinfo_end;
 
   /* Even if the caller doesn't want the owner info, we still have to
      read it to determine return value.  */
@@ -396,104 +393,112 @@ current_lock_owner (lock_info_type *owner, Lisp_Object lfname)
     owner = &local_owner;
 
   /* If nonexistent lock file, all is well; otherwise, got strange error. */
-  lfinfolen = read_lock_data (SSDATA (lfname), owner->user);
+  ptrdiff_t lfinfolen = read_lock_data (SSDATA (lfname), owner->user);
   if (lfinfolen < 0)
     return errno == ENOENT || errno == ENOTDIR ? 0 : errno;
-  if (MAX_LFINFO < lfinfolen)
-    return ENAMETOOLONG;
-  owner->user[lfinfolen] = 0;
-
-  /* Parse USER@HOST.PID:BOOT_TIME.  If can't parse, return EINVAL.  */
-  /* The USER is everything before the last @.  */
-  owner->at = at = memrchr (owner->user, '@', lfinfolen);
-  if (!at)
-    return EINVAL;
-  owner->dot = dot = strrchr (at, '.');
-  if (!dot)
-    return EINVAL;
-
-  /* The PID is everything from the last '.' to the ':' or equivalent.  */
-  if (! integer_prefixed (dot + 1))
-    return EINVAL;
-  errno = 0;
-  pid = strtoimax (dot + 1, &owner->colon, 10);
-  if (errno == ERANGE)
-    pid = -1;
-
-  /* After the ':' or equivalent, if there is one, comes the boot time.  */
-  char *boot = owner->colon + 1;
-  switch (owner->colon[0])
+
+  /* Examine lock file contents.  */
+  if (true)
     {
-    case 0:
-      boot_time = 0;
-      lfinfo_end = owner->colon;
-      break;
+      if (MAX_LFINFO < lfinfolen)
+	return ENAMETOOLONG;
+      owner->user[lfinfolen] = 0;
 
-    case '\357':
-      /* Treat "\357\200\242" (U+F022 in UTF-8) as if it were ":" (Bug#24656).
-	 This works around a bug in the Linux CIFS kernel client, which can
-	 mistakenly transliterate ':' to U+F022 in symlink contents.
-	 See <https://bugzilla.redhat.com/show_bug.cgi?id=1384153>.  */
-      if (! (boot[0] == '\200' && boot[1] == '\242'))
+      /* Parse USER@HOST.PID:BOOT_TIME.  If can't parse, return EINVAL.  */
+      /* The USER is everything before the last @.  */
+      char *at = memrchr (owner->user, '@', lfinfolen);
+      if (!at)
 	return EINVAL;
-      boot += 2;
-      FALLTHROUGH;
-    case ':':
-      if (! integer_prefixed (boot))
+      owner->at = at;
+      char *dot = strrchr (at, '.');
+      if (!dot)
 	return EINVAL;
-      boot_time = strtoimax (boot, &lfinfo_end, 10);
-      break;
+      owner->dot = dot;
 
-    default:
-      return EINVAL;
-    }
-  if (lfinfo_end != owner->user + lfinfolen)
-    return EINVAL;
-
-  char *linkhost = at + 1;
-  ptrdiff_t linkhostlen = dot - linkhost;
-  Lisp_Object system_name = Fsystem_name ();
-  /* If `system-name' returns nil, that means we're in a
-     --no-build-details Emacs, and the name part of the link (e.g.,
-     .#test.txt -> larsi@.118961:1646577954) is an empty string.  */
-  bool on_current_host;
-  if (NILP (system_name))
-    on_current_host = linkhostlen == 0;
-  else
-    {
-      on_current_host = linkhostlen == SBYTES (system_name);
-      if (on_current_host)
+      /* The PID is everything from the last '.' to the ':' or equivalent.  */
+      if (! integer_prefixed (dot + 1))
+	return EINVAL;
+      errno = 0;
+      intmax_t pid = strtoimax (dot + 1, &owner->colon, 10);
+      if (errno == ERANGE)
+	pid = -1;
+
+      /* After the ':' or equivalent, if there is one, comes the boot time.  */
+      intmax_t boot_time;
+      char *boot = owner->colon + 1, *lfinfo_end;
+      switch (owner->colon[0])
 	{
-	  /* Protect against the extremely unlikely case of the host
-	     name containing '@'.  */
-	  char *sysname = SSDATA (system_name);
-	  for (ptrdiff_t i = 0; i < linkhostlen; i++)
-	    if (linkhost[i] != (sysname[i] == '@' ? '-' : sysname[i]))
-	      {
-		on_current_host = false;
-		break;
-	      }
+	case 0:
+	  boot_time = 0;
+	  lfinfo_end = owner->colon;
+	  break;
+
+	case '\357':
+	  /* Treat "\357\200\242" (U+F022 in UTF-8) like ":" (Bug#24656).
+	     This works around a bug in the Linux CIFS kernel client, which can
+	     mistakenly transliterate ':' to U+F022 in symlink contents.
+	     See <https://bugzilla.redhat.com/show_bug.cgi?id=1384153>.  */
+	  if (! (boot[0] == '\200' && boot[1] == '\242'))
+	    return EINVAL;
+	  boot += 2;
+	  FALLTHROUGH;
+	case ':':
+	  if (! integer_prefixed (boot))
+	    return EINVAL;
+	  boot_time = strtoimax (boot, &lfinfo_end, 10);
+	  break;
+
+	default:
+	  return EINVAL;
 	}
-    }
-  if (on_current_host)
-    {
-      if (pid == getpid ())
-        return I_OWN_IT;
-      else if (VALID_PROCESS_ID (pid)
-               && (kill (pid, 0) >= 0 || errno == EPERM)
-	       && (boot_time == 0
-		   || within_one_second (boot_time, get_boot_sec ())))
-        return ANOTHER_OWNS_IT;
-      /* The owner process is dead or has a strange pid, so try to
-         zap the lockfile.  */
+      if (lfinfo_end != owner->user + lfinfolen)
+	return EINVAL;
+
+      char *linkhost = at + 1;
+      ptrdiff_t linkhostlen = dot - linkhost;
+      Lisp_Object system_name = Fsystem_name ();
+      /* If `system-name' returns nil, that means we're in a
+	 --no-build-details Emacs, and the name part of the link (e.g.,
+	 .#test.txt -> larsi@.118961:1646577954) is an empty string.  */
+      bool on_current_host;
+      if (NILP (system_name))
+	on_current_host = linkhostlen == 0;
       else
-        return emacs_unlink (SSDATA (lfname)) < 0 ? errno : 0;
-    }
-  else
-    { /* If we wanted to support the check for stale locks on remote machines,
-         here's where we'd do it.  */
-      return ANOTHER_OWNS_IT;
+	{
+	  on_current_host = linkhostlen == SBYTES (system_name);
+	  if (on_current_host)
+	    {
+	      /* Protect against the extremely unlikely case of the host
+		 name containing '@'.  */
+	      char *sysname = SSDATA (system_name);
+	      for (ptrdiff_t i = 0; i < linkhostlen; i++)
+		if (linkhost[i] != (sysname[i] == '@' ? '-' : sysname[i]))
+		  {
+		    on_current_host = false;
+		    break;
+		  }
+	    }
+	}
+      if (!on_current_host)
+	{
+	  /* Not on current host.  If we wanted to support the check for
+	     stale locks on remote machines, here's where we'd do it.  */
+	  return ANOTHER_OWNS_IT;
+	}
+
+      if (pid == getpid ())
+	return I_OWN_IT;
+
+      if (VALID_PROCESS_ID (pid)
+	  && ! (kill (pid, 0) < 0 && errno != EPERM)
+	  && (boot_time == 0
+	      || within_one_second (boot_time, get_boot_sec ())))
+	return ANOTHER_OWNS_IT;
     }
+
+  /* The owner process is dead or has a strange pid.
+     Try to zap the lockfile.  */
+  return emacs_unlink (SSDATA (lfname)) < 0 ? errno : 0;
 }
 
 \f
-- 
2.43.0


[-- Attachment #5: 0004-Remove-empty-invalid-lock-files.patch --]
[-- Type: text/x-patch, Size: 1612 bytes --]

From 8b36bfc553b97cf435bdfe1b84abe21c3a605b9f Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Thu, 15 Aug 2024 13:30:23 -0700
Subject: [PATCH 4/4] Remove empty (& invalid) lock files

* src/filelock.c (current_lock_owner):
Remove empty lock files, as they are necessarily invalid
and can be caused by buggy file systems.
Problem reported by Michal Nazarewicz (bug#72641).
---
 src/filelock.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/filelock.c b/src/filelock.c
index c68aacc46fb..1ae57dc7344 100644
--- a/src/filelock.c
+++ b/src/filelock.c
@@ -397,8 +397,8 @@ current_lock_owner (lock_info_type *owner, Lisp_Object lfname)
   if (lfinfolen < 0)
     return errno == ENOENT || errno == ENOTDIR ? 0 : errno;
 
-  /* Examine lock file contents.  */
-  if (true)
+  /* If the lock file seems valid, return a value based on its contents.  */
+  if (lfinfolen)
     {
       if (MAX_LFINFO < lfinfolen)
 	return ENAMETOOLONG;
@@ -496,8 +496,11 @@ current_lock_owner (lock_info_type *owner, Lisp_Object lfname)
 	return ANOTHER_OWNS_IT;
     }
 
-  /* The owner process is dead or has a strange pid.
-     Try to zap the lockfile.  */
+  /* The owner process is dead or has a strange pid, or the lock file is empty.
+     Try to zap the lockfile.  If the lock file is empty, this assumes
+     the file system is buggy, e.g., <https://bugs.gnu.org/72641>.
+     Emacs never creates empty lock files even temporarily, so removing
+     an empty lock file should be harmless.  */
   return emacs_unlink (SSDATA (lfname)) < 0 ? errno : 0;
 }
 
-- 
2.43.0


  reply	other threads:[~2024-08-15 21:43 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-16  0:53 bug#70973: 29.1; "Unlocking file: Invalid argument" Warning saving via a softlink with stale file lock Duncan Greatwood
2024-05-16  8:43 ` Eli Zaretskii
2024-05-16 14:17   ` Duncan Greatwood
2024-05-16 15:46     ` Eli Zaretskii
2024-05-16 15:55       ` Duncan Greatwood
2024-05-16 16:09         ` Eli Zaretskii
2024-05-16 16:20           ` Duncan Greatwood
2024-05-16 18:18             ` Eli Zaretskii
2024-05-16 19:27               ` Duncan Greatwood
2024-05-16 19:51                 ` Eli Zaretskii
2024-05-16 21:36                   ` Duncan Greatwood
2024-06-01 14:04                     ` Eli Zaretskii
2024-08-15 15:59                   ` bug#72641: 31.0.50; "Unlocking file: Invalid argument" when deleting lock file on network file system Michal Nazarewicz
     [not found]                   ` <2+lcnmedng9le3pwfn0gc79m@mina86.com>
     [not found]                     ` <86a5hd7o4t.fsf@gnu.org>
2024-08-15 17:44                       ` Michal Nazarewicz
2024-08-15 21:43                         ` Paul Eggert [this message]
2024-08-16  0:59                           ` Michal Nazarewicz
2024-08-16  3:20                             ` Paul Eggert
2024-08-17 20:03                               ` Michal Nazarewicz
2024-08-17 22:38                                 ` Paul Eggert
2024-08-17 22:55                                   ` Mike Kupfer
2024-08-17 23:08                                     ` Mike Kupfer
2024-08-18 21:15                                   ` Michal Nazarewicz
2024-08-18 21:25                                     ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bf7fca1b-14b0-4500-b3a0-641302774804@cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=72641@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=mina86@mina86.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).