unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#58472: [PATCH] Make `message-unique-id' less prone to collisions
@ 2022-10-12 16:07 Stefan Kangas
  2022-10-12 18:08 ` Paul Eggert
  2022-10-13 11:45 ` Lars Ingebrigtsen
  0 siblings, 2 replies; 32+ messages in thread
From: Stefan Kangas @ 2022-10-12 16:07 UTC (permalink / raw)
  To: 58472; +Cc: Paul Eggert

[-- Attachment #1: Type: text/plain, Size: 1527 bytes --]

This is a proposal to make `message-unique-id' less prone to collisions.

For the first 1-2 characters, it uses the return value `user-uid', which
on most single-user systems is a fixed value (for example, this gives us
"87" on almost any single-user Debian machine).[1]  It's also
unnecessarily leaky of potentially privacy sensitive information.

The next 8 characters are the current time, with some gymnastics to
emulate support for a fractional part to seconds.  This seems
unnecessary now that, AFAIU, `time-convert' can do that for us portably
(please correct me if I'm wrong, Paul).

I suggest that we instead base the left-hand side of the Message-ID on:

  1. (time-convert nil (expt 10 9))
  2. 2^N bits of pseudo random data (e.g. N=32)

We can then ignore the effective user id, while significantly decreasing
the risk of a Message-ID collision.[2]

Currently, we get values like:

    (message-unique-id)
    => "87o7uhi3at.fsf"            ; length 10

With the attached patch, we have instead:

    (message-unique-id)
    => "cnk29wgg1a4nvrpqcy.fsf"   ; length ~22

Note also that `message-number-base36' uses a Caesar cipher for some
reason:

    (message-number-base36 5 -1)
    => "u"    ; expect "5"
    (message-number-base36 (expt 36 3) -1)
    => "yzzz" ; expect "1000"

The patch fixes this also.

I don't know if this change should be in NEWS or not.

Footnotes:
[1]  Just for fun, you can search for 87 on
     https://en.wikipedia.org/wiki/Message-ID

[2]  See also: https://www.jwz.org/doc/mid.html

[-- Attachment #2: 0001-Make-message-unique-id-less-prone-to-collisions.patch --]
[-- Type: text/x-diff, Size: 3772 bytes --]

From 5eedf88bb85e5b4a8a1bb556d856324aa9333042 Mon Sep 17 00:00:00 2001
From: Stefan Kangas <stefankangas@gmail.com>
Date: Wed, 12 Oct 2022 17:29:04 +0200
Subject: [PATCH] Make `message-unique-id' less prone to collisions

* lisp/gnus/message.el (message-unique-id): Make less prone to
collisions.
(message-unique-id-char): Make unused variable obsolete.
* test/lisp/gnus/message-tests.el (message-unique-id-test)
(message-number-base36-test): New tests.
---
 lisp/gnus/message.el            | 42 +++++++--------------------------
 test/lisp/gnus/message-tests.el |  9 +++++++
 2 files changed, 18 insertions(+), 33 deletions(-)

diff --git a/lisp/gnus/message.el b/lisp/gnus/message.el
index 67ec0531fa..bdc9a5b2ad 100644
--- a/lisp/gnus/message.el
+++ b/lisp/gnus/message.el
@@ -5889,41 +5889,17 @@ message-make-message-id
 		"_-_" ""))
 	  "@" (message-make-fqdn) ">"))
 
+(make-obsolete-variable 'message-unique-id-char nil "29.1")
 (defvar message-unique-id-char nil)
 
-;; If you ever change this function, make sure the new version
-;; cannot generate IDs that the old version could.
-;; You might for example insert a "." somewhere (not next to another dot
-;; or string boundary), or modify the "fsf" string.
 (defun message-unique-id ()
-  ;; Don't use fractional seconds from timestamp; they may be unsupported.
-  ;; Instead we use this randomly inited counter.
-  (setq message-unique-id-char
-	;; 2^16 * 25 just fits into 4 digits i base 36.
-	(let ((base (* 25 25)))
-	  (if message-unique-id-char
-	      (% (1+ message-unique-id-char) base)
-	    (random base))))
-  (let ((tm (time-convert nil 'integer)))
-    (concat
-     (if (or (eq system-type 'ms-dos)
-	     ;; message-number-base36 doesn't handle bigints.
-	     (floatp (user-uid)))
-	 (let ((user (downcase (user-login-name))))
-	   (while (string-match "[^a-z0-9_]" user)
-	     (aset user (match-beginning 0) ?_))
-	   user)
-       (message-number-base36 (user-uid) -1))
-     (message-number-base36 (+ (ash tm -16)
-			       (ash (% message-unique-id-char 25) 16))
-			    4)
-     (message-number-base36 (+ (logand tm #xffff)
-			       (ash (/ message-unique-id-char 25) 16))
-			    4)
-     ;; Append a given name, because while the generated ID is unique
-     ;; to this newsreader, other newsreaders might otherwise generate
-     ;; the same ID via another algorithm.
-     ".fsf")))
+  (concat
+   (message-number-base36 (car (time-convert nil (expt 10 9))) -1)
+   (message-number-base36 (random (expt 2 32)) -1)
+   ;; Append a given name, because while the generated ID is unique
+   ;; to this newsreader, other newsreaders might otherwise generate
+   ;; the same ID via another algorithm.
+   ".fsf"))
 
 (defun message-number-base36 (num len)
   (if (if (< len 0)
@@ -5931,7 +5907,7 @@ message-number-base36
 	(= len 0))
       ""
     (concat (message-number-base36 (/ num 36) (1- len))
-	    (char-to-string (aref "zyxwvutsrqponmlkjihgfedcba9876543210"
+            (char-to-string (aref "0123456789abcdefghijklmnopqrstuvwxyz"
 				  (% num 36))))))
 
 (defun message-make-organization ()
diff --git a/test/lisp/gnus/message-tests.el b/test/lisp/gnus/message-tests.el
index a724428ecb..e43fb1c1c4 100644
--- a/test/lisp/gnus/message-tests.el
+++ b/test/lisp/gnus/message-tests.el
@@ -31,6 +31,15 @@
 
 (require 'cl-lib)
 
+(ert-deftest message-unique-id-test ()
+  (should (stringp (message-unique-id)))
+  (should (< 10 (length (message-unique-id)))))
+
+(ert-deftest message-number-base36-test ()
+  (should (equal (message-number-base36 10 -1) "a"))
+  (should (equal (message-number-base36 1 -1) "1"))
+  (should (equal (message-number-base36 (expt 36 5) -1) "100000")))
+
 (ert-deftest message-mode-propertize ()
   (with-temp-buffer
     (unwind-protect
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2022-11-25  1:26 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-12 16:07 bug#58472: [PATCH] Make `message-unique-id' less prone to collisions Stefan Kangas
2022-10-12 18:08 ` Paul Eggert
2022-10-13  2:46   ` Stefan Kangas
2022-10-13  4:53     ` Matt Armstrong
2022-10-13 12:10       ` Stefan Kangas
2022-10-13 16:35         ` Matt Armstrong
2022-10-13 16:38           ` Paul Eggert
2022-10-14  9:22           ` Stefan Kangas
2022-10-13 16:21     ` Paul Eggert
2022-10-14  9:22       ` Stefan Kangas
2022-10-16  7:32         ` Stefan Kangas
2022-10-16 17:05           ` Stefan Kangas
2022-10-16 15:19       ` Matt Armstrong
2022-10-16 16:49         ` Stefan Kangas
2022-10-17  6:17           ` Matt Armstrong
2022-10-17  7:30           ` Paul Eggert
2022-10-17  8:14             ` Stefan Kangas
2022-10-17  8:23               ` Eli Zaretskii
2022-10-17 18:47               ` Matt Armstrong
2022-10-17  8:16             ` Eli Zaretskii
2022-10-17  8:29               ` Lars Ingebrigtsen
2022-10-17  8:34                 ` Eli Zaretskii
2022-10-17  9:30                 ` Stefan Kangas
2022-10-17 11:22                   ` Lars Ingebrigtsen
2022-10-17 15:40                     ` Stefan Kangas
2022-11-25  1:26                       ` Stefan Kangas
2022-10-17 18:40             ` Matt Armstrong
2022-10-18  1:38               ` Paul Eggert
2022-10-18 14:05                 ` Eli Zaretskii
2022-10-13 11:45 ` Lars Ingebrigtsen
2022-10-13 12:10   ` Stefan Kangas
2022-10-13 19:15     ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).