unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Stefan Kangas <stefankangas@gmail.com>
To: Paul Eggert <eggert@cs.ucla.edu>
Cc: 58472@debbugs.gnu.org
Subject: bug#58472: [PATCH] Make `message-unique-id' less prone to collisions
Date: Wed, 12 Oct 2022 19:46:14 -0700	[thread overview]
Message-ID: <CADwFkmm5A6zbx4=ZSrhcmPjcOR+CT=tRU2fTac-L6XVLLXQdBQ@mail.gmail.com> (raw)
In-Reply-To: <e5c16d15-dc23-0d9b-86ed-3a4b68a5a6cc@cs.ucla.edu>

[-- Attachment #1: Type: text/plain, Size: 1505 bytes --]

Paul Eggert <eggert@cs.ucla.edu> writes:

> If the goal is to avoid collisions, though, wouldn't it be better to
> avoid the time entirely? time is not very random in the high order bits.

The main goal is to avoid collisions, but using the time also gives an
idea of when the message was sent, which is kind of nice.  Time also
guarantees a somewhat unique value even if the user has happened to set
the random seed.

If we limit the length of the time string to 12 characters, and the
total length to 25 characters (including the ".gnu" part), we still have
a guaranteed 9 characters of random data, or 46 bits of entropy.

    ⌊log₂ 36⁹⌋ = 46

If we want to guarantee 62 bits of entropy, we could do a total length
of 28 characters instead, but that might be overkill.

> Also, the comment about ".fsf" and other newsreaders is wrong [...]
> And if we're changing the algorithm perhaps we should change ".fsf" to
> something else.

Changing it to ".gnu" makes sense to me.

> Something like this, perhaps, where you can choose LEN as you like:
>
>    (concat
>       (let ((len 18))
>         ;; Pass LEN, not -1, to message-number-base36 so that it never
>         ;; returns "" which would make the message-ID nonconforming.
>         (message-number-base36 (random (expt 36 len)) len))
>       ;; Append ".gnu" to advertise that we're GNU Emacs.
>       ".gnu")

I think making sure it is always the same length is a good idea.

How about the attached?

[-- Attachment #2: 0001-Make-message-unique-id-less-prone-to-collisions.patch --]
[-- Type: text/x-diff, Size: 4030 bytes --]

From 889667e332fe76fc670cb1d4620c47525d9a70bf Mon Sep 17 00:00:00 2001
From: Stefan Kangas <stefankangas@gmail.com>
Date: Wed, 12 Oct 2022 17:29:04 +0200
Subject: [PATCH] Make `message-unique-id' less prone to collisions

* lisp/gnus/message.el (message-unique-id): Make less prone to
collisions.  (Bug#58472)
(message-unique-id-char): Make unused variable obsolete.
* test/lisp/gnus/message-tests.el (message-unique-id-test)
(message-number-base36-test): New tests.
---
 lisp/gnus/message.el            | 43 ++++++++-------------------------
 test/lisp/gnus/message-tests.el |  9 +++++++
 2 files changed, 19 insertions(+), 33 deletions(-)

diff --git a/lisp/gnus/message.el b/lisp/gnus/message.el
index 67ec0531fa..9b85f1c5fd 100644
--- a/lisp/gnus/message.el
+++ b/lisp/gnus/message.el
@@ -47,7 +47,7 @@
 (require 'rfc2047)
 (require 'puny)
 (require 'rmc)                          ; read-multiple-choice
-(require 'subr-x)
+(require 'subr-x)                       ; string-limit
 (require 'yank-media)
 (require 'mailcap)
 (require 'sendmail)
@@ -5889,41 +5889,18 @@ message-make-message-id
 		"_-_" ""))
 	  "@" (message-make-fqdn) ">"))
 
+(make-obsolete-variable 'message-unique-id-char nil "29.1")
 (defvar message-unique-id-char nil)
 
-;; If you ever change this function, make sure the new version
-;; cannot generate IDs that the old version could.
-;; You might for example insert a "." somewhere (not next to another dot
-;; or string boundary), or modify the "fsf" string.
 (defun message-unique-id ()
-  ;; Don't use fractional seconds from timestamp; they may be unsupported.
-  ;; Instead we use this randomly inited counter.
-  (setq message-unique-id-char
-	;; 2^16 * 25 just fits into 4 digits i base 36.
-	(let ((base (* 25 25)))
-	  (if message-unique-id-char
-	      (% (1+ message-unique-id-char) base)
-	    (random base))))
-  (let ((tm (time-convert nil 'integer)))
+  (let* ((time (car (time-convert nil t)))
+         (time-string (string-limit (message-number-base36 time -1) 12))
+         (len (- 21 (length time-string))))
     (concat
-     (if (or (eq system-type 'ms-dos)
-	     ;; message-number-base36 doesn't handle bigints.
-	     (floatp (user-uid)))
-	 (let ((user (downcase (user-login-name))))
-	   (while (string-match "[^a-z0-9_]" user)
-	     (aset user (match-beginning 0) ?_))
-	   user)
-       (message-number-base36 (user-uid) -1))
-     (message-number-base36 (+ (ash tm -16)
-			       (ash (% message-unique-id-char 25) 16))
-			    4)
-     (message-number-base36 (+ (logand tm #xffff)
-			       (ash (/ message-unique-id-char 25) 16))
-			    4)
-     ;; Append a given name, because while the generated ID is unique
-     ;; to this newsreader, other newsreaders might otherwise generate
-     ;; the same ID via another algorithm.
-     ".fsf")))
+     time-string
+     (message-number-base36 (random (expt 36 len)) len)
+     ;; Append ".gnu" to advertise that we're GNU Emacs.
+     ".gnu")))
 
 (defun message-number-base36 (num len)
   (if (if (< len 0)
@@ -5931,7 +5908,7 @@ message-number-base36
 	(= len 0))
       ""
     (concat (message-number-base36 (/ num 36) (1- len))
-	    (char-to-string (aref "zyxwvutsrqponmlkjihgfedcba9876543210"
+            (char-to-string (aref "0123456789abcdefghijklmnopqrstuvwxyz"
 				  (% num 36))))))
 
 (defun message-make-organization ()
diff --git a/test/lisp/gnus/message-tests.el b/test/lisp/gnus/message-tests.el
index a724428ecb..e8c0655543 100644
--- a/test/lisp/gnus/message-tests.el
+++ b/test/lisp/gnus/message-tests.el
@@ -31,6 +31,15 @@
 
 (require 'cl-lib)
 
+(ert-deftest message-unique-id-test ()
+  (should (stringp (message-unique-id)))
+  (should (= (length (message-unique-id)) 25)))
+
+(ert-deftest message-number-base36-test ()
+  (should (equal (message-number-base36 10 -1) "a"))
+  (should (equal (message-number-base36 1 -1) "1"))
+  (should (equal (message-number-base36 (expt 36 5) -1) "100000")))
+
 (ert-deftest message-mode-propertize ()
   (with-temp-buffer
     (unwind-protect
-- 
2.35.1


  reply	other threads:[~2022-10-13  2:46 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-12 16:07 bug#58472: [PATCH] Make `message-unique-id' less prone to collisions Stefan Kangas
2022-10-12 18:08 ` Paul Eggert
2022-10-13  2:46   ` Stefan Kangas [this message]
2022-10-13  4:53     ` Matt Armstrong
2022-10-13 12:10       ` Stefan Kangas
2022-10-13 16:35         ` Matt Armstrong
2022-10-13 16:38           ` Paul Eggert
2022-10-14  9:22           ` Stefan Kangas
2022-10-13 16:21     ` Paul Eggert
2022-10-14  9:22       ` Stefan Kangas
2022-10-16  7:32         ` Stefan Kangas
2022-10-16 17:05           ` Stefan Kangas
2022-10-16 15:19       ` Matt Armstrong
2022-10-16 16:49         ` Stefan Kangas
2022-10-17  6:17           ` Matt Armstrong
2022-10-17  7:30           ` Paul Eggert
2022-10-17  8:14             ` Stefan Kangas
2022-10-17  8:23               ` Eli Zaretskii
2022-10-17 18:47               ` Matt Armstrong
2022-10-17  8:16             ` Eli Zaretskii
2022-10-17  8:29               ` Lars Ingebrigtsen
2022-10-17  8:34                 ` Eli Zaretskii
2022-10-17  9:30                 ` Stefan Kangas
2022-10-17 11:22                   ` Lars Ingebrigtsen
2022-10-17 15:40                     ` Stefan Kangas
2022-11-25  1:26                       ` Stefan Kangas
2022-10-17 18:40             ` Matt Armstrong
2022-10-18  1:38               ` Paul Eggert
2022-10-18 14:05                 ` Eli Zaretskii
2022-10-13 11:45 ` Lars Ingebrigtsen
2022-10-13 12:10   ` Stefan Kangas
2022-10-13 19:15     ` Lars Ingebrigtsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADwFkmm5A6zbx4=ZSrhcmPjcOR+CT=tRU2fTac-L6XVLLXQdBQ@mail.gmail.com' \
    --to=stefankangas@gmail.com \
    --cc=58472@debbugs.gnu.org \
    --cc=eggert@cs.ucla.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).