From: Mark H Weaver <mhw@netris.org>
To: ludo@gnu.org (Ludovic Courtès)
Cc: guile-devel@gnu.org
Subject: Re: [PATCH] Universally-unique gensyms
Date: Wed, 18 Jan 2012 19:58:46 -0500 [thread overview]
Message-ID: <8762g8y0c9.fsf@netris.org> (raw)
In-Reply-To: <87d3agu1rt.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Wed, 18 Jan 2012 22:41:26 +0100")
[-- Attachment #1: Type: text/plain, Size: 1013 bytes --]
Hi Ludovic,
Thanks for the review! Attached below are updated patches, which I
believe incorporate all of your suggestions.
> This looks cool, though I must confess I don’t see why sizeof (int) is
> not enough for everyone, nor whether this warrants adding all this code.
For some applications, it is important for gensyms to be unique across
multiple sessions. `local-eval' is one such application, but there are
many others.
> Apparently this doesn’t break the ABI, right?
Good question! I investigated this, and AFAICT there should be no
problem with changing the scm_i_thread structure. I see no public
macros to access the structure members, no public APIs to retrieve a
pointer to this structure, and nothing for a user to do with such a
structure if he created one himself.
Indeed, this structure definition should probably be put within:
#ifdef BUILDING_LIBGUILE
Anyway, here's the new patch. What do you think? Okay to push? :)
Thanks!
Mark
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: [PATCH 1/2] Add `random-state-from-platform' and `scm_i_random_bytes_from_platform' --]
[-- Type: text/x-patch, Size: 5998 bytes --]
From 55714abaf6c55f380bb68f456fdc6af923bae6f0 Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw@netris.org>
Date: Wed, 18 Jan 2012 02:36:17 -0500
Subject: [PATCH 1/2] Add `random-state-from-platform' and
`scm_i_random_bytes_from_platform'
* libguile/random.c (scm_random_state_from_platform): New procedure.
(scm_i_random_bytes_from_platform): New internal function.
* libguile/random.h (scm_random_state_from_platform,
scm_i_random_bytes_from_platform): Add prototypes.
* doc/ref/api-data.texi (Random): Add documentation.
---
doc/ref/api-data.texi | 30 ++++++++--------------
libguile/random.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++
libguile/random.h | 3 ++
3 files changed, 80 insertions(+), 19 deletions(-)
diff --git a/doc/ref/api-data.texi b/doc/ref/api-data.texi
index f2450ce..e165559 100644
--- a/doc/ref/api-data.texi
+++ b/doc/ref/api-data.texi
@@ -1865,6 +1865,14 @@ Return a datum representation of @var{state} that may be written out and
read back with the Scheme reader.
@end deffn
+@deffn {Scheme Procedure} random-state-from-platform
+@deffnx {C Function} scm_random_state_from_platform ()
+Construct a new random state seeded from a platform-specific source of
+entropy, appropriate for use in non-security-critical applications.
+Currently @file{/dev/urandom} is tried first, or else the seed is based
+on the time, date, process ID, and a high-resolution timer if available.
+@end deffn
+
@defvar *random-state*
The global random state used by the above functions when the
@var{state} parameter is not given.
@@ -1887,29 +1895,13 @@ Guile started up, will always give:
(0 1 1 2 2 2 1 2 6 7 10 0 5 3 12 5 5 12)
@end lisp
-To use the time of day as the random seed, you can use code like this:
-
-@lisp
-(let ((time (gettimeofday)))
- (set! *random-state*
- (seed->random-state (+ (car time)
- (cdr time)))))
-@end lisp
-
-@noindent
-And then (depending on the time of day, of course):
+To seed the random state in a sensible way for non-security-critical
+applications, do this during initialization of your program:
@lisp
-(map random (cdr (iota 19)))
-@result{}
-(0 0 1 0 2 4 5 4 5 5 9 3 10 1 8 3 14 17)
+(set! *random-state* (random-state-from-platform))
@end lisp
-For security applications, such as password generation, you should use
-more bits of seed. Otherwise an open source password generator could
-be attacked by guessing the seed@dots{} but that's a subject for
-another manual.
-
@node Characters
@subsection Characters
diff --git a/libguile/random.c b/libguile/random.c
index 8bc0d87..5c7983a 100644
--- a/libguile/random.c
+++ b/libguile/random.c
@@ -653,6 +653,72 @@ SCM_DEFINE (scm_random_exp, "random:exp", 0, 1, 0,
}
#undef FUNC_NAME
+/* Return a new random-state seeded from the time, date, process ID,
+ and a high-resolution timer if available. This is only to be used as
+ a last resort, when no better source of entropy is available. */
+static SCM
+random_state_of_last_resort (void)
+{
+ SCM time_of_day = scm_gettimeofday ();
+ return scm_seed_to_random_state
+ (scm_sum
+ (scm_get_internal_real_time (),
+ scm_ash (scm_sum (scm_getpid (),
+ scm_ash (scm_sum (scm_cdr (time_of_day),
+ scm_ash (scm_car (time_of_day),
+ SCM_I_MAKINUM (20))),
+ SCM_I_MAKINUM (20))),
+ SCM_I_MAKINUM (64))));
+}
+
+/* Attempt to fill buffer with random bytes from /dev/urandom.
+ Return 1 if successful, else return 0. */
+static int
+read_dev_urandom (unsigned char *buf, size_t len)
+{
+ size_t res = 0;
+ FILE *f = fopen ("/dev/urandom", "r");
+ if (f)
+ {
+ res = fread(buf, 1, len, f);
+ fclose (f);
+ }
+ return (res == len);
+}
+
+/* Fill a buffer with random bytes seeded from a platform-specific
+ source of entropy. /dev/urandom is used if available. Note that
+ this function provides no guarantees about the amount of entropy
+ present in the returned bytes. */
+void
+scm_i_random_bytes_from_platform (unsigned char *buf, size_t len)
+{
+ if (read_dev_urandom (buf, len))
+ return;
+ else /* FIXME: support other platform sources */
+ {
+ /* When all else fails, use this (rather weak) fallback */
+ SCM random_state = random_state_of_last_resort ();
+ int i;
+ for (i = len-1; i >= 0; --i)
+ buf[i] = scm_to_int (scm_random (SCM_I_MAKINUM (256), random_state));
+ }
+}
+
+SCM_DEFINE (scm_random_state_from_platform, "random-state-from-platform", 0, 0, 0,
+ (void),
+ "Construct a new random state seeded from a platform-specific\n\
+source of entropy, appropriate for use in non-security-critical applications.")
+#define FUNC_NAME s_scm_random_state_from_platform
+{
+ unsigned char buf[32];
+ if (read_dev_urandom (buf, sizeof(buf)))
+ return make_rstate (scm_c_make_rstate ((char *) buf, sizeof(buf)));
+ else
+ return random_state_of_last_resort ();
+}
+#undef FUNC_NAME
+
void
scm_init_random ()
{
diff --git a/libguile/random.h b/libguile/random.h
index 2f1f0f6..109969e 100644
--- a/libguile/random.h
+++ b/libguile/random.h
@@ -86,6 +86,7 @@ SCM_API SCM scm_copy_random_state (SCM state);
SCM_API SCM scm_seed_to_random_state (SCM seed);
SCM_API SCM scm_datum_to_random_state (SCM datum);
SCM_API SCM scm_random_state_to_datum (SCM state);
+SCM_API SCM scm_random_state_from_platform (void);
SCM_API SCM scm_random_uniform (SCM state);
SCM_API SCM scm_random_solid_sphere_x (SCM v, SCM state);
SCM_API SCM scm_random_hollow_sphere_x (SCM v, SCM state);
@@ -94,6 +95,8 @@ SCM_API SCM scm_random_normal_vector_x (SCM v, SCM state);
SCM_API SCM scm_random_exp (SCM state);
SCM_INTERNAL void scm_init_random (void);
+SCM_INTERNAL void scm_i_random_bytes_from_platform (unsigned char *buf, size_t len);
+
#endif /* SCM_RANDOM_H */
/*
--
1.7.5.4
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: [PATCH 2/2] Universally-unique gensyms --]
[-- Type: text/x-patch, Size: 4495 bytes --]
From 0b7603c90d17d5296a94ef7d22849c536e73b49a Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw@netris.org>
Date: Wed, 18 Jan 2012 02:53:05 -0500
Subject: [PATCH 2/2] Universally-unique gensyms
* libguile/symbols.c (scm_gensym): Make the gensym counter a 128-bit
thread-local, initialized to a random number upon the first call to
`gensym' within a given thread. This counter is rendered as a 22 byte
suffix of mostly base64 digits.
* libguile/threads.h (scm_i_thread): Add a thread-local gensym_counter.
* libguile/threads.c (guilify_self_1): Initialize gensym_counter to NULL.
---
libguile/symbols.c | 50 +++++++++++++++++++++++++++++++++++++++-----------
libguile/threads.c | 1 +
libguile/threads.h | 4 ++++
3 files changed, 44 insertions(+), 11 deletions(-)
diff --git a/libguile/symbols.c b/libguile/symbols.c
index 59aca00..840a458 100644
--- a/libguile/symbols.c
+++ b/libguile/symbols.c
@@ -31,6 +31,7 @@
#include "libguile/variable.h"
#include "libguile/alist.h"
#include "libguile/fluids.h"
+#include "libguile/threads.h"
#include "libguile/strings.h"
#include "libguile/vectors.h"
#include "libguile/hashtab.h"
@@ -340,7 +341,9 @@ SCM_DEFINE (scm_string_ci_to_symbol, "string-ci->symbol", 1, 0, 0,
/* The default prefix for `gensym'd symbols. */
static SCM default_gensym_prefix;
-#define MAX_PREFIX_LENGTH 30
+#define GENSYM_LENGTH 22 /* bytes */
+#define GENSYM_RADIX_BITS 6
+#define GENSYM_RADIX (1 << (GENSYM_RADIX_BITS))
SCM_DEFINE (scm_gensym, "gensym", 0, 1, 0,
(SCM prefix),
@@ -351,22 +354,47 @@ SCM_DEFINE (scm_gensym, "gensym", 0, 1, 0,
"resetting the counter.")
#define FUNC_NAME s_scm_gensym
{
- static int gensym_counter = 0;
-
+ static const char base64[GENSYM_RADIX] =
+ "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789$@";
+ static const char base4[4] = "_.-~";
+
+ unsigned char *digit_buf = SCM_I_CURRENT_THREAD->gensym_counter;
+ char char_buf[GENSYM_LENGTH];
SCM suffix, name;
- int n, n_digits;
- char buf[SCM_INTBUFLEN];
+ int i;
if (SCM_UNBNDP (prefix))
prefix = default_gensym_prefix;
- /* mutex in case another thread looks and incs at the exact same moment */
- scm_i_scm_pthread_mutex_lock (&scm_i_misc_mutex);
- n = gensym_counter++;
- scm_i_pthread_mutex_unlock (&scm_i_misc_mutex);
+ if (SCM_UNLIKELY (digit_buf == NULL))
+ {
+ /* This is the first time gensym has been called in this thread.
+ Allocate and randomize our new thread-local gensym counter */
+ digit_buf = (unsigned char *)
+ scm_gc_malloc_pointerless (GENSYM_LENGTH, "gensym-counter");
+ scm_i_random_bytes_from_platform (digit_buf, GENSYM_LENGTH);
+ for (i = (GENSYM_LENGTH - 1); i >= 0; --i)
+ digit_buf[i] &= (GENSYM_RADIX - 1);
+ SCM_I_CURRENT_THREAD->gensym_counter = digit_buf;
+ }
+
+ /* Increment our thread-local gensym_counter. */
+ for (i = (GENSYM_LENGTH - 1); i >= 0; --i)
+ {
+ if (SCM_LIKELY (++(digit_buf[i]) < GENSYM_RADIX))
+ break;
+ else
+ digit_buf[i] = 0;
+ }
+
+ /* Encode digit_buf as base64, except for the first character where we
+ use the sparse glyphs "_.-~" (base 4) to provide some visual
+ separation between the prefix and the dense base64 block. */
+ for (i = (GENSYM_LENGTH - 1); i > 0; --i)
+ char_buf[i] = base64[digit_buf[i]];
+ char_buf[0] = base4[digit_buf[0] & 3];
- n_digits = scm_iint2str (n, 10, buf);
- suffix = scm_from_latin1_stringn (buf, n_digits);
+ suffix = scm_from_latin1_stringn (char_buf, GENSYM_LENGTH);
name = scm_string_append (scm_list_2 (prefix, suffix));
return scm_string_to_symbol (name);
}
diff --git a/libguile/threads.c b/libguile/threads.c
index 5a13e5c..67834ff 100644
--- a/libguile/threads.c
+++ b/libguile/threads.c
@@ -545,6 +545,7 @@ guilify_self_1 (struct GC_stack_base *base)
t.join_queue = SCM_EOL;
t.dynamic_state = SCM_BOOL_F;
t.dynwinds = SCM_EOL;
+ t.gensym_counter = NULL;
t.active_asyncs = SCM_EOL;
t.block_asyncs = 1;
t.pending_asyncs = 1;
diff --git a/libguile/threads.h b/libguile/threads.h
index ec129bc..3660a58 100644
--- a/libguile/threads.h
+++ b/libguile/threads.h
@@ -81,6 +81,10 @@ typedef struct scm_i_thread {
SCM dynamic_state;
SCM dynwinds;
+ /* Thread-local gensym counter.
+ */
+ unsigned char *gensym_counter;
+
/* For system asyncs.
*/
SCM active_asyncs; /* The thunks to be run at the next
--
1.7.5.4
next prev parent reply other threads:[~2012-01-19 0:58 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-17 13:27 [PATCH] Universally-unique gensyms Mark H Weaver
2012-01-17 13:57 ` Andy Wingo
2012-01-17 14:06 ` Andy Wingo
2012-01-18 8:04 ` Mark H Weaver
2012-01-18 9:35 ` Mark H Weaver
2012-01-18 21:41 ` Ludovic Courtès
2012-01-18 22:23 ` Andy Wingo
2012-01-18 22:43 ` Ludovic Courtès
2012-01-18 23:30 ` Mark H Weaver
2012-01-19 1:21 ` David Kastrup
2012-01-19 4:50 ` Mark H Weaver
2012-01-19 6:25 ` David Kastrup
2012-01-19 10:23 ` Mark H Weaver
2012-01-20 0:50 ` Mark H Weaver
2012-01-19 0:58 ` Mark H Weaver [this message]
2012-01-19 23:26 ` Ludovic Courtès
2012-01-18 21:42 ` Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8762g8y0c9.fsf@netris.org \
--to=mhw@netris.org \
--cc=guile-devel@gnu.org \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).