* bug#22689: 25.1.50; implement logcount @ 2016-02-15 23:18 Mark Oteiza 2017-09-29 17:41 ` bug#22689: [PATCH] Add logcount Mark Oteiza 2017-09-30 22:48 ` Paul Eggert 0 siblings, 2 replies; 16+ messages in thread From: Mark Oteiza @ 2016-02-15 23:18 UTC (permalink / raw) To: 22689 Wishlist: Logcount, AKA popcount, Hamming weight, sideways sum. Basically, counting the number of 1s in a number's binary representation. The concept is similar to that in `bool-vector-count-population', except the argument would be a number, presumably a non-negative integer. There are a number of ways to go about it, and beyond that I'm not sure how to write it into data.c: - Some compilers have a __builtin_popcount (gcc since 2004 and llvm since 2005) - gmp has a popcount function - lots of algorithms implementing it Many are given here: https://en.wikipedia.org/wiki/Hamming_weight another resource: http://rosettacode.org/wiki/Population_count Guile's implementation uses table lookup: http://git.savannah.gnu.org/cgit/guile.git/tree/libguile/numbers.c#n5234 When I needed it, I just naïvely ported an algorithm to elisp (in particular ignoring the case of negative numbers): (let ((m1 #x555555555555555) (m2 #x333333333333333) (m4 #x0f0f0f0f0f0f0f0f) (h01 #x0101010101010101)) (setq x (- x (logand (lsh x -1) m1))) (setq x (+ (logand x m2) (logand (lsh x -2) m2))) (setq x (logand (+ x (lsh x -4)) m4)) (lsh (* x h01) -56)) The table lookup isn't so different (defvar logtab [0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4]) (let ((count 0)) (while (not (zerop x)) (setq count (+ count (aref logtab (logand 15 x)))) (setq x (lsh x -4))) count) I couldn't find any implementation of this in calc either, but such a function/operation would fit in calc-bin. ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount 2016-02-15 23:18 bug#22689: 25.1.50; implement logcount Mark Oteiza @ 2017-09-29 17:41 ` Mark Oteiza 2017-09-30 11:42 ` Eli Zaretskii 2017-09-30 22:48 ` Paul Eggert 1 sibling, 1 reply; 16+ messages in thread From: Mark Oteiza @ 2017-09-29 17:41 UTC (permalink / raw) To: 22689 Hi, I made an attempt implementing this: diff --git a/src/data.c b/src/data.c index e070be6c20..1332173dcd 100644 --- a/src/data.c +++ b/src/data.c @@ -3069,6 +3069,57 @@ usage: (logxor &rest INTS-OR-MARKERS) */) return arith_driver (Alogxor, nargs, args); } +#if defined (__POPCNT__) && defined (__GNUC__) && (__GNUC__> 4 || (__GNUC__== 4 && __GNUC_MINOR__> 1)) +#define HAVE_BUILTIN_POPCOUNTLL +#endif + +static uint32_t +bitcount32 (uint32_t b) +{ + b -= (b >> 1) & 0x55555555; + b = (b & 0x33333333) + ((b >> 2) & 0x33333333); + b = (b + (b >> 4)) & 0x0f0f0f0f; + return (b * 0x01010101) >> 24; +} + +static uint64_t +bitcount64 (uint64_t b) +{ + b -= (b >> 1) & 0x5555555555555555; + b = (b & 0x3333333333333333) + ((b >> 2) & 0x3333333333333333); + b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0f; + return (b * 0x0101010101010101) >> 56; +} + +DEFUN ("logcount", Flogcount, Slogcount, 1, 1, 0, + doc: /* Return population count of VALUE. */) + (register Lisp_Object value) +{ + Lisp_Object res; + + CHECK_NUMBER (value); + +#ifdef HAVE_BUILTIN_POPCOUNTLL + XSETINT (res, __builtin_popcount (XUINT (value))); +#else /* HAVE_BUILTIN_POPCOUNTLL */ + if (XINT (value) <= UINT_MAX) + XSETINT (res, bitcount32 (XUINT (value))); + else if (XINT (value) <= ULONG_MAX) + XSETINT (res, bitcount64 (XUINT (value))); + else + { + int count; + EMACS_UINT v = XUINT (value); + for (count = 0; v; count++) + { + v &= v - 1; + } + XSETINT (res, v); + } +#endif /* HAVE_BUILTIN_POPCOUNTLL */ + return res; +} + static Lisp_Object ash_lsh_impl (Lisp_Object value, Lisp_Object count, bool lsh) { @@ -3856,6 +3907,7 @@ syms_of_data (void) defsubr (&Slogand); defsubr (&Slogior); defsubr (&Slogxor); + defsubr (&Slogcount); defsubr (&Slsh); defsubr (&Sash); defsubr (&Sadd1); ^ permalink raw reply related [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount 2017-09-29 17:41 ` bug#22689: [PATCH] Add logcount Mark Oteiza @ 2017-09-30 11:42 ` Eli Zaretskii 2017-09-30 13:16 ` Mark Oteiza 0 siblings, 1 reply; 16+ messages in thread From: Eli Zaretskii @ 2017-09-30 11:42 UTC (permalink / raw) To: Mark Oteiza; +Cc: 22689 > Date: Fri, 29 Sep 2017 13:41:34 -0400 > From: Mark Oteiza <mvoteiza@udel.edu> > > I made an attempt implementing this: Thanks. I'm not an expert on these ops, but otherwise the code looks OK to me, modulo the minor comments below. This will eventually need a NEWS entry and some docs in the ELisp manual. > +#if defined (__POPCNT__) && defined (__GNUC__) && (__GNUC__> 4 || (__GNUC__== 4 && __GNUC_MINOR__> 1)) > +#define HAVE_BUILTIN_POPCOUNTLL > +#endif Where does __POPCNT__ definition come from? I don't see it in my GCC's "gcc -dM" output. As for the rest of the condition, I think you should use GNUC_PREREQ, like this: #if GNUC_PREREQ (4, 1, 0) > +static uint64_t > +bitcount64 (uint64_t b) > +{ > + b -= (b >> 1) & 0x5555555555555555; > + b = (b & 0x3333333333333333) + ((b >> 2) & 0x3333333333333333); > + b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0f; > + return (b * 0x0101010101010101) >> 56; Shouldn't these constants have a ULL suffix, to make sure they are not truncated to a 32-bit int? > +#else /* HAVE_BUILTIN_POPCOUNTLL */ > + if (XINT (value) <= UINT_MAX) > + XSETINT (res, bitcount32 (XUINT (value))); > + else if (XINT (value) <= ULONG_MAX) > + XSETINT (res, bitcount64 (XUINT (value))); The comparisons against Uxxx_MAX seem to assume that VALUE is unsigned, but that's not guaranteed, right? ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount 2017-09-30 11:42 ` Eli Zaretskii @ 2017-09-30 13:16 ` Mark Oteiza 2017-09-30 13:59 ` Eli Zaretskii 0 siblings, 1 reply; 16+ messages in thread From: Mark Oteiza @ 2017-09-30 13:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 22689 On 30/09/17 at 02:42pm, Eli Zaretskii wrote: > > +#if defined (__POPCNT__) && defined (__GNUC__) && (__GNUC__> 4 || (__GNUC__== 4 && __GNUC_MINOR__> 1)) > > +#define HAVE_BUILTIN_POPCOUNTLL > > +#endif > > Where does __POPCNT__ definition come from? I don't see it in my > GCC's "gcc -dM" output. > > As for the rest of the condition, I think you should use GNUC_PREREQ, > like this: > > #if GNUC_PREREQ (4, 1, 0) I guess it comes from nowhere, that condition and the two helper functions come from here https://rosettacode.org/wiki/Population_count#C > > +static uint64_t > > +bitcount64 (uint64_t b) > > +{ > > + b -= (b >> 1) & 0x5555555555555555; > > + b = (b & 0x3333333333333333) + ((b >> 2) & 0x3333333333333333); > > + b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0f; > > + return (b * 0x0101010101010101) >> 56; > > Shouldn't these constants have a ULL suffix, to make sure they are not > truncated to a 32-bit int? Probably, I don't know--I'm out of my comfort zone here. > > +#else /* HAVE_BUILTIN_POPCOUNTLL */ > > + if (XINT (value) <= UINT_MAX) > > + XSETINT (res, bitcount32 (XUINT (value))); > > + else if (XINT (value) <= ULONG_MAX) > > + XSETINT (res, bitcount64 (XUINT (value))); > > The comparisons against Uxxx_MAX seem to assume that VALUE is > unsigned, but that's not guaranteed, right? True. Should I instead be doing XINT (value) <= xxx_MAX && XINT (value) >= xxx_MIN or might there be a better check? For negative values popcount typically counts ones of the two's complement ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount 2017-09-30 13:16 ` Mark Oteiza @ 2017-09-30 13:59 ` Eli Zaretskii 2017-09-30 14:55 ` Mark Oteiza 0 siblings, 1 reply; 16+ messages in thread From: Eli Zaretskii @ 2017-09-30 13:59 UTC (permalink / raw) To: Mark Oteiza; +Cc: 22689 > Date: Sat, 30 Sep 2017 09:16:36 -0400 > From: Mark Oteiza <mvoteiza@udel.edu> > Cc: 22689@debbugs.gnu.org > > On 30/09/17 at 02:42pm, Eli Zaretskii wrote: > > > +#if defined (__POPCNT__) && defined (__GNUC__) && (__GNUC__> 4 || (__GNUC__== 4 && __GNUC_MINOR__> 1)) > > > +#define HAVE_BUILTIN_POPCOUNTLL > > > +#endif > > > > Where does __POPCNT__ definition come from? I don't see it in my > > GCC's "gcc -dM" output. > > > > As for the rest of the condition, I think you should use GNUC_PREREQ, > > like this: > > > > #if GNUC_PREREQ (4, 1, 0) > > I guess it comes from nowhere, that condition and the two helper > functions come from here > https://rosettacode.org/wiki/Population_count#C I only see that symbol in the GCC sources, which I don't think are relevant here. If you drop the __POPCNT__ part, does the code still work for you? > > > +static uint64_t > > > +bitcount64 (uint64_t b) > > > +{ > > > + b -= (b >> 1) & 0x5555555555555555; > > > + b = (b & 0x3333333333333333) + ((b >> 2) & 0x3333333333333333); > > > + b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0f; > > > + return (b * 0x0101010101010101) >> 56; > > > > Shouldn't these constants have a ULL suffix, to make sure they are not > > truncated to a 32-bit int? > > Probably, I don't know--I'm out of my comfort zone here. I'd use the suffixes. > > > +#else /* HAVE_BUILTIN_POPCOUNTLL */ > > > + if (XINT (value) <= UINT_MAX) > > > + XSETINT (res, bitcount32 (XUINT (value))); > > > + else if (XINT (value) <= ULONG_MAX) > > > + XSETINT (res, bitcount64 (XUINT (value))); > > > > The comparisons against Uxxx_MAX seem to assume that VALUE is > > unsigned, but that's not guaranteed, right? > > True. Should I instead be doing > > XINT (value) <= xxx_MAX && > XINT (value) >= xxx_MIN > > or might there be a better check? For negative values popcount > typically counts ones of the two's complement I'd just assign to an unsigned temporary, IIUC the semantics. Btw, on Windows, a long is a 32-bit type, so I think we need to check against ULONG_LONG_MAX as well. ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount 2017-09-30 13:59 ` Eli Zaretskii @ 2017-09-30 14:55 ` Mark Oteiza 2017-09-30 15:38 ` Eli Zaretskii 2017-09-30 16:50 ` Benjamin Riefenstahl 0 siblings, 2 replies; 16+ messages in thread From: Mark Oteiza @ 2017-09-30 14:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 22689 On 30/09/17 at 04:59pm, Eli Zaretskii wrote: > I only see that symbol in the GCC sources, which I don't think are > relevant here. > > If you drop the __POPCNT__ part, does the code still work for you? Yes. > > > > +#else /* HAVE_BUILTIN_POPCOUNTLL */ > > > > + if (XINT (value) <= UINT_MAX) > > > > + XSETINT (res, bitcount32 (XUINT (value))); > > > > + else if (XINT (value) <= ULONG_MAX) > > > > + XSETINT (res, bitcount64 (XUINT (value))); > > > > > > The comparisons against Uxxx_MAX seem to assume that VALUE is > > > unsigned, but that's not guaranteed, right? > > > > True. Should I instead be doing > > > > XINT (value) <= xxx_MAX && > > XINT (value) >= xxx_MIN > > > > or might there be a better check? For negative values popcount > > typically counts ones of the two's complement > > I'd just assign to an unsigned temporary, IIUC the semantics. > > Btw, on Windows, a long is a 32-bit type, so I think we need > to check against ULONG_LONG_MAX as well. I tried implementing your comments (and added a test). I don't know what to do in the #else for ULONG_LONG_MAX. diff --git a/src/data.c b/src/data.c index e070be6c20..8b3866151d 100644 --- a/src/data.c +++ b/src/data.c @@ -3069,6 +3069,66 @@ usage: (logxor &rest INTS-OR-MARKERS) */) return arith_driver (Alogxor, nargs, args); } +#if GNUC_PREREQ (4, 1, 0) +#define HAVE_BUILTIN_POPCOUNTLL +#endif + +#ifndef HAVE_BUILTIN_POPCOUNTLL +static uint32_t +logcount32 (uint32_t b) +{ + b -= (b >> 1) & 0x55555555; + b = (b & 0x33333333) + ((b >> 2) & 0x33333333); + b = (b + (b >> 4)) & 0x0f0f0f0f; + return (b * 0x01010101) >> 24; +} + +static uint64_t +logcount64 (uint64_t b) +{ + b -= (b >> 1) & 0x5555555555555555ULL; + b = (b & 0x3333333333333333ULL) + ((b >> 2) & 0x3333333333333333ULL); + b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0fULL; + return (b * 0x0101010101010101ULL) >> 56; +} +#endif /* HAVE_BUILTIN_POPCOUNTLL */ + +DEFUN ("logcount", Flogcount, Slogcount, 1, 1, 0, + doc: /* Return population count of VALUE. +If VALUE is negative, the count is of its two's complement representation. */) + (register Lisp_Object value) +{ + Lisp_Object res; + EMACS_UINT v; + + CHECK_NUMBER (value); + + v = XUINT (value); +#ifdef HAVE_BUILTIN_POPCOUNTLL + if (v <= UINT_MAX) + XSETINT (res, __builtin_popcount (v)); + else if (v <= ULONG_MAX) + XSETINT (res, __builtin_popcountl (v)); + else if (v <= ULONG_LONG_MAX) + XSETINT (res, __builtin_popcountll (v)); +#else /* HAVE_BUILTIN_POPCOUNTLL */ + if (v <= UINT_MAX) + XSETINT (res, logcount32 (v)); + else if (v <= ULONG_MAX) + XSETINT (res, logcount64 (v)); +#endif /* HAVE_BUILTIN_POPCOUNTLL */ + else + { + unsigned int count; + for (count = 0; v; count++) + { + v &= v - 1; + } + XSETINT (res, v); + } + return res; +} + static Lisp_Object ash_lsh_impl (Lisp_Object value, Lisp_Object count, bool lsh) { @@ -3856,6 +3916,7 @@ syms_of_data (void) defsubr (&Slogand); defsubr (&Slogior); defsubr (&Slogxor); + defsubr (&Slogcount); defsubr (&Slsh); defsubr (&Sash); defsubr (&Sadd1); diff --git a/test/src/data-tests.el b/test/src/data-tests.el index 8de8c145d4..d1154cc5c4 100644 --- a/test/src/data-tests.el +++ b/test/src/data-tests.el @@ -107,6 +107,19 @@ (should (isnan (min 1.0 0.0e+NaN))) (should (isnan (min 1.0 0.0e+NaN 1.1)))) +(defun data-tests-popcnt (byte) + "Calculate the Hamming weight of BYTE." + (setq byte (- byte (logand (lsh byte -1) #x55555555))) + (setq byte (+ (logand byte #x33333333) (logand (lsh byte -2) #x33333333))) + (lsh (* (logand (+ byte (lsh byte -4)) #x0f0f0f0f) #x01010101) -24)) + +(ert-deftest data-tests-logcount () + (should (cl-loop for n in (number-sequence 0 255) + always (= (logcount n) (data-tests-popcnt n)))) + ;; https://oeis.org/A000120 + (should (= 11 (logcount 9727))) + (should (= 8 (logcount 9999)))) + ;; Bool vector tests. Compactly represent bool vectors as hex ;; strings. ^ permalink raw reply related [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount 2017-09-30 14:55 ` Mark Oteiza @ 2017-09-30 15:38 ` Eli Zaretskii 2017-09-30 16:03 ` Mark Oteiza 2017-09-30 16:50 ` Benjamin Riefenstahl 1 sibling, 1 reply; 16+ messages in thread From: Eli Zaretskii @ 2017-09-30 15:38 UTC (permalink / raw) To: Mark Oteiza; +Cc: 22689 > Date: Sat, 30 Sep 2017 10:55:25 -0400 > From: Mark Oteiza <mvoteiza@udel.edu> > Cc: 22689@debbugs.gnu.org > > > Btw, on Windows, a long is a 32-bit type, so I think we need > > to check against ULONG_LONG_MAX as well. > > I tried implementing your comments (and added a test). I don't know > what to do in the #else for ULONG_LONG_MAX. Same as you do for ULONG_MAX, I think. ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount 2017-09-30 15:38 ` Eli Zaretskii @ 2017-09-30 16:03 ` Mark Oteiza 2017-09-30 16:17 ` Eli Zaretskii 0 siblings, 1 reply; 16+ messages in thread From: Mark Oteiza @ 2017-09-30 16:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 22689 On 30/09/17 at 06:38pm, Eli Zaretskii wrote: > > Date: Sat, 30 Sep 2017 10:55:25 -0400 > > From: Mark Oteiza <mvoteiza@udel.edu> > > Cc: 22689@debbugs.gnu.org > > > > > Btw, on Windows, a long is a 32-bit type, so I think we need > > > to check against ULONG_LONG_MAX as well. > > > > I tried implementing your comments (and added a test). I don't know > > what to do in the #else for ULONG_LONG_MAX. > > Same as you do for ULONG_MAX, I think. so the following? else if (v <= ULONG_MAX || v <= ULONG_LONG_MAX) XSETINT (res, logcount64 (v)); ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount 2017-09-30 16:03 ` Mark Oteiza @ 2017-09-30 16:17 ` Eli Zaretskii 2017-09-30 17:07 ` Mark Oteiza 0 siblings, 1 reply; 16+ messages in thread From: Eli Zaretskii @ 2017-09-30 16:17 UTC (permalink / raw) To: Mark Oteiza; +Cc: 22689 > Date: Sat, 30 Sep 2017 12:03:31 -0400 > From: Mark Oteiza <mvoteiza@udel.edu> > Cc: 22689@debbugs.gnu.org > > > > I tried implementing your comments (and added a test). I don't know > > > what to do in the #else for ULONG_LONG_MAX. > > > > Same as you do for ULONG_MAX, I think. > > so the following? > > else if (v <= ULONG_MAX || v <= ULONG_LONG_MAX) > XSETINT (res, logcount64 (v)); Yes. ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount 2017-09-30 16:17 ` Eli Zaretskii @ 2017-09-30 17:07 ` Mark Oteiza 2017-09-30 17:53 ` Eli Zaretskii 0 siblings, 1 reply; 16+ messages in thread From: Mark Oteiza @ 2017-09-30 17:07 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 22689 On 30/09/17 at 07:17pm, Eli Zaretskii wrote: > > Date: Sat, 30 Sep 2017 12:03:31 -0400 > > From: Mark Oteiza <mvoteiza@udel.edu> > > Cc: 22689@debbugs.gnu.org > > > > > > I tried implementing your comments (and added a test). I don't know > > > > what to do in the #else for ULONG_LONG_MAX. > > > > > > Same as you do for ULONG_MAX, I think. > > > > so the following? > > > > else if (v <= ULONG_MAX || v <= ULONG_LONG_MAX) > > XSETINT (res, logcount64 (v)); > > Yes. Ok. Patch including this, Benjamin's catch, and documentation. doc/lispref/numbers.texi | 10 ++++++++ etc/NEWS | 3 +++ src/data.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++ test/src/data-tests.el | 13 +++++++++++ 4 files changed, 87 insertions(+) diff --git a/doc/lispref/numbers.texi b/doc/lispref/numbers.texi index 3fdc94169b..29c6429eda 100644 --- a/doc/lispref/numbers.texi +++ b/doc/lispref/numbers.texi @@ -1107,6 +1107,16 @@ Bitwise Operations @end example @end defun +@defun logcount integer +This function returns the population count of @var{integer}: the +number of ones in the binary representation of @var{integer}. + +@example +(logcount 42) ; 42 = #b101010 + @result{} 3 +@end example +@end defun + @node Math Functions @section Standard Mathematical Functions @cindex transcendental functions diff --git a/etc/NEWS b/etc/NEWS index 238c7b7ea4..8fbc354fc0 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -31,6 +31,9 @@ When you add a new item, use the appropriate mark if you are sure it applies, \f * Changes in Emacs 27.1 ++++ +** New function 'logcount' calculates an integer's Hamming weight. + \f * Editing Changes in Emacs 27.1 diff --git a/src/data.c b/src/data.c index e070be6c20..f470d6ffaa 100644 --- a/src/data.c +++ b/src/data.c @@ -3069,6 +3069,66 @@ usage: (logxor &rest INTS-OR-MARKERS) */) return arith_driver (Alogxor, nargs, args); } +#if GNUC_PREREQ (4, 1, 0) +#define HAVE_BUILTIN_POPCOUNTLL +#endif + +#ifndef HAVE_BUILTIN_POPCOUNTLL +static uint32_t +logcount32 (uint32_t b) +{ + b -= (b >> 1) & 0x55555555; + b = (b & 0x33333333) + ((b >> 2) & 0x33333333); + b = (b + (b >> 4)) & 0x0f0f0f0f; + return (b * 0x01010101) >> 24; +} + +static uint64_t +logcount64 (uint64_t b) +{ + b -= (b >> 1) & 0x5555555555555555ULL; + b = (b & 0x3333333333333333ULL) + ((b >> 2) & 0x3333333333333333ULL); + b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0fULL; + return (b * 0x0101010101010101ULL) >> 56; +} +#endif /* HAVE_BUILTIN_POPCOUNTLL */ + +DEFUN ("logcount", Flogcount, Slogcount, 1, 1, 0, + doc: /* Return population count of VALUE. +If VALUE is negative, the count is of its two's complement representation. */) + (register Lisp_Object value) +{ + Lisp_Object res; + EMACS_UINT v; + + CHECK_NUMBER (value); + + v = XUINT (value); +#ifdef HAVE_BUILTIN_POPCOUNTLL + if (v <= UINT_MAX) + XSETINT (res, __builtin_popcount (v)); + else if (v <= ULONG_MAX) + XSETINT (res, __builtin_popcountl (v)); + else if (v <= ULONG_LONG_MAX) + XSETINT (res, __builtin_popcountll (v)); +#else /* HAVE_BUILTIN_POPCOUNTLL */ + if (v <= UINT_MAX) + XSETINT (res, logcount32 (v)); + else if (v <= ULONG_MAX || v <= ULONG_LONG_MAX) + XSETINT (res, logcount64 (v)); +#endif /* HAVE_BUILTIN_POPCOUNTLL */ + else + { + unsigned int count; + for (count = 0; v; count++) + { + v &= v - 1; + } + XSETINT (res, count); + } + return res; +} + static Lisp_Object ash_lsh_impl (Lisp_Object value, Lisp_Object count, bool lsh) { @@ -3856,6 +3916,7 @@ syms_of_data (void) defsubr (&Slogand); defsubr (&Slogior); defsubr (&Slogxor); + defsubr (&Slogcount); defsubr (&Slsh); defsubr (&Sash); defsubr (&Sadd1); diff --git a/test/src/data-tests.el b/test/src/data-tests.el index 8de8c145d4..d1154cc5c4 100644 --- a/test/src/data-tests.el +++ b/test/src/data-tests.el @@ -107,6 +107,19 @@ (should (isnan (min 1.0 0.0e+NaN))) (should (isnan (min 1.0 0.0e+NaN 1.1)))) +(defun data-tests-popcnt (byte) + "Calculate the Hamming weight of BYTE." + (setq byte (- byte (logand (lsh byte -1) #x55555555))) + (setq byte (+ (logand byte #x33333333) (logand (lsh byte -2) #x33333333))) + (lsh (* (logand (+ byte (lsh byte -4)) #x0f0f0f0f) #x01010101) -24)) + +(ert-deftest data-tests-logcount () + (should (cl-loop for n in (number-sequence 0 255) + always (= (logcount n) (data-tests-popcnt n)))) + ;; https://oeis.org/A000120 + (should (= 11 (logcount 9727))) + (should (= 8 (logcount 9999)))) + ;; Bool vector tests. Compactly represent bool vectors as hex ;; strings. ^ permalink raw reply related [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount 2017-09-30 17:07 ` Mark Oteiza @ 2017-09-30 17:53 ` Eli Zaretskii 2017-09-30 18:18 ` Mark Oteiza 0 siblings, 1 reply; 16+ messages in thread From: Eli Zaretskii @ 2017-09-30 17:53 UTC (permalink / raw) To: Mark Oteiza; +Cc: 22689 > Date: Sat, 30 Sep 2017 13:07:58 -0400 > From: Mark Oteiza <mvoteiza@udel.edu> > Cc: 22689@debbugs.gnu.org > > Ok. Patch including this, Benjamin's catch, and documentation. Thanks. A minor comment on the docs: > +@defun logcount integer > +This function returns the population count of @var{integer}: the > +number of ones in the binary representation of @var{integer}. I would mention here the "Hamming weight" term (in @dfn) that is referenced in NEWS, and I would also add a few index entries: @cindex popcount @cindex Hamming weight @cindex counting set bits ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount 2017-09-30 17:53 ` Eli Zaretskii @ 2017-09-30 18:18 ` Mark Oteiza 0 siblings, 0 replies; 16+ messages in thread From: Mark Oteiza @ 2017-09-30 18:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 22689-done On 30/09/17 at 08:53pm, Eli Zaretskii wrote: > > Date: Sat, 30 Sep 2017 13:07:58 -0400 > > From: Mark Oteiza <mvoteiza@udel.edu> > > Cc: 22689@debbugs.gnu.org > > > > Ok. Patch including this, Benjamin's catch, and documentation. > > Thanks. A minor comment on the docs: > > > +@defun logcount integer > > +This function returns the population count of @var{integer}: the > > +number of ones in the binary representation of @var{integer}. > > I would mention here the "Hamming weight" term (in @dfn) that is > referenced in NEWS, and I would also add a few index entries: > > @cindex popcount > @cindex Hamming weight > @cindex counting set bits Done, pushed as 185f3334. Thank you. ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount 2017-09-30 14:55 ` Mark Oteiza 2017-09-30 15:38 ` Eli Zaretskii @ 2017-09-30 16:50 ` Benjamin Riefenstahl 2017-09-30 16:59 ` Mark Oteiza 1 sibling, 1 reply; 16+ messages in thread From: Benjamin Riefenstahl @ 2017-09-30 16:50 UTC (permalink / raw) To: Mark Oteiza; +Cc: 22689 Hi Mark, Just a drive-by note: Mark Oteiza writes: > + { > + unsigned int count; > + for (count = 0; v; count++) > + { > + v &= v - 1; > + } > + XSETINT (res, v); > + } Isn't this a fancy way of just saying "XSETINT (res, 0)"? The variable "count" is incremented but never used, so without it the loop degenerates to while (v) { v &= v - 1; } I other words, this just loops until "v == 0". Maybe that assignment should be "XSETINT (res, count)"? That would actually use the variable "count". benny ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount 2017-09-30 16:50 ` Benjamin Riefenstahl @ 2017-09-30 16:59 ` Mark Oteiza 0 siblings, 0 replies; 16+ messages in thread From: Mark Oteiza @ 2017-09-30 16:59 UTC (permalink / raw) To: Benjamin Riefenstahl; +Cc: 22689 On 30/09/17 at 06:50pm, Benjamin Riefenstahl wrote: > Hi Mark, > > Just a drive-by note: > > Mark Oteiza writes: > > + { > > + unsigned int count; > > + for (count = 0; v; count++) > > + { > > + v &= v - 1; > > + } > > + XSETINT (res, v); > > + } > > Isn't this a fancy way of just saying "XSETINT (res, 0)"? The variable > "count" is incremented but never used, so without it the loop > degenerates to > > while (v) > { > v &= v - 1; > } > > I other words, this just loops until "v == 0". > > Maybe that assignment should be "XSETINT (res, count)"? That would > actually use the variable "count". I think you're right, thank you. ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount 2016-02-15 23:18 bug#22689: 25.1.50; implement logcount Mark Oteiza 2017-09-29 17:41 ` bug#22689: [PATCH] Add logcount Mark Oteiza @ 2017-09-30 22:48 ` Paul Eggert 2017-09-30 23:22 ` Mark Oteiza 1 sibling, 1 reply; 16+ messages in thread From: Paul Eggert @ 2017-09-30 22:48 UTC (permalink / raw) To: Mark Oteiza; +Cc: 22689 [-- Attachment #1: Type: text/plain, Size: 509 bytes --] I looked at this patch after it landed in 'master'. Thanks for contributing it. A couple of minor improvements. First, the implementation of logcount can use the count_one_bits functions already used by Emacs, rather than reinventing that wheel; these functions should be just as fast as __builtin_popcount etc. when using GCC. Second, logcount should count zero bits in negative numbers, for compatibility with Common Lisp. I installed the attached two patches to implement these improvements. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Simplify-logcount-implementation.patch --] [-- Type: text/x-patch; name="0001-Simplify-logcount-implementation.patch", Size: 2608 bytes --] From 6fe5f5db496f5962e1504991bf77e2b7d23a681f Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sat, 30 Sep 2017 13:12:33 -0700 Subject: [PATCH 1/2] Simplify logcount implementation * src/data.c (HAVE_BUILTIN_POPCOUNTLL, logcount32, logcount64): Remove. (Flogcount): Simplify by using count_one_bits. --- src/data.c | 60 +++++++----------------------------------------------------- 1 file changed, 7 insertions(+), 53 deletions(-) diff --git a/src/data.c b/src/data.c index b595e3fb1a..fd8cdd19aa 100644 --- a/src/data.c +++ b/src/data.c @@ -3069,64 +3069,18 @@ usage: (logxor &rest INTS-OR-MARKERS) */) return arith_driver (Alogxor, nargs, args); } -#if GNUC_PREREQ (4, 1, 0) -#define HAVE_BUILTIN_POPCOUNTLL -#endif - -#ifndef HAVE_BUILTIN_POPCOUNTLL -static uint32_t -logcount32 (uint32_t b) -{ - b -= (b >> 1) & 0x55555555; - b = (b & 0x33333333) + ((b >> 2) & 0x33333333); - b = (b + (b >> 4)) & 0x0f0f0f0f; - return (b * 0x01010101) >> 24; -} - -static uint64_t -logcount64 (uint64_t b) -{ - b -= (b >> 1) & 0x5555555555555555ULL; - b = (b & 0x3333333333333333ULL) + ((b >> 2) & 0x3333333333333333ULL); - b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0fULL; - return (b * 0x0101010101010101ULL) >> 56; -} -#endif /* HAVE_BUILTIN_POPCOUNTLL */ - DEFUN ("logcount", Flogcount, Slogcount, 1, 1, 0, doc: /* Return population count of VALUE. If VALUE is negative, the count is of its two's complement representation. */) - (register Lisp_Object value) + (Lisp_Object value) { - Lisp_Object res; - EMACS_UINT v; - CHECK_NUMBER (value); - - v = XUINT (value); -#ifdef HAVE_BUILTIN_POPCOUNTLL - if (v <= UINT_MAX) - XSETINT (res, __builtin_popcount (v)); - else if (v <= ULONG_MAX) - XSETINT (res, __builtin_popcountl (v)); - else if (v <= ULONG_LONG_MAX) - XSETINT (res, __builtin_popcountll (v)); -#else /* HAVE_BUILTIN_POPCOUNTLL */ - if (v <= UINT_MAX) - XSETINT (res, logcount32 (v)); - else if (v <= ULONG_MAX || v <= ULONG_LONG_MAX) - XSETINT (res, logcount64 (v)); -#endif /* HAVE_BUILTIN_POPCOUNTLL */ - else - { - unsigned int count; - for (count = 0; v; count++) - { - v &= v - 1; - } - XSETINT (res, count); - } - return res; + EMACS_UINT v = XUINT (value); + return make_number (EMACS_UINT_WIDTH <= UINT_WIDTH + ? count_one_bits (v) + : EMACS_UINT_WIDTH <= ULONG_WIDTH + ? count_one_bits_l (v) + : count_one_bits_ll (v)); } static Lisp_Object -- 2.13.5 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #3: 0002-Make-logcount-act-like-CL-on-negative-arg.patch --] [-- Type: text/x-patch; name="0002-Make-logcount-act-like-CL-on-negative-arg.patch", Size: 3148 bytes --] From 3b6d6b03383ac413fd0f25c4ae4fed1c780a3c29 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sat, 30 Sep 2017 15:36:52 -0700 Subject: [PATCH 2/2] Make logcount act like CL on negative arg Behaving like Common Lisp is less likely to lead to surprises, as it yields the same answers on 32- vs 64-bit machines. * doc/lispref/numbers.texi (Bitwise Operations): Document behavior on negative integers. * src/data.c (Flogcount): Behave like Common Lisp for negative arguments. * test/src/data-tests.el (data-tests-popcnt) (data-tests-logcount): Test negative args too. --- doc/lispref/numbers.texi | 7 ++++++- src/data.c | 6 ++++-- test/src/data-tests.el | 4 +++- 3 files changed, 13 insertions(+), 4 deletions(-) diff --git a/doc/lispref/numbers.texi b/doc/lispref/numbers.texi index 5058063af4..be74b0c611 100644 --- a/doc/lispref/numbers.texi +++ b/doc/lispref/numbers.texi @@ -1113,9 +1113,14 @@ Bitwise Operations @defun logcount integer This function returns the @dfn{Hamming weight} of @var{integer}: the number of ones in the binary representation of @var{integer}. +If @var{integer} is negative, it returns the number of zero bits in +its two's complement binary representation. The result is always +nonnegative. @example -(logcount 42) ; 42 = #b101010 +(logcount 43) ; 43 = #b101011 + @result{} 4 +(logcount -43) ; -43 = #b111...1010101 @result{} 3 @end example @end defun diff --git a/src/data.c b/src/data.c index fd8cdd19aa..2e7f3e017b 100644 --- a/src/data.c +++ b/src/data.c @@ -3071,11 +3071,13 @@ usage: (logxor &rest INTS-OR-MARKERS) */) DEFUN ("logcount", Flogcount, Slogcount, 1, 1, 0, doc: /* Return population count of VALUE. -If VALUE is negative, the count is of its two's complement representation. */) +This is the number of one bits in the two's complement representation +of VALUE. If VALUE is negative, return the number of zero bits in the +representation. */) (Lisp_Object value) { CHECK_NUMBER (value); - EMACS_UINT v = XUINT (value); + EMACS_INT v = XINT (value) < 0 ? -1 - XINT (value) : XINT (value); return make_number (EMACS_UINT_WIDTH <= UINT_WIDTH ? count_one_bits (v) : EMACS_UINT_WIDTH <= ULONG_WIDTH diff --git a/test/src/data-tests.el b/test/src/data-tests.el index d1154cc5c4..374d1689b9 100644 --- a/test/src/data-tests.el +++ b/test/src/data-tests.el @@ -109,12 +109,14 @@ (defun data-tests-popcnt (byte) "Calculate the Hamming weight of BYTE." + (if (< byte 0) + (setq byte (lognot byte))) (setq byte (- byte (logand (lsh byte -1) #x55555555))) (setq byte (+ (logand byte #x33333333) (logand (lsh byte -2) #x33333333))) (lsh (* (logand (+ byte (lsh byte -4)) #x0f0f0f0f) #x01010101) -24)) (ert-deftest data-tests-logcount () - (should (cl-loop for n in (number-sequence 0 255) + (should (cl-loop for n in (number-sequence -255 255) always (= (logcount n) (data-tests-popcnt n)))) ;; https://oeis.org/A000120 (should (= 11 (logcount 9727))) -- 2.13.5 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount 2017-09-30 22:48 ` Paul Eggert @ 2017-09-30 23:22 ` Mark Oteiza 0 siblings, 0 replies; 16+ messages in thread From: Mark Oteiza @ 2017-09-30 23:22 UTC (permalink / raw) To: Paul Eggert; +Cc: 22689 On 30/09/17 at 03:48pm, Paul Eggert wrote: > I looked at this patch after it landed in 'master'. Thanks for contributing it. > > A couple of minor improvements. First, the implementation of logcount can > use the count_one_bits functions already used by Emacs, rather than > reinventing that wheel; these functions should be just as fast as > __builtin_popcount etc. when using GCC. Second, logcount should count zero > bits in negative numbers, for compatibility with Common Lisp. I installed > the attached two patches to implement these improvements. Thank you for the improvements--I was not aware of count-one-bits.h but am pleasantly surprised. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2017-09-30 23:22 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-02-15 23:18 bug#22689: 25.1.50; implement logcount Mark Oteiza 2017-09-29 17:41 ` bug#22689: [PATCH] Add logcount Mark Oteiza 2017-09-30 11:42 ` Eli Zaretskii 2017-09-30 13:16 ` Mark Oteiza 2017-09-30 13:59 ` Eli Zaretskii 2017-09-30 14:55 ` Mark Oteiza 2017-09-30 15:38 ` Eli Zaretskii 2017-09-30 16:03 ` Mark Oteiza 2017-09-30 16:17 ` Eli Zaretskii 2017-09-30 17:07 ` Mark Oteiza 2017-09-30 17:53 ` Eli Zaretskii 2017-09-30 18:18 ` Mark Oteiza 2017-09-30 16:50 ` Benjamin Riefenstahl 2017-09-30 16:59 ` Mark Oteiza 2017-09-30 22:48 ` Paul Eggert 2017-09-30 23:22 ` Mark Oteiza
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).