* bug#22689: 25.1.50; implement logcount
@ 2016-02-15 23:18 Mark Oteiza
2017-09-29 17:41 ` bug#22689: [PATCH] Add logcount Mark Oteiza
2017-09-30 22:48 ` Paul Eggert
0 siblings, 2 replies; 16+ messages in thread
From: Mark Oteiza @ 2016-02-15 23:18 UTC (permalink / raw)
To: 22689
Wishlist:
Logcount, AKA popcount, Hamming weight, sideways sum. Basically,
counting the number of 1s in a number's binary representation. The
concept is similar to that in `bool-vector-count-population', except the
argument would be a number, presumably a non-negative integer.
There are a number of ways to go about it, and beyond that I'm not sure
how to write it into data.c:
- Some compilers have a __builtin_popcount (gcc since 2004 and llvm
since 2005)
- gmp has a popcount function
- lots of algorithms implementing it
Many are given here:
https://en.wikipedia.org/wiki/Hamming_weight
another resource:
http://rosettacode.org/wiki/Population_count
Guile's implementation uses table lookup:
http://git.savannah.gnu.org/cgit/guile.git/tree/libguile/numbers.c#n5234
When I needed it, I just naïvely ported an algorithm to elisp (in
particular ignoring the case of negative numbers):
(let ((m1 #x555555555555555)
(m2 #x333333333333333)
(m4 #x0f0f0f0f0f0f0f0f)
(h01 #x0101010101010101))
(setq x (- x (logand (lsh x -1) m1)))
(setq x (+ (logand x m2) (logand (lsh x -2) m2)))
(setq x (logand (+ x (lsh x -4)) m4))
(lsh (* x h01) -56))
The table lookup isn't so different
(defvar logtab [0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4])
(let ((count 0))
(while (not (zerop x))
(setq count (+ count (aref logtab (logand 15 x))))
(setq x (lsh x -4)))
count)
I couldn't find any implementation of this in calc either, but such a
function/operation would fit in calc-bin.
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount
2016-02-15 23:18 bug#22689: 25.1.50; implement logcount Mark Oteiza
@ 2017-09-29 17:41 ` Mark Oteiza
2017-09-30 11:42 ` Eli Zaretskii
2017-09-30 22:48 ` Paul Eggert
1 sibling, 1 reply; 16+ messages in thread
From: Mark Oteiza @ 2017-09-29 17:41 UTC (permalink / raw)
To: 22689
Hi,
I made an attempt implementing this:
diff --git a/src/data.c b/src/data.c
index e070be6c20..1332173dcd 100644
--- a/src/data.c
+++ b/src/data.c
@@ -3069,6 +3069,57 @@ usage: (logxor &rest INTS-OR-MARKERS) */)
return arith_driver (Alogxor, nargs, args);
}
+#if defined (__POPCNT__) && defined (__GNUC__) && (__GNUC__> 4 || (__GNUC__== 4 && __GNUC_MINOR__> 1))
+#define HAVE_BUILTIN_POPCOUNTLL
+#endif
+
+static uint32_t
+bitcount32 (uint32_t b)
+{
+ b -= (b >> 1) & 0x55555555;
+ b = (b & 0x33333333) + ((b >> 2) & 0x33333333);
+ b = (b + (b >> 4)) & 0x0f0f0f0f;
+ return (b * 0x01010101) >> 24;
+}
+
+static uint64_t
+bitcount64 (uint64_t b)
+{
+ b -= (b >> 1) & 0x5555555555555555;
+ b = (b & 0x3333333333333333) + ((b >> 2) & 0x3333333333333333);
+ b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0f;
+ return (b * 0x0101010101010101) >> 56;
+}
+
+DEFUN ("logcount", Flogcount, Slogcount, 1, 1, 0,
+ doc: /* Return population count of VALUE. */)
+ (register Lisp_Object value)
+{
+ Lisp_Object res;
+
+ CHECK_NUMBER (value);
+
+#ifdef HAVE_BUILTIN_POPCOUNTLL
+ XSETINT (res, __builtin_popcount (XUINT (value)));
+#else /* HAVE_BUILTIN_POPCOUNTLL */
+ if (XINT (value) <= UINT_MAX)
+ XSETINT (res, bitcount32 (XUINT (value)));
+ else if (XINT (value) <= ULONG_MAX)
+ XSETINT (res, bitcount64 (XUINT (value)));
+ else
+ {
+ int count;
+ EMACS_UINT v = XUINT (value);
+ for (count = 0; v; count++)
+ {
+ v &= v - 1;
+ }
+ XSETINT (res, v);
+ }
+#endif /* HAVE_BUILTIN_POPCOUNTLL */
+ return res;
+}
+
static Lisp_Object
ash_lsh_impl (Lisp_Object value, Lisp_Object count, bool lsh)
{
@@ -3856,6 +3907,7 @@ syms_of_data (void)
defsubr (&Slogand);
defsubr (&Slogior);
defsubr (&Slogxor);
+ defsubr (&Slogcount);
defsubr (&Slsh);
defsubr (&Sash);
defsubr (&Sadd1);
^ permalink raw reply related [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount
2017-09-29 17:41 ` bug#22689: [PATCH] Add logcount Mark Oteiza
@ 2017-09-30 11:42 ` Eli Zaretskii
2017-09-30 13:16 ` Mark Oteiza
0 siblings, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2017-09-30 11:42 UTC (permalink / raw)
To: Mark Oteiza; +Cc: 22689
> Date: Fri, 29 Sep 2017 13:41:34 -0400
> From: Mark Oteiza <mvoteiza@udel.edu>
>
> I made an attempt implementing this:
Thanks. I'm not an expert on these ops, but otherwise the code looks
OK to me, modulo the minor comments below.
This will eventually need a NEWS entry and some docs in the ELisp
manual.
> +#if defined (__POPCNT__) && defined (__GNUC__) && (__GNUC__> 4 || (__GNUC__== 4 && __GNUC_MINOR__> 1))
> +#define HAVE_BUILTIN_POPCOUNTLL
> +#endif
Where does __POPCNT__ definition come from? I don't see it in my
GCC's "gcc -dM" output.
As for the rest of the condition, I think you should use GNUC_PREREQ,
like this:
#if GNUC_PREREQ (4, 1, 0)
> +static uint64_t
> +bitcount64 (uint64_t b)
> +{
> + b -= (b >> 1) & 0x5555555555555555;
> + b = (b & 0x3333333333333333) + ((b >> 2) & 0x3333333333333333);
> + b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0f;
> + return (b * 0x0101010101010101) >> 56;
Shouldn't these constants have a ULL suffix, to make sure they are not
truncated to a 32-bit int?
> +#else /* HAVE_BUILTIN_POPCOUNTLL */
> + if (XINT (value) <= UINT_MAX)
> + XSETINT (res, bitcount32 (XUINT (value)));
> + else if (XINT (value) <= ULONG_MAX)
> + XSETINT (res, bitcount64 (XUINT (value)));
The comparisons against Uxxx_MAX seem to assume that VALUE is
unsigned, but that's not guaranteed, right?
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount
2017-09-30 11:42 ` Eli Zaretskii
@ 2017-09-30 13:16 ` Mark Oteiza
2017-09-30 13:59 ` Eli Zaretskii
0 siblings, 1 reply; 16+ messages in thread
From: Mark Oteiza @ 2017-09-30 13:16 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 22689
On 30/09/17 at 02:42pm, Eli Zaretskii wrote:
> > +#if defined (__POPCNT__) && defined (__GNUC__) && (__GNUC__> 4 || (__GNUC__== 4 && __GNUC_MINOR__> 1))
> > +#define HAVE_BUILTIN_POPCOUNTLL
> > +#endif
>
> Where does __POPCNT__ definition come from? I don't see it in my
> GCC's "gcc -dM" output.
>
> As for the rest of the condition, I think you should use GNUC_PREREQ,
> like this:
>
> #if GNUC_PREREQ (4, 1, 0)
I guess it comes from nowhere, that condition and the two helper
functions come from here
https://rosettacode.org/wiki/Population_count#C
> > +static uint64_t
> > +bitcount64 (uint64_t b)
> > +{
> > + b -= (b >> 1) & 0x5555555555555555;
> > + b = (b & 0x3333333333333333) + ((b >> 2) & 0x3333333333333333);
> > + b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0f;
> > + return (b * 0x0101010101010101) >> 56;
>
> Shouldn't these constants have a ULL suffix, to make sure they are not
> truncated to a 32-bit int?
Probably, I don't know--I'm out of my comfort zone here.
> > +#else /* HAVE_BUILTIN_POPCOUNTLL */
> > + if (XINT (value) <= UINT_MAX)
> > + XSETINT (res, bitcount32 (XUINT (value)));
> > + else if (XINT (value) <= ULONG_MAX)
> > + XSETINT (res, bitcount64 (XUINT (value)));
>
> The comparisons against Uxxx_MAX seem to assume that VALUE is
> unsigned, but that's not guaranteed, right?
True. Should I instead be doing
XINT (value) <= xxx_MAX &&
XINT (value) >= xxx_MIN
or might there be a better check? For negative values popcount
typically counts ones of the two's complement
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount
2017-09-30 13:16 ` Mark Oteiza
@ 2017-09-30 13:59 ` Eli Zaretskii
2017-09-30 14:55 ` Mark Oteiza
0 siblings, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2017-09-30 13:59 UTC (permalink / raw)
To: Mark Oteiza; +Cc: 22689
> Date: Sat, 30 Sep 2017 09:16:36 -0400
> From: Mark Oteiza <mvoteiza@udel.edu>
> Cc: 22689@debbugs.gnu.org
>
> On 30/09/17 at 02:42pm, Eli Zaretskii wrote:
> > > +#if defined (__POPCNT__) && defined (__GNUC__) && (__GNUC__> 4 || (__GNUC__== 4 && __GNUC_MINOR__> 1))
> > > +#define HAVE_BUILTIN_POPCOUNTLL
> > > +#endif
> >
> > Where does __POPCNT__ definition come from? I don't see it in my
> > GCC's "gcc -dM" output.
> >
> > As for the rest of the condition, I think you should use GNUC_PREREQ,
> > like this:
> >
> > #if GNUC_PREREQ (4, 1, 0)
>
> I guess it comes from nowhere, that condition and the two helper
> functions come from here
> https://rosettacode.org/wiki/Population_count#C
I only see that symbol in the GCC sources, which I don't think are
relevant here.
If you drop the __POPCNT__ part, does the code still work for you?
> > > +static uint64_t
> > > +bitcount64 (uint64_t b)
> > > +{
> > > + b -= (b >> 1) & 0x5555555555555555;
> > > + b = (b & 0x3333333333333333) + ((b >> 2) & 0x3333333333333333);
> > > + b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0f;
> > > + return (b * 0x0101010101010101) >> 56;
> >
> > Shouldn't these constants have a ULL suffix, to make sure they are not
> > truncated to a 32-bit int?
>
> Probably, I don't know--I'm out of my comfort zone here.
I'd use the suffixes.
> > > +#else /* HAVE_BUILTIN_POPCOUNTLL */
> > > + if (XINT (value) <= UINT_MAX)
> > > + XSETINT (res, bitcount32 (XUINT (value)));
> > > + else if (XINT (value) <= ULONG_MAX)
> > > + XSETINT (res, bitcount64 (XUINT (value)));
> >
> > The comparisons against Uxxx_MAX seem to assume that VALUE is
> > unsigned, but that's not guaranteed, right?
>
> True. Should I instead be doing
>
> XINT (value) <= xxx_MAX &&
> XINT (value) >= xxx_MIN
>
> or might there be a better check? For negative values popcount
> typically counts ones of the two's complement
I'd just assign to an unsigned temporary, IIUC the semantics.
Btw, on Windows, a long is a 32-bit type, so I think we need
to check against ULONG_LONG_MAX as well.
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount
2017-09-30 13:59 ` Eli Zaretskii
@ 2017-09-30 14:55 ` Mark Oteiza
2017-09-30 15:38 ` Eli Zaretskii
2017-09-30 16:50 ` Benjamin Riefenstahl
0 siblings, 2 replies; 16+ messages in thread
From: Mark Oteiza @ 2017-09-30 14:55 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 22689
On 30/09/17 at 04:59pm, Eli Zaretskii wrote:
> I only see that symbol in the GCC sources, which I don't think are
> relevant here.
>
> If you drop the __POPCNT__ part, does the code still work for you?
Yes.
> > > > +#else /* HAVE_BUILTIN_POPCOUNTLL */
> > > > + if (XINT (value) <= UINT_MAX)
> > > > + XSETINT (res, bitcount32 (XUINT (value)));
> > > > + else if (XINT (value) <= ULONG_MAX)
> > > > + XSETINT (res, bitcount64 (XUINT (value)));
> > >
> > > The comparisons against Uxxx_MAX seem to assume that VALUE is
> > > unsigned, but that's not guaranteed, right?
> >
> > True. Should I instead be doing
> >
> > XINT (value) <= xxx_MAX &&
> > XINT (value) >= xxx_MIN
> >
> > or might there be a better check? For negative values popcount
> > typically counts ones of the two's complement
>
> I'd just assign to an unsigned temporary, IIUC the semantics.
>
> Btw, on Windows, a long is a 32-bit type, so I think we need
> to check against ULONG_LONG_MAX as well.
I tried implementing your comments (and added a test). I don't know
what to do in the #else for ULONG_LONG_MAX.
diff --git a/src/data.c b/src/data.c
index e070be6c20..8b3866151d 100644
--- a/src/data.c
+++ b/src/data.c
@@ -3069,6 +3069,66 @@ usage: (logxor &rest INTS-OR-MARKERS) */)
return arith_driver (Alogxor, nargs, args);
}
+#if GNUC_PREREQ (4, 1, 0)
+#define HAVE_BUILTIN_POPCOUNTLL
+#endif
+
+#ifndef HAVE_BUILTIN_POPCOUNTLL
+static uint32_t
+logcount32 (uint32_t b)
+{
+ b -= (b >> 1) & 0x55555555;
+ b = (b & 0x33333333) + ((b >> 2) & 0x33333333);
+ b = (b + (b >> 4)) & 0x0f0f0f0f;
+ return (b * 0x01010101) >> 24;
+}
+
+static uint64_t
+logcount64 (uint64_t b)
+{
+ b -= (b >> 1) & 0x5555555555555555ULL;
+ b = (b & 0x3333333333333333ULL) + ((b >> 2) & 0x3333333333333333ULL);
+ b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0fULL;
+ return (b * 0x0101010101010101ULL) >> 56;
+}
+#endif /* HAVE_BUILTIN_POPCOUNTLL */
+
+DEFUN ("logcount", Flogcount, Slogcount, 1, 1, 0,
+ doc: /* Return population count of VALUE.
+If VALUE is negative, the count is of its two's complement representation. */)
+ (register Lisp_Object value)
+{
+ Lisp_Object res;
+ EMACS_UINT v;
+
+ CHECK_NUMBER (value);
+
+ v = XUINT (value);
+#ifdef HAVE_BUILTIN_POPCOUNTLL
+ if (v <= UINT_MAX)
+ XSETINT (res, __builtin_popcount (v));
+ else if (v <= ULONG_MAX)
+ XSETINT (res, __builtin_popcountl (v));
+ else if (v <= ULONG_LONG_MAX)
+ XSETINT (res, __builtin_popcountll (v));
+#else /* HAVE_BUILTIN_POPCOUNTLL */
+ if (v <= UINT_MAX)
+ XSETINT (res, logcount32 (v));
+ else if (v <= ULONG_MAX)
+ XSETINT (res, logcount64 (v));
+#endif /* HAVE_BUILTIN_POPCOUNTLL */
+ else
+ {
+ unsigned int count;
+ for (count = 0; v; count++)
+ {
+ v &= v - 1;
+ }
+ XSETINT (res, v);
+ }
+ return res;
+}
+
static Lisp_Object
ash_lsh_impl (Lisp_Object value, Lisp_Object count, bool lsh)
{
@@ -3856,6 +3916,7 @@ syms_of_data (void)
defsubr (&Slogand);
defsubr (&Slogior);
defsubr (&Slogxor);
+ defsubr (&Slogcount);
defsubr (&Slsh);
defsubr (&Sash);
defsubr (&Sadd1);
diff --git a/test/src/data-tests.el b/test/src/data-tests.el
index 8de8c145d4..d1154cc5c4 100644
--- a/test/src/data-tests.el
+++ b/test/src/data-tests.el
@@ -107,6 +107,19 @@
(should (isnan (min 1.0 0.0e+NaN)))
(should (isnan (min 1.0 0.0e+NaN 1.1))))
+(defun data-tests-popcnt (byte)
+ "Calculate the Hamming weight of BYTE."
+ (setq byte (- byte (logand (lsh byte -1) #x55555555)))
+ (setq byte (+ (logand byte #x33333333) (logand (lsh byte -2) #x33333333)))
+ (lsh (* (logand (+ byte (lsh byte -4)) #x0f0f0f0f) #x01010101) -24))
+
+(ert-deftest data-tests-logcount ()
+ (should (cl-loop for n in (number-sequence 0 255)
+ always (= (logcount n) (data-tests-popcnt n))))
+ ;; https://oeis.org/A000120
+ (should (= 11 (logcount 9727)))
+ (should (= 8 (logcount 9999))))
+
;; Bool vector tests. Compactly represent bool vectors as hex
;; strings.
^ permalink raw reply related [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount
2017-09-30 14:55 ` Mark Oteiza
@ 2017-09-30 15:38 ` Eli Zaretskii
2017-09-30 16:03 ` Mark Oteiza
2017-09-30 16:50 ` Benjamin Riefenstahl
1 sibling, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2017-09-30 15:38 UTC (permalink / raw)
To: Mark Oteiza; +Cc: 22689
> Date: Sat, 30 Sep 2017 10:55:25 -0400
> From: Mark Oteiza <mvoteiza@udel.edu>
> Cc: 22689@debbugs.gnu.org
>
> > Btw, on Windows, a long is a 32-bit type, so I think we need
> > to check against ULONG_LONG_MAX as well.
>
> I tried implementing your comments (and added a test). I don't know
> what to do in the #else for ULONG_LONG_MAX.
Same as you do for ULONG_MAX, I think.
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount
2017-09-30 15:38 ` Eli Zaretskii
@ 2017-09-30 16:03 ` Mark Oteiza
2017-09-30 16:17 ` Eli Zaretskii
0 siblings, 1 reply; 16+ messages in thread
From: Mark Oteiza @ 2017-09-30 16:03 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 22689
On 30/09/17 at 06:38pm, Eli Zaretskii wrote:
> > Date: Sat, 30 Sep 2017 10:55:25 -0400
> > From: Mark Oteiza <mvoteiza@udel.edu>
> > Cc: 22689@debbugs.gnu.org
> >
> > > Btw, on Windows, a long is a 32-bit type, so I think we need
> > > to check against ULONG_LONG_MAX as well.
> >
> > I tried implementing your comments (and added a test). I don't know
> > what to do in the #else for ULONG_LONG_MAX.
>
> Same as you do for ULONG_MAX, I think.
so the following?
else if (v <= ULONG_MAX || v <= ULONG_LONG_MAX)
XSETINT (res, logcount64 (v));
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount
2017-09-30 16:03 ` Mark Oteiza
@ 2017-09-30 16:17 ` Eli Zaretskii
2017-09-30 17:07 ` Mark Oteiza
0 siblings, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2017-09-30 16:17 UTC (permalink / raw)
To: Mark Oteiza; +Cc: 22689
> Date: Sat, 30 Sep 2017 12:03:31 -0400
> From: Mark Oteiza <mvoteiza@udel.edu>
> Cc: 22689@debbugs.gnu.org
>
> > > I tried implementing your comments (and added a test). I don't know
> > > what to do in the #else for ULONG_LONG_MAX.
> >
> > Same as you do for ULONG_MAX, I think.
>
> so the following?
>
> else if (v <= ULONG_MAX || v <= ULONG_LONG_MAX)
> XSETINT (res, logcount64 (v));
Yes.
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount
2017-09-30 14:55 ` Mark Oteiza
2017-09-30 15:38 ` Eli Zaretskii
@ 2017-09-30 16:50 ` Benjamin Riefenstahl
2017-09-30 16:59 ` Mark Oteiza
1 sibling, 1 reply; 16+ messages in thread
From: Benjamin Riefenstahl @ 2017-09-30 16:50 UTC (permalink / raw)
To: Mark Oteiza; +Cc: 22689
Hi Mark,
Just a drive-by note:
Mark Oteiza writes:
> + {
> + unsigned int count;
> + for (count = 0; v; count++)
> + {
> + v &= v - 1;
> + }
> + XSETINT (res, v);
> + }
Isn't this a fancy way of just saying "XSETINT (res, 0)"? The variable
"count" is incremented but never used, so without it the loop
degenerates to
while (v)
{
v &= v - 1;
}
I other words, this just loops until "v == 0".
Maybe that assignment should be "XSETINT (res, count)"? That would
actually use the variable "count".
benny
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount
2017-09-30 16:50 ` Benjamin Riefenstahl
@ 2017-09-30 16:59 ` Mark Oteiza
0 siblings, 0 replies; 16+ messages in thread
From: Mark Oteiza @ 2017-09-30 16:59 UTC (permalink / raw)
To: Benjamin Riefenstahl; +Cc: 22689
On 30/09/17 at 06:50pm, Benjamin Riefenstahl wrote:
> Hi Mark,
>
> Just a drive-by note:
>
> Mark Oteiza writes:
> > + {
> > + unsigned int count;
> > + for (count = 0; v; count++)
> > + {
> > + v &= v - 1;
> > + }
> > + XSETINT (res, v);
> > + }
>
> Isn't this a fancy way of just saying "XSETINT (res, 0)"? The variable
> "count" is incremented but never used, so without it the loop
> degenerates to
>
> while (v)
> {
> v &= v - 1;
> }
>
> I other words, this just loops until "v == 0".
>
> Maybe that assignment should be "XSETINT (res, count)"? That would
> actually use the variable "count".
I think you're right, thank you.
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount
2017-09-30 16:17 ` Eli Zaretskii
@ 2017-09-30 17:07 ` Mark Oteiza
2017-09-30 17:53 ` Eli Zaretskii
0 siblings, 1 reply; 16+ messages in thread
From: Mark Oteiza @ 2017-09-30 17:07 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 22689
On 30/09/17 at 07:17pm, Eli Zaretskii wrote:
> > Date: Sat, 30 Sep 2017 12:03:31 -0400
> > From: Mark Oteiza <mvoteiza@udel.edu>
> > Cc: 22689@debbugs.gnu.org
> >
> > > > I tried implementing your comments (and added a test). I don't know
> > > > what to do in the #else for ULONG_LONG_MAX.
> > >
> > > Same as you do for ULONG_MAX, I think.
> >
> > so the following?
> >
> > else if (v <= ULONG_MAX || v <= ULONG_LONG_MAX)
> > XSETINT (res, logcount64 (v));
>
> Yes.
Ok. Patch including this, Benjamin's catch, and documentation.
doc/lispref/numbers.texi | 10 ++++++++
etc/NEWS | 3 +++
src/data.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++
test/src/data-tests.el | 13 +++++++++++
4 files changed, 87 insertions(+)
diff --git a/doc/lispref/numbers.texi b/doc/lispref/numbers.texi
index 3fdc94169b..29c6429eda 100644
--- a/doc/lispref/numbers.texi
+++ b/doc/lispref/numbers.texi
@@ -1107,6 +1107,16 @@ Bitwise Operations
@end example
@end defun
+@defun logcount integer
+This function returns the population count of @var{integer}: the
+number of ones in the binary representation of @var{integer}.
+
+@example
+(logcount 42) ; 42 = #b101010
+ @result{} 3
+@end example
+@end defun
+
@node Math Functions
@section Standard Mathematical Functions
@cindex transcendental functions
diff --git a/etc/NEWS b/etc/NEWS
index 238c7b7ea4..8fbc354fc0 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -31,6 +31,9 @@ When you add a new item, use the appropriate mark if you are sure it applies,
\f
* Changes in Emacs 27.1
++++
+** New function 'logcount' calculates an integer's Hamming weight.
+
\f
* Editing Changes in Emacs 27.1
diff --git a/src/data.c b/src/data.c
index e070be6c20..f470d6ffaa 100644
--- a/src/data.c
+++ b/src/data.c
@@ -3069,6 +3069,66 @@ usage: (logxor &rest INTS-OR-MARKERS) */)
return arith_driver (Alogxor, nargs, args);
}
+#if GNUC_PREREQ (4, 1, 0)
+#define HAVE_BUILTIN_POPCOUNTLL
+#endif
+
+#ifndef HAVE_BUILTIN_POPCOUNTLL
+static uint32_t
+logcount32 (uint32_t b)
+{
+ b -= (b >> 1) & 0x55555555;
+ b = (b & 0x33333333) + ((b >> 2) & 0x33333333);
+ b = (b + (b >> 4)) & 0x0f0f0f0f;
+ return (b * 0x01010101) >> 24;
+}
+
+static uint64_t
+logcount64 (uint64_t b)
+{
+ b -= (b >> 1) & 0x5555555555555555ULL;
+ b = (b & 0x3333333333333333ULL) + ((b >> 2) & 0x3333333333333333ULL);
+ b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0fULL;
+ return (b * 0x0101010101010101ULL) >> 56;
+}
+#endif /* HAVE_BUILTIN_POPCOUNTLL */
+
+DEFUN ("logcount", Flogcount, Slogcount, 1, 1, 0,
+ doc: /* Return population count of VALUE.
+If VALUE is negative, the count is of its two's complement representation. */)
+ (register Lisp_Object value)
+{
+ Lisp_Object res;
+ EMACS_UINT v;
+
+ CHECK_NUMBER (value);
+
+ v = XUINT (value);
+#ifdef HAVE_BUILTIN_POPCOUNTLL
+ if (v <= UINT_MAX)
+ XSETINT (res, __builtin_popcount (v));
+ else if (v <= ULONG_MAX)
+ XSETINT (res, __builtin_popcountl (v));
+ else if (v <= ULONG_LONG_MAX)
+ XSETINT (res, __builtin_popcountll (v));
+#else /* HAVE_BUILTIN_POPCOUNTLL */
+ if (v <= UINT_MAX)
+ XSETINT (res, logcount32 (v));
+ else if (v <= ULONG_MAX || v <= ULONG_LONG_MAX)
+ XSETINT (res, logcount64 (v));
+#endif /* HAVE_BUILTIN_POPCOUNTLL */
+ else
+ {
+ unsigned int count;
+ for (count = 0; v; count++)
+ {
+ v &= v - 1;
+ }
+ XSETINT (res, count);
+ }
+ return res;
+}
+
static Lisp_Object
ash_lsh_impl (Lisp_Object value, Lisp_Object count, bool lsh)
{
@@ -3856,6 +3916,7 @@ syms_of_data (void)
defsubr (&Slogand);
defsubr (&Slogior);
defsubr (&Slogxor);
+ defsubr (&Slogcount);
defsubr (&Slsh);
defsubr (&Sash);
defsubr (&Sadd1);
diff --git a/test/src/data-tests.el b/test/src/data-tests.el
index 8de8c145d4..d1154cc5c4 100644
--- a/test/src/data-tests.el
+++ b/test/src/data-tests.el
@@ -107,6 +107,19 @@
(should (isnan (min 1.0 0.0e+NaN)))
(should (isnan (min 1.0 0.0e+NaN 1.1))))
+(defun data-tests-popcnt (byte)
+ "Calculate the Hamming weight of BYTE."
+ (setq byte (- byte (logand (lsh byte -1) #x55555555)))
+ (setq byte (+ (logand byte #x33333333) (logand (lsh byte -2) #x33333333)))
+ (lsh (* (logand (+ byte (lsh byte -4)) #x0f0f0f0f) #x01010101) -24))
+
+(ert-deftest data-tests-logcount ()
+ (should (cl-loop for n in (number-sequence 0 255)
+ always (= (logcount n) (data-tests-popcnt n))))
+ ;; https://oeis.org/A000120
+ (should (= 11 (logcount 9727)))
+ (should (= 8 (logcount 9999))))
+
;; Bool vector tests. Compactly represent bool vectors as hex
;; strings.
^ permalink raw reply related [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount
2017-09-30 17:07 ` Mark Oteiza
@ 2017-09-30 17:53 ` Eli Zaretskii
2017-09-30 18:18 ` Mark Oteiza
0 siblings, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2017-09-30 17:53 UTC (permalink / raw)
To: Mark Oteiza; +Cc: 22689
> Date: Sat, 30 Sep 2017 13:07:58 -0400
> From: Mark Oteiza <mvoteiza@udel.edu>
> Cc: 22689@debbugs.gnu.org
>
> Ok. Patch including this, Benjamin's catch, and documentation.
Thanks. A minor comment on the docs:
> +@defun logcount integer
> +This function returns the population count of @var{integer}: the
> +number of ones in the binary representation of @var{integer}.
I would mention here the "Hamming weight" term (in @dfn) that is
referenced in NEWS, and I would also add a few index entries:
@cindex popcount
@cindex Hamming weight
@cindex counting set bits
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount
2017-09-30 17:53 ` Eli Zaretskii
@ 2017-09-30 18:18 ` Mark Oteiza
0 siblings, 0 replies; 16+ messages in thread
From: Mark Oteiza @ 2017-09-30 18:18 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 22689-done
On 30/09/17 at 08:53pm, Eli Zaretskii wrote:
> > Date: Sat, 30 Sep 2017 13:07:58 -0400
> > From: Mark Oteiza <mvoteiza@udel.edu>
> > Cc: 22689@debbugs.gnu.org
> >
> > Ok. Patch including this, Benjamin's catch, and documentation.
>
> Thanks. A minor comment on the docs:
>
> > +@defun logcount integer
> > +This function returns the population count of @var{integer}: the
> > +number of ones in the binary representation of @var{integer}.
>
> I would mention here the "Hamming weight" term (in @dfn) that is
> referenced in NEWS, and I would also add a few index entries:
>
> @cindex popcount
> @cindex Hamming weight
> @cindex counting set bits
Done, pushed as 185f3334. Thank you.
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount
2016-02-15 23:18 bug#22689: 25.1.50; implement logcount Mark Oteiza
2017-09-29 17:41 ` bug#22689: [PATCH] Add logcount Mark Oteiza
@ 2017-09-30 22:48 ` Paul Eggert
2017-09-30 23:22 ` Mark Oteiza
1 sibling, 1 reply; 16+ messages in thread
From: Paul Eggert @ 2017-09-30 22:48 UTC (permalink / raw)
To: Mark Oteiza; +Cc: 22689
[-- Attachment #1: Type: text/plain, Size: 509 bytes --]
I looked at this patch after it landed in 'master'. Thanks for contributing it.
A couple of minor improvements. First, the implementation of logcount can use
the count_one_bits functions already used by Emacs, rather than reinventing that
wheel; these functions should be just as fast as __builtin_popcount etc. when
using GCC. Second, logcount should count zero bits in negative numbers, for
compatibility with Common Lisp. I installed the attached two patches to
implement these improvements.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Simplify-logcount-implementation.patch --]
[-- Type: text/x-patch; name="0001-Simplify-logcount-implementation.patch", Size: 2608 bytes --]
From 6fe5f5db496f5962e1504991bf77e2b7d23a681f Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 30 Sep 2017 13:12:33 -0700
Subject: [PATCH 1/2] Simplify logcount implementation
* src/data.c (HAVE_BUILTIN_POPCOUNTLL, logcount32, logcount64):
Remove.
(Flogcount): Simplify by using count_one_bits.
---
src/data.c | 60 +++++++-----------------------------------------------------
1 file changed, 7 insertions(+), 53 deletions(-)
diff --git a/src/data.c b/src/data.c
index b595e3fb1a..fd8cdd19aa 100644
--- a/src/data.c
+++ b/src/data.c
@@ -3069,64 +3069,18 @@ usage: (logxor &rest INTS-OR-MARKERS) */)
return arith_driver (Alogxor, nargs, args);
}
-#if GNUC_PREREQ (4, 1, 0)
-#define HAVE_BUILTIN_POPCOUNTLL
-#endif
-
-#ifndef HAVE_BUILTIN_POPCOUNTLL
-static uint32_t
-logcount32 (uint32_t b)
-{
- b -= (b >> 1) & 0x55555555;
- b = (b & 0x33333333) + ((b >> 2) & 0x33333333);
- b = (b + (b >> 4)) & 0x0f0f0f0f;
- return (b * 0x01010101) >> 24;
-}
-
-static uint64_t
-logcount64 (uint64_t b)
-{
- b -= (b >> 1) & 0x5555555555555555ULL;
- b = (b & 0x3333333333333333ULL) + ((b >> 2) & 0x3333333333333333ULL);
- b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0fULL;
- return (b * 0x0101010101010101ULL) >> 56;
-}
-#endif /* HAVE_BUILTIN_POPCOUNTLL */
-
DEFUN ("logcount", Flogcount, Slogcount, 1, 1, 0,
doc: /* Return population count of VALUE.
If VALUE is negative, the count is of its two's complement representation. */)
- (register Lisp_Object value)
+ (Lisp_Object value)
{
- Lisp_Object res;
- EMACS_UINT v;
-
CHECK_NUMBER (value);
-
- v = XUINT (value);
-#ifdef HAVE_BUILTIN_POPCOUNTLL
- if (v <= UINT_MAX)
- XSETINT (res, __builtin_popcount (v));
- else if (v <= ULONG_MAX)
- XSETINT (res, __builtin_popcountl (v));
- else if (v <= ULONG_LONG_MAX)
- XSETINT (res, __builtin_popcountll (v));
-#else /* HAVE_BUILTIN_POPCOUNTLL */
- if (v <= UINT_MAX)
- XSETINT (res, logcount32 (v));
- else if (v <= ULONG_MAX || v <= ULONG_LONG_MAX)
- XSETINT (res, logcount64 (v));
-#endif /* HAVE_BUILTIN_POPCOUNTLL */
- else
- {
- unsigned int count;
- for (count = 0; v; count++)
- {
- v &= v - 1;
- }
- XSETINT (res, count);
- }
- return res;
+ EMACS_UINT v = XUINT (value);
+ return make_number (EMACS_UINT_WIDTH <= UINT_WIDTH
+ ? count_one_bits (v)
+ : EMACS_UINT_WIDTH <= ULONG_WIDTH
+ ? count_one_bits_l (v)
+ : count_one_bits_ll (v));
}
static Lisp_Object
--
2.13.5
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-Make-logcount-act-like-CL-on-negative-arg.patch --]
[-- Type: text/x-patch; name="0002-Make-logcount-act-like-CL-on-negative-arg.patch", Size: 3148 bytes --]
From 3b6d6b03383ac413fd0f25c4ae4fed1c780a3c29 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 30 Sep 2017 15:36:52 -0700
Subject: [PATCH 2/2] Make logcount act like CL on negative arg
Behaving like Common Lisp is less likely to lead to surprises,
as it yields the same answers on 32- vs 64-bit machines.
* doc/lispref/numbers.texi (Bitwise Operations):
Document behavior on negative integers.
* src/data.c (Flogcount):
Behave like Common Lisp for negative arguments.
* test/src/data-tests.el (data-tests-popcnt)
(data-tests-logcount): Test negative args too.
---
doc/lispref/numbers.texi | 7 ++++++-
src/data.c | 6 ++++--
test/src/data-tests.el | 4 +++-
3 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/doc/lispref/numbers.texi b/doc/lispref/numbers.texi
index 5058063af4..be74b0c611 100644
--- a/doc/lispref/numbers.texi
+++ b/doc/lispref/numbers.texi
@@ -1113,9 +1113,14 @@ Bitwise Operations
@defun logcount integer
This function returns the @dfn{Hamming weight} of @var{integer}: the
number of ones in the binary representation of @var{integer}.
+If @var{integer} is negative, it returns the number of zero bits in
+its two's complement binary representation. The result is always
+nonnegative.
@example
-(logcount 42) ; 42 = #b101010
+(logcount 43) ; 43 = #b101011
+ @result{} 4
+(logcount -43) ; -43 = #b111...1010101
@result{} 3
@end example
@end defun
diff --git a/src/data.c b/src/data.c
index fd8cdd19aa..2e7f3e017b 100644
--- a/src/data.c
+++ b/src/data.c
@@ -3071,11 +3071,13 @@ usage: (logxor &rest INTS-OR-MARKERS) */)
DEFUN ("logcount", Flogcount, Slogcount, 1, 1, 0,
doc: /* Return population count of VALUE.
-If VALUE is negative, the count is of its two's complement representation. */)
+This is the number of one bits in the two's complement representation
+of VALUE. If VALUE is negative, return the number of zero bits in the
+representation. */)
(Lisp_Object value)
{
CHECK_NUMBER (value);
- EMACS_UINT v = XUINT (value);
+ EMACS_INT v = XINT (value) < 0 ? -1 - XINT (value) : XINT (value);
return make_number (EMACS_UINT_WIDTH <= UINT_WIDTH
? count_one_bits (v)
: EMACS_UINT_WIDTH <= ULONG_WIDTH
diff --git a/test/src/data-tests.el b/test/src/data-tests.el
index d1154cc5c4..374d1689b9 100644
--- a/test/src/data-tests.el
+++ b/test/src/data-tests.el
@@ -109,12 +109,14 @@
(defun data-tests-popcnt (byte)
"Calculate the Hamming weight of BYTE."
+ (if (< byte 0)
+ (setq byte (lognot byte)))
(setq byte (- byte (logand (lsh byte -1) #x55555555)))
(setq byte (+ (logand byte #x33333333) (logand (lsh byte -2) #x33333333)))
(lsh (* (logand (+ byte (lsh byte -4)) #x0f0f0f0f) #x01010101) -24))
(ert-deftest data-tests-logcount ()
- (should (cl-loop for n in (number-sequence 0 255)
+ (should (cl-loop for n in (number-sequence -255 255)
always (= (logcount n) (data-tests-popcnt n))))
;; https://oeis.org/A000120
(should (= 11 (logcount 9727)))
--
2.13.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* bug#22689: [PATCH] Add logcount
2017-09-30 22:48 ` Paul Eggert
@ 2017-09-30 23:22 ` Mark Oteiza
0 siblings, 0 replies; 16+ messages in thread
From: Mark Oteiza @ 2017-09-30 23:22 UTC (permalink / raw)
To: Paul Eggert; +Cc: 22689
On 30/09/17 at 03:48pm, Paul Eggert wrote:
> I looked at this patch after it landed in 'master'. Thanks for contributing it.
>
> A couple of minor improvements. First, the implementation of logcount can
> use the count_one_bits functions already used by Emacs, rather than
> reinventing that wheel; these functions should be just as fast as
> __builtin_popcount etc. when using GCC. Second, logcount should count zero
> bits in negative numbers, for compatibility with Common Lisp. I installed
> the attached two patches to implement these improvements.
Thank you for the improvements--I was not aware of count-one-bits.h but
am pleasantly surprised.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2017-09-30 23:22 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-15 23:18 bug#22689: 25.1.50; implement logcount Mark Oteiza
2017-09-29 17:41 ` bug#22689: [PATCH] Add logcount Mark Oteiza
2017-09-30 11:42 ` Eli Zaretskii
2017-09-30 13:16 ` Mark Oteiza
2017-09-30 13:59 ` Eli Zaretskii
2017-09-30 14:55 ` Mark Oteiza
2017-09-30 15:38 ` Eli Zaretskii
2017-09-30 16:03 ` Mark Oteiza
2017-09-30 16:17 ` Eli Zaretskii
2017-09-30 17:07 ` Mark Oteiza
2017-09-30 17:53 ` Eli Zaretskii
2017-09-30 18:18 ` Mark Oteiza
2017-09-30 16:50 ` Benjamin Riefenstahl
2017-09-30 16:59 ` Mark Oteiza
2017-09-30 22:48 ` Paul Eggert
2017-09-30 23:22 ` Mark Oteiza
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).