On 9/21/13 12:16 AM, Eli Zaretskii wrote: >> Date: Fri, 20 Sep 2013 19:49:00 -0700 >> From: Daniel Colascione >> Cc: Emacs development discussions >> >>>> +static inline >>>> +EMACS_INT >>>> +popcount_size_t(size_t val) >>>> +{ >>>> + EMACS_INT count; >>>> + >>>> +#if defined __GNUC__ && BITS_PER_SIZE_T == 64 >>>> + count = __builtin_popcountll (val); >>>> +#elif defined __GNUC__ && BITS_PER_SIZE_T == 32 >>>> + count = __builtin_popcount (val); >>>> +#elif defined __MSC_VER && BITS_PER_SIZE_T == 64 >>>> +# pragma intrinsic __popcnt64 >>>> + count = __popcnt64 (val); >>>> +#elif defined __MSC_VER && BITS_PER_SIZE_T == 32 >>>> +# pragma intrinsic __popcnt >>>> + count = __popcnt (val); >>>> +#else >>>> + { >>>> + EMACS_INT j; >>>> + count = 0; >>>> + for (j = 0; j < BITS_PER_SIZE_T; ++j) >>>> + count += !!((((size_t) 1) << j) & val); >>>> + } >>>> +#endif >>> >>> Why loop? See http://en.wikipedia.org/wiki/Hamming_weight. >> >> I didn't want to put a lot of effort into a code path we'll probably >> never use. Recall that if we're using icc or gcc or Visual C++ or >> Clang, we'll be using a compiler intrinsic, which will probably compile >> down to a single machine instruction. >> >> By the way: can someone test that the Visual C++ alternate actually >> works? I don't have access to a Windows machine at the moment. > > I don't see why it won't work, per documentation on this page: > > http://msdn.microsoft.com/en-us/library/bb385231%28v=vs.90%29.aspx > > However, I think you will need to make usage of these intrinsics > compiler version dependent. GCC supports them starting from 3.4, > whereas MSVC seems to support them since Studio 2008, i.e. _MSC_VER = > 1500 or higher. Fair enough. > It is also not clear to me what will the MSVC intrinsic do if the > binary ever runs on a CPU that doesn't support SSE4, the MSDN > documentation seems to say that the results are unpredictable, > i.e. that there's no fallback, like GCC has in libgcc. So perhaps we > should also guard that with a Windows version (assuming that old > machines will only ever run Windows 9x). Good point. SSE4 is much too recent to require. Making the cpuid check shouldn't be too hard. I have no way to actually test the fallback, though. (It's easy to test the fallback code, but not that easy to test falling back to it.)