* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches [not found] ` <20180821204439.62390209A6@vcs0.savannah.gnu.org> @ 2018-08-22 12:36 ` Pip Cet 2018-08-22 13:35 ` Ken Brown 0 siblings, 1 reply; 24+ messages in thread From: Pip Cet @ 2018-08-22 12:36 UTC (permalink / raw) To: emacs-devel, eggert; +Cc: emacs-diffs On Tue, Aug 21, 2018 at 9:01 PM Paul Eggert <eggert@cs.ucla.edu> wrote: > * lisp/vc/vc-hg.el (vc-hg-state-fast): When testing fixnum width, > prefer (zerop (ash most-positive-fixnum -32)) to (zerop (lsh -1 > 32)) (Bug#32485#11). I think the previous code was correct: now we have bignums, we should no longer return 'unsupported on machines with 30/32-bit fixnums, because 32 bits fit just fine into a bignum. Another `most-positive-fixnum' we can eliminate! All appearances of "fixnum" in that file should be replaced by "integer". ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-22 12:36 ` [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches Pip Cet @ 2018-08-22 13:35 ` Ken Brown 2018-08-22 13:44 ` Paul Eggert 2018-08-22 13:49 ` [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches Pip Cet 0 siblings, 2 replies; 24+ messages in thread From: Ken Brown @ 2018-08-22 13:35 UTC (permalink / raw) To: Pip Cet, emacs-devel, eggert On 8/22/2018 8:36 AM, Pip Cet wrote: > On Tue, Aug 21, 2018 at 9:01 PM Paul Eggert <eggert@cs.ucla.edu> wrote: >> * lisp/vc/vc-hg.el (vc-hg-state-fast): When testing fixnum width, >> prefer (zerop (ash most-positive-fixnum -32)) to (zerop (lsh -1 >> 32)) (Bug#32485#11). > > I think the previous code was correct: now we have bignums, we should > no longer return 'unsupported on machines with 30/32-bit fixnums, I think you're missing the point of that code. The function is vc-hg-state-fast. If it returns 'unsupported, that simply means that the slower function will be used. Ken ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-22 13:35 ` Ken Brown @ 2018-08-22 13:44 ` Paul Eggert 2018-08-22 15:50 ` Pip Cet 2018-08-22 13:49 ` [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches Pip Cet 1 sibling, 1 reply; 24+ messages in thread From: Paul Eggert @ 2018-08-22 13:44 UTC (permalink / raw) To: Ken Brown, Pip Cet, emacs-devel Ken Brown wrote: >> I think the previous code was correct: now we have bignums, we should >> no longer return 'unsupported on machines with 30/32-bit fixnums, > > I think you're missing the point of that code. The function is > vc-hg-state-fast. If it returns 'unsupported, that simply means that the slower > function will be used. True, but even bignums should be faster than the slow function, so the code really shouldn't be worrying about integer width. Fixing this, though, will be a bit trickier than simply removing the width check, as I expect the code uses eq on integers. Until that assumption is fixed (or until we start hashing bignums) the integer width check still makes sense. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-22 13:44 ` Paul Eggert @ 2018-08-22 15:50 ` Pip Cet 2018-08-22 17:27 ` Paul Eggert 2018-08-22 20:45 ` Stefan Monnier 0 siblings, 2 replies; 24+ messages in thread From: Pip Cet @ 2018-08-22 15:50 UTC (permalink / raw) To: eggert; +Cc: kbrown, emacs-devel On Wed, Aug 22, 2018 at 1:44 PM Paul Eggert <eggert@cs.ucla.edu> wrote: > True, but even bignums should be faster than the slow function, so the code > really shouldn't be worrying about integer width. Fixing this, though, will be a > bit trickier than simply removing the width check, as I expect the code uses eq > on integers. Until that assumption is fixed (or until we start hashing bignums) > the integer width check still makes sense. Maybe we ought to warn (in debug builds) about eq being used to compare several integers, at least one of which is outside a narrow interval of guaranteed fixnums (and ditto for memq and hashtest_eq). I wonder whether Emacs would even build with MOST_POSITIVE_FIXNUM = 0x100000, but it might be worth it to try and run it that way... That said, vc-hg.el doesn't contain any suspicious-looking instances of `eq', `memq', or `hash'; I think all the author was worried about was reading an u32 into an Emacs integer, which now works. Again, I'm not suggesting to remove `most-positive-fixnum', just not to use it in this particular file. It's "dangerous", but so are many useful Emacs functions (the worst of which is probably kill-emacs, which can lead to a user using a different editor, but minor catastrophes such as segfaults or data loss are also possible). As for `eq' and `eql', what I thought Stefan proposed was to keep a hash value in struct Lisp_Bignum, which is calculated lazily the first time the bignum is compared to another bignum with `eq', at which point we might as well point out to the user that portable code would require `eql'. That sounds to me like it would be much cheaper than a hash table, but if there are objections to it we can still implement it without guaranteeing it for the future in the documentation. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-22 15:50 ` Pip Cet @ 2018-08-22 17:27 ` Paul Eggert 2018-08-22 20:05 ` Pip Cet 2018-08-22 20:45 ` Stefan Monnier 1 sibling, 1 reply; 24+ messages in thread From: Paul Eggert @ 2018-08-22 17:27 UTC (permalink / raw) To: Pip Cet; +Cc: kbrown, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1686 bytes --] Pip Cet wrote: > Maybe we ought to warn (in debug builds) about eq being used to > compare several integers, at least one of which is outside a narrow > interval of guaranteed fixnums (and ditto for memq and hashtest_eq). I > wonder whether Emacs would even build with MOST_POSITIVE_FIXNUM = > 0x100000, but it might be worth it to try and run it that way... We get some of that effect already in 32- vs 64-bit builds. Perhaps you're right and we would benefit from artificially small fixnums, or from debug builds that warn about eq misuse. > That said, vc-hg.el doesn't contain any suspicious-looking instances > of `eq', `memq', or `hash' Unfortunately that doesn't seem to be quite right. I looked and found one incorrect (though quite unlikely to fail) use of eq. I changed it to eql as part of the attached patch, which I installed. > As for `eq' and `eql', what I thought Stefan proposed was to keep a > hash value in struct Lisp_Bignum, which is calculated lazily the first > time the bignum is compared to another bignum with `eq', at which > point we might as well point out to the user that portable code would > require `eql'. That sounds to me like it would be much cheaper than a > hash table, but if there are objections to it we can still implement > it without guaranteeing it for the future in the documentation. It might be helpful to have that in a debugging build. I am still leaning more towards hash tables, though; at least, I want to time them to see how much they actually cost in typical practice. I suspect the cost is rather small, and if so it may well be worth it to avoid hassles like the eq-vs-eql glitch that I just now fixed in vc-hg.el. [-- Attachment #2: 0001-Make-vc-hg-safe-for-bignums.patch --] [-- Type: text/x-patch, Size: 3411 bytes --] From ae38a3b8208a71c32f723776297290ee5096d8d4 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Wed, 22 Aug 2018 10:18:34 -0700 Subject: [PATCH] Make vc-hg safe for bignums * lisp/vc/vc-hg.el (vc-hg--raw-dirstate-search): Use eql, not eq, on integers that could be bignums. (vc-hg--time-to-integer): Rename from vc-hg--time-to-fixnum. All uses changed. (vc-hg-state-fast): Remove test that 32-bit unsigned values must be fixnums. --- lisp/vc/vc-hg.el | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/lisp/vc/vc-hg.el b/lisp/vc/vc-hg.el index da4fc2bdf7..d11dc4c5f4 100644 --- a/lisp/vc/vc-hg.el +++ b/lisp/vc/vc-hg.el @@ -583,15 +583,14 @@ vc-hg-parse-hg-data-structures (defsubst vc-hg--read-u8 () "Read and advance over an unsigned byte. -Return a fixnum." +Return the byte's value as an integer." (prog1 (char-after) (forward-char))) (defsubst vc-hg--read-u32-be () - "Read and advance over a big-endian unsigned 32-bit integer. -Return a fixnum; on overflow, result is undefined." + "Read and advance over a big-endian unsigned 32-bit integer." ;; Because elisp bytecode has an instruction for multiply and - ;; doesn't have one for lsh, it's somewhat counter-intuitively + ;; doesn't have one for shift, it's somewhat counter-intuitively ;; faster to multiply than to shift. (+ (* (vc-hg--read-u8) (* 256 256 256)) (* (vc-hg--read-u8) (* 256 256)) @@ -627,12 +626,10 @@ vc-hg--raw-dirstate-search ;; hundreds of thousands of times, so performance is important ;; here (while (< (point) search-limit) - ;; 1+4*4 is the length of the dirstate item header, which we - ;; spell as a literal for performance, since the elisp - ;; compiler lacks constant propagation + ;; 1+4*4 is the length of the dirstate item header. (forward-char (1+ (* 3 4))) (let ((this-flen (vc-hg--read-u32-be))) - (if (and (or (eq this-flen flen) + (if (and (or (eql this-flen flen) (and (> this-flen flen) (eq (char-after (+ (point) flen)) 0))) (search-forward fname (+ (point) flen) t)) @@ -917,7 +914,7 @@ vc-hg--ignore-patterns-ignored-p (setf ignored (string-match (pop patterns) filename))) ignored)) -(defun vc-hg--time-to-fixnum (ts) +(defun vc-hg--time-to-integer (ts) (+ (* 65536 (car ts)) (cadr ts))) (defvar vc-hg--cached-ignore-patterns nil @@ -1016,8 +1013,6 @@ vc-hg-state-fast (not (vc-hg--requirements-understood-p repo)) ;; Dirstate too small to be valid (< (nth 7 dirstate-attr) 40) - ;; We want to store 32-bit unsigned values in fixnums. - (zerop (ash most-positive-fixnum -32)) (progn (setf repo-relative-filename (file-relative-name truename repo)) @@ -1042,7 +1037,7 @@ vc-hg-state-fast (let ((vc-hg-size (nth 2 dirstate-entry)) (vc-hg-mtime (nth 3 dirstate-entry)) (fs-size (nth 7 stat)) - (fs-mtime (vc-hg--time-to-fixnum (nth 5 stat)))) + (fs-mtime (vc-hg--time-to-integer (nth 5 stat)))) (if (and (eql vc-hg-size fs-size) (eql vc-hg-mtime fs-mtime)) 'up-to-date 'edited))) -- 2.17.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-22 17:27 ` Paul Eggert @ 2018-08-22 20:05 ` Pip Cet 2018-08-22 21:53 ` Paul Eggert 0 siblings, 1 reply; 24+ messages in thread From: Pip Cet @ 2018-08-22 20:05 UTC (permalink / raw) To: eggert; +Cc: kbrown, emacs-devel On Wed, Aug 22, 2018 at 5:27 PM Paul Eggert <eggert@cs.ucla.edu> wrote: > Pip Cet wrote: > We get some of that effect already in 32- vs 64-bit builds. Perhaps you're right > and we would benefit from artificially small fixnums, or from debug builds that > warn about eq misuse. Well, it's harder to implement this than I thought at first, since we want to change `eq' and `most-positive-fixnum' but not EQ and MOST_POSITIVE_FIXNUM. > > That said, vc-hg.el doesn't contain any suspicious-looking instances > > of `eq', `memq', or `hash' > Unfortunately that doesn't seem to be quite right. I looked and found one > incorrect (though quite unlikely to fail) use of eq. I changed it to eql as part > of the attached patch, which I installed. No objections to that patch, but `eq' is correct in that code: string length is limited by STRING_BYTES_BOUND, which is <= MOST_POSITIVE_FIXNUM, so flen will always be a fixnum. It's also handed to fixnum-only functions immediately afterwards, so even if we allowed non-fixnum-length strings we'd get, at worst, a different signal. > > As for `eq' and `eql', what I thought Stefan proposed was to keep a > > hash value in struct Lisp_Bignum, which is calculated lazily the first > > time the bignum is compared to another bignum with `eq', at which > > point we might as well point out to the user that portable code would > > require `eql'. That sounds to me like it would be much cheaper than a > > hash table, but if there are objections to it we can still implement > > it without guaranteeing it for the future in the documentation. > > It might be helpful to have that in a debugging build. > I am still leaning more towards hash tables, though; It's precisely the opposite for me: I'm leaning towards hash tables in debugging builds, where "which bignums currently exist" is a useful question to ask, and lazy hashing in production builds, where make_number is called often and (soon) shouldn't actually walk through the bignum again to make a copy of it or calculate its hash value. >at least, I want to time > them to see how much they actually cost in typical practice. I suspect the cost > is rather small, and if so it may well be worth it to avoid hassles like the > eq-vs-eql glitch that I just now fixed in vc-hg.el. I suspect the answer is they're cheap in "typical" practice because bignums don't appear there, and expensive enough in atypical situations (cryptography? number theory? Gödel coding?) to make some applications resort to unnecessary and danger-prone C libraries. And we're comparing two proposals which would make eq-vs-eql a non-issue; one would slow down make_number (significantly, I believe, unless we keep the current behavior which includes an unnecessary memcpy), the other would slow down `eq' and possibly `EQ' to require a tag bit check to catch bignum-on-bignum comparisons. That's an xor (unless we change some other things) and a test insn if we give bignums ("vectors with overloaded EQ behavior") their own tag value (yay, now I've used up the tag bit you've freed for the second time). [In fact, I like the idea of extending this to reserving one tag value for "it's complicated" objects, which may but don't have to overload primitives. All the primitives would check OVERLOADEDP(arg), a simple and cheap tag bit check, before doing the right thing for primitive objects. Think of all the number systems we could still implement.] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-22 20:05 ` Pip Cet @ 2018-08-22 21:53 ` Paul Eggert 0 siblings, 0 replies; 24+ messages in thread From: Paul Eggert @ 2018-08-22 21:53 UTC (permalink / raw) To: Pip Cet; +Cc: kbrown, emacs-devel Pip Cet wrote: > Well, it's harder to implement this than I thought at first, since we > want to change `eq' and `most-positive-fixnum' but not EQ and > MOST_POSITIVE_FIXNUM. If we go this route, we should rename EQ (say, to just E) and write a new EQ function in C that acts like Emacs eq, and similarly rename MOST_POSITIVE_FIXNUM. It'd be pretty confusing otherwise. Not sure it's worth the effort. > No objections to that patch, but `eq' is correct in that code: string > length is limited by STRING_BYTES_BOUND, Ah, good catch. I got confused because one argument of the 'eq' call can be a bignum; trouble can't occur unless both are bignums. I changed that eql back to eq. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-22 15:50 ` Pip Cet 2018-08-22 17:27 ` Paul Eggert @ 2018-08-22 20:45 ` Stefan Monnier 2018-08-23 14:55 ` Pip Cet 1 sibling, 1 reply; 24+ messages in thread From: Stefan Monnier @ 2018-08-22 20:45 UTC (permalink / raw) To: emacs-devel > As for `eq' and `eql', what I thought Stefan proposed was to keep a > hash value in struct Lisp_Bignum, which is calculated lazily the first > time the bignum is compared to another bignum with `eq', at which > point we might as well point out to the user that portable code would > require `eql'. That sounds to me like it would be much cheaper than a > hash table, but if there are objections to it we can still implement > it without guaranteeing it for the future in the documentation. No, that's definitely not what I was proposing. I was proposing an actual hash-table. What you're suggesting is problematic in that it assumes `eq` will do something special for bignums, which implies `eq` is not just `==` in C. Stefan ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-22 20:45 ` Stefan Monnier @ 2018-08-23 14:55 ` Pip Cet 2018-08-23 15:56 ` Stefan Monnier 0 siblings, 1 reply; 24+ messages in thread From: Pip Cet @ 2018-08-23 14:55 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel Hello, Stefan. On Wed, Aug 22, 2018 at 8:46 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote: > > > As for `eq' and `eql', what I thought Stefan proposed was to keep a > > hash value in struct Lisp_Bignum, which is calculated lazily the first > > time the bignum is compared to another bignum with `eq', at which > > point we might as well point out to the user that portable code would > > require `eql'. That sounds to me like it would be much cheaper than a > > hash table, but if there are objections to it we can still implement > > it without guaranteeing it for the future in the documentation. > > No, that's definitely not what I was proposing. I was proposing an > actual hash-table. I'm sorry, I totally misread what you wrote. No, you weren't proposing anything like that. > What you're suggesting is problematic in that it > assumes `eq` will do something special for bignums, which implies `eq` > is not just `==` in C. Well, it isn't `==' if we're building with --enable-check-lisp-object-type, so the problems would be memcmp or hashing memory that might contain Lisp_Objects. I'm not aware of any places that do either of those. I've got something running and things appear not to fail catastrophically right away, at least, with EQ changed to include an extra condition for "overloaded" objects. Assuming some compiler smarts, NILP wouldn't suffer, and EQ wouldn't suffer if one side is known to have a normal type. I doubt the performance hit will be measurable; if anything, the if-fixnum-do-this-if-bignum-do-that code we have right now might be worse for caches than making the bignum code a cold function. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-23 14:55 ` Pip Cet @ 2018-08-23 15:56 ` Stefan Monnier 2018-08-24 18:00 ` Pip Cet 2018-08-28 14:08 ` hash-consing bignums and eq==eql Stefan Monnier 0 siblings, 2 replies; 24+ messages in thread From: Stefan Monnier @ 2018-08-23 15:56 UTC (permalink / raw) To: Pip Cet; +Cc: emacs-devel > Well, it isn't `==' if we're building with > --enable-check-lisp-object-type, Currently, the resulting assembly code should be pretty close even in that case, tho. > extra condition for "overloaded" objects. Assuming some compiler > smarts, NILP wouldn't suffer, and EQ wouldn't suffer if one side is > known to have a normal type. I doubt the performance hit will be > measurable; if anything, the if-fixnum-do-this-if-bignum-do-that code > we have right now might be worse for caches than making the bignum > code a cold function. The whole purpose of hash-consing (for me) is to avoid turning EQ into something like: if (BIGNUMP (x)) return slow_eq (x, y); else return x == y; What we do in slow_eq is largely irrelevant: the problem is the cost of `if (BIGNUMP (x))`, both in terms of code size and processing time. Stefan ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-23 15:56 ` Stefan Monnier @ 2018-08-24 18:00 ` Pip Cet 2018-08-24 20:55 ` Paul Eggert 2018-08-28 14:08 ` hash-consing bignums and eq==eql Stefan Monnier 1 sibling, 1 reply; 24+ messages in thread From: Pip Cet @ 2018-08-24 18:00 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel On Thu, Aug 23, 2018 at 3:56 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote: > > Well, it isn't `==' if we're building with > > --enable-check-lisp-object-type, > > Currently, the resulting assembly code should be pretty close even in > that case, tho. Okay; I wasn't sure whether we were talking about a literal "people use == on Lisp_Objects" problem or not. The resulting assembly code is indeed equivalent to a `==', it seems. > The whole purpose of hash-consing (for me) is to avoid turning EQ into > something like: > > if (BIGNUMP (x)) > return slow_eq (x, y); > else > return x == y; > > What we do in slow_eq is largely irrelevant: the problem is the cost of > `if (BIGNUMP (x))`, both in terms of code size and processing time. I understand that. In fact, the cost is fairly (and, to me, surprisingly) high, about 1%. That is, indeed, way more than I thought, and I'll have to look at the assembler code to figure out why it's so expensive. But what I want is only about one tenth as bad (in terms of code size) as what you describe: the code you don't want: 13 .text 00227f8e 0000000000419a00 0000000000419a00 00019a00 2**4 the code you do want (i.e. vanilla): 13 .text 001d945e 0000000000419a00 0000000000419a00 00019a00 2**4 the code I want: 13 .text 001e0b3e 0000000000419a00 0000000000419a00 00019a00 2**4 (I got it down to 24816 bytes of code size difference with another compiler, but still using the standard make flags on an x86_64 pc-linux-gnu system.) The performance penalty is a quarter of a clock cycle per problematic EQ, on this machine, though obviously anything in that range depends on your precise CPU architecture and surrounding insns that affect superscalar scheduling. We call EQ a lot, there are between 15 and 32 billion problematic calls in the temacs/emacs invocations to rebuild all .elc files in the Emacs distribution. ("problematic" means that gcc wasn't able to prove that one argument to EQ couldn't possibly be a bignum, and had to emit a conditional branch insn). So that also works out to something in the 1% range. (However, upon inspection it turns out that adding the debugging code makes gcc fail to optimize away many of the problematic calls, so the actual number may be much less than 1%). The size of the emacs binary, stripped, is also about 1% more with my code. The difference between the code you suggested and mine is mostly NILP(), though my test code optimizes away all tag bit checks if the compiler proves either argument is definitely not a bignum, or that the arguments have different tag bits. I'm also using __builtin_expect, something I think should be okay in this exceptionally hot single location. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-24 18:00 ` Pip Cet @ 2018-08-24 20:55 ` Paul Eggert 2018-08-25 15:02 ` Pip Cet 0 siblings, 1 reply; 24+ messages in thread From: Paul Eggert @ 2018-08-24 20:55 UTC (permalink / raw) To: Pip Cet, Stefan Monnier; +Cc: emacs-devel On my to-do list of things to try is to use 4 adjacent type codes for fixnums, bignums, immediate floats and indirect floats, so that INTEGERP, BIGNUMP, FIXNUMP, FLOATP, and NUMBERP can each be done by a single mask-and-compare. This may improve your timing results, if you're trying to insert a lot of BIGNUMPs into your code, or if you're trying to have a single mask-and-compare test for "this object is a bignum or an indirect float and so needs more work for eq". ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-24 20:55 ` Paul Eggert @ 2018-08-25 15:02 ` Pip Cet 2018-08-25 18:02 ` Stefan Monnier 2018-08-25 21:24 ` Paul Eggert 0 siblings, 2 replies; 24+ messages in thread From: Pip Cet @ 2018-08-25 15:02 UTC (permalink / raw) To: eggert; +Cc: Stefan Monnier, emacs-devel On Fri, Aug 24, 2018 at 8:55 PM Paul Eggert <eggert@cs.ucla.edu> wrote: > On my to-do list of things to try is to use 4 adjacent type codes for fixnums, > bignums, immediate floats and indirect floats, so that INTEGERP, BIGNUMP, > FIXNUMP, FLOATP, and NUMBERP can each be done by a single mask-and-compare. I'm already using a separate tag value for bignums, though it might help to use tag value 0 instead of a nonzero one (but then Qnil couldn't be all-zeroes in memory). I've discovered that the sequence gcc (trunk) emits for a mask-and-compare is: mov %rax, %rbx and $7, %rbx cmp $7, %rbx where I would use lea 1(%rax), %ebx test $7, %bl I can force the latter by testing ((char)((int)(XLI (x) + 8 - Lisp_String)) & 7) == 0 rather than the (equivalent, AFAICS) (XLI (x) & 7) == Lisp_String (sometimes gcc uses a longer sequence, though). At this point, I think gcc should be emitting the shorter two-insn sequence for both C expressions. With this "improvement", the code size difference is 13184 bytes. I'm unable to measure a performance difference, since it is lost in the noise. I've benchmarked the hash-cons approach, though the implementation I used was rather trivial, and simple bignum calculations appear to be about 20-50% slower. Again, I expect that number to increase once the extra copy in make_number is removed. (For floats, the slowdown is worse.) I still think there are problems with collisions, accidental or intentional, and with shrinking the hash table again after a calculation is finished. And I think those problems would be worse if strings ever become immutable but eq-when-equal. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-25 15:02 ` Pip Cet @ 2018-08-25 18:02 ` Stefan Monnier 2018-08-25 21:24 ` Paul Eggert 1 sibling, 0 replies; 24+ messages in thread From: Stefan Monnier @ 2018-08-25 18:02 UTC (permalink / raw) To: emacs-devel > I still think there are problems with collisions, accidental or > intentional, and with shrinking the hash table again after a > calculation is finished. And I think those problems would be worse if > strings ever become immutable but eq-when-equal. Whatever implementation technique we decide to use now doesn't mean we'd be stuck with it forever. As you're seeing right now, it's actually pretty easy to try various different techniques. The thing that *does* matter, OTOH, is whether we define `eq` to behave like `eql` for bignums. Stefan ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-25 15:02 ` Pip Cet 2018-08-25 18:02 ` Stefan Monnier @ 2018-08-25 21:24 ` Paul Eggert 1 sibling, 0 replies; 24+ messages in thread From: Paul Eggert @ 2018-08-25 21:24 UTC (permalink / raw) To: Pip Cet; +Cc: Stefan Monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1949 bytes --] Pip Cet wrote: > I'm already using a separate tag value for bignums, though it might > help to use tag value 0 instead of a nonzero one (but then Qnil > couldn't be all-zeroes in memory). We'd have to measure it, as there is also a benefit when NILP and initialization to Qnil are fast. > I've discovered that the sequence gcc (trunk) emits for a mask-and-compare is: > mov %rax, %rbx > and $7, %rbx > cmp $7, %rbx > where I would use > lea 1(%rax), %ebx > test $7, %bl > > I can force the latter by testing ((char)((int)(XLI (x) + 8 - > Lisp_String)) & 7) == 0 rather than the (equivalent, AFAICS) (XLI (x) > & 7) == Lisp_String (sometimes gcc uses a longer sequence, though). At > this point, I think gcc should be emitting the shorter two-insn > sequence for both C expressions. I agree, and filed a GCC bug report about it, here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87104 > With this "improvement", the code size difference is 13184 bytes. I'm > unable to measure a performance difference, since it is lost in the > noise. I tried a similar microoptimization on the Emacs trunk, and it made Emacs about 0.4% faster on my platform. This seems like a clear win pretty much everywhere, regardless of what we do about eq vs eql, so I installed the attached patch into master. > I've benchmarked the hash-cons approach, though the implementation I > used was rather trivial, and simple bignum calculations appear to be > about 20-50% slower. Again, I expect that number to increase once the > extra copy in make_number is removed. (For floats, the slowdown is > worse.) Ouch. These figures are making me more inclined to stick with what we have in master, namely the Common Lisp and Scheme tradition that eq is a finer comparison operation than eql is, when these comparisons are applied to bignums and/or to floats. Of course we can continue to guarantee the traditional eq semantics on fixnums. [-- Attachment #2: 0001-Improve-performance-of-CONSP-FIXNUMP-etc.txt --] [-- Type: text/plain, Size: 3840 bytes --] From ccdb08ef4ed8f96e79aa06cf5e806c9c487d58ad Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sat, 25 Aug 2018 13:39:18 -0700 Subject: [PATCH] Improve performance of CONSP, FIXNUMP, etc. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Optimization opportunity noted by Pip Cet in: https://lists.gnu.org/r/emacs-devel/2018-08/msg00828.html On my platform (Fedora 28 x86-64, AMD Phenom II X4 910e, user+system time), this improved ‘make compile-always’ performance by 0.4% and shrank text size by a similar amount. * src/lisp.h (TAGGEDP, lisp_h_TAGGEDP): New macros and function. (lisp_h_CONSP, lisp_h_FLOATP, lisp_h_SYMBOLP) (lisp_h_VECTORLIKEP, make_lisp_ptr, STRINGP): Use them. (lisp_h_FIXNUMP): Use the same idea that lisp_h_TAGGEDP uses. --- src/lisp.h | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/src/lisp.h b/src/lisp.h index bca4dfbb60..fb11a11fda 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -362,10 +362,13 @@ typedef EMACS_INT Lisp_Word; #define lisp_h_CHECK_SYMBOL(x) CHECK_TYPE (SYMBOLP (x), Qsymbolp, x) #define lisp_h_CHECK_TYPE(ok, predicate, x) \ ((ok) ? (void) 0 : wrong_type_argument (predicate, x)) -#define lisp_h_CONSP(x) (XTYPE (x) == Lisp_Cons) +#define lisp_h_CONSP(x) TAGGEDP (x, Lisp_Cons) #define lisp_h_EQ(x, y) (XLI (x) == XLI (y)) -#define lisp_h_FLOATP(x) (XTYPE (x) == Lisp_Float) -#define lisp_h_FIXNUMP(x) ((XTYPE (x) & (Lisp_Int0 | ~Lisp_Int1)) == Lisp_Int0) +#define lisp_h_FIXNUMP(x) \ + (! (((unsigned) (XLI (x) >> (USE_LSB_TAG ? 0 : FIXNUM_BITS)) \ + - (unsigned) (Lisp_Int0 >> !USE_LSB_TAG)) \ + & ((1 << INTTYPEBITS) - 1))) +#define lisp_h_FLOATP(x) TAGGEDP (x, Lisp_Float) #define lisp_h_NILP(x) EQ (x, Qnil) #define lisp_h_SET_SYMBOL_VAL(sym, v) \ (eassert ((sym)->u.s.redirect == SYMBOL_PLAINVAL), \ @@ -375,8 +378,12 @@ typedef EMACS_INT Lisp_Word; #define lisp_h_SYMBOL_TRAPPED_WRITE_P(sym) (XSYMBOL (sym)->u.s.trapped_write) #define lisp_h_SYMBOL_VAL(sym) \ (eassert ((sym)->u.s.redirect == SYMBOL_PLAINVAL), (sym)->u.s.val.value) -#define lisp_h_SYMBOLP(x) (XTYPE (x) == Lisp_Symbol) -#define lisp_h_VECTORLIKEP(x) (XTYPE (x) == Lisp_Vectorlike) +#define lisp_h_SYMBOLP(x) TAGGEDP (x, Lisp_Symbol) +#define lisp_h_TAGGEDP(a, tag) \ + (! (((unsigned) (XLI (a) >> (USE_LSB_TAG ? 0 : VALBITS)) \ + - (unsigned) (tag)) \ + & ((1 << GCTYPEBITS) - 1))) +#define lisp_h_VECTORLIKEP(x) TAGGEDP (x, Lisp_Vectorlike) #define lisp_h_XCAR(c) XCONS (c)->u.s.car #define lisp_h_XCDR(c) XCONS (c)->u.s.u.cdr #define lisp_h_XCONS(a) \ @@ -435,6 +442,7 @@ typedef EMACS_INT Lisp_Word; # define SYMBOL_TRAPPED_WRITE_P(sym) lisp_h_SYMBOL_TRAPPED_WRITE_P (sym) # define SYMBOL_VAL(sym) lisp_h_SYMBOL_VAL (sym) # define SYMBOLP(x) lisp_h_SYMBOLP (x) +# define TAGGEDP(a, tag) lisp_h_TAGGEDP (a, tag) # define VECTORLIKEP(x) lisp_h_VECTORLIKEP (x) # define XCAR(c) lisp_h_XCAR (c) # define XCDR(c) lisp_h_XCDR (c) @@ -647,6 +655,15 @@ INLINE enum Lisp_Type #endif } +/* True if A has type tag TAG. + Equivalent to XTYPE (a) == TAG, but often faster. */ + +INLINE bool +(TAGGEDP) (Lisp_Object a, enum Lisp_Type tag) +{ + return lisp_h_TAGGEDP (a, tag); +} + INLINE void (CHECK_TYPE) (int ok, Lisp_Object predicate, Lisp_Object x) { @@ -1131,7 +1148,7 @@ INLINE Lisp_Object make_lisp_ptr (void *ptr, enum Lisp_Type type) { Lisp_Object a = TAG_PTR (type, ptr); - eassert (XTYPE (a) == type && XUNTAG (a, type, char) == ptr); + eassert (TAGGEDP (a, type) && XUNTAG (a, type, char) == ptr); return a; } @@ -1364,7 +1381,7 @@ verify (alignof (struct Lisp_String) % GCALIGNMENT == 0); INLINE bool STRINGP (Lisp_Object x) { - return XTYPE (x) == Lisp_String; + return TAGGEDP (x, Lisp_String); } INLINE void -- 2.17.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* hash-consing bignums and eq==eql 2018-08-23 15:56 ` Stefan Monnier 2018-08-24 18:00 ` Pip Cet @ 2018-08-28 14:08 ` Stefan Monnier 2018-08-29 13:32 ` Pip Cet 1 sibling, 1 reply; 24+ messages in thread From: Stefan Monnier @ 2018-08-28 14:08 UTC (permalink / raw) To: Pip Cet; +Cc: emacs-devel > The whole purpose of hash-consing (for me) is to avoid turning EQ into > something like: > > if (BIGNUMP (x)) > return slow_eq (x, y); > else > return x == y; BTW, hash-consing bignums would have the following advantages: - EQ is just as fast as before. - EQ is equivalent to EQL for all integers. - While bignum operations would be somewhat slower (30-40% according to a message I can't find any more), they'd still be faster than what we had before (i.e. which varies between "signals an error" and "uses a list of cons cells"). Admittedly, if we used a dedicated tag for bignums, eql could turn into BIGNUM_OR_FLOAT_P (x) ? slow_eql (x, y) : (x == y); where BIGNUM_OR_FLOAT_P can be just as efficient as FLOATP, so it would be cheap enough (IMO) to make `eq` an alias for `eql`. Stefan ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: hash-consing bignums and eq==eql 2018-08-28 14:08 ` hash-consing bignums and eq==eql Stefan Monnier @ 2018-08-29 13:32 ` Pip Cet 2018-08-29 19:23 ` Stefan Monnier 2018-08-29 19:31 ` Paul Eggert 0 siblings, 2 replies; 24+ messages in thread From: Pip Cet @ 2018-08-29 13:32 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel On Tue, Aug 28, 2018 at 2:08 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote: > BTW, hash-consing bignums would have the following advantages: > - EQ is equivalent to EQL for all integers. I think that's the one to focus on, instead of splitting the vote between the hash-consers and the slow-EQers. Most users, even Lisp programmers, won't care how it's done under the hood, but the distinction between eq and eql for integers leads to subtle bugs that are best excluded by design, IMHO. > Admittedly, if we used a dedicated tag for bignums, eql could turn into > > BIGNUM_OR_FLOAT_P (x) ? slow_eql (x, y) : (x == y); > > where BIGNUM_OR_FLOAT_P can be just as efficient as FLOATP, so it > would be cheap enough (IMO) to make `eq` an alias for `eql`. I rather think it hinges on the question of whether eq = eql should hold for floats. As you point out, bignums are already much faster than they used to be, and your theory and my benchmark established an upper bound for the price of hash-consing that is acceptable on 64-bit machines. However, I'd like to point out that if floating-point infinities are used along with integers, things could get confusing again; I still have a slight preference for slow EQ compared to hash-consing only bignums: I think hash-consing floats is problematic unless we have float immediates (which would include infinities). Okay, that's complicated. Five options, in my order of preference: 1. Slow EQ 2. float immediates, hash-cons the remaining floats, hash-cons bignums 3. hash-cons bignums, leave floats alone 4. hash-cons bignums and all floats 5. the status quo ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: hash-consing bignums and eq==eql 2018-08-29 13:32 ` Pip Cet @ 2018-08-29 19:23 ` Stefan Monnier 2018-08-29 19:31 ` Paul Eggert 1 sibling, 0 replies; 24+ messages in thread From: Stefan Monnier @ 2018-08-29 19:23 UTC (permalink / raw) To: Pip Cet; +Cc: emacs-devel >> BTW, hash-consing bignums would have the following advantages: >> - EQ is equivalent to EQL for all integers. > > I think that's the one to focus on, instead of splitting the vote > between the hash-consers and the slow-EQers. For me the only question is *how* to get EQ==EQL in general. I think making it true for bignums is the first step. And using hash-consing on bignums is the only way to do that without any negative impact on existing code (i.e. the path of least resistance). When we get to making EQ==EQL for floats as well, we may indeed decide to drop hash-consing, but AFAICT the mood here is not in favor of making it happen now. Stefan ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: hash-consing bignums and eq==eql 2018-08-29 13:32 ` Pip Cet 2018-08-29 19:23 ` Stefan Monnier @ 2018-08-29 19:31 ` Paul Eggert 2018-08-29 20:50 ` Stefan Monnier 2018-09-09 3:09 ` Stefan Monnier 1 sibling, 2 replies; 24+ messages in thread From: Paul Eggert @ 2018-08-29 19:31 UTC (permalink / raw) To: Pip Cet, Stefan Monnier; +Cc: emacs-devel Pip Cet wrote: > Okay, that's complicated. Five options, in my order of preference: > > 1. Slow EQ > 2. float immediates, hash-cons the remaining floats, hash-cons bignums > 3. hash-cons bignums, leave floats alone > 4. hash-cons bignums and all floats > 5. the status quo Another possibility: * float immediates, no hash-consing This keeps 'eq' fast, and causes eq to start to "work" on floats whose bottom few bits are zero. > the distinction between eq and eql for integers leads to subtle bugs that > are best excluded by design, IMHO. Common Lisp and Scheme both have this distinction, and they seem to be doing OK. In this particular tradeoff between performance and nicer behavior, most Lisp users seem to prefer performance. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: hash-consing bignums and eq==eql 2018-08-29 19:31 ` Paul Eggert @ 2018-08-29 20:50 ` Stefan Monnier 2018-09-09 3:09 ` Stefan Monnier 1 sibling, 0 replies; 24+ messages in thread From: Stefan Monnier @ 2018-08-29 20:50 UTC (permalink / raw) To: emacs-devel > This keeps 'eq' fast, and causes eq to start to "work" on floats whose > bottom few bits are zero. >> the distinction between eq and eql for integers leads to subtle bugs that >> are best excluded by design, IMHO. > Common Lisp and Scheme both have this distinction, and they seem to be doing > OK. In this particular tradeoff between performance and nicer behavior, most > Lisp users seem to prefer performance. FWIW, I think this is a historical mistake and makes for many programs being fundamentally yet unnecessarily not platform-independent (although of course in practice the problem rarely manifests itself). If you take EQ==EQL as a starting point, I think it's always reasonably easy to get performance close enough to what you can get with a "faster EQ". Actually, I'm not opposed to having a low-level "fast equality test". What I dislike is that this fast equality test be so prominent that we encourage its use as the *standard* test, instead of keeping its use for particular cases where the speed actually matters. IOW, I'd be OK with: (defalias 'fast-eq (symbol-function 'eq)) (defalias 'eq (symbol-function 'eql)) [ and its equivalent in C. ] Stefan ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: hash-consing bignums and eq==eql 2018-08-29 19:31 ` Paul Eggert 2018-08-29 20:50 ` Stefan Monnier @ 2018-09-09 3:09 ` Stefan Monnier 2018-09-09 6:26 ` Paul Eggert 1 sibling, 1 reply; 24+ messages in thread From: Stefan Monnier @ 2018-09-09 3:09 UTC (permalink / raw) To: emacs-devel > Common Lisp and Scheme both have this distinction, and they seem to be doing > OK. Don't forget that Common Lisp and Scheme have a very different user-base and goals. These are languages whose target users are real programmers, and where its considered normal to let a program core-dump if the programmer has given incorrect compiler hints in the form of type annotations. Most users of Elisp on the other hand would not describe themselves as programmers. > In this particular tradeoff between performance and nicer behavior, most > Lisp users seem to prefer performance. If they prefer performance, they won't be coding in Elisp, I'm afraid. Stefan ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: hash-consing bignums and eq==eql 2018-09-09 3:09 ` Stefan Monnier @ 2018-09-09 6:26 ` Paul Eggert 0 siblings, 0 replies; 24+ messages in thread From: Paul Eggert @ 2018-09-09 6:26 UTC (permalink / raw) To: Stefan Monnier, emacs-devel Stefan Monnier wrote: > Common Lisp and Scheme have a very different user-base > and goals. These are languages whose target users are real programmers, > and where its considered normal to let a program core-dump Not sure I agree with this, as it's a pretty big world out there. In my corner of the world, almost all the Scheme users I deal with are students, they're not big fans of core dumps, and the Scheme they use (Racket, typically) is even less likely to dump core than Elisp is. Although other corners of the Scheme world are like what you describe, that's more the R5RS tradition, as R6RS moved decisively in the "let's not let programs dump core" direction. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-22 13:35 ` Ken Brown 2018-08-22 13:44 ` Paul Eggert @ 2018-08-22 13:49 ` Pip Cet 2018-08-22 14:52 ` Eli Zaretskii 1 sibling, 1 reply; 24+ messages in thread From: Pip Cet @ 2018-08-22 13:49 UTC (permalink / raw) To: kbrown; +Cc: eggert, emacs-devel On Wed, Aug 22, 2018 at 1:36 PM Ken Brown <kbrown@cornell.edu> wrote: > On 8/22/2018 8:36 AM, Pip Cet wrote: > > On Tue, Aug 21, 2018 at 9:01 PM Paul Eggert <eggert@cs.ucla.edu> wrote: > >> * lisp/vc/vc-hg.el (vc-hg-state-fast): When testing fixnum width, > >> prefer (zerop (ash most-positive-fixnum -32)) to (zerop (lsh -1 > >> 32)) (Bug#32485#11). > > > > I think the previous code was correct: now we have bignums, we should > > no longer return 'unsupported on machines with 30/32-bit fixnums, > > I think you're missing the point of that code. The function is > vc-hg-state-fast. If it returns 'unsupported, that simply means that > the slower function will be used. Thanks for pointing that out. For those who haven't read the code, the "slower function" actually runs the hg binary, an operation that can easily take 100 ms or more, instead of sacrificing the few extra microseconds operating on bignums would cost. That's precisely why `most-positive-fixnum' is dangerous to rely on: if you use it for performance tuning, you're likely to make the wrong decision, by going through an expensive alternative bignum implementation. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches 2018-08-22 13:49 ` [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches Pip Cet @ 2018-08-22 14:52 ` Eli Zaretskii 0 siblings, 0 replies; 24+ messages in thread From: Eli Zaretskii @ 2018-08-22 14:52 UTC (permalink / raw) To: Pip Cet; +Cc: eggert, kbrown, emacs-devel > From: Pip Cet <pipcet@gmail.com> > Date: Wed, 22 Aug 2018 13:49:13 +0000 > Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org > > That's precisely why `most-positive-fixnum' is dangerous to rely on: > if you use it for performance tuning, you're likely to make the wrong > decision, by going through an expensive alternative bignum > implementation. We want most-positive-fixnum so that we could make the right decisions. ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2018-09-09 6:26 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <20180821204437.16880.99611@vcs0.savannah.gnu.org> [not found] ` <20180821204439.62390209A6@vcs0.savannah.gnu.org> 2018-08-22 12:36 ` [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches Pip Cet 2018-08-22 13:35 ` Ken Brown 2018-08-22 13:44 ` Paul Eggert 2018-08-22 15:50 ` Pip Cet 2018-08-22 17:27 ` Paul Eggert 2018-08-22 20:05 ` Pip Cet 2018-08-22 21:53 ` Paul Eggert 2018-08-22 20:45 ` Stefan Monnier 2018-08-23 14:55 ` Pip Cet 2018-08-23 15:56 ` Stefan Monnier 2018-08-24 18:00 ` Pip Cet 2018-08-24 20:55 ` Paul Eggert 2018-08-25 15:02 ` Pip Cet 2018-08-25 18:02 ` Stefan Monnier 2018-08-25 21:24 ` Paul Eggert 2018-08-28 14:08 ` hash-consing bignums and eq==eql Stefan Monnier 2018-08-29 13:32 ` Pip Cet 2018-08-29 19:23 ` Stefan Monnier 2018-08-29 19:31 ` Paul Eggert 2018-08-29 20:50 ` Stefan Monnier 2018-09-09 3:09 ` Stefan Monnier 2018-09-09 6:26 ` Paul Eggert 2018-08-22 13:49 ` [Emacs-diffs] master f18af6c: Audit use of lsh and fix glitches Pip Cet 2018-08-22 14:52 ` Eli Zaretskii
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.