* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? [not found] <1583748933.1069307.1593556032592.ref@mail.yahoo.com> @ 2020-06-30 22:27 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-06-30 23:14 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-01 12:44 ` Mattias Engdegård 0 siblings, 2 replies; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-06-30 22:27 UTC (permalink / raw) To: 42147 Hi all, I was looking in byte-opt.el to how are classified a number of functions. My understanding is that pure functions should be technically a subset of side-effect-and-error-free for which the environment has no influence on the value the function evaluates to. For this reason they can be constant folded in the compile time if possible. Now in pure functions I see we do not have a lot of functions that (to my understanding) would classify for that. I'm thinking to simple predicates acting on immutable objects as consp or fixnump to give an example. Shouldn't we move these into the pure class? ATM for instance this does not get optimized: (defun foo () (fixnump 3)) In case this makes sense I'll be happy to work on it and prepare a patch. Thanks Andrea ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-06-30 22:27 ` bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-06-30 23:14 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-01 12:46 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-01 12:44 ` Mattias Engdegård 1 sibling, 1 reply; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-06-30 23:14 UTC (permalink / raw) To: 42147 Ok I realized that fixnump is a very bad example because the result is strictly platform dependent. Please replace it with numberp or some other simple predicate. Andrea ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-06-30 23:14 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-01 12:46 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 0 replies; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-01 12:46 UTC (permalink / raw) To: 42147 [-- Attachment #1: Type: text/plain, Size: 236 bytes --] The attached is setting as pure: arrayp, bool-vector-p, consp, char-or-string-p, floatp, hash-table-p, integerp, listp, natnump, nlistp, not, null, string-lessp, stringp, symbolp, vectorp. Feedback welcome. Thanks Andrea [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Add-some-function-to-pure-fns.patch --] [-- Type: text/x-patch, Size: 1289 bytes --] From 10c115da725581b3969dab7e453bf4b4fac185d3 Mon Sep 17 00:00:00 2001 From: Andrea Corallo <akrl@sdf.org> Date: Wed, 1 Jul 2020 10:07:57 +0200 Subject: [PATCH] * Add some function to pure-fns * lisp/emacs-lisp/byte-opt.el (pure-fns): Add: arrayp, bool-vector-p, consp, char-or-string-p, floatp, hash-table-p, integerp, listp, natnump, nlistp, not, null, string-lessp, stringp, symbolp, vectorp. --- lisp/emacs-lisp/byte-opt.el | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/lisp/emacs-lisp/byte-opt.el b/lisp/emacs-lisp/byte-opt.el index 12bde8faf3..f3f1acbd65 100644 --- a/lisp/emacs-lisp/byte-opt.el +++ b/lisp/emacs-lisp/byte-opt.el @@ -1307,9 +1307,11 @@ ;; values if a marker is moved. (let ((pure-fns - '(% concat logand logcount logior lognot logxor - regexp-opt regexp-quote - string-to-char string-to-syntax symbol-name))) + '(% arrayp bool-vector-p char-or-string-p concat consp floatp + hash-table-p integerp listp logand logcount logior lognot + logxor natnump nlistp not null regexp-opt regexp-quote + string-lessp string-to-char string-to-syntax stringp + symbol-name symbolp vectorp))) (while pure-fns (put (car pure-fns) 'pure t) (setq pure-fns (cdr pure-fns))) -- 2.20.1 ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-06-30 22:27 ` bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-06-30 23:14 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-01 12:44 ` Mattias Engdegård 2020-07-01 16:08 ` Mattias Engdegård 1 sibling, 1 reply; 98+ messages in thread From: Mattias Engdegård @ 2020-07-01 12:44 UTC (permalink / raw) To: Andrea Corallo; +Cc: Paul Eggert, Stefan Monnier, 42147 Actually, pure functions are side-effect-free but not necessarily error-free. But in essence, you are right. The following functions look like they could be marked pure. Given that they currently aren't, there may be a good reason for their omission -- do correct me! integerp natnump floatp characterp numberp arrayp vectorp bool-vector-p char-or-string-p integer-or-marker-p keywordp number-or-marker-p sequencep length safe-length ... and some more (symbolp, stringp etc) that are already explicitly optimised. Regarding fixnump, we could add an optimiser since this predicate can be constant-folded for certain arguments, but it's unclear whether it's worth the trouble since this predicate (and bignump) are less commonly used today. Most uses of fixnump (in Emacs) are in Calc, and those are probably relics that should be replaced. More useful would be the ability to constant-fold ash, expt, %, mod and abs for a subset of each respective domain. I can write a patch. We could also decide that it's not a problem if an operation returns either a fixnum or bignum depending on the platform, on the grounds that (1) the distinction is not directly carried over through numeric constants (bignums and fixnums look the same in .elc files) (2) any attempt to discriminate between fixnums and bignums is non-portable anyway (and we can punt the discrimination to runtime) and thus constant-fold all integer operations (+, -, * etc), not just those resulting in a portable fixnum. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-01 12:44 ` Mattias Engdegård @ 2020-07-01 16:08 ` Mattias Engdegård 2020-07-01 21:31 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 98+ messages in thread From: Mattias Engdegård @ 2020-07-01 16:08 UTC (permalink / raw) To: Andrea Corallo; +Cc: Paul Eggert, Stefan Monnier, 42147 [-- Attachment #1: Type: text/plain, Size: 320 bytes --] Andrea, I see nothing directly wrong with your patch, but perhaps our messages went past one another since our lists of proposed pure functions differ. > More useful would be the ability to constant-fold ash, expt, %, mod and abs for a subset of each respective domain. I can write a patch. Here is that patch. [-- Attachment #2: 0001-Constant-fold-mod-ash-expt-and-abs-with-constant-int.patch --] [-- Type: application/octet-stream, Size: 3537 bytes --] From aa9ce87268365f766a0d70e6a86bf44067e86b78 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org> Date: Wed, 1 Jul 2020 17:44:54 +0200 Subject: [PATCH] Constant-fold %, mod, ash, expt and abs with constant integer args To ensure portability, the optimisation is confined to calls where the result is a portable fixnum. (Bug#42147) * lisp/emacs-lisp/byte-opt.el (byte-opt--integer-arith) (byte-optimize-binary-integer-arith, byte-optimize-unary-integer-arith) (byte-optimize-mod): New functions. (%, mod, ash, expt, abs): Set byte-optimizer property. * test/lisp/emacs-lisp/bytecomp-tests.el (byte-opt-testsuite-arith-data): Add test cases. --- lisp/emacs-lisp/byte-opt.el | 35 ++++++++++++++++++++++++++ test/lisp/emacs-lisp/bytecomp-tests.el | 12 +++++++++ 2 files changed, 47 insertions(+) diff --git a/lisp/emacs-lisp/byte-opt.el b/lisp/emacs-lisp/byte-opt.el index 12bde8faf3..72c68d64b2 100644 --- a/lisp/emacs-lisp/byte-opt.el +++ b/lisp/emacs-lisp/byte-opt.el @@ -801,6 +801,34 @@ byte-optimize-divide form (cons '/ args))))) +(defun byte-opt--integer-arith (form) + "Constant-fold FORM when args are integers and the result a portable fixnum." + (let ((args (cdr form))) + (if (memq nil (mapcar #'integerp args)) + form + (let ((res (apply (car form) args))) + (if (byte-opt--portable-numberp res) + res + form))))) + +(defun byte-optimize-binary-integer-arith (form) + "Constant-fold the binary integer arithmetic call FORM." + (if (= (length form) 3) + (byte-opt--integer-arith form) + form)) + +(defun byte-optimize-unary-integer-arith (form) + "Constant-fold the unary integer arithmetic call FORM." + (if (= (length form) 2) + (byte-opt--integer-arith form) + form)) + +(defun byte-optimize-mod (form) + "Constant-fold the mod-like function call FORM." + (if (eql (nth 2 form) 0) + form + (byte-optimize-binary-integer-arith form))) + (defun byte-optimize-binary-predicate (form) (cond ((or (not (macroexp-const-p (nth 1 form))) @@ -918,6 +946,13 @@ byte-optimize-concat (put 'max 'byte-optimizer 'byte-optimize-associative-math) (put 'min 'byte-optimizer 'byte-optimize-associative-math) +(put '% 'byte-optimizer 'byte-optimize-mod) +(put 'mod 'byte-optimizer 'byte-optimize-mod) + +(put 'ash 'byte-optimizer 'byte-optimize-binary-integer-arith) +(put 'expt 'byte-optimizer 'byte-optimize-binary-integer-arith) +(put 'abs 'byte-optimizer 'byte-optimize-unary-integer-arith) + (put '= 'byte-optimizer 'byte-optimize-binary-predicate) (put 'eq 'byte-optimizer 'byte-optimize-binary-predicate) (put 'eql 'byte-optimizer 'byte-optimize-equal) diff --git a/test/lisp/emacs-lisp/bytecomp-tests.el b/test/lisp/emacs-lisp/bytecomp-tests.el index bfe2d06a61..a96a7c8368 100644 --- a/test/lisp/emacs-lisp/bytecomp-tests.el +++ b/test/lisp/emacs-lisp/bytecomp-tests.el @@ -69,6 +69,18 @@ byte-opt-testsuite-arith-data (let ((a 3) (b 2)) (/ a b 1)) (let ((a 3) (b 2)) (/ (+ a b) 1)) + ;; More arithmetic constant-folding (bug#42147). + (ash 3 10) + (ash 3 25) + (abs -20) + (abs -2305843009213693952) + (expt 10 3) + (expt 10 20) + (% 20 3) + (% -20 3) + (mod 20 3) + (mod -20 3) + ;; coverage test (let ((a 3) (b 2) (c 1.0)) (+)) (let ((a 3) (b 2) (c 1.0)) (+ 2)) -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-01 16:08 ` Mattias Engdegård @ 2020-07-01 21:31 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-02 10:26 ` Mattias Engdegård 0 siblings, 1 reply; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-01 21:31 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Paul Eggert, Stefan Monnier, 42147 Mattias Engdegård <mattiase@acm.org> writes: > More useful would be the ability to constant-fold ash, expt, %, mod and abs for a subset of each respective domain. I can write a patch. Hi Mattias, I'm not sure what would be more useful, I guess both are a good thing to have. Another reason why I'm interested is that I reuse these definitions in the native compiler. > Andrea, I see nothing directly wrong with your patch, but perhaps our messages went past one another since our lists of proposed pure functions differ. yes that exactly what happen, thanks for looking at it. I diffed our two lists and this is the results, functions included in mine but not in your: consp, hash-table-p, listp, nlistp, not, null, string-lessp, stringp, symbolp. functions included in your but not in mine: characterp, integer-or-marker-p, keywordp, length, number-or-marker-p, numberp, safe-length, sequencep. I guess for the most part I can just include the one I've missed. I'm not only sure about the one operating on lists like `length' given the list may be modified in the runtime (?). I'll update the patch once this point is clarified. Thanks Andrea ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-01 21:31 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-02 10:26 ` Mattias Engdegård 2020-07-02 10:59 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-02 19:09 ` Philipp Stephani 0 siblings, 2 replies; 98+ messages in thread From: Mattias Engdegård @ 2020-07-02 10:26 UTC (permalink / raw) To: Andrea Corallo; +Cc: Paul Eggert, Stefan Monnier, 42147 1 juli 2020 kl. 23.31 skrev Andrea Corallo <andrea_corallo@yahoo.it>: > Another reason why I'm interested is that I reuse these > definitions in the native compiler. In that case there are probably more functions you may want to consider for purity -- what about: < > <= >= = /= string< string= string-equal eq eql equal proper-list-p identity memq memql member assq assql assoc > I guess for the most part I can just include the one I've missed. By all means, but do not take my word for the correctness of my list -- think it through yourself. We mustn't err here. > I'm not only sure about the one operating on lists like `length' given the > list may be modified in the runtime (?). Not sure why this would be an obstacle, but I could have overlooked something important! Could you explain your thinking in greater detail, and if possible provide an example of code that you think might be miscompiled if 'length' or 'safe-length' were marked pure? I still wonder if there is any reason to limit arithmetic constant folding to the portable fixnum range. Given that we don't evaluate fixnump or bignump at compile-time, what observable effects would constant-folding, say, (ash 1 32) have? Advice from deeper thinkers solicited! ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 10:26 ` Mattias Engdegård @ 2020-07-02 10:59 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-02 12:46 ` Mattias Engdegård 2020-07-02 19:09 ` Philipp Stephani 1 sibling, 1 reply; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-02 10:59 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Paul Eggert, Stefan Monnier, 42147 Mattias Engdegård <mattiase@acm.org> writes: > 1 juli 2020 kl. 23.31 skrev Andrea Corallo <andrea_corallo@yahoo.it>: > >> Another reason why I'm interested is that I reuse these >> definitions in the native compiler. > > In that case there are probably more functions you may want to consider for purity -- what about: > > < > <= >= = /= > string< string= string-equal > eq eql equal > proper-list-p > identity > memq memql member > assq assql assoc Good point >> I guess for the most part I can just include the one I've missed. > > By all means, but do not take my word for the correctness of my list -- think it through yourself. We mustn't err here. > >> I'm not only sure about the one operating on lists like `length' given the >> list may be modified in the runtime (?). > > Not sure why this would be an obstacle, but I could have overlooked > something important! Could you explain your thinking in greater > detail, and if possible provide an example of code that you think > might be miscompiled if 'length' or 'safe-length' were marked pure? No, thinking about I believe you are correct. I mixed in mind the fact that now the native compiler must handle correctly in the CFG also pure functions taking mutable objects (given we are adding them), but that is unrelated. This is no problem for the bytecompiler given the constant folding is done only locally. So yes these are pure functions and so they should be marked. > I still wonder if there is any reason to limit arithmetic constant > folding to the portable fixnum range. Given that we don't evaluate > fixnump or bignump at compile-time, what observable effects would > constant-folding, say, (ash 1 32) have? Advice from deeper thinkers > solicited! I always thought the general idea is to respect the allocation side effect we have creating a bignum. Not sure if the class of example you have in mind here fits this case. Thanks Andrea ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 10:59 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-02 12:46 ` Mattias Engdegård 2020-07-02 13:56 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 98+ messages in thread From: Mattias Engdegård @ 2020-07-02 12:46 UTC (permalink / raw) To: Andrea Corallo; +Cc: Paul Eggert, Stefan Monnier, 42147 2 juli 2020 kl. 12.59 skrev Andrea Corallo <andrea_corallo@yahoo.it>: >> I still wonder if there is any reason to limit arithmetic constant >> folding to the portable fixnum range. Given that we don't evaluate >> fixnump or bignump at compile-time, what observable effects would >> constant-folding, say, (ash 1 32) have? Advice from deeper thinkers >> solicited! > > I always thought the general idea is to respect the allocation side > effect we have creating a bignum. Not sure if the class of example you > have in mind here fits this case. Number allocation isn't a semantically visible effect and we probably don't want to change that. As far as I can tell, only fixnump and bignump can discriminate fixnums from bignums. There may be functions that only accept fixnums as arguments and thus fail with a different error, but I don't think we constant-fold any of them, and they would be easy to fix if we did. It may be preferable to defer generation of very big numbers to run-time, to avoid evaluation of (ash 1 1000) at compile-time, but such a limit should, if implemented, be independent of the fixnum limit (and likely higher). ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 12:46 ` Mattias Engdegård @ 2020-07-02 13:56 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-02 14:51 ` Mattias Engdegård 0 siblings, 1 reply; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-02 13:56 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Paul Eggert, Stefan Monnier, 42147 Mattias Engdegård <mattiase@acm.org> writes: > 2 juli 2020 kl. 12.59 skrev Andrea Corallo <andrea_corallo@yahoo.it>: > >>> I still wonder if there is any reason to limit arithmetic constant >>> folding to the portable fixnum range. Given that we don't evaluate >>> fixnump or bignump at compile-time, what observable effects would >>> constant-folding, say, (ash 1 32) have? Advice from deeper thinkers >>> solicited! >> >> I always thought the general idea is to respect the allocation side >> effect we have creating a bignum. Not sure if the class of example you >> have in mind here fits this case. > > Number allocation isn't a semantically visible effect and we probably > don't want to change that. Well is cons allocation a semantically visible effect then? How is it different? I thought the reason why cons is not constant folded is to respect the allocation side effect, at least that's what I convinced my-self of :) Andrea ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 13:56 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-02 14:51 ` Mattias Engdegård 2020-07-02 15:32 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-02 15:49 ` Stefan Monnier 0 siblings, 2 replies; 98+ messages in thread From: Mattias Engdegård @ 2020-07-02 14:51 UTC (permalink / raw) To: Andrea Corallo; +Cc: Paul Eggert, Stefan Monnier, 42147 2 juli 2020 kl. 15.56 skrev Andrea Corallo <andrea_corallo@yahoo.it>: > Well is cons allocation a semantically visible effect then? How is it > different? Conses are mutable and thus each have their own identity. Numbers are immutable and have none; there is no defined way to distinguish two numbers that have the same value ('eq' does not give well-defined results). The compiler is free to, and does, deduplicate equal bignums. For instance, try (disassemble (lambda () (list 18723645817263338474859 18723645817263338474859))) and you will see that the resulting code will only contain one instance of the number. > I thought the reason why cons is not constant folded is to respect the > allocation side effect, at least that's what I convinced my-self of :) Yes, in the sense that (defun f () (cons 'a 'b)) produces a fresh (a . b) each time (f) is called, because the returned values can be distinguished both explicitly by 'eq' and by mutating it and observing whether the change affects previously returned values or not. Neither works for numbers. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 14:51 ` Mattias Engdegård @ 2020-07-02 15:32 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-02 15:49 ` Stefan Monnier 1 sibling, 0 replies; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-02 15:32 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Paul Eggert, Stefan Monnier, 42147 Mattias Engdegård <mattiase@acm.org> writes: >> I thought the reason why cons is not constant folded is to respect the >> allocation side effect, at least that's what I convinced my-self of :) > > Yes, in the sense that > > (defun f () (cons 'a 'b)) > > produces a fresh (a . b) each time (f) is called, because the returned > values can be distinguished both explicitly by 'eq' and by mutating it > and observing whether the change affects previously returned values or > not. Neither works for numbers. Understand makes perfectly sense. Cons allocation is something very visible in Lisp. Andrea ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 14:51 ` Mattias Engdegård 2020-07-02 15:32 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-02 15:49 ` Stefan Monnier 2020-07-02 18:01 ` Mattias Engdegård 1 sibling, 1 reply; 98+ messages in thread From: Stefan Monnier @ 2020-07-02 15:49 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Paul Eggert, Andrea Corallo, 42147 > Conses are mutable and thus each have their own identity. Numbers are > immutable and have none; there is no defined way to distinguish two numbers > that have the same value ('eq' does not give well-defined results). Better yet, there's still hope that we change things such that `eq` behaves like `eql` on bignums (and maybe also on floats). Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 15:49 ` Stefan Monnier @ 2020-07-02 18:01 ` Mattias Engdegård 2020-07-02 18:55 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors ` (3 more replies) 0 siblings, 4 replies; 98+ messages in thread From: Mattias Engdegård @ 2020-07-02 18:01 UTC (permalink / raw) To: Stefan Monnier; +Cc: Paul Eggert, Andrea Corallo, 42147 [-- Attachment #1: Type: text/plain, Size: 921 bytes --] 2 juli 2020 kl. 17.49 skrev Stefan Monnier <monnier@iro.umontreal.ca>: > Better yet, there's still hope that we change things such that `eq` > behaves like `eql` on bignums (and maybe also on floats). Speaking of which, Andrea may be in a good position to provide us with performance data about such a change, since making 'eq' more expensive is likely to be more visible in native code (assuming the operation is open-coded) than in bytecode or interpreted lisp. On the other hand, perhaps his compiler thingamajig is able to eliminate some checks statically by type propagation? Anyway, since we now have bignums and have standardised on IEEE 754 binary64 floats, is there a reason to keep byte-opt-portable-numberp? If we want to make allowance for capricious x87 rounding, what about rewriting it to accept integral floats in the ±2^53 range, as well as any integer? This is what it might look like: [-- Attachment #2: 0001-Relax-portable-number-predicate-in-byte-compiler.patch --] [-- Type: application/octet-stream, Size: 2788 bytes --] From 745752478c7ab895390a671ee4f7712eedf46bcc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org> Date: Thu, 2 Jul 2020 19:54:56 +0200 Subject: [PATCH] Relax portable number predicate in byte-compiler Since Emacs has standardised on IEEE754 binary64, and the range of integers is no longer platform-dependent, all numbers are portable. Retain a restriction of floats to integers in [-2^53,2^53] to be safe against x87 rounding errors (bug#42147). * lisp/emacs-lisp/byte-opt.el (byte-opt--max-integral-float): (byte-opt--min-integral-float): (byte-opt--portable-max): (byte-opt--portable-min): (byte-opt--portable-numberp): --- lisp/emacs-lisp/byte-opt.el | 27 +++++++++++++-------------- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/lisp/emacs-lisp/byte-opt.el b/lisp/emacs-lisp/byte-opt.el index 72c68d64b2..2c7a23ebed 100644 --- a/lisp/emacs-lisp/byte-opt.el +++ b/lisp/emacs-lisp/byte-opt.el @@ -672,23 +672,22 @@ byte-optimize-associative-math (apply (car form) constants)) form))) -;; Portable Emacs integers fall in this range. -(defconst byte-opt--portable-max #x1fffffff) -(defconst byte-opt--portable-min (- -1 byte-opt--portable-max)) +;; Bounds of consecutive integers representable as floats. +;; These assume IEEE 754 binary64 floats. +(defconst byte-opt--max-integral-float (float (ash 1 53))) +(defconst byte-opt--min-integral-float (- byte-opt--max-integral-float)) ;; True if N is a number that works the same on all Emacs platforms. -;; Portable Emacs fixnums are exactly representable as floats on all -;; Emacs platforms, and (except for -0.0) any floating-point number -;; that equals one of these integers must be the same on all -;; platforms. Although other floating-point numbers such as 0.5 are -;; also portable, it can be tricky to characterize them portably so -;; they are not optimized. +;; As long as we need to support 32-bit x86 (using x87 floats), we cannot +;; be certain that the result is portable to the last bit for all values; +;; therefore, we only consider integral floats to be 'portable'. (defun byte-opt--portable-numberp (n) - (and (numberp n) - (<= byte-opt--portable-min n byte-opt--portable-max) - (= n (floor n)) - (not (and (floatp n) (zerop n) - (condition-case () (< (/ n) 0) (error)))))) + (or (integerp n) + (and (floatp n) + (ignore-errors (= n (floor n))) + (<= byte-opt--min-integral-float + n + byte-opt--max-integral-float)))) ;; Use OP to reduce any leading prefix of portable numbers in the list ;; (cons ACCUM ARGS) down to a single portable number, and return the -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 18:01 ` Mattias Engdegård @ 2020-07-02 18:55 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-02 19:38 ` Stefan Monnier ` (2 subsequent siblings) 3 siblings, 0 replies; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-02 18:55 UTC (permalink / raw) To: Stefan Monnier, Mattias Engdegård; +Cc: Paul Eggert, 42147 Mattias Engdegård <mattiase@acm.org> writes: > 2 juli 2020 kl. 17.49 skrev Stefan Monnier <monnier@iro.umontreal.ca>: > >> Better yet, there's still hope that we change things such that `eq` >> behaves like `eql` on bignums (and maybe also on floats). > > Speaking of which, Andrea may be in a good position to provide us with > performance data about such a change, since making 'eq' more expensive > is likely to be more visible in native code (assuming the operation is > open-coded) than in bytecode or interpreted lisp. On the other hand, > perhaps his compiler thingamajig is able to eliminate some checks > statically by type propagation? Correct, in case we would certainly opencode it and use the thingamajig trying to eliminate the type checks we can. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 18:01 ` Mattias Engdegård 2020-07-02 18:55 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-02 19:38 ` Stefan Monnier 2020-07-02 20:09 ` Paul Eggert 2020-07-02 20:31 ` bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? Paul Eggert 2020-07-02 21:41 ` Stefan Monnier 3 siblings, 1 reply; 98+ messages in thread From: Stefan Monnier @ 2020-07-02 19:38 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Paul Eggert, Andrea Corallo, 42147 >> Better yet, there's still hope that we change things such that `eq` >> behaves like `eql` on bignums (and maybe also on floats). > Speaking of which, Andrea may be in a good position to provide us with > performance data about such a change, since making 'eq' more expensive is > likely to be more visible in native code (assuming the operation is > open-coded) than in bytecode or interpreted lisp. On the other hand, perhaps > his compiler thingamajig is able to eliminate some checks statically by > type propagation? Note that it can also be done without slowing down `eq` (by hash-consing the bignums). Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 19:38 ` Stefan Monnier @ 2020-07-02 20:09 ` Paul Eggert 2020-07-03 9:32 ` Mattias Engdegård 0 siblings, 1 reply; 98+ messages in thread From: Paul Eggert @ 2020-07-02 20:09 UTC (permalink / raw) To: Stefan Monnier, Mattias Engdegård; +Cc: Andrea Corallo, 42147 On 7/2/20 12:38 PM, Stefan Monnier wrote: > Note that it can also be done without slowing down `eq` (by > hash-consing the bignums). Yes, that's a better way to go. I wrote a patch to do that a while ago but never got around to the laborious part, which was benchmarking. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 20:09 ` Paul Eggert @ 2020-07-03 9:32 ` Mattias Engdegård 2020-07-03 13:39 ` bug#42147: Hash-consing bignums (was: bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?) Stefan Monnier 0 siblings, 1 reply; 98+ messages in thread From: Mattias Engdegård @ 2020-07-03 9:32 UTC (permalink / raw) To: Paul Eggert; +Cc: Stefan Monnier, Andrea Corallo, 42147 2 juli 2020 kl. 22.09 skrev Paul Eggert <eggert@cs.ucla.edu>: >> Note that it can also be done without slowing down `eq` (by >> hash-consing the bignums). > > Yes, that's a better way to go. I wrote a patch to do that a while ago but never > got around to the laborious part, which was benchmarking. Hash-consing bignums may be a good idea (I'm neutral on the idea), but there could be a reason why it isn't very commonly seen in other runtimes; perhaps they have more efficient GCs (generational and/or incremental), but Emacs would benefit (a lot) from that, too. In any case, it's a one-way decision: once we guarantee eq to provide numerical equality (whether by hash-consing or otherwise), there is no way back. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: Hash-consing bignums (was: bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?) 2020-07-03 9:32 ` Mattias Engdegård @ 2020-07-03 13:39 ` Stefan Monnier 0 siblings, 0 replies; 98+ messages in thread From: Stefan Monnier @ 2020-07-03 13:39 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Paul Eggert, Andrea Corallo, 42147 > Hash-consing bignums may be a good idea (I'm neutral on the idea), but there > could be a reason why it isn't very commonly seen in other runtimes; perhaps > they have more efficient GCs (generational and/or incremental), but Emacs > would benefit (a lot) from that, too. I don't think the GC performance has very much to do with it. I think it's simply that hash-consing bignums has a negative effect on the performance of bignum operations (by a constant factor which I guesstimate to be around 50%). For a general purpose language this can be significant. In my opinion for Emacs Lisp this is irrelevant (we've lived without real bignums (using slow emulations instead) for more than 30 years, so even half-speed bignums are much better than what we really need). It could become relevant if/when we replace wide-int with bignums, in which case performance of "small bignums" could be fairly important. But I'm not sure if we'll ever want to replace wide-ints with bignums (tho I do hope we will). > In any case, it's a one-way decision: once we guarantee eq to provide > numerical equality (whether by hash-consing or otherwise), there is no > way back. Yes, once you've had the taste of clean semantics it's hard to go back ;-) Stefan "in favor of redefining `eq` to `eql`" ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 18:01 ` Mattias Engdegård 2020-07-02 18:55 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-02 19:38 ` Stefan Monnier @ 2020-07-02 20:31 ` Paul Eggert 2020-07-02 21:41 ` Stefan Monnier 3 siblings, 0 replies; 98+ messages in thread From: Paul Eggert @ 2020-07-02 20:31 UTC (permalink / raw) To: Mattias Engdegård, Stefan Monnier; +Cc: Andrea Corallo, 42147 On 7/2/20 11:01 AM, Mattias Engdegård wrote: > If we want to make allowance for capricious x87 rounding, what about rewriting it to accept integral floats in the ±2^53 range, as well as any integer? This is what it might look like: Another plausible option would be for Emacs to drop support for x87 rounding, in the interest of portability. All Apple Intel machines have had SSE2, Microsoft has been requiring SSE2 by default since MSVC 2012, and it's easy enough to do the same with GCC and clang. We'd be in good company as lots of other software packages have dropped support for non-SSE2 machines, including Cygwin, Java, OpenOffice, PostgreSQL, Python, QEMU, Thunderbird, VirtualBox; see <http://matejhorvat.si/en/unfiled/nosse2.htm>. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 18:01 ` Mattias Engdegård ` (2 preceding siblings ...) 2020-07-02 20:31 ` bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? Paul Eggert @ 2020-07-02 21:41 ` Stefan Monnier 2020-07-02 23:16 ` Paul Eggert 3 siblings, 1 reply; 98+ messages in thread From: Stefan Monnier @ 2020-07-02 21:41 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Paul Eggert, Andrea Corallo, 42147 > Anyway, since we now have bignums and have standardised on IEEE 754 binary64 > floats, is there a reason to keep byte-opt-portable-numberp? Indeed, it seems like it might not be needed any more. > If we want to make allowance for capricious x87 rounding, what about > rewriting it to accept integral floats in the ±2^53 range, as well as any > integer? This is what it might look like: I must say I don't know what x87 rounding has to do with it. I'd tend to assume that x87 rounding can virtually never be seen from Elisp because it's hard to imagine how the C compiler will manage to keep our Elisp floats long enough in the x87 stack to avoid rounding back to 64bit floats between every Elisp-level operation. Or are we worried about the double-rounding of x87? Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 21:41 ` Stefan Monnier @ 2020-07-02 23:16 ` Paul Eggert 2020-07-03 8:32 ` Mattias Engdegård 0 siblings, 1 reply; 98+ messages in thread From: Paul Eggert @ 2020-07-02 23:16 UTC (permalink / raw) To: Stefan Monnier, Mattias Engdegård; +Cc: Andrea Corallo, 42147 On 7/2/20 2:41 PM, Stefan Monnier wrote: > I'd tend to assume that x87 rounding can virtually never be seen from > Elisp because it's hard to imagine how the C compiler will manage to > keep our Elisp floats long enough in the x87 stack to avoid rounding > back to 64bit floats between every Elisp-level operation. It can happen in floatop_arith_driver, which has an accumulator of type 'double' that on x87 is put into an 80-bit register. For example, on x86+x87 compiled with the usual gcc -O2, (+ 1e16 2.9999 2.9999) returns 10000000000000008.0 (the exact answer rounded to 'double'), whereas with SSE2 the same expression yields 10000000000000004.0 (which is a bit off, because the accumulator is only 64 bits and suffered from rounding intermediate results). > Or are we worried about the double-rounding of x87? That too. For example, with x87, (+ 1e16 2.9999) yields 10000000000000004.0 due to double-rounding, whereas with SSE2 the same expression yields 10000000000000002.0 which is the correctly-rounded answer. As you can see, sometimes SSE2 is closer to the mathematically-correct answer and sometimes x87 is. In typical C math code, x87 is better; in Emacs I imagine the reverse is true (for the reasons you mentioned), though I have not attempted to measure this. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 23:16 ` Paul Eggert @ 2020-07-03 8:32 ` Mattias Engdegård 2020-07-03 13:11 ` Stefan Monnier 2020-07-03 18:31 ` Paul Eggert 0 siblings, 2 replies; 98+ messages in thread From: Mattias Engdegård @ 2020-07-03 8:32 UTC (permalink / raw) To: Paul Eggert; +Cc: Stefan Monnier, Andrea Corallo, 42147 3 juli 2020 kl. 01.16 skrev Paul Eggert <eggert@cs.ucla.edu>: > As you can see, sometimes SSE2 is closer to the mathematically-correct answer > and sometimes x87 is. In typical C math code, x87 is better; in Emacs I imagine > the reverse is true (for the reasons you mentioned), though I have not attempted > to measure this. Thanks for the examples and these were indeed what I had in mind (there's also the effect from having a greater exponent range in the intermediate result); Monniaux [1] is a good reference. In practice, the extra precision of x87 code is so unreliable and fickle (unless the 80-bit long double is used throughout) that it's almost never worth it. (Being much slower doesn't help either.) Fortunately modern compilers generate SSE code by default, only passing return values on the x87 stack as per the x86 ABI (which causes no harm). This reduces an already tiny risk to nil. We could add an elaborate configure or run-time test and admonishments to the installation instructions but frankly we have better use of our time. I suggest we replace byte-opt--portable-numberp with numberp (or nothing at all, depending on where it occurs) and be done with it. --- [1] https://hal.archives-ouvertes.fr/hal-00128124/file/floating-point-article.pdf ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-03 8:32 ` Mattias Engdegård @ 2020-07-03 13:11 ` Stefan Monnier 2020-07-03 18:35 ` Mattias Engdegård 2020-07-03 18:31 ` Paul Eggert 1 sibling, 1 reply; 98+ messages in thread From: Stefan Monnier @ 2020-07-03 13:11 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Paul Eggert, Andrea Corallo, 42147 > Fortunately modern compilers generate SSE code by default, only passing > return values on the x87 stack as per the x86 ABI (which causes no > harm). This reduces an already tiny risk to nil. We could add an elaborate > configure or run-time test and admonishments to the installation > instructions but frankly we have better use of our time. I suggest we > replace byte-opt--portable-numberp with numberp (or nothing at all, > depending on where it occurs) and be done with it. Agreed, Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-03 13:11 ` Stefan Monnier @ 2020-07-03 18:35 ` Mattias Engdegård 2020-07-03 18:43 ` Mattias Engdegård ` (2 more replies) 0 siblings, 3 replies; 98+ messages in thread From: Mattias Engdegård @ 2020-07-03 18:35 UTC (permalink / raw) To: Stefan Monnier; +Cc: Paul Eggert, Andrea Corallo, 42147 3 juli 2020 kl. 15.11 skrev Stefan Monnier <monnier@iro.umontreal.ca>: > >> Fortunately modern compilers generate SSE code by default, only passing >> return values on the x87 stack as per the x86 ABI (which causes no >> harm). This reduces an already tiny risk to nil. We could add an elaborate >> configure or run-time test and admonishments to the installation >> instructions but frankly we have better use of our time. I suggest we >> replace byte-opt--portable-numberp with numberp (or nothing at all, >> depending on where it occurs) and be done with it. > > Agreed, Thanks -- patch attached. Some expressions will still not be constant-folded entirely; for example (byte-optimize-form '(+ #x100000000000000 1 1)) => (+ 72057594037927936 1 1) This will be fixed automatically by marking + as pure; the same should be done for the other arithmetic functions. By the way, is it a bug or a feature that calls to pure functions with constant but invalid arguments raise an error at compile-time? For example: (disassemble (lambda () (if nil (regexp-quote nil)))) will raise an error despite none would be generated at run time if this function were interpreted. It's easy to suppress those errors, but I see how they can be useful in practice. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-03 18:35 ` Mattias Engdegård @ 2020-07-03 18:43 ` Mattias Engdegård 2020-07-03 19:05 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-04 15:06 ` Stefan Monnier 2 siblings, 0 replies; 98+ messages in thread From: Mattias Engdegård @ 2020-07-03 18:43 UTC (permalink / raw) To: Stefan Monnier; +Cc: Paul Eggert, Andrea Corallo, 42147 [-- Attachment #1: Type: text/plain, Size: 25 bytes --] > patch attached Now. [-- Attachment #2: 0001-Relax-portable-number-check-in-byte-compiler-bug-421.patch --] [-- Type: application/octet-stream, Size: 4540 bytes --] From 7b0a5329706a6c73e49b5e3b464543f8ba94f21d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org> Date: Fri, 3 Jul 2020 20:13:50 +0200 Subject: [PATCH] Relax portable number check in byte compiler (bug#42147) With bignums, the set of representable integers is no longer platform-dependent, and since we use nothing but IEEE754 64-bit floats, all numbers are now portable. Take advantage of this fact to simplify constant-folding in the byte compiler, allowing it to be applied more widely. * lisp/emacs-lisp/byte-opt.el (byte-opt--portable-max) (byte-opt--portable-min, byte-opt--portable-numberp): Remove. (byte-opt--arith-reduce, byte-optimize-minus, byte-optimize-1+) (byte-optimize-1-): Simplify: any number will do, and if N is a number, then so are -N, N+1 and N-1. --- lisp/emacs-lisp/byte-opt.el | 39 +++++++++---------------------------- 1 file changed, 9 insertions(+), 30 deletions(-) diff --git a/lisp/emacs-lisp/byte-opt.el b/lisp/emacs-lisp/byte-opt.el index 12bde8faf3..bf9e6a728a 100644 --- a/lisp/emacs-lisp/byte-opt.el +++ b/lisp/emacs-lisp/byte-opt.el @@ -672,36 +672,18 @@ byte-optimize-associative-math (apply (car form) constants)) form))) -;; Portable Emacs integers fall in this range. -(defconst byte-opt--portable-max #x1fffffff) -(defconst byte-opt--portable-min (- -1 byte-opt--portable-max)) - -;; True if N is a number that works the same on all Emacs platforms. -;; Portable Emacs fixnums are exactly representable as floats on all -;; Emacs platforms, and (except for -0.0) any floating-point number -;; that equals one of these integers must be the same on all -;; platforms. Although other floating-point numbers such as 0.5 are -;; also portable, it can be tricky to characterize them portably so -;; they are not optimized. -(defun byte-opt--portable-numberp (n) - (and (numberp n) - (<= byte-opt--portable-min n byte-opt--portable-max) - (= n (floor n)) - (not (and (floatp n) (zerop n) - (condition-case () (< (/ n) 0) (error)))))) - -;; Use OP to reduce any leading prefix of portable numbers in the list -;; (cons ACCUM ARGS) down to a single portable number, and return the +;; Use OP to reduce any leading prefix of constant numbers in the list +;; (cons ACCUM ARGS) down to a single number, and return the ;; resulting list A of arguments. The idea is that applying OP to A ;; is equivalent to (but likely more efficient than) applying OP to ;; (cons ACCUM ARGS), on any Emacs platform. Do not make any special ;; provision for (- X) or (/ X); for example, it is the caller’s ;; responsibility that (- 1 0) should not be "optimized" to (- 1). (defun byte-opt--arith-reduce (op accum args) - (when (byte-opt--portable-numberp accum) + (when (numberp accum) (let (accum1) - (while (and (byte-opt--portable-numberp (car args)) - (byte-opt--portable-numberp + (while (and (numberp (car args)) + (numberp (setq accum1 (condition-case () (funcall op accum (car args)) (error)))) @@ -746,12 +728,11 @@ byte-optimize-minus ;; (- x -1) --> (1+ x) ((equal (cdr args) '(-1)) (list '1+ (car args))) - ;; (- n) -> -n, where n and -n are portable numbers. + ;; (- n) -> -n, where n and -n are constant numbers. ;; This must be done separately since byte-opt--arith-reduce ;; is not applied to (- n). ((and (null (cdr args)) - (byte-opt--portable-numberp (car args)) - (byte-opt--portable-numberp (- (car args)))) + (numberp (car args))) (- (car args))) ;; not further optimized ((equal args (cdr form)) form) @@ -761,8 +742,7 @@ byte-optimize-1+ (let ((args (cdr form))) (when (null (cdr args)) (let ((n (car args))) - (when (and (byte-opt--portable-numberp n) - (byte-opt--portable-numberp (1+ n))) + (when (numberp n) (setq form (1+ n)))))) form) @@ -770,8 +750,7 @@ byte-optimize-1- (let ((args (cdr form))) (when (null (cdr args)) (let ((n (car args))) - (when (and (byte-opt--portable-numberp n) - (byte-opt--portable-numberp (1- n))) + (when (numberp n) (setq form (1- n)))))) form) -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-03 18:35 ` Mattias Engdegård 2020-07-03 18:43 ` Mattias Engdegård @ 2020-07-03 19:05 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-04 14:58 ` Mattias Engdegård 2020-07-04 15:06 ` Stefan Monnier 2 siblings, 1 reply; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-03 19:05 UTC (permalink / raw) To: Stefan Monnier, Mattias Engdegård; +Cc: Paul Eggert, 42147 [-- Attachment #1: Type: text/plain, Size: 309 bytes --] Mattias Engdegård <mattiase@acm.org> writes: > This will be fixed automatically by marking + as pure; the same should be done for the other arithmetic functions. Hi, attached the updated version of the patch updating the pure function classification. Please have a look. Thanks Andrea [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Add-a-number-of-functions-to-pure-fns-bug-42147.patch --] [-- Type: text/x-patch, Size: 1853 bytes --] From f6b7794ef72a788ecb9e6731b10fa849559f20a2 Mon Sep 17 00:00:00 2001 From: Andrea Corallo <akrl@sdf.org> Date: Wed, 1 Jul 2020 10:07:57 +0200 Subject: [PATCH] * Add a number of functions to pure-fns (bug#42147) * lisp/emacs-lisp/byte-opt.el (pure-fns): Add: /=, <, <=, =, >, >=, abs, arrayp, ash, assoc, assq, bool-vector-p char-or-string-p, characterp, consp, eq, eql, equal, expt, floatp, hash-table-p, identity, integer-or-marker-p, integerp, keywordp, length, listp, member, memq, memql, mod, natnump, nlistp, not, null, number-or-marker-p, numberp, proper-list-p, rassq, safe-length, sequencep, string-equal, string-lessp, string<, string=, stringp, symbolp, vectorp. --- lisp/emacs-lisp/byte-opt.el | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/lisp/emacs-lisp/byte-opt.el b/lisp/emacs-lisp/byte-opt.el index 12bde8faf3..c191f438a4 100644 --- a/lisp/emacs-lisp/byte-opt.el +++ b/lisp/emacs-lisp/byte-opt.el @@ -1307,9 +1307,14 @@ byte-optimize-set ;; values if a marker is moved. (let ((pure-fns - '(% concat logand logcount logior lognot logxor - regexp-opt regexp-quote - string-to-char string-to-syntax symbol-name))) + '(% /= < <= = > >= abs arrayp ash assoc assq bool-vector-p + char-or-string-p characterp concat consp eq eql equal expt floatp + hash-table-p identity integer-or-marker-p integerp keywordp length + listp logand logcount logior lognot logxor member memq memql mod + natnump nlistp not null number-or-marker-p, numberp proper-list-p + rassq regexp-opt regexp-quote safe-length sequencep string-equal + string-lessp string-to-char string-to-syntax string< string= stringp + symbol-name symbolp vectorp))) (while pure-fns (put (car pure-fns) 'pure t) (setq pure-fns (cdr pure-fns))) -- 2.17.1 ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-03 19:05 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-04 14:58 ` Mattias Engdegård 0 siblings, 0 replies; 98+ messages in thread From: Mattias Engdegård @ 2020-07-04 14:58 UTC (permalink / raw) To: Andrea Corallo, Philipp Stephani; +Cc: Paul Eggert, Stefan Monnier, 42147 3 juli 2020 kl. 21.05 skrev Andrea Corallo <andrea_corallo@yahoo.it>: > attached the updated version of the patch updating the pure function > classification. Thanks Andrea! Philipp Stephani raised the interesting question of (essentially) whether 'car' is pure. For the purposes of the current constant folding in the byte compiler the answer is yes, but perhaps you have wider ambitions in your work? Clearly, (car X) cannot be moved past some operations with side-effects if X is aliased: (let* ((x (list 'a)) (y (car x))) (f x) y) Here, (car x) cannot be sunk past the call to f despite x remaining unchanged (assuming lexical binding). It would be useful to know more exactly what notion of purity you require. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-03 18:35 ` Mattias Engdegård 2020-07-03 18:43 ` Mattias Engdegård 2020-07-03 19:05 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-04 15:06 ` Stefan Monnier 2020-07-04 16:13 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-05 15:26 ` Mattias Engdegård 2 siblings, 2 replies; 98+ messages in thread From: Stefan Monnier @ 2020-07-04 15:06 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Paul Eggert, Andrea Corallo, 42147 > Thanks -- patch attached. Some expressions will still not be constant-folded > entirely; for example LGTM, Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 15:06 ` Stefan Monnier @ 2020-07-04 16:13 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-05 13:00 ` Mattias Engdegård 2020-07-05 15:26 ` Mattias Engdegård 1 sibling, 1 reply; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-04 16:13 UTC (permalink / raw) To: Mattias Engdegård, Stefan Monnier; +Cc: Paul Eggert, 42147 [-- Attachment #1: Type: text/plain, Size: 2411 bytes --] Mattias Engdegård <mattiase@acm.org> writes: > 3 juli 2020 kl. 21.05 skrev Andrea Corallo <andrea_corallo@yahoo.it>: > >> attached the updated version of the patch updating the pure function >> classification. > > Thanks Andrea! Philipp Stephani raised the interesting question of (essentially) whether 'car' is pure. For the purposes of the current constant folding in the byte compiler the answer is yes, but perhaps you have wider ambitions in your work? > > Clearly, (car X) cannot be moved past some operations with side-effects if X is aliased: > > (let* ((x (list 'a)) > (y (car x))) > (f x) > y) > > Here, (car x) cannot be sunk past the call to f despite x remaining unchanged (assuming lexical binding). > It would be useful to know more exactly what notion of purity you require. Thanks for the observation, today I was studying the situation. I think the notion of purity has to be the same one we use in the byte compiler. The trickiness is in if the considered object is immutable or not. The optimizer must stay in the boundary of what is allowed in this regard. To put in elisp what I think ATM: (defun aaa () (let ((x (list 1 2))) (1+ (car x)) ; <= legally optimizable )) (defun bbb () (let ((x (list 1 2))) (f x) ; f is not pure (1+ (car x)) ; <= cannot optimize )) (defun ccc () (let ((x '(1 2))) (f x) ; f is not pure (1+ (car x)) ; <= legally optimizable because immutable )) (defun ddd () (let ((x (list 1 2))) (f x) ; f is pure (1+ (car x)) ; <= legally optimizable )) Now given we are not constant folding `cons' we are not materializing conses, as a consequence of these three example we would optimize only `ccc' if we include `car' as pure. AFAIU this is correct given modifying an immutable object in `f' would be undefined. So yes for me `car' is pure and I think we should add it to the list. BTW reading the code of the native compiler I realized I am already extrapolating for use a very similar list of optimizable functions to the one proposed. I still think would quite cleaner to classify these in byte-opt.el. Attached the updated patch where I'm adding car, car-safe, cdr, cdr-safe, max, min. Feedback welcome Thanks Andrea [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Add-a-number-of-functions-to-pure-fns-bug-42147.patch --] [-- Type: text/x-patch, Size: 1928 bytes --] From 15a7dc91a3ef278e1beaa834fdf961844f7b80c8 Mon Sep 17 00:00:00 2001 From: Andrea Corallo <akrl@sdf.org> Date: Wed, 1 Jul 2020 10:07:57 +0200 Subject: [PATCH] * Add a number of functions to pure-fns (bug#42147) * lisp/emacs-lisp/byte-opt.el (pure-fns): Add: /=, <, <=, =, >, >=, abs, arrayp, ash, assoc, assq, bool-vector-p car, car-safe, cdr, cdr-safe, char-or-string-p, characterp, consp, eq, eql, equal, expt, floatp, hash-table-p, identity, integer-or-marker-p, integerp, keywordp, length, listp, max, member, memq, memql, min, mod, natnump, nlistp, not, null, number-or-marker-p, numberp, proper-list-p, rassq, safe-length, sequencep, string-equal, string-lessp, string<, string=, stringp, symbolp, vectorp. --- lisp/emacs-lisp/byte-opt.el | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/lisp/emacs-lisp/byte-opt.el b/lisp/emacs-lisp/byte-opt.el index 12bde8faf3..6ad8f11cd7 100644 --- a/lisp/emacs-lisp/byte-opt.el +++ b/lisp/emacs-lisp/byte-opt.el @@ -1307,9 +1307,14 @@ byte-optimize-set ;; values if a marker is moved. (let ((pure-fns - '(% concat logand logcount logior lognot logxor - regexp-opt regexp-quote - string-to-char string-to-syntax symbol-name))) + '(% /= < <= = > >= abs arrayp ash assoc assq bool-vector-p car car-safe + cdr cdr-safe char-or-string-p characterp concat consp eq eql equal + expt floatp hash-table-p identity integer-or-marker-p integerp + keywordp length listp logand logcount logior lognot logxor max member + memq memql min mod natnump nlistp not null number-or-marker-p, numberp + proper-list-p rassq regexp-opt regexp-quote safe-length sequencep + string-equal string-lessp string-to-char string-to-syntax string< + string= stringp symbol-name symbolp vectorp))) (while pure-fns (put (car pure-fns) 'pure t) (setq pure-fns (cdr pure-fns))) -- 2.17.1 ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 16:13 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-05 13:00 ` Mattias Engdegård 2020-07-05 13:16 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 98+ messages in thread From: Mattias Engdegård @ 2020-07-05 13:00 UTC (permalink / raw) To: Andrea Corallo; +Cc: Paul Eggert, Stefan Monnier, 42147 [-- Attachment #1: Type: text/plain, Size: 1940 bytes --] 4 juli 2020 kl. 18.13 skrev Andrea Corallo <andrea_corallo@yahoo.it>: > (defun bbb () > (let ((x (list 1 2))) > (f x) ; f is not pure > (1+ (car x)) ; <= cannot optimize > )) (A more precise property for 'f' in your examples would be 'side-effect-free' rather than 'pure'.) > BTW reading the code of the native compiler I realized I am already > extrapolating for use a very similar list of optimizable functions to the one > proposed. I still think would quite cleaner to classify these in > byte-opt.el. Certainly, but be sure to state your criteria with clarity. There must be no doubt whether or not a function should have a certain property! > Attached the updated patch where I'm adding car, car-safe, cdr, > cdr-safe, max, min. Thank you Andrea! Attached is an update with the following modifications: * I tried to segregate pure functions that operate on mutable objects, such as car, length and equal, from the rest. This way we can more easily separate them entirely (using different properties) later on if desired. * The list of pure functions was expanded further. Related functions were grouped rather than ordered alphabetically, because I found it easier to read this way -- you may disagree. * 'expt' was prudently removed because it doesn't necessarily give portable results for arbitrary floating-point arguments. (exp, sin etc were not included either for the same reason.) * It turned out that in order to bootstrap, we have to prevent the constant evaluation in the byte compiler from raising errors on invalid input. For example, the macro dired-map-over-marks expands to (essentially) (if (integerp ARG) (< ARG 0) where ARG is a macro argument that can be nil. Since < is now pure, compilation would fail despite the offending code being unreachable. As this style of code exists and is not unreasonable, the error has to be suppressed. [-- Attachment #2: 0001-Mark-more-functions-pure-bug-42147.patch --] [-- Type: application/octet-stream, Size: 4245 bytes --] From bc387b56233798fb5a2a821c58632ca9c8a9044b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org> Date: Sun, 5 Jul 2020 13:47:34 +0200 Subject: [PATCH] Mark more functions pure (bug#42147) Extend the list of 'pure' functions to many predicates and numerical functions that we are reasonably confident will give portable results. Also include various list and array accessors, because our use of purity in the byte compiler isn't affected by the mutability of arguments. * lisp/emacs-lisp/byte-opt.el: Update example in comment. (pure-fns): Add many functions. (byte-optimize-form-code-walker) Don't signal errors during evaluation of calls to pure functions with constant arguments at compile time, since such calls are not necessarily reachable. --- lisp/emacs-lisp/byte-opt.el | 49 +++++++++++++++++++++++++++++++------ 1 file changed, 42 insertions(+), 7 deletions(-) diff --git a/lisp/emacs-lisp/byte-opt.el b/lisp/emacs-lisp/byte-opt.el index bf9e6a728a..92c2374a07 100644 --- a/lisp/emacs-lisp/byte-opt.el +++ b/lisp/emacs-lisp/byte-opt.el @@ -557,7 +557,10 @@ byte-optimize-form-code-walker (let ((args (mapcar #'byte-optimize-form (cdr form)))) (if (and (get fn 'pure) (byte-optimize-all-constp args)) - (list 'quote (apply fn (mapcar #'eval args))) + (let ((arg-values (mapcar #'eval args))) + (condition-case nil + (list 'quote (apply fn arg-values)) + (error (cons fn args)))) (cons fn args))))))) (defun byte-optimize-all-constp (list) @@ -1274,9 +1277,9 @@ byte-optimize-set ;; Pure functions are side-effect free functions whose values depend ;; only on their arguments, not on the platform. For these functions, ;; calls with constant arguments can be evaluated at compile time. -;; This may shift runtime errors to compile time. For example, logand -;; is pure since its results are machine-independent, whereas ash is -;; not pure because (ash 1 29)'s value depends on machine word size. +;; For example, ash is pure since its results are machine-independent, +;; whereas lsh is not pure because (lsh -1 -1)'s value depends on the +;; fixnum range. ;; ;; When deciding whether a function is pure, do not worry about ;; mutable strings or markers, as they are so unlikely in real code @@ -1286,9 +1289,41 @@ byte-optimize-set ;; values if a marker is moved. (let ((pure-fns - '(% concat logand logcount logior lognot logxor - regexp-opt regexp-quote - string-to-char string-to-syntax symbol-name))) + '(concat regexp-opt regexp-quote + string-to-char string-to-syntax symbol-name + eq eql + = /= < <= => > min max + + - * / % mod abs ash 1+ 1- sqrt + logand logior lognot logxor logcount + copysign isnan ldexp float logb + floor ceiling round truncate + ffloor fceiling fround ftruncate + string= string-equal string< string-lessp + consp atom listp nlistp propert-list-p + sequencep arrayp vectorp stringp bool-vector-p hash-table-p + null not + numberp integerp floatp natnump characterp + integer-or-marker-p number-or-marker-p char-or-string-p + symbolp keywordp + type-of + identity ignore + + ;; The following functions are pure up to mutation of their + ;; arguments. This is pure enough for the purposes of + ;; constant folding, but not necessarily for all kinds of + ;; code motion. + car cdr car-safe cdr-safe nth nthcdr last + equal + length safe-length + memq memql member + ;; `assoc' and `assoc-default' are excluded since they are + ;; impure if the test function is (consider `string-match'). + assq assql rassq rassoc + plist-get lax-plist-get plist-member + aref elt + bool-vector-subsetp + bool-vector-count-population bool-vector-count-consecutive + ))) (while pure-fns (put (car pure-fns) 'pure t) (setq pure-fns (cdr pure-fns))) -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-05 13:00 ` Mattias Engdegård @ 2020-07-05 13:16 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-06 17:20 ` Mattias Engdegård 0 siblings, 1 reply; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-05 13:16 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Paul Eggert, Stefan Monnier, Andrea Corallo, 42147 Mattias Engdegård <mattiase@acm.org> writes: > 4 juli 2020 kl. 18.13 skrev Andrea Corallo <andrea_corallo@yahoo.it>: > >> (defun bbb () >> (let ((x (list 1 2))) >> (f x) ; f is not pure >> (1+ (car x)) ; <= cannot optimize >> )) > > (A more precise property for 'f' in your examples would be 'side-effect-free' rather than 'pure'.) Right good point! >> BTW reading the code of the native compiler I realized I am already >> extrapolating for use a very similar list of optimizable functions to the one >> proposed. I still think would quite cleaner to classify these in >> byte-opt.el. > > Certainly, but be sure to state your criteria with clarity. There must be no doubt whether or not a function should have a certain property! Absolutley, actually part of the scope of the patch in discussion (for me) is exactly this given I'll just fetch this as a definition. Thanks for improving the patch! Andrea -- akrl@sdf.org ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-05 13:16 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-06 17:20 ` Mattias Engdegård 2020-07-06 21:23 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 98+ messages in thread From: Mattias Engdegård @ 2020-07-06 17:20 UTC (permalink / raw) To: Andrea Corallo; +Cc: Paul Eggert, Stefan Monnier, Andrea Corallo, 42147 5 juli 2020 kl. 15.16 skrev Andrea Corallo <akrl@sdf.org>: > Absolutley, actually part of the scope of the patch in discussion (for > me) is exactly this given I'll just fetch this as a definition. Very good, it looks like we are all set then. I've pushed that patch and a follow-up that removes explicit optimisation functions that no longer are useful with the added pure functions. Are you happy with this resolution, or is there anything else that needs our attention? ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-06 17:20 ` Mattias Engdegård @ 2020-07-06 21:23 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-07 15:54 ` Mattias Engdegård 0 siblings, 1 reply; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-06 21:23 UTC (permalink / raw) To: Andrea Corallo, Mattias Engdegård; +Cc: Paul Eggert, Stefan Monnier, 42147 Mattias Engdegård <mattiase@acm.org> writes: > 5 juli 2020 kl. 15.16 skrev Andrea Corallo <akrl@sdf.org>: > >> Absolutley, actually part of the scope of the patch in discussion (for >> me) is exactly this given I'll just fetch this as a definition. > > Very good, it looks like we are all set then. I've pushed that patch > and a follow-up that removes explicit optimisation functions that no > longer are useful with the added pure functions. Great, the branch I'm working on is already based on your commit therefore I was looking forward to see it on master. > Are you happy with this resolution, or is there anything else that needs our attention? Yes I am happy :) Not sure about the floating point discussion originated from this but on my side this bug can be closed. Thanks Andrea ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-06 21:23 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-07 15:54 ` Mattias Engdegård 2020-07-07 16:24 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 98+ messages in thread From: Mattias Engdegård @ 2020-07-07 15:54 UTC (permalink / raw) To: Andrea Corallo; +Cc: 42147-done, Paul Eggert, Stefan Monnier, Andrea Corallo 6 juli 2020 kl. 23.23 skrev Andrea Corallo <andrea_corallo@yahoo.it>: > Yes I am happy :) Not sure about the floating point discussion > originated from this but on my side this bug can be closed. Then closed it is. I would happily write something in NEWS but -- as Eli noted -- for any noticeable change in behaviour to occur, many conditions need to be met, several of which are quite unlikely. More improvements to the constant-folding are possible and desirable. For example, I have a patch that deals with constant expressions in let-bindings, so that (let ((x (+ 1 2))) (f x)) simplifies to (f 3), with the variable x removed. This in turn generates more opportunities for further simplification and dead-code elimination. Tell me if you are interested. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-07 15:54 ` Mattias Engdegård @ 2020-07-07 16:24 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-07 16:55 ` Mattias Engdegård 0 siblings, 1 reply; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-07 16:24 UTC (permalink / raw) To: Mattias Engdegård Cc: 42147-done, Paul Eggert, Stefan Monnier, Andrea Corallo Mattias Engdegård <mattiase@acm.org> writes: > 6 juli 2020 kl. 23.23 skrev Andrea Corallo <andrea_corallo@yahoo.it>: > >> Yes I am happy :) Not sure about the floating point discussion >> originated from this but on my side this bug can be closed. > > Then closed it is. I would happily write something in NEWS but -- as > Eli noted -- for any noticeable change in behaviour to occur, many > conditions need to be met, several of which are quite unlikely. > > More improvements to the constant-folding are possible and desirable. For example, I have a patch that deals with constant expressions in let-bindings, so that > > (let ((x (+ 1 2))) > (f x)) > > simplifies to (f 3), with the variable x removed. This in turn > generates more opportunities for further simplification and dead-code > elimination. Tell me if you are interested. Sure I'm. The native compiler does it already but I'm curious to see how you do it at source level and how generic it is. Thanks Andrea ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-07 16:24 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-07 16:55 ` Mattias Engdegård 2020-07-07 17:42 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-08 19:14 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 2 replies; 98+ messages in thread From: Mattias Engdegård @ 2020-07-07 16:55 UTC (permalink / raw) To: Andrea Corallo; +Cc: 42147-done, Paul Eggert, Stefan Monnier, Andrea Corallo [-- Attachment #1: Type: text/plain, Size: 811 bytes --] 7 juli 2020 kl. 18.24 skrev Andrea Corallo <andrea_corallo@yahoo.it>: > Sure I'm. The native compiler does it already but I'm curious to see > how you do it at source level and how generic it is. Not very, but doing it at source level has some advantages since it can enable other source-level transformations. It's mainly a proof of concept -- for simplicity, it doesn't attempt to be overly clever in the face of loops or setq. One snag is that because Emacs inline functions (defsubst) are inlined as bytecode, they are usually not amenable to source optimisations. It is only when a defsubst is imported from a different .el file that has not yet been byte-compiled that it is integrated as source, and then the machinery in this patch will nicely propagate constant arguments into the body. [-- Attachment #2: 0001-Constprop-of-lexical-variables.patch --] [-- Type: application/octet-stream, Size: 20231 bytes --] From 6ba3ec19de923250394f86fbbdb843f43f2f519a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org> Date: Sat, 28 Dec 2019 21:11:56 +0100 Subject: [PATCH] Constprop of lexical variables Lexical variables bound to a constant value (symbol, number or string) are substituted at their point of use and the variable then eliminated if possible. Example: (let ((x (+ 2 3))) (f x)) => (f 5) This reduces code size, eliminates stack operations, and enables further optimisations. The constprop mechanism is generally afraid of variable mutation, conditions and loops. * lisp/emacs-lisp/byte-opt.el (byte-optimize-enable-constprop): Master switch for constprop optimisations. (byte-optimize--lexvars, byte-optimize--vars-outside-condition) (byte-optimize--vars-outside-loop, byte-optimize--constprop-mode) (byte-optimize--dynamic-vars): New dynamic variables. (byte-optimize--substitutable-p, byte-optimize-let-form): New. (byte-optimize-form-code-walker): Adapt several clauses for constprop, and add clauses for 'setq' and 'defvar'. * test/lisp/emacs-lisp/bytecomp-tests.el (bytecomp-test-var) (bytecomp-test-get-var, bytecomp-test-identity) (byte-opt-testsuite-arith-data): Add constprop test cases. --- lisp/emacs-lisp/byte-opt.el | 299 +++++++++++++++++++------ test/lisp/emacs-lisp/bytecomp-tests.el | 59 +++++ 2 files changed, 290 insertions(+), 68 deletions(-) diff --git a/lisp/emacs-lisp/byte-opt.el b/lisp/emacs-lisp/byte-opt.el index 194ceee176..a4a14410b9 100644 --- a/lisp/emacs-lisp/byte-opt.el +++ b/lisp/emacs-lisp/byte-opt.el @@ -366,6 +366,48 @@ byte-compile-unfold-lambda \f ;;; implementing source-level optimizers +(defconst byte-optimize-enable-constprop t + "If non-nil, enable constant propagation optimisations.") + +(defvar byte-optimize--lexvars nil + "Lexical variables in scope, in reverse order of declaration. +Each element is on the form (NAME CHANGED [VALUE]), where: + NAME is the variable name, + CHANGED is a boolean indicating whether it's been changed (with setq), + VALUE, if present, is a substitutable expression. +Earlier variables shadow later ones with the same name. +Only lexical variables are included.") + +(defvar byte-optimize--vars-outside-condition nil + "Alist of variables lexically bound outside conditionally executed code.") + +(defvar byte-optimize--vars-outside-loop nil + "Alist of variables lexically bound outside the innermost `while' loop.") + +(defvar byte-optimize--constprop-mode t + "Constant-propagation mode: When t, constprop is always +enabled; when nil, disabled for variables bound outside the +innermost loop; when `loop', disabled for changed variables bound +outside the innermost loop.") + +(defvar byte-optimize--dynamic-vars nil + "List of variables declared as dynamic during optimisation.") + +(defun byte-optimize--substitutable-p (expr) + "Whether EXPR is a constant that can be propagated." + ;; Only consider numbers, symbols and strings to be values for substitution + ;; purposes. Numbers and symbols are immutable, and mutating string + ;; literals (or results from constant-evaluated string-returning functions) + ;; can be considered undefined. + ;; (What about other quoted values, like conses?) + (or (booleanp expr) + (numberp expr) + (stringp expr) + (and (consp expr) + (eq (car expr) 'quote) + (symbolp (cadr expr))) + (keywordp expr))) + (defun byte-optimize-form-code-walker (form for-effect) ;; ;; For normal function calls, We can just mapcar the optimizer the cdr. But @@ -377,11 +419,22 @@ byte-optimize-form-code-walker (let ((fn (car-safe form)) tmp) (cond ((not (consp form)) - (if (not (and for-effect - (or byte-compile-delete-errors - (not (symbolp form)) - (eq form t)))) - form)) + (cond + ((and for-effect + (or byte-compile-delete-errors + (not (symbolp form)) + (eq form t))) + nil) + ((symbolp form) + (let ((lexvar (assq form byte-optimize--lexvars))) + (if (and (cddr lexvar) ; Value available? + (or (eq byte-optimize--constprop-mode t) + (not (assq form byte-optimize--vars-outside-loop)) + (and (eq byte-optimize--constprop-mode 'loop) + (not (cadr lexvar))))) ; Variable unchanged? + (caddr lexvar) + form))) + (t form))) ((eq fn 'quote) (if (cdr (cdr form)) (byte-compile-warn "malformed quote form: `%s'" @@ -402,29 +455,19 @@ byte-optimize-form-code-walker ;; recursively enter the optimizer for the bindings and body ;; of a let or let*. This for depth-firstness: forms that ;; are more deeply nested are optimized first. - (cons fn - (cons - (mapcar (lambda (binding) - (if (symbolp binding) - binding - (if (cdr (cdr binding)) - (byte-compile-warn "malformed let binding: `%s'" - (prin1-to-string binding))) - (list (car binding) - (byte-optimize-form (nth 1 binding) nil)))) - (nth 1 form)) - (byte-optimize-body (cdr (cdr form)) for-effect)))) + (cons fn (byte-optimize-let-form fn (cdr form) for-effect))) ((eq fn 'cond) - (cons fn - (mapcar (lambda (clause) - (if (consp clause) - (cons - (byte-optimize-form (car clause) nil) - (byte-optimize-body (cdr clause) for-effect)) - (byte-compile-warn "malformed cond form: `%s'" - (prin1-to-string clause)) - clause)) - (cdr form)))) + (let ((byte-optimize--vars-outside-condition byte-optimize--lexvars)) + (cons fn + (mapcar (lambda (clause) + (if (consp clause) + (cons + (byte-optimize-form (car clause) nil) + (byte-optimize-body (cdr clause) for-effect)) + (byte-compile-warn "malformed cond form: `%s'" + (prin1-to-string clause)) + clause)) + (cdr form))))) ((eq fn 'progn) ;; As an extra added bonus, this simplifies (progn <x>) --> <x>. (if (cdr (cdr form)) @@ -456,36 +499,54 @@ byte-optimize-form-code-walker (byte-compile-warn "too few arguments for `if'")) (cons fn (cons (byte-optimize-form (nth 1 form) nil) - (cons - (byte-optimize-form (nth 2 form) for-effect) - (byte-optimize-body (nthcdr 3 form) for-effect))))) + (let ((byte-optimize--vars-outside-condition + byte-optimize--lexvars)) + (cons + (byte-optimize-form (nth 2 form) for-effect) + (byte-optimize-body (nthcdr 3 form) for-effect)))))) ((memq fn '(and or)) ; Remember, and/or are control structures. - ;; Take forms off the back until we can't any more. - ;; In the future it could conceivably be a problem that the - ;; subexpressions of these forms are optimized in the reverse - ;; order, but it's ok for now. - (if for-effect - (let ((backwards (reverse (cdr form)))) - (while (and backwards - (null (setcar backwards - (byte-optimize-form (car backwards) - for-effect)))) - (setq backwards (cdr backwards))) - (if (and (cdr form) (null backwards)) - (byte-compile-log - " all subforms of %s called for effect; deleted" form)) - (and backwards - (cons fn (nreverse (mapcar 'byte-optimize-form - backwards))))) - (cons fn (mapcar 'byte-optimize-form (cdr form))))) - - ((eq fn 'while) - (unless (consp (cdr form)) - (byte-compile-warn "too few arguments for `while'")) - (cons fn - (cons (byte-optimize-form (cadr form) nil) - (byte-optimize-body (cddr form) t)))) + ;; We have to optimise in left-to-right order, but doing so + ;; we miss some optimisation opportunities: consider + ;; (and A B) in a for-effect context, where B => nil. + ;; Then A could be optimised in a for-effect context too. + (let ((tail (cdr form)) + (args nil)) + (when tail + ;; The first argument is always unconditional. + (push (byte-optimize-form + (car tail) (and for-effect (null (cdr tail)))) + args) + (setq tail (cdr tail)) + ;; Remaining arguments are conditional. + (let ((byte-optimize--vars-outside-condition + byte-optimize--lexvars)) + (while tail + (push (byte-optimize-form + (car tail) (and for-effect (null (cdr tail)))) + args) + (setq tail (cdr tail))))) + (cons fn (nreverse args)))) + + ((eq fn 'while) + (let ((byte-optimize--vars-outside-condition + byte-optimize--lexvars) + (byte-optimize--vars-outside-loop + byte-optimize--lexvars)) + ;; Traverse the loop twice: first without any + ;; constprop, to detect setq forms... + (let ((opt + (let ((byte-optimize--constprop-mode nil)) + (cons (byte-optimize-form (nth 1 form) nil) + (byte-optimize-body (nthcdr 2 form) t))))) + ;; ... then in loop mode, allowing substitution of variables + ;; bound inside all loops or not changed anywhere. + ;; This is a bit slow (exponential in the number of nested + ;; loops). + (let ((byte-optimize--constprop-mode 'loop)) + (cons fn + (cons (byte-optimize-form (car opt) nil) + (byte-optimize-body (cdr opt) t))))))) ((eq fn 'interactive) (byte-compile-warn "misplaced interactive spec: `%s'" @@ -498,12 +559,14 @@ byte-optimize-form-code-walker form) ((eq fn 'condition-case) - `(condition-case ,(nth 1 form) ;Not evaluated. - ,(byte-optimize-form (nth 2 form) for-effect) - ,@(mapcar (lambda (clause) - `(,(car clause) - ,@(byte-optimize-body (cdr clause) for-effect))) - (nthcdr 3 form)))) + (let ((byte-optimize--vars-outside-condition + byte-optimize--lexvars)) + `(condition-case ,(nth 1 form) ;Not evaluated. + ,(byte-optimize-form (nth 2 form) for-effect) + ,@(mapcar (lambda (clause) + `(,(car clause) + ,@(byte-optimize-body (cdr clause) for-effect))) + (nthcdr 3 form))))) ((eq fn 'unwind-protect) ;; the "protected" part of an unwind-protect is compiled (and thus @@ -511,14 +574,21 @@ byte-optimize-form-code-walker ;; non-protected part has the same for-effect status as the ;; unwind-protect itself. (The protected part is always for effect, ;; but that isn't handled properly yet.) - (cons fn - (cons (byte-optimize-form (nth 1 form) for-effect) - (cdr (cdr form))))) + (let* ((byte-optimize--vars-outside-condition byte-optimize--lexvars) + (bodyform (byte-optimize-form (nth 1 form) for-effect))) + (cons fn + (cons bodyform + (pcase (cddr form) + (`(:fun-body ,f) + (list :fun-body (byte-optimize-form f nil))) + (unwindforms unwindforms)))))) ((eq fn 'catch) - (cons fn - (cons (byte-optimize-form (nth 1 form) nil) - (byte-optimize-body (cdr form) for-effect)))) + (let ((byte-optimize--vars-outside-condition + byte-optimize--lexvars)) + (cons fn + (cons (byte-optimize-form (nth 1 form) nil) + (byte-optimize-body (cdr form) for-effect))))) ((eq fn 'ignore) ;; Don't treat the args to `ignore' as being @@ -528,7 +598,45 @@ byte-optimize-form-code-walker `(prog1 nil . ,(mapcar 'byte-optimize-form (cdr form)))) ;; Needed as long as we run byte-optimize-form after cconv. - ((eq fn 'internal-make-closure) form) + ((eq fn 'internal-make-closure) + ;; Look up free vars and mark them as changed, so that they + ;; won't be optimised away. + (dolist (var (caddr form)) + (let ((lexvar (assq var byte-optimize--lexvars))) + (when lexvar + (setcar (cdr lexvar) t)))) + form) + + ((eq fn 'setq) + (let ((args (cdr form)) + (var-expr-list nil)) + (while args + (unless (and (consp args) + (symbolp (car args)) (consp (cdr args))) + (byte-compile-warn "malformed setq form: %S" form)) + (let* ((var (car args)) + (expr (cadr args)) + (lexvar (assq var byte-optimize--lexvars)) + (value (byte-optimize-form expr nil))) + (when lexvar + ;; If it's bound outside conditional, invalidate. + (if (assq var byte-optimize--vars-outside-condition) + ;; We are in conditional code and the variable was + ;; bound outside: cancel substitutions. + (setcdr (cdr lexvar) nil) + (setcdr (cdr lexvar) + (and (byte-optimize--substitutable-p value) + (list value)))) + (setcar (cdr lexvar) t)) ; Mark variable as changed. + (push var var-expr-list) + (push value var-expr-list)) + (setq args (cddr args))) + (cons fn (nreverse var-expr-list)))) + + ((eq fn 'defvar) + (when (and (>= (length form) 2) (symbolp (cadr form))) + (push (cadr form) byte-optimize--dynamic-vars)) + form) ((byte-code-function-p fn) (cons fn (mapcar #'byte-optimize-form (cdr form)))) @@ -563,6 +671,60 @@ byte-optimize-form-code-walker (error (cons fn args)))) (cons fn args))))))) +(defun byte-optimize-let-form (head form for-effect) + (if (and lexical-binding byte-optimize-enable-constprop) + (let* ((byte-optimize--lexvars byte-optimize--lexvars) + (new-lexvars nil) + (let-vars nil)) + (dolist (binding (car form)) + (let* ((name (cond ((consp binding) (car binding)) + ((symbolp binding) binding) + (t + (byte-compile-warn "malformed let binding: `%S'" + binding)))) + (expr (and (consp binding) (consp (cdr binding)) + (byte-optimize-form (cadr binding) nil))) + (value (and (byte-optimize--substitutable-p expr) + (list expr))) + (lexical (not (or (special-variable-p name) + (memq name byte-compile-bound-variables) + (memq name byte-optimize--dynamic-vars)))) + (lexinfo (and lexical (cons name (cons nil value))))) + (push (cons name (cons expr (cdr lexinfo))) let-vars) + (when lexinfo + (push lexinfo (if (eq head 'let*) + byte-optimize--lexvars + new-lexvars))))) + (setq byte-optimize--lexvars + (append new-lexvars byte-optimize--lexvars)) + ;; Walk the body expressions, which may mutate some of the records, + ;; and generate new bindings that exlude unused variables. + (let* ((opt-body (byte-optimize-body (cdr form) for-effect)) + (bindings nil)) + (dolist (var let-vars) + ;; VAR is (NAME EXPR [CHANGED [VALUE]]) + (if (and (nthcdr 3 var) (not (nth 2 var)) + byte-optimize--constprop-mode) + (when nil + ;; This warning makes the compiler very chatty, but + ;; it does find the occasional mistake. + (byte-compile-warn "eliminating local variable %S" (car var))) + (push (list (nth 0 var) (nth 1 var)) bindings))) + (cons bindings opt-body))) + + ;; With dynamic binding, no substitutions are in effect. + (let ((byte-optimize--lexvars nil)) + (cons + (mapcar (lambda (binding) + (if (symbolp binding) + binding + (when (or (atom binding) (cddr binding)) + (byte-compile-warn "malformed let binding: `%S'" binding)) + (list (car binding) + (byte-optimize-form (nth 1 binding) nil)))) + (car form)) + (byte-optimize-body (cdr form) for-effect))))) + (defun byte-optimize-all-constp (list) "Non-nil if all elements of LIST satisfy `macroexp-const-p'." (let ((constant t)) @@ -607,6 +769,7 @@ byte-optimize-body ;; all-for-effect is true. returns a new list of forms. (let ((rest forms) (result nil) + (byte-optimize--dynamic-vars byte-optimize--dynamic-vars) fe new) (while rest (setq fe (or all-for-effect (cdr rest))) diff --git a/test/lisp/emacs-lisp/bytecomp-tests.el b/test/lisp/emacs-lisp/bytecomp-tests.el index c235dd43fc..07320cf921 100644 --- a/test/lisp/emacs-lisp/bytecomp-tests.el +++ b/test/lisp/emacs-lisp/bytecomp-tests.el @@ -31,6 +31,15 @@ (require 'bytecomp) ;;; Code: +(defvar bytecomp-test-var nil) + +(defun bytecomp-test-get-var () + bytecomp-test-var) + +(defun bytecomp-test-identity (x) + "Identity, but hidden from some optimisations." + x) + (defconst byte-opt-testsuite-arith-data '( ;; some functional tests @@ -349,6 +358,56 @@ byte-opt-testsuite-arith-data '((a c) (b c) (7 c) (-3 c) (nil nil) (t c) (q c) (r c) (s c) (t c) (x "a") (x "c") (x c) (x d) (x e))) + ;; Constprop test cases + (let ((a 'alpha) (b (concat "be" "ta")) (c nil) (d t) (e :gamma) + (f '(delta epsilon))) + (list a b c d e f)) + + (let ((x 1) (y (+ 3 4))) + (list + (let (q (y x) (z y)) + (if q x (list x y z))))) + + (let* ((x 3) (y (* x 2)) (x (1+ y))) + x) + + (let ((x 1) (bytecomp-test-var 2) (y 3)) + (list x bytecomp-test-var (bytecomp-get-test-var) y)) + + (progn + (defvar d) + (let ((x 'a) (y 'b)) (list x y))) + + (let ((x 2)) + (list x (setq x 13) (setq x (* x 2)) x)) + + (let ((x 'a) (y 'b)) + (setq y x + x (cons 'c y) + y x) + (list x y)) + + (let ((x 3)) + (let ((y x) z) + (setq x 5) + (setq y (+ y 8)) + (setq z (if (bytecomp-test-identity t) + (progn + (setq x (+ x 1)) + (list x y)) + (setq x (+ x 2)) + (list x y))) + (list x y z))) + + (let ((i 1) (s 0) (x 13)) + (while (< i 5) + (setq s (+ s i)) + (setq i (1+ i))) + (list s x i)) + + (let ((x 2)) + (list (or (bytecomp-identity 'a) (setq x 3)) x)) + ;; `substring' bytecode generation (bug#39709). (substring "abcdef") (substring "abcdef" 2) -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-07 16:55 ` Mattias Engdegård @ 2020-07-07 17:42 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-08 19:14 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 0 replies; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-07 17:42 UTC (permalink / raw) To: Mattias Engdegård Cc: 42147-done, Paul Eggert, Stefan Monnier, Andrea Corallo Mattias Engdegård <mattiase@acm.org> writes: > 7 juli 2020 kl. 18.24 skrev Andrea Corallo <andrea_corallo@yahoo.it>: > >> Sure I'm. The native compiler does it already but I'm curious to see >> how you do it at source level and how generic it is. > > Not very, but doing it at source level has some advantages since it can enable other source-level transformations. > It's mainly a proof of concept -- for simplicity, it doesn't attempt to be overly clever in the face of loops or setq. > > One snag is that because Emacs inline functions (defsubst) are inlined > as bytecode, they are usually not amenable to source optimisations. It > is only when a defsubst is imported from a different .el file that has > not yet been byte-compiled that it is integrated as source, and then > the machinery in this patch will nicely propagate constant arguments > into the body. Well loops and setq without CFG IMO are likely to be difficult if not impossible to optimize in a generic way. But I agree that having a simple optimization done earlier is always better given it can benefit other ones. Andrea ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-07 16:55 ` Mattias Engdegård 2020-07-07 17:42 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-08 19:14 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-08 21:25 ` Mattias Engdegård 1 sibling, 1 reply; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-08 19:14 UTC (permalink / raw) To: Mattias Engdegård Cc: 42147-done, Paul Eggert, Stefan Monnier, Andrea Corallo Mattias Engdegård <mattiase@acm.org> writes: > +(defun byte-optimize--substitutable-p (expr) > + "Whether EXPR is a constant that can be propagated." > + ;; Only consider numbers, symbols and strings to be values for substitution > + ;; purposes. Numbers and symbols are immutable, and mutating string > + ;; literals (or results from constant-evaluated string-returning functions) > + ;; can be considered undefined. > + ;; (What about other quoted values, like conses?) > + (or (booleanp expr) > + (numberp expr) > + (stringp expr) > + (and (consp expr) > + (eq (car expr) 'quote) > + (symbolp (cadr expr))) > + (keywordp expr))) Hi Mattias, in the branch I'm working I've a function similar in scope to this so reading your patch this got my attention. In my version I assumed (after a look to the manual) to have strings to be immutable only at speed 3. Is it safe to assume this always instead? Also I wanted ask why symbols are not included but only keywords, is this to respect the side effect of interning them or something else? Thanks! Andrea -- akrl@sdf.org ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-08 19:14 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-08 21:25 ` Mattias Engdegård 2020-07-08 22:19 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 98+ messages in thread From: Mattias Engdegård @ 2020-07-08 21:25 UTC (permalink / raw) To: Andrea Corallo; +Cc: 42147-done, Paul Eggert, Stefan Monnier, Andrea Corallo Hello Andrea, > In my version I assumed (after a look to the manual) to have strings to > be immutable only at speed 3. Is it safe to assume this always instead? Ultimately it depends on the transformations you do, but yes: this patch substitutes let-bound names for their values, and since the behaviour of mutating string literals is undefined, it's safe. Consider: (let ((s "abc")) (f s) s) It doesn't matter what 'f' does; since it isn't permitted to mutate its argument string, the transformation to (progn (f "abc") "abc") is safe (assuming lexical binding, since f could otherwise set s to something else). > Also I wanted ask why symbols are not included but only keywords, is > this to respect the side effect of interning them or something else? Symbols are included, but since this is (normalised) Lisp source, plain symbols are variables; constants of symbol type are represented as (quote SYM), matched by the and-expression. Keywords are just symbols whose name begin with a colon, like :chocolate, and need no quoting. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-08 21:25 ` Mattias Engdegård @ 2020-07-08 22:19 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-09 10:20 ` Mattias Engdegård 0 siblings, 1 reply; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-08 22:19 UTC (permalink / raw) To: Mattias Engdegård Cc: 42147-done, Paul Eggert, Stefan Monnier, Andrea Corallo Mattias Engdegård <mattiase@acm.org> writes: > Hello Andrea, > >> In my version I assumed (after a look to the manual) to have strings to >> be immutable only at speed 3. Is it safe to assume this always instead? > > Ultimately it depends on the transformations you do, but yes: this > patch substitutes let-bound names for their values, and since the > behaviour of mutating string literals is undefined, it's > safe. Consider: > > (let ((s "abc")) > (f s) > s) > > It doesn't matter what 'f' does; since it isn't permitted to mutate its argument string, the transformation to > > (progn (f "abc") "abc") > > is safe (assuming lexical binding, since f could otherwise set s to something else). Understand, in your code this predicate is used for substitution therefore is just correct to substitute with the original object. In mine is governating the objects we can materialize with const folding so I had it a little stricter. But this raise to me another doubt on the topic that is: why don't we have the same in the byte-compiler? That is: (defun foo (x) (concat "bar" "foo")) is byte compiled optimizing on `concat' because pure. But doing that the resulting string "barfoo" becomes immutable. I'd expect "barfoo" to be mutable because without optimizations the allocation would be done in the run-time. In general how I imagined the thing is that we can optimize only pure functions returning immutable objects to avoid the risk of unexpectedly creating objects that should not be changed. Or is it maybe that the only way to certainly have a mutable string is with `copy-sequence' or `make-string'? >> Also I wanted ask why symbols are not included but only keywords, is >> this to respect the side effect of interning them or something else? > > Symbols are included, but since this is (normalised) Lisp source, > plain symbols are variables; constants of symbol type are represented > as (quote SYM), matched by the and-expression. Keywords are just > symbols whose name begin with a colon, like :chocolate, and need no > quoting. Understand It's a little late hope I'm not talking stupid, thanks for your patience. Andrea -- akrl@sdf.org ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-08 22:19 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-09 10:20 ` Mattias Engdegård 2020-07-09 12:47 ` Stefan Monnier 0 siblings, 1 reply; 98+ messages in thread From: Mattias Engdegård @ 2020-07-09 10:20 UTC (permalink / raw) To: Andrea Corallo; +Cc: 42147-done, Paul Eggert, Stefan Monnier, Andrea Corallo 9 juli 2020 kl. 00.19 skrev Andrea Corallo <akrl@sdf.org>: > Or is it maybe that the only way to certainly have a mutable string is > with `copy-sequence' or `make-string'? Yes; since the need for a mutable value is very unlikely, creating a fresh copy every time just in case would be a big waste of resources. Otherwise, we could only 'optimise' (concat "abc" "def") to (copy-sequence "abcdef") which would be a very modest improvement, for very little gain. The two functions you mentioned are indeed guaranteed to return fresh, mutable strings and thus cannot be marked 'pure'. String mutation is something of a special case: it is rare, yet its mere possibility incurs costs even for code that doesn't use it. Better try to keep that cost to a minimum. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-09 10:20 ` Mattias Engdegård @ 2020-07-09 12:47 ` Stefan Monnier 2020-07-09 12:57 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 98+ messages in thread From: Stefan Monnier @ 2020-07-09 12:47 UTC (permalink / raw) To: Mattias Engdegård Cc: 42147-done, Paul Eggert, Andrea Corallo, Andrea Corallo > String mutation is something of a special case: it is rare, yet its mere > possibility incurs costs even for code that doesn't use it. Yes and no: it's *very* rare to change the sequence of characters which compose a string, yes, but until the addition of `propertize` (in Emacs-21) mutation was the only way to add text-properties to a string, so it's still quite common. Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-09 12:47 ` Stefan Monnier @ 2020-07-09 12:57 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-09 14:35 ` Stefan Monnier 0 siblings, 1 reply; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-09 12:57 UTC (permalink / raw) To: Stefan Monnier Cc: Mattias Engdegård, Paul Eggert, Andrea Corallo, 42147-done Stefan Monnier <monnier@iro.umontreal.ca> writes: >> String mutation is something of a special case: it is rare, yet its mere >> possibility incurs costs even for code that doesn't use it. > > Yes and no: it's *very* rare to change the sequence of characters which > compose a string, yes, but until the addition of `propertize` (in > Emacs-21) mutation was the only way to add text-properties to a string, > so it's still quite common. Hi Stefan, What's your suggestion on this optimization? Do you think can be on with default settings or is it dangerous? Andrea -- akrl@sdf.org ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-09 12:57 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-09 14:35 ` Stefan Monnier 2020-07-09 15:19 ` Paul Eggert 2020-07-09 15:37 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 2 replies; 98+ messages in thread From: Stefan Monnier @ 2020-07-09 14:35 UTC (permalink / raw) To: Andrea Corallo Cc: Mattias Engdegård, Paul Eggert, Andrea Corallo, 42147-done > What's your suggestion on this optimization? Do you think can be on with > default settings or is it dangerous? AFAICT it's been on since Emacs-21, which seems to argue for "not too dangerous". Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-09 14:35 ` Stefan Monnier @ 2020-07-09 15:19 ` Paul Eggert 2020-07-09 15:37 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 0 replies; 98+ messages in thread From: Paul Eggert @ 2020-07-09 15:19 UTC (permalink / raw) To: Stefan Monnier, Andrea Corallo Cc: Mattias Engdegård, Andrea Corallo, 42147-done On 7/9/20 7:35 AM, Stefan Monnier wrote: > AFAICT it's been on since Emacs-21, which seems to argue for "not too dangerous". I feel the same way. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-09 14:35 ` Stefan Monnier 2020-07-09 15:19 ` Paul Eggert @ 2020-07-09 15:37 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 0 replies; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-09 15:37 UTC (permalink / raw) To: Stefan Monnier Cc: Mattias Engdegård, Paul Eggert, Andrea Corallo, 42147-done Stefan Monnier <monnier@iro.umontreal.ca> writes: >> What's your suggestion on this optimization? Do you think can be on with >> default settings or is it dangerous? > > AFAICT it's been on since Emacs-21, which seems to argue for "not too dangerous". Great, thanks both for your suggestions. BTW IMO if the only way to *certainly* get a mutable string is with `copy-sequence' or `make-string' I think we could document it. Is a simple rule easy to understand and clarifies the situation. Andrea -- akrl@sdf.org ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 15:06 ` Stefan Monnier 2020-07-04 16:13 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-05 15:26 ` Mattias Engdegård 1 sibling, 0 replies; 98+ messages in thread From: Mattias Engdegård @ 2020-07-05 15:26 UTC (permalink / raw) To: Stefan Monnier; +Cc: Paul Eggert, Andrea Corallo, 42147 4 juli 2020 kl. 17.06 skrev Stefan Monnier <monnier@iro.umontreal.ca>: > >> Thanks -- patch attached. Some expressions will still not be constant-folded >> entirely; for example > > LGTM, Thank you, pushed to master. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-03 8:32 ` Mattias Engdegård 2020-07-03 13:11 ` Stefan Monnier @ 2020-07-03 18:31 ` Paul Eggert 2020-07-03 18:47 ` Mattias Engdegård 1 sibling, 1 reply; 98+ messages in thread From: Paul Eggert @ 2020-07-03 18:31 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Stefan Monnier, Andrea Corallo, 42147 On 7/3/20 1:32 AM, Mattias Engdegård wrote: > In practice, the extra precision of x87 code is so unreliable and fickle (unless the 80-bit long double is used throughout) that it's almost never worth it. Depends on whether you want reproducibility or accuracy. We prefer the former, it seems. > Fortunately modern compilers generate SSE code by default No, GCC generates x87 code by default. You need to specify -mfpmath=sse to convince it to not generate x87 code. (Or, when you build GCC, you need to pass --with-mfpmath=sse to 'configure'; but I think this is uncommon, at least in the GNU/Linux world.) Having Emacs use --with-mfpmath=sse should improve performance a bit on x86. But more important, it should make floating point more reproducible. If I get the time I'll look into having 'configure' add it automatically. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-03 18:31 ` Paul Eggert @ 2020-07-03 18:47 ` Mattias Engdegård 2020-07-04 15:57 ` Paul Eggert 0 siblings, 1 reply; 98+ messages in thread From: Mattias Engdegård @ 2020-07-03 18:47 UTC (permalink / raw) To: Paul Eggert; +Cc: Stefan Monnier, Andrea Corallo, 42147 3 juli 2020 kl. 20.31 skrev Paul Eggert <eggert@cs.ucla.edu>: > No, GCC generates x87 code by default. Sorry, my mistake -- I was testing various compiler options with godbolt and must have looked at LLVM instead. > Having Emacs use --with-mfpmath=sse should improve performance a bit on x86. But > more important, it should make floating point more reproducible. If I get the > time I'll look into having 'configure' add it automatically. Thank you, this would be a prerequisite for further improvements. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-03 18:47 ` Mattias Engdegård @ 2020-07-04 15:57 ` Paul Eggert 2020-07-04 16:15 ` Eli Zaretskii 0 siblings, 1 reply; 98+ messages in thread From: Paul Eggert @ 2020-07-04 15:57 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Stefan Monnier, Andrea Corallo, 42147 [-- Attachment #1: Type: text/plain, Size: 578 bytes --] On 7/3/20 11:47 AM, Mattias Engdegård wrote: >> Having Emacs use --with-mfpmath=sse should improve performance a bit on x86. But >> more important, it should make floating point more reproducible. If I get the >> time I'll look into having 'configure' add it automatically. > Thank you, this would be a prerequisite for further improvements. Attached is a patch to do that. I looked at the GCC source code, and x86 is the only platform where this sort of thing should be necessary. I'll mention this patch on emacs-devel, to give people a heads-up before installing. [-- Attachment #2: 0001-Prefer-standard-rounding-on-x86.patch --] [-- Type: text/x-patch, Size: 2121 bytes --] From df1cd25cefbbbd908efef3023598a6c72d34a176 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sat, 4 Jul 2020 08:43:12 -0700 Subject: [PATCH] Prefer standard rounding on x86 This makes Emacs behavior more consistent across platforms, and should make some byte-code optimizations safer (Bug#42147). We can assume SSE2 by now, since it has been out for 20 years. * configure.ac: On x86 with GCC, default to -msse2 -mfpmath=sse. * etc/NEWS: Mention this. --- configure.ac | 11 +++++++++++ etc/NEWS | 6 ++++++ 2 files changed, 17 insertions(+) diff --git a/configure.ac b/configure.ac index 6ede6104d3..9af691078f 100644 --- a/configure.ac +++ b/configure.ac @@ -943,6 +943,17 @@ AC_DEFUN esac fi +# Unless configured otherwise, prefer standard rounding on x86 if available. +# This makes x86 floating-point more consistent with other platforms. +# To build for older x86 processors not supporting SSE2 (introduced in +# 2000), invoke 'configure' with -mno-sse2 in CFLAGS. +case "$host_cpu, $CC $CFLAGS " in + *' -mfpmath='* | *' -mno-sse2 '*) ;; + i?86,* | x86_64,*' -m32 '*) + gl_COMPILER_OPTION_IF([-msse2 -mfpmath=sse], + [CFLAGS="$CFLAGS -msse2 -mfpmath=sse"]);; +esac + # gl_GCC_VERSION_IFELSE([major], [minor], [run-if-found], [run-if-not-found]) # --------------------------------------------------------------------------- # If $CPP is gcc-MAJOR.MINOR or newer, then run RUN-IF-FOUND. diff --git a/etc/NEWS b/etc/NEWS index fc5c215d2a..090ebd6312 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -58,6 +58,12 @@ shaping, so 'configure' now recommends that combination. ** The ftx font backend driver has been removed. It was declared obsolete in Emacs 27.1. +--- +** On x86 with GCC-like compilers, Emac builds now assume SSE2 by default. +This makes x86 floating-point more consistent with other platforms. +To build for older x86 processors not supporting SSE2 (introduced in +2000), invoke 'configure' with -mno-sse2 in CFLAGS. + --- ** Emacs no longer supports old OpenBSD systems. OpenBSD 5.3 and older releases are no longer supported, as they lack -- 2.25.4 ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 15:57 ` Paul Eggert @ 2020-07-04 16:15 ` Eli Zaretskii 2020-07-04 16:27 ` Paul Eggert 0 siblings, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2020-07-04 16:15 UTC (permalink / raw) To: Paul Eggert; +Cc: mattiase, monnier, andrea_corallo, 42147 > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sat, 4 Jul 2020 08:57:08 -0700 > Cc: Stefan Monnier <monnier@iro.umontreal.ca>, > Andrea Corallo <andrea_corallo@yahoo.it>, 42147@debbugs.gnu.org > > This makes Emacs behavior more consistent across platforms, > and should make some byte-code optimizations safer (Bug#42147). > We can assume SSE2 by now, since it has been out for 20 years. > * configure.ac: On x86 with GCC, default to -msse2 -mfpmath=sse. This assumes that the system on which Emacs runs is the same as where it was built, doesn't it? Is that assumption valid enough to do this by default? ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 16:15 ` Eli Zaretskii @ 2020-07-04 16:27 ` Paul Eggert 2020-07-04 16:33 ` Stefan Monnier 2020-07-04 17:10 ` Eli Zaretskii 0 siblings, 2 replies; 98+ messages in thread From: Paul Eggert @ 2020-07-04 16:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mattiase, monnier, andrea_corallo, 42147 On 7/4/20 9:15 AM, Eli Zaretskii wrote: >> * configure.ac: On x86 with GCC, default to -msse2 -mfpmath=sse. > This assumes that the system on which Emacs runs is the same as where > it was built, doesn't it? It doesn't assume they're the same system. It merely assumes that build and host platforms both support SSE2, which is a safe assumption nowadays. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 16:27 ` Paul Eggert @ 2020-07-04 16:33 ` Stefan Monnier 2020-07-04 16:44 ` Mattias Engdegård 2020-07-04 17:00 ` Paul Eggert 2020-07-04 17:10 ` Eli Zaretskii 1 sibling, 2 replies; 98+ messages in thread From: Stefan Monnier @ 2020-07-04 16:33 UTC (permalink / raw) To: Paul Eggert; +Cc: mattiase, andrea_corallo, 42147 >>> * configure.ac: On x86 with GCC, default to -msse2 -mfpmath=sse. >> This assumes that the system on which Emacs runs is the same as where >> it was built, doesn't it? > It doesn't assume they're the same system. It merely assumes that build and > host platforms both support SSE2, which is a safe assumption nowadays. Is there a way to tell gcc to try and avoid x87's idiosyncrasies without being platform-dependent (or at least without imposing SSE2, since I still use Emacs on my old Thinkpad X30)? Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 16:33 ` Stefan Monnier @ 2020-07-04 16:44 ` Mattias Engdegård 2020-07-04 17:00 ` Paul Eggert 1 sibling, 0 replies; 98+ messages in thread From: Mattias Engdegård @ 2020-07-04 16:44 UTC (permalink / raw) To: Stefan Monnier; +Cc: Paul Eggert, andrea_corallo, 42147 4 juli 2020 kl. 18.33 skrev Stefan Monnier <monnier@iro.umontreal.ca>: > Is there a way to tell gcc to try and avoid x87's idiosyncrasies without > being platform-dependent (or at least without imposing SSE2, since > I still use Emacs on my old Thinkpad X30)? There is -ffloat-store, which cures the 80 bit intermediary result problem while retaining the double-rounding one (which tends to be less common in practice). Maybe it is good enough? ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 16:33 ` Stefan Monnier 2020-07-04 16:44 ` Mattias Engdegård @ 2020-07-04 17:00 ` Paul Eggert 2020-07-04 18:37 ` Pip Cet 2020-07-04 19:01 ` Mattias Engdegård 1 sibling, 2 replies; 98+ messages in thread From: Paul Eggert @ 2020-07-04 17:00 UTC (permalink / raw) To: Stefan Monnier; +Cc: mattiase, andrea_corallo, 42147 On 7/4/20 9:33 AM, Stefan Monnier wrote: > Is there a way to tell gcc to try and avoid x87's idiosyncrasies without > being platform-dependent (or at least without imposing SSE2, since > I still use Emacs on my old Thinkpad X30)? Not as far as I know. GCC's -fexcess-precision=standard option tries to do that, by causing GCC to convert 80-bit results to 64-bit results after every 80-bit operation. However, this still suffers from double-rounding on the x86 unless you also specify -msse2 -mfpmath=sse. (-fexcess-precision=standard supports the C standard better than the older -ffloat-store option, which generates code that is faster but has more double-rounding problems than -fexcess-precision=standard does.) Several GNU/Linux distributions have already dropped support for x86-only hardware like the circa-2001 Intel Mobile Pentium III-M in your laptop. On the distributions that still support i686, you can still build and run Emacs on your laptop (which has SSE but not SSE2) by configuring with CFLAGS='-msse -mfpmath=sse -fexcess-precision=standard'; this should avoid some (but not all) of the rounding problems. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 17:00 ` Paul Eggert @ 2020-07-04 18:37 ` Pip Cet 2020-07-04 21:05 ` Stefan Monnier 2020-07-05 9:56 ` Paul Eggert 2020-07-04 19:01 ` Mattias Engdegård 1 sibling, 2 replies; 98+ messages in thread From: Pip Cet @ 2020-07-04 18:37 UTC (permalink / raw) To: Paul Eggert; +Cc: mattiase, Stefan Monnier, andrea_corallo, 42147 On Sat, Jul 4, 2020 at 5:01 PM Paul Eggert <eggert@cs.ucla.edu> wrote: > On 7/4/20 9:33 AM, Stefan Monnier wrote: > > Is there a way to tell gcc to try and avoid x87's idiosyncrasies without > > being platform-dependent (or at least without imposing SSE2, since > > I still use Emacs on my old Thinkpad X30)? > > Not as far as I know. GCC's -fexcess-precision=standard option tries to do that, > by causing GCC to convert 80-bit results to 64-bit results after every 80-bit > operation. However, this still suffers from double-rounding on the x86 unless > you also specify -msse2 -mfpmath=sse. (-fexcess-precision=standard supports the > C standard better than the older -ffloat-store option, which generates code that > is faster but has more double-rounding problems than -fexcess-precision=standard > does.) So it's a GCC bug? Wouldn't it be better to fix that? (The double-rounding issues you mention don't appear to be documented; but -fexcess-precision=standard is documented to be a nop if -mfpmath=sse is specified...) > Several GNU/Linux distributions have already dropped support for x86-only > hardware like the circa-2001 Intel Mobile Pentium III-M in your laptop. On the > distributions that still support i686, you can still build and run Emacs on your > laptop (which has SSE but not SSE2) by configuring with CFLAGS='-msse > -mfpmath=sse -fexcess-precision=standard'; this should avoid some (but not all) > of the rounding problems. I think we should either drop x86 entirely or support it, and the former is not yet an option. The minor issue fixed by dropping support for pre-SSE x86 can, if I understand the GCC docs correctly, be evaded by making sure the C code does the appropriate amount of casting. FWIW, SSE instructions require special OS support, too. We still have things like the (broken) snprintf implementation in sysdep.c, so people may be forgiven for getting the impression we care about non-typical hardware/OS combinations. I also think it's unwise to establish that Emacs uses 64-bit IEEE-754 floating point numbers exclusively. I'm inclined to believe that the current consensus on that particular format is likely to be temporary only. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 18:37 ` Pip Cet @ 2020-07-04 21:05 ` Stefan Monnier 2020-07-04 22:25 ` Pip Cet 2020-07-05 9:56 ` Paul Eggert 1 sibling, 1 reply; 98+ messages in thread From: Stefan Monnier @ 2020-07-04 21:05 UTC (permalink / raw) To: Pip Cet; +Cc: mattiase, Paul Eggert, andrea_corallo, 42147 > So it's a GCC bug? Wouldn't it be better to fix that? Not really, no: the x87 hardware makes it difficult to get the exact behavior of 64bit floats, so either you live with "almost" the right behavior or you don't use the x87 hardware (which on pre-SSE2 hardware implies a massive slow down on floating point operations). >> Several GNU/Linux distributions have already dropped support for x86-only >> hardware like the circa-2001 Intel Mobile Pentium III-M in your laptop. On the >> distributions that still support i686, you can still build and run Emacs on your >> laptop (which has SSE but not SSE2) by configuring with CFLAGS='-msse >> -mfpmath=sse -fexcess-precision=standard'; this should avoid some (but not all) >> of the rounding problems. If I need to change the flags, then I'm much more likely to just use the "normal" compilation option: I actually couldn't care less about the potential differences in rounding. I'm actually not completely sure why we care about those minor rounding differences. After all, even if we use SSE2, there's no guarantee that the output of `sin` in one version of Emacs will be exactly the same as in another version of Emacs. And while I know that there's some hope that those differences will disappear in some distant future (since someone finally figured out how to implement `sin` efficiently with perfect rounding), I don't think we'll be able to rely on that any time soon, and I don't see why `+` should not be allowed to vary wile `sin` is. Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 21:05 ` Stefan Monnier @ 2020-07-04 22:25 ` Pip Cet 2020-07-05 2:38 ` Eli Zaretskii 0 siblings, 1 reply; 98+ messages in thread From: Pip Cet @ 2020-07-04 22:25 UTC (permalink / raw) To: Stefan Monnier; +Cc: mattiase, Paul Eggert, andrea_corallo, 42147 On Sat, Jul 4, 2020 at 9:05 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote: > > So it's a GCC bug? Wouldn't it be better to fix that? > > Not really, no: the x87 hardware makes it difficult to get the exact > behavior of 64bit floats, so either you live with "almost" the right > behavior or you don't use the x87 hardware (which on pre-SSE2 hardware > implies a massive slow down on floating point operations). Precisely. Those are the two options GCC should offer, instead of (or in addition to) three degrees of wrongness. This is obsolete hardware which does not, apparently, provide any support for adding two IEEE 754 64-bit floats. Pretending it does is not helpful and, at the very least, a bug in the documentation. > >> Several GNU/Linux distributions have already dropped support for x86-only > >> hardware like the circa-2001 Intel Mobile Pentium III-M in your laptop. On the > >> distributions that still support i686, you can still build and run Emacs on your > >> laptop (which has SSE but not SSE2) by configuring with CFLAGS='-msse > >> -mfpmath=sse -fexcess-precision=standard'; this should avoid some (but not all) > >> of the rounding problems. FWIW, that doesn't seem to have any effect here, it just uses x87 instructions. > If I need to change the flags, then I'm much more likely to just use the > "normal" compilation option: I actually couldn't care less about the > potential differences in rounding. > > I'm actually not completely sure why we care about those minor > rounding differences. Neither am I. If the idea is to standardize Emacs on a single floating-point representation, let's at least use the 61-bit floats Paul suggested a while back? (Incidentally, I believe those can be implemented somewhat more efficiently on x87 hardware). Or we could go with bignum ratios or GMP floats. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 22:25 ` Pip Cet @ 2020-07-05 2:38 ` Eli Zaretskii 2020-07-05 8:28 ` Paul Eggert 0 siblings, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2020-07-05 2:38 UTC (permalink / raw) To: Pip Cet; +Cc: mattiase, eggert, monnier, andrea_corallo, 42147 > From: Pip Cet <pipcet@gmail.com> > Date: Sat, 4 Jul 2020 22:25:55 +0000 > Cc: mattiase@acm.org, Paul Eggert <eggert@cs.ucla.edu>, andrea_corallo@yahoo.it, > 42147@debbugs.gnu.org > > > I'm actually not completely sure why we care about those minor > > rounding differences. > > Neither am I. If the idea is to standardize Emacs on a single > floating-point representation, let's at least use the 61-bit floats > Paul suggested a while back? (Incidentally, I believe those can be > implemented somewhat more efficiently on x87 hardware). Or we could go > with bignum ratios or GMP floats. Some other relevant questions: . why didn't GCC folks made SSE2 the default output? shouldn't Emacs follow the defaults, and leave it to the experts to decide which instruction set should be the default? . which other projects use this non-standard instruction set? GNU Guile, for example, doesn't, so why should we? ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-05 2:38 ` Eli Zaretskii @ 2020-07-05 8:28 ` Paul Eggert 2020-07-05 8:39 ` Andreas Schwab ` (2 more replies) 0 siblings, 3 replies; 98+ messages in thread From: Paul Eggert @ 2020-07-05 8:28 UTC (permalink / raw) To: Eli Zaretskii, Pip Cet; +Cc: mattiase, monnier, andrea_corallo, 42147 On 7/4/20 7:38 PM, Eli Zaretskii wrote: > . which other projects use this non-standard instruction set? Wait a minute, SSE2 is not "non-standard". It was introduced in 2000 as part of the evolving Intel/AMD x86 instruction set, and is supported by pretty much every x86-compatible chip that first shipped after 2003-or-so and that is a plausible candidate for Emacs. (Stefan's old laptop uses a circa-2001 chip.) To get back to your question, many software projects already require SSE2 on x86. This includes Chromium since build 35 (2014), Firefox since Firefox 53 (2017), and WebKit2GTK since version 2.24.0 (2019). Since Emacs links to WebKit2GTK by default, recent Emacs by default already requires SSE2 on x86. Similarly, many Gnome applications require SSE2 on x86, as they also link to WebKit2GTK. Among proprietary systems, Mac OS X on x86 (2006-2011) always required SSE2, and MS-Windows development platforms have been requiring SSE2 by default since 2012, which means pretty much every application built for 32-bit MS-Windows requires SSE2 nowadays. Also, all supported MS-Windows operating systems have required SSE2 since 2018, which is when Microsoft pushed out an update for MS-Windows 7 that required SSE2. So it would not at all be outlandish for Emacs to default to requiring SSE2 on GNU/Linux x86 (particularly since it does so already :-). > . why didn't GCC folks made SSE2 the default output? I suspect they think x86 is on its way out and not worth worrying about. Which is a valid point of view. After all, Microsoft dropped distributing 32-bit builds for MS-Windows starting with the May 2020 update, and they're pretty conservative about this sort of thing. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-05 8:28 ` Paul Eggert @ 2020-07-05 8:39 ` Andreas Schwab 2020-07-05 14:47 ` Eli Zaretskii 2020-07-05 15:11 ` Stefan Monnier 2 siblings, 0 replies; 98+ messages in thread From: Andreas Schwab @ 2020-07-05 8:39 UTC (permalink / raw) To: Paul Eggert; +Cc: Pip Cet, mattiase, 42147, monnier, andrea_corallo On Jul 05 2020, Paul Eggert wrote: >> . why didn't GCC folks made SSE2 the default output? > > I suspect they think x86 is on its way out and not worth worrying > about. It all depends on how you configure it. You can make -mfpmath=sse the default if you like, but gcc still supports pre-SSE2 configurations. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-05 8:28 ` Paul Eggert 2020-07-05 8:39 ` Andreas Schwab @ 2020-07-05 14:47 ` Eli Zaretskii 2020-07-05 15:30 ` Stefan Monnier 2020-07-05 15:11 ` Stefan Monnier 2 siblings, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2020-07-05 14:47 UTC (permalink / raw) To: Paul Eggert; +Cc: pipcet, mattiase, monnier, andrea_corallo, 42147 > Cc: monnier@iro.umontreal.ca, mattiase@acm.org, andrea_corallo@yahoo.it, > 42147@debbugs.gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sun, 5 Jul 2020 01:28:46 -0700 > > On 7/4/20 7:38 PM, Eli Zaretskii wrote: > > > . which other projects use this non-standard instruction set? > > Wait a minute, SSE2 is not "non-standard". It was introduced in 2000 as part of > the evolving Intel/AMD x86 instruction set, and is supported by pretty much > every x86-compatible chip that first shipped after 2003-or-so and that is a > plausible candidate for Emacs. (Stefan's old laptop uses a circa-2001 chip.) That's not what I meant: I meant "non-standard" in the sense that the compiler doesn't generate these instructions by default when dealing with FP calculations. > To get back to your question, many software projects already require SSE2 on > x86. This includes Chromium since build 35 (2014), Firefox since Firefox 53 > (2017), and WebKit2GTK since version 2.24.0 (2019). Since Emacs links to > WebKit2GTK by default, recent Emacs by default already requires SSE2 on x86. > Similarly, many Gnome applications require SSE2 on x86, as they also link to > WebKit2GTK. > > Among proprietary systems, Mac OS X on x86 (2006-2011) always required SSE2, and > MS-Windows development platforms have been requiring SSE2 by default since 2012, > which means pretty much every application built for 32-bit MS-Windows requires > SSE2 nowadays. Also, all supported MS-Windows operating systems have required > SSE2 since 2018, which is when Microsoft pushed out an update for MS-Windows 7 > that required SSE2. Thanks, I knew all that. Emacs is not an OS, it's an application, and the issue here is about using these instructions for FP calculations, not for anything else. That was what I was asking about: applications doing FP math. > > . why didn't GCC folks made SSE2 the default output? > > I suspect they think x86 is on its way out and not worth worrying about. Then why won't we do the same, and simply ignore the issue? We have enough real development and maintenance work on our hands. Making non-trivial change in the code we produce for Emacs, for the benefit of (at best) insignificant speed improvement on a dying platform, while risking breakage for at least some systems, is not a good use of our scarce resources, IMO. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-05 14:47 ` Eli Zaretskii @ 2020-07-05 15:30 ` Stefan Monnier 2020-07-06 0:14 ` Paul Eggert 0 siblings, 1 reply; 98+ messages in thread From: Stefan Monnier @ 2020-07-05 15:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mattiase, Paul Eggert, pipcet, andrea_corallo, 42147 > That's not what I meant: I meant "non-standard" in the sense that the > compiler doesn't generate these instructions by default when dealing > with FP calculations. It depends on the architecture you generate for. It's clear that when that architecture is too old to include SSE2 instructions, GCC won't use SSE2 instructions (tho it still could at the cost of runtime tests for the presence of the feature). I don't know what GCC does if the target architecture is recent enough to include SSE2, but I'd expect it to then use SSE2 for most/all floating point operations since it generally leads to more efficient code (and it's easier for GCC to generate efficient code with it). >> I suspect they think x86 is on its way out and not worth worrying about. > Then why won't we do the same, and simply ignore the issue? Sounds good to me, Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-05 15:30 ` Stefan Monnier @ 2020-07-06 0:14 ` Paul Eggert 0 siblings, 0 replies; 98+ messages in thread From: Paul Eggert @ 2020-07-06 0:14 UTC (permalink / raw) To: Stefan Monnier, Eli Zaretskii; +Cc: mattiase, pipcet, andrea_corallo, 42147 On 7/5/20 8:30 AM, Stefan Monnier wrote: >>> I suspect they think x86 is on its way out and not worth worrying about. >> Then why won't we do the same, and simply ignore the issue? > Sounds good to me, Me too. I assume this means we needn't worry about the x86 rounding glitches, except perhaps to document them. So we can fold constant expressions involving floating point without worrying abouth these glitches. For sanity's sake, we should generate Emacs tarballs on a platform other than GNU/Linux x86, which is a pretty easy bit of advice to follow as hardly anybody runs that platform for software development nowadays. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-05 8:28 ` Paul Eggert 2020-07-05 8:39 ` Andreas Schwab 2020-07-05 14:47 ` Eli Zaretskii @ 2020-07-05 15:11 ` Stefan Monnier 2020-07-06 0:10 ` Paul Eggert 2 siblings, 1 reply; 98+ messages in thread From: Stefan Monnier @ 2020-07-05 15:11 UTC (permalink / raw) To: Paul Eggert; +Cc: mattiase, Pip Cet, andrea_corallo, 42147 > Since Emacs links to WebKit2GTK by default, I don't believe this is the case. Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-05 15:11 ` Stefan Monnier @ 2020-07-06 0:10 ` Paul Eggert 0 siblings, 0 replies; 98+ messages in thread From: Paul Eggert @ 2020-07-06 0:10 UTC (permalink / raw) To: Stefan Monnier; +Cc: mattiase, Pip Cet, andrea_corallo, 42147 On 7/5/20 8:11 AM, Stefan Monnier wrote: >> Since Emacs links to WebKit2GTK by default, > I don't believe this is the case. Oh, you're right. I mistakenly assumed Fedora used a default configuration, ran "ldd /usr/bin/emacs", and saw libwebkit2gtk-4.0.so.37, so I thought WebKit2GTK was in by default. To get WebKit2GTK linked, one must configure Emacs with --with-xwidgets. Fedora does that, so their x86 Emacs would require SSE2 if they still supported x86. Which they do not - they dropped support for x86 in Fedora 31 (2019). ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 18:37 ` Pip Cet 2020-07-04 21:05 ` Stefan Monnier @ 2020-07-05 9:56 ` Paul Eggert 2020-07-05 10:03 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 1 reply; 98+ messages in thread From: Paul Eggert @ 2020-07-05 9:56 UTC (permalink / raw) To: Pip Cet; +Cc: mattiase, Stefan Monnier, andrea_corallo, 42147 On 7/4/20 11:37 AM, Pip Cet wrote: > So it's a GCC bug? Wouldn't it be better to fix that? Not without such a performance hit that nobody has wanted to do it. I expect it would need software emulation for most floating-point operations. Also, in theory the Linux kernel could emulate missing SSE2 instructions. However, this is another workaround that nobody has wanted to do because it's not worth the hassle to support these old CPUs. > (The double-rounding issues you mention don't appear to be documented; It's obvious from the generated code. -fexcess-precision doesn't mean floating-point operations conform to IEEE-754 double; all it means is that floating-point values are representable as IEEE-754 double. This is why one cannot rely on -fexcess-precision=standard to get reproducible results. > The minor issue fixed by dropping support > for pre-SSE x86 can, if I understand the GCC docs correctly, be evaded > by making sure the C code does the appropriate amount of casting. That won't suffice. On x87+387, GCC preserves excess precision through casts, even if -ffloat-store is specified. > SSE instructions require special OS support, too. The Linux kernel has supported SSE2 since Linux 2.4 (2001). Any Linux kernel that doesn't support SSE2 is so old that I suspect it won't run current Emacs anyway, for other reasons. > I also think it's unwise to establish that Emacs uses 64-bit IEEE-754 > floating point numbers exclusively. What, will we switch to IBM mainframe floating-point format? :-) Part of the problem we're trying to solve here is rounding discrepancies between build-time and run-time computation. The idea is that an Elisp floating-point operation should yield the same value regardless of whether it's byte-compiled (or even machine-compiled, if we want to do that). Even if Emacs changed floating-point format (by stealing a few bits for tags, say), we'd still run into the same problem. PS. I finally got around to benchmarking, and on my usual benchmark (make compile-always, Fedora 31 x86), using SSE2 sped up Emacs by 4.6%. So there is a performance argument for the change as well as a fix-the-rounding-glitches argument. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-05 9:56 ` Paul Eggert @ 2020-07-05 10:03 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-05 23:57 ` Paul Eggert 0 siblings, 1 reply; 98+ messages in thread From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-05 10:03 UTC (permalink / raw) To: Paul Eggert; +Cc: mattiase, andrea_corallo, Stefan Monnier, Pip Cet, 42147 Paul Eggert <eggert@cs.ucla.edu> writes: > PS. I finally got around to benchmarking, and on my usual benchmark (make > compile-always, Fedora 31 x86), using SSE2 sped up Emacs by 4.6%. So there is a > performance argument for the change as well as a fix-the-rounding-glitches argument. Hi Paul, Wow I'm really surprised by this result! Are we doing so much floating point while compiling Emacs? I expected it to be very marginal, possibly something else is boiling here we are not considering? Regards Andrea -- akrl@sdf.org ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-05 10:03 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-05 23:57 ` Paul Eggert 0 siblings, 0 replies; 98+ messages in thread From: Paul Eggert @ 2020-07-05 23:57 UTC (permalink / raw) To: Andrea Corallo; +Cc: Pip Cet, mattiase, Stefan Monnier, andrea_corallo, 42147 On 7/5/20 3:03 AM, Andrea Corallo wrote: > Wow I'm really surprised by this result! Are we doing so much floating > point while compiling Emacs? No, and you are right to be skeptical. I checked again, and unfortunately I originally measured the performance incorrectly, as I used a script that I had forgotten adds '-march=native' which makes the benchmark results not reflect what distros normally do (unless one is assuming something like Gentoo which is far less common). When I removed -march=native and stuck with just CC='gcc -m32' versus CC='gcc -m32 -msse2 -mfpmath=sse', the performance difference went the other way on my AMD Phenom II X4 910e (2010): the version generated with -msse2 -mfpmath=sse was 3.7% slower on the 'make compile-always' benchmark. I suppose it's possible both of these numbers are artifacts, though I don't know what would cause the artifacts. At any rate obviously I was wrong to suggest that -msse2 improves performance on this benchmark. Sorry about the noise. On 7/5/20 8:07 AM, Stefan Monnier wrote: > Could it be that your measurements also included the time spent > compiling C files and that GCC is significantly faster at generating > code for SSE2? I do a plain 'make' before running 'cd lisp; make compile-always' so GCC is not involved; it's mostly an Emacs byte-compiler benchmark. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 17:00 ` Paul Eggert 2020-07-04 18:37 ` Pip Cet @ 2020-07-04 19:01 ` Mattias Engdegård 1 sibling, 0 replies; 98+ messages in thread From: Mattias Engdegård @ 2020-07-04 19:01 UTC (permalink / raw) To: Paul Eggert; +Cc: Stefan Monnier, andrea_corallo, 42147 4 juli 2020 kl. 19.00 skrev Paul Eggert <eggert@cs.ucla.edu>: > Not as far as I know. GCC's -fexcess-precision=standard option tries to do that, > by causing GCC to convert 80-bit results to 64-bit results after every 80-bit > operation. Not quite after every operation: y = x * a + b; With -fexcess-precision=standard, the 80-bit result of the multiplication will be used in the addition. With -ffloat-store, a 64-bit store and reload will take place between the two operations, forcing excess fraction bits to be discarded. Another option is to set the Precision Control field in the x87 control word to 53-bit precision; that pretty much eliminates double rounding, but still uses the extended exponent range (only matters in overflow and underflow cases). And it may confuse libc's carefully crafted floating-point functions. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 16:27 ` Paul Eggert 2020-07-04 16:33 ` Stefan Monnier @ 2020-07-04 17:10 ` Eli Zaretskii 2020-07-04 19:26 ` Paul Eggert 1 sibling, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2020-07-04 17:10 UTC (permalink / raw) To: Paul Eggert; +Cc: mattiase, monnier, andrea_corallo, 42147 > Cc: mattiase@acm.org, monnier@iro.umontreal.ca, andrea_corallo@yahoo.it, > 42147@debbugs.gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sat, 4 Jul 2020 09:27:23 -0700 > > On 7/4/20 9:15 AM, Eli Zaretskii wrote: > >> * configure.ac: On x86 with GCC, default to -msse2 -mfpmath=sse. > > This assumes that the system on which Emacs runs is the same as where > > it was built, doesn't it? > It doesn't assume they're the same system. It merely assumes that build and > host platforms both support SSE2, which is a safe assumption nowadays. What about the effect on the ABI? If Emacs compiled with these switches is linked against libraries compiled without them, could there be problems in the produced binary? ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-04 17:10 ` Eli Zaretskii @ 2020-07-04 19:26 ` Paul Eggert 0 siblings, 0 replies; 98+ messages in thread From: Paul Eggert @ 2020-07-04 19:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mattiase, monnier, andrea_corallo, 42147 On 7/4/20 10:10 AM, Eli Zaretskii wrote: > What about the effect on the ABI? If Emacs compiled with these > switches is linked against libraries compiled without them, could > there be problems in the produced binary? No. These compiler options do not change the ABI. All that changes is the use of registers and temporaries within each function, not how the function deals with callers, callees, or storage visible to callers or callees. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 10:26 ` Mattias Engdegård 2020-07-02 10:59 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-07-02 19:09 ` Philipp Stephani 2020-07-03 9:25 ` Mattias Engdegård 1 sibling, 1 reply; 98+ messages in thread From: Philipp Stephani @ 2020-07-02 19:09 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Paul Eggert, Stefan Monnier, Andrea Corallo, 42147 Am Do., 2. Juli 2020 um 12:28 Uhr schrieb Mattias Engdegård <mattiase@acm.org>: > > 1 juli 2020 kl. 23.31 skrev Andrea Corallo <andrea_corallo@yahoo.it>: > > > Another reason why I'm interested is that I reuse these > > definitions in the native compiler. > > In that case there are probably more functions you may want to consider for purity -- what about: > > < > <= >= = /= > string< string= string-equal > eq eql equal > proper-list-p > identity > memq memql member > assq assql assoc I don't think most of those are pure, as they have to "look into" an object. For example, the result of "equal" does not only depend on the argument objects, but also the objects they refer to. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-02 19:09 ` Philipp Stephani @ 2020-07-03 9:25 ` Mattias Engdegård 2020-07-25 17:09 ` Philipp Stephani 0 siblings, 1 reply; 98+ messages in thread From: Mattias Engdegård @ 2020-07-03 9:25 UTC (permalink / raw) To: Philipp Stephani; +Cc: Paul Eggert, Stefan Monnier, Andrea Corallo, 42147 2 juli 2020 kl. 21.09 skrev Philipp Stephani <p.stephani2@gmail.com>: > I don't think most of those are pure, as they have to "look into" an > object. For example, the result of "equal" does not only depend on the > argument objects, but also the objects they refer to. Unless I'm mistaken, they are pure enough for the purpose of constant folding, where the arguments are already known (constant) at compile-time; do come with a counter-example if you disagree. Were you thinking about other uses of pure functions? Perhaps our notion of purity is underspecified. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-03 9:25 ` Mattias Engdegård @ 2020-07-25 17:09 ` Philipp Stephani 2020-07-25 18:10 ` Stefan Monnier 0 siblings, 1 reply; 98+ messages in thread From: Philipp Stephani @ 2020-07-25 17:09 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Paul Eggert, Stefan Monnier, Andrea Corallo, 42147 Am Fr., 3. Juli 2020 um 11:25 Uhr schrieb Mattias Engdegård <mattiase@acm.org>: > > 2 juli 2020 kl. 21.09 skrev Philipp Stephani <p.stephani2@gmail.com>: > > > I don't think most of those are pure, as they have to "look into" an > > object. For example, the result of "equal" does not only depend on the > > argument objects, but also the objects they refer to. > > Unless I'm mistaken, they are pure enough for the purpose of constant folding, where the arguments are already known (constant) at compile-time; do come with a counter-example if you disagree. > > Were you thinking about other uses of pure functions? Perhaps our notion of purity is underspecified. > Yes, I think so. The term is mentioned in the Lisp manual, but I've always had trouble deciding whether a given function was pure or not. To be useful, the definition shouldn't depend on what the byte compiler happens to do; rather, we need to formally define purity (and side effect-freedom) based on the observable behavior of the function in question alone and then adapt the behavior of the byte compiler to the definition, if necessary. My working hypothesis is that "pure" is like GCC's "const" (https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html) and "side-effect free" is like GCC's "pure." That is, a side-effect free function can dereference pointers stored in Lisp_Objects, but a pure function can't. So functions like consp or eq are pure, while car or equal are merely side-effect free. eql is a bit of an exception, as floating-point objects and big integers are truly immutable, so it probably also qualifies as pure using this definition. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-25 17:09 ` Philipp Stephani @ 2020-07-25 18:10 ` Stefan Monnier 2020-07-25 20:03 ` Philipp Stephani 0 siblings, 1 reply; 98+ messages in thread From: Stefan Monnier @ 2020-07-25 18:10 UTC (permalink / raw) To: Philipp Stephani Cc: Mattias Engdegård, Paul Eggert, Andrea Corallo, 42147 > and "side-effect free" is like GCC's "pure." That is, a side-effect > free function can dereference pointers stored in Lisp_Objects, but a > pure function can't. I think it's still not very satisfactory since it's written in terms of low-level operations in the C code. I think the current intention of our "pure" goes something along the lines of: the function will always return the "same" (the sense of `eql`) value (or signal the same error) when called with `eql` arguments. IOW "the function preserves `eql`ity". Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-25 18:10 ` Stefan Monnier @ 2020-07-25 20:03 ` Philipp Stephani 2020-07-25 20:07 ` Stefan Monnier 0 siblings, 1 reply; 98+ messages in thread From: Philipp Stephani @ 2020-07-25 20:03 UTC (permalink / raw) To: Stefan Monnier; +Cc: Mattias Engdegård, Paul Eggert, Andrea Corallo, 42147 Am Sa., 25. Juli 2020 um 20:10 Uhr schrieb Stefan Monnier <monnier@iro.umontreal.ca>: > > > and "side-effect free" is like GCC's "pure." That is, a side-effect > > free function can dereference pointers stored in Lisp_Objects, but a > > pure function can't. > > I think it's still not very satisfactory since it's written in terms of > low-level operations in the C code. Agreed, I'd rather think of this hypothesis as a first step towards a definition that we can put into the manual. > > I think the current intention of our "pure" goes something along the > lines of: the function will always return the "same" (the sense of `eql`) > value (or signal the same error) when called with `eql` arguments. > IOW "the function preserves `eql`ity". > That sounds like a very reasonable definition. Do you think it's equivalent to my hypothesis and/or to the current behavior of the byte optimizer? Is it a complete definition in the sense that it gives an unambiguous yes/no answer for every current and future function? ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-25 20:03 ` Philipp Stephani @ 2020-07-25 20:07 ` Stefan Monnier 2020-07-25 20:11 ` Philipp Stephani 0 siblings, 1 reply; 98+ messages in thread From: Stefan Monnier @ 2020-07-25 20:07 UTC (permalink / raw) To: Philipp Stephani Cc: Mattias Engdegård, Paul Eggert, Andrea Corallo, 42147 > That sounds like a very reasonable definition. Do you think it's > equivalent to my hypothesis and/or to the current behavior of the byte > optimizer? Probably not exactly: there might be functions which don't always "preserve `eql`" but for which we decide nevertheless that it's OK to precompute them at compile time for pragmatic reasons. E.g. `concat`. Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-25 20:07 ` Stefan Monnier @ 2020-07-25 20:11 ` Philipp Stephani 2020-07-25 21:00 ` Mattias Engdegård 2020-07-25 21:09 ` Stefan Monnier 0 siblings, 2 replies; 98+ messages in thread From: Philipp Stephani @ 2020-07-25 20:11 UTC (permalink / raw) To: Stefan Monnier; +Cc: Mattias Engdegård, Paul Eggert, Andrea Corallo, 42147 Am Sa., 25. Juli 2020 um 22:07 Uhr schrieb Stefan Monnier <monnier@iro.umontreal.ca>: > > > That sounds like a very reasonable definition. Do you think it's > > equivalent to my hypothesis and/or to the current behavior of the byte > > optimizer? > > Probably not exactly: there might be functions which don't always > "preserve `eql`" but for which we decide nevertheless that it's OK to > precompute them at compile time for pragmatic reasons. > > E.g. `concat`. I don't think we can really do that, as that would allow the byte compiler to introduce bugs in the code, right? The manual states that "This function [concat] always constructs a new string that is not ‘eq’ to any existing string" so I don't see how it could ever be pure. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-25 20:11 ` Philipp Stephani @ 2020-07-25 21:00 ` Mattias Engdegård 2020-07-25 21:29 ` Stefan Monnier 2020-07-29 13:10 ` Philipp Stephani 2020-07-25 21:09 ` Stefan Monnier 1 sibling, 2 replies; 98+ messages in thread From: Mattias Engdegård @ 2020-07-25 21:00 UTC (permalink / raw) To: Philipp Stephani; +Cc: Paul Eggert, Stefan Monnier, Andrea Corallo, 42147 25 juli 2020 kl. 22.11 skrev Philipp Stephani <p.stephani2@gmail.com>: > The manual states that > "This function [concat] always constructs a new string that is not > ‘eq’ to any existing string" so I don't see how it could ever be pure. Actually that part of the manual was corrected fairly recently, as that statement hasn't been true for decades. More to the point, the current set of functions marked as 'pure' are really the superset 'pure-absent-mutation': functions that are pure when it can be assumed that the arguments are not modified. This assumption can be based on physical immutability (integers), by convention (string constants), or anything else the compiler can prove such as control flow. There is also the question of what equality to use and here the answer is probably 'equal' since we are only dealing with immutables. (The return values of these functions cannot be considered mutable for obvious reasons, so make-string is out.) If you want to rename the property accordingly then I won't object. In any case, it is certainly a good idea to be precise about what the various sets really mean. There are also some functions declared 'pure' that appear to have side effects: kbd, package-get-version ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-25 21:00 ` Mattias Engdegård @ 2020-07-25 21:29 ` Stefan Monnier 2020-07-25 21:39 ` Philipp Stephani 2020-07-25 21:54 ` Mattias Engdegård 2020-07-29 13:10 ` Philipp Stephani 1 sibling, 2 replies; 98+ messages in thread From: Stefan Monnier @ 2020-07-25 21:29 UTC (permalink / raw) To: Mattias Engdegård Cc: Philipp Stephani, Paul Eggert, Andrea Corallo, 42147 > There are also some functions declared 'pure' that appear to have side > effects: kbd, package-get-version Which side-effects are you thinking of? Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-25 21:29 ` Stefan Monnier @ 2020-07-25 21:39 ` Philipp Stephani 2020-07-25 22:27 ` Stefan Monnier 2020-07-25 21:54 ` Mattias Engdegård 1 sibling, 1 reply; 98+ messages in thread From: Philipp Stephani @ 2020-07-25 21:39 UTC (permalink / raw) To: Stefan Monnier; +Cc: Mattias Engdegård, Paul Eggert, Andrea Corallo, 42147 Am Sa., 25. Juli 2020 um 23:29 Uhr schrieb Stefan Monnier <monnier@iro.umontreal.ca>: > > > There are also some functions declared 'pure' that appear to have side > > effects: kbd, package-get-version > > Which side-effects are you thinking of? > I wouldn't know about side effects, but `kbd' is definitely not pure by the "homomorphism w.r.t. eql" definition as it takes a string argument. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-25 21:39 ` Philipp Stephani @ 2020-07-25 22:27 ` Stefan Monnier 2020-07-29 12:53 ` Philipp Stephani 0 siblings, 1 reply; 98+ messages in thread From: Stefan Monnier @ 2020-07-25 22:27 UTC (permalink / raw) To: Philipp Stephani Cc: Mattias Engdegård, Paul Eggert, Andrea Corallo, 42147 >> > There are also some functions declared 'pure' that appear to have side >> > effects: kbd, package-get-version >> >> Which side-effects are you thinking of? >> > > I wouldn't know about side effects, but `kbd' is definitely not pure > by the "homomorphism w.r.t. eql" definition as it takes a string > argument. Taking string arguments is not a problem (`eql` strings are also `equal`). It's returning a fresh new string/vector that is a problem (which also affects `kbd`, indeed). Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-25 22:27 ` Stefan Monnier @ 2020-07-29 12:53 ` Philipp Stephani 2020-07-29 14:28 ` Stefan Monnier 0 siblings, 1 reply; 98+ messages in thread From: Philipp Stephani @ 2020-07-29 12:53 UTC (permalink / raw) To: Stefan Monnier; +Cc: Mattias Engdegård, Paul Eggert, Andrea Corallo, 42147 Am So., 26. Juli 2020 um 00:27 Uhr schrieb Stefan Monnier <monnier@iro.umontreal.ca>: > > >> > There are also some functions declared 'pure' that appear to have side > >> > effects: kbd, package-get-version > >> > >> Which side-effects are you thinking of? > >> > > > > I wouldn't know about side effects, but `kbd' is definitely not pure > > by the "homomorphism w.r.t. eql" definition as it takes a string > > argument. > > Taking string arguments is not a problem (`eql` strings are also > `equal`). Only if the strings aren't modified between two invocations. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-29 12:53 ` Philipp Stephani @ 2020-07-29 14:28 ` Stefan Monnier 0 siblings, 0 replies; 98+ messages in thread From: Stefan Monnier @ 2020-07-29 14:28 UTC (permalink / raw) To: Philipp Stephani Cc: Mattias Engdegård, Paul Eggert, Andrea Corallo, 42147 >> > I wouldn't know about side effects, but `kbd' is definitely not pure >> > by the "homomorphism w.r.t. eql" definition as it takes a string >> > argument. >> Taking string arguments is not a problem (`eql` strings are also `equal`). > Only if the strings aren't modified between two invocations. Indeed, you're right. In practice this is not a problem, for all kinds of circumstantial reasons, but in theory you're quite right. Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-25 21:29 ` Stefan Monnier 2020-07-25 21:39 ` Philipp Stephani @ 2020-07-25 21:54 ` Mattias Engdegård 2020-07-25 22:30 ` Stefan Monnier 1 sibling, 1 reply; 98+ messages in thread From: Mattias Engdegård @ 2020-07-25 21:54 UTC (permalink / raw) To: Stefan Monnier; +Cc: Philipp Stephani, Paul Eggert, Andrea Corallo, 42147 25 juli 2020 kl. 23.29 skrev Stefan Monnier <monnier@iro.umontreal.ca>: > >> There are also some functions declared 'pure' that appear to have side >> effects: kbd, package-get-version > > Which side-effects are you thinking of? They both clobber the match data. Not that it matters for the purpose of compile-time evaluation, but we were discussing exact definitions. ('package-get-version' does a lot more but at least it admits to lying in a comment, so I suppose that's all right then.) ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-25 21:54 ` Mattias Engdegård @ 2020-07-25 22:30 ` Stefan Monnier 2020-07-26 9:05 ` Mattias Engdegård 0 siblings, 1 reply; 98+ messages in thread From: Stefan Monnier @ 2020-07-25 22:30 UTC (permalink / raw) To: Mattias Engdegård Cc: Philipp Stephani, Paul Eggert, Andrea Corallo, 42147 >>> There are also some functions declared 'pure' that appear to have side >>> effects: kbd, package-get-version >> Which side-effects are you thinking of? > They both clobber the match data. It's not a problem w.r.t optimizing pure functions, tho: those functions don't promise to modify the match data, so it's OK to replace calls to them with their return value. But yes, it's a problem because `C-h o kbd RET` says explicitly that it doesn't change the match data :-( Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-25 22:30 ` Stefan Monnier @ 2020-07-26 9:05 ` Mattias Engdegård 2020-07-29 16:03 ` Mattias Engdegård 0 siblings, 1 reply; 98+ messages in thread From: Mattias Engdegård @ 2020-07-26 9:05 UTC (permalink / raw) To: Stefan Monnier; +Cc: Philipp Stephani, Paul Eggert, Andrea Corallo, 42147 [-- Attachment #1: Type: text/plain, Size: 272 bytes --] 26 juli 2020 kl. 00.30 skrev Stefan Monnier <monnier@iro.umontreal.ca>: > But yes, it's a problem because `C-h o kbd RET` says explicitly that it > doesn't change the match data :-( Motivation enough for them to save the match data then. What about the attached patch? [-- Attachment #2: 0001-Preserve-match-data-in-kbd-and-package-get-version.patch --] [-- Type: application/octet-stream, Size: 3660 bytes --] From ba851b70be4211695937fa7fbac7ee38bbbfa4aa Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org> Date: Sun, 26 Jul 2020 10:51:42 +0200 Subject: [PATCH] Preserve match data in 'kbd' and 'package-get-version' * lisp/emacs-lisp/package.el (package-get-version): * lisp/subr.el (kbd): Preserve match data, since these functions are declared pure (see discussion in bug#42147). --- lisp/emacs-lisp/package.el | 39 +++++++++++++++++++------------------- lisp/subr.el | 7 +++---- 2 files changed, 23 insertions(+), 23 deletions(-) diff --git a/lisp/emacs-lisp/package.el b/lisp/emacs-lisp/package.el index e6f54d206d..b87f568650 100644 --- a/lisp/emacs-lisp/package.el +++ b/lisp/emacs-lisp/package.el @@ -3936,25 +3936,26 @@ package-get-version (or (if (boundp 'byte-compile-current-file) byte-compile-current-file) load-file-name buffer-file-name))) - (cond - ((null file) nil) - ;; Packages are normally installed into directories named "<pkg>-<vers>", - ;; so get the version number from there. - ((string-match "/[^/]+-\\([0-9]\\(?:[0-9.]\\|pre\\|beta\\|alpha\\|snapshot\\)+\\)/[^/]+\\'" file) - (match-string 1 file)) - ;; For packages run straight from the an elpa.git clone, there's no - ;; "-<vers>" in the directory name, so we have to fetch the version - ;; the hard way. - (t - (let* ((pkgdir (file-name-directory file)) - (pkgname (file-name-nondirectory (directory-file-name pkgdir))) - (mainfile (expand-file-name (concat pkgname ".el") pkgdir))) - (when (file-readable-p mainfile) - (require 'lisp-mnt) - (with-temp-buffer - (insert-file-contents mainfile) - (or (lm-header "package-version") - (lm-header "version"))))))))) + (save-match-data ; Since the function is declared pure. + (cond + ((null file) nil) + ;; Packages are normally installed into directories named "<pkg>-<vers>", + ;; so get the version number from there. + ((string-match "/[^/]+-\\([0-9]\\(?:[0-9.]\\|pre\\|beta\\|alpha\\|snapshot\\)+\\)/[^/]+\\'" file) + (match-string 1 file)) + ;; For packages run straight from the an elpa.git clone, there's no + ;; "-<vers>" in the directory name, so we have to fetch the version + ;; the hard way. + (t + (let* ((pkgdir (file-name-directory file)) + (pkgname (file-name-nondirectory (directory-file-name pkgdir))) + (mainfile (expand-file-name (concat pkgname ".el") pkgdir))) + (when (file-readable-p mainfile) + (require 'lisp-mnt) + (with-temp-buffer + (insert-file-contents mainfile) + (or (lm-header "package-version") + (lm-header "version")))))))))) \f ;;;; Quickstart: precompute activation actions for faster start up. diff --git a/lisp/subr.el b/lisp/subr.el index 10c37e9413..70a6ec7ab2 100644 --- a/lisp/subr.el +++ b/lisp/subr.el @@ -891,10 +891,9 @@ kbd `edmacro-mode'). For an approximate inverse of this, see `key-description'." - ;; Don't use a defalias, since the `pure' property is true only for - ;; the calling convention of `kbd'. - (read-kbd-macro keys)) -(put 'kbd 'pure t) + (declare (pure t)) + ;; A pure function is expected to preserve the match data. + (save-match-data (read-kbd-macro keys))) (defun undefined () "Beep to tell the user this binding is undefined." -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-26 9:05 ` Mattias Engdegård @ 2020-07-29 16:03 ` Mattias Engdegård 2020-07-29 20:39 ` Stefan Monnier 0 siblings, 1 reply; 98+ messages in thread From: Mattias Engdegård @ 2020-07-29 16:03 UTC (permalink / raw) To: Stefan Monnier; +Cc: Philipp Stephani, Paul Eggert, Andrea Corallo, 42147 >> But yes, it's a problem because `C-h o kbd RET` says explicitly that it >> doesn't change the match data :-( > > Motivation enough for them to save the match data then. I pushed the save-match-data around 'kbd' since that one looked rather obvious. Regarding package-get-version: perhaps we should drop the 'pure' property and just let callers wrap it in eval-when-compile? ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-29 16:03 ` Mattias Engdegård @ 2020-07-29 20:39 ` Stefan Monnier 2020-08-03 15:07 ` Mattias Engdegård 2020-08-10 13:42 ` Philipp Stephani 0 siblings, 2 replies; 98+ messages in thread From: Stefan Monnier @ 2020-07-29 20:39 UTC (permalink / raw) To: Mattias Engdegård Cc: Philipp Stephani, Paul Eggert, Andrea Corallo, 42147 > Regarding package-get-version: perhaps we should drop the 'pure' property > and just let callers wrap it in eval-when-compile? I'd rather not: the benefit is too subtle, I'd expect most users won't know/bother to use `eval-when-compile` around it even though I'd expect a vast majority of the uses can benefit from compile-time evaluation. In contrast the cases where the impurity will get in the way should be rare. Feel free to add a `save-match-data` if you think it's worth the trouble (but please include a comment explaining it's only there in order to satisfy the `pure` annotation). Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-29 20:39 ` Stefan Monnier @ 2020-08-03 15:07 ` Mattias Engdegård 2020-08-10 13:39 ` Philipp Stephani 2020-08-10 13:42 ` Philipp Stephani 1 sibling, 1 reply; 98+ messages in thread From: Mattias Engdegård @ 2020-08-03 15:07 UTC (permalink / raw) To: Stefan Monnier; +Cc: Philipp Stephani, Paul Eggert, Andrea Corallo, 42147 29 juli 2020 kl. 22.39 skrev Stefan Monnier <monnier@iro.umontreal.ca>: > >> Regarding package-get-version: perhaps we should drop the 'pure' property >> and just let callers wrap it in eval-when-compile? > > I'd rather not: the benefit is too subtle, I'd expect most users won't > know/bother to use `eval-when-compile` around it even though I'd expect > a vast majority of the uses can benefit from compile-time evaluation. > In contrast the cases where the impurity will get in the way > should be rare. Feel free to add a `save-match-data` if you think it's > worth the trouble (but please include a comment explaining it's only > there in order to satisfy the `pure` annotation). Thank you, I'll let you make the decision since you introduced the function. Another possibility would be to turn it into a macro again (or more likely a macro that calls a function at compile-time), or use define-inline. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-08-03 15:07 ` Mattias Engdegård @ 2020-08-10 13:39 ` Philipp Stephani 2020-08-10 22:07 ` Stefan Monnier 0 siblings, 1 reply; 98+ messages in thread From: Philipp Stephani @ 2020-08-10 13:39 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Paul Eggert, Stefan Monnier, Andrea Corallo, 42147 Am Mo., 3. Aug. 2020 um 17:07 Uhr schrieb Mattias Engdegård <mattiase@acm.org>: > > 29 juli 2020 kl. 22.39 skrev Stefan Monnier <monnier@iro.umontreal.ca>: > > > >> Regarding package-get-version: perhaps we should drop the 'pure' property > >> and just let callers wrap it in eval-when-compile? > > > > I'd rather not: the benefit is too subtle, I'd expect most users won't > > know/bother to use `eval-when-compile` around it even though I'd expect > > a vast majority of the uses can benefit from compile-time evaluation. > > In contrast the cases where the impurity will get in the way > > should be rare. Feel free to add a `save-match-data` if you think it's > > worth the trouble (but please include a comment explaining it's only > > there in order to satisfy the `pure` annotation). > > Thank you, I'll let you make the decision since you introduced the function. Another possibility would be to turn it into a macro again (or more likely a macro that calls a function at compile-time), or use define-inline. > Another option would be to create a separate function property (such as `eval-when-compile' or `force-inline') that would force the byte compiler to inline each call, as if surrounded by `eval-when-compile'. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-08-10 13:39 ` Philipp Stephani @ 2020-08-10 22:07 ` Stefan Monnier 0 siblings, 0 replies; 98+ messages in thread From: Stefan Monnier @ 2020-08-10 22:07 UTC (permalink / raw) To: Philipp Stephani Cc: Mattias Engdegård, Paul Eggert, Andrea Corallo, 42147 > Another option would be to create a separate function property (such > as `eval-when-compile' or `force-inline') that would force the byte > compiler to inline each call, as if surrounded by `eval-when-compile'. It can't always be done, so I think from that point of view what we have is fine, the only problem is that we don't have a way to say "this can be considered as pure by the compiler" without also saying "this doesn't change the march-data". So maybe a new value of `pure` is in order? Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-29 20:39 ` Stefan Monnier 2020-08-03 15:07 ` Mattias Engdegård @ 2020-08-10 13:42 ` Philipp Stephani 2020-08-10 22:10 ` Stefan Monnier 1 sibling, 1 reply; 98+ messages in thread From: Philipp Stephani @ 2020-08-10 13:42 UTC (permalink / raw) To: Stefan Monnier; +Cc: Mattias Engdegård, Paul Eggert, Andrea Corallo, 42147 Am Mi., 29. Juli 2020 um 22:39 Uhr schrieb Stefan Monnier <monnier@iro.umontreal.ca>: > > > Regarding package-get-version: perhaps we should drop the 'pure' property > > and just let callers wrap it in eval-when-compile? > > I'd rather not: the benefit is too subtle, I'd expect most users won't > know/bother to use `eval-when-compile` around it even though I'd expect > a vast majority of the uses can benefit from compile-time evaluation. > In contrast the cases where the impurity will get in the way > should be rare. Arguments relating to probabilities ("most users", "vast majority", "rare") don't really apply here, as the discussion is about correctness. A pure function must be pure 100% of the time, no exceptions allowed. `package-get-version' can't possibly be pure: it doesn't take any argument, so if it were pure, it would have to return a constant in the mathematical sense. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-08-10 13:42 ` Philipp Stephani @ 2020-08-10 22:10 ` Stefan Monnier 0 siblings, 0 replies; 98+ messages in thread From: Stefan Monnier @ 2020-08-10 22:10 UTC (permalink / raw) To: Philipp Stephani Cc: Mattias Engdegård, Paul Eggert, Andrea Corallo, 42147 > Arguments relating to probabilities ("most users", "vast majority", > "rare") don't really apply here, as the discussion is about > correctness. A pure function must be pure 100% of the time, no > exceptions allowed. `package-get-version' can't possibly be pure: it > doesn't take any argument, so if it were pure, it would have to return > a constant in the mathematical sense. It's a borderline case: it's designed such that if the compiler pre-evaluates it, the result will indeed be the same as if it were evaluated later at run-time. So it's correct w.r.t "allow constant-folding", even tho it's not "pure". Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-25 21:00 ` Mattias Engdegård 2020-07-25 21:29 ` Stefan Monnier @ 2020-07-29 13:10 ` Philipp Stephani 1 sibling, 0 replies; 98+ messages in thread From: Philipp Stephani @ 2020-07-29 13:10 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Paul Eggert, Stefan Monnier, Andrea Corallo, 42147 Am Sa., 25. Juli 2020 um 23:00 Uhr schrieb Mattias Engdegård <mattiase@acm.org>: > > 25 juli 2020 kl. 22.11 skrev Philipp Stephani <p.stephani2@gmail.com>: > > > The manual states that > > "This function [concat] always constructs a new string that is not > > ‘eq’ to any existing string" so I don't see how it could ever be pure. > > Actually that part of the manual was corrected fairly recently, as that statement hasn't been true for decades. > > More to the point, the current set of functions marked as 'pure' are really the superset 'pure-absent-mutation': functions that are pure when it can be assumed that the arguments are not modified. This assumption can be based on physical immutability (integers), by convention (string constants), or anything else the compiler can prove such as control flow. > > There is also the question of what equality to use and here the answer is probably 'equal' since we are only dealing with immutables. (The return values of these functions cannot be considered mutable for obvious reasons, so make-string is out.) > > If you want to rename the property accordingly then I won't object. In any case, it is certainly a good idea to be precise about what the various sets really mean. It looks like these are really two separate concepts, which should be documented separately and should get separate function properties: - Fully-pure functions are guaranteed to either return results that are "eql" or signal error data that is "equal-including-properties" when given arguments that are pairwise "eql", for all possible arguments, mutable or not. - Pure-unless-mutable functions are a strict superset whose restrictions to immutable arguments are guaranteed to either return results or signal error data that are "equal-including-properties" when given arguments that are immutable and pairwise "equal-including-properties". Both categories must be free of observable side effects as well. > > There are also some functions declared 'pure' that appear to have side effects: kbd, package-get-version > kbd can and should be made pure-unless-mutable at least. package-get-version should be neither, but should be marked using a third to-be-introduced property (e.g. "eval-when-compile") to state the intention without resorting to hacks. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? 2020-07-25 20:11 ` Philipp Stephani 2020-07-25 21:00 ` Mattias Engdegård @ 2020-07-25 21:09 ` Stefan Monnier 1 sibling, 0 replies; 98+ messages in thread From: Stefan Monnier @ 2020-07-25 21:09 UTC (permalink / raw) To: Philipp Stephani Cc: Mattias Engdegård, Paul Eggert, Andrea Corallo, 42147 > I don't think we can really do that, as that would allow the byte > compiler to introduce bugs in the code, right? The manual states that > "This function [concat] always constructs a new string that is not > ‘eq’ to any existing string" so I don't see how it could ever be pure. And yet, `concat` has been marked as "pure" even before we introduced the notion of pure. More specifically, the code in byte-opt.el which optimizes calls to pure functions was originally written exclusively for `concat`: commit 79d137ffe7dac5fe3041b4916c715f4ce91143af Author: Karl Heuer <kwzh@gnu.org> Date: Mon Nov 3 03:58:23 1997 +0000 (byte-optimize-concat): New function. followed by: commit e856a453a1c1ce1907b3b582841bce3e9cff8cec Author: Stefan Monnier <monnier@iro.umontreal.ca> Date: Mon Mar 22 15:21:08 2004 +0000 (byte-compile-log-lap, byte-compile-inline-expand): Use backquote. (byte-optimize-pure-func): Rename from byte-optimize-concat. (symbol-name, regexp-opt, regexp-quote): Mark as pure. -- Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
end of thread, other threads:[~2020-08-10 22:10 UTC | newest] Thread overview: 98+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <1583748933.1069307.1593556032592.ref@mail.yahoo.com> 2020-06-30 22:27 ` bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-06-30 23:14 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-01 12:46 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-01 12:44 ` Mattias Engdegård 2020-07-01 16:08 ` Mattias Engdegård 2020-07-01 21:31 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-02 10:26 ` Mattias Engdegård 2020-07-02 10:59 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-02 12:46 ` Mattias Engdegård 2020-07-02 13:56 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-02 14:51 ` Mattias Engdegård 2020-07-02 15:32 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-02 15:49 ` Stefan Monnier 2020-07-02 18:01 ` Mattias Engdegård 2020-07-02 18:55 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-02 19:38 ` Stefan Monnier 2020-07-02 20:09 ` Paul Eggert 2020-07-03 9:32 ` Mattias Engdegård 2020-07-03 13:39 ` bug#42147: Hash-consing bignums (was: bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?) Stefan Monnier 2020-07-02 20:31 ` bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations? Paul Eggert 2020-07-02 21:41 ` Stefan Monnier 2020-07-02 23:16 ` Paul Eggert 2020-07-03 8:32 ` Mattias Engdegård 2020-07-03 13:11 ` Stefan Monnier 2020-07-03 18:35 ` Mattias Engdegård 2020-07-03 18:43 ` Mattias Engdegård 2020-07-03 19:05 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-04 14:58 ` Mattias Engdegård 2020-07-04 15:06 ` Stefan Monnier 2020-07-04 16:13 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-05 13:00 ` Mattias Engdegård 2020-07-05 13:16 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-06 17:20 ` Mattias Engdegård 2020-07-06 21:23 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-07 15:54 ` Mattias Engdegård 2020-07-07 16:24 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-07 16:55 ` Mattias Engdegård 2020-07-07 17:42 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-08 19:14 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-08 21:25 ` Mattias Engdegård 2020-07-08 22:19 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-09 10:20 ` Mattias Engdegård 2020-07-09 12:47 ` Stefan Monnier 2020-07-09 12:57 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-09 14:35 ` Stefan Monnier 2020-07-09 15:19 ` Paul Eggert 2020-07-09 15:37 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-05 15:26 ` Mattias Engdegård 2020-07-03 18:31 ` Paul Eggert 2020-07-03 18:47 ` Mattias Engdegård 2020-07-04 15:57 ` Paul Eggert 2020-07-04 16:15 ` Eli Zaretskii 2020-07-04 16:27 ` Paul Eggert 2020-07-04 16:33 ` Stefan Monnier 2020-07-04 16:44 ` Mattias Engdegård 2020-07-04 17:00 ` Paul Eggert 2020-07-04 18:37 ` Pip Cet 2020-07-04 21:05 ` Stefan Monnier 2020-07-04 22:25 ` Pip Cet 2020-07-05 2:38 ` Eli Zaretskii 2020-07-05 8:28 ` Paul Eggert 2020-07-05 8:39 ` Andreas Schwab 2020-07-05 14:47 ` Eli Zaretskii 2020-07-05 15:30 ` Stefan Monnier 2020-07-06 0:14 ` Paul Eggert 2020-07-05 15:11 ` Stefan Monnier 2020-07-06 0:10 ` Paul Eggert 2020-07-05 9:56 ` Paul Eggert 2020-07-05 10:03 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors 2020-07-05 23:57 ` Paul Eggert 2020-07-04 19:01 ` Mattias Engdegård 2020-07-04 17:10 ` Eli Zaretskii 2020-07-04 19:26 ` Paul Eggert 2020-07-02 19:09 ` Philipp Stephani 2020-07-03 9:25 ` Mattias Engdegård 2020-07-25 17:09 ` Philipp Stephani 2020-07-25 18:10 ` Stefan Monnier 2020-07-25 20:03 ` Philipp Stephani 2020-07-25 20:07 ` Stefan Monnier 2020-07-25 20:11 ` Philipp Stephani 2020-07-25 21:00 ` Mattias Engdegård 2020-07-25 21:29 ` Stefan Monnier 2020-07-25 21:39 ` Philipp Stephani 2020-07-25 22:27 ` Stefan Monnier 2020-07-29 12:53 ` Philipp Stephani 2020-07-29 14:28 ` Stefan Monnier 2020-07-25 21:54 ` Mattias Engdegård 2020-07-25 22:30 ` Stefan Monnier 2020-07-26 9:05 ` Mattias Engdegård 2020-07-29 16:03 ` Mattias Engdegård 2020-07-29 20:39 ` Stefan Monnier 2020-08-03 15:07 ` Mattias Engdegård 2020-08-10 13:39 ` Philipp Stephani 2020-08-10 22:07 ` Stefan Monnier 2020-08-10 13:42 ` Philipp Stephani 2020-08-10 22:10 ` Stefan Monnier 2020-07-29 13:10 ` Philipp Stephani 2020-07-25 21:09 ` Stefan Monnier
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.