* bug#31688: 26.1.50; Byte compiler confuses two string variables
@ 2018-06-02 17:51 Gemini Lasswell
2018-06-02 18:02 ` Noam Postavsky
0 siblings, 1 reply; 16+ messages in thread
From: Gemini Lasswell @ 2018-06-02 17:51 UTC (permalink / raw)
To: 31688
Here is a test which succeeds when interpreted and fails when
byte-compiled. The byte compiler is apparently confusing two string
variables, or optimizing away one of them. I've tried it both
with and without lexical-binding with the same results.
To reproduce, save this to bug.el:
(require 'ert)
(ert-deftest test-strings-props ()
(let* ((str1 "abcdefghij")
(obj '(a b))
(str2 "abcdefghij"))
(put-text-property 0 5 'test obj str2)
(should (equal "\"abcdefghij\"" (prin1-to-string str1)))))
Then:
C-u M-x byte-compile-file RET bug.el RET
M-x ert RET t RET
Result:
(ert-test-failed
((should
(equal "\"abcdefghij\""
(prin1-to-string str1)))
:form
(equal "\"abcdefghij\"" "#(\"abcdefghij\" 0 5 (test (a b)))")
:value nil :explanation
(arrays-of-different-length 12 32 "\"abcdefghij\"" "#(\"abcdefghij\" 0 5 (test (a b)))" first-mismatch-at 0)))
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#31688: 26.1.50; Byte compiler confuses two string variables
2018-06-02 17:51 bug#31688: 26.1.50; Byte compiler confuses two string variables Gemini Lasswell
@ 2018-06-02 18:02 ` Noam Postavsky
2018-06-02 22:52 ` Gemini Lasswell
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: Noam Postavsky @ 2018-06-02 18:02 UTC (permalink / raw)
To: Gemini Lasswell; +Cc: 31688
Gemini Lasswell <gazally@runbox.com> writes:
> Here is a test which succeeds when interpreted and fails when
> byte-compiled. The byte compiler is apparently confusing two string
> variables, or optimizing away one of them. I've tried it both
> with and without lexical-binding with the same results.
>
> To reproduce, save this to bug.el:
>
> (require 'ert)
> (ert-deftest test-strings-props ()
> (let* ((str1 "abcdefghij")
> (obj '(a b))
> (str2 "abcdefghij"))
> (put-text-property 0 5 'test obj str2)
> (should (equal "\"abcdefghij\"" (prin1-to-string str1)))))
I don't think this is a bug, the compiler coalesces equal string
literals. `put-text-property' modifies the string destructively, so you
shouldn't use it on literals, for the same reason you shouldn't use
destructive operations on quoted list literals. Another example, not
dependent on compilation:
(defun foo (prop val)
(let ((s "xyz"))
(put-text-property 0 3 prop val s)
s))
(foo 'x 1) ;=> #("xyz" 0 3 (x 1))
(foo 'y 2) ;=> #("xyz" 0 3 (x 1 y 2))
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#31688: 26.1.50; Byte compiler confuses two string variables
2018-06-02 18:02 ` Noam Postavsky
@ 2018-06-02 22:52 ` Gemini Lasswell
2018-06-02 23:25 ` Noam Postavsky
2018-06-03 0:40 ` Drew Adams
2018-06-02 23:03 ` Drew Adams
2018-06-02 23:38 ` Phil Sainty
2 siblings, 2 replies; 16+ messages in thread
From: Gemini Lasswell @ 2018-06-02 22:52 UTC (permalink / raw)
To: Noam Postavsky; +Cc: 31688
Noam Postavsky <npostavs@gmail.com> writes:
> I don't think this is a bug, the compiler coalesces equal string
> literals. `put-text-property' modifies the string destructively, so you
> shouldn't use it on literals, for the same reason you shouldn't use
> destructive operations on quoted list literals.
Thanks for the explanation. I've just searched the Elisp reference
looking for any any warnings not to use destructive functions on
literals, and didn't find anything. Did I miss it? If not, it seems to
me that the node "Self-Evaluating Forms" would be a good place to
discuss the subject.
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#31688: 26.1.50; Byte compiler confuses two string variables
2018-06-02 18:02 ` Noam Postavsky
2018-06-02 22:52 ` Gemini Lasswell
@ 2018-06-02 23:03 ` Drew Adams
2018-06-02 23:38 ` Phil Sainty
2 siblings, 0 replies; 16+ messages in thread
From: Drew Adams @ 2018-06-02 23:03 UTC (permalink / raw)
To: Noam Postavsky, Gemini Lasswell; +Cc: 31688
> I don't think this is a bug, the compiler coalesces equal string
> literals. `put-text-property' modifies the string destructively, so you
> shouldn't use it on literals, for the same reason you shouldn't use
> destructive operations on quoted list literals. Another example, not
> dependent on compilation:
>
> (defun foo (prop val)
> (let ((s "xyz"))
> (put-text-property 0 3 prop val s)
> s))
>
> (foo 'x 1) ;=> #("xyz" 0 3 (x 1))
> (foo 'y 2) ;=> #("xyz" 0 3 (x 1 y 2))
Yes. See also this, about Common Lisp:
https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node74.html
A snippet of it:
An additional problem with eq is that the implementation
is permitted to ``collapse'' constants (or portions thereof)
appearing in code to be compiled if they are equal. An object
is considered to be a constant in code to be compiled if it
is a self-evaluating form or is contained in a quote form.
This is why (eq "Foo" "Foo") might be true or false; in
interpreted code it would normally be false, because reading
in the form (eq "Foo" "Foo") would construct distinct strings
for the two arguments to eq, but the compiler might choose to
use the same identical string or two distinct copies as the
two arguments in the call to eq.
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#31688: 26.1.50; Byte compiler confuses two string variables
2018-06-02 22:52 ` Gemini Lasswell
@ 2018-06-02 23:25 ` Noam Postavsky
2018-06-03 0:40 ` Drew Adams
1 sibling, 0 replies; 16+ messages in thread
From: Noam Postavsky @ 2018-06-02 23:25 UTC (permalink / raw)
To: Gemini Lasswell; +Cc: 31688
Gemini Lasswell <gazally@runbox.com> writes:
> Thanks for the explanation. I've just searched the Elisp reference
> looking for any any warnings not to use destructive functions on
> literals, and didn't find anything. Did I miss it? If not, it seems to
> me that the node "Self-Evaluating Forms" would be a good place to
> discuss the subject.
There is a description with reference to `nconc' in `(elisp)
Rearrangement'. See also Bug#23417.
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=23417
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#31688: 26.1.50; Byte compiler confuses two string variables
2018-06-02 18:02 ` Noam Postavsky
2018-06-02 22:52 ` Gemini Lasswell
2018-06-02 23:03 ` Drew Adams
@ 2018-06-02 23:38 ` Phil Sainty
2018-06-02 23:54 ` Noam Postavsky
2018-06-03 0:02 ` Michael Heerdegen
2 siblings, 2 replies; 16+ messages in thread
From: Phil Sainty @ 2018-06-02 23:38 UTC (permalink / raw)
To: Noam Postavsky; +Cc: Gemini Lasswell, bug-gnu-emacs, 31688
On 2018-06-03 06:02, Noam Postavsky wrote:
> I don't think this is a bug, the compiler coalesces equal string
> literals.
Ouch. Has this always been the case? I've been firmly under the
impression that the lisp reader creates a new lisp objects whenever
it reads a string, so it's hugely surprising to me to learn that
(eq str1 str2) can return different results depending on whether
or not the code was byte-compiled.
I see that this is t when compiled and nil otherwise:
(let ((str1 "abc")
(str2 "abc"))
(eq str1 str2)))
But this is nil regardless:
(eq "abc" "abc")
This seems kinda horrible?
-Phil
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#31688: 26.1.50; Byte compiler confuses two string variables
2018-06-02 23:38 ` Phil Sainty
@ 2018-06-02 23:54 ` Noam Postavsky
2018-06-03 12:32 ` Phil Sainty
2018-06-03 0:02 ` Michael Heerdegen
1 sibling, 1 reply; 16+ messages in thread
From: Noam Postavsky @ 2018-06-02 23:54 UTC (permalink / raw)
To: Phil Sainty; +Cc: Gemini Lasswell, bug-gnu-emacs, 31688
Phil Sainty <psainty@orcon.net.nz> writes:
> On 2018-06-03 06:02, Noam Postavsky wrote:
>> I don't think this is a bug, the compiler coalesces equal string
>> literals.
>
> Ouch. Has this always been the case? I've been firmly under the
> impression that the lisp reader creates a new lisp objects whenever
> it reads a string,
Strictly speaking, that is correct. The reader does that. The byte
compiler doesn't preserve the object identity.
(byte-compile (lambda () (let ((str1 "abc")
(str2 "abc"))
(eq str1 str2))))
;=> #[0 "<bytecode>" ["abc"] 4]
> But this is nil regardless:
>
> (eq "abc" "abc")
Oh, looks like the compiler performs the `eq' call at compile time.
(byte-compile (lambda () (eq "abc" "abc")))
;=> #[0 "\300\207" [nil] 1]
> This seems kinda horrible?
What, you don't like optimization? ;)
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#31688: 26.1.50; Byte compiler confuses two string variables
2018-06-02 23:38 ` Phil Sainty
2018-06-02 23:54 ` Noam Postavsky
@ 2018-06-03 0:02 ` Michael Heerdegen
2018-06-03 0:46 ` Drew Adams
1 sibling, 1 reply; 16+ messages in thread
From: Michael Heerdegen @ 2018-06-03 0:02 UTC (permalink / raw)
To: Phil Sainty; +Cc: Gemini Lasswell, bug-gnu-emacs, Noam Postavsky, 31688
Phil Sainty <psainty@orcon.net.nz> writes:
> Ouch. Has this always been the case? I've been firmly under the
> impression that the lisp reader creates a new lisp objects whenever it
> reads a string [...]
Note that even interpreted
(eq "" "")
==> t
Michael.
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#31688: 26.1.50; Byte compiler confuses two string variables
2018-06-02 22:52 ` Gemini Lasswell
2018-06-02 23:25 ` Noam Postavsky
@ 2018-06-03 0:40 ` Drew Adams
1 sibling, 0 replies; 16+ messages in thread
From: Drew Adams @ 2018-06-03 0:40 UTC (permalink / raw)
To: Gemini Lasswell, Noam Postavsky; +Cc: 31688
> just searched the Elisp reference
> looking for any any warnings not to use destructive functions on
> literals, and didn't find anything.
It's not really about destructive functions.
It's about the fact that you might not have two different
strings, and you should not assume that you do.
It's about `eq'. For the Emacs byte-compiler, apparently,
multiple occurrences of a string literal in the code are
compiled to the same string object. Thinking you have two
separate strings is the problem. Anything you might want
to think or say about the use of "destructive" functions
follows from the fact that you have a single string.
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#31688: 26.1.50; Byte compiler confuses two string variables
2018-06-03 0:02 ` Michael Heerdegen
@ 2018-06-03 0:46 ` Drew Adams
0 siblings, 0 replies; 16+ messages in thread
From: Drew Adams @ 2018-06-03 0:46 UTC (permalink / raw)
To: Michael Heerdegen, Phil Sainty
Cc: Gemini Lasswell, Noam Postavsky, bug-gnu-emacs, 31688
> Note that even interpreted (eq "" "") ==> t
Yes, but that's an exception. Someone thought it was
a brilliant idea to add it as an exception in Emacs 23,
and that prevailed.
Just like (eq "abc" "abc") still, (eq "" "") was nil
before Emacs 23 when interpreted.
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#31688: 26.1.50; Byte compiler confuses two string variables
2018-06-02 23:54 ` Noam Postavsky
@ 2018-06-03 12:32 ` Phil Sainty
2018-06-03 13:05 ` Andreas Schwab
0 siblings, 1 reply; 16+ messages in thread
From: Phil Sainty @ 2018-06-03 12:32 UTC (permalink / raw)
To: Noam Postavsky; +Cc: Gemini Lasswell, bug-gnu-emacs, 31688
On 2018-06-03 11:54, Noam Postavsky wrote:
> What, you don't like optimization? ;)
I generally dislike it when byte-compiled and interpreted code
give different results.
Offhand I'm struggling to imagine there being really significant
benefits to this optimisation; whereas I can easily imagine that
any bugs on this account, while no doubt unlikely, could be
amazingly painful to track down if the bug only occurred in byte-
compiled code, and instrumenting the function for debugging
simply made the bug go away.
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#31688: 26.1.50; Byte compiler confuses two string variables
2018-06-03 12:32 ` Phil Sainty
@ 2018-06-03 13:05 ` Andreas Schwab
2018-06-04 10:02 ` Phil Sainty
0 siblings, 1 reply; 16+ messages in thread
From: Andreas Schwab @ 2018-06-03 13:05 UTC (permalink / raw)
To: Phil Sainty; +Cc: Gemini Lasswell, bug-gnu-emacs, Noam Postavsky, 31688
On Jun 04 2018, Phil Sainty <psainty@orcon.net.nz> wrote:
> On 2018-06-03 11:54, Noam Postavsky wrote:
>> What, you don't like optimization? ;)
>
> I generally dislike it when byte-compiled and interpreted code
> give different results.
This really has nothing to do with byte-compilation. Whether literals
are shared or not should not be relied upon. You always have to be
careful when modifying values in-place.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#31688: 26.1.50; Byte compiler confuses two string variables
2018-06-03 13:05 ` Andreas Schwab
@ 2018-06-04 10:02 ` Phil Sainty
2018-06-04 15:58 ` Eli Zaretskii
2018-06-04 17:01 ` Andreas Schwab
0 siblings, 2 replies; 16+ messages in thread
From: Phil Sainty @ 2018-06-04 10:02 UTC (permalink / raw)
To: Andreas Schwab; +Cc: Gemini Lasswell, Noam Postavsky, bug-gnu-emacs, 31688
On 2018-06-04 01:05, Andreas Schwab wrote:
> On Jun 04 2018, Phil Sainty <psainty@orcon.net.nz> wrote:
>> I generally dislike it when byte-compiled and interpreted code
>> give different results.
>
> This really has nothing to do with byte-compilation. Whether
> literals are shared or not should not be relied upon. You always
> have to be careful when modifying values in-place.
I don't disagree that one ought to take care when modifying values
in-place, but my general concern is purely that the byte-compiler is
producing code which does not behave the same as the uncompiled code.
(i.e. I think my issue is specifically to do with byte-compilation,
and I would consider such discrepancies to be a problem irrespective
of the sort of code which was affected.)
Surely consistent behaviour between compiled and uncompiled code is
not only desirable, but a primary goal?
I realise (albeit vaguely) that the byte code and its interpreter are
rather different to the uncompiled versions, so I suppose this may not
be the only situation where a discrepancy results; but I think that
known cases ought be identified and documented (and I think that
eliminating such differences may be a valuable improvement).
The "(elisp)Byte Compilation" info node could certainly do with a
child node detailing the ways in which byte-compiled code behaves
differently from uncompiled code, so that elisp authors can gain an
understanding of all these nuances from a single section of the
manual.
> Whether literals are shared or not should not be relied upon.
Why?
I mean, in this case we already know the answer, but why shouldn't the
behaviour be consistent and dependable between the two variants?
Again, it bothers me to think that someone could observe a bug when
running byte-compiled code, and try to debug it but, through the
process of instrumenting functions for debugging, unwittingly change
the behaviour of the code such that the bug no longer occurs.
-Phil
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#31688: 26.1.50; Byte compiler confuses two string variables
2018-06-04 10:02 ` Phil Sainty
@ 2018-06-04 15:58 ` Eli Zaretskii
2018-06-04 17:01 ` Andreas Schwab
1 sibling, 0 replies; 16+ messages in thread
From: Eli Zaretskii @ 2018-06-04 15:58 UTC (permalink / raw)
To: Phil Sainty
Cc: gazally, bug-gnu-emacs-bounces+psainty=orcon.net.nz, schwab,
npostavs, 31688
> Date: Mon, 04 Jun 2018 22:02:27 +1200
> From: Phil Sainty <psainty@orcon.net.nz>
> Cc: Gemini Lasswell <gazally@runbox.com>, Noam Postavsky <npostavs@gmail.com>,
> bug-gnu-emacs <bug-gnu-emacs-bounces+psainty=orcon.net.nz@gnu.org>,
> 31688@debbugs.gnu.org
>
> Again, it bothers me to think that someone could observe a bug when
> running byte-compiled code, and try to debug it but, through the
> process of instrumenting functions for debugging, unwittingly change
> the behaviour of the code such that the bug no longer occurs.
Byte compilation includes optimizations, and the fact that optimized
code can behave differently from unoptimized one is well known in
every programming language. When you get differences, you have code
that relies on undefined behavior, which I believe is the point
Andreas was making.
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#31688: 26.1.50; Byte compiler confuses two string variables
2018-06-04 10:02 ` Phil Sainty
2018-06-04 15:58 ` Eli Zaretskii
@ 2018-06-04 17:01 ` Andreas Schwab
2018-06-08 15:09 ` Eli Zaretskii
1 sibling, 1 reply; 16+ messages in thread
From: Andreas Schwab @ 2018-06-04 17:01 UTC (permalink / raw)
To: Phil Sainty; +Cc: Gemini Lasswell, Noam Postavsky, bug-gnu-emacs, 31688
On Jun 04 2018, Phil Sainty <psainty@orcon.net.nz> wrote:
> Surely consistent behaviour between compiled and uncompiled code is
> not only desirable, but a primary goal?
Not if you are using self-modifying code.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#31688: 26.1.50; Byte compiler confuses two string variables
2018-06-04 17:01 ` Andreas Schwab
@ 2018-06-08 15:09 ` Eli Zaretskii
0 siblings, 0 replies; 16+ messages in thread
From: Eli Zaretskii @ 2018-06-08 15:09 UTC (permalink / raw)
To: Andreas Schwab
Cc: psainty, gazally, 31688-done, npostavs,
bug-gnu-emacs-bounces+psainty=orcon.net.nz
> From: Andreas Schwab <schwab@linux-m68k.org>
> Date: Mon, 04 Jun 2018 19:01:16 +0200
> Cc: Gemini Lasswell <gazally@runbox.com>, Noam Postavsky <npostavs@gmail.com>,
> bug-gnu-emacs <bug-gnu-emacs-bounces+psainty=orcon.net.nz@gnu.org>,
> 31688@debbugs.gnu.org
>
> On Jun 04 2018, Phil Sainty <psainty@orcon.net.nz> wrote:
>
> > Surely consistent behaviour between compiled and uncompiled code is
> > not only desirable, but a primary goal?
>
> Not if you are using self-modifying code.
Thanks to everyone who participated in the discussion. I have now
added some explanation of these issues to the ELisp manual, and I'm
closing the bug report.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2018-06-08 15:09 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-06-02 17:51 bug#31688: 26.1.50; Byte compiler confuses two string variables Gemini Lasswell
2018-06-02 18:02 ` Noam Postavsky
2018-06-02 22:52 ` Gemini Lasswell
2018-06-02 23:25 ` Noam Postavsky
2018-06-03 0:40 ` Drew Adams
2018-06-02 23:03 ` Drew Adams
2018-06-02 23:38 ` Phil Sainty
2018-06-02 23:54 ` Noam Postavsky
2018-06-03 12:32 ` Phil Sainty
2018-06-03 13:05 ` Andreas Schwab
2018-06-04 10:02 ` Phil Sainty
2018-06-04 15:58 ` Eli Zaretskii
2018-06-04 17:01 ` Andreas Schwab
2018-06-08 15:09 ` Eli Zaretskii
2018-06-03 0:02 ` Michael Heerdegen
2018-06-03 0:46 ` Drew Adams
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).