From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!.POSTED!not-for-mail
From: Rocky Bernstein <rocky@gnu.org>
Newsgroups: gmane.emacs.devel
Subject: Re: Emacs-devel Digest, Vol 166, Issue 137
Date: Fri, 22 Dec 2017 18:46:27 -0500
Message-ID: <CANCp2ga-FZFE7-RP_mg=2epkZRUkQ7f4znDphNSVTHOyTXVdWA@mail.gmail.com>
References: <mailman.13940.1513973159.27992.emacs-devel@gnu.org>
NNTP-Posting-Host: blaine.gmane.org
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary="001a11472b56daf64c0560f66cd5"
X-Trace: blaine.gmane.org 1513986324 6781 195.159.176.226 (22 Dec 2017 23:45:24 GMT)
X-Complaints-To: usenet@blaine.gmane.org
NNTP-Posting-Date: Fri, 22 Dec 2017 23:45:24 +0000 (UTC)
Cc: Stefan Monnier <monnier@iro.umontreal.ca>
To: emacs-devel@gnu.org
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Dec 23 00:45:20 2017
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by blaine.gmane.org with esmtp (Exim 4.84_2)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1eSX08-0001F2-4Q
	for ged-emacs-devel@m.gmane.org; Sat, 23 Dec 2017 00:45:16 +0100
Original-Received: from localhost ([::1]:39468 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1eSX26-00040h-E5
	for ged-emacs-devel@m.gmane.org; Fri, 22 Dec 2017 18:47:18 -0500
Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:53338)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <rocky.bernstein@gmail.com>) id 1eSX1L-00040b-DK
	for emacs-devel@gnu.org; Fri, 22 Dec 2017 18:46:33 -0500
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <rocky.bernstein@gmail.com>) id 1eSX1J-0007mC-74
	for emacs-devel@gnu.org; Fri, 22 Dec 2017 18:46:31 -0500
Original-Received: from mail-qt0-x22a.google.com ([2607:f8b0:400d:c0d::22a]:36480)
	by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
	(Exim 4.71) (envelope-from <rocky.bernstein@gmail.com>)
	id 1eSX1I-0007lG-VY
	for emacs-devel@gnu.org; Fri, 22 Dec 2017 18:46:29 -0500
Original-Received: by mail-qt0-x22a.google.com with SMTP id a16so38046824qtj.3
	for <emacs-devel@gnu.org>; Fri, 22 Dec 2017 15:46:28 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
	h=mime-version:sender:in-reply-to:references:from:date:message-id
	:subject:to:cc;
	bh=uF5cyGAOeVmnaMIUOnv1pnjNyq7A1nGpdtJnKVyfLtg=;
	b=bjY5rFEaO0RobuDurMzUZf+F6eyKXM7EWm3vyEBMSxpgzKJ0rj8DNOHTr2byMuhKiF
	hbpDRXyJ+fy+mNmzcrEFtwDuAyPPe2L9f22I3UbY/d2Gj+Yan5j7tWj+viqllpyQaYB9
	0PkwaDfGki0baPbI6v11y3YrfGJ6AO72W+Kg3bHZ1GxEhZ/AfUS/WBJ85S3QIFLY3p2b
	K2xrJaTwlSnoa68aOeUnF5scsBwd4wIp/G26MYOLOtVoRXcueSaIlzZaIvqZhxizhUH5
	oXG9Bz8KXSxHUZzUef5+cH0HZbfHSYum5HOBFwiSvjV7CiyFf0pDKolI6Dt8YB2SDu56
	E9dA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:sender:in-reply-to:references:from
	:date:message-id:subject:to:cc;
	bh=uF5cyGAOeVmnaMIUOnv1pnjNyq7A1nGpdtJnKVyfLtg=;
	b=tALcgfQiAzt+D3otvDbZCUUnlOZH4Z2Rv2LJhhcEqQuSy7+Q+/Hj5PabFMt4rO5GAc
	Da0fbF5nI2ospZL1DMi5LnSPugRfHzFqFQgA5ybPyNIXnn8HJI3yTtsBPRa8a1SdnwYd
	/1vxOxXfwj5SC4XkgH5X9Y+nx2nOPOrK869aNHGDdvRJ9ZiHaxUWnSLBqcLy9OCyhFzi
	ATRR4pJCuuP2UEZEL+7aSylxJKs72gkvln9lRbGJocXzLeZKyWGqDn0U30r9xmcAx56V
	YIDdkGugkUr/oWHSub2pf2um81BpNkpd8OaOFyL9COVqwm97gZJo0J1XR/u7t1wHLeDx
	lkOA==
X-Gm-Message-State: AKGB3mIEIOi4nfjeAM3zJcK+vdCu+znBj++LEFU0IPK3W0gxLgBudhaq
	EtJmS5WxqEB+0dUGqM2j3/dCCe7zh5Gw94hT2XU8k+mI
X-Google-Smtp-Source: ACJfBosvn9dgOCb6xMzs0w5Wz86b8pvxpfqGen5E8qCmfDhrifdM0UPbiRigh6JFzOv/teGd6YHmEB9rY0vpVblKwDM=
X-Received: by 10.200.41.120 with SMTP id z53mr22700360qtz.305.1513986388062; 
	Fri, 22 Dec 2017 15:46:28 -0800 (PST)
Original-Received: by 10.12.197.8 with HTTP; Fri, 22 Dec 2017 15:46:27 -0800 (PST)
In-Reply-To: <mailman.13940.1513973159.27992.emacs-devel@gnu.org>
X-Google-Sender-Auth: zr264-M2o7EtfnT-4FL3Ee9IxRw
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
	recognized.
X-Received-From: 2607:f8b0:400d:c0d::22a
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel/>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: "Emacs-devel" <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Xref: news.gmane.org gmane.emacs.devel:221361
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/221361>

--001a11472b56daf64c0560f66cd5
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Fri, 22 Dec 2017 15:05:39 -050 Stefan Monnier informs:

>
> I think Emacs should evolve (and is evolving) towards a model where .elc
> files are handled completely automatically, so there's no need to
> preserve backward compatibility at all, because we can just recompile
> the source file.
>

If you mean always keep the source code around in the bytecode file, I'm
all for that!

If not, we're back to that discussion on how to find the source text for a
given bytecode file and failing that (or in addition to that) having decent
decompilers for bytecode.

[ Modulo supporting enough backward compatibility for bootstrapping
>   purposes, since I also think we should get rid of the interpreter.  ]
>
> > My understanding of how this work in a more rational way would be that
> > there shouldn't be incompatible changes between major releases.  So I
> would
> > hope that incompatible macro changes wouldn't happen within a major
> release
> > but between major releases, the same as I hope would be the case for
> > bytecode changes.
>
> In theory, that's what we aim for, yes.
>

Good. If that's the case then most of the cases you report, such as where
the macro expansion is incompatible,  could be detected just by checking if
the compiler used in compilation has the same major number as the bytecode
interpreter.


> > Maybe this could be incorporated into a "safe-load-file" function.
>
> Define "safe"
>

Okay. Let me call it then "safer" then. And I will define that: detecting
problems that can be reasonably detected in advance of hitting them instead
of giving a =C2=AF\_(=E3=83=84)_/=C2=AF traceback.
Recently have come to learn it can be worse because checks are not done on
bytecode...

Want to crash emacs immediately without a traceback? Run

emacs -batch -Q --eval '(print (#[0 "\300\207" [] 0]))'

How many times this year have I run into the problem this year, also
seen by others judging by reports on the Internet, of Emacs blithely
running probably an incompatible version of cl-lib.

The bytecode file for cl-lib no doubt had in it "Hey, I'm emacs 24."
and I probably ran that on Emacs 25 where there was an incompatibility
that can happen between major releases.

If that were the case (and although probably it is not the *only*
scenario case)  how much nicer would it have been if a safer-load-file
 warned me about running version 24 bytecode.

And if such a safer-load-file package were in ELPA or something where
packages are updated much more frequently than Emacs, when such
conditions arise, the safer-load-file could add a check for this
particular cl-lib incompatibility between the particular major
releases


=C2=AF
> >> FWIW, I think Emacs deserves a new Elisp compilation system (either
> >> a new kind of bytecode (maybe using something like vmgen), or a JIT or
> >> something): the bytecode we use is basically identical to the one we h=
ad
> >> 20 years ago, yet the tradeoffs have changed substantially in the
> >> mean time.
> > I would  be interested in elaboration here about what specific trade of=
fs
> > you mean.
>
> Obviously, the performance characteristics of computers has changed
> drastically, e.g. in terms of memory available, in terms of relative
> costs of ALU instructions vs memory accesses, etc...
>
> But more importantly, the kind of Elisp code run is quite different from
> when the bytecode was introduced.  E.g. it's odd to have a byte-code for
> `skip_chars_forward` but not for `apply`.  This said, I haven't done any
> real bytecode profiling to say how much deserves to change.
>

There are free opcode space available. "apply" could be added is someone
chooses to add it.


> > From what I've seen of Emacs Lisp bytecode, I think it would be a bit
> > difficult to use something like vmgen without a lot of effort.  In the
> > interpreter for vmgen the objects are basically C kinds of objects,
> > not Lisp Objects.  Perhaps that could be negotiated, but it would not
> > be trivial.
>
> I haven't looked closely enough to be sure, but I didn't see anything
> problematic: Lisp_Object in the C source code is very much a C object,
> and that's what the current bytecode manipulates.
>

There may be some glibness here. The benefits of using a lower-level
general-purpose intermediate language like LLVM IR or vmgen is that because
it a lower level, working with registers and pointers, understands some
structure layouts, and is more statically typed. So efficiency can be
gained by specialization.  But if one doesn't break down Lisp_Object and
uses that in the same way the C interpreter currently does, then I don't
see why vmgen will be any faster than the current interpreter. (Other than
the benefit that would also be had by rewriting the interpreter without the
bloat and compatibility overhead)


> > As for JITing bytecode, haven't there been a couple of efforts in that
> > direction already?  Again, this is probably hard.
>
> It's a significant effort, yes, but the speed up could be significant
> (the kind of JITing attempts so far haven't tried to optimize the code
> at all, so it just removes some of the bytecode interpreter overhead,
> whereas there is a lot more opportunity if you try to eliminate the type
> checks included in each operation).
>
> There are many fairly good experimental JITs for Javascript, so it's not
> *that* hard.  It'd probably take an MSc thesis to get a prototype working=
.
>
> > I'm not saying it shouldn't be done. Just that these are very serious
> > projects requiring a lot of effort that would take a bit of time, and
> might
> > cause instability in the interim. All while  Emacs is moving forward on
> its
> > own.
>
> Indeed.  Note that Emacs's bytecode hasn't been moving very much, so the
> "parallel" development shouldn't be a problem.
>
> > But in any event, a prerequisite for considering doing this is to
> > understand what we got right now. That's why I'm trying to document tha=
t
> > more people at least have an understanding of what we are talking about
> in
> > the replacing or modifying the existing system.
>
> I agree that documenting the current bytecode is a very good idea, and
> I thank you for undertaking such an effort.
>

Thanks for the kind words. It's not something I feel all that knowledgeable
or qualified to do.

--001a11472b56daf64c0560f66cd5
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On Fri, 22 Dec 2017 15:05:39 -050 Stefan Monnier informs:<=
br><div class=3D"gmail_extra"><div class=3D"gmail_quote"><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rg=
b(204,204,204);padding-left:1ex"><br>
I think Emacs should evolve (and is evolving) towards a model where .elc<br=
>
files are handled completely automatically, so there&#39;s no need to<br>
preserve backward compatibility at all, because we can just recompile<br>
the source file.<br></blockquote><div><br></div><div>If you mean always kee=
p the source code around in the bytecode file, I&#39;m all for that!<br></d=
iv><div>=C2=A0</div><div>If not, we&#39;re back to that discussion on how t=
o find the source text for a given bytecode file and failing that (or in ad=
dition to that) having decent decompilers for bytecode.<br></div><div> <br>=
</div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;b=
order-left:1px solid rgb(204,204,204);padding-left:1ex">
[ Modulo supporting enough backward compatibility for bootstrapping<br>
=C2=A0 purposes, since I also think we should get rid of the interpreter.=
=C2=A0 ]<br>
<br>
&gt; My understanding of how this work in a more rational way would be that=
<br>
&gt; there shouldn&#39;t be incompatible changes between major releases.=C2=
=A0 So I would<br>
&gt; hope that incompatible macro changes wouldn&#39;t happen within a majo=
r release<br>
&gt; but between major releases, the same as I hope would be the case for<b=
r>
&gt; bytecode changes.<br>
<br>
In theory, that&#39;s what we aim for, yes.<br></blockquote><div><br></div>=
<div>Good. If that&#39;s the case then most of the cases you report, such a=
s where the macro expansion is incompatible,=C2=A0 could be detected just b=
y checking if the compiler used in compilation has the same major number as=
 the bytecode interpreter. <br></div><div> <br></div><blockquote class=3D"g=
mail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204=
,204,204);padding-left:1ex">
<br>
&gt; Maybe this could be incorporated into a &quot;safe-load-file&quot; fun=
ction.<br>
<br>
Define &quot;safe&quot;<br></blockquote><div><br></div><div>Okay. Let me ca=
ll it then &quot;safer&quot; then. And I will define that: detecting proble=
ms that can be reasonably detected in advance of hitting them instead of gi=
ving a =C2=AF\_(=E3=83=84)_/=C2=AF traceback. <br></div><div>Recently have =
come to learn it can be worse because checks are not done on bytecode...<br=
></div><div><br></div><div>Want to crash emacs immediately without a traceb=
ack? Run <br><pre class=3D"gmail-highlight"><code>emacs -batch -Q --eval &#=
39;(print (#[0 &quot;\300\207&quot; [] 0]))&#39;<br><br></code></pre><pre c=
lass=3D"gmail-highlight"><code><span style=3D"font-family:arial,helvetica,s=
ans-serif">How many times this year have I run into the problem this year, =
also seen by others judging by reports on the Internet, of Emacs blithely r=
unning probably an incompatible version of cl-lib.<br><br>The bytecode file=
 for cl-lib no doubt had in it &quot;Hey, I&#39;m emacs 24.&quot; and I pro=
bably ran that on Emacs 25 where there was an incompatibility that can happ=
en between major releases.<br></span></code></pre><pre class=3D"gmail-highl=
ight"><code><span style=3D"font-family:arial,helvetica,sans-serif">If that =
were the case (and although probably it is not the <i>only</i> scenario cas=
e)  how much nicer would it have been if a safer-load-file  warned me about=
 running version 24 bytecode.<br></span></code></pre><pre class=3D"gmail-hi=
ghlight"><code><span style=3D"font-family:arial,helvetica,sans-serif">And i=
f such a safer-load-file package were in ELPA or something where packages a=
re updated much more frequently than Emacs, when such conditions arise, the=
 safer-load-file could add a check for this particular cl-lib incompatibili=
ty between the particular major releases<br><br><br></span></code></pre></d=
iv><div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px=
 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
=C2=AF<br>
&gt;&gt; FWIW, I think Emacs deserves a new Elisp compilation system (eithe=
r<br>
&gt;&gt; a new kind of bytecode (maybe using something like vmgen), or a JI=
T or<br>
&gt;&gt; something): the bytecode we use is basically identical to the one =
we had<br>
&gt;&gt; 20 years ago, yet the tradeoffs have changed substantially in the<=
br>
&gt;&gt; mean time.<br>
&gt; I would=C2=A0 be interested in elaboration here about what specific tr=
ade offs<br>
&gt; you mean.<br>
<br>
Obviously, the performance characteristics of computers has changed<br>
drastically, e.g. in terms of memory available, in terms of relative<br>
costs of ALU instructions vs memory accesses, etc...<br>
<br>
But more importantly, the kind of Elisp code run is quite different from<br=
>
when the bytecode was introduced.=C2=A0 E.g. it&#39;s odd to have a byte-co=
de for<br>
`skip_chars_forward` but not for `apply`.=C2=A0 This said, I haven&#39;t do=
ne any<br>
real bytecode profiling to say how much deserves to change.<br></blockquote=
><div><br></div><div>There are free opcode space available. &quot;apply&quo=
t; could be added is someone chooses to add it.<br></div><div> <br></div><b=
lockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-le=
ft:1px solid rgb(204,204,204);padding-left:1ex">
<br>
&gt; From what I&#39;ve seen of Emacs Lisp bytecode, I think it would be a =
bit<br>
&gt; difficult to use something like vmgen without a lot of effort.=C2=A0 I=
n the<br>
&gt; interpreter for vmgen the objects are basically C kinds of objects,<br=
>
&gt; not Lisp Objects.=C2=A0 Perhaps that could be negotiated, but it would=
 not<br>
&gt; be trivial.<br>
<br>
I haven&#39;t looked closely enough to be sure, but I didn&#39;t see anythi=
ng<br>
problematic: Lisp_Object in the C source code is very much a C object,<br>
and that&#39;s what the current bytecode manipulates.<br></blockquote><div>=
<br></div><div>There may be some glibness here. The benefits of using a low=
er-level general-purpose intermediate language like LLVM IR or vmgen is tha=
t because it a lower level, working with registers and pointers, understand=
s some structure layouts, and is more statically typed. So efficiency can b=
e gained by specialization.=C2=A0=C2=A0<u></u>But if one doesn&#39;t break =
down Lisp_Object and uses that in the same way the C interpreter currently =
does, then I don&#39;t see why vmgen will be any faster than the current in=
terpreter. (Other than the benefit that would also be had by rewriting the =
interpreter without the bloat and compatibility overhead)<br></div><div><br=
></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;=
border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
&gt; As for JITing bytecode, haven&#39;t there been a couple of efforts in =
that<br>
&gt; direction already?=C2=A0 Again, this is probably hard.<br>
<br>
It&#39;s a significant effort, yes, but the speed up could be significant<b=
r>
(the kind of JITing attempts so far haven&#39;t tried to optimize the code<=
br>
at all, so it just removes some of the bytecode interpreter overhead,<br>
whereas there is a lot more opportunity if you try to eliminate the type<br=
>
checks included in each operation).<br>
<br>
There are many fairly good experimental JITs for Javascript, so it&#39;s no=
t<br>
*that* hard.=C2=A0 It&#39;d probably take an MSc thesis to get a prototype =
working.<br>
<br>
&gt; I&#39;m not saying it shouldn&#39;t be done. Just that these are very =
serious<br>
&gt; projects requiring a lot of effort that would take a bit of time, and =
might<br>
&gt; cause instability in the interim. All while=C2=A0 Emacs is moving forw=
ard on its<br>
&gt; own.<br>
<br>
Indeed.=C2=A0 Note that Emacs&#39;s bytecode hasn&#39;t been moving very mu=
ch, so the<br>
&quot;parallel&quot; development shouldn&#39;t be a problem.<br>
<br>
&gt; But in any event, a prerequisite for considering doing this is to<br>
&gt; understand what we got right now. That&#39;s why I&#39;m trying to doc=
ument that<br>
&gt; more people at least have an understanding of what we are talking abou=
t in<br>
&gt; the replacing or modifying the existing system.<br>
<br>
I agree that documenting the current bytecode is a very good idea, and<br>
I thank you for undertaking such an effort.<br></blockquote><div><br></div>=
<div>Thanks for the kind words. It&#39;s not something I feel all that know=
ledgeable or qualified to do.<br></div></div><br></div></div>

--001a11472b56daf64c0560f66cd5--