From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!.POSTED!not-for-mail
From: Stefan Monnier <monnier@iro.umontreal.ca>
Newsgroups: gmane.emacs.devel
Subject: Re: Bytecode interoperability: the good and bad
Date: Fri, 22 Dec 2017 15:05:39 -0500
Message-ID: <jwv7ete7dm1.fsf-monnier+gmane.emacs.devel@gnu.org>
References: <CANCp2gaOUtmgivNkxiFTNfoWk_1vpZtfOeYjeasFCuDQocAHxw@mail.gmail.com>
NNTP-Posting-Host: blaine.gmane.org
Mime-Version: 1.0
Content-Type: text/plain
X-Trace: blaine.gmane.org 1513973089 26709 195.159.176.226 (22 Dec 2017 20:04:49 GMT)
X-Complaints-To: usenet@blaine.gmane.org
NNTP-Posting-Date: Fri, 22 Dec 2017 20:04:49 +0000 (UTC)
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)
To: emacs-devel@gnu.org
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Dec 22 21:04:45 2017
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by blaine.gmane.org with esmtp (Exim 4.84_2)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1eSTYh-0006Vs-ND
	for ged-emacs-devel@m.gmane.org; Fri, 22 Dec 2017 21:04:43 +0100
Original-Received: from localhost ([::1]:36636 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1eSTag-0005iy-3t
	for ged-emacs-devel@m.gmane.org; Fri, 22 Dec 2017 15:06:46 -0500
Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:34393)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <ged-emacs-devel@m.gmane.org>) id 1eSTZt-0005hu-GA
	for emacs-devel@gnu.org; Fri, 22 Dec 2017 15:05:58 -0500
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <ged-emacs-devel@m.gmane.org>) id 1eSTZq-0008Ku-6n
	for emacs-devel@gnu.org; Fri, 22 Dec 2017 15:05:57 -0500
Original-Received: from [195.159.176.226] (port=52510 helo=blaine.gmane.org)
	by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
	(Exim 4.71) (envelope-from <ged-emacs-devel@m.gmane.org>)
	id 1eSTZp-0008Jt-Vc
	for emacs-devel@gnu.org; Fri, 22 Dec 2017 15:05:54 -0500
Original-Received: from list by blaine.gmane.org with local (Exim 4.84_2)
	(envelope-from <ged-emacs-devel@m.gmane.org>) id 1eSTXm-0002nv-Dp
	for emacs-devel@gnu.org; Fri, 22 Dec 2017 21:03:46 +0100
X-Injected-Via-Gmane: http://gmane.org/
Original-Lines: 101
Original-X-Complaints-To: usenet@blaine.gmane.org
Cancel-Lock: sha1:Pmhbds5aAmnkN+xsYntWkxe40VY=
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
	[fuzzy]
X-Received-From: 195.159.176.226
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel/>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: "Emacs-devel" <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Xref: news.gmane.org gmane.emacs.devel:221351
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/221351>

> In the "not true" department, there are instructions 0153 scan_buffer and
> 0163 set_mark which aren't handled in the current interpreter sources in
> bytecode.c

Right, there are a few exceptions where we did remove old instructions.
I haven't heard of anyone using a new enough Emacs with an old enough
.elc file to bump into this problem, so I'm not worried.

> In the "not without downsides" department, this means that when someone
> looks at the bytecode interpreter, it is filled with garbage and bloat.
> This has to have a technology debt associated with it.

Of course, backward compatibility has its costs.

> It is likely that the code that purports to handle obsolete (or no longer
> emitted) instructions is broken,

It's possible, indeed.  Not sure about "likely", tho.

> since I doubt any of this behavior is tested.  Subtle changes in the
> semantics of instructions can cause unintended effects.

In any case, Emacs has plenty of real confirmed bugs affecting real
users that I don't worry too much about such hypotheticals.

I think Emacs should evolve (and is evolving) towards a model where .elc
files are handled completely automatically, so there's no need to
preserve backward compatibility at all, because we can just recompile
the source file.
[ Modulo supporting enough backward compatibility for bootstrapping
  purposes, since I also think we should get rid of the interpreter.  ]

> My understanding of how this work in a more rational way would be that
> there shouldn't be incompatible changes between major releases.  So I would
> hope that incompatible macro changes wouldn't happen within a major release
> but between major releases, the same as I hope would be the case for
> bytecode changes.

In theory, that's what we aim for, yes.

> Maybe this could be incorporated into a "safe-load-file" function.

Define "safe".

>> FWIW, I think Emacs deserves a new Elisp compilation system (either
>> a new kind of bytecode (maybe using something like vmgen), or a JIT or
>> something): the bytecode we use is basically identical to the one we had
>> 20 years ago, yet the tradeoffs have changed substantially in the
>> mean time.
> I would  be interested in elaboration here about what specific trade offs
> you mean.

Obviously, the performance characteristics of computers has changed
drastically, e.g. in terms of memory available, in terms of relative
costs of ALU instructions vs memory accesses, etc...

But more importantly, the kind of Elisp code run is quite different from
when the bytecode was introduced.  E.g. it's odd to have a byte-code for
`skip_chars_forward` but not for `apply`.  This said, I haven't done any
real bytecode profiling to say how much deserves to change.

> From what I've seen of Emacs Lisp bytecode, I think it would be a bit
> difficult to use something like vmgen without a lot of effort.  In the
> interpreter for vmgen the objects are basically C kinds of objects,
> not Lisp Objects.  Perhaps that could be negotiated, but it would not
> be trivial.

I haven't looked closely enough to be sure, but I didn't see anything
problematic: Lisp_Object in the C source code is very much a C object,
and that's what the current bytecode manipulates.

> As for JITing bytecode, haven't there been a couple of efforts in that
> direction already?  Again, this is probably hard.

It's a significant effort, yes, but the speed up could be significant
(the kind of JITing attempts so far haven't tried to optimize the code
at all, so it just removes some of the bytecode interpreter overhead,
whereas there is a lot more opportunity if you try to eliminate the type
checks included in each operation).

There are many fairly good experimental JITs for Javascript, so it's not
*that* hard.  It'd probably take an MSc thesis to get a prototype working.

> I'm not saying it shouldn't be done. Just that these are very serious
> projects requiring a lot of effort that would take a bit of time, and might
> cause instability in the interim. All while  Emacs is moving forward on its
> own.

Indeed.  Note that Emacs's bytecode hasn't been moving very much, so the
"parallel" development shouldn't be a problem.

> But in any event, a prerequisite for considering doing this is to
> understand what we got right now. That's why I'm trying to document that
> more people at least have an understanding of what we are talking about in
> the replacing or modifying the existing system.

I agree that documenting the current bytecode is a very good idea, and
I thank you for undertaking such an effort.


        Stefan