From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!.POSTED!not-for-mail
From: Ken Raeburn <raeburn@raeburn.org>
Newsgroups: gmane.emacs.devel
Subject: Re: compiled lisp file format (Re: Skipping unexec via a big .elc
	file)
Date: Wed, 27 Sep 2017 04:31:09 -0400
Message-ID: <633E485D-B414-427B-8257-699EE53C84F4@raeburn.org>
References: <8A8DA980-13A7-4F8B-9D07-391728C673C9@raeburn.org>
	<83inmq53xk.fsf@gnu.org>
	<96D35768-314C-43F5-BD5E-B12187759DCA@raeburn.org>
	<123104DD-447F-4CDB-B3A0-CED80E3AC8C9@raeburn.org>
	<20170403165736.GA2851@acm>
	<2497A2D5-FDB1-47FF-AED3-FD4ABE2FE144@raeburn.org>
	<83lgrhpalq.fsf@gnu.org>
	<0D99B4FE-FEEF-4565-87D6-E230A05DEF3C@raeburn.org>
	<86lgrc4vob.fsf@molnjunk.nocrew.org> <834ly0oew1.fsf@gnu.org>
	<968E8F50-92F6-43C7-B7E4-EE8378943087@raeburn.org>
	<83wpawmj4d.fsf@gnu.org>
	<AA5EB763-1EE7-4B96-8909-45C086A33815@raeburn.org>
	<cc5929fa-c6af-4d6c-81e5-bf1c73c22fdb@gmail.com>
	<CAArVCkQzZSz_d9mn-cWdxuzW8sMwgh1tmR7a9VMP70DD+Qdy7w@mail.gmail.com>
	<1e397033-8291-1625-8b78-a1e1c200aea5@gmail.com>
	<CAArVCkTdVhMxk_p0PvW7dX-4dxvE=k16V5vFaGx0-Dj10b2ctQ@mail.gmail.com>
	<18196f08-408d-8b17-423e-8be54507bb84@gmail.com>
	<CAArVCkT+akdELz6MOaBrZSy3L41UF+wkyLCkMUirco=N=6sf-A@mail.gmail.com>
	<8360hkkcgj.fsf@gnu.org>
	<E1C94ABB-2872-44C0-AEB9-3FD502944EAC@raeburn.org>
	<26b35c16-33e7-0e08-9cc5-6f9b81e40968@cs.ucla.edu>
	<C3BE57F1-5FD5-4CED-AF81-A550276A35DD@raeburn.org>
	<CAArVCkRGnTXt4YjFJt5X4SPi0HX57HNjjYGiaAg4eSPhRejd1Q@mail.gmail.com>
	<A84117D0-12E5-4840-B7E7-B85C1093B173@raeburn.org>
	<CAArVCkSkF-m14Y=amWsCxEov-px39wjyV4M8=gv50kGhWkuHww@mail.gmail.com>
	<ADF52B6E-C3F7-453C-8FE1-1BD71F7B5541@raeburn.org>
	<CAArVCkTd5BgiRF_+pLkBLwJn47N8-rABeL-TbvFA1xy5kDE=Bg@mail.gmail.com>
NNTP-Posting-Host: blaine.gmane.org
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Content-Type: multipart/alternative;
	boundary="Apple-Mail=_01B585D4-7062-4B19-B3E3-6FBC84D5605C"
X-Trace: blaine.gmane.org 1506501332 5757 195.159.176.226 (27 Sep 2017 08:35:32 GMT)
X-Complaints-To: usenet@blaine.gmane.org
NNTP-Posting-Date: Wed, 27 Sep 2017 08:35:32 +0000 (UTC)
Cc: Paul Eggert <eggert@cs.ucla.edu>, Emacs developers <emacs-devel@gnu.org>
To: Philipp Stephani <p.stephani2@gmail.com>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Sep 27 10:35:25 2017
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by blaine.gmane.org with esmtp (Exim 4.84_2)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1dx7oM-0000RK-3R
	for ged-emacs-devel@m.gmane.org; Wed, 27 Sep 2017 10:35:19 +0200
Original-Received: from localhost ([::1]:53406 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1dx7oQ-0005ND-Ez
	for ged-emacs-devel@m.gmane.org; Wed, 27 Sep 2017 04:35:22 -0400
Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:42691)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <raeburn@raeburn.org>) id 1dx7kT-0002vg-5E
	for emacs-devel@gnu.org; Wed, 27 Sep 2017 04:31:24 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <raeburn@raeburn.org>) id 1dx7kP-0004AD-51
	for emacs-devel@gnu.org; Wed, 27 Sep 2017 04:31:17 -0400
Original-Received: from mail-qt0-x233.google.com ([2607:f8b0:400d:c0d::233]:45014)
	by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
	(Exim 4.71) (envelope-from <raeburn@raeburn.org>) id 1dx7kO-00048y-UF
	for emacs-devel@gnu.org; Wed, 27 Sep 2017 04:31:13 -0400
Original-Received: by mail-qt0-x233.google.com with SMTP id o13so12885264qtf.1
	for <emacs-devel@gnu.org>; Wed, 27 Sep 2017 01:31:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=raeburn-org.20150623.gappssmtp.com; s=20150623;
	h=mime-version:subject:from:in-reply-to:date:cc:message-id:references
	:to; bh=RHjOkrxKOG8AU9FbxD6bEVtQ8X1gTFz7m1W5d9h/s50=;
	b=UHiQbQxivwm2DYfRNQYS7r8fsyOAZMKTKZEs9VQ9K0q15noV41A98W+pN6xU4d6zPL
	HmzCU+Y+hdYYbmfBGkvsf33Nk8FQcFvqoXEbM9DIEQt8qxWPcMyWJg/JxPjR1conOHhs
	BthrhQmfqd33L1RaqPWV1DFJpxdPTJ1/viyPOdvLV3UW6oD/YYfKYG6X3OYDFYDzGAIO
	qPnM7w37dAatmUNBbK0x/fts/Snh7bDHs9K+pZsBEAp0/tu1+QouWbO9n3BR0Ffphfvh
	2Q+zY1H8VJuMakoxGNb3HtInGx2dYcgncnJ8QJKfkRn++twFn++gFCswl/fHfPoKxfL+
	Ge2w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc
	:message-id:references:to;
	bh=RHjOkrxKOG8AU9FbxD6bEVtQ8X1gTFz7m1W5d9h/s50=;
	b=hrNgpWkEtGxXrhr3NJTse2G+0G9HWo5Y6jmXKpl2HmzSO7lyLBaow5YucAUVNsUGGb
	L3n2ZSAXMtNY/fBmyiq0wC3VdI9YGsosT5PkJw0uaxKwuV5YhxmSwi57wBVf57IbTJsQ
	8q12jDVy6c1+eeb51ZzCbz9xN/Bc5fgCQPtSAFvYVOZmmZ3OAQ6fYSojuUnRGQzDjpe8
	7BC291DqvUXbfjc1NraJvutc5q8+SSa3nBP2Cro55CIkFIEjnuyeVVbcAwa9KK++fK3+
	fd0IMTRuHGL+PArAE/Au9J2J2CRbFmDx+DlG10KHKHpO322Iw8q8tk5sSyxPK72ytjwm
	/zsQ==
X-Gm-Message-State: AHPjjUj8sVkuayQi+xwNukwUZmJsqEnsoYnIH9bIsE5GxB/HZJAQn7UO
	sqeOdELy04hFZVx35nW2Y7kYLg==
X-Google-Smtp-Source: AOwi7QBP3ifH65CN7PAr+3FxkQwPpMZ27HLT67n33QtqhULqhndh5lnZUNEMm2GtFjK4eaKc+APxwA==
X-Received: by 10.200.35.204 with SMTP id r12mr763184qtr.95.1506501071075;
	Wed, 27 Sep 2017 01:31:11 -0700 (PDT)
Original-Received: from [192.168.23.135] (c-73-253-167-23.hsd1.ma.comcast.net.
	[73.253.167.23]) by smtp.gmail.com with ESMTPSA id
	r22sm8127721qtj.94.2017.09.27.01.31.09
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
	Wed, 27 Sep 2017 01:31:09 -0700 (PDT)
In-Reply-To: <CAArVCkTd5BgiRF_+pLkBLwJn47N8-rABeL-TbvFA1xy5kDE=Bg@mail.gmail.com>
X-Mailer: Apple Mail (2.3273)
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
	recognized.
X-Received-From: 2607:f8b0:400d:c0d::233
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel/>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: "Emacs-devel" <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Xref: news.gmane.org gmane.emacs.devel:218819
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/218819>


--Apple-Mail=_01B585D4-7062-4B19-B3E3-6FBC84D5605C
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8


On Sep 24, 2017, at 09:57, Philipp Stephani <p.stephani2@gmail.com> =
wrote:
> Ken Raeburn <raeburn@raeburn.org <mailto:raeburn@raeburn.org>> schrieb =
am Mo., 3. Juli 2017 um 03:44 Uhr:
>=20
> On Jul 2, 2017, at 11:46, Philipp Stephani <p.stephani2@gmail.com =
<mailto:p.stephani2@gmail.com>> wrote:
>=20
>> Ken Raeburn <raeburn@raeburn.org <mailto:raeburn@raeburn.org>> =
schrieb am Mo., 29. Mai 2017 um 11:33 Uhr:
>>=20
>> On May 28, 2017, at 08:43, Philipp Stephani <p.stephani2@gmail.com =
<mailto:p.stephani2@gmail.com>> wrote:
>>=20
>>>=20
>>>=20
>>> Ken Raeburn <raeburn@raeburn.org <mailto:raeburn@raeburn.org>> =
schrieb am So., 28. Mai 2017 um 13:07 Uhr:
>>>=20
>>> On May 21, 2017, at 04:53, Paul Eggert <eggert@cs.ucla.edu =
<mailto:eggert@cs.ucla.edu>> wrote:
>>>=20
>>> > Ken Raeburn wrote:
>>> >> The Guile project has taken this idea pretty far; they=E2=80=99re =
generating ELF object files with a few special sections for Guile =
objects, using the standard DWARF sections for debug information, etc.  =
While it has a certain appeal (making C modules and Lisp files look much =
more similar, maybe being able to link Lisp and C together into one =
executable image, letting GDB understand some of your data), switching =
to a machine-specific format would be a pretty drastic change, when we =
can currently share the files across machines.
>>> >
>>> > Although it does indeed sound like a big change, I don't see why =
it would prevent us from sharing the files across machines. Emacs can =
use standard ELF and DWARF format on any platform if Emacs is doing the =
loading. And there should be some software-engineering benefit in using =
the same format that Guile uses.
>>>=20
>>> Sorry for the delay in responding.
>>>=20
>>> The ELF format has header fields indicating the word size, =
endianness, machine architecture (though there=E2=80=99s a value for =
=E2=80=9Cnone=E2=80=9D), and OS ABI.  Some fields vary in size or order =
depending on whether the 32-bit or 64-bit format is in use.  Some other =
format details (e.g., relocation types, interpretation of certain ranges =
of values in some fields) are architecture- or OS-dependent; we might =
not care about many of those details, but relocations are likely needed =
if we want to play linking games or use DWARF.
>>>=20
>>> I think Guile is using whatever the native word size and =
architecture are.  If we do that for Emacs, they=E2=80=99re not portable =
between platforms.  Currently it works for me to put my Lisp files, both =
source and compiled, into ~/elisp and use them from different kinds of =
machines if my home directory is NFS-mounted.
>>>=20
>>> We could instead pick fixed values (say, architecture =E2=80=9Cnone=E2=
=80=9D, little-endian, 32-bit), but then there=E2=80=99s no guarantee =
that we could use any of the usual GNU tools on them without a bunch of =
work, or that we=E2=80=99d ever be able to use non-GNU tools to treat =
them as object files.  Then again, we couldn=E2=80=99t expect to do the =
latter portably anyway, since some of the platforms don=E2=80=99t even =
use ELF.
>>>=20
>>>=20
>>> Is there any significant advantage of using ELF, or could this just =
use one of the standard binary serialization formats (protobuf, =
flatbuffer, ...)?=20
>>=20
>> That=E2=80=99s an interesting idea.  If one of the popular =
serialization libraries is compatibly licensed, easy to use, and =
performs well, it may be better than rolling our own.
>>=20
>> I've tried this out (with flatbuffers), but I haven't seen =
significant speed improvements. It might very well be the case that =
during loading the reader is already fast enough (e.g. for ELC files it =
doesn't do any decoding), and it's the evaluator that's too slow.
>=20
> What=E2=80=99s your test case, and how are you measuring the =
performance?
>=20
> IIRC I've repeatedly loaded one of the biggest .elc files shipped with =
Emacs and measured the total loading time. I haven't done any detailed =
profiling, since I was hoping for a significant speed increase that =
would justify the work.

It=E2=80=99ll depend on what the code in that file is doing.

In the raeburn-startup branch, the last bit of profiling I did =E2=80=94 =
you can see a graph at http://www.mit.edu/~raeburn/emacs.svg =
<http://www.mit.edu/~raeburn/emacs.svg> and if you haven=E2=80=99t read =
up on flame graphs (http://www.brendangregg.com/flamegraphs.html =
<http://www.brendangregg.com/flamegraphs.html>), they provide a nice =
visualization of the CPU time consumption broken down by what the =
current call stack looks like =E2=80=94 showed nearly 1/3 of the CPU =
time of a simple run of Emacs in batch mode was spent reading and =
parsing the saved Lisp environment.  Most of the rest of the CPU time =
was spent executing the loaded code (lots of fset and setplist calls), =
but the biggest chunk of that was executing a nested load of =
international/characters.elc; during that nested load, most of the time =
was spent in execution (mostly char table processing) and very little in =
parsing.

So=E2=80=A6 for the saved Lisp environment file, excluding the nested =
load, reading and parsing is about 2/3 of the CPU time used; for =
characters.elc, reading and parsing is a minuscule portion of the CPU =
time.

Loading a Lisp file internally uses the Lisp =E2=80=9Cread=E2=80=9D =
routine, which requires an input stream of character values (not byte =
values) to be supplied; we examine the stream object and dispatch to =
various bits of code depending on its type (buffer, marker, function, =
certain special symbols), *for each character*.  Each byte is examined =
to see if it=E2=80=99s part of a multibyte character.  Each character is =
considered to see if it=E2=80=99s allowed to be part of a symbol name or =
string or whatever we=E2=80=99re in the middle of parsing, or if it=E2=80=99=
s a backslash quoting some other character, etc.

Hence my hopes for a non-text-based format, designed to streamline =
reading data from files, where we can do things like specify a vector =
length or string length up front instead of having to consider each =
character and process character quoting sequences, stuff like that.  =
E.g., here=E2=80=99s a unibyte string of 47 bytes, so just copy the =
bytes without considering every one separately.  No human-readable =
printed form, no escape sequences needed.

Another help might be finding a faster way to load the character data.  =
I=E2=80=99ve got the branch loading characters.elc at startup because =
saving and parsing the generated tables was even slower than evaluating =
the Lisp code to generate them.  Perhaps we can do some processing of =
them during the build and convert them into some other form that lets us =
start up faster.

> If people are generally interested in pursuing this further, I'd be =
happy to put my code into a scratch branch.

I=E2=80=99d be curious to take a look=E2=80=A6

Ken=

--Apple-Mail=_01B585D4-7062-4B19-B3E3-6FBC84D5605C
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=utf-8

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" =
class=3D""><br class=3D""><div><div class=3D""><div class=3D"">On Sep =
24, 2017, at 09:57, Philipp Stephani &lt;<a =
href=3D"mailto:p.stephani2@gmail.com" =
class=3D"">p.stephani2@gmail.com</a>&gt; wrote:</div></div><blockquote =
type=3D"cite" class=3D""><div class=3D""><div dir=3D"ltr" class=3D""><div =
class=3D"gmail_quote"><div dir=3D"ltr" class=3D"">Ken Raeburn &lt;<a =
href=3D"mailto:raeburn@raeburn.org" class=3D"">raeburn@raeburn.org</a>&gt;=
 schrieb am Mo., 3. Juli 2017 um 03:44&nbsp;Uhr:<br =
class=3D""></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 =
.8ex;border-left:1px #ccc solid;padding-left:1ex"><div =
style=3D"word-wrap:break-word" class=3D""><br class=3D""><div =
class=3D""><div class=3D""><div class=3D"">On Jul 2, 2017, at 11:46, =
Philipp Stephani &lt;<a href=3D"mailto:p.stephani2@gmail.com" =
target=3D"_blank" class=3D"">p.stephani2@gmail.com</a>&gt; =
wrote:</div></div><br class=3D""><blockquote type=3D"cite" class=3D""><div=
 class=3D""><div dir=3D"ltr" =
style=3D"font-family:Helvetica;font-size:12px;font-style:normal;font-varia=
nt-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;t=
ext-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" =
class=3D""><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"">Ken =
Raeburn &lt;<a href=3D"mailto:raeburn@raeburn.org" target=3D"_blank" =
class=3D"">raeburn@raeburn.org</a>&gt; schrieb am Mo., 29. Mai 2017 um =
11:33&nbsp;Uhr:<br class=3D""></div><blockquote class=3D"gmail_quote" =
style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex"><div style=3D"word-wrap:break-word" =
class=3D""><br class=3D""><div class=3D""><div class=3D""><div =
class=3D"">On May 28, 2017, at 08:43, Philipp Stephani &lt;<a =
href=3D"mailto:p.stephani2@gmail.com" target=3D"_blank" =
class=3D"">p.stephani2@gmail.com</a>&gt; wrote:</div><br =
class=3D"m_2316488649784768280m_-2161781396252996659Apple-interchange-newl=
ine"></div><blockquote type=3D"cite" class=3D""><div class=3D""><div =
dir=3D"ltr" class=3D""><br class=3D""><br class=3D""><div =
class=3D"gmail_quote"><div dir=3D"ltr" class=3D"">Ken Raeburn &lt;<a =
href=3D"mailto:raeburn@raeburn.org" target=3D"_blank" =
class=3D"">raeburn@raeburn.org</a>&gt; schrieb am So., 28. Mai 2017 um =
13:07&nbsp;Uhr:<br class=3D""></div><blockquote class=3D"gmail_quote" =
style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex"><br class=3D"">On May 21, 2017, at 04:53, =
Paul Eggert &lt;<a href=3D"mailto:eggert@cs.ucla.edu" target=3D"_blank" =
class=3D"">eggert@cs.ucla.edu</a>&gt; wrote:<br class=3D""><br =
class=3D"">&gt; Ken Raeburn wrote:<br class=3D"">&gt;&gt; The Guile =
project has taken this idea pretty far; they=E2=80=99re generating ELF =
object files with a few special sections for Guile objects, using the =
standard DWARF sections for debug information, etc.&nbsp; While it has a =
certain appeal (making C modules and Lisp files look much more similar, =
maybe being able to link Lisp and C together into one executable image, =
letting GDB understand some of your data), switching to a =
machine-specific format would be a pretty drastic change, when we can =
currently share the files across machines.<br class=3D"">&gt;<br =
class=3D"">&gt; Although it does indeed sound like a big change, I don't =
see why it would prevent us from sharing the files across machines. =
Emacs can use standard ELF and DWARF format on any platform if Emacs is =
doing the loading. And there should be some software-engineering benefit =
in using the same format that Guile uses.<br class=3D""><br =
class=3D"">Sorry for the delay in responding.<br class=3D""><br =
class=3D"">The ELF format has header fields indicating the word size, =
endianness, machine architecture (though there=E2=80=99s a value for =
=E2=80=9Cnone=E2=80=9D), and OS ABI.&nbsp; Some fields vary in size or =
order depending on whether the 32-bit or 64-bit format is in use.&nbsp; =
Some other format details (e.g., relocation types, interpretation of =
certain ranges of values in some fields) are architecture- or =
OS-dependent; we might not care about many of those details, but =
relocations are likely needed if we want to play linking games or use =
DWARF.<br class=3D""><br class=3D"">I think Guile is using whatever the =
native word size and architecture are.&nbsp; If we do that for Emacs, =
they=E2=80=99re not portable between platforms.&nbsp; Currently it works =
for me to put my Lisp files, both source and compiled, into ~/elisp and =
use them from different kinds of machines if my home directory is =
NFS-mounted.<br class=3D""><br class=3D"">We could instead pick fixed =
values (say, architecture =E2=80=9Cnone=E2=80=9D, little-endian, =
32-bit), but then there=E2=80=99s no guarantee that we could use any of =
the usual GNU tools on them without a bunch of work, or that we=E2=80=99d =
ever be able to use non-GNU tools to treat them as object files.&nbsp; =
Then again, we couldn=E2=80=99t expect to do the latter portably anyway, =
since some of the platforms don=E2=80=99t even use ELF.<br class=3D""><br =
class=3D""></blockquote><div class=3D""><br class=3D""></div><div =
class=3D"">Is there any significant advantage of using ELF, or could =
this just use one of the standard binary serialization formats =
(protobuf, flatbuffer, =
...)?&nbsp;</div></div></div></div></blockquote></div><br =
class=3D""></div><div style=3D"word-wrap:break-word" class=3D""><div =
class=3D""><span style=3D"font-family:monospace;font-size:10px" =
class=3D"">That=E2=80=99s an interesting idea.&nbsp; If one of the =
popular serialization libraries is compatibly licensed, easy to use, and =
performs well, it</span><span =
style=3D"font-family:monospace;font-size:10px" =
class=3D"">&nbsp;</span><span =
style=3D"font-family:monospace;font-size:10px" class=3D"">may be better =
than rolling our own.</span></div></div></blockquote><div class=3D""><br =
class=3D""></div><div class=3D"">I've tried this out (with flatbuffers), =
but I haven't seen significant speed improvements. It might very well be =
the case that during loading the reader is already fast enough (e.g. for =
ELC files it doesn't do any decoding), and it's the evaluator that's too =
slow.</div></div></div></div></blockquote><br class=3D""></div></div><div =
style=3D"word-wrap:break-word" class=3D""><div class=3D"">What=E2=80=99s =
your test case, and how are you measuring the =
performance?</div></div></blockquote><div class=3D""><br =
class=3D""></div><div class=3D"">IIRC I've repeatedly loaded one of the =
biggest .elc files shipped with Emacs and measured the total loading =
time. I haven't done any detailed profiling, since I was hoping for a =
significant speed increase that would justify the =
work.</div></div></div></div></blockquote><div><br =
class=3D""></div><div>It=E2=80=99ll depend on what the code in that file =
is doing.</div><div><br class=3D""></div><div>In the raeburn-startup =
branch, the last bit of profiling I did =E2=80=94 you can see a graph at =
<a href=3D"http://www.mit.edu/~raeburn/emacs.svg" =
class=3D"">http://www.mit.edu/~raeburn/emacs.svg</a>&nbsp;and if you =
haven=E2=80=99t read up on flame graphs (<a =
href=3D"http://www.brendangregg.com/flamegraphs.html" =
class=3D"">http://www.brendangregg.com/flamegraphs.html</a>), they =
provide a nice visualization of the CPU time consumption broken down by =
what the current call stack looks like&nbsp;=E2=80=94 showed nearly 1/3 =
of the CPU time of a simple run of Emacs in batch mode was spent reading =
and parsing the saved Lisp environment. &nbsp;Most of the rest of the =
CPU time was spent executing the loaded code (lots of fset and setplist =
calls), but the biggest chunk of that was executing a nested load of =
international/characters.elc; during that nested load, most of the time =
was spent in execution (mostly char table processing) and very little in =
parsing.</div><div><br class=3D""></div><div>So=E2=80=A6 for the saved =
Lisp environment file, excluding the nested load, reading and parsing is =
about 2/3 of the CPU time used; for characters.elc, reading and parsing =
is a minuscule portion of the CPU time.</div><div><br =
class=3D""></div><div>Loading a Lisp file internally uses the Lisp =
=E2=80=9Cread=E2=80=9D routine, which requires an input stream of =
character values (not byte values) to be supplied; we examine the stream =
object and dispatch to various bits of code depending on its type =
(buffer, marker, function, certain special symbols), *for each =
character*. &nbsp;Each byte is examined to see if it=E2=80=99s part of a =
multibyte character. &nbsp;Each character is considered to see if it=E2=80=
=99s allowed to be part of a symbol name or string or whatever we=E2=80=99=
re in the middle of parsing, or if it=E2=80=99s a backslash quoting some =
other character, etc.</div><div><br class=3D""></div><div>Hence my hopes =
for a non-text-based format, designed to streamline reading data from =
files, where we can do things like specify a vector length or string =
length up front instead of having to consider each character and process =
character quoting sequences, stuff like that. &nbsp;E.g., here=E2=80=99s =
a unibyte string of 47 bytes, so just copy the bytes without considering =
every one separately. &nbsp;No human-readable printed form, no escape =
sequences needed.</div><div><br class=3D""></div><div>Another help might =
be finding a faster way to load the character data. &nbsp;I=E2=80=99ve =
got the branch loading characters.elc at startup because saving and =
parsing the generated tables was even slower than evaluating the Lisp =
code to generate them. &nbsp;Perhaps we can do some processing of them =
during the build and convert them into some other form that lets us =
start up faster.</div><br class=3D""><blockquote type=3D"cite" =
class=3D""><div class=3D""><div dir=3D"ltr" class=3D""><div =
class=3D"gmail_quote"><div class=3D"">If people are generally interested =
in pursuing this further, I'd be happy to put my code into a scratch =
branch.</div></div></div>
</div></blockquote></div><br class=3D""><div class=3D"">I=E2=80=99d be =
curious to take a look=E2=80=A6</div><div class=3D""><br =
class=3D""></div><div class=3D"">Ken</div></body></html>=

--Apple-Mail=_01B585D4-7062-4B19-B3E3-6FBC84D5605C--