From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Ken Raeburn Newsgroups: gmane.emacs.devel Subject: compiled lisp file format (Re: Skipping unexec via a big .elc file) Date: Sun, 21 May 2017 04:44:01 -0400 Message-ID: References: <8A8DA980-13A7-4F8B-9D07-391728C673C9@raeburn.org> <87h9300x5n.fsf@linux-m68k.org> <734D2132-71FD-414D-B091-629189742DB4@raeburn.org> <83a8889ede.fsf@gnu.org> <144D5F87-D876-485D-BAB3-2AA93627272A@raeburn.org> <83inmq53xk.fsf@gnu.org> <96D35768-314C-43F5-BD5E-B12187759DCA@raeburn.org> <123104DD-447F-4CDB-B3A0-CED80E3AC8C9@raeburn.org> <20170403165736.GA2851@acm> <2497A2D5-FDB1-47FF-AED3-FD4ABE2FE144@raeburn.org> <83lgrhpalq.fsf@gnu.org> <0D99B4FE-FEEF-4565-87D6-E230A05DEF3C@raeburn.org> <86lgrc4vob.fsf@molnjunk.nocrew.org> <834ly0oew1.fsf@gnu.org> <968E8F50-92F6-43C7-B7E4-EE8378943087@raeburn.org> <83wpawmj4d.fsf@gnu.org> <1e397033-8291-1625-8b78-a1e1c200aea5@gmail.com> <18196f08-408d-8b17-423e-8be54507bb84@gmail.com> <8360hkkcgj.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1495356289 26757 195.159.176.226 (21 May 2017 08:44:49 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 21 May 2017 08:44:49 +0000 (UTC) To: Emacs developers Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun May 21 10:44:44 2017 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dCMTk-0006oZ-7A for ged-emacs-devel@m.gmane.org; Sun, 21 May 2017 10:44:44 +0200 Original-Received: from localhost ([::1]:36784 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dCMTo-0000Ed-0W for ged-emacs-devel@m.gmane.org; Sun, 21 May 2017 04:44:48 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:58826) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dCMTB-0000EJ-0t for emacs-devel@gnu.org; Sun, 21 May 2017 04:44:10 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dCMT7-0005X6-Tn for emacs-devel@gnu.org; Sun, 21 May 2017 04:44:09 -0400 Original-Received: from mail-qt0-x233.google.com ([2607:f8b0:400d:c0d::233]:36064) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dCMT7-0005Ws-NL for emacs-devel@gnu.org; Sun, 21 May 2017 04:44:05 -0400 Original-Received: by mail-qt0-x233.google.com with SMTP id f55so83587321qta.3 for ; Sun, 21 May 2017 01:44:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raeburn-org.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=5GvghNbOIDQEgyrPiFtjF1AXDBghvEdVZtNuKtPkpNQ=; b=SZsc1AF2MRZVdW0rNDxEfIwNtHcjxaoyAd6AcTJUtXXmLGlCOgruAFbxFd4babai2C MRUvG8I8nztF3ghe+2pJGSjiDhGRdLUGRuR/RUwNvXg8yxuU2C2vwstfbhptwlhhpptG G7YyucKGFohh/SgyIMoOadtdy7GFI9rkdUZIy00cPdb+DzTtEW+9Fba1RJ0Di8S43J4x HT6QaGkmjofxQIFqSD3R2lsV0dorKv2DYcKLwozO4xYyz9LcKKKe8RuaSZlRC9hWwRLM E8vHiAtKr73TkxFcoxumFqa0oPsyEbuWdSjKIsu7Qr0zwCngBKuQIGCkyD0a78Mrt+9Q uEJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=5GvghNbOIDQEgyrPiFtjF1AXDBghvEdVZtNuKtPkpNQ=; b=JrBxDGxXm7cRdGz7pRyr7wfS4cLSCkkX4Q4olb9Nubu+SUbOvVJvWd4e10axqnK8pD O1x1nKxsGGOSwd7VrMBZBAt51bk61Uy1QLJhU08NxPOfEh9plAXqklOv+CFda8Hmw1jp bick6QxpalmHDfmitm9fb6cMNxq2qDHoo0kA3m0RVNmI1KNH5oeEUUYYeggGMnjq+Gm/ ClB7B2qKtCYvsPsz+iF7mN4kK1miHnbGdfManhMJWqpPbuyw4KCMM2QBVMx4uAAbVcDP 4rWxao6AZpAY2xTXTD+GtCUbAebJiBYN9zr+Umbtw338NpmFIlEdV6f34nhqMkLcr91D yS2Q== X-Gm-Message-State: AODbwcB9qFt2DMls1jFgQmntSptmWwh/9jUz7PQ3FrHAArXl5gbVSwRK S+GX+HSTk1nPyzIpNR1DAQ== X-Received: by 10.200.49.174 with SMTP id h43mr16951648qte.128.1495356243916; Sun, 21 May 2017 01:44:03 -0700 (PDT) Original-Received: from [192.168.23.52] (c-73-253-167-23.hsd1.ma.comcast.net. [73.253.167.23]) by smtp.gmail.com with ESMTPSA id p130sm9432473qka.5.2017.05.21.01.44.02 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sun, 21 May 2017 01:44:03 -0700 (PDT) In-Reply-To: <8360hkkcgj.fsf@gnu.org> X-Mailer: Apple Mail (2.3124) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400d:c0d::233 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:215039 Archived-At: I haven=E2=80=99t had much time to further the work on the big-elc = approach recently, but there is one idea I want to toss out there for = possibly improving the load time further: Changing the .elc file format = to a binary one. I=E2=80=99m not talking about a memory image like = Daniel is working on. I mean a file representing a sequence of = S-expressions, but optimized for loading speed rather than for human = readability. The Guile project has taken this idea pretty far; they=E2=80=99re = generating ELF object files with a few special sections for Guile = objects, using the standard DWARF sections for debug information, etc. = While it has a certain appeal (making C modules and Lisp files look much = more similar, maybe being able to link Lisp and C together into one = executable image, letting GDB understand some of your data), switching = to a machine-specific format would be a pretty drastic change, when we = can currently share the files across machines. I haven=E2=80=99t got a complete, concrete proposal, but I see at least = a couple general approaches possible: 1) Follow the model of flat object file formats: Some file sections have = data of various types (string content, symbol names, integer or floating = constants); others (the equivalent of standard object file = =E2=80=9Crelocation=E2=80=9D data) would provide info on how to allocate = and fill in the container objects (pairs, vectors, etc) desired, with = references to the symbols or strings or other container objects. 2) Continue to use the current recursive processing, but with a binary = format. Some (byte? word?) value indicates =E2=80=9Cthis is string = data=E2=80=9D, it=E2=80=99s followed by a byte count and that many bytes = of string content (always using the Emacs internal encoding, so we = don=E2=80=99t have to translate when reading). Another value indicates = an integer constant. Another value indicates a vector, and is followed = by a length and then that many other values, which are each processed = recursively before we get back to the object following the vector. Each = object=E2=80=99s initializer=E2=80=99s length is dependent on the type, = and for container types, the values contained within. Either way, getting away from the expensive one-character-at-a-time = processing, multibyte coding, escape processing, etc., and pushing = around groups of bytes whenever possible should save us time. This would be useable not just for the dumped.elc file, but for other = compiled Lisp files as well, whether in the distribution or from ELPA or = the user=E2=80=99s own code. I did throw together a half-baked attempt to try some of this out. I = added a new =E2=80=9C#=E2=80=9D construct for unibyte strings, putting = the byte count into the file so that the string data could be copied = with fread() instead of a READCHAR loop. I also added a new version of = the =E2=80=9C#n#=E2=80=9D syntax that uses a fixed number of READCHAR = calls and avoids the decimal arithmetic. So, the file can no longer be = processed as Lisp, and it still requires some text parsing, though not = nearly as much as before; some of the worst of both worlds. But the = load time for dumped.elc did drop by another 12% in my tests (start in = batch mode, print a message and exit, from 0.227s down to 0.2s or less = per run, still loading a couple of standard-elc-format files during = startup). I=E2=80=99m curious if people think this might be an approach worth = pursuing. Or if the Lisp-based elc format is seen as advantageous in = ways I=E2=80=99m not seeing=E2=80=A6. Ken=