From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id GNfoMay13158MwAA0tVLHw (envelope-from ) for ; Tue, 09 Jun 2020 16:15:40 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id EBDVLay1315COAAAbx9fmQ (envelope-from ) for ; Tue, 09 Jun 2020 16:15:40 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 8F6369403E9 for ; Tue, 9 Jun 2020 16:15:40 +0000 (UTC) Received: from localhost ([::1]:34190 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jiguZ-0002GW-7E for larch@yhetil.org; Tue, 09 Jun 2020 12:15:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44256) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jigp1-0005BV-Rj; Tue, 09 Jun 2020 12:09:57 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:44515) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jigoy-0004wA-Td; Tue, 09 Jun 2020 12:09:52 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=34576 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jigoy-0001rv-Ey; Tue, 09 Jun 2020 12:09:52 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Andy Wingo Subject: Re: The size of =?utf-8?B?4oCYLmdv4oCZ?= files References: <875zc5z18d.fsf@gnu.org> <877dwirndv.fsf@igalia.com> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 22 Prairial an 228 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Tue, 09 Jun 2020 18:09:50 +0200 In-Reply-To: <877dwirndv.fsf@igalia.com> (Andy Wingo's message of "Mon, 08 Jun 2020 10:07:56 +0200") Message-ID: <87sgf4gr01.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: guix-devel , Guile Devel Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Spam-Score: 0.49 X-TUID: BvDap7+rMezo Hello! Andy Wingo skribis: > A few points of information :) Much appreciated! > The guile.arities section starts with a sorted array of fixed-size > headers, then is followed by a sequence of ULEB128 references to local > variable names, including non-arguments. The size is a bit perplexing, > I agree. I can think of a number of ways to encode that section > differently but we'd need to understand a bit more about it and why the > baseline compiler is significantly different. =E2=80=98.guile.arities=E2=80=99 size should be proportional to the number = of procedures, right? Additionally, if there are only/mostly thunks, the string table for argument names should be small if not empty. For N thunks, I would expect roughly N 28-byte headers + NxM UL128, say 100 bytes per thunk; there=E2=80=99s 1000 of them, so we should be ~100,000 byt= es. This is roughly what we get observe with the baseline compiler. >> =E2=80=9C.rtl-text=E2=80=9D is 38% smaller and =E2=80=9C.guile.arities= =E2=80=9D is almost a tenth of >> what it was. > > The difference in the text are the new baseline intrinsics, > e.g. $vector-ref. It goes in the opposite direction from instruction > explosion, which sought to (1) make the JIT compiler easier by > decomposing compound operations into their atomic parts, (2) make the > optimizer learn more information from flow rather than type-checking > side effects, and (3) allow the optimizer to eliminate / hoist / move > the component pieces of macro-operations. > > However in the baseline compiler (2) and (3) aren't possible because > there is no optimizer on that level, and therefore the result is > actually a lose -- 10 micro-ops cost more than 1 macro-op because of > stack traffic overhead, which isn't currently mitigated by the JIT (1). > > So instruction explosion is residual code explosion, which should pay > off in theory, but not for the baseline compiler. So I added new > intrinsics for e.g. $vector-ref et al. Thus the smaller code size. Yes, that makes a lot of sense. In particular, this file must use the struct intrinsics a lot. > There are more possibilities for making code size smaller, e.g. having > two equivalent encodings for bytecode, where one is smaller: > > https://webkit.org/blog/9329/a-new-bytecode-format-for-javascriptcore/ Like THUMB, but for bytecode. :-) I guess we could first analyze the generated code more closely and see if there are opportunities there. Thanks for the explanations! Ludo=E2=80=99.