From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.devel,gmane.comp.gnu.guix.devel Subject: Re: The size of =?utf-8?B?4oCYLmdv4oCZ?= files Date: Mon, 08 Jun 2020 10:07:56 +0200 Message-ID: <877dwirndv.fsf@igalia.com> References: <875zc5z18d.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="125514"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) Cc: guix-devel , Guile Devel To: Ludovic =?utf-8?Q?Court=C3=A8s?= Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Mon Jun 08 10:08:47 2020 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jiCpr-000WY2-H6 for guile-devel@m.gmane-mx.org; Mon, 08 Jun 2020 10:08:47 +0200 Original-Received: from localhost ([::1]:41868 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jiCpq-0007Ji-FO for guile-devel@m.gmane-mx.org; Mon, 08 Jun 2020 04:08:46 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:41026) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jiCpc-0007I8-W9; Mon, 08 Jun 2020 04:08:33 -0400 Original-Received: from fanzine.igalia.com ([178.60.130.6]:47552) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jiCpa-0004TS-KL; Mon, 08 Jun 2020 04:08:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID:In-Reply-To:Date:References:Subject:Cc:To:From; bh=WWDJalY3qRXiWMzE0FJKoGZ4JtSKuZtlNLIE8a0tb78=; b=ItungNQCcSmZSfxRPHxsCEzX8UK78h+jOSl6gWfNGbBPXrhnixhuocYMr7Slz9Qam7MVQgAh4XtIQteuopHyFIi/If9WrNW6aQFaKDiAaKrV4mL2Acv2+DeS1YG9/Tq9HWrL/BPbkO1hb8VNLvf7wXXNsj+KhCos4kMXjb5GvGMDe2lS68oDYSmyfPntJ+9pv9siNOQ5AUSErcivDmkPH32sHhjF1aRqqPKNc+LbfYFhONE/GcTZXW5pvbLM5rRfLTkSaAI6yYZZHZeZgM1/p2mY9ap74/u/Zs73XvLzUHkjnFE9RTroYXDhjQn0jTtsjHlo0O3uKUGHsjSaTk19vQ==; Original-Received: from 82-65-63-215.subs.proxad.net ([82.65.63.215] helo=milano) by fanzine.igalia.com with esmtpsa (Cipher TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim) id 1jiCpD-0008Bn-4h; Mon, 08 Jun 2020 10:08:07 +0200 In-Reply-To: <875zc5z18d.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Fri, 05 Jun 2020 22:50:10 +0200") Received-SPF: pass client-ip=178.60.130.6; envelope-from=wingo@igalia.com; helo=fanzine.igalia.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/08 04:08:07 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x (no timestamps) [generic] [fuzzy] X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.io gmane.lisp.guile.devel:20536 gmane.comp.gnu.guix.devel:54463 Archived-At: Hi :) A few points of information :) On Fri 05 Jun 2020 22:50, Ludovic Court=C3=A8s writes: > [Sorting] the ELF sections of a .go file by size; for =E2=80=98python-xyz= .go=E2=80=99, > I get this: > > $13 =3D ((".rtl-text" . 3417108) > (".guile.arities" . 1358536) > (".data" . 586912) > (".rodata" . 361599) > (".symtab" . 117000) > (".debug_line" . 97342) > (".debug_info" . 54519) > (".guile.frame-maps" . 47114) > ("" . 1344) > (".guile.arities.strtab" . 681) > ("" . 232) > (".shstrtab" . 229) > (".dynamic" . 112) > (".debug_str" . 87) > (".strtab" . 75) > (".debug_abbrev" . 65) > (".guile.docstrs.strtab" . 1) > ("" . 0) > (".guile.procprops" . 0) > (".guile.docstrs" . 0) > (".debug_loc" . 0)) > > More than half of those 6=C2=A0MiB is code, and more than 1=C2=A0MiB is > =E2=80=9C.guile.arities=E2=80=9D (info "(guile) Object File Format"), whi= ch is > surprisingly large; presumably the file only contains thunks (the > =E2=80=98thunked=E2=80=99 fields of ). The guile.arities section starts with a sorted array of fixed-size headers, then is followed by a sequence of ULEB128 references to local variable names, including non-arguments. The size is a bit perplexing, I agree. I can think of a number of ways to encode that section differently but we'd need to understand a bit more about it and why the baseline compiler is significantly different. > Stripping the .debug_* sections (if that works) clearly wouldn=E2=80=99t = help. I believe that it should eventually be possible to strip guile.arities, fwiw. > So I guess we could generate less code (reduce =E2=80=98.rtl-text=E2=80= =99), perhaps by > tweaking =E2=80=98define-record-type*=E2=80=99, but I have little hope th= ere. Hehe :) As you mention later: > With 3.0.3-to-be and -O1, python-xyz.go weighs in at 3.4=C2=A0MiB instead= of > 5.9=C2=A0MiB! Here=E2=80=99s the section size distribution: > > $4 =3D ((".rtl-text" . 2101168) > (".data" . 586392) > (".rodata" . 360703) > (".guile.arities" . 193106) > (".symtab" . 117000) > (".debug_line" . 76685) > (".debug_info" . 53513) > ("" . 1280) > (".guile.arities.strtab" . 517) > ("" . 232) > (".shstrtab" . 211) > (".dynamic" . 96) > (".debug_str" . 87) > (".strtab" . 75) > (".debug_abbrev" . 56) > (".guile.docstrs.strtab" . 1) > ("" . 0) > (".guile.procprops" . 0) > (".guile.docstrs" . 0) > (".debug_loc" . 0)) > scheme@(guile-user)> (stat:size (stat go)) > $5 =3D 3519323 > > =E2=80=9C.rtl-text=E2=80=9D is 38% smaller and =E2=80=9C.guile.arities=E2= =80=9D is almost a tenth of > what it was. The difference in the text are the new baseline intrinsics, e.g. $vector-ref. It goes in the opposite direction from instruction explosion, which sought to (1) make the JIT compiler easier by decomposing compound operations into their atomic parts, (2) make the optimizer learn more information from flow rather than type-checking side effects, and (3) allow the optimizer to eliminate / hoist / move the component pieces of macro-operations. However in the baseline compiler (2) and (3) aren't possible because there is no optimizer on that level, and therefore the result is actually a lose -- 10 micro-ops cost more than 1 macro-op because of stack traffic overhead, which isn't currently mitigated by the JIT (1). So instruction explosion is residual code explosion, which should pay off in theory, but not for the baseline compiler. So I added new intrinsics for e.g. $vector-ref et al. Thus the smaller code size. I am not sure what causes the significantly different .guile.arities size! > Something=E2=80=99s going on here! Thoughts? There are more possibilities for making code size smaller, e.g. having two equivalent encodings for bytecode, where one is smaller: https://webkit.org/blog/9329/a-new-bytecode-format-for-javascriptcore/ Or it could be that if we could do register allocation for a target-dependent fixed set of registers in bytecode already, that could decrease minimum instruction size, making more instructions fit into single 32-bit words. Would be nice if the JIT could rely on the bytecode compiler to already have done register allocation, and reify corresponding debug information. Just a thought though, and not really appropriate to the baseline compiler. Cheers, Andy