From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Lynn Winebarger Newsgroups: gmane.emacs.devel Subject: Re: Ideal performance of ELisp (was: Why tree-sitter instead of Semantic? (was Re: CC Mode with font-lock-maximum-decoration 2)) Date: Fri, 12 Aug 2022 19:26:14 -0400 Message-ID: References: <83o7wuva9o.fsf@gnu.org> <83mtceupbx.fsf@gnu.org> <83lerxvfnu.fsf@gnu.org> <838rnxvdcq.fsf@gnu.org> <83r11ptksn.fsf@gnu.org> <83a68dti6w.fsf@gnu.org> <874jykzvx9.fsf@yahoo.com> <83fsi4sttn.fsf@gnu.org> <838rnws5c7.fsf@gnu.org> <838rntocb8.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="00000000000063a79705e6139cb2" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="612"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Eli Zaretskii , Po Lu , Alan Mackenzie , emacs-devel , Yuan Fu To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Aug 13 01:27:33 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oMe3w-000AXw-RV for ged-emacs-devel@m.gmane-mx.org; Sat, 13 Aug 2022 01:27:32 +0200 Original-Received: from localhost ([::1]:38568 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oMe3v-0005ls-VG for ged-emacs-devel@m.gmane-mx.org; Fri, 12 Aug 2022 19:27:31 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:55244) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oMe2v-00052L-OQ for emacs-devel@gnu.org; Fri, 12 Aug 2022 19:26:29 -0400 Original-Received: from mail-pj1-x1031.google.com ([2607:f8b0:4864:20::1031]:36410) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oMe2t-0005s4-RQ; Fri, 12 Aug 2022 19:26:29 -0400 Original-Received: by mail-pj1-x1031.google.com with SMTP id 15-20020a17090a098f00b001f305b453feso9621306pjo.1; Fri, 12 Aug 2022 16:26:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=/SFO2pCMV3gPwvtQOGXn+Qc4NJaclcAlU1SlTQirwW0=; b=pNYkz3tUkAnwl7HNBX3TYkvY51sNlzXdxE7LQRNmwigtU8nCGiLIGAbIy9XRkEqm3A jxYeGX4QlT4pg67o34U3r4jCWVClDEELiJIKPMYQBtjBlIRclIius/r2TE/Ht1wBwFIO zTVQ0pMLyf6y5fKsWvstx5wTt5yfIu08ecbTgO+meiziKSj8BB316+jQZJnBPE2N/7Kg mFbXHCCDFJ4EB75FZwsfnwbprwAWFGbopvzHIj5Mh/5foZbu1poumiRQyhw5NeDJ1aHO 4j2Jwsqpgb5DEIeOmXppLTqO2N2eX2JCt7N/WOynObPYVU15+/J9d50JXd0jHdrZb4Y4 I1gA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=/SFO2pCMV3gPwvtQOGXn+Qc4NJaclcAlU1SlTQirwW0=; b=d1O/n5l+/Tifa8mNlCHKtNyEU2qTR7T84ACiisUi6xltJl05g2HHWnCsRHEu5om+/r EhLQhQySInwCd/9rGJhh0Yx/dzx09NVZITbRWp/oBe8yjSmnfjFoYXOdu8eledVew8NO e7K7WikVK7I4b7rIjKJCA93HG2QPSiIXZlb1hZc2PclVyXh9b10rMPm/L9HpZYSkOvlx PiqGMR4l5cidLS/O4BS1TLhuIzFSjByiDB1uQxEjfXF1YpIUkUqHpb+QMLezQE+fiEk4 uRmD+CdStz7twwdbtMEnODSYyvW66Yd9ua8WSya6ChvgMmTE2Vwo0kwbmW73I85/KGc5 HiwA== X-Gm-Message-State: ACgBeo3+o78wH9poRAlNB2ltZn8PGv+HV2Pu3CQyMK4HRTClQtzLYLvD dEIfudDyU7l73lO1XdzCbol0r7bw6pcSouZRDiw= X-Google-Smtp-Source: AA6agR7FLWFwjCMVbFFs2iNMCDzo9eEXLs1873ccDk4a+UC6MQ6b/crH5rg9NbY4DS1N3goOdFQm30MOrrPoebVY/BE= X-Received: by 2002:a17:902:cf0b:b0:171:270d:13ad with SMTP id i11-20020a170902cf0b00b00171270d13admr6286148plg.156.1660346786094; Fri, 12 Aug 2022 16:26:26 -0700 (PDT) In-Reply-To: Received-SPF: pass client-ip=2607:f8b0:4864:20::1031; envelope-from=owinebar@gmail.com; helo=mail-pj1-x1031.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:293396 Archived-At: --00000000000063a79705e6139cb2 Content-Type: text/plain; charset="UTF-8" On Fri, Aug 12, 2022, 5:50 PM Stefan Monnier wrote: > >> > I don't have this information. Maybe someone else does. But in > >> > general, it is a very small wonder that a parser written in optimized > >> > C is much faster than anything written in Emacs Lisp, given that Lisp > >> > is an interpreted language that has no special support for writing > >> > parsers. > >> That can be cured over time, now that the bulk of the core of emacs > >> uses lexical scoping. > > I very much doubt that. > > Agreed. > I noted the switch to lexical scope, because dynamic scope prevents any call from ever being a tail call in any meaningful sense. It's not automatic, but when the core of Emacs lisp code used dynamic scoping, changes to the VM to support proper tail recursion would have been meaningless. That is no longer the case. Once the VM supports proper tail recursion, it's straightforward to generate automata that never perform a function call, at least not as part of the recognizer. Lisp with proper tail recursion gives you the equivalent of computed goto's, which are not available in standards-conformant C AFAIK. If the compiler can prove certain local variables are always fixnums and remove the tag/untag overhead, then the generated IR passed to libgccjit (or clang) should be able to match anything expressible in C for those automata. The code in the actions may not be as optimizable as the recognition automata, I'll grant you. > ELisp code cannot match the speed of optimized C code. > > I suspect it could, to some extent, in theory. Getting there would > require a fair bit more work, probably using a different compilation > strategy than the AOT compiler we have now. > A good reference is JavaScript which shouldn't be noticeably easier to > compile efficiently and where many millions of dollars have been poured > over several years to speed it up. The result is indeed able to match > the performance of C nowadays for several non-trivial examples of code, > but it's far from obvious that we have the resources to reproduce this > feat for ELisp. The claim wasn't that every piece of ELisp code could be as optimized as sharply written C or C++ code, but that there is a subset that is (that can be targeted by tools like parser and lexer generators), at least at the level of an abstract RISC machine that the IRs for compiler backends like libgccjit or llvm let you target. I don't think the compiler has to match the level of optimization of the V8 compiler to get significant improvements in the performance of ELisp. We're really only talking about optimizations known to Common Lisp and Scheme compiler writers in the 1980s. However, the most significant improvements in performance will probably only come after changes that remove impediments to parallelizing work and improved memory management. Lynn --00000000000063a79705e6139cb2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Fri, Aug 12, 2022, 5:50 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
<= /div>
>> > I don't have this inf= ormation.=C2=A0 Maybe someone else does.=C2=A0 But in
>> > general, it is a very small wonder that a parser written in o= ptimized
>> > C is much faster than anything written in Emacs Lisp, given t= hat Lisp
>> > is an interpreted language that has no special support for wr= iting
>> > parsers.
>> That can be cured over time, now that the bulk of the core of emac= s
>> uses lexical scoping.
> I very much doubt that.

Agreed.

I noted the switch to lexical scope, because dynamic scope prevents = any call from ever being a tail call in any meaningful sense.=C2=A0 It'= s not automatic, but when the core of Emacs lisp code used dynamic scoping,= changes to the VM to support proper tail recursion would have been meaning= less.=C2=A0 That is no longer the case.
Once the VM = supports proper tail recursion, it's straightforward to generate automa= ta that never perform a function call, at least not as part of the recogniz= er.=C2=A0 Lisp with proper tail recursion gives you the equivalent of compu= ted goto's, which are not available in standards-conformant C AFAIK.=C2= =A0 If the compiler can prove certain local variables are always fixnums an= d remove the tag/untag overhead, then the generated IR passed to libgccjit = (or clang) should be able to match anything expressible in C for those auto= mata.
The code in the actions may not be as optimiza= ble as the recognition automata, I'll grant you.

> ELisp code cannot match the speed of optimized C code.

I suspect it could, to some extent, in theory.=C2=A0 Getting there would require a fair bit more work, probably using a different compilation
strategy than the AOT compiler we have now.

A good reference is JavaScript which shouldn't be noticeably easier to<= br> compile efficiently and where many millions of dollars have been poured
over several years to speed it up.=C2=A0 The result is indeed able to match=
the performance of C nowadays for several non-trivial examples of code,
but it's far from obvious that we have the resources to reproduce this<= br> feat for ELisp.

The claim wasn't that every piece of ELisp code could be as = optimized as sharply written C or C++ code, but that there is a subset that= is (that can be targeted by tools like parser and lexer generators), at le= ast at the level of an abstract RISC machine that the IRs for compiler back= ends like libgccjit or llvm let you target.=C2=A0=C2=A0

I don't think the compiler has to match= the level of optimization of the V8 compiler to get significant improvemen= ts in the performance of ELisp.=C2=A0 We're really only talking about o= ptimizations known to Common Lisp and Scheme compiler writers in the 1980s.= =C2=A0 However, the most significant improvements in performance will proba= bly only come after changes that remove impediments to parallelizing work a= nd improved memory management.

Lynn

--00000000000063a79705e6139cb2--