From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Lynn Winebarger Newsgroups: gmane.emacs.devel Subject: Re: Ideal performance of ELisp Date: Wed, 17 Aug 2022 08:41:45 -0400 Message-ID: References: <838rnxvdcq.fsf@gnu.org> <83r11ptksn.fsf@gnu.org> <83a68dti6w.fsf@gnu.org> <874jykzvx9.fsf@yahoo.com> <83fsi4sttn.fsf@gnu.org> <838rnws5c7.fsf@gnu.org> <838rntocb8.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="15205"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Eli Zaretskii , Po Lu , Alan Mackenzie , emacs-devel , Yuan Fu To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Aug 17 14:42:57 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oOINt-0003jt-II for ged-emacs-devel@m.gmane-mx.org; Wed, 17 Aug 2022 14:42:57 +0200 Original-Received: from localhost ([::1]:60876 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oOINs-0005nf-EW for ged-emacs-devel@m.gmane-mx.org; Wed, 17 Aug 2022 08:42:56 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:50726) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oOIMz-0004tO-Tk for emacs-devel@gnu.org; Wed, 17 Aug 2022 08:42:01 -0400 Original-Received: from mail-pl1-x635.google.com ([2607:f8b0:4864:20::635]:41924) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oOIMx-0007ee-Nm; Wed, 17 Aug 2022 08:42:01 -0400 Original-Received: by mail-pl1-x635.google.com with SMTP id p18so11905298plr.8; Wed, 17 Aug 2022 05:41:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=kdHDItGjwAt4ZHiyXHfiQMO573ceRLPhdb+FpKPXc6s=; b=TeRSOirIH2YC9rC9gVaZ7uLytDGz40CYjjWpZMH11NPmmFCv+fBL3z2s4nSxrHiaT8 swUFfNhqUgVfU+ew3CZPauAo3scZaVc5Hd/FqATys8TkQ3zgVMIVx3Q7NS3Nw0KSTZH5 MD/Ps/gKhtL8+xJ9Ce89eQ0eei4RPHn6wK3btBLEaQEinF615e/srqwrirg+PYGBaRka 1kNabwQ5i2nkIVpm4MYUYZLiXMe+TCgDkOyYOHID5mww1uTQAacLsxzsSzFacY0uNnYD ygq9awpfDSjaf4h4g27UlSC8eKBiYplQBFAarA+Fvq2dD+SEUw7tCEyAMLTXmjPyD92X RWtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=kdHDItGjwAt4ZHiyXHfiQMO573ceRLPhdb+FpKPXc6s=; b=g1hAQWL8DwVM8TFGS2Q+bkZA85/+KFe0OuF07UYnx1s/fdFT8QUAykgJgiGneSxlKC Xn3FAMU49PbWwbNY0RJzAQ9tOue4nbgVOsNM0VbbXkwZn4TT3B+27ZVUQppxgfWB8zXL MjvF2U5RIVWi8rRlqzogQ/ezwMb9GW/OB9MC8ZbxLh3h55flsTNjdIz4RBXr/rvu82sp uuqlV+PUPg/7lSkpSLTjrmSb6t8HwHRwHi9OowVoSxGEWAwNvE/mVE2E/aHVtXZJSMYc uGZWLdrZ+H++zrVDSnEtpyLidhMHzuesTVXnkMdN7JEAQv9I5BqPhq4Zj3Hc+TyWNQfi 3iZQ== X-Gm-Message-State: ACgBeo0N1chjJuSP/ZASrzEoY0FZvUy2+wELwweQWLTHU0jI6HPsgepG XzzbOiDT1jzgS6ZIOT1/NwhhItJp8TGkkGkSwdE= X-Google-Smtp-Source: AA6agR6mnhz9LNfaXbIfliWsNPADgtcthMx+0HnGlJ/BYnPwXO4kVDN/PBPzqUaYkSuWH5pmOGtyZJN12auc0R9oBUQ= X-Received: by 2002:a17:903:50e:b0:170:d829:b3bb with SMTP id jn14-20020a170903050e00b00170d829b3bbmr25830619plb.93.1660740117570; Wed, 17 Aug 2022 05:41:57 -0700 (PDT) In-Reply-To: Received-SPF: pass client-ip=2607:f8b0:4864:20::635; envelope-from=owinebar@gmail.com; helo=mail-pl1-x635.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:293554 Archived-At: On Tue, Aug 16, 2022 at 1:22 PM Stefan Monnier wrote: > > > BTW, I was being serious - if there's a way to write a simple jump to > > do case-dispatching for that trampolining while loop, I'd definitely > > look at making semantic produce such automata so the native compiler, > > in particular, could optimize the result properly. > > AFAIK a (pcase x ('foo ..) ('bar ...) ...) should be compiled to > a `switch` bytecode which uses a hash-table lookup to find the > destination target to jump to. > > Not sure how well it works for very large tables where we risk bumping > into the 32K limit of bytecode jumps. That did occur to me. Perhaps one way to address that limitation without completely overhauling the VM would be to introduce a form of segmented memory for byte strings. It would require modification of the byte-code vector during the activation frame of exec_byte_code, but I believe there is a scheme that would ensure the byte-code object would be returned to its non-modified state before either executing a funcall or returning. It would require one new byte-code op, which I will call "trampoline", and a vector of byte-code strings (or "code segments") either at a known location in the constants vector, or as an additional entry in the byte-code vector itself. I'd lean toward the latter because it would be easy to discriminate byte-code vectors that are allowed to use the trampoline instruction without imposing additional space overhead for existing byte-vectors. The first entry in the code segment vector would be the "real" or "entry" byte-string, and no funcalls or returns would be performed except when that code segment was active. The trampoline instruction would two arguments off the operand stack - a code segment, and an offset in the code segment - set the byte string of the code vector to the indicated code segment, then set the pc to the offset to perform the jump. A typical entry byte-string for an automata might take a label and an arbitrary number of arguments, look up the label in a dispatch table, and trampoline to that label with the rest of the arguments on the stack. The following instructions might raise an error to indicate the label was invalid. The rest of the entry byte string would contain a block that would expect the operand stack to have the form [ fxn, arg0, .., argn, cont-segment,cont-offset ], then perform a funcall using all the elements on the operand stack but the bottom two, then swap the return value to the bottom of the operand stack and trampoline to the continuation label. Another block would be required that would just perform the return operation. There might be another for dealing with stack unwinding, I don't know the details of how that works. At any rate, I believe that would be a feasible scheme that would allow the byte-compiler to (a) overcome the address space limit, and (b) allow implementation of efficient tail recursion in simple cases like dispatching to locally-bound closures that are closed over the same lexical environment. The compiler could presumably do (b) now if there wasn't the concern for the limited address space. Such an instruction would definitely make writing an efficient automata easier. Lynn