From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Update 1 on Bytecode Offset tracking Date: Wed, 15 Jul 2020 23:55:19 -0400 Message-ID: References: <87a700fk3j.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="13774"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: emacs-devel@gnu.org To: Zach Shaftel Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Jul 16 05:56:05 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jvv09-0003VE-7D for ged-emacs-devel@m.gmane-mx.org; Thu, 16 Jul 2020 05:56:05 +0200 Original-Received: from localhost ([::1]:51520 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jvv08-00028Y-9H for ged-emacs-devel@m.gmane-mx.org; Wed, 15 Jul 2020 23:56:04 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:53328) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jvuzY-0001SD-DM for emacs-devel@gnu.org; Wed, 15 Jul 2020 23:55:28 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:40682) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jvuzW-0008P8-2E for emacs-devel@gnu.org; Wed, 15 Jul 2020 23:55:27 -0400 Original-Received: from pmg1.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id 5F62D10031F; Wed, 15 Jul 2020 23:55:24 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id 3BA9A100311; Wed, 15 Jul 2020 23:55:22 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1594871722; bh=eTAw9EkyyzOLufhVRXfVX4FRFHECBqS2RqPnhgHJC5c=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=jb2yBI4Rm5+8KhP6yiSNwUc8N2i5oRMIMQCGzwNWMZCutNqW74v21u7LnC9f/x0M2 eQdvrvllvSVajaFqu5ga9cf3UeTIrpRAF07iPaayVEEgSZYHj7G3NY6eUWH57tApSm XXJaIlGv77rSRGwak5Z27t7LkVDr6lZPJUkt4x1yOuRm/6yEBEox8ECHMJhdnP6Ije Mrs5xNAUujrTwlcXsLcWJdEjUumPs3k+aCwxdTxCJz7nS3fL9nzk1iYBa2d3VcWegZ NlksFqdWRhbv9/2Ay9+ZGvm75uDfTs5TIZhXNh23iVPK38y/UbbtfIvtfgCJjxUx9M mIHfS2mHjATfQ== Original-Received: from asado (unknown [45.72.129.42]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id EC2BC1204E8; Wed, 15 Jul 2020 23:55:21 -0400 (EDT) In-Reply-To: <87a700fk3j.fsf@gmail.com> (Zach Shaftel's message of "Wed, 15 Jul 2020 19:10:32 -0400") Received-SPF: pass client-ip=132.204.25.50; envelope-from=monnier@iro.umontreal.ca; helo=mailscanner.iro.umontreal.ca X-detected-operating-system: by eggs.gnu.org: First seen = 2020/07/15 23:55:24 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:252975 Archived-At: > The second branch saves the offset only before a call. Therefore, the > traceback on all of the functions other than the current one are > accurate, but the current one is not accurate if the error happens in > a byte op. IIUC this has a negligible performance impact. The info it provides in not 100% accurate, but I think it's a "sweet spot": it does provide the byte-offset info and is cheap enough to be acceptable into `master` with no real downside. I'd look at it as a "step" along the way: subsequent steps can be to make use of that info, or to improve the accuracy of that info. > The third branch bypasses invoking Ffuncall from within > exec_byte_code, and instead does essentially the same thing that > Ffuncall does, right in the Bcall ops. This would be useful in its own right. So I suggest you try and get this code into shape for `master` as well. I expect this will tend to suffer from some amount of code duplication. Maybe we can avoid it via refactoring, or maybe by "clever" macro tricks, but if the speedup is important enough, we can probably live with some amount of duplication. > All of them print the offset next to function names in the backtrace like= this:=20 > > > Debugger entered--Lisp error: (wrong-type-argument stringp t) > string-match(t t nil) > 13 test-condition-case() > load("/home/zach/.repos/bench-compare.el/test/test-debug...") > 78 byte-recompile-file("/home/zach/.repos/bench-compare.el/test/test-= debug..." nil 0 t) > 35 emacs-lisp-byte-compile-and-load() > funcall-interactively(emacs-lisp-byte-compile-and-load) > call-interactively(emacs-lisp-byte-compile-and-load record nil) > 101 command-execute(emacs-lisp-byte-compile-and-load record) Cool! > With respect to reporting offsets, using code from edebug we have > a Lisp-Expression reader that will track source-code locations and > store the information in a source-map-expression cl-struct. The code > in progress is here. How does the performance of this code compare to that of the "native" `read? And to put it into perspective, have you looked at the relative proportion of time spent in `read` during a "typical" byte compilation? There's no doubt that preserving source code information will slow down byte-compilation but depending on how slow it gets we may find it's not "worth it". > Information currently saved is:=20 > > * The expression itself > * The exact string that was read > * Begin and end point=E2=80=8Bs of the sexp in the buffer > * source-map-expression children (for conses and vectors) Sounds like a lot of information, which in turn implies a potentially high overhead (e.g. the "exact string" sounds like it might cost O(N=C2=B2) in corner cases, yet provides redundant info that can be recovered from begin+end points). Note also that while `read` returns a sexp made exclusively of data coming from a particular buffer, the code after macro-expansion can include chunks coming from other buffers, so if we want to keep the same representation of "sexp with extra info" in both cases, we can't just assume "the buffer". Stefan