From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Zach Shaftel Newsgroups: gmane.emacs.devel Subject: Update 2 on Bytecode Offset tracking Date: Tue, 28 Jul 2020 15:19:24 -0400 Message-ID: <87v9i74dm3.fsf@gmail.com> References: <87a700fk3j.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="18759"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: mu4e 1.4.10; emacs 28.0.50 To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Jul 28 21:58:58 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1k0VkX-0004mt-Ss for ged-emacs-devel@m.gmane-mx.org; Tue, 28 Jul 2020 21:58:57 +0200 Original-Received: from localhost ([::1]:57486 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k0VkW-0003AX-Tt for ged-emacs-devel@m.gmane-mx.org; Tue, 28 Jul 2020 15:58:56 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:40906) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k0Vis-0001rx-GW for emacs-devel@gnu.org; Tue, 28 Jul 2020 15:57:14 -0400 Original-Received: from mail-qt1-x82a.google.com ([2607:f8b0:4864:20::82a]:36030) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1k0Viq-0000Fc-DN for emacs-devel@gnu.org; Tue, 28 Jul 2020 15:57:13 -0400 Original-Received: by mail-qt1-x82a.google.com with SMTP id t23so12719045qto.3 for ; Tue, 28 Jul 2020 12:57:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:references:user-agent:in-reply-to:message-id :mime-version:content-disposition:content-transfer-encoding; bh=TIa7pYKbQ22G4ELh542XRUdByKiFGO768LtETid6bbw=; b=bWZgQCYEn4ZrCWXuq4pHRjYVHHxbFt6btyh2bdZnaeLFBr6wCtrj2N266FZp/oletj vTMBRRxNjX3Dnzv2CLjWRPMDhHaaGQRExAhmpzsPQxWvMg8Ux0sKWDEgXEak6MK4/q2p kI6rq9P7+6o7dEOAuncuOGAPrbYAibu3vf6jjgPyTVml/7KMmF2I+uKlU9Lkykkk3Um4 o63JoKvjkHSi/6ZhrhO11HYBrtM85FPZyhB6mA6m08NHV/kJAVKr9DoqLyM506fRsWm1 AjDlQcN4gbiU1aVSOc1ZMTAS1YoPAyhzDROGM3qMF+MwvCHX1qIM7ubjfcfZC2nHcdTP s8xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:references:user-agent :in-reply-to:message-id:mime-version:content-disposition :content-transfer-encoding; bh=TIa7pYKbQ22G4ELh542XRUdByKiFGO768LtETid6bbw=; b=dqNudkOsoFJNrcDYrRLUe/JzOrLo2XMSMrBW8GPHrojwZgX41suNffgGt3VWGf4jYv OBDzTAZ5+Z9zK4K5zkrVzLLeGdLU501WglOHUAZhm45IdeG5Ie83qw3Ej6b9danL1PSs RalZiC2aBtfQlRmZxN2wMJ4dCB3W9M9UzuOXNIkTcP9vLJqFrjNR7LWFVTfJt8rtvy6x dxwxtX3pI1QWdKc3F+PVSznAjLprWPLUrB3O+CYbZ+kjF5jzvJXHwAxQ+Z1qPr5OPsiD DtAXq6AV1/GEoMzMBpmXWPmyolFGOCzMK5IHPWSG6FOsowh0FV/IPf0GxugcsoLfJfN2 73rg== X-Gm-Message-State: AOAM531s+flDf5pgBXawF6VCJzYeCXTxm02azMQkgRhejfAAWGvVOTeU GQpyyReihxXHcpi2e5HqyXW6RGil X-Google-Smtp-Source: ABdhPJxjcIDchrbU3KbgdJ8lL8MTUVWY4RMMooovenPruDjT8S5JdA95sgl1yxRpapeU783fgqKwYQ== X-Received: by 2002:ac8:70da:: with SMTP id g26mr29245252qtp.67.1595966230478; Tue, 28 Jul 2020 12:57:10 -0700 (PDT) Original-Received: from arch-thinkpad ([2604:2000:2f41:2d00::1]) by smtp.gmail.com with ESMTPSA id i19sm22646417qkk.68.2020.07.28.12.57.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Jul 2020 12:57:09 -0700 (PDT) In-reply-to: <87a700fk3j.fsf@gmail.com> Content-Disposition: inline Received-SPF: pass client-ip=2607:f8b0:4864:20::82a; envelope-from=zshaftel@gmail.com; helo=mail-qt1-x82a.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:253317 Archived-At:

Another update on my Summer of Code project, improving ELisp traceback info= rmation.

On the C side (saving bytecode execution offset so mapbacktrace can pa= ss it to Lisp), I=E2=80=99ve submitted a patch = with the offset tracking code to bug-gnu-emacs. This is the version which updates the thread=E2=80=99s offset only before r= eaching a Bc= all op in exec_byte_code.

On the Lisp side (turning the offset into a source code position), right no= w I have a very rudimentary modified byte-compiler which compiles expres= sions annotated with source code data (in the form of a cl-struct). The code is up h= ere.

The main entry point is a function source-map-byte-compile-definition,= similar to byte-compile, which:

  1. takes a string or sexp contain= ing a function definition
  2. reads it into a source-map-expression struct containing source code information (and the original sexp)
  3. passes the struct through the compilatio= n process, maintaining an alist associating source-map-expression=E2=80=8Bs to LAP.

bytecode-source-map st= ruct and added to the symbol=E2=80=99s bytecode-source-map property. There are = functions to retrieve the code for a specific offset in the function, or to fetch an ali= st of the code and LAP. For example:

(let ((lexical-binding t))
  (source-map-byte-compile-definition
   '=
(defalias =
'plu=
s2-times3 #'(lambda (arg) (* (+ arg 2) 3)))))

(disassemble #'plus2-times3)
;; byte code for plus2-times3:
;;   doc:   ...
;;   args: (arg1)
;; 0       dup
;; 1       constant  2
;; 2       plus
;; 3       constant  3
;; 4       mult
;; 5       return

(bytecode-source-map =
'plus2-times3 0) ;; =3D> "arg"
(bytecode-source-map =
'plus2-times3 1) ;; =3D> "2"
(bytecode-source-map =
'plus2-times3 2) ;; =3D> "(+ arg 2)&qu=
ot;
(bytecode-source-map =
'plus2-times3 3) ;; =3D> "3"
(bytecode-source-map =
'plus2-times3 4) ;; =3D> "(* (+ arg 2)=
 3)"

(source-map-bytecomp-annotated-lap 'plus2-times3)
;; (("arg" byte-dup)
;;  ("2" byte-constant 2 . 0=
)
;;  ("(+ arg 2)" byte-plus .=
 0)
;;  ("3" byte-constant 3 . 1=
)
;;  ("(* (+ arg 2) 3)" byte-=
mult . 0)
;;  ("(* (+ arg 2) 3)" byte-=
return . 0))

It=E2=80=99s quite limited as it is. byte-optimize is disabled, and it= assumes the expression is fully macroexpanded. cconv also isn=E2=80=99t supported yet, = so only simple lexically-scoped functions work. If you=E2=80=99d like to try it out= , there are instructions in the repository=E2=80=99s README for running the ERT tests (basically cd source-mapping, ./run-tests.sh).

It isn=E2=80=99t that much slower than byte-compile in simple tests, b= ut there=E2=80=99s no way to get a realistic idea of performance while only simple expressions are supported. Since aref=E2=80=8Bs are pretty fast, I=E2=80=99m guessing = the struct slot accesses wouldn=E2=80=99t slow down execution too much, but creating so many records= certainly uses a lot of memory.

I=E2=80=99d like to know what others think of this: if there are inherent f= laws in this approach, if there=E2=80=99s a simpler or more obvious way to integrate thi= s into the byte-compilation process, or any other comments.

-Zach