From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Rocky Bernstein Newsgroups: gmane.emacs.devel Subject: Re: Update 1 on Bytecode Offset tracking Date: Fri, 17 Jul 2020 09:47:38 -0400 Message-ID: References: <87a700fk3j.fsf@gmail.com> <875zanounr.fsf@gmail.com> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000281d1b05aaa367b8" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="35006"; mail-complaints-to="usenet@ciao.gmane.io" To: emacs-devel Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Jul 17 15:48:42 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jwQj9-0008vl-PR for ged-emacs-devel@m.gmane-mx.org; Fri, 17 Jul 2020 15:48:39 +0200 Original-Received: from localhost ([::1]:46796 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jwQj8-0003SF-Ps for ged-emacs-devel@m.gmane-mx.org; Fri, 17 Jul 2020 09:48:38 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:39242) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jwQiR-00031R-AY for emacs-devel@gnu.org; Fri, 17 Jul 2020 09:47:55 -0400 Original-Received: from mail-lf1-f52.google.com ([209.85.167.52]:40121) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jwQiP-0001MO-M1 for emacs-devel@gnu.org; Fri, 17 Jul 2020 09:47:55 -0400 Original-Received: by mail-lf1-f52.google.com with SMTP id o4so6084765lfi.7 for ; Fri, 17 Jul 2020 06:47:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=eAcebIFG5+lFt14O6poFaeqFAl89D24jIjjbi4ODAeY=; b=saMJv8QZizRiKRWbqczxGD5F0NcJxWWmv0wMFE5Af4GaJ1j9gr32h12320GHqtwehi gemn/FFDzo1g8urt24P3MdZqSSWvZvER4d/9eHtNAEedsLa1l09QclR9V2dhNtkSXSJK 9s9oGYYdhyiDjfQHaG6WYfsu0XDkktnBlpHR4EmTrd697/hGwzq4GdXhbtPXoijiRdh0 h98W5mSQDDru3QtZTpHCytlsqkYxDx4V05mGLXMqdQHDHGmzU+aXZQW86Id6iZYG19jh gBagvWT3Q/qNpa0qcFvLK0o5oXlCr3tCobQ2HKOCQZ6EAIFHwfeNDQP4tLM6nl6UBtNf JI6g== X-Gm-Message-State: AOAM530+/+LOmvmozFFNGSRj58TK8Y+LkR0RwWKQdMTTj3AkJCeexmbP V4oPWHNdVBVb1k/VI+St7R5V4PWy/0jnjFoWcNUfsLWkIcs= X-Google-Smtp-Source: ABdhPJxvbjsS4lguhxPAREokf8+u3iQT6HLXTify/npnsDDbdc67jfO69XFABOp7hibg7alsrgabkPwkjrOVX7iUnb4= X-Received: by 2002:a05:6512:3190:: with SMTP id i16mr4996877lfe.184.1594993670625; Fri, 17 Jul 2020 06:47:50 -0700 (PDT) In-Reply-To: <875zanounr.fsf@gmail.com> Received-SPF: pass client-ip=209.85.167.52; envelope-from=rocky.bernstein@gmail.com; helo=mail-lf1-f52.google.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/07/17 09:47:51 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -8 X-Spam_score: -0.9 X-Spam_bar: / X-Spam_report: (-0.9 / 5.0 requ) BAYES_00=-1.9, FREEMAIL_FORGED_FROMDOMAIN=1, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:253018 Archived-At: --000000000000281d1b05aaa367b8 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable ON Wed, 15 Jul 2020 23:55:19 -0400 Stefan Monnier wrote: Sounds like a lot of information, which in turn implies a potentially > high overhead (e.g. the "exact string" sounds like it might cost O(N=C2= =B2) > in corner cases, yet provides redundant info that can be recovered from > begin+end points). Note also that while `read` returns a sexp made > exclusively of data coming from a particular buffer, the code after > macro-expansion can include chunks coming from other buffers, so if we > want to keep the same representation of "sexp with extra info" in both > cases, we can't just assume "the buffer". Yes, when I last looked, yes, there is bloat in the way source mappings are done. But let me explain: As a Google Summer of Code project, the project has always been been a bit behind. So the approach I had been taking was that if something is usable for now, go with it and move onto other uncharted territory. In other words, get something out, complete what remains and *only then* go back and iterate on the parts that need improving. The C changes were little bit different because of the (necessarily) long lead time to get things into master and because one can't put something inefficient into the core. The source-code string is needed in the source map only at the top-level. (Oddly the member name for this is "code"). I had suggested that offsets should be relative to the beginning of the function, and the function node would have the position from the beginning of the container (e.g. file) that it is in. However this isn't a big deal, since conversions are easily done. As for handling bits of S-expressions that represent the conglomeration of a number of containers/files, that's pretty easily handled inside the structure. I am not totally clear about how the container information is determined. I imagine some of it would be noticed in the parameters when the macro is defined, and some of each time the macro is expanded. But once it is determined that certain S-expressions go with certain containers, it is trivial to add it to a source-map object One cool thing about having the source string stored in the sourcemap object (whether just at the top-level of in more places) is that in tracebacks is that exact information can be given without searching around. In fact, the source code may have *never* existed inside a file and this still works. Another great thing about this is that it can tolerate mismatches between the Elisp compiled and the Elisp that is have available. If there were changes outside the toplevel object but not inside the object, then it is pretty easy to detect and correct for this. Even if the discrepency is inside the object, the differences are also easiliy detected. Adjusting is a little more difficult, but still doable. --000000000000281d1b05aaa367b8 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
ON Wed, = 15 Jul 2020 23:55:19 -0400 Stefan Monnier wrote:
Sounds like a lot of i= nformation, which in turn implies a potentially
high overhead (e.g. the "exact string" sounds lik= e it might cost O(N=C2=B2)
in co= rner cases, yet provides redundant info that can be recovered from
begin+end points). Note also that while= `read` returns a sexp made
excl= usively of data coming from a particular buffer, the code after
<= font face=3D"arial, sans-serif">macro-expansion can include chunks coming f= rom other buffers, so if we
want= to keep the same representation of "sexp with extra info" in bot= h
cases, we can't just assum= e "the buffer".

=C2=A0
Yes, when I last looked, yes, there is bloat in the way source = mappings are done. But let me explain:=C2=A0

As a Google Summer of Code proje= ct, the project has always been been a bit behind. So the approach I had be= en taking was that if something is usable for now, go with it and move onto= other uncharted territory. In other words, get something out,=C2=A0 comple= te what remains and only=C2=A0then=C2=A0go back and iterate on the p= arts that need improving.=C2=A0 The C changes were little bit different bec= ause of the (necessarily) long lead time to get things into master and beca= use one can't put something inefficient into the core.=C2=A0
=
The source-code string is needed in the source map only at t= he top-level. (Oddly the member name for this is "code"). I had s= uggested that offsets should be relative to the beginning of the function, = and the function node would have the position from the beginning of the con= tainer (e.g. file) that it is in. However this isn't a big deal, since = conversions are easily done.=C2=A0

As for hand= ling bits of S-expressions that represent the conglomeration of a number of= containers/files, that's pretty easily handled inside the structure. I= am not totally clear about how the container information is determined. I = imagine some of it would be noticed in the parameters when the macro is def= ined, and some of each time the macro is expanded. But once it is determine= d that certain S-expressions go with certain containers, it is trivial to a= dd it to a source-map object=C2=A0

One cool thing = about having the source string stored in the sourcemap object (whether just= at the top-level of in more places) is that in tracebacks is that exact in= formation can be given without searching around. In fact, the source code m= ay have never existed inside a file and this still works.=C2=A0

Another great thing about this is that it can tolerate= mismatches between the Elisp compiled and the Elisp that is have available= . If there were changes outside the toplevel object but not inside the obje= ct, then it is pretty easy to detect and correct for this. Even if the disc= repency is inside the object, the differences are also easiliy detected. Ad= justing is a little more difficult, but still doable.

<= div>
--000000000000281d1b05aaa367b8--