From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: Rocky Bernstein <rocky@gnu.org>
Newsgroups: gmane.emacs.devel
Subject: Re: Update 1 on Bytecode Offset tracking
Date: Fri, 17 Jul 2020 09:47:38 -0400
Message-ID: <CANCp2gYXmRDMc0BugTsaT=oUiDwap6d1OGg_RsDg81B417fHGg@mail.gmail.com>
References: <87a700fk3j.fsf@gmail.com> <xjfsgdrgbuu.fsf@sdf.org>
 <875zanounr.fsf@gmail.com>
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary="000000000000281d1b05aaa367b8"
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
	logging-data="35006"; mail-complaints-to="usenet@ciao.gmane.io"
To: emacs-devel <emacs-devel@gnu.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Jul 17 15:48:42 2020
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane-mx.org
Original-Received: from lists.gnu.org ([209.51.188.17])
	by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
	(Exim 4.92)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>)
	id 1jwQj9-0008vl-PR
	for ged-emacs-devel@m.gmane-mx.org; Fri, 17 Jul 2020 15:48:39 +0200
Original-Received: from localhost ([::1]:46796 helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>)
	id 1jwQj8-0003SF-Ps
	for ged-emacs-devel@m.gmane-mx.org; Fri, 17 Jul 2020 09:48:38 -0400
Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:39242)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <rocky.bernstein@gmail.com>)
 id 1jwQiR-00031R-AY
 for emacs-devel@gnu.org; Fri, 17 Jul 2020 09:47:55 -0400
Original-Received: from mail-lf1-f52.google.com ([209.85.167.52]:40121)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.90_1) (envelope-from <rocky.bernstein@gmail.com>)
 id 1jwQiP-0001MO-M1
 for emacs-devel@gnu.org; Fri, 17 Jul 2020 09:47:55 -0400
Original-Received: by mail-lf1-f52.google.com with SMTP id o4so6084765lfi.7
 for <emacs-devel@gnu.org>; Fri, 17 Jul 2020 06:47:52 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to;
 bh=eAcebIFG5+lFt14O6poFaeqFAl89D24jIjjbi4ODAeY=;
 b=saMJv8QZizRiKRWbqczxGD5F0NcJxWWmv0wMFE5Af4GaJ1j9gr32h12320GHqtwehi
 gemn/FFDzo1g8urt24P3MdZqSSWvZvER4d/9eHtNAEedsLa1l09QclR9V2dhNtkSXSJK
 9s9oGYYdhyiDjfQHaG6WYfsu0XDkktnBlpHR4EmTrd697/hGwzq4GdXhbtPXoijiRdh0
 h98W5mSQDDru3QtZTpHCytlsqkYxDx4V05mGLXMqdQHDHGmzU+aXZQW86Id6iZYG19jh
 gBagvWT3Q/qNpa0qcFvLK0o5oXlCr3tCobQ2HKOCQZ6EAIFHwfeNDQP4tLM6nl6UBtNf
 JI6g==
X-Gm-Message-State: AOAM530+/+LOmvmozFFNGSRj58TK8Y+LkR0RwWKQdMTTj3AkJCeexmbP
 V4oPWHNdVBVb1k/VI+St7R5V4PWy/0jnjFoWcNUfsLWkIcs=
X-Google-Smtp-Source: ABdhPJxvbjsS4lguhxPAREokf8+u3iQT6HLXTify/npnsDDbdc67jfO69XFABOp7hibg7alsrgabkPwkjrOVX7iUnb4=
X-Received: by 2002:a05:6512:3190:: with SMTP id
 i16mr4996877lfe.184.1594993670625; 
 Fri, 17 Jul 2020 06:47:50 -0700 (PDT)
In-Reply-To: <875zanounr.fsf@gmail.com>
Received-SPF: pass client-ip=209.85.167.52;
 envelope-from=rocky.bernstein@gmail.com; helo=mail-lf1-f52.google.com
X-detected-operating-system: by eggs.gnu.org: First seen = 2020/07/17 09:47:51
X-ACL-Warn: Detected OS   = Linux 2.2.x-3.x [generic] [fuzzy]
X-Spam_score_int: -8
X-Spam_score: -0.9
X-Spam_bar: /
X-Spam_report: (-0.9 / 5.0 requ) BAYES_00=-1.9, FREEMAIL_FORGED_FROMDOMAIN=1,
 FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=1, HTML_MESSAGE=0.001,
 RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-1, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=no autolearn_force=no
X-Spam_action: no action
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Original-Sender: "Emacs-devel"
 <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>
Xref: news.gmane.io gmane.emacs.devel:253018
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/253018>

--000000000000281d1b05aaa367b8
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

ON Wed, 15 Jul 2020 23:55:19 -0400 Stefan Monnier wrote:

Sounds like a lot of information, which in turn implies a potentially
> high overhead (e.g. the "exact string" sounds like it might cost O(N=C2=
=B2)
> in corner cases, yet provides redundant info that can be recovered from
> begin+end points). Note also that while `read` returns a sexp made
> exclusively of data coming from a particular buffer, the code after
> macro-expansion can include chunks coming from other buffers, so if we
> want to keep the same representation of "sexp with extra info" in both
> cases, we can't just assume "the buffer".


Yes, when I last looked, yes, there is bloat in the way source mappings are
done. But let me explain:

As a Google Summer of Code project, the project has always been been a bit
behind. So the approach I had been taking was that if something is usable
for now, go with it and move onto other uncharted territory. In other
words, get something out,  complete what remains and *only then* go back
and iterate on the parts that need improving.  The C changes were little
bit different because of the (necessarily) long lead time to get things
into master and because one can't put something inefficient into the core.

The source-code string is needed in the source map only at the top-level.
(Oddly the member name for this is "code"). I had suggested that offsets
should be relative to the beginning of the function, and the function node
would have the position from the beginning of the container (e.g. file)
that it is in. However this isn't a big deal, since conversions are easily
done.

As for handling bits of S-expressions that represent the conglomeration of
a number of containers/files, that's pretty easily handled inside the
structure. I am not totally clear about how the container information is
determined. I imagine some of it would be noticed in the parameters when
the macro is defined, and some of each time the macro is expanded. But once
it is determined that certain S-expressions go with certain containers, it
is trivial to add it to a source-map object

One cool thing about having the source string stored in the sourcemap
object (whether just at the top-level of in more places) is that in
tracebacks is that exact information can be given without searching around.
In fact, the source code may have *never* existed inside a file and this
still works.

Another great thing about this is that it can tolerate mismatches between
the Elisp compiled and the Elisp that is have available. If there were
changes outside the toplevel object but not inside the object, then it is
pretty easy to detect and correct for this. Even if the discrepency is
inside the object, the differences are also easiliy detected. Adjusting is
a little more difficult, but still doable.

--000000000000281d1b05aaa367b8
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><pre style=3D"color:rgb(0,0,0)"><div dir=3D"ltr" class=3D"=
gmail_attr"><span style=3D"font-family:Arial,Helvetica,sans-serif">ON Wed, =
15 Jul 2020 23:55:19 -0400</span> <span style=3D"font-family:arial,sans-ser=
if">Stefan Monnier wrote: </span></div></pre><blockquote style=3D"margin:0p=
x 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" cl=
ass=3D"gmail_quote"><font face=3D"arial, sans-serif">Sounds like a lot of i=
nformation, which in turn implies a potentially<br></font><font face=3D"ari=
al, sans-serif">high overhead (e.g. the &quot;exact string&quot; sounds lik=
e it might cost O(N=C2=B2)<br></font><font face=3D"arial, sans-serif">in co=
rner cases, yet provides redundant info that can be recovered from<br></fon=
t><font face=3D"arial, sans-serif">begin+end points).  Note also that while=
 `read` returns a sexp made<br></font><font face=3D"arial, sans-serif">excl=
usively of data coming from a particular buffer, the code after<br></font><=
font face=3D"arial, sans-serif">macro-expansion can include chunks coming f=
rom other buffers, so if we<br></font><font face=3D"arial, sans-serif">want=
 to keep the same representation of &quot;sexp with extra info&quot; in bot=
h<br></font><font face=3D"arial, sans-serif">cases, we can&#39;t just assum=
e &quot;the buffer&quot;.</font></blockquote><div><br></div><div>=C2=A0</di=
v><div><div>Yes, when I last looked, yes, there is bloat in the way source =
mappings are done. But let me explain:=C2=A0</div><div></div></div><br clas=
s=3D"gmail-Apple-interchange-newline"><div>As a Google Summer of Code proje=
ct, the project has always been been a bit behind. So the approach I had be=
en taking was that if something is usable for now, go with it and move onto=
 other uncharted territory. In other words, get something out,=C2=A0 comple=
te what remains and <i>only=C2=A0then</i>=C2=A0go back and iterate on the p=
arts that need improving.=C2=A0 The C changes were little bit different bec=
ause of the (necessarily) long lead time to get things into master and beca=
use one can&#39;t put something inefficient into the core.=C2=A0</div><div>=
<br></div><div>The source-code string is needed in the source map only at t=
he top-level. (Oddly the member name for this is &quot;code&quot;). I had s=
uggested that offsets should be relative to the beginning of the function, =
and the function node would have the position from the beginning of the con=
tainer (e.g. file) that it is in. However this isn&#39;t a big deal, since =
conversions are easily done.=C2=A0<br></div><div><br></div><div>As for hand=
ling bits of S-expressions that represent the conglomeration of a number of=
 containers/files, that&#39;s pretty easily handled inside the structure. I=
 am not totally clear about how the container information is determined. I =
imagine some of it would be noticed in the parameters when the macro is def=
ined, and some of each time the macro is expanded. But once it is determine=
d that certain S-expressions go with certain containers, it is trivial to a=
dd it to a source-map object=C2=A0</div><div><br></div><div>One cool thing =
about having the source string stored in the sourcemap object (whether just=
 at the top-level of in more places) is that in tracebacks is that exact in=
formation can be given without searching around. In fact, the source code m=
ay have <i>never</i> existed inside a file and this still works.=C2=A0</div=
><div><br></div><div>Another great thing about this is that it can tolerate=
 mismatches between the Elisp compiled and the Elisp that is have available=
. If there were changes outside the toplevel object but not inside the obje=
ct, then it is pretty easy to detect and correct for this. Even if the disc=
repency is inside the object, the differences are also easiliy detected. Ad=
justing is a little more difficult, but still doable.</div><div><br></div><=
div><br></div></div>

--000000000000281d1b05aaa367b8--