From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!.POSTED!not-for-mail
From: Alan Mackenzie <acm@muc.de>
Newsgroups: gmane.emacs.devel
Subject: Re: Thoughts on getting correct line numbers in the byte compiler's
	warning messages
Date: Wed, 7 Nov 2018 17:00:36 +0000
Message-ID: <20181107170036.GA4934@ACM>
References: <20181101175953.GC4504@ACM>
	<jwvbm788ln6.fsf-monnier+gmane.emacs.devel@gnu.org>
	<20181105105302.GA10520@ACM>
	<jwvo9b2z5y2.fsf-monnier+emacs@gnu.org> <20181106151143.GB4030@ACM>
	<jwvwopq6v8f.fsf-monnier+emacs@gnu.org>
NNTP-Posting-Host: blaine.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: blaine.gmane.org 1541610249 5810 195.159.176.226 (7 Nov 2018 17:04:09 GMT)
X-Complaints-To: usenet@blaine.gmane.org
NNTP-Posting-Date: Wed, 7 Nov 2018 17:04:09 +0000 (UTC)
User-Agent: Mutt/1.10.1 (2018-07-13)
Cc: emacs-devel@gnu.org
To: Stefan Monnier <monnier@iro.umontreal.ca>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Nov 07 18:04:05 2018
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by blaine.gmane.org with esmtp (Exim 4.84_2)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1gKRFM-0001PF-Jl
	for ged-emacs-devel@m.gmane.org; Wed, 07 Nov 2018 18:04:04 +0100
Original-Received: from localhost ([::1]:49413 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1gKRHS-0008EF-Vw
	for ged-emacs-devel@m.gmane.org; Wed, 07 Nov 2018 12:06:15 -0500
Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:36796)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <acm@muc.de>)
	id 1gKRDE-0002x4-Hg
	for emacs-devel@gnu.org; Wed, 07 Nov 2018 12:01:55 -0500
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <acm@muc.de>) id 1gKRDB-0005zD-PE
	for emacs-devel@gnu.org; Wed, 07 Nov 2018 12:01:52 -0500
Original-Received: from colin.muc.de ([193.149.48.1]:63198 helo=mail.muc.de)
	by eggs.gnu.org with smtp (Exim 4.71) (envelope-from <acm@muc.de>)
	id 1gKRD7-0005W0-Vw
	for emacs-devel@gnu.org; Wed, 07 Nov 2018 12:01:47 -0500
Original-Received: (qmail 45708 invoked by uid 3782); 7 Nov 2018 17:01:38 -0000
Original-Received: from acm.muc.de (p5B147AD4.dip0.t-ipconnect.de [91.20.122.212]) by
	colin.muc.de (tmda-ofmipd) with ESMTP;
	Wed, 07 Nov 2018 18:01:37 +0100
Original-Received: (qmail 5153 invoked by uid 1000); 7 Nov 2018 17:00:36 -0000
Content-Disposition: inline
In-Reply-To: <jwvwopq6v8f.fsf-monnier+emacs@gnu.org>
X-Delivery-Agent: TMDA/1.1.12 (Macallan)
X-Primary-Address: acm@muc.de
X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy]
X-Received-From: 193.149.48.1
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel/>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: "Emacs-devel" <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Xref: news.gmane.org gmane.emacs.devel:231049
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/231049>

Hello, Stefan.

On Tue, Nov 06, 2018 at 11:29:41 -0500, Stefan Monnier wrote:
> > I timed a bootstrap, unoptimised GCC, with an extra tag check and
> > storage to a global variable inserted into XFIXNUM.  (Currently there is
> > no such check there).  The slowdown was around 1.3%

> That accumulates for every data type, and it increases code size,
> reduces cache hit rate...

No, it applies mainly to FIXNUM, because XFIXNUM doesn't already check
the Lisp_Type.  Other object types already perform this check, so while
it would increase the code size (by how much?) it would have a lesser
run time penalty.  There would be a slow down in predicates like
symbolp, when the result is false.  This probably wouldn't amount to
much in practice.

Part of that 1.3% (I don't know how big a part) was GCC outputting
warning messages.

Anyhow, do we really need to worry about code size anymore?  temacs is
only 7.3 Mb, and the machines people will be running it on will have
several, or more usually many, Gb of RAM.  So what if it became 7.5 Mb,
or even 8.0 Mb?

> You may find it acceptable, but I don't, mostly because I know
> fundamentally it's not needed: it's only introduced for short/medium
> term convenience (to avoid having to rewrite a lot of code).
> And I can't see how we'll be able to get rid of it in the long run
> (gradually or not).

> So in the long run it's a bad option.

Yes, it may be a bad option, but possibly less bad than the other bad
options we have.

> > Many of the original forms produced by the reader survive these
> > transformations.

This, as it happens, is not true.  Many of the symbols produced by the
reader survive, none of the cons forms do.  cconv, we love you. ;-(

> Yeah, that's why I thought of using a hash-table.

> > I've tried 2., and given up on it: everywhere in the compiler where FORM
> > is transformed to NEWFORM, a copy of a hash has to be created for
> > NEWFORM.

I've rediscovered why I gave up on the hash table approach 2.  That's
because cconv-convert chews up EVERY list it is presented with and
spits out one which is not EQ to the original, though it is usually
EQUAL.  I'm not saying it was written with the object of frustrating
the current exercise (I'm sure it wasn't), but I will say that if that
had been the objective, the end result wouldn't be different from what
we now have.

cconve.el would need to be entirely rewritten if we stick to the hash
table approach.  It wouldn't survive anything like unscathed even in an
"extended Lisp Object" solution.

Maybe it would be possible to defer cconv.el processing till after macro
expansion and byte-opt.el stuff.  Would this do any good?

The only vague idea I have for saving this, and I don't like it one bit,
is somehow to redefine \` (and possibly \,) in such a way that it would
somehow copy the source position from the original list to the result.

> Same with your new scheme: everywhere where a "big cons-cell" is
> transformed, by a macro you'll get a "small cons-cell".
> That's a constant of all options, AFAICT.

The "extended" symbols would survive.  That is a big plus.

> > Also, there's no convenient key for recording the hash of an
> > occurence of a symbol (such as `if').

> Ah, right, I keep forgetting this detail.  Yes, that's a major downer.

> > 3. is what I'm proposing, I think.

> Yes [ sorry, you had to guess; I thought it was clear enough].

> > The motivating thing here is that the rest of the system can handle
> > NEW-SPECIAL-OBJECT and get the same result it would have from OBJECT.
> > Hence the use of Lisp_Type 1, or possibly a new pseudovector type.

> How 'bout we don't try to add location to all objects, but only to some
> specific objects?  E.g. only cons-cells?

This could work, together with byte-compile-enclosing-form and a subform
number N to get at the non-cons objects (symbols, strings, ..) in a cons
or vector form.

> We could add a new "big cons-cell" type which shares the same tag, and
> just adds additional info after the end of the normal cons-cell
> (cons-cell would either be allocated from small_cons_blocks or
> big_cons_blocks, so you'd have to look at the enclosing cons_block to
> determine which kind of cons-cell you have).

I've been through these sort of thoughts.  That idea would be less
effective than the "extended object", since it would only work with
conses, but might be less disruptive.  But why should it only work with
conses?  Why not with symbols, too?

> So normal code is not slowed down at all (except I guess for the GC
> which will be marginally slower).

Hmmm.  Maybe there's something in this idea.  :-)  Somehow we'd need to
determine the enclosing cons block, given the address of a cons, and
that could be slow.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).