From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Re: Thoughts on getting correct line numbers in the byte compiler's warning messages Date: Wed, 7 Nov 2018 17:00:36 +0000 Message-ID: <20181107170036.GA4934@ACM> References: <20181101175953.GC4504@ACM> <20181105105302.GA10520@ACM> <20181106151143.GB4030@ACM> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1541610249 5810 195.159.176.226 (7 Nov 2018 17:04:09 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 7 Nov 2018 17:04:09 +0000 (UTC) User-Agent: Mutt/1.10.1 (2018-07-13) Cc: emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Nov 07 18:04:05 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gKRFM-0001PF-Jl for ged-emacs-devel@m.gmane.org; Wed, 07 Nov 2018 18:04:04 +0100 Original-Received: from localhost ([::1]:49413 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gKRHS-0008EF-Vw for ged-emacs-devel@m.gmane.org; Wed, 07 Nov 2018 12:06:15 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:36796) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gKRDE-0002x4-Hg for emacs-devel@gnu.org; Wed, 07 Nov 2018 12:01:55 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gKRDB-0005zD-PE for emacs-devel@gnu.org; Wed, 07 Nov 2018 12:01:52 -0500 Original-Received: from colin.muc.de ([193.149.48.1]:63198 helo=mail.muc.de) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1gKRD7-0005W0-Vw for emacs-devel@gnu.org; Wed, 07 Nov 2018 12:01:47 -0500 Original-Received: (qmail 45708 invoked by uid 3782); 7 Nov 2018 17:01:38 -0000 Original-Received: from acm.muc.de (p5B147AD4.dip0.t-ipconnect.de [91.20.122.212]) by colin.muc.de (tmda-ofmipd) with ESMTP; Wed, 07 Nov 2018 18:01:37 +0100 Original-Received: (qmail 5153 invoked by uid 1000); 7 Nov 2018 17:00:36 -0000 Content-Disposition: inline In-Reply-To: X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy] X-Received-From: 193.149.48.1 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:231049 Archived-At: Hello, Stefan. On Tue, Nov 06, 2018 at 11:29:41 -0500, Stefan Monnier wrote: > > I timed a bootstrap, unoptimised GCC, with an extra tag check and > > storage to a global variable inserted into XFIXNUM. (Currently there is > > no such check there). The slowdown was around 1.3% > That accumulates for every data type, and it increases code size, > reduces cache hit rate... No, it applies mainly to FIXNUM, because XFIXNUM doesn't already check the Lisp_Type. Other object types already perform this check, so while it would increase the code size (by how much?) it would have a lesser run time penalty. There would be a slow down in predicates like symbolp, when the result is false. This probably wouldn't amount to much in practice. Part of that 1.3% (I don't know how big a part) was GCC outputting warning messages. Anyhow, do we really need to worry about code size anymore? temacs is only 7.3 Mb, and the machines people will be running it on will have several, or more usually many, Gb of RAM. So what if it became 7.5 Mb, or even 8.0 Mb? > You may find it acceptable, but I don't, mostly because I know > fundamentally it's not needed: it's only introduced for short/medium > term convenience (to avoid having to rewrite a lot of code). > And I can't see how we'll be able to get rid of it in the long run > (gradually or not). > So in the long run it's a bad option. Yes, it may be a bad option, but possibly less bad than the other bad options we have. > > Many of the original forms produced by the reader survive these > > transformations. This, as it happens, is not true. Many of the symbols produced by the reader survive, none of the cons forms do. cconv, we love you. ;-( > Yeah, that's why I thought of using a hash-table. > > I've tried 2., and given up on it: everywhere in the compiler where FORM > > is transformed to NEWFORM, a copy of a hash has to be created for > > NEWFORM. I've rediscovered why I gave up on the hash table approach 2. That's because cconv-convert chews up EVERY list it is presented with and spits out one which is not EQ to the original, though it is usually EQUAL. I'm not saying it was written with the object of frustrating the current exercise (I'm sure it wasn't), but I will say that if that had been the objective, the end result wouldn't be different from what we now have. cconve.el would need to be entirely rewritten if we stick to the hash table approach. It wouldn't survive anything like unscathed even in an "extended Lisp Object" solution. Maybe it would be possible to defer cconv.el processing till after macro expansion and byte-opt.el stuff. Would this do any good? The only vague idea I have for saving this, and I don't like it one bit, is somehow to redefine \` (and possibly \,) in such a way that it would somehow copy the source position from the original list to the result. > Same with your new scheme: everywhere where a "big cons-cell" is > transformed, by a macro you'll get a "small cons-cell". > That's a constant of all options, AFAICT. The "extended" symbols would survive. That is a big plus. > > Also, there's no convenient key for recording the hash of an > > occurence of a symbol (such as `if'). > Ah, right, I keep forgetting this detail. Yes, that's a major downer. > > 3. is what I'm proposing, I think. > Yes [ sorry, you had to guess; I thought it was clear enough]. > > The motivating thing here is that the rest of the system can handle > > NEW-SPECIAL-OBJECT and get the same result it would have from OBJECT. > > Hence the use of Lisp_Type 1, or possibly a new pseudovector type. > How 'bout we don't try to add location to all objects, but only to some > specific objects? E.g. only cons-cells? This could work, together with byte-compile-enclosing-form and a subform number N to get at the non-cons objects (symbols, strings, ..) in a cons or vector form. > We could add a new "big cons-cell" type which shares the same tag, and > just adds additional info after the end of the normal cons-cell > (cons-cell would either be allocated from small_cons_blocks or > big_cons_blocks, so you'd have to look at the enclosing cons_block to > determine which kind of cons-cell you have). I've been through these sort of thoughts. That idea would be less effective than the "extended object", since it would only work with conses, but might be less disruptive. But why should it only work with conses? Why not with symbols, too? > So normal code is not slowed down at all (except I guess for the GC > which will be marginally slower). Hmmm. Maybe there's something in this idea. :-) Somehow we'd need to determine the enclosing cons block, given the address of a cons, and that could be slow. > Stefan -- Alan Mackenzie (Nuremberg, Germany).