From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Re: scratch/accurate-warning-pos: Solid progress: the branch now bootstraps. Date: Mon, 26 Nov 2018 09:48:00 +0000 Message-ID: <20181126094800.GA4030@ACM> References: <20181117124534.GA8831@ACM> <83muq7u9rk.fsf@gnu.org> <20181123130904.GA2916@ACM> <20181125193050.GH27152@ACM> <2c2ae483-3309-f79d-07a5-30af1f49058b@cs.ucla.edu> <20181125212920.GK27152@ACM> <60ac9dfc-b540-89f9-68ea-ec7cceaa8511@cs.ucla.edu> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1543225804 10779 195.159.176.226 (26 Nov 2018 09:50:04 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 26 Nov 2018 09:50:04 +0000 (UTC) User-Agent: Mutt/1.10.1 (2018-07-13) Cc: michael_heerdegen@web.de, eliz@gnu.org, emacs-devel@gnu.org, cpitclaudel@gmail.com, monnier@IRO.UMontreal.CA To: Paul Eggert Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Nov 26 10:50:00 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gRDWh-0002eO-Mc for ged-emacs-devel@m.gmane.org; Mon, 26 Nov 2018 10:49:59 +0100 Original-Received: from localhost ([::1]:34862 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gRDYo-0008E4-Bq for ged-emacs-devel@m.gmane.org; Mon, 26 Nov 2018 04:52:10 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:48115) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gRDYB-0008Dz-Dx for emacs-devel@gnu.org; Mon, 26 Nov 2018 04:51:32 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gRDY9-0008A1-1u for emacs-devel@gnu.org; Mon, 26 Nov 2018 04:51:30 -0500 Original-Received: from colin.muc.de ([193.149.48.1]:54104 helo=mail.muc.de) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1gRDY7-00086c-1v for emacs-devel@gnu.org; Mon, 26 Nov 2018 04:51:28 -0500 Original-Received: (qmail 77100 invoked by uid 3782); 26 Nov 2018 09:51:25 -0000 Original-Received: from acm.muc.de (p2E5D5CCE.dip0.t-ipconnect.de [46.93.92.206]) by colin.muc.de (tmda-ofmipd) with ESMTP; Mon, 26 Nov 2018 10:51:22 +0100 Original-Received: (qmail 4729 invoked by uid 1000); 26 Nov 2018 09:48:00 -0000 Content-Disposition: inline In-Reply-To: <60ac9dfc-b540-89f9-68ea-ec7cceaa8511@cs.ucla.edu> X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy] X-Received-From: 193.149.48.1 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:231385 Archived-At: Hello, Paul. On Sun, Nov 25, 2018 at 17:41:39 -0800, Paul Eggert wrote: > Alan Mackenzie wrote: > > Because of macros. These macros are typically already compiled. > Even a compiled macro operates via the interpreter. So we could have a > separate interpreter used only by the byte compiler. The byte-compiler > interpreter would operate more slowly than the normal interpreter, but > that's OK. The main and the byte-compiler interpreter could mostly be > written with shared code, without slowing down the main interpreter. > Admittedly this would not be a project for the fainthearted. Indeed not. Where and how would this help with getting accurate source code positions? > > If you could come up with a solid proposal which would fix the bug > > without slowing down Emacs at all, we'd all be most appreciative. > I'm afraid I don't understand the bug well enough yet to know whether > any proposal I can come up with would be "solid". For one thing, any > method of outputting source-code locations will founder in the > presence of macros. scratch/accurate-warning-pos seems to do rather well in this regard. > Even GCC, which tries to do a reasonably good job of this and isn't > limited by the Lisp reader, doesn't do well with the sort of C macros > I tend to write. My admittedly uninformed guess is that there is no > such thing as a "solid" solution here, only solutions that work better > and/or worse on various example sets. The example set scratch/accurate-warning-pos works well on is the Lisp code comprising Emacs. > That being said, here's another possibility: don't bother attaching > source-code positions to symbols, since duplicate symbols can be > appear in the input and the source-code positions can't be retrieved > reliably. The source code positions are attached not to symbols, but to symbol _occurrences_. > Instead, attach positions to input objects that are guaranteed to be > unique so that retrieval is trivial. I think you mean conses here. I've tried this approach, spending a lot of time on it but not getting very far. The problem is, Lisp objects flow through lots of different conses as they are transformed by the compiler. Have a look at cconv-convert, which processes every function. I'm not sure that even a single cons in the input form survives through to the output. The symbols do survive, though, in the main. > Do the attachment via a hash table so that the input objects are > unchanged and we don't need to change much of anything except the > byte-compiler's diagnostic code (plus a read function that fills in > the hash table as it reads). Using conses as keys? See previous paragraph. The approach I tried before to implement this was to ensure that after any source transformation, the result was written back to the original cons using setcar and setcdr. This rapidly became unwieldy, with, for example, versions of setq and mapcar which had extra parameters indicating the result cons, and so on. This involved extensive amendment of large portions of the compiler. > When the byte compiler needs the source code location corresponding to > a symbol, it looks for the closest unique object nearby and uses its > location. "Nearby"? Warning messages are typically associated with symbol occurences (not conses), and are found when a recursive compiler routine is presented with a symbol rather than a cons. Not all the time, but a lot of the time. Hence the scratch/accurate-warning-pos approach of attaching source positions to symbol occurrences. > For example, the source expression for the bug#22288 test case: > (defun test () (let (a)) a) > has five conses in its top level list, two conses at the top of its > second level list (let (a)), and one cons in its third level list (a). > Each cons corresponds to a source code position (or if you prefer more > accuracy, multiple positions for the start and end of the > corresponding source-code and/or for the starts and ends of the source > code corresponding to the cons's car and cdr). This will let the byte > compiler narrow down where every subexpression lies, with > significantly more accuracy than what we have now. In the bug#22288 > example, the last cons in the top-level list should be attached to the > precise source code location for the 'a' that we want to issue a > diagnostic about. Yes, in theory. In practice, as already said, the source code flows between lots of cons cells as it is transformed by functions like cconv-convert and those in byte-opt.el. As said, countering this would involve lots of tedious amendment to the compiler, with the emphasis on "lots". I've already tried this, and given up. -- Alan Mackenzie (Nuremberg, Germany).