From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Thoughts on getting correct line numbers in the byte compiler's warning messages Date: Thu, 1 Nov 2018 17:59:53 +0000 Message-ID: <20181101175953.GC4504@ACM> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1541095129 23300 195.159.176.226 (1 Nov 2018 17:58:49 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 1 Nov 2018 17:58:49 +0000 (UTC) User-Agent: Mutt/1.10.1 (2018-07-13) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Nov 01 18:58:44 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gIHEw-0005uV-WA for ged-emacs-devel@m.gmane.org; Thu, 01 Nov 2018 18:58:43 +0100 Original-Received: from localhost ([::1]:43360 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gIHH3-0007W2-Ft for ged-emacs-devel@m.gmane.org; Thu, 01 Nov 2018 14:00:53 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:48293) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gIHGM-0007Rv-AD for emacs-devel@gnu.org; Thu, 01 Nov 2018 14:00:15 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gIHGI-0000Av-98 for emacs-devel@gnu.org; Thu, 01 Nov 2018 14:00:10 -0400 Original-Received: from colin.muc.de ([193.149.48.1]:46081 helo=mail.muc.de) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1gIHGH-0000AI-UI for emacs-devel@gnu.org; Thu, 01 Nov 2018 14:00:06 -0400 Original-Received: (qmail 65313 invoked by uid 3782); 1 Nov 2018 18:00:04 -0000 Original-Received: from acm.muc.de (p5B1479F1.dip0.t-ipconnect.de [91.20.121.241]) by colin.muc.de (tmda-ofmipd) with ESMTP; Thu, 01 Nov 2018 19:00:03 +0100 Original-Received: (qmail 6065 invoked by uid 1000); 1 Nov 2018 17:59:53 -0000 Content-Disposition: inline X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy] X-Received-From: 193.149.48.1 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:230956 Archived-At: Hello, Emacs. Most of the time, the byte compiler identifies the correct place of error in its warning messages. This is remarkable, given the crude hack which it uses. However, it sometimes fails, and this has given rise to a number of bug reports, e.g., 22288, and several others which have been merged with it. In bug #22288: (defun test () (let (a)) a) , the byte compiler correctly reports "reference to free variable 'a', but wrongly gives the source position as L2 C9 rather than L3 C3. The problem is that the Emacs Lisp source code being compiled is first read, and this discards line/column numbers of the constructs created. I believe that, somehow, accurate source position information must be preserved. But how? It is not easy. The forms created by the reader go through several (?many) transformative phases where they get replaced by successor forms. This makes things more difficult. My first idea to track position information was for the reader to create a hash table of conses (the key) and positions (the value), so that the position could be found simply by accessing the entry corresponding with the current form. This doesn't work so easily, because of the previous paragraph. Then I tried duplicating a hash table entry when a transformation was effected. This was just too tedious and error prone, and was also slow. Second idea was still to maintain this hash table, but on each transformation to write the result back to the same cons cell as the original. I actually put quite a lot of work into this approach, but in the end didn't get very far. It was just too much detailed work, too fiddly. The third idea is to amend the reader so that whereas it now produces a form, in a byte compiler special mode, it would produce the cons (form . offset). So, for example, the text "(not a)" currently gets read into the form (not . (a . nil)). The amended reader would produce (((not . 1) . ((a . 5) . (nil . 6))) . 0) (where 0, 1, 5, and 6 are the textual offsets of the elements coded). Such forms would require special versions of `cons', `car', `cdr', `cond', ...., `mapcar', .... to be easily manipulable. These versions would be macros to begin with, but probably primitives ultimately. Assuming appropriate design, it should be possibly to substitute these new macros/primitives for the existing cons/car/cdr/...s in the byte compiler without too much related change. I'm still exploring this scheme. I feel that this bug is not intractable, though it will take quite a lot of work to fix. Comments? -- Alan Mackenzie (Nuremberg, Germany).