From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: John Williams Newsgroups: gmane.emacs.devel Subject: Re: Proposal: stack traces with line numbers Date: Mon, 16 Oct 2017 15:43:38 -0700 Message-ID: References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" X-Trace: blaine.gmane.org 1508193899 8690 195.159.176.226 (16 Oct 2017 22:44:59 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 16 Oct 2017 22:44:59 +0000 (UTC) Cc: emacs-devel@gnu.org To: Helmut Eller Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Oct 17 00:44:51 2017 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e4E7o-0000Kc-Mq for ged-emacs-devel@m.gmane.org; Tue, 17 Oct 2017 00:44:44 +0200 Original-Received: from localhost ([::1]:35526 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e4E7w-0008WD-18 for ged-emacs-devel@m.gmane.org; Mon, 16 Oct 2017 18:44:52 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:47031) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e4E78-0008Sy-PU for emacs-devel@gnu.org; Mon, 16 Oct 2017 18:44:04 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e4E77-00063B-MZ for emacs-devel@gnu.org; Mon, 16 Oct 2017 18:44:02 -0400 Original-Received: from mail-wr0-x242.google.com ([2a00:1450:400c:c0c::242]:53485) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1e4E77-00062O-Ef for emacs-devel@gnu.org; Mon, 16 Oct 2017 18:44:01 -0400 Original-Received: by mail-wr0-x242.google.com with SMTP id y44so4421433wry.10 for ; Mon, 16 Oct 2017 15:43:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=HIqnlsJvk1qN7IrKCBfAxVp2mlvW3V47GH2/Jp+ygBM=; b=E10rwd9tMjL430AW9fjn2FAPIAqJly1m8cEeznIwtftgxZEx+P3pKbzEGmIyhYmhhr 9BDS0iGyqgUMciqfJ21xpxgy1iBMjIqawPTkoJFBiXOu1LUse1jwTQs/njD8M6mENhec p73HQQk6x55xO7cA4XlDZUvHFFe6zBhS7X6nocsQDICFA+2dDAuLxB/ankgzLpHqdvL5 uJJv7Eko6piO0SiUWxCVIrgBkQw9rKvsuUm1M8Tbqlvieci/1r0fJr+sIJjUl+a9/dYU csMMm6Ayv1rksputoxeSm6xM043MN1MVX2aSxDJGIjuA88gdAQ8HoS06leJVVVBDDB1x hAQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=HIqnlsJvk1qN7IrKCBfAxVp2mlvW3V47GH2/Jp+ygBM=; b=Ex1kHERbpzYXrvRKVA5/A2SSCfchg8iuTr5OM/EPNtoQ2JA/4usyFSlihJ7KhrzjGK qlfxvtf2EeN/deU7jD4ojryKZvIa7EaBa7V/qQ6wE26DuhVOYF2KCBcDDIj/RIoebwzl Q+wMCfrMLUHS85GRY6TicTbOHP6Vjf23NAOjxhBxnfqAgQVpQ1iJyk3orL2ZIk/lqJ04 W0+om5eMSEt324vzFPjTpEBt/Bd0N0vLwmo7CmJAP2aeHvarDlSxU33lRHtwv4u/tz36 w5vVUMG+XiUbE9TQ9CpSoQ0GasmAizYSidcLDbkSlRF803uRV3q0+Hg6qHCtucByJJnt 2zug== X-Gm-Message-State: AMCzsaV6kGkMw2eVpeDnKrnPQJjElsr+ekFfBQJLXtrph8QppCLepLgB lwoRoUhi8nnqA0HgazeSY4cl6YBtEUu/dCh9owg= X-Google-Smtp-Source: ABhQp+RgKhLNsFOKuQPLey3Cz1oRFHqaehiJIx+RSX+bu2lV4EQyETn6VaHSNiQa/R/5IcBp709Ehp5Y5jFe5ef83NU= X-Received: by 10.223.170.197 with SMTP id i5mr1932283wrc.118.1508193838677; Mon, 16 Oct 2017 15:43:58 -0700 (PDT) Original-Received: by 10.28.111.93 with HTTP; Mon, 16 Oct 2017 15:43:38 -0700 (PDT) In-Reply-To: X-Google-Sender-Auth: rnV9msOyWh1QvrqUPsKppY3xLvo X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:400c:c0c::242 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:219593 Archived-At: On Sun, Oct 15, 2017 at 3:01 AM, Helmut Eller wrote: > On Sat, Oct 14 2017, John Williams wrote: >> The information is "attached" >> using cons cells as keys in a weak-key hash table. [...] > > Unless you care about interpreted code, a non-weak hash-table should be > enough. I think this hash table should work similar to > read-symbol-positions-list. What I had in mind was a single global hashtable, because that way it's easy to make it look as if the source refs are physically part of the annotated cons cells, and users of the API don't need to be aware that a supplementary data structure even exists. But of course using a global hashtable with strong keys would create a huge space leak in the reader. Is there any particular disadvantage to using weak keys? >> - I'm storing the information in vectors because it seems like a >> reasonably efficient use of memory. [...] > > It's debatable whether a [file line column] vector is an efficent > representation. E.g. all lists in a source form come from the same file > (or buffer or string) so storing the same filename many times seems > redundant. It might also be reasonable to use different representations > in the debug info than for the data-structures used by the reader or > compiler. The file name would be a single string object shared by every ref in a given file (or nil when there is no file), so we'd only be saving a few words per source ref (one for the string itself, plus one or two saved by using a cons cell instead of a two-element vector.) >> - I'm saving line and column numbers rather that just byte/character >> offsets [...] > > Line/column pairs have the (minor) advantage that line numbers have a > higher porbability to stay the same after small edits to the source. > But other than that, it seems to me that character offsets encode the > same information more compactly. It seems like we may be talking about different things. I'm speaking strictly about the in-memory representation produced by the reader, which will be quickly garbage-collected in most cases (assuming most elisp code is compiled, either as part of the Emacs build processes, or by the package manager). I haven't even thought about how to represent the same information in bytecode, but I assume it will be quite different, and more focused on compactness. >> - I'm only attaching information to the head of each list purely as a >> memory-saving measure. I can't think of scenario where you'd need a >> source reference for a list without having its head available, except >> maybe in the expansion of a macro that disassembles its arguments and >> puts them back together in a new list. If it's an issue in practice, > > In Lisp almost everything is a macro, so I bet that this is an issue. Maybe. From what I can tell, most function calls in macro arguments are copied directly into the expansion, so no important information would be lost in the expansion process, except for a few outliers like iter-defun, which appears to completely re-assemble the code it's given. Attaching the same information to every cons cell wouldn't be difficult, though. Every cell in a given list could share the same source ref, so the main overhead would be the extra hash table entries. My guess is that doing so would roughly double or even triple the average memory footprint of a cons cell produced by the reader, but I don't think that would be a problem unless you're trying to run Emacs on an embedded platform, and it's a feature that could be easily compiled out or disabled at runtime if necessary. >> I think a better solution would be for the macro expander to propagate >> source refs to every cons cell in a macro argument at the point where >> macro expansion takes place. > > It's clearly desirable that source positions are propagated > automatically as often as possible. That job will be easier if the > reader records more information. It would definitely be simpler. I'll defer to others' opinions regarding the relative merits of each approach. > So, I think the reader should, at least optionally, also record > positions of every cons cell not just the first in a list. Also, in > addition to the start position the reader could/should also record the > end position. That would not be difficult to implement.