From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: John Williams Newsgroups: gmane.emacs.devel Subject: Re: Proposal: stack traces with line numbers Date: Wed, 18 Oct 2017 08:00:42 -0700 Message-ID: References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" X-Trace: blaine.gmane.org 1508338939 16360 195.159.176.226 (18 Oct 2017 15:02:19 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 18 Oct 2017 15:02:19 +0000 (UTC) Cc: emacs-devel@gnu.org To: Helmut Eller Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Oct 18 17:02:09 2017 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e4prC-0002Z3-Bw for ged-emacs-devel@m.gmane.org; Wed, 18 Oct 2017 17:02:06 +0200 Original-Received: from localhost ([::1]:45074 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e4prG-0004Li-JT for ged-emacs-devel@m.gmane.org; Wed, 18 Oct 2017 11:02:10 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46342) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e4pqL-0004HJ-Kq for emacs-devel@gnu.org; Wed, 18 Oct 2017 11:01:17 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e4pqF-0001fj-1R for emacs-devel@gnu.org; Wed, 18 Oct 2017 11:01:13 -0400 Original-Received: from mail-wr0-x232.google.com ([2a00:1450:400c:c0c::232]:44009) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1e4pqE-0001dX-PE for emacs-devel@gnu.org; Wed, 18 Oct 2017 11:01:06 -0400 Original-Received: by mail-wr0-x232.google.com with SMTP id p46so5355546wrb.0 for ; Wed, 18 Oct 2017 08:01:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=GKZa7EkgGwJ4jdWCp8CbpoD1lQntgISU2EcLsHpOGnI=; b=uItd4Btac1DionVNLttPJjLKJnm5PmdMmvaZVbNYM5GJZ/+XVJp8EKfXJWFPWXW+mC U0nDeeAm2Un1J2GPYnsNQNqf9ZQf4A6dRd3bv1s5tKMmfwVaPWWez6wv2qhjDmAN7K4H ed+FNU9g2rMb+FnjRIANFhYPUMZPcIcm5FYhvpnWxHg+od5pLVB9C8FBfUYI2gSwTFHY EUDXGC0cHpfoVjtKBTjhAmbWAE4/jB+40T9J2iwMtAw8e/9fRrc8OPmMS+ldfD8hxrfp Yjf97eiGdY09Yib9fqubd9+t8fPGw3yYLWLTHYonmylOJtIzxAEUqjI7unMtEG9aXzvu 7YxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=GKZa7EkgGwJ4jdWCp8CbpoD1lQntgISU2EcLsHpOGnI=; b=rF+4NZEBnU1ylHIiVtBKD8bRuMZNezZJjlEVc//BcQySbirF/l/Q8AuUJ9Mok3YAek y6wgd+yS9Xq3XWlvLPj5bOEO3Lhjt8E07Gqjsc3twjmdCdomjE3qbIe6q0QyPwhY3jcn uD0yh51CWklsq2/NN+hFEN37X8rLU2GseCpUGUXsqVEJ6smMeJjZY8Fj3VqonOqmHdDq D3vV7ZoJt64ZF7jfRAnFgjOjxQ4RNcIftRkCg44FBeCqEeqKi3Tg7UKWFG/ifavkbSRx R3kSouQWbVp4LBf7RtcASDPrdriyU0jPGlTU4lNRNH4yG0eskv2oX/rVpFiFXzcu+L/Y ZyFA== X-Gm-Message-State: AMCzsaVNNkpOylSRD20ul31hX5dfo5XqBwBafKWQZw9G0mB+czc8f05V vawfQxRjoBA0AhrT24+BVWxmgRFUI+kwHq46w9s= X-Google-Smtp-Source: ABhQp+TbMVgZZ90sQskcoBrSeJVTga2JhPlIn4smB0bHNf9cUMZb041R3dd4Trzdi2z/PIm/1MD/BAkfNiGvUXpCSdI= X-Received: by 10.223.166.181 with SMTP id t50mr6849203wrc.251.1508338863179; Wed, 18 Oct 2017 08:01:03 -0700 (PDT) Original-Received: by 10.28.111.93 with HTTP; Wed, 18 Oct 2017 08:00:42 -0700 (PDT) In-Reply-To: X-Google-Sender-Auth: coHtGCEJfCcIKZLWYmgrJj91Se0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:400c:c0c::232 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:219619 Archived-At: I just remembered something else about weak hash maps that I forgot to mention earlier. Most people who try to use weak references eventually get burned by surprising nondeterministic behavior, but the design I'm proposing has an interesting property I first learned about in the context of the JavaScript WeakMap type: as long as the map's :test is 'eq and :weakness is 'key, and the map is only ever accessed using gethash and puthash, nondeterministic behavior is impossible to observe, because a given entry becomes eligible for garbage collection precisely when it is no longer accessible using gethash. This property allows you to effectively add a new field to certain instances of an existing type without altering the type itself. On Mon, Oct 16, 2017 at 3:43 PM, John Williams wrote: > On Sun, Oct 15, 2017 at 3:01 AM, Helmut Eller wrote: >> On Sat, Oct 14 2017, John Williams wrote: >>> The information is "attached" >>> using cons cells as keys in a weak-key hash table. [...] >> >> Unless you care about interpreted code, a non-weak hash-table should be >> enough. I think this hash table should work similar to >> read-symbol-positions-list. > > What I had in mind was a single global hashtable, because that way > it's easy to make it look as if the source refs are physically part of > the annotated cons cells, and users of the API don't need to be aware > that a supplementary data structure even exists. But of course using a > global hashtable with strong keys would create a huge space leak in > the reader. > > Is there any particular disadvantage to using weak keys? > >>> - I'm storing the information in vectors because it seems like a >>> reasonably efficient use of memory. [...] >> >> It's debatable whether a [file line column] vector is an efficent >> representation. E.g. all lists in a source form come from the same file >> (or buffer or string) so storing the same filename many times seems >> redundant. It might also be reasonable to use different representations >> in the debug info than for the data-structures used by the reader or >> compiler. > > The file name would be a single string object shared by every ref in a > given file (or nil when there is no file), so we'd only be saving a > few words per source ref (one for the string itself, plus one or two > saved by using a cons cell instead of a two-element vector.) > >>> - I'm saving line and column numbers rather that just byte/character >>> offsets [...] >> >> Line/column pairs have the (minor) advantage that line numbers have a >> higher porbability to stay the same after small edits to the source. >> But other than that, it seems to me that character offsets encode the >> same information more compactly. > > It seems like we may be talking about different things. I'm speaking > strictly about the in-memory representation produced by the reader, > which will be quickly garbage-collected in most cases (assuming most > elisp code is compiled, either as part of the Emacs build processes, > or by the package manager). I haven't even thought about how to > represent the same information in bytecode, but I assume it will be > quite different, and more focused on compactness. > >>> - I'm only attaching information to the head of each list purely as a >>> memory-saving measure. I can't think of scenario where you'd need a >>> source reference for a list without having its head available, except >>> maybe in the expansion of a macro that disassembles its arguments and >>> puts them back together in a new list. If it's an issue in practice, >> >> In Lisp almost everything is a macro, so I bet that this is an issue. > > Maybe. From what I can tell, most function calls in macro arguments > are copied directly into the expansion, so no important information > would be lost in the expansion process, except for a few outliers like > iter-defun, which appears to completely re-assemble the code it's > given. Attaching the same information to every cons cell wouldn't be > difficult, though. Every cell in a given list could share the same > source ref, so the main overhead would be the extra hash table > entries. My guess is that doing so would roughly double or even triple > the average memory footprint of a cons cell produced by the reader, > but I don't think that would be a problem unless you're trying to run > Emacs on an embedded platform, and it's a feature that could be easily > compiled out or disabled at runtime if necessary. > >>> I think a better solution would be for the macro expander to propagate >>> source refs to every cons cell in a macro argument at the point where >>> macro expansion takes place. >> >> It's clearly desirable that source positions are propagated >> automatically as often as possible. That job will be easier if the >> reader records more information. > > It would definitely be simpler. I'll defer to others' opinions > regarding the relative merits of each approach. > >> So, I think the reader should, at least optionally, also record >> positions of every cons cell not just the first in a list. Also, in >> addition to the start position the reader could/should also record the >> end position. > > That would not be difficult to implement.