From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: The poor quality of Emacs's backtraces Date: Thu, 13 Jul 2023 13:35:58 +0000 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="26614"; mail-complaints-to="usenet@ciao.gmane.io" To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Jul 13 15:36:49 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qJwUy-0006os-TD for ged-emacs-devel@m.gmane-mx.org; Thu, 13 Jul 2023 15:36:48 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qJwUY-00005s-R9; Thu, 13 Jul 2023 09:36:22 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qJwUU-0008Ub-NM for emacs-devel@gnu.org; Thu, 13 Jul 2023 09:36:20 -0400 Original-Received: from mx3.muc.de ([193.149.48.5]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qJwUS-0003xd-4y for emacs-devel@gnu.org; Thu, 13 Jul 2023 09:36:18 -0400 Original-Received: (qmail 73276 invoked by uid 3782); 13 Jul 2023 15:35:59 +0200 Original-Received: from acm.muc.de (pd953af8c.dip0.t-ipconnect.de [217.83.175.140]) (using STARTTLS) by colin.muc.de (tmda-ofmipd) with ESMTP; Thu, 13 Jul 2023 15:35:59 +0200 Original-Received: (qmail 24655 invoked by uid 1000); 13 Jul 2023 13:35:58 -0000 Content-Disposition: inline X-Submission-Agent: TMDA/1.3.x (Ph3nix) X-Primary-Address: acm@muc.de Received-SPF: pass client-ip=193.149.48.5; envelope-from=acm@muc.de; helo=mx3.muc.de X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:307817 Archived-At: Hello, Emacs. On running the test suite (make check) on my development version, I recently got lots of errors and backtraces in the files.log. This is fair enough. What's not good is the message that precedes (or, in the test suite follows) the backtrace. In the current instance, it looks like: Test test-kill-buffer-auto-save-delete-no condition: (wrong-type-argument listp #[257 "\300\211\2!\262\1\207" [yes-or-no-p] 4 "\n\n(fn ARG124 &optional)" t]) FAILED 376/406 test-kill-buffer-auto-save-delete-no (0.012664 sec) So, this says that _something_ wasn't a list, without telling me what the something was. It may have been the value of a variable, or the result returned from evaluating a form, but the message gives no clue. It says "wrong-type-argument", but doesn't say why it's wrong. It doesn't disclose which primitive detected the fault, though Emacs could easily do this. It doesn't make any effort to say _where_ the bug was detected. If you're lucky, it might be somewhere in the first function reported in the backtrace, but only if there aren't any frivolous condition-case's in the code which render any remaining backtrace useless. But even if you know what function it's in, unless that function is short, you don't know _where_ in that function the error has occurred. The Emacs backtrace functions don't output any coordinates. This could be done easily, for example by printing the current position and the total length for a compiled function (something like 56/120), or by outputting neighbouring forms for an interpreted function. This would be helpful information for debugging. As already said, the actual backtrace itself is often of little use, due to the frivolous use of condition-case's. That's even if you stop the test suite truncating every line at ~70 characters. (Why is this done?) Currently the first few lines of my backtrace look like: Test test-kill-buffer-auto-save-delete-no backtrace: {comp-spill-lap-function} #f(compiled-function (form) "Byte-compile FORM, spilling data from the byte compiler." #)((lambda (arg124 &optional) (let ((f #'yes-or-no-p)) (funcall f arg124)))) apply({comp-spill-lap-function} #f(compiled-function (form) "Byte-compile FORM, spilling data from the byte compiler." #) (lambda (arg124 &optional) (let ((f #'yes-or-no-p)) (funcall f arg124))) nil) comp-spill-lap-function((lambda (arg124 &optional) (let ((f #'yes-or-no-p)) (funcall f arg124)))) comp-spill-lap((lambda (arg124 &optional) (let ((f #'yes-or-no-p)) (funcall f arg124)))) comp--native-compile((lambda (arg124 &optional) (let ((f #'yes-or-no-p)) (funcall f arg124))) nil "/tmp/test-nativecomp-cache-Dmx7GP/30.0.50-7a56150c...") comp-trampoline-compile(yes-or-no-p) comp-subr-trampoline-install(yes-or-no-p) {test-overlay-regions} #f(compiled-function () #)() test-kill-buffer-auto-save(110 {test-kill-buffer-auto-save} #f(compiled-function () #)) {test-kill-buffer-auto-save} #f(compiled-function () #)() (The symbols in braces are an enhancement I'm currently working on to give more information for anonymous functions.) If anybody can match up the "wrong-type-argument" message to this backtrace, they're a better hacker than me - What precisely in comp-spill-lap-function is using some unknown list primitive on the byte compiled function? So, on the principle of using ALL the information that is available, we might as well disassemble that byte-compiled function, which might give a clue: \300 byte-constant yes-or-no-p \211 byte-dup \2 stack-ref 2 ; ARG124 ! byte-call 1 \262\1 stack-set 1 \207 return .... but not much of one. The function is calling yes-or-no-p with its parameter, and then I think it's returning yes-or-no-p's result. Though why it's duplicating yes-or-no-p at the top of the stack, then overwriting it with the result is unclear. In the test test-kill-buffer-auto-save-delete-no, fancy things are done with cl-letf on (symbol-function 'yes-or-no-p), so the disassembled function above probably has something to do with that. It's worth pointing out that there doesn't seem to be a way to get Emacs to disassemble a function, only a symbol with a function value. ######################################################################## I will eventually track down the cause of the above bug. But it will have taken me MUCH longer than it would have done, had the missing information actually been present in the backtrace. I think I might not be alone in wishing for these things to be improved. So I propose that the quality of our backtraces be improved, and nominate myself as the person to do the work. -- Alan Mackenzie (Nuremberg, Germany).