From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: "Eric S. Raymond" Newsgroups: gmane.emacs.devel Subject: Re: Insight into the mystery hangs Date: Mon, 12 Feb 2024 13:26:16 -0500 Organization: Eric Conspiracy Secret Labs Message-ID: References: <20240211213737.3A38C18A1647@snark.thyrsus.com> <868r3psv2s.fsf@gnu.org> Reply-To: esr@thyrsus.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8656"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Feb 12 19:27:05 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rZb1F-0001zH-CZ for ged-emacs-devel@m.gmane-mx.org; Mon, 12 Feb 2024 19:27:05 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rZb0Z-0001Ra-LM; Mon, 12 Feb 2024 13:26:23 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rZb0X-0001R2-9I for emacs-devel@gnu.org; Mon, 12 Feb 2024 13:26:21 -0500 Original-Received: from thyrsus.com ([71.162.243.5] helo=snark.thyrsus.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rZb0V-00024p-2w; Mon, 12 Feb 2024 13:26:21 -0500 Original-Received: by snark.thyrsus.com (Postfix, from userid 1000) id 5AC1218A12DC; Mon, 12 Feb 2024 13:26:16 -0500 (EST) Content-Disposition: inline In-Reply-To: <868r3psv2s.fsf@gnu.org> X-Eric-Conspiracy: There is no conspiracy Received-SPF: pass client-ip=71.162.243.5; envelope-from=esr@thyrsus.com; helo=snark.thyrsus.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:316146 Archived-At: Eli Zaretskii : > > From: "Eric S. Raymond" > > Date: Sun, 11 Feb 2024 16:37:37 -0500 (EST) > > > > However. Emacs is not entirely off the hook here. When I'm not under > > deadline pressure I will file a bug with a title something like > > "With debug-on-quit enabled, Emacs does not reliably raise a debug > > trace on interrupt of call-process" > > Isn't that call issued from the mode-line display? If so, that is > done from redisplay, and redisplay cannot enter debugger, so it > catches all errors. If you want to produce Lisp backtraces from Lisp > code called by redisplay, you need to use the facilities documented in > the node "Debugging Redisplay" in the ELisp Reference manual. 1. Thinking about it, I can see why redisplay can't be allowed to enter the debugger. Infinite regress... 2. I don't know if that subprocess is called from the modeline code. Probably, but I'd have to dig into vc.el to check. I won't have time for that for a few days yet. 3. Assuming that it is called from the modeline code, the question shifts from "Why did I have so much trouble generating a debug trace?" to "How could I get one at all"?" There's some kind of timing issue, I think. Just to make this saga more interesting, after I turned in my last report I was disamayed to find the the hang on mode initialization wasn't *entirely* banished. Fortunately I had a test case that would reproduce it reliably. Some bisecting revealed that SRC had a *real* hang bug (not a mere. pseudo-hang due to a long-running command) It seems I omitted a loop break while I was performing what I thought was a safe refactor. Last September... 644 cases in my test suite and none of them caught it, nor did I encounter it in heavy production use between then and yesterday. The trigger is some strange corner case in parsing RCS masters. I repaired the code, but... ...this directs my attention to the fact that that Emacs makes it generally difficult to notice and diagnose ill-behaved subprocesses, wuth problems behind the mode-line being the extreme case. Alas, I don't know what to do about this other than windmill my arms at the dev group. -- Eric S. Raymond