From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Daniel Colascione Newsgroups: gmane.emacs.bugs Subject: bug#16775: dbus interacts poorly with lisp-level event loops Date: Mon, 17 Feb 2014 07:22:26 -0800 Message-ID: <53022932.5010503@dancol.org> References: <5301EAE0.5080008@dancol.org> <5301EFB0.7000400@dancol.org> <87k3ctg993.fsf@gmx.de> <53021436.9080704@dancol.org> <87mwhpn5fe.fsf@gmx.de> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1392650595 9937 80.91.229.3 (17 Feb 2014 15:23:15 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 17 Feb 2014 15:23:15 +0000 (UTC) Cc: 16775@debbugs.gnu.org To: Michael Albinus Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Feb 17 16:23:21 2014 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1WFQ2T-0005MR-62 for geb-bug-gnu-emacs@m.gmane.org; Mon, 17 Feb 2014 16:23:21 +0100 Original-Received: from localhost ([::1]:41489 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WFQ2S-0002MI-P2 for geb-bug-gnu-emacs@m.gmane.org; Mon, 17 Feb 2014 10:23:20 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:54758) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WFQ2I-0002Lo-UA for bug-gnu-emacs@gnu.org; Mon, 17 Feb 2014 10:23:17 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WFQ2A-0000y1-Pd for bug-gnu-emacs@gnu.org; Mon, 17 Feb 2014 10:23:10 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:55438) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WFQ2A-0000xt-I1 for bug-gnu-emacs@gnu.org; Mon, 17 Feb 2014 10:23:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1WFQ2A-0007Tx-9X for bug-gnu-emacs@gnu.org; Mon, 17 Feb 2014 10:23:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Daniel Colascione Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 17 Feb 2014 15:23:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16775 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 16775-submit@debbugs.gnu.org id=B16775.139265055228711 (code B ref 16775); Mon, 17 Feb 2014 15:23:02 +0000 Original-Received: (at 16775) by debbugs.gnu.org; 17 Feb 2014 15:22:32 +0000 Original-Received: from localhost ([127.0.0.1]:56619 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WFQ1f-0007Sy-6K for submit@debbugs.gnu.org; Mon, 17 Feb 2014 10:22:31 -0500 Original-Received: from dancol.org ([96.126.100.184]:48868) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WFQ1c-0007Sp-LL for 16775@debbugs.gnu.org; Mon, 17 Feb 2014 10:22:29 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=dancol.org; s=x; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:CC:To:MIME-Version:From:Date:Message-ID; bh=2uZxuNW7Tk7ORb9EfbijDgtfPl7TOj+7fkYGfhPLZQ0=; b=MzFS870OB9b7QYmSXrj+Vaj6UP3tbqhbOxmAFP7eWBpOy/+AYikVTIgRs7aEKH3XzAAeX3YsfMfyY6lBp47ZamWm90HBs7IarMMAzqyzdqZyKw2wFTT90Mzd6pmjeL6Kly20qSjX+dyUJm3Xn3XIy9gw4+4fSyMB+mlmZrx4lMNLvD73znwbaNK336yLvM4ro8Q2KHtYhx4Z62KoeGuDU/f/i5GMNd6TVwaOKVFKHIDJJekI/xWvSIQsAC/+4w6fCN008oiC5MQYNZWX5jb1d0lLiWy/kBtr2mZGdMpSaiE4ilMRcG6Cw9XDi5LxLIMqDXlsDVyzLAxH9S0Do7lx4g==; Original-Received: from c-76-104-210-106.hsd1.wa.comcast.net ([76.104.210.106] helo=[192.168.1.50]) by dancol.org with esmtpsa (TLS1.0:DHE_RSA_CAMELLIA_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1WFQ1b-0000Xp-HK; Mon, 17 Feb 2014 07:22:27 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 In-Reply-To: <87mwhpn5fe.fsf@gmx.de> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:85735 Archived-At: On 02/17/2014 07:07 AM, Michael Albinus wrote: > Daniel Colascione writes: > >> On 02/17/2014 05:27 AM, Michael Albinus wrote: >>> Daniel Colascione writes: >>> >>>> dbus-call-method expects read-event to return the dbus event >>>> immediately, but read_char in keyboard.c treats the dbus event as a >>>> special event and runs it through special-event-map itself before >>>> sitting and reading another event. The event waiting loop always times >>>> out, so dbus-call-method always takes at least 100ms due to the >>>> hard-coded 0.1 timeout parameter to read-event. >>> >>> dbus-call-method does not expect the D-Bus event to be returned by >>> read-event. >> >> Then why was it calling dbus-check-event on the result? I checked in a >> hack that addresses the immediate issue. > > Just for safety. Likely, I was not convinced the time I was writing > this, that it couldn't be returned by read-event. Don't remember > exactly. > > Does it really hurt? It certainly makes it more difficult to determine the intent of the code. >>> It simply calls read-event in order to trigger event >>> handling. The loop itself checks, whether the respective event has been >>> inserted into dbus-return-values-table. And when *other* but D-Bus events >>> do arrive in the meantime, they must be preserved in unread-command-events. >>> >>> Why is it a problem to wait at least 100ms? D-Bus messages are not >>> expected to perform in real time (whatever this means). >> >> Because secrets.el was taking a whole second to load due almost >> entirely to dbus delays. > > Well, that's serious. I'll check. I used time emacs -Q -batch --eval "(require 'secrets)". > However, is it just secrets.el loading, or also normal operation in > secrets.el, which are delayed I don't know. Secrets has other problems that I have to fix separately. (By the way: dbus appears to hang if there's an error in a message spec and the bus disconnects from underneath us. Try (dbus-call-method :session "org.freedesktop.secrets" "/org/freedesktop/secrets/collection/login" "org.freedesktop.Secret.Collection" "SearchItems" '(:array (:dict-entry "host" ("xxxx" "xxxx")) (:dict-entry "port" "imaps"))). A signal of some sort pertaining to the disconnection is delivered to xd_read_message_1, but it drops this signal on the floor, leading dbus-call-method to loop until its timeout expires. > >>>> This problem is hairy: special-event-map functions can execute arbitrary >>>> code and re-enter the dbus synchronous event loop, and there's no way to >>>> non-locally terminate a particular read-event loop. Here's the >>>> problematic scenario: dbus-call-method works by setting up an >>>> asynchronous dbus call and calling read-event until the specific >>>> asynchronous call on which it is waiting completes. >>> >>> Why do you want to terminate non-locally in dbus-call-method? If you >>> need asynchronous behaviour, there is dbus-call-method-asynchronously. >> >> The goal is to make dbus-call-method return as soon as the method call >> is complete. > > We're speaking about 0.1sec delay. Does it really hurt? (Yes, I'll check > the secrets.el case). Yes, 100 milliseconds is far about the threshold at which interactivity is visibly affected. It's completely unacceptable. It's worse when a single logical operation involves multiple dbus calls. >>>> The immediate problem is that read-event never actually returns because >>>> the dbus event is special >>> >>> As said above this is not a problem but intended. >> >> I find it hard to believe that the overall effects were >> intentional. Randomly delaying all of Emacs because something tried to >> make a dbus call is completely unacceptable. > > You misunderstand me. Intentionally? No. You seem to be operating under the something interesting notion that it's reasonable for an interactive program to simply stop responding for 100ms. The threshold of human perception is widely regarded to be somewhere in the 30ms neighborhood. > It is not intended to delay Emacs. But there was an error with the old > implementation (Emacs was blocked sometimes during dbus-call-method); > that's why it is changed to be called asynchronously internally. By > this, you could interrupt dbus-call-method. Right. I looked up the change history, and making dbus-call-method async was the right choice. The event delivery, however, needs to be refined. There's no reason, in principle, that dbus-call-method shouldn't be able to return instantly on call completion. > >>>> --- but let's say we worked around that >>>> problem by modifying special-event-map around the read-event call so >>>> that read-event returned immediately. We'd still have a serious issue >>>> because *other*, non-dbus special event handles can run arbitrary code >>>> and enter an inner dbus-call-method reply-waiting loop. If the reply to >>>> the outer synchronous dbus call arrives before the reply to the inner >>>> synchronous dbus call, dbus-call-method-handler (which is run from >>>> special-event-map inside read-event or, in our hypothetical partial fix, >>>> manually from the wait loop) will dutifully put the reply on >>>> dbus-return-values-table. But the inner event loop has no way of waking >>>> the *outer* event loop, so when the special event handler that called >>>> the inner dbus-call-method returns, read_char will loop around and wait >>>> for the full timeout period before returning to the outer dbus-call-method. >>> >>> I don't understand the scenario. Could you, please, give a code example? >> >> No, because the current code is so broken that any example I gave >> wouldn't actually demonstrate the problem I'm trying to explain above: > > Soory, then it will be hard to debug ... > >> I wasn't expecting the desirability of random 100ms delays to be a >> point of debate, so I jumped right to the problems one might encounter >> with various solutions. You can trigger the scenario I'm worried about >> by performing a dbus call in an X drag-and-drop handler while somebody >> else is already blocked waiting for a dbus call. > > I've written another small example, using file notifications (see > attached file). No problems there. You misunderstand me. The code I checked in "fixes" the problem for both the base case and the scenario I described. The code that was there yesterday is similarly broken --- with respect to the 100ms delay --- for both uses. There's nothing to debug right now. >>>> If dbus had been implemented as a process type instead of a special >>>> event source, we'd just be able to use accept-process-output in dbus-call. >>> >>> There is already the discussion that such events (dbus, file >>> notification) should be implemented differently. I don't know whether >>> this shall be done as process type or as a separate queue to be checked for. >> >> I have a partial patch that implements this scheme: add another >> argument early-return to Fread-event and sit-for. early-return can >> either be nil for present behavior or a function of no >> arguments. Plumb early-return all the way down to >> wait_reading_process_output, and have that function call early-return >> just before running xg_select. If early-return returns true, abort the >> select loop and return nil from Fread-event. This way, you could pass >> into read-event in dbus.el an early-return function that checked for >> the existence of the return value on dbus-return-values-table. Then, >> the function would return immediately on a dbus call being complete no >> matter how deeply the calls were nested. The mechanism could also take >> over the role of the current wait_for_cell stuff. > > Definitely nothing for 24.4, we're in feature freeze. And before we're > adding such a patch, I recommend to discuss Stefan's proposal to add > another event queue in the main loop for such kind of special events. I'll have to go find that thread. I'm not sure what "another event queue" means, exactly, or how it would help. If it's just a different kind of accept-process-input, one that pulls only events of a certain sort out of the kbd buffer, then it's still vulnerable to the reentrancy problem I described in my first message.