From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Date: Tue, 12 Sep 2023 22:35:37 +0300 Message-ID: <83sf7jnq0m.fsf@gnu.org> References: <2d844a34-857d-3d59-b897-73372baac480@gutov.dev> <83bkg2tsu6.fsf@gnu.org> <83bd4246-ac41-90ec-1df3-02d0bd59ca44@gutov.dev> <834jlttv1p.fsf@gnu.org> <937c3b8e-7742-91b7-c2cf-4cadd0782f0c@gutov.dev> <83a5vlsanw.fsf@gnu.org> <69a98e2a-5816-d36b-9d04-8609291333cd@gutov.dev> <87351cs8no.fsf@localhost> <35163e56-607d-9c5b-e3e8-5d5b548b3cb7@gutov.dev> <878rb3m43b.fsf@localhost> <83v8e6lyi4.fsf@gnu.org> <35f8b664-0241-9f96-1aa0-20ca51b2d34c@gutov.dev> <59c30342-a7e0-d83b-a128-0faae4cbd633@gutov.dev> <83pm4bi6qa.fsf@gnu.org> <83bkfs2tw5.fsf@gnu.org> <18a0b4d8-32bd-3ecd-8db4-32608a1ebba7@gutov.dev> <83il8lxjcu.fsf@gnu.org> <2e21ec81-8e4f-4c02-ea15-43bd6da3daa7@gutov.dev> <8334zmtwwi.fsf@gnu.org> <83tts0rkh5.fsf@gnu.org> <831qf3pd1y.fsf@gnu.org> <28a7916e-92d5-77ab-a61e-f85b59ac76b1@gutov.dev> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="18473"; mail-complaints-to="usenet@ciao.gmane.io" Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, 64735@debbugs.gnu.org To: Dmitry Gutov Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Sep 12 21:37:15 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qg9CE-0004bU-7S for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 12 Sep 2023 21:37:14 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qg9C0-0001kt-DF; Tue, 12 Sep 2023 15:37:00 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qg9Bx-0001kI-QQ for bug-gnu-emacs@gnu.org; Tue, 12 Sep 2023 15:36:57 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qg9Bx-0007fK-HY for bug-gnu-emacs@gnu.org; Tue, 12 Sep 2023 15:36:57 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1qg9C1-0006lo-Tv for bug-gnu-emacs@gnu.org; Tue, 12 Sep 2023 15:37:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 12 Sep 2023 19:37:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 64735 X-GNU-PR-Package: emacs Original-Received: via spool by 64735-submit@debbugs.gnu.org id=B64735.169454736725958 (code B ref 64735); Tue, 12 Sep 2023 19:37:01 +0000 Original-Received: (at 64735) by debbugs.gnu.org; 12 Sep 2023 19:36:07 +0000 Original-Received: from localhost ([127.0.0.1]:60110 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qg9B9-0006kc-7S for submit@debbugs.gnu.org; Tue, 12 Sep 2023 15:36:07 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:42098) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qg9B6-0006k6-B4 for 64735@debbugs.gnu.org; Tue, 12 Sep 2023 15:36:05 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qg9Av-0007Vo-9J; Tue, 12 Sep 2023 15:35:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=R9F4/GmMtcUGFgKQiY8I7hTyYDm96X+ZJs5UYdgq0I8=; b=Bk4ek8Bg4dC4 RRyJgQlY7Kso0cBjzliB/rG+0tv4lBZvqzsXbeccKVuoRWjIQsg0WE03Aenfe+Egi2NkKB/z1yobF RImbY+Sr7m+c+bVaLY2MJwKZzx0dMT2J2/34DwduSSkthv3hiFGisvv4HSYGsNZxSvBlk2cjIOj0H pnyp8qwjBheplAgYZMlXjpStMJ9FSw4zAkwsmMs04zmu2bf1tQtCXwk6qDIC7jOlhc0rk8ajgMj8V AaoZH4f2lGpaUJ6QPKanvHnpVywG3wIke1EiNBtnCrxvzq+jeaFLghX30cHQsbph7Uv/k8aHk8HGm UQWgRmDdxnKEk6rfBcOY3w==; In-Reply-To: <28a7916e-92d5-77ab-a61e-f85b59ac76b1@gutov.dev> (message from Dmitry Gutov on Tue, 12 Sep 2023 21:48:37 +0300) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:270216 Archived-At: > Date: Tue, 12 Sep 2023 21:48:37 +0300 > Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, > 64735@debbugs.gnu.org > From: Dmitry Gutov > > > then we could try extending > > internal-default-process-filter (or writing a new filter function > > similar to it) so that it inserts the stuff into the gap and then uses > > decode_coding_gap, > > Can that work at all? By the time internal-default-process-filter is > called, we have already turned the string from char* into Lisp_Object > text, which we then pass to it. So consing has already happened, IIUC. That's why I said "or writing a new filter function". read_and_dispose_of_process_output will have to call this new filter differently, passing it the raw text read from the subprocess, where read_and_dispose_of_process_output current first decodes the text and produces a Lisp string from it. Then the filter would need to do something similar to what insert-file-contents does: insert the raw input into the gap, then call decode_coding_gap to decode that in-place. > > which converts inserted bytes in-place -- that, at > > least, will be correct and will avoid consing intermediate temporary > > strings from the process output, then decoding them, then inserting > > them. Other than that, the -2 and -3 variants are very close > > runners-up of -5, so maybe I'm missing something, but I see no reason > > be too excited here? I mean, 0.89 vs 0.92? really? > > The important part is not 0.89 vs 0.92 (that would be meaningless > indeed), but that we have an _asyncronous_ implementation of the feature > that works as fast as the existing synchronous one (or faster! if we > also bind read-process-output-max to a large value, the time is 0.72). > > The possible applications for that range from simple (printing progress > bar while the scan is happening) to more advanced (launching a > concurrent process where we pipe the received file names concurrently to > 'xargs grep'), including visuals (xref buffer which shows the > intermediate search results right away, updating them gradually, all > without blocking the UI). Hold your horses. Emacs only reads output from sub-processes when it's idle. So printing a progress bar (which makes Emacs not idle) with the asynchronous implementation is basically the same as having the synchronous implementation call some callback from time to time (which will then show the progress). As for piping to another process, this is best handled by using a shell pipe, without passing stuff through Emacs. And even if you do need to pass it through Emacs, you could do the same with the synchronous implementation -- only the "xargs" part needs to be asynchronous, the part that reads file names does not. Right? Please note: I'm not saying that the asynchronous implementation is not interesting. It might even have advantages in some specific use cases. So it is good to have it. It just isn't a breakthrough, that's all. And if we want to use it in production, we should probably work on adding that special default filter which inserts and decodes directly into the buffer, because that will probably lower the GC pressure and thus has hope of being faster. Or even replace the default filter implementation with that new one. > > About inserting into the buffer: what we do is insert into the gap, > > and when the gap becomes full, we enlarge it. Enlarging the gap > > involves: (a) enlarging the chunk of memory allocated to buffer text > > (which might mean we ask the OS for more memory), and (b) moving the > > characters after the gap to the right to free space for inserting more > > stuff. This is pretty fast, but still, with a large pipe buffer and a > > lot of output, we do this many times, so it could add up to something > > pretty tangible. It's hard to me to tell whether this is > > significantly faster than consing strings and inserting them, only > > measurements can tell. > > See the benchmark tables and the POC patch in my previous email. Using a > better filter function would be ideal, but it seems like that's not > going to fit the current design. Happy to be proven wrong, though. I see no reason why reading subprocess output couldn't use the same technique as insert-file-contents does.