From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Date: Sun, 23 Jul 2023 09:15:22 +0300 Message-ID: <83lef76tmt.fsf@gnu.org> References: <87cz0lmoxy.fsf@localhost> <83v8edzb31.fsf@gnu.org> <87r0p1cta3.fsf@gmx.de> <87pm4ll7ox.fsf@localhost> <87a5vpcmc7.fsf@gmx.de> <878rb9l1f5.fsf@localhost> <87zg3pb6yt.fsf@gmx.de> <83zg3p9s39.fsf@gnu.org> <878rb944wi.fsf@localhost> <83tttx9q4v.fsf@gnu.org> <87pm4lb4fr.fsf@gmx.de> <83pm4l9n0o.fsf@gnu.org> <87jzutb14l.fsf@gmx.de> <83mszp9kl2.fsf@gnu.org> <83h6pwa52z.fsf@gnu.org> <87ilaci637.fsf@catern.com> <83sf9g88eh.fsf@gnu.org> <87cz0jj25g.fsf@catern.com> <83wmyr7sbq.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="17516"; mail-complaints-to="usenet@ciao.gmane.io" Cc: yantar92@posteo.net, rms@gnu.org, sbaugh@catern.com, dmitry@gutov.dev, michael.albinus@gmx.de, 64735@debbugs.gnu.org To: Spencer Baugh Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Jul 23 08:15:25 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qNSNH-0004Lr-UI for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 23 Jul 2023 08:15:24 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qNSMz-0006KS-6k; Sun, 23 Jul 2023 02:15:05 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qNSMx-0006Jg-7R for bug-gnu-emacs@gnu.org; Sun, 23 Jul 2023 02:15:03 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qNSMw-0001Cg-TW for bug-gnu-emacs@gnu.org; Sun, 23 Jul 2023 02:15:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1qNSMw-0008O3-Ng for bug-gnu-emacs@gnu.org; Sun, 23 Jul 2023 02:15:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 23 Jul 2023 06:15:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 64735 X-GNU-PR-Package: emacs Original-Received: via spool by 64735-submit@debbugs.gnu.org id=B64735.169009290032220 (code B ref 64735); Sun, 23 Jul 2023 06:15:02 +0000 Original-Received: (at 64735) by debbugs.gnu.org; 23 Jul 2023 06:15:00 +0000 Original-Received: from localhost ([127.0.0.1]:37840 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qNSMu-0008Nc-1V for submit@debbugs.gnu.org; Sun, 23 Jul 2023 02:15:00 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:39096) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qNSMr-0008NN-KW for 64735@debbugs.gnu.org; Sun, 23 Jul 2023 02:14:59 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qNSMj-0001Be-VO; Sun, 23 Jul 2023 02:14:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=877rEBDiCdCpXvd5zPvzgYUFox7/ScyEiBA2QRapeh8=; b=J7lysJzc3f1g EkDuqXW1rQmPdGgtY2GOmgGvcCcVc+Fnyv6GtU1PeM7Uk7CtZeUXgSzD5ak0jl0G975sSrxocgswE JZKR8P5wkXH9Fm7xsTnCGg5UGN+NNSZMRoqaPULBgH0KAmI+hiXMbItkKXFrs5ESIxtNjv0dduVU2 /jAiw9ejWe+tjes7L8n3Yn+LWXuLlSXSC1qHjhfgHRqYM7wKeTPpeKxkJ1KFwLTqTLv3CmRd0O8jB cwWJC0idjoP7y9E2nGr6Os6tYznvTPLnP5HP5VxFgxooOyxO6IH7DwbgYEPDWi4VRzAXNzM5hLiMK LlpU6mSx4vMrVHRj9RoqsA==; Original-Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qNSMc-0005zf-QX; Sun, 23 Jul 2023 02:14:43 -0400 In-Reply-To: (message from Spencer Baugh on Sat, 22 Jul 2023 16:53:05 -0400) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:265863 Archived-At: > From: Spencer Baugh > Cc: sbaugh@catern.com, yantar92@posteo.net, rms@gnu.org, > dmitry@gutov.dev, michael.albinus@gmx.de, 64735@debbugs.gnu.org > Date: Sat, 22 Jul 2023 16:53:05 -0400 > > Can you try this further change on your Windows (and GNU/Linux) box? I > just tested on a different box and my original change gets: > > (("built-in" . "Elapsed time: 4.506643s (2.276269s in 21 GCs)") > ("with-find" . "Elapsed time: 4.114531s (2.848497s in 27 GCs)")) > > while this parallel implementation gets > > (("built-in" . "Elapsed time: 4.479185s (2.236561s in 21 GCs)") > ("with-find" . "Elapsed time: 2.858452s (1.934647s in 19 GCs)")) > > so it might have a favorable impact on Windows and your other GNU/Linux > box. Almost no effect here on MS-Windows: (("built-in" . "Elapsed time: 0.859375s (0.093750s in 4 GCs)") ("with-find" . "Elapsed time: 8.437500s (0.078125s in 4 GCs)")) It was 8.578 sec with the previous version. (The Lisp version is somewhat faster in this test because I native-compiled the code for this test.) On GNU/Linux: (("built-in" . "Elapsed time: 4.244898s (1.934182s in 56 GCs)") ("with-find" . "Elapsed time: 3.011574s (1.190498s in 35 GCs)")) Faster by 10% (previous version yielded 3.327 sec). Btw, I needed to fix the code: when-let needs 2 open parens after it, not one. The original code signals an error from the filter function in Emacs 29. > >> (cl-assert (null _predicate) t "find-directory-files-recursively can't accept arbitrary predicates") > > > > It should. > > This is where I think a fallback would be useful - it's basically > impossible to support arbitrary predicates efficiently here, since it > requires us to put Lisp in control of whether find descends into a > directory. There's nothing wrong with supporting this less efficiently. And there's no need to control where Find descends: you could just filter out the files from those directories that need to be ignored. > So I'm thinking I would just fall back to running the old > directory-files-recursively whenever there's a predicate. Or just not > supporting this at all... We cannot not support it at all, because then it will not be a replacement. Fallback is okay, though I'd prefer a self-contained function. > >> (if follow-symlinks > >> '("-L") > >> '("!" "(" "-type" "l" "-xtype" "d" ")")) > >> (unless (string-empty-p regexp) > >> "-regex" (concat ".*" regexp ".*")) > >> (unless include-directories > >> '("!" "-type" "d")) > >> '("-print0") > > > > Some of these switches are specific to GNU Find. Are we going to > > support only GNU Find? > > POSIX find doesn't support -regex, so I think we have to. We could > stick to just POSIX find if we only allowed globs in > find-directory-files-recursively, instead of full regexes. The latter would again be incompatible with directory-files-recursively, so it isn't TRT, IMO. One other subtlety is non-ASCII file names: you use -print0 switch to Find, which produces null bytes, and those could inhibit decoding of non-ASCII characters. So you may need to bind inhibit-null-byte-detection to a non-nil value to get correctly decoded file names you get from Find.