From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: sbaugh@catern.com Newsgroups: gmane.emacs.bugs Subject: bug#69775: [PATCH] Use regexp-opt in dired-omit-regexp Date: Sat, 16 Mar 2024 17:15:52 +0000 (UTC) Message-ID: <8734sqjdyz.fsf@catern.com> References: <86o7bh9iz3.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="4361"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: Spencer Baugh , 69775@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Mar 16 18:16:41 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rlXeB-0000wB-Rl for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 16 Mar 2024 18:16:40 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rlXdz-0008Dl-Vg; Sat, 16 Mar 2024 13:16:28 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rlXdy-0008DW-3h for bug-gnu-emacs@gnu.org; Sat, 16 Mar 2024 13:16:26 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rlXdx-0004z2-NB for bug-gnu-emacs@gnu.org; Sat, 16 Mar 2024 13:16:25 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1rlXeY-00077G-GX for bug-gnu-emacs@gnu.org; Sat, 16 Mar 2024 13:17:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: sbaugh@catern.com Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 16 Mar 2024 17:17:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 69775 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 69775-submit@debbugs.gnu.org id=B69775.171060940027310 (code B ref 69775); Sat, 16 Mar 2024 17:17:02 +0000 Original-Received: (at 69775) by debbugs.gnu.org; 16 Mar 2024 17:16:40 +0000 Original-Received: from localhost ([127.0.0.1]:56417 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rlXeB-00076J-4E for submit@debbugs.gnu.org; Sat, 16 Mar 2024 13:16:39 -0400 Original-Received: from s.wfbtzhsv.outbound-mail.sendgrid.net ([159.183.224.104]:21264) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rlXe8-00075q-WD for 69775@debbugs.gnu.org; Sat, 16 Mar 2024 13:16:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=catern.com; h=from:subject:in-reply-to:references:mime-version:to:cc:content-type: cc:content-type:from:subject:to; s=s1; bh=0X/2cC18rCSordnagn1gr3jYm3bU7gFG1R02vT/wfto=; b=wJXFic9xscHEOboxEKV43CW9TcTLH9OyS1GTWuDoVQIuJ7R03I1wZV657X0Ih9nJfFWS dsW7N4mAFv2Wm2uEX8ktn1dwhWpwJ/ifwHO5FSYFWY0Qxiokxi9PmlonsX0xVVdl7LhjVM iSnJSDp4w5UVdrap624nHdqVjb1SABV9vUT3yiEPxxqyE/dLkncpZjSPAUzRLOY4bYmivo fu82T9S8hIq09LX9mEFKq/na0VvpV+1gIZDgVrYcSERUnBfkHAyl9Hu9oNOy8JXo/FTnVb WGmKk7v4lQfiWkUshF+DOk+j1Ilap4G8TDCRHH8rD6/qhSIdLTbDFt9o8p1GhQrQ== Original-Received: by filterdrecv-5568fdb67c-v8fxn with SMTP id filterdrecv-5568fdb67c-v8fxn-1-65F5D3C8-14 2024-03-16 17:15:52.672183824 +0000 UTC m=+1016483.597694199 Original-Received: from earth.catern.com (unknown) by geopod-ismtpd-2 (SG) with ESMTP id 7q8QUkgeQ9qvYPCBwm8PVw Sat, 16 Mar 2024 17:15:52.433 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=68.160.221.150; helo=localhost; envelope-from=sbaugh@catern.com; receiver=gnu.org Original-Received: from localhost (pool-68-160-221-150.nycmny.fios.verizon.net [68.160.221.150]) by earth.catern.com (Postfix) with ESMTPSA id 0F9C760170; Sat, 16 Mar 2024 17:14:44 +0000 (UTC) In-Reply-To: <86o7bh9iz3.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 14 Mar 2024 13:00:00 +0200") X-SG-EID: u001.v6RTqHFpv1T6krEot6UFAVAJmQ+4h1t8/TfqqE2B07PaXOxtK13VR5j9EFry4QA+eG3SJHI0nnxCQ6apmcQDypHdmeEqFwrpYf4uMG35ybDctg4s4YwR5DSJM0nHrJttlVYHrBFwOWKFxOvRwIBxcoJANK7NqHjpWu+VTb0AIg5YuLJ73EkKS3yHxOa2LDsnGtxI30CDno1kYk1011utMw== X-Entity-ID: u001.oW4JupFKOzCccZAQN2OOFQ== X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:281729 Archived-At: --=-=-= Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Eli Zaretskii writes: >> From: Spencer Baugh >> Date: Wed, 13 Mar 2024 11:01:05 -0400 >> >> In my benchmarking, for large dired buffers, using regexp-opt provides >> around a 3x speedup in omitting. > > Can you show a recipe for such benchmarking? I'd like to try that on > my systems. > > Also, what is the slowdown in the (improbable, but possible) case > where dired-omit-extensions change for each call of dired-omit-regexp? Yes, run the following after applying the patch: (require 'dired) (require 'dired-x) (require 'cl-lib) (defun dired-omit-regexp-old () (concat (if dired-omit-files (concat "\\(" dired-omit-files "\\)") "") (if (and dired-omit-files dired-omit-extensions) "\\|" "") (if dired-omit-extensions (concat ".";; a non-extension part should exist "\\(" (mapconcat 'regexp-quote dired-omit-extensions "\\|") "\\)$") ""))) (defun my-do-omit (mode) (let ((regexp (cl-case mode (new (dired-omit-regexp)) (old (dired-omit-regexp-old)) (new-uncached (let ((dired-omit--extension-regexp-cache nil)) (dired-omit-regexp))) (t (error "Bad mode %s" mode))))) (dired-mark-if (let ((fn (dired-get-filename nil t))) (and fn (string-match-p regexp fn))) nil))) (defun my-bench-omit (nfiles ntimes) (let ((default-directory (expand-file-name "test-dired-list"))) (make-directory default-directory t) (dolist (file (directory-files "." t "test-file")) (delete-file file)) (dotimes (i nfiles) (write-region "" nil (format "test-file%s" i) nil 'nomessage nil 'excl)) (let ((dired-omit-mode nil)) (with-current-buffer (let ((inhibit-message t)) (dired-noselect default-directory)) (revert-buffer) (message "files %s, ntimes %s: new %s old %s new-uncached %s" nfiles ntimes (car (benchmark-call (lambda () (my-do-omit 'new)) ntimes)) (car (benchmark-call (lambda () (my-do-omit 'old)) ntimes)) (car (benchmark-call (lambda () (my-do-omit 'new-uncached)) ntimes))) )))) (my-bench-omit 1 100) (my-bench-omit 10 100) (my-bench-omit 100 100) (my-bench-omit 1000 100) (my-bench-omit 10000 100) For me, I get: $ ./src/emacs -Q --batch -l ../emacs-29/bench-omit.elc files 1, ntimes 100: new 0.008839979999999999 old 0.018162129 new-uncached 0.031399762 files 10, ntimes 100: new 0.012037615 old 0.040232355000000004 new-uncached 0.037990543 files 100, ntimes 100: new 0.07368538100000001 old 0.314905271 new-uncached 0.10006527300000001 files 1000, ntimes 100: new 0.669103498 old 3.076339984 new-uncached 0.693134644 files 10000, ntimes 100: new 6.336211434 old 30.926320486 new-uncached 6.442762152999999 So the performance improvement is quite substantial for large directories. new-uncached is the performance if dired-omit-extensions changes on each call of dired-omit-regexp. For a directory of 1 file, the overhead of recomputing regexp-opt every time makes the performance perhaps 2x-3x worse, but around 10 files the performance improvement from regexp-opt exceeds the overhead, and above that the uncached version still outperforms the old version substantially. If dired-omit-extensions doesn't change every time, the performance is improved even for directories of 1 file. >> regexp-opt takes around 5 milliseconds, so to avoid slowing down >> omitting in small dired buffers we cache the return value. >> >> Since omitting is now 3x faster, increase dired-omit-size-limit by 3x. >> >> * lisp/dired-x.el (dired-omit--extension-regexp-cache): Add. >> (dired-omit-regexp): Use regexp-opt. >> (dired-omit-size-limit): Increase, since omitting is now faster. > > I'm okay with these changes, but: > > . the change in the default value of dired-omit-size-limit should be > called out in NEWS > . please document this variable in the dired-x.texi manual, where we > document all the other variables relevant to dired-omit mode. > . the doc string of dired-omit-size-limit is embarrassingly > unhelpful, so bonus points for fixing that as well > > Thanks. Certainly, updated patch attached. --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=0001-Use-regexp-opt-in-dired-omit-regexp.patch >From ca0ff9d40b85b281e25b998528c1e1a71b9e70c5 Mon Sep 17 00:00:00 2001 From: Spencer Baugh Date: Sat, 16 Mar 2024 17:11:24 +0000 Subject: [PATCH] Use regexp-opt in dired-omit-regexp In my benchmarking, for large dired buffers, using regexp-opt provides around a 3x speedup in omitting. regexp-opt takes around 5 milliseconds, so to avoid slowing down omitting in small dired buffers we cache the return value. Since omitting is now 3x faster, increase dired-omit-size-limit by 3x. Also, document dired-omit-size-limit better. * doc/misc/dired-x.texi (Omitting Variables): Document dired-omit-size-limit. * etc/NEWS: Announce increase of dired-omit-size-limit. * lisp/dired-x.el (dired-omit--extension-regexp-cache): Add. (dired-omit-regexp): Use regexp-opt. (bug#69775) (dired-omit-size-limit): Increase and improve docs. --- doc/misc/dired-x.texi | 8 ++++++++ etc/NEWS | 6 ++++++ lisp/dired-x.el | 26 ++++++++++++++++++++------ 3 files changed, 34 insertions(+), 6 deletions(-) diff --git a/doc/misc/dired-x.texi b/doc/misc/dired-x.texi index 4cad016a0f6..66045c5f759 100644 --- a/doc/misc/dired-x.texi +++ b/doc/misc/dired-x.texi @@ -346,6 +346,14 @@ Omitting Variables match the file name relative to the buffer's top-level directory. @end defvar +@defvar dired-omit-size-limit +If non-@code{nil}, omitting will be skipped if the directory listing +exceeds this size in bytes. Since omitting can be slow for very large +directories, this avoids having to wait before seeing the directory. +This variable is ignored when @code{dired-omit-mode} is called +interactively, such as by @code{C-x M-o}, so you can still enable +omitting in the directory after the initial display. + @cindex omitting additional files @defvar dired-omit-marker-char Temporary marker used by Dired to implement omitting. Should never be used diff --git a/etc/NEWS b/etc/NEWS index b4a1c887f2e..fdface7aa0c 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -668,6 +668,12 @@ marked or clicked on files according to the OS conventions. For example, on systems supporting XDG, this runs 'xdg-open' on the files. +*** The default value of 'dired-omit-size-limit' has increased. +After performance improvements to omitting in large directories, the new +default value is 300k, up from 100k. This means 'dired-omit-mode' will +omit files in directories whose directory listing is up to 300 kilobytes +in size. + +++ *** 'dired-listing-switches' handles connection-local values if exist. This allows to customize different switches for different remote machines. diff --git a/lisp/dired-x.el b/lisp/dired-x.el index 62fdd916e69..d7d15028489 100644 --- a/lisp/dired-x.el +++ b/lisp/dired-x.el @@ -77,12 +77,17 @@ dired-vm-read-only-folders (other :tag "non-writable only" if-file-read-only)) :group 'dired-x) -(defcustom dired-omit-size-limit 100000 - "Maximum size for the \"omitting\" feature. +(defcustom dired-omit-size-limit 300000 + "Maximum buffer size for `dired-omit-mode'. + +Omitting will be skipped if the directory listing exceeds this size in +bytes. This variable is ignored when `dired-omit-mode' is called +interactively. + If nil, there is no maximum size." :type '(choice (const :tag "no maximum" nil) integer) :group 'dired-x - :version "29.1") + :version "30.1") (defcustom dired-omit-case-fold 'filesystem "Determine whether \"omitting\" patterns are case-sensitive. @@ -506,14 +511,23 @@ dired-omit-expunge (re-search-forward dired-re-mark nil t)))) count))) +(defvar dired-omit--extension-regexp-cache + nil + "A cache of `regexp-opt' applied to `dired-omit-extensions'. + +This is a cons whose car is a list of strings and whose cdr is a +regexp produced by `regexp-opt'.") + (defun dired-omit-regexp () + (unless (equal dired-omit-extensions (car dired-omit--extension-regexp-cache)) + (setq dired-omit--extension-regexp-cache + (cons dired-omit-extensions (regexp-opt dired-omit-extensions)))) (concat (if dired-omit-files (concat "\\(" dired-omit-files "\\)") "") (if (and dired-omit-files dired-omit-extensions) "\\|" "") (if dired-omit-extensions (concat ".";; a non-extension part should exist - "\\(" - (mapconcat 'regexp-quote dired-omit-extensions "\\|") - "\\)$") + (cdr dired-omit--extension-regexp-cache) + "$") ""))) ;; Returns t if any work was done, nil otherwise. -- 2.44.0 --=-=-=--