Dmitry Gutov writes: > Hi all! > > Time flies, doesn't it? Indeed. First, thanks for working on this. I tried your patches and could reproduce all your numbers with a recent master (643c67cf239). After checking the 2021 version again (254dc6ab4) I've discovered that 2023 Emacs doesn't respond as well to the lazy-hilit patch as 2021 Emacs. It's as if the cost of string copies (that the patch optimized) has gone down significantly. In addition, in 2021, we never had anyone perform the needed changes for icomplete.el to work with Daniel's API. It was only known that without those changes, icomplete.el actually performed worse under the new patch So, backed by the new ability to conduct good benchmarks, I looked at the problem anew. I found some insight in the problem, and came up with a new "lazy-hilit" patch which performs just as well, if not slightly better, than Daniel's, while keeping the changes to lisp/minibuffer.el much more minimal and not adding replacement for the longstanding completion-all-completions. Before I go into benchmarks, it's obvious to me that "lazy" or "deferred" highlighting mean basically the same thing, which is "late" or "just-in-time" highlighting. I also think, as I did in 2021, that we should be careful to separate what are performance-motivated changes from style-motivated changes. The former are easy to discuss objectively, while the latter are much, much more subjective. Whatever the outcome, we're on our way to at least a faster icomplete.el, which is of course good. So here are the benchmarks. The setup is the following, we start Emacs like: src/emacs -nw -Q -f fido-vertical-mode -l ~/Downloads/benchmark.el And make sure to put 300 000 symbols in the obarray. The symbols are prefixed "yoyo" deliberately. (cl-loop repeat 300000 do (intern (symbol-name (gensym "yoyo")))) First a micro-benchmark: ;; Daniel's patch worked by Dmitry (v3) (benchmark-run 50 (let ((completion-styles '(flex))) (completion-filter-completions "" obarray 'fboundp 0 nil) (completion-filter-completions "yo" obarray 'fboundp 0 nil) (completion-filter-completions "yoo" obarray 'fboundp 0 nil) ));; => (12.192422429999999 3 0.107881004) ;; lazy-hilit v4 patch attached in this email (benchmark-run 50 (let ((completion-styles '(flex)) (completion-lazy-hilit (cl-gensym))) (completion-all-completions "" obarray 'fboundp 0 nil) (completion-all-completions "yo" obarray 'fboundp 0 nil) (completion-all-completions "yoo" obarray 'fboundp 0 nil) ));; => (12.267915333 4 0.14799709099999991) Now, tests specific to icomplete.el using Dmitry's instrumentation. This is the "yoyo" test. Evaluate: (completing-read "" obarray) This starts a fido-vertical-mode minibuffer. First type "yo", then repeatedly insert and backspace a "o" to make "yoo" and "yo" in alternating fashion. The number of matches should be exactly 300696 for "yoo" and 301721 for "yo". Highlighting should be correct, of course. Observe the results printed by the instrumentation in `icomplete.el`. Collect them from *Messages* after 18 alternations. The results: ;; with Daniel's patch to minibuffer ;; ;; Elapsed time: 0.967481s (0.401923s in 8 GCs) ;; Elapsed time: 0.703229s (0.252157s in 5 GCs) ;; Elapsed time: 0.945053s (0.401540s in 8 GCs) ;; Elapsed time: 0.721198s (0.252657s in 5 GCs) ;; Elapsed time: 0.951377s (0.394238s in 8 GCs) ;; Elapsed time: 0.699232s (0.254524s in 5 GCs) ;; Elapsed time: 0.940497s (0.400292s in 8 GCs) ;; Elapsed time: 0.709986s (0.253635s in 5 GCs) ;; Elapsed time: 0.943063s (0.399020s in 8 GCs) ;; Elapsed time: 0.720825s (0.251619s in 5 GCs) ;; Elapsed time: 0.972146s (0.407665s in 8 GCs) ;; Elapsed time: 0.709619s (0.255678s in 5 GCs) ;; Elapsed time: 0.947108s (0.397916s in 8 GCs) ;; Elapsed time: 0.727231s (0.254040s in 5 GCs) ;; Elapsed time: 0.966196s (0.398492s in 8 GCs) ;; Elapsed time: 0.701558s (0.252168s in 5 GCs) ;; Elapsed time: 0.936269s (0.388110s in 8 GCs) ;; Elapsed time: 0.694050s (0.249759s in 5 GCs) ;; with my lazy-hilit patch worked minimally by Dmitry ;; ;; Elapsed time: 1.779906s (0.975332s in 15 GCs) ;; Elapsed time: 1.342160s (0.490314s in 5 GCs) ;; Elapsed time: 1.235759s (0.420019s in 4 GCs) ;; Elapsed time: 1.363909s (0.519521s in 5 GCs) ;; Elapsed time: 1.175773s (0.423938s in 4 GCs) ;; Elapsed time: 1.340017s (0.508744s in 5 GCs) ;; Elapsed time: 1.124552s (0.404149s in 4 GCs) ;; Elapsed time: 1.327419s (0.499433s in 5 GCs) ;; Elapsed time: 1.121927s (0.400499s in 4 GCs) ;; Elapsed time: 1.308526s (0.493652s in 5 GCs) ;; Elapsed time: 1.159132s (0.404612s in 4 GCs) ;; Elapsed time: 1.323803s (0.500754s in 5 GCs) ;; Elapsed time: 1.128562s (0.406496s in 4 GCs) ;; Elapsed time: 1.345577s (0.503971s in 5 GCs) ;; Elapsed time: 1.121691s (0.401876s in 4 GCs) ;; Elapsed time: 1.304913s (0.492255s in 5 GCs) ;; Elapsed time: 1.141926s (0.399154s in 4 GCs) ;; Elapsed time: 1.312480s (0.498205s in 5 GCs) ;; Elapsed time: 1.125095s (0.403174s in 4 GCs) ;; Elapsed time: 1.332119s (0.503671s in 5 GCs) ;; Elapsed time: 1.131561s (0.402268s in 4 GCs) ;; New lazy-hilit patch attached: ;; ;; Elapsed time: 0.902985s (0.224307s in 3 GCs) ;; Elapsed time: 0.696391s (0.079687s in 1 GCs) ;; Elapsed time: 0.896176s (0.219964s in 3 GCs) ;; Elapsed time: 0.648318s (0.074765s in 1 GCs) ;; Elapsed time: 0.906288s (0.221534s in 3 GCs) ;; Elapsed time: 0.679141s (0.079102s in 1 GCs) ;; Elapsed time: 0.889320s (0.222668s in 3 GCs) ;; Elapsed time: 0.690199s (0.076926s in 1 GCs) ;; Elapsed time: 0.912206s (0.220297s in 3 GCs) ;; Elapsed time: 0.675524s (0.078875s in 1 GCs) ;; Elapsed time: 0.907111s (0.226627s in 3 GCs) ;; Elapsed time: 0.697139s (0.079571s in 1 GCs) ;; Elapsed time: 0.906650s (0.220946s in 3 GCs) ;; Elapsed time: 0.683808s (0.078001s in 1 GCs) ;; Elapsed time: 0.912471s (0.221977s in 3 GCs) ;; Elapsed time: 0.699742s (0.079009s in 1 GCs) ;; Elapsed time: 0.897566s (0.217786s in 3 GCs) ;; Elapsed time: 0.659043s (0.076046s in 1 GCs) Daniel+Dmitry patch v3: 0.8308954444444444s avg Old lazy-hilit patch: 1.2390678333333334s avg New lazy-hilit patch attached: 0.7922265555555554s avg In conclusion: * I think the two approaches are basically evenly matched in terms of performance, at least for this symbol completion scenario. * In the completion-{sorted|all}-completions micro-benchmark my patch does very marginally worse (0.6%). Probably because of the use of a hash table. I believe I can fix this, though. * In the icomplete.el usability test, my new patch probably does slightly better (4.6%). Probably because it doesn't recalculate the regexp from the pattern every time it needs to late highlight. * My patch doesn't suffer from the 'completion--unquoted' property complication in Daniel+Dmitry's patch." It's possible/likely that the additional memory needed by this property will introduce an additional slowdown which isn't visible in the simpler symbol completion scenario. Both patches propose what are effectively extensions to the large, old and complicated completion API, in my case an additional variable to bind (or set), in Daniel's patch a brand new API entry point and the deprecation of an existing one. The benchmarks show that Daniel's patch is not absolutely necessary to reap the benefits of deferred/lazy/late/just-in-time highlighting. Looking at the two patches side-by-side it seems evident to me that one patch is much simpler than the other. The "maybe-alist-maybe-not" flag dedicated to completely changing the meaning of a number of minibuffer.el functions while keeping backward compatibility is one such item of complexity. Therefore, since we can come up with simpler alternatives that bring these now well-understood benefits, it won't surprise anyone that I think we should go for the simpler choice. João