Hi all! Time flies, doesn't it? On 14/08/2021 10:16, Eli Zaretskii wrote: >>> If one removes these lines, the process becomes much faster, but there is a >>> problem with highlighting. My idea is indeed to defer highlighting by not >>> setting the 'face property directly on that shared string, but some >>> other property >>> that is read later from the shared string by compliant frontents. >> >> I haven't done any direct benchmarking, but I'm pretty sure that this >> approach cannot, by definition, be as fast as the non-mutating one. > > Daniel seems to think otherwise, AFAIU. > >> Because you go through the whole list and mutate all of its elements, >> you have to perform a certain bit of work W x N times, where N is the >> size of the whole list. >> >> Whereas the deferred-mutation approach will only have to do its bit >> (which is likely more work, like, WWW) only 20 times, or 100 times, or >> however many completions are displayed. And this is usually negligible. >> >> However big the difference is going to be, I can't say in advance, of >> course, or whether it's going to be shadowed by some other performance >> costs. But the non-mutating approach should have the best optimization >> potential when the list is long. > > So I guess the suggestion to have a benchmark is still useful, because > the estimations of which approach has better performance vary between > you three. So maybe producing such benchmarks would be a good step? To cross this out from my TODO, I spent most of the day rebasing both of the proposed patches (one of them longer than the other) -- one from an attachment here and another from a commit inside the scratch/icomplete-lazy-highlight-attempt-2 branch, porting icomplete to Daniel's new completion-filter-completions API (*), and benchmarking. (*) Included in the attached patch: it needed changing just two lines inside icomplete, but also new variable completion-all-sorted-highlight and updates to completion--cache-all-sorted-completions and completion-all-sorted-completions. Both rebased patches are attached to this email for your convenience. AFAICT, the results confirmed my expectations quoted above. Using Joao's benchmark, with setup: (defmacro lotsoffunctions () `(progn ,@(cl-loop repeat 150000 collect `(defun ,(intern (symbol-name (gensym "heyhey"))) () 42)))) (lotsoffunctions) I ran the comparisons for empty and non-empty inputs. With no characters typed: (benchmark-run 1 (let ((completion-styles '(flex)) (completion-lazy-hilit (cl-gensym)) ; might not be defined ) ;; Uncomment one of the lines below, depending on the patch used. ;; (completion-all-completions "" obarray 'fboundp 0 nil) ;; (completion-filter-completions "" obarray 'fboundp 0 nil) )) master => 0.066 lazy-hilit => 0.045 filter-and-defer => 0.041 (but more often ~0.110 including GC, somehow) With one character typed: (benchmark-run 1 (let ((completion-styles '(flex)) (completion-lazy-hilit (cl-gensym)) ; might not be defined ) ;; Uncomment one of the lines below, depending on the patch used. ;; (completion-all-completions "h" obarray 'fboundp 1 nil) ;; (completion-filter-completions "h" obarray 'fboundp 1 nil) )) master => 0.824 lazy-hilit => 0.395 filter-and-defer => 0.125 (!) This more or less translates into the improvement in speed of fido-vertical-mode, according to my benchmark-progn call inside icomplete-exhibit (included in both attached patches for convenience). For non-empty inputs (h or hh or hhe, to match the generated functions), filter-and-defer is about 1.5x faster than lazy-hilit, like 0.450ms vs 0.640ms. lazy-hilit is slightly faster than filter-and-defer with the empty input (like 380ms vs 420ms), and I'm not yet sure why, but it's the scenario with 0 highlighting (and so no flex scoring/sorting). Perhaps some short-circuiting can be added somewhere to reach parity, or it's the cost of extra branching somewhere for backward compatibility (which could be removed in the future). Worth additional study.