unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost.
@ 2011-11-09  2:55 Austin Clements
  2011-11-09  6:31 ` Dmitry Kurochkin
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Austin Clements @ 2011-11-09  2:55 UTC (permalink / raw)
  To: notmuch

Buffer redisplay requires traversing the buffer's invisibility spec
for every part of the display that has an 'invisible text or overlay
property.  Previously, the search buffer's invisibility spec list
contained roughly one entry for each search result.  As a result,
redisplay took O(NM) time where N is the number of visible lines and M
is the total number of results.  On a slow computer, this is enough to
make even buffer motion noticeably slow.  Worse, during a search
operation, redisplay is triggered for each search result (even if
there are no visible buffer changes), so search was quadratic
(O(NM^2)) in the number of search results.

This change switches to using a single element buffer invisibility
spec.  To un-hide authors, instead of removing an entry from the
invisibility spec, it simply removes the invisibility overlay from
those authors.


I tested using a query with 6633 results on a 9 year old machine.
Before this patch, Emacs took 70 seconds to fill the search buffer;
toward the end of the search, Emacs consumed 10-20x as much CPU as
notmuch; and moving point in the buffer took about a second.  With
this patch, the same query takes 40 seconds, Emacs consumes ~3x the
CPU of notmuch by the end, and there's no noticeable lag to moving
point.  (There's still some source of non-linearity, because Emacs and
notmuch consume roughly the same amount of CPU early in the search.)
---
 emacs/notmuch.el |   11 +++--------
 1 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/emacs/notmuch.el b/emacs/notmuch.el
index bb7565c..58fdea0 100644
--- a/emacs/notmuch.el
+++ b/emacs/notmuch.el
@@ -378,7 +378,7 @@ Complete list of currently available key bindings:
   (make-local-variable 'notmuch-search-target-line)
   (set (make-local-variable 'notmuch-search-continuation) nil)
   (set (make-local-variable 'scroll-preserve-screen-position) t)
-  (add-to-invisibility-spec 'notmuch-search)
+  (add-to-invisibility-spec (cons 'ellipsis t))
   (use-local-map notmuch-search-mode-map)
   (setq truncate-lines t)
   (setq major-mode 'notmuch-search-mode
@@ -679,9 +679,6 @@ foreground and blue background."
 			      (append (overlay-get overlay 'face) attributes)))))
 	  notmuch-search-line-faces)))
 
-(defun notmuch-search-isearch-authors-show (overlay)
-  (remove-from-invisibility-spec (cons (overlay-get overlay 'invisible) t)))
-
 (defun notmuch-search-author-propertize (authors)
   "Split `authors' into matching and non-matching authors and
 propertize appropriately. If no boundary between authors and
@@ -755,13 +752,11 @@ non-authors is found, assume that all of the authors match."
       (insert visible-string)
       (when (not (string= invisible-string ""))
 	(let ((start (point))
-	      (invis-spec (make-symbol "notmuch-search-authors"))
 	      overlay)
 	  (insert invisible-string)
-	  (add-to-invisibility-spec (cons invis-spec t))
 	  (setq overlay (make-overlay start (point)))
-	  (overlay-put overlay 'invisible invis-spec)
-	  (overlay-put overlay 'isearch-open-invisible #'notmuch-search-isearch-authors-show)))
+	  (overlay-put overlay 'invisible 'ellipsis)
+	  (overlay-put overlay 'isearch-open-invisible #'delete-overlay)))
       (insert padding))))
 
 (defun notmuch-search-insert-field (field date count authors subject tags)
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost.
  2011-11-09  2:55 [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost Austin Clements
@ 2011-11-09  6:31 ` Dmitry Kurochkin
  2011-11-10 13:33 ` Sebastian Spaeth
  2011-11-12 14:35 ` David Bremner
  2 siblings, 0 replies; 12+ messages in thread
From: Dmitry Kurochkin @ 2011-11-09  6:31 UTC (permalink / raw)
  To: Austin Clements, notmuch

Hi Austin.

The patch looks good to me.  Thanks for it!  A very nice optimization!

Regards,
  Dmitry

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost.
  2011-11-09  2:55 [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost Austin Clements
  2011-11-09  6:31 ` Dmitry Kurochkin
@ 2011-11-10 13:33 ` Sebastian Spaeth
  2011-11-10 19:22   ` servilio
  2011-11-12 14:35 ` David Bremner
  2 siblings, 1 reply; 12+ messages in thread
From: Sebastian Spaeth @ 2011-11-10 13:33 UTC (permalink / raw)
  To: Austin Clements, notmuch

[-- Attachment #1: Type: text/plain, Size: 235 bytes --]

On Tue,  8 Nov 2011 21:55:28 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
>  emacs/notmuch.el |   11 +++--------
>  1 files changed, 3 insertions(+), 8 deletions(-)


Tested and works great here! +1 for quick inclusion.

Sebastian

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost.
  2011-11-10 13:33 ` Sebastian Spaeth
@ 2011-11-10 19:22   ` servilio
  2011-11-11  3:04     ` Pieter Praet
  0 siblings, 1 reply; 12+ messages in thread
From: servilio @ 2011-11-10 19:22 UTC (permalink / raw)
  To: Sebastian Spaeth; +Cc: notmuch

On 10 November 2011 08:33, Sebastian Spaeth <Sebastian@sspaeth.de> wrote:
> On Tue,  8 Nov 2011 21:55:28 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
>>  emacs/notmuch.el |   11 +++--------
>>  1 files changed, 3 insertions(+), 8 deletions(-)
>
>
> Tested and works great here! +1 for quick inclusion.

I join the petition, I have been using for a couple of days and the
difference is noticeable.

Servilio

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost.
  2011-11-10 19:22   ` servilio
@ 2011-11-11  3:04     ` Pieter Praet
  2011-11-11  4:53       ` Austin Clements
  0 siblings, 1 reply; 12+ messages in thread
From: Pieter Praet @ 2011-11-11  3:04 UTC (permalink / raw)
  To: servilio, Sebastian Spaeth; +Cc: notmuch

On Thu, 10 Nov 2011 14:22:30 -0500, servilio <servilio@gmail.com> wrote:
> On 10 November 2011 08:33, Sebastian Spaeth <Sebastian@sspaeth.de> wrote:
> > On Tue,  8 Nov 2011 21:55:28 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> >>  emacs/notmuch.el |   11 +++--------
> >>  1 files changed, 3 insertions(+), 8 deletions(-)
> >
> >
> > Tested and works great here! +1 for quick inclusion.
> 
> I join the petition, I have been using for a couple of days and the
> difference is noticeable.
> 

Subjectively, I'm not seeing any difference, but that may be due to a
relatively recent machine, and Austin's recent rebase [1] of Istvan's
database patch [2] no doubt makes a huge difference as well.


I've tried getting some hard numbers using

  #+begin_src sh
    time emacs --eval '(progn
        (notmuch)
        (notmuch-search "*")
        (while (get-buffer-process (current-buffer))
            (sleep-for 0.1))
        (kill-emacs))'
  #+end_src

... but the results vary wildly on subsequent runs.


How would one go about getting stable results?


> Servilio
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch


Peace

-- 
Pieter

[1] id:"1320599856-24078-1-git-send-email-amdragon@mit.edu"
[2] id:"m3pqnj2j7a.fsf@zsu.kismala.com"

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost.
  2011-11-11  3:04     ` Pieter Praet
@ 2011-11-11  4:53       ` Austin Clements
  2011-11-11  5:27         ` Austin Clements
  2011-11-15 16:45         ` Pieter Praet
  0 siblings, 2 replies; 12+ messages in thread
From: Austin Clements @ 2011-11-11  4:53 UTC (permalink / raw)
  To: Pieter Praet; +Cc: notmuch, servilio

Quoth Pieter Praet on Nov 11 at  4:04 am:
> On Thu, 10 Nov 2011 14:22:30 -0500, servilio <servilio@gmail.com> wrote:
> > On 10 November 2011 08:33, Sebastian Spaeth <Sebastian@sspaeth.de> wrote:
> > > On Tue,  8 Nov 2011 21:55:28 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> > >>  emacs/notmuch.el |   11 +++--------
> > >>  1 files changed, 3 insertions(+), 8 deletions(-)
> > >
> > >
> > > Tested and works great here! +1 for quick inclusion.
> > 
> > I join the petition, I have been using for a couple of days and the
> > difference is noticeable.
> > 
> 
> Subjectively, I'm not seeing any difference, but that may be due to a
> relatively recent machine, and Austin's recent rebase [1] of Istvan's
> database patch [2] no doubt makes a huge difference as well.

How many results?  Since it's eliminating a quadratic factor, this
only affects large search result buffers (and only once they grow
large; the beginning of the buffer will come in just as quickly with
or without this).

> I've tried getting some hard numbers using
> 
>   #+begin_src sh
>     time emacs --eval '(progn
>         (notmuch)
>         (notmuch-search "*")
>         (while (get-buffer-process (current-buffer))
>             (sleep-for 0.1))
>         (kill-emacs))'
>   #+end_src
> 
> ... but the results vary wildly on subsequent runs.

For me, this doesn't actually display the results buffer (though I
don't know why not), which means it won't test this, since the problem
lies in the Emacs redisplay logic.  However, I think it does yield a
baseline of how long it takes to construct the buffer without
displaying it.  This is interesting because, while this patch does not
achieve this baseline time, this patch combined with another one I
have waiting in the wings, does.

> How would one go about getting stable results?

The results are quite stable for me, both timing it manually and using
the command you gave above.  The only non-obvious thing that comes to
mind is that you're probably testing on multiple cores, in which case
the kernel scheduler could introduce a lot of noise.  You could try
running numactl -C 0 emacs ... to see if there's any merit to this
theory.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost.
  2011-11-11  4:53       ` Austin Clements
@ 2011-11-11  5:27         ` Austin Clements
  2011-11-15 16:47           ` Pieter Praet
  2011-11-15 16:45         ` Pieter Praet
  1 sibling, 1 reply; 12+ messages in thread
From: Austin Clements @ 2011-11-11  5:27 UTC (permalink / raw)
  To: Pieter Praet; +Cc: notmuch, servilio

Quoth myself on Nov 10 at 11:53 pm:
> Quoth Pieter Praet on Nov 11 at  4:04 am:
> > I've tried getting some hard numbers using
> > 
> >   #+begin_src sh
> >     time emacs --eval '(progn
> >         (notmuch)
> >         (notmuch-search "*")
> >         (while (get-buffer-process (current-buffer))
> >             (sleep-for 0.1))
> >         (kill-emacs))'
> >   #+end_src
> > 
> > ... but the results vary wildly on subsequent runs.
> 
> For me, this doesn't actually display the results buffer (though I
> don't know why not), which means it won't test this, since the problem
> lies in the Emacs redisplay logic.

This may or may not actually be correct, but the following seems more
representative on my system:

    time emacs --eval '(progn
        (notmuch)
        (notmuch-search "*")
        (while (get-buffer-process (current-buffer))
            (redisplay)
            (sleep-for 0.1))
        (kill-emacs))'

This at least displays the buffer.  I also tried
(accept-process-output) instead of the (sleep-for 0.1), which clearly
behaved differently, but gave only slightly higher numbers.  If I
timed just the search part, to exclude emacs start-up, I would have a
better idea of which more closely matches my manual measurements.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost.
  2011-11-09  2:55 [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost Austin Clements
  2011-11-09  6:31 ` Dmitry Kurochkin
  2011-11-10 13:33 ` Sebastian Spaeth
@ 2011-11-12 14:35 ` David Bremner
  2 siblings, 0 replies; 12+ messages in thread
From: David Bremner @ 2011-11-12 14:35 UTC (permalink / raw)
  To: Austin Clements; +Cc: notmuch

On Tue,  8 Nov 2011 21:55:28 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> Buffer redisplay requires traversing the buffer's invisibility spec
> This change switches to using a single element buffer invisibility
> spec.  To un-hide authors, instead of removing an entry from the
> invisibility spec, it simply removes the invisibility overlay from
> those authors.

Pushed to master.

d

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost.
  2011-11-11  4:53       ` Austin Clements
  2011-11-11  5:27         ` Austin Clements
@ 2011-11-15 16:45         ` Pieter Praet
  1 sibling, 0 replies; 12+ messages in thread
From: Pieter Praet @ 2011-11-15 16:45 UTC (permalink / raw)
  To: Austin Clements; +Cc: notmuch, servilio

On Thu, 10 Nov 2011 23:53:41 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> Quoth Pieter Praet on Nov 11 at  4:04 am:
> > On Thu, 10 Nov 2011 14:22:30 -0500, servilio <servilio@gmail.com> wrote:
> > > On 10 November 2011 08:33, Sebastian Spaeth <Sebastian@sspaeth.de> wrote:
> > > > On Tue,  8 Nov 2011 21:55:28 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> > > >>  emacs/notmuch.el |   11 +++--------
> > > >>  1 files changed, 3 insertions(+), 8 deletions(-)
> > > >
> > > >
> > > > Tested and works great here! +1 for quick inclusion.
> > > 
> > > I join the petition, I have been using for a couple of days and the
> > > difference is noticeable.
> > > 
> > 
> > Subjectively, I'm not seeing any difference, but that may be due to a
> > relatively recent machine, and Austin's recent rebase [1] of Istvan's
> > database patch [2] no doubt makes a huge difference as well.
> 
> How many results?  Since it's eliminating a quadratic factor, this
> only affects large search result buffers (and only once they grow
> large; the beginning of the buffer will come in just as quickly with
> or without this).
> 

About the same as with my review [1] of the database patch [2], so ~9540.

> > I've tried getting some hard numbers using
> > 
> >   #+begin_src sh
> >     time emacs --eval '(progn
> >         (notmuch)
> >         (notmuch-search "*")
> >         (while (get-buffer-process (current-buffer))
> >             (sleep-for 0.1))
> >         (kill-emacs))'
> >   #+end_src
> > 
> > ... but the results vary wildly on subsequent runs.
> 
> For me, this doesn't actually display the results buffer (though I
> don't know why not), which means it won't test this, since the problem
> lies in the Emacs redisplay logic.  However, I think it does yield a
> baseline of how long it takes to construct the buffer without
> displaying it.  This is interesting because, while this patch does not
> achieve this baseline time, this patch combined with another one I
> have waiting in the wings, does.
> 

That's it!  I've been running the test unattended, so I failed to notice
that the results buffer isn't always displayed (i.e. sometimes it is,
causing the test to take quite a bit longer).

> > How would one go about getting stable results?
> 
> The results are quite stable for me, both timing it manually and using
> the command you gave above.  The only non-obvious thing that comes to
> mind is that you're probably testing on multiple cores, in which case
> the kernel scheduler could introduce a lot of noise.  You could try
> running numactl -C 0 emacs ... to see if there's any merit to this
> theory.

Hmm, numactl isn't available on my system;

I seem to have only a single node @ '/sys/devices/system/node/', and
both cpu0 and cpu1 @ '/sys/devices/system/cpu/' contain a symlink to
'node0'.

I'm guessing my kernel lacks support for NUMA policy?

Either way, this works:
  echo 0 > /sys/devices/system/cpu/cpu1/online


Peace

-- 
Pieter

[1] id:"87obwjtpcd.fsf@praet.org"
[2] id:"1320599856-24078-1-git-send-email-amdragon@mit.edu"

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost.
  2011-11-11  5:27         ` Austin Clements
@ 2011-11-15 16:47           ` Pieter Praet
  2011-11-15 17:17             ` servilio
  0 siblings, 1 reply; 12+ messages in thread
From: Pieter Praet @ 2011-11-15 16:47 UTC (permalink / raw)
  To: Austin Clements; +Cc: notmuch, servilio

On Fri, 11 Nov 2011 00:27:16 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> Quoth myself on Nov 10 at 11:53 pm:
> > Quoth Pieter Praet on Nov 11 at  4:04 am:
> > > I've tried getting some hard numbers using
> > > 
> > >   #+begin_src sh
> > >     time emacs --eval '(progn
> > >         (notmuch)
> > >         (notmuch-search "*")
> > >         (while (get-buffer-process (current-buffer))
> > >             (sleep-for 0.1))
> > >         (kill-emacs))'
> > >   #+end_src
> > > 
> > > ... but the results vary wildly on subsequent runs.
> > 
> > For me, this doesn't actually display the results buffer (though I
> > don't know why not), which means it won't test this, since the problem
> > lies in the Emacs redisplay logic.
> 
> This may or may not actually be correct, but the following seems more
> representative on my system:
> 
>     time emacs --eval '(progn
>         (notmuch)
>         (notmuch-search "*")
>         (while (get-buffer-process (current-buffer))
>             (redisplay)
>             (sleep-for 0.1))
>         (kill-emacs))'
> 

Yup, that does the trick!

These remained fairly consistent, even with both CPU cores enabled:

|      |         SINGLE        |          DUAL         |
|      | before    | after     | before    | after     |
|------+-----------+-----------+-----------+-----------|
| real | 0m34.560s | 0m31.829s | 0m30.784s | 0m26.587s |
| user | 0m26.188s | 0m23.672s | 0m27.028s | 0m22.889s |
| sys  | 0m0.863s  | 0m0.907s  | 0m1.200s  | 0m1.243s  |


> This at least displays the buffer.  I also tried
> (accept-process-output) instead of the (sleep-for 0.1), which clearly
> behaved differently, but gave only slightly higher numbers.  If I
> timed just the search part, to exclude emacs start-up, I would have a
> better idea of which more closely matches my manual measurements.

Oddly enough...  I ran some tests using `elp-instrument-package' and
your `time-it' macro [1] to keep emacs init out of the equation, and
both not only produced horribly fluctuating results (even with only a
single CPU core enabled), but often took *longer* as well!

I'm stumped...

  #+begin_src emacs-lisp
    (progn
      (require 'notmuch)
      (let ((elp-reset-after-results t))
        (elp-instrument-package "notmuch")
        (notmuch-search "*")
        (while (get-buffer-process (current-buffer))
          (redisplay)
          (sleep-for 0.1))
        (elp-results)))
  #+end_src

  #+begin_src emacs-lisp
    (defmacro time-it (code)
      `(let ((start-time (get-internal-run-time)))
         ,code
         (float-time (time-subtract (get-internal-run-time) start-time))))

    (require 'notmuch)

    (time-it (progn
               (notmuch-search "*")
               (while (get-buffer-process (current-buffer))
                 (redisplay)
                 (sleep-for 0.1))))
  #+end_src


Peace

-- 
Pieter

[1] id:"20110713185721.GI25558@mit.edu"

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost.
  2011-11-15 16:47           ` Pieter Praet
@ 2011-11-15 17:17             ` servilio
  2011-11-16 21:29               ` Pieter Praet
  0 siblings, 1 reply; 12+ messages in thread
From: servilio @ 2011-11-15 17:17 UTC (permalink / raw)
  To: Pieter Praet; +Cc: notmuch

Hi,

Given that this change is about display of search results, I have the
suspicion that the following two factors might be more relevant:

- size of the Emacs frame: bigger would mean more threads to show

- composition of the search results, specially length of the threads
displayed, as the longer they go, the more hidden text there will be

And because of this, though you might already have a proper method for
measuring it, the results are different. I thought about this because
my searches have only ~180 threads, yet I could notice the difference.

Servilio

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost.
  2011-11-15 17:17             ` servilio
@ 2011-11-16 21:29               ` Pieter Praet
  0 siblings, 0 replies; 12+ messages in thread
From: Pieter Praet @ 2011-11-16 21:29 UTC (permalink / raw)
  To: servilio; +Cc: notmuch

On Tue, 15 Nov 2011 12:17:59 -0500, servilio <servilio@gmail.com> wrote:
> Hi,
> 
> Given that this change is about display of search results, I have the
> suspicion that the following two factors might be more relevant:
> 
> - size of the Emacs frame: bigger would mean more threads to show
> 
> - composition of the search results, specially length of the threads
> displayed, as the longer they go, the more hidden text there will be
> 
> And because of this, though you might already have a proper method for
> measuring it, the results are different. I thought about this because
> my searches have only ~180 threads, yet I could notice the difference.
> 

Definitely valid points, but these aren't likely to have affected my
test results, as every iteration started a fresh instance of Emacs,
full-screen, on the same screen, with the same query on the same
dataset, whilst ensuring -to a feasible degree- that Emacs was the
only thing vying for CPU time.

After including `redisplay' in the script (as suggested by Austin [1]),
I did get consistent results as well as a measurable performance
improvement due to the buffer invisibility spec patch.


As for the tests using `elp-instrument-package' and Austin's
`time-it' macro: I'm most likely just doing it wrong :)

> Servilio


Peace

-- 
Pieter

[1] id:"20111111052716.GU2658@mit.edu"

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-11-16 21:30 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-09  2:55 [PATCH] emacs: Use a single buffer invisibility spec to fix quadratic search cost Austin Clements
2011-11-09  6:31 ` Dmitry Kurochkin
2011-11-10 13:33 ` Sebastian Spaeth
2011-11-10 19:22   ` servilio
2011-11-11  3:04     ` Pieter Praet
2011-11-11  4:53       ` Austin Clements
2011-11-11  5:27         ` Austin Clements
2011-11-15 16:47           ` Pieter Praet
2011-11-15 17:17             ` servilio
2011-11-16 21:29               ` Pieter Praet
2011-11-15 16:45         ` Pieter Praet
2011-11-12 14:35 ` David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).