unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Ihor Radchenko <yantar92@posteo.net>
Cc: emacs-devel@gnu.org
Subject: Re: Markers in a gap array
Date: Thu, 04 Jul 2024 16:11:28 -0400	[thread overview]
Message-ID: <jwvzfqx0wde.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <87le2hp6ug.fsf@localhost> (Ihor Radchenko's message of "Thu, 04 Jul 2024 14:30:47 +0000")

Ihor Radchenko [2024-07-04 14:30:47] wrote:

> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>>> Some perf stats:
>>>
>>> ;; Switch to todo and mark next 3 times, on branch
>>>     ;; 28.72%  emacs            emacs
>>> [.] markers_sanity_check
>>
>> Did you build with or without assertions?
>
> Without.
>
>> And indeed, I need to rework them to be "more conditional" (but I was
>> focused on correctness until now).  You should probably remove those
>> calls to `markers_sanity_check` by hand when testing performance, sorry.
>
> Without these calls, I can see some speed improvement in
> buf_bytepos_to_charpos, but I do not currently have a reliable
> reproducer to trigger buf_bytepos_to_charpos slowdown on master, so it
> is comparing very small numbers.

Hmm... I tried a benchmark based on:

    (defconst elb-bytechar-buffer
      (let ((buf (get-buffer-create " *elb-bytechar*")))
        (with-current-buffer buf
          (let ((step (apply #'concat "🙂 foo\n" (make-list 2000 "asdf "))))
            (dotimes (_ (/ 10000000 (length step)))
              (insert step))
            buf))))
    
    (defconst elb-bytechar-re "\\<.\\> \\<.\\> bar")
    
    (defun elb-bytechar--aux (nmarkers lookup &optional marker-fun)
      (with-current-buffer elb-bytechar-buffer
        (let ((step (/ (buffer-size) nmarkers))
              (markers nil))
          (dotimes (i nmarkers)
            (push (copy-marker (funcall (or marker-fun #'identity) (* i step)))
                  markers))
          (dotimes (_ 10)
            (goto-char (point-min))
            (let ((parse-sexp-lookup-properties lookup))
              (re-search-forward elb-bytechar-re nil t))))))

where I call `elb-bytechar--aux` with various arguments.

[ This benchmark is a test of the performance of
  bytepos->charpos conversion because the regexp engine works only with
  bytepos internally and it needs to convert it to charpos whenever it
  looks up the `syntax-table` text-property, which happens for example
  for \< and \>.  IME, this is the most important use of the
  bytepos->charpos conversion.  ]

And like you, I don't see any speed improvement from the branch.  On the
other hand, my trivial "thinko fix" b595b4598 (which I thought would
have no real-life effect) seems to make a significant difference (see
the results below).  So maybe the reason why you can't reproduce the
slowdown is because of b595b4598?  And maybe we should install that into
`emacs-30`?

In any case, these benchmarks suggest my branch isn't very exciting
performancewise.  🙁

Also I don't have an explanation for the difference in performance
between bytechar-100k (8.00) and bytechar-100k-random/rev (~9.00) on
`markers-as-gap-array`: `rev` just builds the markers in reverse order
and `random` puts the markers at random positions.  Since my gap-array
keeps the markers sorted, the order in which they're created should not
affect the end result, and I don't think that placing them randomly in
the text should make much difference either (unless the performance
difference is just due to the time needed to compute `random`?).


        Stefan


PS: Beware the "tot avg error", because the machine I used for those
benchmarks is a poor fit, with a CPU whose top-frequency varies
enormously depending on temperature and such, and I was using the
machine (lightly, but still) at the same time as the benchmarks
were running.

* markers-as-gap-array

  | test                   | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) | tot avg err (s) |
  |------------------------+----------------+------------+---------+-------------+-----------------|
  | bytechar               |           7.86 |       0.00 |       0 |        7.86 |            0.05 |
  | bytechar-100k          |           8.00 |       0.00 |       0 |        8.00 |            0.15 |
  | bytechar-100k-nolookup |           5.99 |       0.00 |       0 |        5.99 |            0.07 |
  | bytechar-100k-random   |           9.20 |       0.00 |       0 |        9.20 |            0.23 |
  | bytechar-100k-rev      |           9.05 |       0.00 |       0 |        9.05 |            0.59 |
  | bytechar-10k-random    |           8.09 |       0.00 |       0 |        8.09 |            0.06 |
  | bytechar-1k-random     |           7.91 |       0.00 |       0 |        7.91 |            0.03 |
  | bytechar-nolookup      |           5.91 |       0.00 |       0 |        5.91 |            0.01 |
  |------------------------+----------------+------------+---------+-------------+-----------------|

* master

  | test                   | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) | tot avg err (s) |
  |------------------------+----------------+------------+---------+-------------+-----------------|
  | bytechar               |           7.73 |       0.00 |       0 |        7.73 |            0.40 |
  | bytechar-100k          |           8.04 |       0.00 |       0 |        8.04 |            0.02 |
  | bytechar-100k-nolookup |           5.93 |       0.00 |       0 |        5.93 |            0.02 |
  | bytechar-100k-random   |          10.05 |       0.00 |       0 |       10.05 |            0.01 |
  | bytechar-100k-rev      |           7.99 |       0.00 |       0 |        7.99 |            0.01 |
  | bytechar-10k-random    |           8.23 |       0.00 |       0 |        8.23 |            0.05 |
  | bytechar-1k-random     |           8.05 |       0.00 |       0 |        8.05 |            0.03 |
  | bytechar-nolookup      |           5.86 |       0.00 |       0 |        5.86 |            0.01 |
  |------------------------+----------------+------------+---------+-------------+-----------------|

* master before commit b595b4598 (mixup byte/char)

  | test                   | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) | tot avg err (s) |
  |------------------------+----------------+------------+---------+-------------+-----------------|
  | bytechar               |           7.97 |       0.00 |       0 |        7.97 |            0.60 |
  | bytechar-100k          |          16.64 |       0.00 |       0 |       16.64 |            0.43 |
  | bytechar-100k-nolookup |           6.80 |       0.00 |       0 |        6.80 |            1.07 |
  | bytechar-100k-random   |          16.85 |       0.00 |       0 |       16.85 |            1.03 |
  | bytechar-100k-rev      |          13.56 |       0.00 |       0 |       13.56 |            0.10 |
  | bytechar-10k-random    |          14.15 |       0.00 |       0 |       14.15 |            0.07 |
  | bytechar-1k-random     |          14.06 |       0.00 |       0 |       14.06 |            0.20 |
  | bytechar-nolookup      |           5.93 |       0.00 |       0 |        5.93 |            0.03 |
  |------------------------+----------------+------------+---------+-------------+-----------------|

* /usr/bin/emacs (29.4):

  | test                   | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) | tot avg err (s) |
  |------------------------+----------------+------------+---------+-------------+-----------------|
  | bytechar               |           7.92 |       0.00 |       0 |        7.92 |            1.07 |
  | bytechar-100k          |          16.27 |       0.00 |       0 |       16.27 |            1.91 |
  | bytechar-100k-nolookup |           6.16 |       0.00 |       0 |        6.16 |            0.04 |
  | bytechar-100k-random   |          15.91 |       0.00 |       0 |       15.91 |            0.29 |
  | bytechar-100k-rev      |          13.38 |       0.00 |       0 |       13.38 |            0.06 |
  | bytechar-10k-random    |          15.47 |       0.00 |       0 |       15.47 |            3.06 |
  | bytechar-1k-random     |          14.26 |       0.00 |       0 |       14.26 |            1.22 |
  | bytechar-nolookup      |           6.03 |       0.00 |       0 |        6.03 |            0.02 |
  |------------------------+----------------+------------+---------+-------------+-----------------|





  reply	other threads:[~2024-07-04 20:11 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-04  4:59 Markers in a gap array Stefan Monnier
2024-07-04 10:24 ` Ihor Radchenko
2024-07-04 13:16   ` Stefan Monnier
2024-07-04 14:30     ` Ihor Radchenko
2024-07-04 20:11       ` Stefan Monnier [this message]
2024-07-04 20:34         ` Pip Cet
2024-07-04 20:42           ` Stefan Monnier
2024-07-17 16:48             ` Helmut Eller
2024-07-18 20:46               ` Stefan Monnier
2024-07-26 19:48                 ` Helmut Eller
2024-08-05 19:54                   ` MPS: marker-vector (was: Markers in a gap array) Helmut Eller
2024-08-05 21:14                     ` MPS: marker-vector Pip Cet
2024-08-06  6:28                       ` Helmut Eller
2024-08-06  6:51                         ` Gerd Möllmann
2024-08-06 14:36                         ` Pip Cet
2024-08-06 16:15                           ` Helmut Eller
2024-08-06  3:59                     ` Gerd Möllmann
2024-08-06  6:02                       ` Helmut Eller
2024-07-04 22:24         ` Markers in a gap array Stefan Monnier
2024-07-07 12:31         ` Ihor Radchenko
2024-07-07 13:09         ` Konstantin Kharlamov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwvzfqx0wde.fsf-monnier+emacs@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=emacs-devel@gnu.org \
    --cc=yantar92@posteo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).