unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Helmut Eller <eller.helmut@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: Pip Cet <pipcet@protonmail.com>,
	 Ihor Radchenko <yantar92@posteo.net>,
	emacs-devel@gnu.org
Subject: Re: Markers in a gap array
Date: Fri, 26 Jul 2024 21:48:46 +0200	[thread overview]
Message-ID: <87jzh8djdt.fsf@gmail.com> (raw)
In-Reply-To: <jwvr0bqa30h.fsf-monnier+emacs@gnu.org> (Stefan Monnier's message of "Thu, 18 Jul 2024 16:46:04 -0400")

On Thu, Jul 18 2024, Stefan Monnier wrote:

>> The current scratch/igc branch, configured with MPS and -O2
>> -fno-omit-frame-pointer:
>>
>>   | test                   || tot avg (s) | tot avg err (s) |
>>   |------------------------++-------------+-----------------|
>>   | bytechar               ||       12.11 |            0.18 |
>>   | bytechar-100k          ||       12.38 |            0.17 |
>>   | bytechar-100k-nolookup ||        9.14 |            0.22 |
>>   | bytechar-100k-random   ||      271.52 |           14.27 |
>>   | bytechar-100k-rev      ||       12.38 |            0.24 |
>>   | bytechar-10k-random    ||       38.08 |            1.43 |
>>   | bytechar-1k-random     ||       14.95 |            0.48 |
>>   | bytechar-nolookup      ||        8.97 |            0.12 |
>>   |------------------------++-------------+-----------------|
>>   | total                  ||      379.53 |           14.36 |
>>
>> and without MPS:
>>
>>   | test                   || tot avg (s) | tot avg err (s) |
>>   |------------------------++-------------+-----------------|
>>   | bytechar               ||       11.42 |            0.03 |
>>   | bytechar-100k          ||       11.48 |            0.02 |
>>   | bytechar-100k-nolookup ||        9.15 |            0.00 |
>>   | bytechar-100k-random   ||       16.39 |            0.02 |
>>   | bytechar-100k-rev      ||       11.48 |            0.02 |
>>   | bytechar-10k-random    ||       11.97 |            0.02 |
>>   | bytechar-1k-random     ||       11.56 |            0.01 |
>>   | bytechar-nolookup      ||        9.13 |            0.04 |
>>   |------------------------++-------------+-----------------|
>>   | total                  ||       92.58 |            0.06 |
>>
>> So the weak vector doesn't compare very well to the linked list.
>
> Hmm... I wonder why there is such a large difference for markers created
> in a random-order compared to the cases where they're created beg-to-end
> and end-to-beg.  My crystal ball is of no help but suggests that it
> might hint at the fact that it's probably a silly effect that could be
> fixed easily once diagnosed.

One problem with the benchmarks is that they all use the same buffer and
that the markers for the previous benchmark can still linger around.
The benchmark driver calls garbage-collect before running a benchmark
and for the old GC that may be enough to collect all the old markers;
with MPS, the old markers are definitely still there.

If I create a fresh buffer for each benchmark, the times of the MPS and
non-MPS version are much closer.

>> Maybe because the vector only grows and never shrinks.
>
> But why would that only show up when the order is random?

To figure out what is going on I run bytechar-100k followed
bytechar-10k-random; in GDB I interrupted the benchmark and printed the
marker array.  After index 100000, it contains suspicious duplicates:

  ...
  (99997 9899604)
  (99998 9899703)
  (99999 9899802)
  (100000 9899901)
  (100001 7272795)
  (100002 7272795)
  (100003 8017474)
  (100004 8017474)
  (100005 7087003)
  (100006 7087003)
  (100007 4076094)
  (100008 4076094)
  ...

The first element is the array index and the second is the charpos of
the marker.  Then I set a breakpoint in build_marker and got this:

#0  build_marker (buf=0x7fffe46c9a10, charpos=6001308, bytepos=6003108)
    at alloc.c:4191
#1  0x00005555557be1a7 in buf_charpos_to_bytepos
    (b=0x7fffe46c9a10, charpos=6001308) at marker.c:238
#2  0x00005555557bf184 in set_marker_internal
    (marker=0x7fffe5a0f4dd, position=0x16e4a72, buffer=0x0, restricted=false)
    at marker.c:587
#3  0x00005555557bf2a3 in Fset_marker
    (marker=0x7fffe5a0f4dd, position=0x16e4a72, buffer=0x0) at marker.c:630
#4  0x00005555557bf640 in Fcopy_marker (marker=0x16e4a72, type=0x0)
    at marker.c:788

It looks like Fcopy_marker calls (indirectly) buf_charpos_to_bytepos and
that creates another marker at the same position (0x16e4a72 is the
fixnum for 6001308).  I doubt that this is intentional, but it may not
be a serious problem.

So why does the problem only show up for random positions?  Maybe
because the benchmark is spending most of the time not in
re-search-forward, but in copy-marker and for random positions the
caching is ineffective?



  reply	other threads:[~2024-07-26 19:48 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-04  4:59 Markers in a gap array Stefan Monnier
2024-07-04 10:24 ` Ihor Radchenko
2024-07-04 13:16   ` Stefan Monnier
2024-07-04 14:30     ` Ihor Radchenko
2024-07-04 20:11       ` Stefan Monnier
2024-07-04 20:34         ` Pip Cet
2024-07-04 20:42           ` Stefan Monnier
2024-07-17 16:48             ` Helmut Eller
2024-07-18 20:46               ` Stefan Monnier
2024-07-26 19:48                 ` Helmut Eller [this message]
2024-08-05 19:54                   ` MPS: marker-vector (was: Markers in a gap array) Helmut Eller
2024-08-05 21:14                     ` MPS: marker-vector Pip Cet
2024-08-06  6:28                       ` Helmut Eller
2024-08-06  6:51                         ` Gerd Möllmann
2024-08-06 14:36                         ` Pip Cet
2024-08-06 16:15                           ` Helmut Eller
2024-08-06  3:59                     ` Gerd Möllmann
2024-08-06  6:02                       ` Helmut Eller
2024-07-04 22:24         ` Markers in a gap array Stefan Monnier
2024-07-07 12:31         ` Ihor Radchenko
2024-07-07 13:09         ` Konstantin Kharlamov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jzh8djdt.fsf@gmail.com \
    --to=eller.helmut@gmail.com \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    --cc=pipcet@protonmail.com \
    --cc=yantar92@posteo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).