all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: emacs-devel@gnu.org
Subject: Markers in a gap array
Date: Thu, 04 Jul 2024 00:59:56 -0400	[thread overview]
Message-ID: <jwvv81lg65d.fsf-monnier+emacs@gnu.org> (raw)

I've just pushed to `scratch/markers-as-gap-array` code that seems to be
working correctly (passes all the tests, for example).
The code is not ready for `master` (there are some cleanups to do), but
if you use code that uses many markers, I'd be interested to hear about
your experience with it.

What it does: replace the unordered singly-linked list of markers that
we keep in every buffer, with an "ordered array (with gap) of markers".
The main purpose is to try and eliminate some pathological behavior in
the bytepos<->charpos conversion routines (which "abuse" markers as
a cache of bytepos<->charpos conversions) when you have many markers in
large buffers.

There's no free lunch, so this comes at the cost of slowing down other
marker operations, which is why I'd like to hear about your experience.
Here are those operations and how the performance is expected to compare:

The good:
- Conversion bytepos<->charpos used to be O(N), now it's O(lg N).
  The frequency of such conversions can vary drastically depending on
  the circumstances, so a lot of use-cases will see no difference at all,
  whereas some particular use-cases should see a significant speed up.

The indifferent:
- `make-marker`: Was and is still O(1).
- `marker-position`: Was and is still O(1).
- Adjustments when performing text insertion/deletion: Used to be O(N)
  and is still O(N), but with a smaller constant factor because
  it can fetch the markers in parallel whereas that used to be
  sequentialized by the pointer chasing of the singly-linked list and
  its branch instructions should be easier to predict.
  It's unlikely you'll see the gain, tho, because those adjustments are
  usually drowned in the noise of everything else we do upon text
  insertion/deletion.

The bad:
- `copy-marker`: Used to be O(1), now will cost time O(M + lg N) where
  N is the number of markers and M is the distance between where this
  marker is inserted and the previous position of the gap.
  The `lg N` factor should be hopefully cheap enough.
  The `M` factor could be more worrisome, since M can be as large as N,
  but there are two reasons to be optimistic:
  - The locality which makes our "gap buffer" usually efficient should
    hopefully work its same magic here keeping M usually small.
  - Moving the gap is a `memmove` and this can be performed quite fast
    even for fairly large M (i.e. the constant factor applied to
    M should be quite small).
- `(set-marker m nil)`: Used to be O(N), now costs the same as
  `copy-marker`, hence O(M + lg N).
  Even when M=N, this should be much better than the previous worst case
  because the constant factor applied to M should be *much* smaller.
  So it could be considered to be part of "The good" except that
  (set-marker m nil) was often near O(1) in practice if `m` was recently
  created (in which case it was found near the beginning of the
  singly-linked list).

So, for instance, `save-excursion` used to do a `copy-marker` followed
by a (set-marker m nil), both of which were, typically, O(1) and are now
made slower.

Side note: IIUC this design is similar to the one used in XEmacs, so
we're slowly catching up to them.  🙂


        Stefan




             reply	other threads:[~2024-07-04  4:59 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-04  4:59 Stefan Monnier [this message]
2024-07-04 10:24 ` Markers in a gap array Ihor Radchenko
2024-07-04 13:16   ` Stefan Monnier
2024-07-04 14:30     ` Ihor Radchenko
2024-07-04 20:11       ` Stefan Monnier
2024-07-04 20:34         ` Pip Cet
2024-07-04 20:42           ` Stefan Monnier
2024-07-17 16:48             ` Helmut Eller
2024-07-18 20:46               ` Stefan Monnier
2024-07-26 19:48                 ` Helmut Eller
2024-08-05 19:54                   ` MPS: marker-vector (was: Markers in a gap array) Helmut Eller
2024-08-05 21:14                     ` MPS: marker-vector Pip Cet
2024-08-06  6:28                       ` Helmut Eller
2024-08-06  6:51                         ` Gerd Möllmann
2024-08-06 14:36                         ` Pip Cet
2024-08-06 16:15                           ` Helmut Eller
2024-08-06  3:59                     ` Gerd Möllmann
2024-08-06  6:02                       ` Helmut Eller
2024-07-04 22:24         ` Markers in a gap array Stefan Monnier
2024-07-07 12:31         ` Ihor Radchenko
2024-07-07 13:09         ` Konstantin Kharlamov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwvv81lg65d.fsf-monnier+emacs@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.