unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Markers in a gap array
@ 2024-07-04  4:59 Stefan Monnier
  2024-07-04 10:24 ` Ihor Radchenko
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan Monnier @ 2024-07-04  4:59 UTC (permalink / raw)
  To: emacs-devel

I've just pushed to `scratch/markers-as-gap-array` code that seems to be
working correctly (passes all the tests, for example).
The code is not ready for `master` (there are some cleanups to do), but
if you use code that uses many markers, I'd be interested to hear about
your experience with it.

What it does: replace the unordered singly-linked list of markers that
we keep in every buffer, with an "ordered array (with gap) of markers".
The main purpose is to try and eliminate some pathological behavior in
the bytepos<->charpos conversion routines (which "abuse" markers as
a cache of bytepos<->charpos conversions) when you have many markers in
large buffers.

There's no free lunch, so this comes at the cost of slowing down other
marker operations, which is why I'd like to hear about your experience.
Here are those operations and how the performance is expected to compare:

The good:
- Conversion bytepos<->charpos used to be O(N), now it's O(lg N).
  The frequency of such conversions can vary drastically depending on
  the circumstances, so a lot of use-cases will see no difference at all,
  whereas some particular use-cases should see a significant speed up.

The indifferent:
- `make-marker`: Was and is still O(1).
- `marker-position`: Was and is still O(1).
- Adjustments when performing text insertion/deletion: Used to be O(N)
  and is still O(N), but with a smaller constant factor because
  it can fetch the markers in parallel whereas that used to be
  sequentialized by the pointer chasing of the singly-linked list and
  its branch instructions should be easier to predict.
  It's unlikely you'll see the gain, tho, because those adjustments are
  usually drowned in the noise of everything else we do upon text
  insertion/deletion.

The bad:
- `copy-marker`: Used to be O(1), now will cost time O(M + lg N) where
  N is the number of markers and M is the distance between where this
  marker is inserted and the previous position of the gap.
  The `lg N` factor should be hopefully cheap enough.
  The `M` factor could be more worrisome, since M can be as large as N,
  but there are two reasons to be optimistic:
  - The locality which makes our "gap buffer" usually efficient should
    hopefully work its same magic here keeping M usually small.
  - Moving the gap is a `memmove` and this can be performed quite fast
    even for fairly large M (i.e. the constant factor applied to
    M should be quite small).
- `(set-marker m nil)`: Used to be O(N), now costs the same as
  `copy-marker`, hence O(M + lg N).
  Even when M=N, this should be much better than the previous worst case
  because the constant factor applied to M should be *much* smaller.
  So it could be considered to be part of "The good" except that
  (set-marker m nil) was often near O(1) in practice if `m` was recently
  created (in which case it was found near the beginning of the
  singly-linked list).

So, for instance, `save-excursion` used to do a `copy-marker` followed
by a (set-marker m nil), both of which were, typically, O(1) and are now
made slower.

Side note: IIUC this design is similar to the one used in XEmacs, so
we're slowly catching up to them.  🙂


        Stefan




^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-08-06 16:15 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-04  4:59 Markers in a gap array Stefan Monnier
2024-07-04 10:24 ` Ihor Radchenko
2024-07-04 13:16   ` Stefan Monnier
2024-07-04 14:30     ` Ihor Radchenko
2024-07-04 20:11       ` Stefan Monnier
2024-07-04 20:34         ` Pip Cet
2024-07-04 20:42           ` Stefan Monnier
2024-07-17 16:48             ` Helmut Eller
2024-07-18 20:46               ` Stefan Monnier
2024-07-26 19:48                 ` Helmut Eller
2024-08-05 19:54                   ` MPS: marker-vector (was: Markers in a gap array) Helmut Eller
2024-08-05 21:14                     ` MPS: marker-vector Pip Cet
2024-08-06  6:28                       ` Helmut Eller
2024-08-06  6:51                         ` Gerd Möllmann
2024-08-06 14:36                         ` Pip Cet
2024-08-06 16:15                           ` Helmut Eller
2024-08-06  3:59                     ` Gerd Möllmann
2024-08-06  6:02                       ` Helmut Eller
2024-07-04 22:24         ` Markers in a gap array Stefan Monnier
2024-07-07 12:31         ` Ihor Radchenko
2024-07-07 13:09         ` Konstantin Kharlamov

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).