unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Matt Armstrong <matt@rfc20.org>
Cc: emacs-devel@gnu.org
Subject: Re: noverlay branch
Date: Tue, 11 Oct 2022 00:09:32 -0400	[thread overview]
Message-ID: <jwv4jwbvwy9.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <87sfjvvx7g.fsf@rfc20.org> (Matt Armstrong's message of "Mon, 10 Oct 2022 20:46:43 -0700")

Matt Armstrong [2022-10-10 20:46:43] wrote:

> Stefan, I read only Eli's reply this morning.  Got to yours just now.
>
>
> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>>> I had a different idea of tightening the otree invariant to this:
>>>  1) A node's otick must be greater than or equal to its children's.
>> That's already checked in the current code, right?
> Not yet...until...just now.

??
In the code I have from feature/noverlay I see:

    eassert (node->parent == ITREE_NULL || node->parent->otick >= node->otick);

in `interval_tree_inherit_offset`.

>>>  3) All downward tree traversal propagates offsets and otick.
>> I think we already do that, but if there are places we missed, then yes,
>> of course.
> Yes, I think we do.  The wrinkle is that we don't always start
> inheriting at the root, but otick is not updated in that case.

I don't think propagating `otick` is very important during
tree traversals.

>> Regarding `otick`, I can see 2 more options:
>> - Get rid of it completely: its sole purpose is to try and keep
>>   `overlay-start/end` O(1) in the usual case instead of O(log N), but
>>   I'm not convinced it's worth the cost of propagating `otick` everywhere
>>   all the time.
>> - A halfway point is to keep `otick` but update it more lazily,
>>   i.e. only update it when we do `overlay-start/end` (e.g. in
>>   `interval_tree_validate`).
> These ideas are simpler but similar in direction to my idea to use a
> btree instead.

Sorry, I fail to see the connection to btrees.

>> This max_depth also sounds to me like over-engineering.
> I'd like to keep max_depth.

Sorry, not gonna happen.

> My reason for adding it is informed by experience on similar projects.
> It is easier to understand an assertion that the tree is too deep than
> it is to debug, say, stack overflow when there is a cycle in the link
> structure of the tree.

The purpose of ENABLE_CHECKING assertions is:
- avoid errors remaining undetected.
- document the code.

If it crashes with a stack overflow, it still detects the problem, so
`max_depth` is not necessary for that.

Your checking code is nice and detailed, but it's becoming so large that
it increases the maintenance burden.  I want to make it shorter&simpler.

> It is easier to limit the height of checking if, indeed, on some night
> I am diagnosing an unrelated bug but the perf cost here is too much.

Many people run their Emacs always with ENABLE_CHECKING, so the
performance impact should be minimized.  It's OK to check a node's local
invariants in those places where you visit the node anyway.  It's OK to
check the overall state of the tree in those places where the code will
visit all the nodes anyway, but it's not OK to visit significantly more
nodes just for the check than what the normal code would do, especially
if an error there could be detected sooner or later by cheaper
checks elsewhere.

> With max_depth in place I can easily manually hack up
> a functon to verify checks only in a small subset of the tree
> (e.g. around a rotation operation, etc.).  I've done and benefited from
> this sort of thing in the past.

Yes, it's very helpful *while working on the code*.  But it's easy to
sprinkle many more calls to `check_tree` as needed when you're debugging
an error caught by the cheap checks.  And when you do that you can
temporarily pay the price of full tree traversals.

> There is a small concrete benefit to it right now.  MAX_DEPTH is now
> initialized from interval_tree_max_height(), which currently could wildy
> under estimate the height and we wouldn't notice (because generator
> stacks auto-grow).  Since we call that fucntion to make "big enough to
> not need growing" stacks, it is nice to have test coverage for it.

[ FWIW, I'd like to get rid of the `tree->size` field, and thus rely on
  auto-growing more heavily.  ]

>>> +  result.complete = left_result.complete && right_result.complete;
>>> +  if (result.complete)
>> I think all calls to `check_tree` should be complete traversals.
>> Most of the invariants checked in it are already checked incrementally
>> via sprinkled assertions elsewhere, so it's only really useful when
>> debugging a concrete known issue where the "local" checks aren't good
>> enough.
> All `check_tree` calls are complete traversals.

Great, so we can remove the `result.complete` field.

>> We could also include a few "cheap" calls to `check_tree` via `eassert`
>> (i.e. calls which we know shouldn't be algorithmically too expensive
>> because we're already traversing the whole tree for some other reason,
>> e.g. when killing a buffer (e.g. to verify that, at the end of the day,
>> we still preversed the RB invariants :-) or maybe during GC).
> ...I'm not understanding something here.  All `check_tree` calls are in
> `eassert` already.

But some of them are currently at places where they're unacceptable
because they cost a lot more than the surrounding code.


        Stefan




  reply	other threads:[~2022-10-11  4:09 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-25 22:38 noverlay branch Stefan Monnier
2022-09-25 22:50 ` Lars Ingebrigtsen
2022-09-25 22:56   ` Stefan Monnier
2022-09-26  2:52 ` Ihor Radchenko
2022-09-26  3:17   ` Stefan Monnier
2022-09-26  6:11   ` Eli Zaretskii
2022-09-26 13:08     ` Ihor Radchenko
2022-09-26 15:59       ` Eli Zaretskii
     [not found]         ` <87v8ovdosz.fsf@localhost>
2022-10-08  6:57           ` Eli Zaretskii
2022-10-09  3:25             ` Matt Armstrong
2022-10-09  4:30               ` Eli Zaretskii
2022-10-09  3:23           ` Matt Armstrong
2022-10-09  3:47           ` Stefan Monnier
2022-10-13 12:09             ` Ihor Radchenko
2022-09-29 18:12       ` Stefan Monnier
2022-09-27  5:12 ` Matt Armstrong
2022-09-27  6:53   ` Eli Zaretskii
2022-09-27 17:31     ` Matt Armstrong
2022-09-27 18:45       ` Stefan Monnier
2022-09-28 23:09   ` Stefan Monnier
2022-09-29 14:54     ` Gerd Möllmann
2022-09-29 21:36       ` Stefan Monnier
2022-09-30  5:20         ` Gerd Möllmann
2022-10-06  4:47         ` Matt Armstrong
2022-10-06  5:43           ` Gerd Möllmann
2022-10-07  4:11             ` Matt Armstrong
2022-10-07  4:34               ` Gerd Möllmann
2022-10-07 13:33                 ` Stefan Monnier
2022-10-07 14:29                   ` Gerd Möllmann
2022-10-07 14:51                     ` Stefan Monnier
2022-10-07 15:12                       ` Gerd Möllmann
2022-10-07 17:12                         ` Stefan Monnier
2022-10-07 14:56                   ` Stefan Monnier
2022-10-07 15:59                   ` Matt Armstrong
2022-10-07 15:34                 ` Matt Armstrong
2022-10-06 12:08           ` Stefan Monnier
2022-09-27  8:39 ` Gerd Möllmann
2022-09-27  9:38   ` Eli Zaretskii
2022-10-06 20:41 ` Matt Armstrong
2022-10-07 16:51 ` Matt Armstrong
2022-10-07 18:28   ` Stefan Monnier
2022-10-08 23:33     ` Matt Armstrong
2022-10-09  3:44       ` Matt Armstrong
2022-10-09  4:12       ` Stefan Monnier
2022-10-09 15:34         ` Stefan Monnier
2022-10-10  2:57           ` Matt Armstrong
2022-10-10  6:24             ` Eli Zaretskii
2022-10-10 16:26               ` Matt Armstrong
2022-10-10 14:44             ` Stefan Monnier
2022-10-11  3:46               ` Matt Armstrong
2022-10-11  4:09                 ` Stefan Monnier [this message]
2022-10-11 18:02                   ` Matt Armstrong
2022-10-11 18:59                     ` Stefan Monnier
2022-10-12  5:18                       ` Matt Armstrong
2022-10-12 18:02                         ` Stefan Monnier
2022-10-12 22:26                           ` Matt Armstrong
2022-10-13  4:03                             ` Stefan Monnier
2022-10-09 23:47       ` Stefan Monnier
2022-10-10  0:05         ` Emanuel Berg
2022-10-10 16:27           ` Matt Armstrong
2022-10-10 16:33         ` Matt Armstrong
2022-10-10 18:32           ` Matt Armstrong
2022-10-11 16:06             ` Stefan Monnier
2022-10-12 17:33               ` Matt Armstrong
2022-10-13  3:59                 ` Stefan Monnier
2022-10-16 21:53                   ` Matt Armstrong
2022-10-23  4:49 ` Matt Armstrong
2022-10-24  9:14   ` Stefan Kangas
2022-10-24 16:21     ` Matt Armstrong
2022-10-24 12:51   ` Eli Zaretskii
2022-10-25 20:57     ` Dmitry Gutov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwv4jwbvwy9.fsf-monnier+emacs@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=emacs-devel@gnu.org \
    --cc=matt@rfc20.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).