From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Matt Armstrong Newsgroups: gmane.emacs.devel Subject: Re: noverlay branch Date: Wed, 12 Oct 2022 15:26:58 -0700 Message-ID: <87ilkollu5.fsf@rfc20.org> References: <87sfjzefvv.fsf@rfc20.org> <875ygt6gbj.fsf@rfc20.org> <87pmf04c7s.fsf@rfc20.org> <87sfjvvx7g.fsf@rfc20.org> <87fsfuw85t.fsf@rfc20.org> <87czaxwreu.fsf@rfc20.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="30241"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Oct 13 00:28:07 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oikCt-0007e4-GV for ged-emacs-devel@m.gmane-mx.org; Thu, 13 Oct 2022 00:28:07 +0200 Original-Received: from localhost ([::1]:39832 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oikCr-0003cU-Qp for ged-emacs-devel@m.gmane-mx.org; Wed, 12 Oct 2022 18:28:05 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:41388) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oikBx-0002wR-Is for emacs-devel@gnu.org; Wed, 12 Oct 2022 18:27:10 -0400 Original-Received: from relay8-d.mail.gandi.net ([2001:4b98:dc4:8::228]:34347) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oikBu-0004dB-Dh for emacs-devel@gnu.org; Wed, 12 Oct 2022 18:27:08 -0400 Original-Received: (Authenticated sender: matt@rfc20.org) by mail.gandi.net (Postfix) with ESMTPSA id 0D64D1BF206; Wed, 12 Oct 2022 22:27:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rfc20.org; s=gm1; t=1665613622; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=zybEZBGiQYffXTw/xU/uKjuqkPhFedcsbati8EYRTXY=; b=anYOw91gq/ehvnE1K1YgQK3yrUXlgMkSRFKSLghA2yPS30WISLneVk+yTIdrDYpwlH2Uqf lecr5AnlKBih8pAEzuXV4FcubH2UIJ7KNKGnaPjQNzGjeLoTftQHPJCjWUOVoAYYMKnVmB ivLtB3eczIeVuwl88XEYoiZRyFDOkWNXV/8pjaEaSS9nfnR4eOCQw92LAgYc9NgEP00SVN Au/tj4e7DPKeRsfA/13drAjf2666Mun4XCx0+KT1v5sZJ3bROUocRMrjQKp6ZeKti+wk5M CboZ0vjNCJlk58TytNFi8MqOkr2vi9oOXuBUlV0n22hSl7Aprx63Gqa6mLveDA== Original-Received: from matt by naz with local (Exim 4.96) (envelope-from ) id 1oikBm-00Fvj0-2u; Wed, 12 Oct 2022 15:26:58 -0700 In-Reply-To: Received-SPF: pass client-ip=2001:4b98:dc4:8::228; envelope-from=matt@rfc20.org; helo=relay8-d.mail.gandi.net X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:297636 Archived-At: Stefan Monnier writes: >> --- a/src/itree.c >> +++ b/src/itree.c >> @@ -307,6 +307,7 @@ check_tree (struct interval_tree *tree, >> if (tree->root == ITREE_NULL) >> return true; >> eassert (tree->root->parent == ITREE_NULL); >> + eassert (!check_red_black_invariants || !tree->root->red); >> >> struct interval_node *node = tree->root; >> struct check_subtree_result result > > Does any part of the code care about that? I can't see anything that > would break if this invariant is not satisfied (both in terms of > correct behavior and in terms of performance). IOW it seems more like > an accident than something important would should check. That is a can of subtle worms. As a practical matter, `interval_tree_insert_fix` asserts it, and that was failing for me. I wanted it to fail earlier. TLDR: That the root be black isn't an essential proeprty of Red-Black trees. We could relax it, but that is uncommon practice and would slightly complicate `interval_tree_insert_fix`. Corman et. al. doesn't list it as one of the four "red-black properties", but read on. Harper, Sen and Tarjan's Rank Balanced Trees paper (http://sidsen.azurewebsites.net/papers/rb-trees-talg.pdf) reformulates Red-Black trees in terms of rank differences of a node and its parent. In their formulation, there is an equivalence between Red-Black "color" and the rank difference between a node and its parent. Since the root has no parent, they say the root has no color. So I think you could prove that the root's color doesn't matter -- it is inexpressible in an equivalent formulation of the same class of trees. But then you get to the implementation, where Corman et. al., subtly placed deep in their description of their "RB-Insert" function (at least in my old copy), says this: > We have made the important assumption that the root of the tree is > black -- a property we guarante in line 18 [...] Why, because, in my copy, they do this: if p[x] = left[p[p[x]]] which is ill defined if p[x] is a red root -- there is no p[p[x]]. They want a guarantee that p[p[x]] exists. They've already checked that p[x] is red, so they just say root can't be red to save a check that p[x] is not the root. (Let me chime in here that I *dislike* it when textbook authors do this kind of thing, especially in books aimed at undergrads. They prominently list four essential Red-Black tree properties, and promptly write code that relies on additional invariants they toss in to make their pseudo-code look nicer. I actually remember looking at this p[p[x]] and struggling to figure out how the four "red-black properties" made that safe, then finding that little note burried at the end of a paragraph in their voluminous text, and getting angry. It is almost as if they had no appreciation for their poor young readers minds being completely full just understanding the very basics. That was nearly three decades ago! A formative memory in the folly of being too clever.) Our code is the same in `interval_tree_inherit_offset`: if (node->parent == node->parent->parent->left) { If node->parent is a red root that'll crash even with sentinel nodes (after my change to give even them a truly NULL parent). At long last, I think we could change this in `interval_tree_insert_fix`: while (null_safe_is_red (node->parent)) { to this: while (null_safe_is_red (node->parent) && node->parent != tree->root) { ...and then we could allow red roots. I'm inclined to keep it the way it is, if only for reasons of inertia. Most future maintainers coming to Red-Black tree code will have the "Corman et. al." mindset.