From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful Date: Fri, 30 Sep 2022 11:25:30 -0400 Message-ID: References: <83h70qhez0.fsf@gnu.org> <83edvuhaby.fsf@gnu.org> <831qruh67o.fsf@gnu.org> <83y1u2foli.fsf@gnu.org> Reply-To: Stefan Monnier Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="7734"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) Cc: Eli Zaretskii , Andreas Politz , 58158@debbugs.gnu.org To: Gerd =?UTF-8?Q?M=C3=B6llmann?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Fri Sep 30 17:26:32 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oeHuJ-0001me-Qh for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 30 Sep 2022 17:26:32 +0200 Original-Received: from localhost ([::1]:58500 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oeHuI-0001aI-GK for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 30 Sep 2022 11:26:30 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:40814) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oeHtw-0001Zc-8i for bug-gnu-emacs@gnu.org; Fri, 30 Sep 2022 11:26:10 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:43597) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oeHtq-0000NL-Ny for bug-gnu-emacs@gnu.org; Fri, 30 Sep 2022 11:26:04 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1oeHtq-0002Rx-9G for bug-gnu-emacs@gnu.org; Fri, 30 Sep 2022 11:26:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 30 Sep 2022 15:26:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 58158 X-GNU-PR-Package: emacs Original-Received: via spool by 58158-submit@debbugs.gnu.org id=B58158.16645515459390 (code B ref 58158); Fri, 30 Sep 2022 15:26:02 +0000 Original-Received: (at 58158) by debbugs.gnu.org; 30 Sep 2022 15:25:45 +0000 Original-Received: from localhost ([127.0.0.1]:42675 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oeHtZ-0002RO-05 for submit@debbugs.gnu.org; Fri, 30 Sep 2022 11:25:45 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:6310) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oeHtT-0002R6-JZ for 58158@debbugs.gnu.org; Fri, 30 Sep 2022 11:25:43 -0400 Original-Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 207314422FB; Fri, 30 Sep 2022 11:25:34 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 1C1B8441EDD; Fri, 30 Sep 2022 11:25:32 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1664551532; bh=OivdNllY/14RVEMogDOTuhTzY4HeG6d3ne33GAuCsiU=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=Bof0Ovt6an/npPjOLoGodQ7V0uMS6DtDIVi6tudSR63kZR487m3Ta4Ngg+DyXWETp qJPDUtwaq6khjS+2hTe1tYQjXl5pSl6rs2YKbzhY5lk6iYCml6u/yhpj0I+eom+HQm dNz9GdL7HLftilZgWvqJavK/iUc2RpASRm4+IEjY22ajQKp0BHsNIBPB6+sWXtS1+k 0dFbRFAkkF5LqxWii5mZf1W5V2Okc0xBRK6NaRr5shbrSU6OI8RvCpHc6l5ttZA79H kJi9jGc7t5pe4cR/1f853GRIWmYQH+I08ICxfKGUPJImp7/2PrMrzQJJSLRxxf3sc2 5TDmq9ZeyhK3g== Original-Received: from pastel (65-110-220-202.cpe.pppoe.ca [65.110.220.202]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id C656A12023A; Fri, 30 Sep 2022 11:25:31 -0400 (EDT) In-Reply-To: ("Gerd =?UTF-8?Q?M=C3=B6llmann?="'s message of "Fri, 30 Sep 2022 16:08:36 +0200") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:244027 Archived-At: > Maybe you could also help me with the questions below? I'll try (BTW, the original author is Andreas Politz who we can still reach at . He doesn't have much time to devote to it, tho, so best not to Cc him through all the conversations). > I'm assuming, from a comment somewhere, that an interval tree is an > rb-tree with keys being interval start positions, and allowing > duplicates. Yup. > That is, if N is a node, all nodes in the subtree N->left are strictly < > N, and nodes in N->right are >=. The following code in `interval_tree_insert`: while (child != ITREE_NULL) { parent = child; offset += child->offset; child->limit = max (child->limit, node->end - offset); /* This suggests that nodes in the right subtree are strictly greater. But this is not true due to later rotations. */ child = node->begin <= child->begin ? child->left : child->right; } suggests that N->left are <= and N->right are > but my reading of the comment is that the only thing we can rely on is that N-right is >= [ I do understand this comment now :-) ] > 2. Does the tree order duplicates in a particular way? > Maybe by overlay priority, or by interval end? AFAICT, no it does not. > And, perhaps, if it doesn't order duplicates, would it be an idea to > order them, maybe at some later point? I'm asking this because > a successor(N) function would return nodes in the order in the tree, > always, and I don't know what the order is now. Ordering based on interval end could arguably make sense. I'm not completely sure how/where it would benefit us, tho. Most times we look at the itree, we just look at *all* the nodes within a specific interval, so the order in which they're returned doesn't matter very much (the ordering within the tree does matter in terms of how we manage to prune the tree, but that has more to do with which elements are near the root, which is a different kind of "ordering" than the "binary tree ordering" itself). Maybe the `next/previous-overlay-change` code benefit from sub-ordering based on `end`, but I suspect the effect would be lost in the noise: if we want to speed up that part of the code, I expect that a better avenue is to rewrite the `next/previous-single-overlay-change` so as not to work by (while .. (next/previous-overlay-change ..) .. (get-char-property ..) ..) where both functions do the O(log N) itree-iteration dance, but instead doing the itree iteration itself. > 3. If my mental picture is right, we could of course end up with a tree > that has degenerated to a list. Or a subtree, maybe. Do you know if > that can happen? In terms of tree depth, no: the code preserves the RB invariants, which ensure balance regardless of ordering (or lack thereof). Stefan