From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful Date: Thu, 29 Sep 2022 12:48:34 -0400 Message-ID: References: <83h70qhez0.fsf@gnu.org> <83edvuhaby.fsf@gnu.org> <831qruh67o.fsf@gnu.org> <83y1u2foli.fsf@gnu.org> <83o7uyfh5o.fsf@gnu.org> Reply-To: Stefan Monnier Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="11903"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) Cc: gerd.moellmann@gmail.com, 58158@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Sep 29 19:36:22 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1odxSP-0002r8-UA for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 29 Sep 2022 19:36:22 +0200 Original-Received: from localhost ([::1]:53630 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1odxSO-00020n-Sn for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 29 Sep 2022 13:36:20 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:33434) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1odwic-0000LF-IC for bug-gnu-emacs@gnu.org; Thu, 29 Sep 2022 12:49:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:40367) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1odwic-00069j-9e for bug-gnu-emacs@gnu.org; Thu, 29 Sep 2022 12:49:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1odwic-0002Nf-4l for bug-gnu-emacs@gnu.org; Thu, 29 Sep 2022 12:49:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 29 Sep 2022 16:49:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 58158 X-GNU-PR-Package: emacs Original-Received: via spool by 58158-submit@debbugs.gnu.org id=B58158.16644701289118 (code B ref 58158); Thu, 29 Sep 2022 16:49:02 +0000 Original-Received: (at 58158) by debbugs.gnu.org; 29 Sep 2022 16:48:48 +0000 Original-Received: from localhost ([127.0.0.1]:39443 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1odwiO-0002Mz-BA for submit@debbugs.gnu.org; Thu, 29 Sep 2022 12:48:48 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:15985) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1odwiJ-0002Mj-7R for 58158@debbugs.gnu.org; Thu, 29 Sep 2022 12:48:47 -0400 Original-Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id A642C442F98; Thu, 29 Sep 2022 12:48:37 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 953F3442F94; Thu, 29 Sep 2022 12:48:35 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1664470115; bh=eOh6g4WtUR/fEw2F4endAWNdlv/ADGvgWQKbYXwoF78=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=SBt3yjNNbLOF1Q6jiHWHYHiiiy1FZ5iTLi3pp5WM5rnkMHFe/hQDmo+kTjyTVGUZJ SmlBpQoPXYIhgtsZEj1WNRNEVWbKggv6wnJzzRvx+5HK48wcjcgOozhqS0STxJO5Zg g2XENSSxTXhds2WpasWOKjFeM3Ltuesci3iSP61+K877MPnhTyo4ZO/omGuo+n2B1d jpnUmnk5DUL1bQrqhbDbaT5o47zwGNCZ8RSJPyCINTMvT2u6/aT0RYOpAJbrhB1qf7 C1InNBNmQNmp9/Jx/eo9LNqk5EOqpOlO9zI3qlAWTast3dl3xsqzOvlGXi1qjGukm6 GslMM8ZVblwxg== Original-Received: from lechazo (lechon.iro.umontreal.ca [132.204.27.242]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 88E03120DA7; Thu, 29 Sep 2022 12:48:35 -0400 (EDT) In-Reply-To: <83o7uyfh5o.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 29 Sep 2022 16:23:47 +0300") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:243927 Archived-At: >> > I may be missing something, but it looks like the sole purpose of the >> > iter_start/iter_finish dance is to ensure only one iteration per tree >> > is running at any given time, and that's because the iteration uses >> > some state variable(s) of which there's only one instance per tree. >> > >> > Stefan, am I missing something? >> >> One reason is that traversing a binary tree usually requires something >> like recursion, but that wouldn't fit very conveniently with the current >> code (nor with C in general since you can't make a local recursive >> closure which accesses local variables from the surrounding function). > > I'm not sure I understand how recursion is related. Are you saying > that recursion is replaced with iteration? But then, if _start and > _finish are called by the same caller, we don't really need the > protection, since no one can start another iteration while the first > one runs. Right? Typically, current code will look something like: int x; Lisp_Object y; buffer_overlay_iter_start (current_buffer, prev, pos, ITREE_DESCENDING); while ((node = buffer_overlay_iter_next (current_buffer))) { ... do something that updates x and y ... } buffer_overlay_iter_finish (current_buffer); If we were to use recursion, then we'd need to define a new (recursive) function which does what's currently done in the `while` loop, but this function can't access `x` and `y`, so it would need to take them as argument, or a reference to them... The use of an iterator is definitely convenient (and is a standard approach in many languages). >> Another is the need to update the begin/end fields (these need updating >> because of insertions/deletions but they're updated lazily while >> traversing the tree to avoid an O(N) complexity during the >> insertions/deletions). Hiding that behind 'some kind of "next node" >> function keeps the code more readable. > > Where in the code do you see iteration that adds or deletes nodes? No, I meant insertion/deletion of text in the buffer, thus requiring updates to `begin/end` fields. > Btw, couldn't we handle this by having a flag in the node that tells > us whether the begin/end fields can be trusted? Then the first caller > who need them would run the update and reset the flags, and we still > have lazy update, albeit somewhat less lazy, but without the need for > guarding the iteration with start/finish calls. Would that work? Yes, it would, but it's still O(N). The current approach is inspired by the approach used in `intervals.c` which also updates those fields lazily/ondemand so as to avoid the O(N) performance impact. >> For now, I pushed a simple fix to traverse the tree "by hand" in the GC >> rather than via the iterator. > So this removes the restriction of not having GC during iteration? Yes. > Also, I take it that you don't consider the current code is as > "harmful" as Gerd thinks it is? I don't like the global state it uses, but I think we can fix this aspect without too much trouble. > IOW, you don't share his opinion that this implementation is > a "no-go"? No, indeed, I don't. Stefan