From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Ihor Radchenko Newsgroups: gmane.emacs.devel Subject: Optimizing performance of buffer markers (was: Exposing buffer text modifications to Lisp) Date: Sat, 25 Jun 2022 12:47:38 +0800 Message-ID: <87edzdwdqt.fsf@localhost> References: <2c2746e5f2558a87e8eab6f0914264a020173a9d.camel@pm.me> <27630AA3-8026-4E24-8852-ACCD9325B99D@gmail.com> <0E9E702B-B07C-4794-8498-29B9320E14CC@gmail.com> <871qvorqvv.fsf@localhost> <83tu8jq2vl.fsf@gnu.org> <87sfo37etn.fsf@localhost> <834k0jplcm.fsf@gnu.org> <878rpuwm9w.fsf@localhost> <83mteao3oj.fsf@gnu.org> <87edzmv3i0.fsf@localhost> <83k09eo1p5.fsf@gnu.org> <878rpuv17q.fsf@localhost> <83fsk2nyrm.fsf@gnu.org> <878rpr4kd4.fsf@localhost> <8335fzms6q.fsf@gnu.org> <87wndar5tt.fsf@localhost> <831qvikxqo.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="24232"; mail-complaints-to="usenet@ciao.gmane.io" Cc: casouri@gmail.com, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Jun 25 06:47:15 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1o4xhS-00067e-Lq for ged-emacs-devel@m.gmane-mx.org; Sat, 25 Jun 2022 06:47:14 +0200 Original-Received: from localhost ([::1]:45976 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o4xhR-0006qN-5E for ged-emacs-devel@m.gmane-mx.org; Sat, 25 Jun 2022 00:47:13 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:35290) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o4xgk-00064Z-IN for emacs-devel@gnu.org; Sat, 25 Jun 2022 00:46:30 -0400 Original-Received: from mail-pl1-x62d.google.com ([2607:f8b0:4864:20::62d]:40805) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1o4xgi-0002zC-SF; Sat, 25 Jun 2022 00:46:30 -0400 Original-Received: by mail-pl1-x62d.google.com with SMTP id k7so3712291plg.7; Fri, 24 Jun 2022 21:46:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:in-reply-to:references:date:message-id :mime-version; bh=u0KxgcqzK++xEAjBQCu7ac1usYjgolQZaHSHoiaDXnI=; b=XH+SSKCqt+2zug17+EeYNzvBB6dWCP13RiU8lqklymktBkiXbRfaHOuhOz/RBUpiMO NT6rxWqRJ8EMzxq3qjfOyEFqqReI0s0nB56Q4+Z9R4oykh1wwn+yYNvxWrCwmAA/RrU3 PFCYvK/szjJKqpKZPFe9bjAnFmAiS3VN+AGZU/5UMaJaXgRvQU53MHCPq8kABJYuEfbA A7/Cx+a4eV6cb+gvMy2BoTScHSUwCbBiNQFwoxpYqwteWHbd8P5ua6ZmYvgMCRNcCWRN NW83N3vAK9xnVQkJcgUvRMY170vHPW9wWps4pPeg8X2R5vKqTxQCd/E3GxGAaTT7KXD6 YRTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=u0KxgcqzK++xEAjBQCu7ac1usYjgolQZaHSHoiaDXnI=; b=1p13dyDQIfIOhw0/w2czva3uT9N6sCg19BBqUt4fBp87liJr5eWPlIAhSRQyPOCivv nenB6j1FwKNSbRxVZGKl3YVlnkxHWiBPL3NvRTzGU57cRuSJb0h6avOq7v5ivQUu0Y6k ZE1oZWTd4Xat/rUbLlgEyATtR+DdgZOWa6+x4ZA1B/sTyt0W569DC+Qn3ffxsXOqlrwU LGx90+a8MDlDcGEOgoL0Z/6zdjdLvLB699UKmZL397qYcc0cXXeGJK+SEF2v65o/eGNk 7HDPGHZi8PVWqlipWd3lDsTeFX+gf4aR3BVn5eiD49QIcxdLcIhapRgJxZO+xiCArkDG pPoQ== X-Gm-Message-State: AJIora8g1y9o2PmpSYhptT7ICwbWJv1v6aBMoOiuFzLmN88es+qggFDo FdnONKVCLvgFx7Q1Ok3BcM8sxqL9kt9ilb9J X-Google-Smtp-Source: AGRyM1sW9KsLJwOVS//99JQdzIU07SHM/vSSKy/6d+t4szVKiOlGdVP6bc6ELs339eQKSK5yXhpImA== X-Received: by 2002:a17:90b:4a4c:b0:1ec:9072:6a72 with SMTP id lb12-20020a17090b4a4c00b001ec90726a72mr2639062pjb.96.1656132386760; Fri, 24 Jun 2022 21:46:26 -0700 (PDT) Original-Received: from localhost ([192.161.177.252]) by smtp.gmail.com with ESMTPSA id o13-20020a170903210d00b00168a216f629sm2662371ple.11.2022.06.24.21.46.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Jun 2022 21:46:26 -0700 (PDT) In-Reply-To: <831qvikxqo.fsf@gnu.org> Received-SPF: pass client-ip=2607:f8b0:4864:20::62d; envelope-from=yantar92@gmail.com; helo=mail-pl1-x62d.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:291575 Archived-At: I think that we settled something workable regarding buffer modifications. I will try the proposed solutions and see if issues pop up on Org ML. Changing subject to reflect the remaining point of the discussion better. Eli Zaretskii writes: >> Clarification: I was asking about C-level trees to store marker list. >> I did not have moving Org AST from Lisp to C-level in mind. We currently >> use built-in Lisp implementation of AVL-tree to search across AST (which >> is not ideal, but good enough for moderately large files). > > Ah, okay. Sorry for my misunderstanding. > > Trees could indeed be relevant, but maybe other data structures as > well? E.g., why not hash tables? Not that I consider myself an > expert on efficient search algorithms... AFAIU, buf_bytepos_to_charpos tries to search for the closest marker near the requested bytepos. It currently does it using the following heuristics (roughly): (let ((threshold 50)) (dolist (marker markers) (if (or (< (abs (- marker bytepos)) threshold) (< (abs (- nearest_previous_marker bytepos)) threshold)) (throw 'found marker) (cl-incf threshold 50)))) If we store markers in a hash table, there will be no benefit - Hash table will only allow to find marker at exact position, not nearby. AFAIK, the most natural data structure to search for data before/after given key is a binary tree. There are more exotic data structures, like skip list, but I do not expect skip lists to be implemented in Emacs C code. Best, Ihor