From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: State of the overlay tree branch? Date: Fri, 23 Mar 2018 01:03:10 -0400 Message-ID: References: <834lldp18f.fsf@gnu.org> <9646341d-700b-4240-216b-8c0e753fa79f@arkona-technologies.de> <86d03e78-9984-f33e-a3f3-3faa4b34d78b@arkona-technologies.de> <83vadso9ad.fsf@gnu.org> <5155d5e2-6b5c-581e-89fe-4f3af717304f@arkona-technologies.de> <4c82fcbd-961a-c6ca-b1f0-6b85665cb339@arkona-technologies.de> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1521781291 5840 195.159.176.226 (23 Mar 2018 05:01:31 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 23 Mar 2018 05:01:31 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Mar 23 06:01:27 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ezEpQ-0001Nh-Iz for ged-emacs-devel@m.gmane.org; Fri, 23 Mar 2018 06:01:24 +0100 Original-Received: from localhost ([::1]:36058 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ezErU-0006HC-06 for ged-emacs-devel@m.gmane.org; Fri, 23 Mar 2018 01:03:32 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:44295) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ezErO-0006CK-Ei for emacs-devel@gnu.org; Fri, 23 Mar 2018 01:03:27 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ezErK-00049p-Fc for emacs-devel@gnu.org; Fri, 23 Mar 2018 01:03:26 -0400 Original-Received: from [195.159.176.226] (port=57162 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1ezErK-00049R-7z for emacs-devel@gnu.org; Fri, 23 Mar 2018 01:03:22 -0400 Original-Received: from list by blaine.gmane.org with local (Exim 4.84_2) (envelope-from ) id 1ezEpE-0001BV-Ny for emacs-devel@gnu.org; Fri, 23 Mar 2018 06:01:12 +0100 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 92 Original-X-Complaints-To: usenet@blaine.gmane.org Cancel-Lock: sha1:4Dhm8Rh63aIIby/PB3rw8MfG+RU= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 195.159.176.226 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:223932 Archived-At: > since noverlay performs so well, I guess the technical issue here is already > solved and I'll just have to wait for it to make it into the master > branch. Yes and no: there can still be many markers (even without having any overlays) and that poses the same problem (and the noverlays branch doesn't change that). I've made some further tests and found that the patch I sent indeed doesn't help tremendously with "distance += 10" but that when we reach "+= 50" the effect is more significant. Could you try the patch below (ideally not just with artificial tests but also in actual use) to see if it helps? Stefan diff --git a/src/marker.c b/src/marker.c index 7773c4fce0..6ab0d3d61e 100644 --- a/src/marker.c +++ b/src/marker.c @@ -133,6 +133,28 @@ CHECK_MARKER (Lisp_Object x) CHECK_TYPE (MARKERP (x), Qmarkerp, x); } +/* When converting bytes from/to chars, we look through the list of + markers to try and find a good starting point (since markers keep + track of both bytepos and charpos at the same time). + But if there are many markers, it can take too much time to find a "good" + marker from which to start. Worse yet: if it takes a long time and we end + up finding a nearby markers, we won't add a new marker to cache this + result, so next time around we'll have to go through this same long list + to (re)find this best marker. So the further down the list of + markers we go, the less demanding we are w.r.t what is a good marker. + + The previous code used INITIAL=50 and INCREMENT=0 and this lead to + really poor performance when there are many markers. + I haven't tried to tweak INITIAL, but my experiments on my trusty Thinkpad + T61 using various artificial test cases seem to suggest that INCREMENT=50 + might be "the best compromise": it significantly improved the + worst case and it was rarely slower and never by much. + + The asymptotic behavior is still poor, tho, so in largish buffers with many + overlays (e.g. 300KB and 30K overlays), it can still be a bottlneck. */ +#define BYTECHAR_DISTANCE_INITIAL 50 +#define BYTECHAR_DISTANCE_INCREMENT 50 + /* Return the byte position corresponding to CHARPOS in B. */ ptrdiff_t @@ -141,6 +163,7 @@ buf_charpos_to_bytepos (struct buffer *b, ptrdiff_t charpos) struct Lisp_Marker *tail; ptrdiff_t best_above, best_above_byte; ptrdiff_t best_below, best_below_byte; + ptrdiff_t distance = BYTECHAR_DISTANCE_INITIAL; eassert (BUF_BEG (b) <= charpos && charpos <= BUF_Z (b)); @@ -180,8 +203,10 @@ buf_charpos_to_bytepos (struct buffer *b, ptrdiff_t charpos) /* If we are down to a range of 50 chars, don't bother checking any other markers; scan the intervening chars directly now. */ - if (best_above - best_below < 50) + if (best_above - best_below < distance) break; + else + distance += BYTECHAR_DISTANCE_INCREMENT; } /* We get here if we did not exactly hit one of the known places. @@ -293,6 +318,7 @@ buf_bytepos_to_charpos (struct buffer *b, ptrdiff_t bytepos) struct Lisp_Marker *tail; ptrdiff_t best_above, best_above_byte; ptrdiff_t best_below, best_below_byte; + ptrdiff_t distance = BYTECHAR_DISTANCE_INITIAL; eassert (BUF_BEG_BYTE (b) <= bytepos && bytepos <= BUF_Z_BYTE (b)); @@ -323,8 +349,10 @@ buf_bytepos_to_charpos (struct buffer *b, ptrdiff_t bytepos) /* If we are down to a range of 50 chars, don't bother checking any other markers; scan the intervening chars directly now. */ - if (best_above - best_below < 50) + if (best_above - best_below < distance) break; + else + distance += BYTECHAR_DISTANCE_INCREMENT; } /* We get here if we did not exactly hit one of the known places.