From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: State of the overlay tree branch? Date: Fri, 23 Mar 2018 15:39:35 -0400 Message-ID: References: <834lldp18f.fsf@gnu.org> <9646341d-700b-4240-216b-8c0e753fa79f@arkona-technologies.de> <86d03e78-9984-f33e-a3f3-3faa4b34d78b@arkona-technologies.de> <83vadso9ad.fsf@gnu.org> <5155d5e2-6b5c-581e-89fe-4f3af717304f@arkona-technologies.de> <4c82fcbd-961a-c6ca-b1f0-6b85665cb339@arkona-technologies.de> <1ea4248a-11ce-00a9-0644-df7b2e5a3a58@arkona-technologies.de> <838tajhsau.fsf@gnu.org> <837eq2j2i5.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1521833867 28300 195.159.176.226 (23 Mar 2018 19:37:47 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 23 Mar 2018 19:37:47 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Mar 23 20:37:43 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ezSVS-0007Eo-Ko for ged-emacs-devel@m.gmane.org; Fri, 23 Mar 2018 20:37:42 +0100 Original-Received: from localhost ([::1]:39522 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ezSXU-00062y-6l for ged-emacs-devel@m.gmane.org; Fri, 23 Mar 2018 15:39:48 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:60527) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ezSXN-00062c-MV for emacs-devel@gnu.org; Fri, 23 Mar 2018 15:39:43 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ezSXK-0003UI-El for emacs-devel@gnu.org; Fri, 23 Mar 2018 15:39:41 -0400 Original-Received: from pruche.dit.umontreal.ca ([132.204.246.22]:58494) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ezSXK-0003U1-8u; Fri, 23 Mar 2018 15:39:38 -0400 Original-Received: from ceviche.home (lechon.iro.umontreal.ca [132.204.27.242]) by pruche.dit.umontreal.ca (8.14.7/8.14.1) with ESMTP id w2NJdafv017770; Fri, 23 Mar 2018 15:39:36 -0400 Original-Received: by ceviche.home (Postfix, from userid 20848) id 04F056637A; Fri, 23 Mar 2018 15:39:35 -0400 (EDT) In-Reply-To: <837eq2j2i5.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 23 Mar 2018 17:22:10 +0300") X-NAI-Spam-Flag: NO X-NAI-Spam-Threshold: 5 X-NAI-Spam-Score: 0 X-NAI-Spam-Rules: 2 Rules triggered EDT_SA_DN_PASS=0, RV6249=0 X-NAI-Spam-Version: 2.3.0.9418 : core <6249> : inlines <6517> : streams <1782057> : uri <2613559> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 132.204.246.22 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:223975 Archived-At: OK, I think I'm tired of these experiments. Here's my current test, along with the patches I use with it. You can test it with something like src/emacs -Q --batch -l ~/tmp/foo.el --eval '(setq internal--bytechar-d= istance-increment 50 internal--randomize-markers t)' --eval '(bytechar-test= 3000 nil)'=20 Shuffling the markers can make a noticeable difference in some cases but we're only talking about a factor of 2 or 3. It doesn't have much negative impact, so it's not a bad option, but the algorithmic problem remains anyway. Stefan (defun bytechar-test (buffer-kb &optional forward) (random "seed") (with-temp-buffer (dotimes (i (* buffer-kb 33)) (insert "lksajflahalskjdf=E9efawrgfr=FCegf\n")) (message "buffer-size =3D %SkB" (/ (buffer-size) 1024)) (let ((txtbuf (current-buffer)) (goto-iterations (/ 10000000 buffer-kb)) (line-iterations (/ 20000 buffer-kb)) (markers ())) (dotimes (s 4) (with-temp-buffer (insert-buffer-substring txtbuf) (let ((stepsize (lsh 10 (* 4 s)))) (if forward (cl-loop for n from (point-min) upto (point-max) by stepsiz= e do (push (copy-marker n) markers)) (cl-loop for n from (point-max) downto (point-min) by stepsiz= e do (push (copy-marker n) markers)))) ;; The GC's internal--randomize-markers just brings-to-front every ;; 8th marker, so when starting with an ordered list of markers (= like ;; in our case), we need to run the GC at least 8 times before the ;; whole list starts to look somewhat shuffled. (dotimes (i 20) (garbage-collect)) (let ((timing (benchmark-run goto-iterations (goto-char (+ (point-min) (random (buffer-size))))))) (message "ols=3D%S goto-random time=3D%.4f (+ %S)" (/ (buffer-size) (lsh 10 (* 4 s))) (car timing) (cdr timing))) (garbage-collect) ;throw away the transient markers (let ((timing (benchmark-run line-iterations (dotimes (i 5) (line-number-at-pos (+ (point-min) (* i (/ (buffer-size) 4)))))))) (message "nbmks=3D%S pos=3D*/4 time=3D%.4f (+ %S)" (/ (buffer-size) (lsh 10 (* 4 s))) (car timing) (cdr timing))) (dotimes (i 5) (let ((timing (benchmark-run line-iterations (line-number-at-pos (+ (point-min) (* i (/ (buffer-size) 4))))))) (message "nbmks=3D%S pos=3D%S/4 time=3D%.4f (+ %S)" (/ (buffer-size) (lsh 10 (* 4 s))) i (car timing) (cdr timing)))) ))))) diff --git a/lisp/emacs-lisp/benchmark.el b/lisp/emacs-lisp/benchmark.el index b86b56b81e..2f4e38fe35 100644 --- a/lisp/emacs-lisp/benchmark.el +++ b/lisp/emacs-lisp/benchmark.el @@ -50,7 +50,7 @@ benchmark-run garbage collections that ran, and the time taken by garbage collection. See also `benchmark-run-compiled'." (declare (indent 1) (debug t)) - (unless (natnump repetitions) + (unless (or (natnump repetitions) (symbolp repetitions)) (setq forms (cons repetitions forms) repetitions 1)) (let ((i (make-symbol "i")) @@ -58,7 +58,7 @@ benchmark-run (gc (make-symbol "gc"))) `(let ((,gc gc-elapsed) (,gcs gcs-done)) - (list ,(if (> repetitions 1) + (list ,(if (or (symbolp repetitions) (> repetitions 1)) ;; Take account of the loop overhead. `(- (benchmark-elapse (dotimes (,i ,repetitions) ,@forms)) @@ -101,7 +101,7 @@ benchmark For non-interactive use see also `benchmark-run' and `benchmark-run-compiled'." (interactive "p\nxForm: ") - (let ((result (eval `(benchmark-run ,repetitions ,form)))) + (let ((result (eval `(benchmark-run ,repetitions ,form) t))) (if (zerop (nth 1 result)) (message "Elapsed time: %fs" (car result)) (message "Elapsed time: %fs (%fs in %d GCs)" (car result) diff --git a/src/alloc.c b/src/alloc.c index 7ba872aaee..16d11e34cd 100644 --- a/src/alloc.c +++ b/src/alloc.c @@ -7270,10 +7270,22 @@ static void unchain_dead_markers (struct buffer *buffer) { struct Lisp_Marker *this, **prev =3D &BUF_MARKERS (buffer); + /* In order to try and avoid worst case behaviors in buf_charpos_to_byte= pos + we try and randomize the order of markers here. */ + unsigned i =3D 4; =20 while ((this =3D *prev)) if (this->gcmarkbit) - prev =3D &this->next; + { + if (!randomize_markers || i++ % 8) + prev =3D &this->next; + else + { /* Move this one to front, just to randomize things a bit. */ + *prev =3D this->next; + this->next =3D BUF_MARKERS (buffer); + BUF_MARKERS (buffer) =3D this; + } + } else { this->buffer =3D NULL; @@ -7752,6 +7764,9 @@ The time is in seconds as a floating point value. */= ); DEFVAR_INT ("gcs-done", gcs_done, doc: /* Accumulated number of garbage collections done. */); =20 + DEFVAR_BOOL ("internal--randomize-markers", randomize_markers, doc: /* = */); + randomize_markers =3D true; +=20=20 defsubr (&Scons); defsubr (&Slist); defsubr (&Svector); diff --git a/src/marker.c b/src/marker.c index 3d808fd6fa..7c1d164927 100644 --- a/src/marker.c +++ b/src/marker.c @@ -133,6 +133,28 @@ CHECK_MARKER (Lisp_Object x) CHECK_TYPE (MARKERP (x), Qmarkerp, x); } =20 +/* When converting bytes from/to chars, we look through the list of + markers to try and find a good starting point (since markers keep + track of both bytepos and charpos at the same time). + But if there are many markers, it can take too much time to find a "goo= d" + marker from which to start. Worse yet: if it takes a long time and we = end + up finding a nearby markers, we won't add a new marker to cache this + result, so next time around we'll have to go through this same long list + to (re)find this best marker. So the further down the list of + markers we go, the less demanding we are w.r.t what is a good marker. + + The previous code used INITIAL=3D50 and INCREMENT=3D0 and this lead to + really poor performance when there are many markers. + I haven't tried to tweak INITIAL, but my experiments on my trusty Think= pad + T61 using various artificial test cases seem to suggest that INCREMENT= =3D50 + might be "the best compromise": it significantly improved the + worst case and it was rarely slower and never by much. + + The asymptotic behavior is still poor, tho, so in largish buffers with = many + overlays (e.g. 300KB and 30K overlays), it can still be a bottlneck. */ +#define BYTECHAR_DISTANCE_INITIAL 50 +#define BYTECHAR_DISTANCE_INCREMENT bytechar_distance_increment + /* Return the byte position corresponding to CHARPOS in B. */ =20 ptrdiff_t @@ -141,7 +163,7 @@ buf_charpos_to_bytepos (struct buffer *b, ptrdiff_t cha= rpos) struct Lisp_Marker *tail; ptrdiff_t best_above, best_above_byte; ptrdiff_t best_below, best_below_byte; - ptrdiff_t distance =3D 50; + ptrdiff_t distance =3D BYTECHAR_DISTANCE_INITIAL; =20 eassert (BUF_BEG (b) <=3D charpos && charpos <=3D BUF_Z (b)); =20 @@ -184,7 +206,7 @@ buf_charpos_to_bytepos (struct buffer *b, ptrdiff_t cha= rpos) if (best_above - best_below < distance) break; else - distance++; + distance +=3D BYTECHAR_DISTANCE_INCREMENT; } =20 /* We get here if we did not exactly hit one of the known places. @@ -296,7 +318,7 @@ buf_bytepos_to_charpos (struct buffer *b, ptrdiff_t byt= epos) struct Lisp_Marker *tail; ptrdiff_t best_above, best_above_byte; ptrdiff_t best_below, best_below_byte; - ptrdiff_t distance =3D 50; + ptrdiff_t distance =3D BYTECHAR_DISTANCE_INITIAL; =20 eassert (BUF_BEG_BYTE (b) <=3D bytepos && bytepos <=3D BUF_Z_BYTE (b)); =20 @@ -330,7 +352,7 @@ buf_bytepos_to_charpos (struct buffer *b, ptrdiff_t byt= epos) if (best_above - best_below < distance) break; else - distance++; + distance +=3D BYTECHAR_DISTANCE_INCREMENT; } =20 /* We get here if we did not exactly hit one of the known places. @@ -756,4 +778,9 @@ syms_of_marker (void) defsubr (&Scopy_marker); defsubr (&Smarker_insertion_type); defsubr (&Sset_marker_insertion_type); + + DEFVAR_INT ("internal--bytechar-distance-increment", + bytechar_distance_increment, + doc: /* Haha */); + bytechar_distance_increment =3D 50; }