From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Thierry Volpiatto Newsgroups: gmane.emacs.bugs Subject: bug#63861: [PATCH] pp.el: New "pretty printing" code Date: Fri, 09 Jun 2023 05:22:31 +0000 Message-ID: <878rctkwvh.fsf@posteo.net> References: <87edmnics1.fsf@posteo.net> <87bkhpvm35.fsf@posteo.net> Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="14228"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 63861@debbugs.gnu.org To: Stefan Monnier Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Fri Jun 09 07:51:27 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1q7V1z-0003SY-FW for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 09 Jun 2023 07:51:27 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1q7V1m-0003DN-S2; Fri, 09 Jun 2023 01:51:14 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q7V1a-000386-Mu for bug-gnu-emacs@gnu.org; Fri, 09 Jun 2023 01:51:11 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1q7V1a-0006GN-EG for bug-gnu-emacs@gnu.org; Fri, 09 Jun 2023 01:51:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1q7V1Z-0003Sx-Qq for bug-gnu-emacs@gnu.org; Fri, 09 Jun 2023 01:51:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Thierry Volpiatto Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 09 Jun 2023 05:51:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 63861 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 63861-submit@debbugs.gnu.org id=B63861.168628982813281 (code B ref 63861); Fri, 09 Jun 2023 05:51:01 +0000 Original-Received: (at 63861) by debbugs.gnu.org; 9 Jun 2023 05:50:28 +0000 Original-Received: from localhost ([127.0.0.1]:58180 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q7V11-0003S6-MA for submit@debbugs.gnu.org; Fri, 09 Jun 2023 01:50:28 -0400 Original-Received: from mout01.posteo.de ([185.67.36.65]:59149) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q7V0v-0003Rf-DX for 63861@debbugs.gnu.org; Fri, 09 Jun 2023 01:50:25 -0400 Original-Received: from submission (posteo.de [185.67.36.169]) by mout01.posteo.de (Postfix) with ESMTPS id B9297240027 for <63861@debbugs.gnu.org>; Fri, 9 Jun 2023 07:50:13 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1686289813; bh=33UgKrqOPZMLxRVGjw4vLWW3hvTpCGHol0iUUYNEghI=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:From; b=qyrgmEpBhc1wkerCLU9R/GhzRjgLZSXR4PUia+Fftg0DhMfQ4l4u3GASZkcq/Jidt l0U2uIVLcIpUSSJEe8U8To5gKe2hGlfLwW6KAU20nzslT7+h8NHW9rpN4nt8jXbJKR HI3Cxm+pj33p9+jClDzwnx+V6nvkEDEP9geyXEFFil9oR0vXCUsnGX8cPEHRNDPgMU XUjHPgipSYn8zTrnWnVTp93hL7kgYfHzH3Hh8aO+paF3s1+MeKNGOfZFQF91vX/cUG xxyT7130/8AgXNn5LTBy/ZN6OVE7z+jcpj1twKsNuPpXp/w/fqvZGhNJMj8N8wB8Gp 8hQrqEzOiEE0w== Original-Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4QcqtD52cnz6tsf; Fri, 9 Jun 2023 07:50:12 +0200 (CEST) In-reply-to: X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:263154 Archived-At: --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Stefan Monnier writes: >>> I find these results (mine) quite odd: they suggest that my `pp-region` >>> is *faster* than the old `pp-buffer` for `load-history` and `bookmark-a= list` >>> data, which I definitely did not expect (and don't know how to explain >>> either). > > I've just redone my tests a bit differently, added `pp-emacs-lisp-code`, > and also introduced a var to control whether to activate the `lisp-ppss` > patch or not. I also fixed my `foo.el` file: its content was > accidentally already pretty printed rather than being on a single line, > which totally changes the behavior of `pp-region` and `pp-buffer`). > > For reference: > > % (cd ~/tmp; l foo.el test*.el) > -rw------- 1 monnier monnier 1125154 8 jun 11:20 test-load-history.el > -rw------- 1 monnier monnier 163258 8 jun 11:20 test-bookmark-alist= .el > -rw-r--r-- 1 monnier monnier 77101 8 jun 17:20 foo.el > % > > Here's the code I used to run the test: > > for f in ~/tmp/foo.el ~/tmp/test-bookmark-alist.el ~/tmp/test-load-hi= story.el; do for ppss in nil t; do for v in '(pp-buffer)' '(pp-region (poin= t-min) (point-max))' '(tv/pp-region (point-min) (point-max))' '(let ((s (re= ad (current-buffer)))) (erase-buffer) (pp-emacs-lisp-code s))'; do src/emac= s -Q --batch -l ~/tmp/describe-variable --eval "(with-temp-buffer (emacs-li= sp-mode) (insert-file-contents \"$f\") (setq pp-buffer-use-pp-region nil li= sp--faster-ppss $ppss) (message \"%S %S %S %S\" (file-name-nondirectory \"$= f\") (benchmark-run $v) '$v '$ppss))"&; done; done; done > > So, by file, from fastest to slowest: > > foo.el (0.859482743 0 0.0) (pp-buffer) t > foo.el (0.890402623 0 0.0) (pp-buffer) nil > foo.el (4.62344853 4 1.7225397670000002) (tv/pp-region (point-min) (p= oint-max)) t > foo.el (4.687414465 4 1.7116580980000002) (tv/pp-region (point-min) (= point-max)) nil > foo.el (7.932661181 1 0.3435169600000001) (pp-region (point-min) (poi= nt-max)) t > foo.el (196.183345212 1 0.618591124) (pp-region (point-min) (point-ma= x)) nil > foo.el (2997.739238575 505 105.82851685700001) (let ((s (read (curren= t-buffer)))) (erase-buffer) (pp-emacs-lisp-code s)) t > > Here we see that my `pp-region` code is slower than `pp-buffer` by > a factor ~10x: I'm not very happy about it, but this `foo.el` file was > selected because it was the worst case I had come across (tho that was > before I found the `lisp-ppss` patch). > > The last element in each line is whether we activated the `lisp-ppss` > patch. As we can see here, the `lisp-ppss` patch makes an enormous > difference (~24x) for my code, but not for `pp-buffer` (because it > relies on `lisp-indent-region` rather than `lisp-indent-line`) and not > for `tv/pp-region` either (because it doesn't indent at all). > > We also see that `pp-emacs-lisp-code` is *much* slower. I don't include > other results for this function in this email because they're still > running :-) > > test-bookmark-alist.el (13.237991019999999 6 2.403892035) (tv/pp-regi= on (point-min) (point-max)) nil > test-bookmark-alist.el (14.853056353 6 2.705511935) (tv/pp-region (po= int-min) (point-max)) t > test-bookmark-alist.el (28.059658418 5 2.005039257) (pp-region (point= -min) (point-max)) t > test-bookmark-alist.el (180.390204026 5 2.1182066760000002) (pp-regio= n (point-min) (point-max)) nil > test-bookmark-alist.el (265.95429676599997 10 4.445954908) (pp-buffer= ) t > test-bookmark-alist.el (268.975666886 10 3.6774180120000004) (pp-buff= er) nil > > Here we see that my `pp-region` code can be faster (even significantly > so) than `pp-buffer`. I'm not sure why, but I'll take the win :-) > We also see that the faster `lisp-ppss` again makes an important > difference in the performance of `pp-region` (~8x), tho the effect is > not as drastic as in the case of `foo.el`. > > test-load-history.el (235.134082197 8 4.440112806999999) (tv/pp-regio= n (point-min) (point-max)) nil > test-load-history.el (235.873981253 8 4.416064476) (tv/pp-region (poi= nt-min) (point-max)) t > test-load-history.el (506.770548196 31 9.706665932) (pp-region (point= -min) (point-max)) t > test-load-history.el (701.991875274 43 12.390212449) (pp-buffer) t > test-load-history.el (710.843618646 43 12.156289354) (pp-buffer) nil > test-load-history.el (1419.268184021 36 9.260999640000001) (pp-region= (point-min) (point-max)) nil > > Here again, we see that `pp-region` is competitive with `pp-buffer` and > the `lisp-ppss` patch speeds it up significantly (~3x). > > Another thing we see in those tests is that `pp-region` (with the > `lisp-ppss` patch) is ~2x slower than `tv/pp-region`, whereas the > performance differential with `pp-buffer` varies a lot more. Also if we > compare the time it takes to the size of the file, we see: > > 77101B / 7.932661181s =3D 9719 B/s > 163258B / 28.059658418s =3D 5818 B/s > 1125154B / 506.770548196s =3D 2220 B/s > > `pp-region`s performance is not quite linear in the size of the file :-( > Interestingly, the same holds for `tv/pp-region`: > > 77101B / 4.62344853s =3D 16676 B/s > 163258B / 13.237991019s =3D 12332 B/s > 1125154B / 235.134082197s =3D 4785 B/s > > even though it works in a fundamentally very different way (which, to > my naive eye should result in a linear performance), so maybe the > slowdown here is due to something external (such as the fact that > various operations on buffers get slower as the buffer gets bigger). > >> hmm, don't know, I ran pp-buffer with M-: from the test-load-history.el = or the >> test-bookmark-alist.el buffer. May be using emacs --batch makes a >> difference? > > I don't see any significant performance difference between batch and > non-batch :-( > >> is the data really printed in such case? > > Yes, definitely. > >> More or less the code using pp-region takes between 42 to 48s and the one >> with old pp-buffer around 6s. > > I wonder why we see such wildly different performance. In my tests on > your `test-bookmark-alist.el` I basically see the reverse ratio! > >> Also sorry about your last patch I tested it too fast, it is indeed >> slightly faster, but not much: 42 vs 46s. > > This is also perplexing. In my tests, the patch has a very significant > impact on the performance of `pp-region`. > Are you sure the patch is used (`lisp-mode.el` is preloaded, so you need > to re-dump Emacs, or otherwise manually force-reload `lisp-mode.elc` > into your Emacs session)? No, I just C-M-x lisp-ppss, I will try today to recompile an emacs with your patchs applied and see if it makes a difference. I also use emacs-28, will try with 30. > FWIW, I'm running my tests on Emacs's `master` branch with the native > ELisp compiler enabled (tho I don't see much difference in performance > on these tests when running my other Emacs build without the native > compiler) on an i386 Debian testing system. I don't use native compilation. >>> And do I understand correctly that `tv/pp-region` does not indent its >>> output? >> No, it does indent, see function tv/pp which use pp-to-string which use = pp-buffer >> and pp-buffer indent the whole sexp at the end. > > AFAICT `tv/pp` uses `pp-to-string` only on "atomic" values (i.e. not > lists, vectors, or hash-tables), so there's usually not much to indent in= there. > What I see in the output after calling `tv/pp-region` are non-indented li= sts. Hmm maybe, seems it was similar to pp-buffer but perhaps I didn't look carefully. >>> What was the reason for this choice? >> Because indentation is done at the end by pp-buffer. > > When I use `describe-variable` with your code, the printed value is > indeed indented, but that code uses only `pp-buffer` and not > `tv/pp-region` (i.e. `tv/describe-variable` does not call > `tv/pp-region`, neither directly nor indirectly). Yes you have to run manually tv/pp-value-in-help when you need to read the value of a variable (unless it is a small var). >> PS (unrelated to pp-region): About the old pp-buffer, there is >> a difficult to find bug where the same operation is running twice >> (newline is added, then removed, then added again and then the loop >> continue) >> >> You can see it by edebugging pp-buffer on a simple alist like this: >> >> (defvar bar '((1 . "un") (2 . "deux") (3 . "trois") (4 . "quatre") (5 . = "cinq") (6 . "six"))) > > That might be part of the poor performance we see on > `test-bookmark-alist.el`? Not sure it makes a big difference but for sure it is slower to defeat the operation done in previous turn and do it again. > > Stefan =2D-=20 Thierry --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQHHBAEBCgAxFiEEI9twfRN7r3nig/xwDsVtFB0W75MFAmSCvZITHHRoaWV2b2xA cG9zdGVvLm5ldAAKCRAOxW0UHRbvk1fjC/wKGtKtEctIoIOhvTh5fauCQymCkEu3 gvxDZPmEMTnhui1u77KHOdjR/lYQKzknNRLHz/i7iWtTHg01H2Je/QlFOjddnaja VbXq4ZMsPiyqkp/9GjHn9kz3Fmmg6Z8qx6AAeq17ZUH9eE1nKM9hyJDsru9G3/OZ ilagbYDjRIkTTbUsYj4h1emR+nRiiWykgeKvvcupXiizeiPeFvQZsyqmPxar8vW/ K7dBElSDGhN1SofSU+Q7z/cA6qvadd1Ms81tzvTSkfs0BaBpgLd4+SPonuuCLRVy MmeSSJehxGItsmmBJq8cZGXG0lk06gsZ0K3AlwUdesxhUyPoUSojcPKVtdO4jisB i1aWKFojbacesLbwVssUBbvVFGX0dv8t4M3L8XJbyjnbV+LVh3b2djNi7L9NW19n cxqy+S1E30jaoC1qu6azEET/7sHYshQYMnrfbH6+XtpW4iKvPwYgHJIDvf5nF0+f Br3qF4C7i8wRFHLmuNRxZuHAGjnuDsaY/FQ= =xvmp -----END PGP SIGNATURE----- --=-=-=--