From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#63861: [PATCH] pp.el: New "pretty printing" code Date: Thu, 08 Jun 2023 12:15:47 -0400 Message-ID: References: <87edmnics1.fsf@posteo.net> Reply-To: Stefan Monnier Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="3371"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: 63861@debbugs.gnu.org To: Thierry Volpiatto Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Jun 08 18:17:40 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1q7IKQ-0000fK-VY for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 08 Jun 2023 18:17:39 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1q7IJs-0001GR-5f; Thu, 08 Jun 2023 12:17:04 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q7IJq-0001Fw-Is for bug-gnu-emacs@gnu.org; Thu, 08 Jun 2023 12:17:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1q7IJq-0004Jx-A6 for bug-gnu-emacs@gnu.org; Thu, 08 Jun 2023 12:17:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1q7IJq-0006ok-66 for bug-gnu-emacs@gnu.org; Thu, 08 Jun 2023 12:17:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 08 Jun 2023 16:17:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 63861 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch X-Debbugs-Original-Cc: "Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors" , 63861@debbugs.gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.168624096926115 (code B ref -1); Thu, 08 Jun 2023 16:17:02 +0000 Original-Received: (at submit) by debbugs.gnu.org; 8 Jun 2023 16:16:09 +0000 Original-Received: from localhost ([127.0.0.1]:57332 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q7IIy-0006n8-J6 for submit@debbugs.gnu.org; Thu, 08 Jun 2023 12:16:09 -0400 Original-Received: from lists.gnu.org ([209.51.188.17]:54684) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q7IIv-0006mz-TV for submit@debbugs.gnu.org; Thu, 08 Jun 2023 12:16:07 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q7IIs-00008e-Pg for bug-gnu-emacs@gnu.org; Thu, 08 Jun 2023 12:16:04 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q7IIo-0004DS-BM for bug-gnu-emacs@gnu.org; Thu, 08 Jun 2023 12:16:02 -0400 Original-Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 6F48E44273B; Thu, 8 Jun 2023 12:15:54 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 03FA44426F9; Thu, 8 Jun 2023 12:15:52 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1686240952; bh=Q99fEJPiE/gKpF6fktOZZqp+JvCW+yGJzCF0GdJ1B5o=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=pozCBrPVbhUD0Mk/YaN/wZMWjXrFIP95NLInXTAKH4WtgPCwcC70gigXqCnWDBxf4 GKXnCJOkgHmVc1ZTUJ/8JH0J7+gk7phpzueoYRxJt11OrZGeQ6qBQwPP/YxxjX43vS hEYItyuTDc6gUh5RudFi1kmCsY8Sn8bnAWUWu4tdR1SC8OAZwf6YXWYtTSNvJiUaPv Er3O/Vxo3gJpO9MVOjYdnvEmKdjDcjK3YVTFcI9KFQ1rRkCciMZxinPG+dbsIDX3ZV aKv00zQeAyWvzXdfEvDgGJyYTLVSWR544l6YcbOsasckf9491z3R0wtzTKiOlUjKa4 E27CGht8glwtQ== Original-Received: from lechazo (lechon.iro.umontreal.ca [132.204.27.242]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id CC4251200D7; Thu, 8 Jun 2023 12:15:51 -0400 (EDT) In-Reply-To: <87edmnics1.fsf@posteo.net> (Thierry Volpiatto's message of "Wed, 07 Jun 2023 14:10:22 +0000") Received-SPF: pass client-ip=132.204.25.50; envelope-from=monnier@iro.umontreal.ca; helo=mailscanner.iro.umontreal.ca X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:263132 Archived-At: >> The new code is in a new function `pp-region`. >> The old code redirects to the new code if `pp-buffer-use-pp-region` is >> non-nil, tho I'm not sure we want to bother users with such >> a config var. Hopefully, the new code should be good enough that users >> don't need to choose. Maybe I should make it a `defvar` and have it >> default to t, so new users will complain if it's not good enough? > > I tried your code and it looks very slow (but looks nice once printed). > Testing on my bookmark-alist printed in some buffer. > Here with a slightly modified version of pp-buffer (not much faster than > the original one): > > (benchmark-run-compiled 1 (pp-buffer)) > => (6.942135047 0 0.0) > And here with your version (using pp-region): > (benchmark-run-compiled 1 (pp-buffer)) > => (46.141411097 0 0.0) Hmm... that's weird. With the bookmark-alist.el file you sent me, I get pretty much the reverse result: % for f in foo.el test-load-history.el test-bookmark-alist.el; do \ for v in nil t; do \ time src/emacs -Q --batch "$f" -l pp \ --eval "(progn (setq pp-buffer-use-pp-region $v) \ (message \"%s %s %S\" (file-name-nondirectory buffer-file-name) \ pp-buffer-use-pp-region \ (benchmark-run (pp-buffer))))"; \ done; done foo.el nil (0.210123295 1 0.057426385) src/emacs -Q --batch "$f" -l pp --eval 0.34s user 0.04s system 99% cpu 0.382 total foo.el t (0.07107641199999999 0 0.0) src/emacs -Q --batch "$f" -l pp --eval 0.19s user 0.03s system 99% cpu 0.222 total test-load-history.el nil (156.07942386099998 17 1.432754161) src/emacs -Q --batch "$f" -l pp --eval 156.04s user 0.20s system 99% cpu 2:36.26 total test-load-history.el t (96.480110987 24 1.9799413479999999) src/emacs -Q --batch "$f" -l pp --eval 96.40s user 0.27s system 99% cpu 1:36.69 total test-bookmark-alist.el nil (51.211047973 8 0.6690439610000001) src/emacs -Q --batch "$f" -l pp --eval 51.29s user 0.11s system 99% cpu 51.401 total test-bookmark-alist.el t (5.110458941 6 0.468187075) src/emacs -Q --batch "$f" -l pp --eval 5.21s user 0.09s system 99% cpu 5.302 total % This is comparing "the old pp-buffer" with "the new `pp-region`", using the patch below. This is not using your `tv/pp-region` code. I find these results (mine) quite odd: they suggest that my `pp-region` is *faster* than the old `pp-buffer` for `load-history` and `bookmark-alist` data, which I definitely did not expect (and don't know how to explain either). These tests were run on a machine whose CPU's speed can vary quite drastically depending on the load, so take those numbers with a grain of salt, but the dynamic frequency fluctuations shouldn't cause more than a factor of 2 difference (and according to my CPU frequency monitor widget, the frequency was reasonably stable during the test). > For describe variable I use a modified version of `pp` which is very > fast (nearly instant to pretty print value above) but maybe unsafe with > some vars, didn't have any problems though, see > https://github.com/thierryvolpiatto/emacs-config/blob/main/describe-variable.el. So, IIUC the numbers you cite above compare my `pp-region` to your `tv/pp-region`, right? And do I understand correctly that `tv/pp-region` does not indent its output? What was the reason for this choice? Stefan PS: BTW, looking at the output of `pp` on the bookmark-data, it's not a test where `pp-region` shines: the old pp uses up more lines, but is more regular and arguably more readable :-( diff --git a/lisp/emacs-lisp/lisp-mode.el b/lisp/emacs-lisp/lisp-mode.el index d44c9d6e23d..9914ededb85 100644 --- a/lisp/emacs-lisp/lisp-mode.el +++ b/lisp/emacs-lisp/lisp-mode.el @@ -876,7 +876,7 @@ lisp-ppss 2 (counting from 0). This is important for Lisp indentation." (unless pos (setq pos (point))) (let ((pss (syntax-ppss pos))) - (if (nth 9 pss) + (if (and (not (nth 2 pss)) (nth 9 pss)) (let ((sexp-start (car (last (nth 9 pss))))) (parse-partial-sexp sexp-start pos nil nil (syntax-ppss sexp-start))) pss))) diff --git a/lisp/emacs-lisp/pp.el b/lisp/emacs-lisp/pp.el index e6e3cd6c6f4..89d7325a491 100644 --- a/lisp/emacs-lisp/pp.el +++ b/lisp/emacs-lisp/pp.el @@ -74,31 +74,127 @@ pp-to-string (pp-buffer) (buffer-string)))) +(defun pp--within-fill-column-p () + "Return non-nil if point is within `fill-column'." + ;; Try and make it O(fill-column) rather than O(current-column), + ;; so as to avoid major slowdowns on long lines. + ;; FIXME: This doesn't account for invisible text or `display' properties :-( + (and (save-excursion + (re-search-backward + "^\\|\n" (max (point-min) (- (point) fill-column)) t)) + (<= (current-column) fill-column))) + +(defun pp-region (beg end) + "Insert newlines in BEG..END to try and fit within `fill-column'. +Presumes the current buffer contains Lisp code and has indentation properly +configured for that. +Designed under the assumption that the region occupies a single line, +tho it should also work if that's not the case." + (interactive "r") + (goto-char beg) + (let ((end (copy-marker end t)) + (newline (lambda () + (skip-chars-forward ")]}") + (unless (save-excursion (skip-chars-forward " \t") (eolp)) + (insert "\n") + (indent-according-to-mode))))) + (while (progn (forward-comment (point-max)) + (< (point) end)) + (let ((beg (point)) + ;; Whether we're in front of an element with paired delimiters. + ;; Can be something funky like #'(lambda ..) or ,'#s(...). + (paired (when (looking-at "['`,#]*[[:alpha:]]*\\([({[\"]\\)") + (match-beginning 1)))) + ;; Go to the end of the sexp. + (goto-char (or (scan-sexps (or paired (point)) 1) end)) + (unless + (and + ;; The sexp is all on a single line. + (save-excursion (not (search-backward "\n" beg t))) + ;; And its end is within `fill-column'. + (or (pp--within-fill-column-p) + ;; If the end of the sexp is beyond `fill-column', + ;; try to move the sexp to its own line. + (and + (save-excursion + (goto-char beg) + (if (save-excursion (skip-chars-backward " \t({[',") (bolp)) + ;; The sexp was already on its own line. + nil + (skip-chars-backward " \t") + (setq beg (copy-marker beg t)) + (if paired (setq paired (copy-marker paired t))) + ;; We could try to undo this insertion if it + ;; doesn't reduce the indentation depth, but I'm + ;; not sure it's worth the trouble. + (insert "\n") (indent-according-to-mode) + t)) + ;; Check again if we moved the whole exp to a new line. + (pp--within-fill-column-p)))) + ;; The sexp is spread over several lines, and/or its end is + ;; (still) beyond `fill-column'. + (when (and paired (not (eq ?\" (char-after paired)))) + ;; The sexp has sub-parts, so let's try and spread + ;; them over several lines. + (save-excursion + (goto-char beg) + (when (looking-at "(\\([^][()\" \t\n;']+\\)") + ;; Inside an expression of the form (SYM ARG1 + ;; ARG2 ... ARGn) where SYM has a `lisp-indent-function' + ;; property that's a number, insert a newline after + ;; the corresponding ARGi, because it tends to lead to + ;; more natural and less indented code. + (let* ((sym (intern-soft (match-string 1))) + (lif (and sym (get sym 'lisp-indent-function)))) + (if (eq lif 'defun) (setq lif 2)) + (when (natnump lif) + (goto-char (match-end 0)) + (forward-sexp lif) + (funcall newline))))) + (save-excursion + (pp-region (1+ paired) (1- (point))))) + ;; Now the sexp either ends beyond `fill-column' or is + ;; spread over several lines (or both). Either way, the rest of the + ;; line should be moved to its own line. + (funcall newline)))))) + +(defcustom pp-buffer-use-pp-region t + "If non-nil, `pp-buffer' uses the new `pp-region' code." + :type 'boolean) + ;;;###autoload (defun pp-buffer () "Prettify the current buffer with printed representation of a Lisp object." (interactive) (goto-char (point-min)) - (while (not (eobp)) - (cond - ((ignore-errors (down-list 1) t) - (save-excursion - (backward-char 1) - (skip-chars-backward "'`#^") - (when (and (not (bobp)) (memq (char-before) '(?\s ?\t ?\n))) - (delete-region - (point) - (progn (skip-chars-backward " \t\n") (point))) - (insert "\n")))) - ((ignore-errors (up-list 1) t) - (skip-syntax-forward ")") - (delete-region - (point) - (progn (skip-chars-forward " \t\n") (point))) - (insert ?\n)) - (t (goto-char (point-max))))) - (goto-char (point-min)) - (indent-sexp)) + (if pp-buffer-use-pp-region + (with-syntax-table emacs-lisp-mode-syntax-table + (let ((fill-column (max fill-column 70)) + (indent-line-function + (if (local-variable-p 'indent-line-function) + indent-line-function + #'lisp-indent-line))) + (pp-region (point-min) (point-max)))) + (while (not (eobp)) + (cond + ((ignore-errors (down-list 1) t) + (save-excursion + (backward-char 1) + (skip-chars-backward "'`#^") + (when (and (not (bobp)) (memq (char-before) '(?\s ?\t ?\n))) + (delete-region + (point) + (progn (skip-chars-backward " \t\n") (point))) + (insert "\n")))) + ((ignore-errors (up-list 1) t) + (skip-syntax-forward ")") + (delete-region + (point) + (progn (skip-chars-forward " \t\n") (point))) + (insert ?\n)) + (t (goto-char (point-max))))) + (goto-char (point-min)) + (indent-sexp))) ;;;###autoload (defun pp (object &optional stream)