unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#37393: 26.2.90; [PATCH] Speed up 'csv-align-fields'
@ 2019-09-12 17:07 Simen Heggestøyl
  2019-09-12 17:46 ` Stefan Monnier
  0 siblings, 1 reply; 15+ messages in thread
From: Simen Heggestøyl @ 2019-09-12 17:07 UTC (permalink / raw)
  To: 37393; +Cc: Stefan Monnier, Leo Liu

[-- Attachment #1: Type: text/plain, Size: 1015 bytes --]


The attached patch attempts to speed up the 'csv-align-fields' command
by avoiding expensive calls to 'current-column', instead reusing field
widths already computed by 'csv--column-widths'.

I felt an urge to speed up the command a bit while working with large
(100 000+ lines) CSV files. Below are benchmarks produced by running

  (benchmark 3 '(csv-align-fields nil (point-min) (point-max)))

in three CSV files from the real world of various sizes. In these cases
the speedup seems to be around 1.5x—2x.

~400 line file:
  Before: Elapsed time: 0.175867s
  After:  Elapsed time: 0.086809s

~50 000 line file:
  Before: Elapsed time: 34.665853s (7.480686s in 35 GCs)
  After:  Elapsed time: 24.349081s (7.154716s in 27 GCs)

~110 000 line file:
  Before: Elapsed time: 82.444038s (19.799686s in 51 GCs)
  After:  Elapsed time: 40.184331s (9.037813s in 25 GCs)

(I've put on CC the two of you who seem to have done most of the work on
this mode lately, hope that's OK.)

-- Simen

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Speed-up-csv-align-fields.patch --]
[-- Type: text/x-diff, Size: 3851 bytes --]

From 4fc82f1f66c736bcfbc15d20ff53bd3e21e8a8e1 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Simen=20Heggest=C3=B8yl?= <simenheg@gmail.com>
Date: Thu, 12 Sep 2019 18:54:28 +0200
Subject: [PATCH] Speed up 'csv-align-fields'

* packages/csv-mode/csv-mode.el: Bump version number and make the
dependency on Emacs 24.1 or higher explicit.
(csv--column-widths): Return the field widths as well.
(csv-align-fields): Speed up by using the field widths already computed
by 'csv--column-widths'.
---
 packages/csv-mode/csv-mode.el | 30 ++++++++++++++++--------------
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/packages/csv-mode/csv-mode.el b/packages/csv-mode/csv-mode.el
index 40f70330a..dc2555687 100644
--- a/packages/csv-mode/csv-mode.el
+++ b/packages/csv-mode/csv-mode.el
@@ -4,7 +4,8 @@
 
 ;; Author: "Francis J. Wright" <F.J.Wright@qmul.ac.uk>
 ;; Time-stamp: <23 August 2004>
-;; Version: 1.7
+;; Version: 1.8
+;; Package-Requires: ((emacs "24.1"))
 ;; Keywords: convenience
 
 ;; This package is free software; you can redistribute it and/or modify
@@ -969,24 +970,26 @@ The fields yanked are those last killed by `csv-kill-fields'."
   (and (overlay-get o 'csv) (delete-overlay o)))
 
 (defun csv--column-widths ()
-  (let ((widths '()))
+  (let ((column-widths '())
+        (field-widths '()))
     ;; Construct list of column widths:
     (while (not (eobp))                   ; for each record...
       (or (csv-not-looking-at-record)
-          (let ((w widths)
+          (let ((w column-widths)
                 (col (current-column))
-                x)
+                field-width)
             (while (not (eolp))
               (csv-end-of-field)
-              (setq x (- (current-column) col)) ; Field width.
+              (setq field-width (- (current-column) col))
+              (push field-width field-widths)
               (if w
-                  (if (> x (car w)) (setcar w x))
-                (setq w (list x)
-                      widths (nconc widths w)))
+                  (if (> field-width (car w)) (setcar w field-width))
+                (setq w (list field-width)
+                      column-widths (nconc column-widths w)))
               (or (eolp) (forward-char))  ; Skip separator.
               (setq w (cdr w) col (current-column)))))
       (forward-line))
-    widths))
+    (list column-widths (nreverse field-widths))))
 
 (defun csv-align-fields (hard beg end)
   "Align all the fields in the region to form columns.
@@ -1017,23 +1020,22 @@ If there is no selected region, default to the whole buffer."
       (narrow-to-region beg end)
       (set-marker end nil)
       (goto-char (point-min))
-      (let ((widths (csv--column-widths)))
+      (pcase-let ((`(,column-widths ,field-widths) (csv--column-widths)))
 
 	;; Align fields:
 	(goto-char (point-min))
 	(while (not (eobp))		; for each record...
 	  (unless (csv-not-looking-at-record)
-            (let ((w widths)
+            (let ((w column-widths)
                   (column 0))    ;Desired position of left-side of this column.
               (while (and w (not (eolp)))
                 (let* ((beg (point))
                        (align-padding (if (bolp) 0 csv-align-padding))
                        (left-padding 0) (right-padding 0)
-                       (field-width
-                        (- (- (current-column)
-                              (progn (csv-end-of-field) (current-column)))))
+                       (field-width (pop field-widths))
                        (column-width (pop w))
                        (x (- column-width field-width))) ; Required padding.
+                  (csv-end-of-field)
                   (set-marker end (point)) ; End of current field.
                   ;; beg = beginning of current field
                   ;; end = (point) = end of current field
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-10-09 16:33 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-09-12 17:07 bug#37393: 26.2.90; [PATCH] Speed up 'csv-align-fields' Simen Heggestøyl
2019-09-12 17:46 ` Stefan Monnier
2019-09-15 15:55   ` Simen Heggestøyl
2019-09-15 16:17     ` Eli Zaretskii
2019-09-15 18:43     ` Stefan Monnier
2019-09-17 16:53       ` Simen Heggestøyl
2019-09-17 17:23         ` Eli Zaretskii
2019-09-17 19:14           ` Stefan Monnier
2019-09-18  2:34             ` Eli Zaretskii
2019-09-18 19:59               ` Simen Heggestøyl
2019-09-18 20:08                 ` Stefan Monnier
2019-09-19 15:51                   ` Simen Heggestøyl
2019-09-19 17:30                     ` Eli Zaretskii
2019-10-09 16:33                       ` Simen Heggestøyl
2019-09-17 19:12         ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).