From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Simen =?UTF-8?Q?Heggest=C3=B8yl?= Newsgroups: gmane.emacs.bugs Subject: bug#37393: 26.2.90; [PATCH] Speed up 'csv-align-fields' Date: Sun, 15 Sep 2019 17:55:44 +0200 Message-ID: <5d7e5f00.1c69fb81.43f55.3fa2@mx.google.com> References: <5d7a7b56.1c69fb81.c32f6.840d@mx.google.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="163184"; mail-complaints-to="usenet@blaine.gmane.org" Cc: 37393@debbugs.gnu.org, sdl.web@gmail.com To: Stefan Monnier Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Sep 15 17:56:11 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1i9Wsl-000gHn-69 for geb-bug-gnu-emacs@m.gmane.org; Sun, 15 Sep 2019 17:56:11 +0200 Original-Received: from localhost ([::1]:55766 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i9Wsj-0007E8-Ge for geb-bug-gnu-emacs@m.gmane.org; Sun, 15 Sep 2019 11:56:09 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:49881) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i9Wse-0007E0-1s for bug-gnu-emacs@gnu.org; Sun, 15 Sep 2019 11:56:05 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i9Wsc-0000kj-T7 for bug-gnu-emacs@gnu.org; Sun, 15 Sep 2019 11:56:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:39825) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1i9Wsc-0000jq-Mi for bug-gnu-emacs@gnu.org; Sun, 15 Sep 2019 11:56:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1i9Wsc-0004rL-JZ for bug-gnu-emacs@gnu.org; Sun, 15 Sep 2019 11:56:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Simen =?UTF-8?Q?Heggest=C3=B8yl?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 15 Sep 2019 15:56:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 37393 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 37393-submit@debbugs.gnu.org id=B37393.156856295418664 (code B ref 37393); Sun, 15 Sep 2019 15:56:02 +0000 Original-Received: (at 37393) by debbugs.gnu.org; 15 Sep 2019 15:55:54 +0000 Original-Received: from localhost ([127.0.0.1]:48646 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1i9WsT-0004qy-Kl for submit@debbugs.gnu.org; Sun, 15 Sep 2019 11:55:53 -0400 Original-Received: from mail-lj1-f179.google.com ([209.85.208.179]:46058) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1i9WsR-0004ql-Ty for 37393@debbugs.gnu.org; Sun, 15 Sep 2019 11:55:52 -0400 Original-Received: by mail-lj1-f179.google.com with SMTP id q64so20947285ljb.12 for <37393@debbugs.gnu.org>; Sun, 15 Sep 2019 08:55:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:from:to:cc:subject:in-reply-to:date:mime-version; bh=5UR4hpOHyR4jyNgyVWKWMw2YTBP/GLyHkPR/RbeZnwI=; b=NI3I/03K45UeOqxpxAn0ta9a828+GBvHDzSxWVaYzleyffr5SGgeWPMvqWrtGApgjy Ukq9GUzHBIgAm/QJufZzObpZeimvttVXV/ND01k5OaR7qsiVQK1/mgMnVBkTL/9253OL 9EQeX8e8KOPPkIOJfEgDO5FSeuRFZsJZwzkgsHZ2eNJMghK+ku9+hcan3dzndq5sP5AO jVuhLB2sllykjaUY7f7FwnHmf78Bwd7bl2BmedRMKeS7OwkwWx63C/refvp53wffHhc7 NoPbdckd2clwgO+5KiVqM78nL9JfN+n9J3S9WMP47SoNCfYpHlHwizFls/jmZFu4G0b0 z0xQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:from:to:cc:subject:in-reply-to:date :mime-version; bh=5UR4hpOHyR4jyNgyVWKWMw2YTBP/GLyHkPR/RbeZnwI=; b=Abc30XvG7Kcg+nvd2J1OV5ipMMRR27tI61ZE9IiDy0xNFhVUhkR8JDQnQKItSrcTCf WnkDsQrBUP7c5d3kS7p0ZrDRMflNn6gL6fQ8h9hBfXY7FJXqQqbZG4HnfAEEwXtFbOJ9 0Q7/fGecX9cU9wwGgLXxI1XnRWv4GywXzmOn9FKHDGclBvEAOFGCJHwjLhhducjA7+r3 KU55nAsa50OR/NDfI5ix0D24MJIm0wnQGRAjYBKBPhQVohB/u+6sk8AtZraIDZIbruKN TjPUGLI5COQnaKpopWXEqIeTw2peJs1PWEqloAF+iRf2UaRSZ8Ghxf8OrxIalk7mQLFT 58sA== X-Gm-Message-State: APjAAAW1sEX5sluiqMQSmk4q8WUx2n0YibaA9CkHS2blNVMAL78o+Irn pqr8vWlK8X9nFALyHsxpXxQ= X-Google-Smtp-Source: APXvYqwzOJnmlRs478EA2pSCef7MCYIu+ygiazQ1KKyk1gPGMs6nK0pYavgjd0T0VZ4oIVKMXDUfdw== X-Received: by 2002:a2e:2c01:: with SMTP id s1mr35268632ljs.113.1568562945726; Sun, 15 Sep 2019 08:55:45 -0700 (PDT) Original-Received: from ae25 (cm-84.210.143.4.getinternet.no. [84.210.143.4]) by smtp.gmail.com with ESMTPSA id m18sm8336642lfb.73.2019.09.15.08.55.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 15 Sep 2019 08:55:44 -0700 (PDT) X-Google-Original-Message-ID: <87zhj5sge7.fsf@simenheg@gmail.com> In-Reply-To: (message from Stefan Monnier on Thu, 12 Sep 2019 13:46:36 -0400) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:166509 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Stefan Monnier writes: > Sounds good. I rarely use large CSV files, but I know the operation is s= low. > > I'm OK with the patch, tho please see my comment below. Thanks for reviewing it. > 40s is still slow, but a factor of 2 is good, thanks. Yes (though 40s is the time for all three benchmark runs, so one alignment is 40s/3). > If you're interested in this line, I think there are two avenues to > improve the behavior further: > - align lazily via jit-lock (this way the time is determined by the > amount of text displayed rather than the total file size). Wouldn't that still depend on knowing the column widths? I find that the column width computation is taking about 80% of the time when calling 'csv-align-fields' (after the patch). > - make align-fields' into a mode, where fields are kept aligned even while > the buffer is modified. That sounds nice. >> (defun csv--column-widths () >> - (let ((widths '())) >> + (let ((column-widths '()) >> + (field-widths '())) > > I think the return value is now sufficiently complex that the function > deserves a docstring describing it. Agreed, I'll add one before I install the patch. I've also attached a new suggestion for speeding up the column width computation itself by eliminating another 'current-column'-call. I'm not too sure about its correctness yet, but it seems to work in a few tests I've done, and it sped up 'csv--column-widths' by a factor of 1.3=E2=80=931= .4. --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=0001-WIP.patch >From c3c077170aefa8ba0cd5d8f8b824c85eb0f01a66 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Simen=20Heggest=C3=B8yl?= Date: Sun, 15 Sep 2019 17:31:40 +0200 Subject: [PATCH] WIP --- packages/csv-mode/csv-mode.el | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/packages/csv-mode/csv-mode.el b/packages/csv-mode/csv-mode.el index dc2555687..00107f51e 100644 --- a/packages/csv-mode/csv-mode.el +++ b/packages/csv-mode/csv-mode.el @@ -976,18 +976,26 @@ The fields yanked are those last killed by `csv-kill-fields'." (while (not (eobp)) ; for each record... (or (csv-not-looking-at-record) (let ((w column-widths) - (col (current-column)) + (col-beg (current-column)) + col-end field-width) (while (not (eolp)) (csv-end-of-field) - (setq field-width (- (current-column) col)) + (setq col-end (current-column)) + (setq field-width (- col-end col-beg)) (push field-width field-widths) (if w (if (> field-width (car w)) (setcar w field-width)) (setq w (list field-width) column-widths (nconc column-widths w))) - (or (eolp) (forward-char)) ; Skip separator. - (setq w (cdr w) col (current-column))))) + (unless (eolp) + (forward-char) ; Skip separator. + (setq w (cdr w)) + (setq col-beg (if (= (char-before) ?\t) + (* (/ (+ col-end tab-width) + tab-width) + tab-width) + (+ col-end (char-width (char-before))))))))) (forward-line)) (list column-widths (nreverse field-widths)))) -- 2.23.0 --=-=-= Content-Type: text/plain -- Simen --=-=-=--