unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#26079: 26.0.50; Performance regression in delete-trailing-whitespace
@ 2017-03-13  2:13 Sho Takemori
  2017-03-14  2:01 ` npostavs
  0 siblings, 1 reply; 4+ messages in thread
From: Sho Takemori @ 2017-03-13  2:13 UTC (permalink / raw)
  To: 26079

[-- Attachment #1: Type: text/plain, Size: 4407 bytes --]

Dear developers,

delete-trailing-whitespace in Emacs 26 for large files is very slow.

For example, it took about 1.6s for this file (
https://raw.githubusercontent.com/stakemori/e8theta_degree3/master/results/wt18_17_5/wt18_17_5.org
).
But in Emacs 25, it took about 0.003s.
A similar code to the following is used in delete-trailing-whitespace. And
it is slow for large files.

(save-excursion
    (let ((end-marker nil))
      (goto-char (point-min))
      (with-syntax-table (make-syntax-table (syntax-table))
        (modify-syntax-entry ?\f "_")
        (modify-syntax-entry ?\n "_")
        (re-search-forward "\\s-+$" end-marker t))))

Best regards,
Sho Takemori

In GNU Emacs 26.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.18.9)
 of 2017-03-13 built on 500-270jp
Repository revision: cf670b49a7704d63575863f832426d32bf6a8c3c
Windowing system distributor 'The X.Org Foundation', version 11.0.11804000
System Description: Ubuntu 16.04.2 LTS

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
delete-backward-char: Text is read-only
(1.6173726940000002 0 0.0)
Quit
Configured using:
 'configure --with-sound=no --with-modules'

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK GPM DBUS GCONF GSETTINGS NOTIFY
ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB
TOOLKIT_SCROLL_BARS GTK3 X11 MODULES

Important settings:
  value of $LC_MONETARY: ja_JP.UTF-8
  value of $LC_NUMERIC: ja_JP.UTF-8
  value of $LC_TIME: ja_JP.UTF-8
  value of $LANG: ja_JP.UTF-8
  value of $XMODIFIERS: @im=ibus
  locale-coding-system: utf-8-unix

Major mode: Org

Minor modes in effect:
  diff-auto-refine-mode: t
  shell-dirtrack-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message puny rfc822 mml mml-sec epa
derived epg epg-config mm-decode mm-bodies mm-encode mail-parse rfc2231
mailabbrev gmm-utils mailheader sendmail benchmark vc-git diff-mode
org-element org-rmail org-mhe org-irc org-info org-gnus gnus-util rmail
rmail-loaddefs rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils
org-docview doc-view jka-compr image-mode dired dired-loaddefs
org-bibtex bibtex org-bbdb org-w3m org org-macro org-footnote
org-pcomplete org-list org-faces org-entities noutline outline
easy-mmode org-version ob-emacs-lisp ob ob-tangle ob-ref ob-lob ob-table
ob-exp org-src ob-keys ob-comint ob-core ob-eval org-compat org-macs
org-loaddefs find-func seq cal-menu calendar cal-loaddefs tramp
tramp-compat tramp-loaddefs trampver ucs-normalize shell pcomplete
comint ansi-color ring parse-time format-spec advice auth-source cl-seq
eieio byte-opt subr-x bytecomp byte-compile cl-extra help-mode easymenu
cconv eieio-core cl-macs gv eieio-loaddefs cl-loaddefs pcase cl-lib
password-cache time-date mule-util japan-util tooltip eldoc electric
uniquify ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win
term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page menu-bar rfn-eshadow isearch timer select
scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932
hebrew greek romanian slovak czech european ethiopic indian cyrillic
chinese composite charscript case-table epa-hook jka-cmpr-hook help
simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs button
faces cus-face macroexp files text-properties overlay sha1 md5 base64
format env code-pages mule custom widget hashtable-print-readable
backquote dbusbind inotify dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty
make-network-process emacs)

Memory information:
((conses 16 275028 6205)
 (symbols 48 28629 1)
 (miscs 40 79 128)
 (strings 32 45484 10223)
 (string-bytes 1 1457234)
 (vectors 16 45224)
 (vector-slots 8 888199 4865)
 (floats 8 105 76)
 (intervals 56 394 5)
 (buffers 976 12)
 (heap 1024 30354 1435))

[-- Attachment #2: Type: text/html, Size: 5936 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#26079: 26.0.50; Performance regression in delete-trailing-whitespace
  2017-03-13  2:13 bug#26079: 26.0.50; Performance regression in delete-trailing-whitespace Sho Takemori
@ 2017-03-14  2:01 ` npostavs
  2017-03-14 13:00   ` Michal Nazarewicz
  0 siblings, 1 reply; 4+ messages in thread
From: npostavs @ 2017-03-14  2:01 UTC (permalink / raw)
  To: Sho Takemori; +Cc: 26079, Michal Nazarewicz

tags 26079 confirmed
quit

Sho Takemori <stakemorii@gmail.com> writes:

> Dear developers,
>
> delete-trailing-whitespace in Emacs 26 for large files is very slow.
>
> For example, it took about 1.6s for this file (https://raw.githubusercontent.com/stakemori/e8theta_degree3/master/results/wt18_17_5/wt18_17_5.org).
> But in Emacs 25, it took about 0.003s.
> A similar code to the following is used in delete-trailing-whitespace. And it is slow for large files.
>
> (save-excursion
> (let ((end-marker nil))
> (goto-char (point-min))
> (with-syntax-table (make-syntax-table (syntax-table))
> (modify-syntax-entry ?\f "_")
> (modify-syntax-entry ?\n "_")
> (re-search-forward "\\s-+$" end-marker t))))

It seems that this regex causes a lot of backtracking when \n is not
whitespace.  It was introduced in [1: 7c6317a049]; restoring the
strategy from before seems make it fast again.  Michal, do you think
that's the best way to fix this?

---   i/lisp/simple.el
+++   i/lisp/simple.el
@@ -632,12 +632,11 @@ delete-trailing-whitespace
         (goto-char (or start (point-min)))
         (with-syntax-table (make-syntax-table (syntax-table))
           ;; Don't delete formfeeds, even if they are considered whitespace.
           (modify-syntax-entry ?\f "_")
-          ;; Treating \n as non-whitespace makes things easier.
-          (modify-syntax-entry ?\n "_")
-          (while (re-search-forward "\\s-+$" end-marker t)
-            (let ((b (match-beginning 0)) (e (match-end 0)))
+          (while (re-search-forward "\\s-$" end-marker t)
+            (skip-syntax-backward "-" (line-beginning-position))
+            (let ((b (point)) (e (match-end 0)))
               (when (region-modifiable-p b e)
                 (delete-region b e)))))
         (if end
             (set-marker end-marker nil)


1: 2016-07-04 23:44:06 +0200 7c6317a0498b6690ea668909ac012cb45e6f809b
  Simplify ‘delete-trailing-whitespace’ by not treating \n as whitespace





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#26079: 26.0.50; Performance regression in delete-trailing-whitespace
  2017-03-14  2:01 ` npostavs
@ 2017-03-14 13:00   ` Michal Nazarewicz
  2017-03-15  2:36     ` npostavs
  0 siblings, 1 reply; 4+ messages in thread
From: Michal Nazarewicz @ 2017-03-14 13:00 UTC (permalink / raw)
  To: npostavs, Sho Takemori; +Cc: 26079

On Mon, Mar 13 2017, npostavs wrote:
> tags 26079 confirmed
> quit
>
> Sho Takemori <stakemorii@gmail.com> writes:
>
>> Dear developers,
>>
>> delete-trailing-whitespace in Emacs 26 for large files is very slow.
>>
>> For example, it took about 1.6s for this file (https://raw.githubusercontent.com/stakemori/e8theta_degree3/master/results/wt18_17_5/wt18_17_5.org).
>> But in Emacs 25, it took about 0.003s.
>> A similar code to the following is used in delete-trailing-whitespace. And it is slow for large files.
>>
>> (save-excursion
>> (let ((end-marker nil))
>> (goto-char (point-min))
>> (with-syntax-table (make-syntax-table (syntax-table))
>> (modify-syntax-entry ?\f "_")
>> (modify-syntax-entry ?\n "_")
>> (re-search-forward "\\s-+$" end-marker t))))
>
> It seems that this regex causes a lot of backtracking when \n is not
> whitespace.  It was introduced in [1: 7c6317a049]; restoring the
> strategy from before seems make it fast again.  Michal, do you think
> that's the best way to fix this?

I wish Emacs’ RE was O(n). :(

Yeah, I think reverting my commit is the best course of action.

> ---   i/lisp/simple.el
> +++   i/lisp/simple.el
> @@ -632,12 +632,11 @@ delete-trailing-whitespace
>          (goto-char (or start (point-min)))
>          (with-syntax-table (make-syntax-table (syntax-table))
>            ;; Don't delete formfeeds, even if they are considered whitespace.
>            (modify-syntax-entry ?\f "_")
> -          ;; Treating \n as non-whitespace makes things easier.
> -          (modify-syntax-entry ?\n "_")
> -          (while (re-search-forward "\\s-+$" end-marker t)
> -            (let ((b (match-beginning 0)) (e (match-end 0)))
> +          (while (re-search-forward "\\s-$" end-marker t)
> +            (skip-syntax-backward "-" (line-beginning-position))
> +            (let ((b (point)) (e (match-end 0)))
>                (when (region-modifiable-p b e)
>                  (delete-region b e)))))
>          (if end
>              (set-marker end-marker nil)
>
>
> 1: 2016-07-04 23:44:06 +0200 7c6317a0498b6690ea668909ac012cb45e6f809b
>   Simplify ‘delete-trailing-whitespace’ by not treating \n as whitespace

-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#26079: 26.0.50; Performance regression in delete-trailing-whitespace
  2017-03-14 13:00   ` Michal Nazarewicz
@ 2017-03-15  2:36     ` npostavs
  0 siblings, 0 replies; 4+ messages in thread
From: npostavs @ 2017-03-15  2:36 UTC (permalink / raw)
  To: Michal Nazarewicz; +Cc: Sho Takemori, 26079

tags 26079 fixed
close 26079 
quit

Michal Nazarewicz <mina86@mina86.com> writes:

>
> I wish Emacs’ RE was O(n). :(

Hah yeah, the trouble is it's easy to fix each regexp as they come up,
but it would take a lot of hard work to change the regex engine...

> Yeah, I think reverting my commit is the best course of action.

Done [1: c66aaa6163].

1: 2017-03-14 22:14:30 -0400 c66aaa61639e72a70a4f2c4bc73645048caebe53
  Recomplexify ‘delete-trailing-whitespace’ by treating \n as whitespace again





^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-03-15  2:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-13  2:13 bug#26079: 26.0.50; Performance regression in delete-trailing-whitespace Sho Takemori
2017-03-14  2:01 ` npostavs
2017-03-14 13:00   ` Michal Nazarewicz
2017-03-15  2:36     ` npostavs

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).