* bug#38104: 27.0.50; elixir-mode fontification is very slow
@ 2019-11-07 15:40 Dmitry Gutov
2019-11-26 16:26 ` Dmitry Gutov
0 siblings, 1 reply; 7+ messages in thread
From: Dmitry Gutov @ 2019-11-07 15:40 UTC (permalink / raw)
To: 38104
[-- Attachment #1: Type: text/plain, Size: 983 bytes --]
I haven't been able to track this to a particular component (e.g. a
regexp) for now, but font-lock-fontify-region is now considerably slower
than it was in Emacs 26 (at least at revision cb8fb597e5bf4f14).
To reproduce: install elixir-mode (e.g. from MELPA Stable):
(add-to-list 'package-archives
'("melpa-stable" . "https://stable.melpa.org/packages/") t)
M-x list-packages, install elixir-mode.
Savet the attached tiny.__ex__ as tiny.ex.
Visit tiny.ex.
Eval: (benchmark 1 '(font-lock-fontify-region (point-min) (point-max))).
"Elapsed time: 0.158824s"
With larger files, the times are much longer.
I had a break from Elixir, so I noticed this only now.
In GNU Emacs 27.0.50 (build 11, x86_64-pc-linux-gnu, GTK+ Version 3.24.8)
of 2019-11-05 built on potemkin
Repository revision: dd19cc3aa16ccc441a8a2bfcdeb3005a6eef2543
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12004000
System Description: Ubuntu 19.04
[-- Attachment #2: tiny.__ex__ --]
[-- Type: text/plain, Size: 162 bytes --]
# module_name.ex
defmodule ModuleName do
def message({to, %IncidentMessage{title: _t, body: msg}}) do
msg
|> message_body
|> send_message
end
end
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#38104: 27.0.50; elixir-mode fontification is very slow
2019-11-07 15:40 bug#38104: 27.0.50; elixir-mode fontification is very slow Dmitry Gutov
@ 2019-11-26 16:26 ` Dmitry Gutov
2019-11-26 16:30 ` Dmitry Gutov
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Dmitry Gutov @ 2019-11-26 16:26 UTC (permalink / raw)
To: 38104; +Cc: Mattias Engdegård
I did a 'git bisect', and it came down to:
commit 2ed71227c626c6cfdc684948644ccf3d9eaeb15b
Author: Mattias Engdegård <mattiase@acm.org>
Date: Wed Sep 25 14:29:50 2019 -0700
New rx implementation
Mattias, could you look into it?
elixir-mode does use rx, heavily. Albeit with a thin wrapper.
To be clear, elixir-mode is quite unusable now.
On 07.11.2019 17:40, Dmitry Gutov wrote:
> I haven't been able to track this to a particular component (e.g. a
> regexp) for now, but font-lock-fontify-region is now considerably slower
> than it was in Emacs 26 (at least at revision cb8fb597e5bf4f14).
>
> To reproduce: install elixir-mode (e.g. from MELPA Stable):
>
> (add-to-list 'package-archives
> '("melpa-stable" . "https://stable.melpa.org/packages/") t)
>
> M-x list-packages, install elixir-mode.
>
> Savet the attached tiny.__ex__ as tiny.ex.
>
> Visit tiny.ex.
>
> Eval: (benchmark 1 '(font-lock-fontify-region (point-min) (point-max))).
>
> "Elapsed time: 0.158824s"
>
> With larger files, the times are much longer.
>
> I had a break from Elixir, so I noticed this only now.
>
> In GNU Emacs 27.0.50 (build 11, x86_64-pc-linux-gnu, GTK+ Version 3.24.8)
> of 2019-11-05 built on potemkin
> Repository revision: dd19cc3aa16ccc441a8a2bfcdeb3005a6eef2543
> Repository branch: master
> Windowing system distributor 'The X.Org Foundation', version 11.0.12004000
> System Description: Ubuntu 19.04
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#38104: 27.0.50; elixir-mode fontification is very slow
2019-11-26 16:26 ` Dmitry Gutov
@ 2019-11-26 16:30 ` Dmitry Gutov
2019-11-26 16:59 ` Mattias Engdegård
2019-11-26 19:32 ` Mattias Engdegård
2 siblings, 0 replies; 7+ messages in thread
From: Dmitry Gutov @ 2019-11-26 16:30 UTC (permalink / raw)
To: 38104; +Cc: Mattias Engdegård
On 26.11.2019 18:26, Dmitry Gutov wrote:
> elixir-mode does use rx, heavily. Albeit with a thin wrapper.
And one more thing: this wrapper makes use of rx-constituents.
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#38104: 27.0.50; elixir-mode fontification is very slow
2019-11-26 16:26 ` Dmitry Gutov
2019-11-26 16:30 ` Dmitry Gutov
@ 2019-11-26 16:59 ` Mattias Engdegård
2019-11-26 17:03 ` Dmitry Gutov
2019-11-26 19:32 ` Mattias Engdegård
2 siblings, 1 reply; 7+ messages in thread
From: Mattias Engdegård @ 2019-11-26 16:59 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: 38104
26 nov. 2019 kl. 17.26 skrev Dmitry Gutov <dgutov@yandex.ru>:
> Mattias, could you look into it?
Thanks, will have a look. By the way, when byte-compiling I get complaints about 'looking-back' being called with too few arguments in elixir-smie.el; may want to do something about that.
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#38104: 27.0.50; elixir-mode fontification is very slow
2019-11-26 16:59 ` Mattias Engdegård
@ 2019-11-26 17:03 ` Dmitry Gutov
0 siblings, 0 replies; 7+ messages in thread
From: Dmitry Gutov @ 2019-11-26 17:03 UTC (permalink / raw)
To: Mattias Engdegård; +Cc: 38104
On 26.11.2019 18:59, Mattias Engdegård wrote:
> By the way, when byte-compiling I get complaints about 'looking-back' being called with too few arguments in elixir-smie.el; may want to do something about that.
Yes, I have a few other pending patches for that project as well.
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#38104: 27.0.50; elixir-mode fontification is very slow
2019-11-26 16:26 ` Dmitry Gutov
2019-11-26 16:30 ` Dmitry Gutov
2019-11-26 16:59 ` Mattias Engdegård
@ 2019-11-26 19:32 ` Mattias Engdegård
2019-11-27 21:58 ` Dmitry Gutov
2 siblings, 1 reply; 7+ messages in thread
From: Mattias Engdegård @ 2019-11-26 19:32 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: 38104
26 nov. 2019 kl. 17.26 skrev Dmitry Gutov <dgutov@yandex.ru>:
> elixir-mode does use rx, heavily. Albeit with a thin wrapper.
As it turned out, rx is fine (now); elixir-mode, not quite. In elixir-mode.el, we have
(identifiers . ,(rx (one-or-more (any "A-Z" "a-z" "_"))
(zero-or-more (any "A-Z" "a-z" "0-9" "_"))
(optional (or "?" "!"))))
First, this regex is suboptimal: the first character of an identifier should occur exactly once, or you get bad backtracking behaviour. Just remove the one-or-more construct:
(identifiers . ,(rx (any "A-Z" "a-z" "_")
(zero-or-more (any "A-Z" "a-z" "0-9" "_"))
(optional (or "?" "!"))))
This definition is then used in several places, but two in particular are of interest to us:
;; Module attributes
(,(elixir-rx (and "@" (1+ identifiers)))
The construct (1+ identifiers) was perhaps meant to match multiple identifiers, but it doesn't (no separator); it just matches an identifier in several ways, which again leads to bad backtracking behaviour.
The same problem here:
;; Map keys
(,(elixir-rx (group (and (one-or-more identifiers) ":")) space)
Remove the 1+ and one-or-more and it's fast again.
Why did this "work" with the old rx implementation? Because that code had a nasty bug: it does not bracket definitions in rx-constituents properly. Example:
(let ((rx-constituents (cons '(hello . "HELLO") rx-constituents)))
(rx-to-string '(1+ hello) t))
=> "HELLO+"
The new rx implementation does not suffer from this bug.
The result in your case is that the old rx, when translating (1+ identifiers), only tacked the "+" onto whatever regexp 'identifiers' produced, resulting in
"[A-Z_a-z]+[0-9A-Z_a-z]*[!?]?+"
which is a lot faster, since only the final [!?] is repeated twice (and it probably doesn't match very often).
^ permalink raw reply [flat|nested] 7+ messages in thread
* bug#38104: 27.0.50; elixir-mode fontification is very slow
2019-11-26 19:32 ` Mattias Engdegård
@ 2019-11-27 21:58 ` Dmitry Gutov
0 siblings, 0 replies; 7+ messages in thread
From: Dmitry Gutov @ 2019-11-27 21:58 UTC (permalink / raw)
To: Mattias Engdegård; +Cc: 38104-done
Hi Mattias,
On 26.11.2019 21:32, Mattias Engdegård wrote:
> As it turned out, rx is fine (now); elixir-mode, not quite. In elixir-mode.el, we have
>
> (identifiers . ,(rx (one-or-more (any "A-Z" "a-z" "_"))
> (zero-or-more (any "A-Z" "a-z" "0-9" "_"))
> (optional (or "?" "!"))))
>
> First, this regex is suboptimal: the first character of an identifier should occur exactly once, or you get bad backtracking behaviour. Just remove the one-or-more construct:
>
> (identifiers . ,(rx (any "A-Z" "a-z" "_")
> (zero-or-more (any "A-Z" "a-z" "0-9" "_"))
> (optional (or "?" "!"))))
>
> This definition is then used in several places, but two in particular are of interest to us:
>
> ;; Module attributes
> (,(elixir-rx (and "@" (1+ identifiers)))
>
> The construct (1+ identifiers) was perhaps meant to match multiple identifiers, but it doesn't (no separator); it just matches an identifier in several ways, which again leads to bad backtracking behaviour.
> The same problem here:
>
> ;; Map keys
> (,(elixir-rx (group (and (one-or-more identifiers) ":")) space)
>
> Remove the 1+ and one-or-more and it's fast again.
That makes a lot of sense. I removed these one-or-more's and 1+ (and a
few others), and it became fast again.
I'll send a patch upstream. Thanks for your help!
(Looking at the tracker, they have a minor version of this change
submitted already).
> Why did this "work" with the old rx implementation? Because that code had a nasty bug: it does not bracket definitions in rx-constituents properly. Example:
>
> (let ((rx-constituents (cons '(hello . "HELLO") rx-constituents)))
> (rx-to-string '(1+ hello) t))
> => "HELLO+"
>
> The new rx implementation does not suffer from this bug.
>
> The result in your case is that the old rx, when translating (1+ identifiers), only tacked the "+" onto whatever regexp 'identifiers' produced, resulting in
>
> "[A-Z_a-z]+[0-9A-Z_a-z]*[!?]?+"
>
> which is a lot faster, since only the final [!?] is repeated twice (and it probably doesn't match very often).
It's funny to think how someone probably beaten the current code into
submission by trial and error.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-11-27 21:58 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-11-07 15:40 bug#38104: 27.0.50; elixir-mode fontification is very slow Dmitry Gutov
2019-11-26 16:26 ` Dmitry Gutov
2019-11-26 16:30 ` Dmitry Gutov
2019-11-26 16:59 ` Mattias Engdegård
2019-11-26 17:03 ` Dmitry Gutov
2019-11-26 19:32 ` Mattias Engdegård
2019-11-27 21:58 ` Dmitry Gutov
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.