* bug#58302: 29.0.50; browse-url-emacs is extremely slow (and I think always has been?)
@ 2022-10-05 11:07 Phil Sainty
2022-10-06 12:53 ` Lars Ingebrigtsen
0 siblings, 1 reply; 10+ messages in thread
From: Phil Sainty @ 2022-10-05 11:07 UTC (permalink / raw)
To: 58302
browse-url-emacs has always been inexplicably slow for me,
since at least Emacs 24 (but maybe just 'always').
I've just done some basic benchmarking with:
(benchmark-run (save-selected-window
(browse-url-emacs "http://www.example.com")))
;; If I delete the "www.example.com" buffer after each attempt, this
;; call takes nearly 3 seconds:
(2.819470404 1 0.046148434000000016)
(2.669529036 0 0.0)
(2.837350438 1 0.02421054800000011)
;; If I retain the "www.example.com" buffer, then each retry takes
;; <0.5 seconds:
(0.374270464 0 0.0)
(0.428719681 0 0.0)
(0.476068586 1 0.044311528999999794)
;; Whereas benchmarking only `url-retrieve-synchronously':
(benchmark-run (url-retrieve-synchronously "http://www.example.com"))
;; This takes <0.25 seconds.
(0.172364234 0 0.0)
(0.24314511600000002 0 0.0)
(0.19534228 0 0.0)
;; Deleting the network connection via M-x list-processes between
;; attempts adds about 0.25 seconds for all sets of benchmarks, so the
;; network connection time is not a factor.
;; eww is also fast:
(benchmark-run (save-window-excursion (let ((eww-retrieve-command
'sync))
(eww "http://www.example.com"))))
(0.197008672 0 0.0)
(0.25098304499999996 0 0.0)
(0.28419454299999997 0 0.0)
So something to do with browse-url-emacs is taking an additional 2.5s
on top of the basic URL request -- unless the buffer already exists,
in which case it's much faster (albeit still twice as slow as the
other options).
Presumably this could be improved?
If I benchmark-progn the final (funcall func url) in browse-url-emacs
I can see that this is where all the time is spent. func is set to
`find-file-other-window'; so this is equivalent:
(benchmark-run
(save-selected-window
(let ((file-name-handler-alist
(cons (cons url-handler-regexp 'url-file-handler)
file-name-handler-alist)))
(find-file-noselect "http://www.example.com")))
(kill-buffer "www.example.com"))
If the buffer already existed, *Messages* says:
Contacting host: www.example.com:80
(0.408067329 0 0.0)
If the buffer did not already exist, *Messages* says:
Contacting host: www.example.com:80 [2 times]
File exists, but cannot be read
(2.617302471 0 0.0)
(The "[2 times]" is not on account of a previous test;
they are both generated by this single call.)
Benchmarking the end of `find-file-noselect' like so:
(benchmark-progn
(find-file-noselect-1 buf filename nowarn
rawfile truename number))
Gives me:
Elapsed time: 1.876737s (0.057295s in 1 GCs) ;; find-file-noselect-1
(2.680159853 1 0.057295379000000146) ;; the overall
benchmark-run
And that's as far as I've gone.
-Phil
In GNU Emacs 29.0.50 (build 4, x86_64-pc-linux-gnu, X toolkit, cairo
version 1.15.10, Xaw scroll bars)
of 2022-07-15 built on phil-lp
Repository revision: 00eb894a56d63fad3573a53dd57c323289711512
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version
11.0.12008000
System Description: Ubuntu 18.04.6 LTS
Configured using:
'configure --prefix=/home/phil/emacs/trunk/usr/local
--with-x-toolkit=lucid --without-sound
'--program-transform-name=s/^ctags$/ctags_emacs/''
Configured features:
CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG JSON
LCMS2 LIBXML2 MODULES NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP THREADS
TIFF TOOLKIT_SCROLL_BARS WEBP X11 XDBE XIM XINPUT2 XPM LUCID ZLIB
Important settings:
value of $LC_MONETARY: en_NZ.UTF-8
value of $LC_NUMERIC: en_NZ.UTF-8
value of $LC_TIME: en_NZ.UTF-8
value of $LANG: en_GB.UTF-8
value of $XMODIFIERS: @im=ibus
locale-coding-system: utf-8-unix
Major mode: Lisp Interaction
Minor modes in effect:
savehist-mode: t
windmove-mode: t
winner-mode: t
tooltip-mode: t
global-eldoc-mode: t
eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
Load-path shadows:
None found.
Features:
(shadow sort mail-extr emacsbug mule-util display-line-numbers dcl-mode
tempo mm-archive message sendmail yank-media rfc822 mml mml-sec epa
derived epg rfc6068 epg-config gnus-util mailabbrev gmm-utils mailheader
mm-decode mm-bodies mm-encode url-dav parse-time iso8601 cl-extra
cl-print debug backtrace find-func benchmark shortdoc url-about
url-handlers thingatpt help-fns radix-tree help-mode dabbrev time-date
textsec uni-scripts idna-mapping ucs-normalize uni-confusable
textsec-check mail-utils gnutls network-stream url-http mail-parse
rfc2231 rfc2047 rfc2045 mm-util ietf-drums mail-prsvr url-gw nsm
url-cache url-auth shr text-property-search pixel-fill kinsoku url-file
url-dired puny svg xml dom browse-url url url-proxy url-privacy
url-expand url-methods url-history url-cookie generate-lisp-file
url-domsuf url-util url-parse auth-source cl-seq eieio eieio-core
cl-macs password-cache json subr-x map byte-opt gv bytecomp byte-compile
cconv url-vars mailcap savehist windmove winner ring dired-aux
cl-loaddefs cl-lib dired dired-loaddefs advice rmc iso-transl tooltip
eldoc paren electric uniquify ediff-hook vc-hooks lisp-float-type
elisp-mode mwheel term/x-win x-win term/common-win x-dnd tool-bar dnd
fontset image regexp-opt fringe tabulated-list replace newcomment
text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow
isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax
font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
faces cus-face macroexp files window text-properties overlay sha1 md5
base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads dbusbind inotify lcms2
dynamic-setting system-font-setting font-render-setting cairo x-toolkit
xinput2 x multi-tty make-network-process emacs)
Memory information:
((conses 16 210718 27584)
(symbols 48 9278 0)
(strings 32 47429 2202)
(string-bytes 1 1030861)
(vectors 16 40826)
(vector-slots 8 595360 31429)
(floats 8 105 82)
(intervals 56 1185 0)
(buffers 992 38))
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#58302: 29.0.50; browse-url-emacs is extremely slow (and I think always has been?)
2022-10-05 11:07 bug#58302: 29.0.50; browse-url-emacs is extremely slow (and I think always has been?) Phil Sainty
@ 2022-10-06 12:53 ` Lars Ingebrigtsen
2022-10-06 23:25 ` Phil Sainty
2022-10-07 2:47 ` Phil Sainty
0 siblings, 2 replies; 10+ messages in thread
From: Lars Ingebrigtsen @ 2022-10-06 12:53 UTC (permalink / raw)
To: Phil Sainty; +Cc: 58302
Phil Sainty <psainty@orcon.net.nz> writes:
> If the buffer did not already exist, *Messages* says:
> Contacting host: www.example.com:80 [2 times]
> File exists, but cannot be read
> (2.617302471 0 0.0)
I can reproduce this, too. I think it's likely that the delay is coming
from the error message (which is a misleading error message). There's
probably a "sleep-for 2" after displaying the error message?
There's a bug report in debbugs somewhere about fixing the error
message, but I can't find it now.
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#58302: 29.0.50; browse-url-emacs is extremely slow (and I think always has been?)
2022-10-06 12:53 ` Lars Ingebrigtsen
@ 2022-10-06 23:25 ` Phil Sainty
2022-10-07 11:48 ` Lars Ingebrigtsen
2022-10-07 2:47 ` Phil Sainty
1 sibling, 1 reply; 10+ messages in thread
From: Phil Sainty @ 2022-10-06 23:25 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: 58302
On 2022-10-07 01:53, Lars Ingebrigtsen wrote:
> Phil Sainty <psainty@orcon.net.nz> writes:
>> If the buffer did not already exist, *Messages* says:
>> File exists, but cannot be read
>
> I can reproduce this, too. I think it's likely that the delay is
> coming
> from the error message (which is a misleading error message). There's
> probably a "sleep-for 2" after displaying the error message?
>
> There's a bug report in debbugs somewhere about fixing the error
> message, but I can't find it now.
Perhaps https://debbugs.gnu.org/cgi/bugreport.cgi?bug=42431 ?
There are also a handful of other hits at:
https://debbugs.gnu.org/cgi/search.cgi?phrase=%22File+exists%2C+but+cannot+be+read%22&search=search
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#58302: 29.0.50; browse-url-emacs is extremely slow (and I think always has been?)
2022-10-06 23:25 ` Phil Sainty
@ 2022-10-07 11:48 ` Lars Ingebrigtsen
0 siblings, 0 replies; 10+ messages in thread
From: Lars Ingebrigtsen @ 2022-10-07 11:48 UTC (permalink / raw)
To: Phil Sainty; +Cc: 58302
Phil Sainty <psainty@orcon.net.nz> writes:
>> coming
>> from the error message (which is a misleading error message). There's
>> probably a "sleep-for 2" after displaying the error message?
>> There's a bug report in debbugs somewhere about fixing the error
>> message, but I can't find it now.
>
> Perhaps https://debbugs.gnu.org/cgi/bugreport.cgi?bug=42431 ?
Yes, that's the one I was thinking about, and it was apparently fixed at
the time? But this looks like pretty much the same problem, but with a
different code path...
Phil Sainty <psainty@orcon.net.nz> writes:
> True, `after-find-file' does this:
>
> (or not-serious (sit-for 1 t))
>
> And indeed that accounts for 1s of the ~2.5s delay; so this is
> a significant factor, yet seemingly not the only issue.
>
> With that commented out I now get:
>
> Elapsed time: 0.835925s (0.039518s in 1 GCs) ;; find-file-noselect-1
> (1.7317769889999999 1 0.03951770099999996) ;; the overall
> benchmark-run
>
> Instead of the former:
>
> Elapsed time: 1.876737s (0.057295s in 1 GCs) ;; find-file-noselect-1
> (2.680159853 1 0.057295379000000146) ;; the overall
> benchmark-run
Perhaps there's something else that also wants to sleep a bit after a
file error... In any case, I think the real fix is to not signal an
error here, because that's wrong.
I haven't looked at this code in a while, though.
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#58302: 29.0.50; browse-url-emacs is extremely slow (and I think always has been?)
2022-10-06 12:53 ` Lars Ingebrigtsen
2022-10-06 23:25 ` Phil Sainty
@ 2022-10-07 2:47 ` Phil Sainty
2022-10-12 10:25 ` Phil Sainty
1 sibling, 1 reply; 10+ messages in thread
From: Phil Sainty @ 2022-10-07 2:47 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: 58302
On 2022-10-07 01:53, Lars Ingebrigtsen wrote:
> I can reproduce this, too. I think it's likely that the delay is
> coming
> from the error message (which is a misleading error message). There's
> probably a "sleep-for 2" after displaying the error message?
True, `after-find-file' does this:
(or not-serious (sit-for 1 t))
And indeed that accounts for 1s of the ~2.5s delay; so this is
a significant factor, yet seemingly not the only issue.
With that commented out I now get:
Elapsed time: 0.835925s (0.039518s in 1 GCs) ;; find-file-noselect-1
(1.7317769889999999 1 0.03951770099999996) ;; the overall
benchmark-run
Instead of the former:
Elapsed time: 1.876737s (0.057295s in 1 GCs) ;; find-file-noselect-1
(2.680159853 1 0.057295379000000146) ;; the overall
benchmark-run
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#58302: 29.0.50; browse-url-emacs is extremely slow (and I think always has been?)
2022-10-07 2:47 ` Phil Sainty
@ 2022-10-12 10:25 ` Phil Sainty
2022-10-12 11:03 ` Lars Ingebrigtsen
0 siblings, 1 reply; 10+ messages in thread
From: Phil Sainty @ 2022-10-12 10:25 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: 58302
On 2022-10-07 15:47, Phil Sainty wrote:
> (or not-serious (sit-for 1 t))
With that commented out, I tried to do some profiling like this:
(progn
(profiler-start 'cpu)
(browse-url-emacs "http://www.example.com")
(profiler-report)
(profiler-stop)
(profiler-reset)
(kill-buffer "www.example.com"))
The results were perplexing in their variability -- all I can
suggest is that you run that code multiple times, and C-u RET
to expand the full profile after each run, and see whether you
also observe a variety of fairly different outcomes.
Here's one example where we can see `url-retrieve-synchronously'
being called 4 times; but other times it was called 2-3 times,
and the profile looked rather different.
23 69% - browse-url-emacs
23 69% - find-file-other-window
23 69% - find-file-noselect
17 51% - find-file-noselect-1
8 24% - after-find-file
8 24% - if
4 12% - let*
4 12% - cond
4 12% - and
4 12% - file-exists-p
4 12% - url-file-handler
4 12% - apply
4 12% - url-file-exists-p
4 12% - url-http-file-exists-p
4 12% - url-http-head
4 12% - url-retrieve-synchronously
4 12% - accept-process-output
4 12% - url-http-generic-filter
4 12% -
url-http-wait-for-headers-change-function
4 12% mail-fetch-field
4 12% - run-hooks
4 12% - vc-refresh-state
4 12% - vc-backend
4 12% - vc-file-getprop
4 12% - expand-file-name
4 12% url-file-handler
6 18% - insert-file-contents
6 18% - url-file-handler
6 18% - apply
6 18% - url-insert-file-contents
4 12% url-retrieve-synchronously
2 6% - url-insert-buffer-contents
2 6% - url-insert
2 6% - mm-dissect-buffer
2 6% - mm-dissect-singlepart
2 6% - mm-copy-to-buffer
2 6% generate-new-buffer
3 9% - file-readable-p
3 9% - url-file-handler
3 9% - apply
3 9% - url-file-exists-p
3 9% - url-http-file-exists-p
3 9% - url-http-head
3 9% - url-retrieve-synchronously
3 9% - url-retrieve
3 9% - url-retrieve-internal
3 9% url-http
6 18% - file-attributes
6 18% - url-file-handler
6 18% - apply
6 18% - url-file-attributes
6 18% - url-http-file-attributes
6 18% - url-http-head-file-attributes
6 18% - url-http-head
6 18% - url-retrieve-synchronously
6 18% - url-retrieve
6 18% - url-retrieve-internal
6 18% - url-http
6 18% generate-new-buffer
10 30% Automatic GC
I'm not very familiar with the ins and outs of these code paths,
but my first impression is that we've initiated an operation which
needs to deal with a particular URL and if we were to make a high-
level binding to indicate that we were doing this, we could then
cache and re-use the results of those network requests for the
extent of that binding.
3 of the 4 `url-retrieve-synchronously' calls above are from
`url-http-head'; twice on account of `url-file-exists-p', and another
from `url-file-attributes'.
I see the following in the code:
(defun url-http-head (url)
(let ((url-request-method "HEAD")
(url-request-data nil))
(url-retrieve-synchronously url)))
(defun url-http-file-exists-p (url)
(let ((buffer (url-http-head url)))
...))
(defalias 'url-http-file-readable-p 'url-http-file-exists-p)
(defun url-http-head-file-attributes (url &optional _id-format)
(let ((buffer (url-http-head url)))
...))
(defun url-http-file-attributes (url &optional id-format)
(if (url-dav-supported-p url)
(url-dav-file-attributes url id-format)
(url-http-head-file-attributes url id-format)))
In principle, I don't see why we couldn't be re-using the buffer
returned by the first call `url-http-head' in each of the
subsequent calls.
Furthermore, we could *probably* flag the fact that we are 100%
intending to request the entire file later on in the command,
and use that information to just do a GET request instead of a
HEAD request in the first place -- the resulting buffer for which
can then *also* be re-used by the eventual `url-insert-file-contents'
call.
I think `url-http-head' itself should only ever do a HEAD request,
but `url-http-head-file-attributes' and `url-http-file-exists-p'
could conditionally use the full GET buffer.
-Phil
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#58302: 29.0.50; browse-url-emacs is extremely slow (and I think always has been?)
2022-10-12 10:25 ` Phil Sainty
@ 2022-10-12 11:03 ` Lars Ingebrigtsen
2022-10-12 11:28 ` Phil Sainty
0 siblings, 1 reply; 10+ messages in thread
From: Lars Ingebrigtsen @ 2022-10-12 11:03 UTC (permalink / raw)
To: Phil Sainty; +Cc: 58302
Phil Sainty <psainty@orcon.net.nz> writes:
> I'm not very familiar with the ins and outs of these code paths,
> but my first impression is that we've initiated an operation which
> needs to deal with a particular URL and if we were to make a high-
> level binding to indicate that we were doing this, we could then
> cache and re-use the results of those network requests for the
> extent of that binding.
[excellent analysis elided]
I think the conclusion here is that using the file-name-handler-alist
stuff for this is the absolutely pessimal way to implement
`browse-url-emacs'.
It should be pretty easy to rewrite browse-url-emacs to just call
`url-retrieve-synchronously' explicitly, and then display the resulting
data -- and it should be much, much faster.
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#58302: 29.0.50; browse-url-emacs is extremely slow (and I think always has been?)
2022-10-12 11:03 ` Lars Ingebrigtsen
@ 2022-10-12 11:28 ` Phil Sainty
2022-10-12 11:33 ` Lars Ingebrigtsen
0 siblings, 1 reply; 10+ messages in thread
From: Phil Sainty @ 2022-10-12 11:28 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: 58302
On 2022-10-13 00:03, Lars Ingebrigtsen wrote:
> I think the conclusion here is that using the file-name-handler-alist
> stuff for this is the absolutely pessimal way to implement
> `browse-url-emacs'.
>
> It should be pretty easy to rewrite browse-url-emacs to just call
> `url-retrieve-synchronously' explicitly, and then display the resulting
> data -- and it should be much, much faster.
Undoubtedly so; but making the existing approach more efficient might
also bring the same benefits to other functionality?
E.g.:
(url-handler-mode 1)
(trace-function 'url-retrieve-synchronously "*trace-output*"
(lambda () (format " [%s]" url-request-method)))
(find-file "http://www.example.com")
======================================================================
1 -> (url-retrieve-synchronously #s(url "http" nil nil "www.example.com"
nil "" nil nil t nil t t)) [OPTIONS]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[OPTIONS]
======================================================================
1 -> (url-retrieve-synchronously #s(url "http" nil nil "www.example.com"
nil "" nil nil t nil t nil)) [HEAD]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[HEAD]
======================================================================
1 -> (url-retrieve-synchronously #s(url "http" nil nil "www.example.com"
nil "" nil nil t nil t t)) [OPTIONS]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[OPTIONS]
======================================================================
1 -> (url-retrieve-synchronously #s(url "http" nil nil "www.example.com"
nil "" nil nil t nil t nil)) [HEAD]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[HEAD]
======================================================================
1 -> (url-retrieve-synchronously "http://www.example.com") [nil]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[nil]
======================================================================
1 -> (url-retrieve-synchronously #s(url "http" nil nil "www.example.com"
nil "" nil nil t nil t t)) [HEAD]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[HEAD]
======================================================================
1 -> (url-retrieve-synchronously #s(url "http" nil nil "www.example.com"
nil "" nil nil t nil t t)) [HEAD]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[HEAD]
======================================================================
1 -> (url-retrieve-synchronously #s(url "http" nil nil "www.example.com"
nil "" nil nil t nil t t)) [HEAD]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[HEAD]
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#58302: 29.0.50; browse-url-emacs is extremely slow (and I think always has been?)
2022-10-12 11:28 ` Phil Sainty
@ 2022-10-12 11:33 ` Lars Ingebrigtsen
2022-10-13 8:01 ` Lars Ingebrigtsen
0 siblings, 1 reply; 10+ messages in thread
From: Lars Ingebrigtsen @ 2022-10-12 11:33 UTC (permalink / raw)
To: Phil Sainty; +Cc: 58302
Phil Sainty <psainty@orcon.net.nz> writes:
> Undoubtedly so; but making the existing approach more efficient might
> also bring the same benefits to other functionality?
>
> E.g.:
>
> (url-handler-mode 1)
>
> (trace-function 'url-retrieve-synchronously "*trace-output*"
> (lambda () (format " [%s]" url-request-method)))
>
> (find-file "http://www.example.com")
That's true, but I kinda feel that this stuff is something that nobody
uses -- it's a fun trick, but you can't do much with it. That is, you
can load "http://www.example.com" as a file, but you can't save it, so...
It's a fun toy and a demonstration of how you can hook into the Emacs
file machinery. But if you're writing code that's actually going to be
used (like browse-url-emacs), it's the worst way.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-10-13 8:01 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-05 11:07 bug#58302: 29.0.50; browse-url-emacs is extremely slow (and I think always has been?) Phil Sainty
2022-10-06 12:53 ` Lars Ingebrigtsen
2022-10-06 23:25 ` Phil Sainty
2022-10-07 11:48 ` Lars Ingebrigtsen
2022-10-07 2:47 ` Phil Sainty
2022-10-12 10:25 ` Phil Sainty
2022-10-12 11:03 ` Lars Ingebrigtsen
2022-10-12 11:28 ` Phil Sainty
2022-10-12 11:33 ` Lars Ingebrigtsen
2022-10-13 8:01 ` Lars Ingebrigtsen
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.