unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Phil Sainty <psainty@orcon.net.nz>
To: Lars Ingebrigtsen <larsi@gnus.org>
Cc: 58302@debbugs.gnu.org
Subject: bug#58302: 29.0.50; browse-url-emacs is extremely slow (and I think always has been?)
Date: Wed, 12 Oct 2022 23:25:52 +1300	[thread overview]
Message-ID: <d3c5f2c00773da6c4148c4f29fe6a29e@webmail.orcon.net.nz> (raw)
In-Reply-To: <18018e9dddf6e6d5d07a567a36b59a65@webmail.orcon.net.nz>

On 2022-10-07 15:47, Phil Sainty wrote:
>  (or not-serious (sit-for 1 t))

With that commented out, I tried to do some profiling like this:

(progn
   (profiler-start 'cpu)
   (browse-url-emacs "http://www.example.com")
   (profiler-report)
   (profiler-stop)
   (profiler-reset)
   (kill-buffer "www.example.com"))

The results were perplexing in their variability -- all I can
suggest is that you run that code multiple times, and C-u RET
to expand the full profile after each run, and see whether you
also observe a variety of fairly different outcomes.

Here's one example where we can see `url-retrieve-synchronously'
being called 4 times; but other times it was called 2-3 times,
and the profile looked rather different.

           23  69%         - browse-url-emacs
           23  69%          - find-file-other-window
           23  69%           - find-file-noselect
           17  51%            - find-file-noselect-1
            8  24%             - after-find-file
            8  24%              - if
            4  12%               - let*
            4  12%                - cond
            4  12%                 - and
            4  12%                  - file-exists-p
            4  12%                   - url-file-handler
            4  12%                    - apply
            4  12%                     - url-file-exists-p
            4  12%                      - url-http-file-exists-p
            4  12%                       - url-http-head
            4  12%                        - url-retrieve-synchronously
            4  12%                         - accept-process-output
            4  12%                          - url-http-generic-filter
            4  12%                           - 
url-http-wait-for-headers-change-function
            4  12%                              mail-fetch-field
            4  12%               - run-hooks
            4  12%                - vc-refresh-state
            4  12%                 - vc-backend
            4  12%                  - vc-file-getprop
            4  12%                   - expand-file-name
            4  12%                      url-file-handler
            6  18%             - insert-file-contents
            6  18%              - url-file-handler
            6  18%               - apply
            6  18%                - url-insert-file-contents
            4  12%                   url-retrieve-synchronously
            2   6%                 - url-insert-buffer-contents
            2   6%                  - url-insert
            2   6%                   - mm-dissect-buffer
            2   6%                    - mm-dissect-singlepart
            2   6%                     - mm-copy-to-buffer
            2   6%                        generate-new-buffer
            3   9%             - file-readable-p
            3   9%              - url-file-handler
            3   9%               - apply
            3   9%                - url-file-exists-p
            3   9%                 - url-http-file-exists-p
            3   9%                  - url-http-head
            3   9%                   - url-retrieve-synchronously
            3   9%                    - url-retrieve
            3   9%                     - url-retrieve-internal
            3   9%                        url-http
            6  18%            - file-attributes
            6  18%             - url-file-handler
            6  18%              - apply
            6  18%               - url-file-attributes
            6  18%                - url-http-file-attributes
            6  18%                 - url-http-head-file-attributes
            6  18%                  - url-http-head
            6  18%                   - url-retrieve-synchronously
            6  18%                    - url-retrieve
            6  18%                     - url-retrieve-internal
            6  18%                      - url-http
            6  18%                         generate-new-buffer
           10  30%    Automatic GC



I'm not very familiar with the ins and outs of these code paths,
but my first impression is that we've initiated an operation which
needs to deal with a particular URL and if we were to make a high-
level binding to indicate that we were doing this, we could then
cache and re-use the results of those network requests for the
extent of that binding.

3 of the 4 `url-retrieve-synchronously' calls above are from
`url-http-head'; twice on account of `url-file-exists-p', and another
from `url-file-attributes'.

I see the following in the code:

(defun url-http-head (url)
   (let ((url-request-method "HEAD")
	(url-request-data nil))
     (url-retrieve-synchronously url)))

(defun url-http-file-exists-p (url)
   (let ((buffer (url-http-head url)))
     ...))

(defalias 'url-http-file-readable-p 'url-http-file-exists-p)

(defun url-http-head-file-attributes (url &optional _id-format)
   (let ((buffer (url-http-head url)))
     ...))

(defun url-http-file-attributes (url &optional id-format)
   (if (url-dav-supported-p url)
       (url-dav-file-attributes url id-format)
     (url-http-head-file-attributes url id-format)))


In principle, I don't see why we couldn't be re-using the buffer
returned by the first call `url-http-head' in each of the
subsequent calls.

Furthermore, we could *probably* flag the fact that we are 100%
intending to request the entire file later on in the command,
and use that information to just do a GET request instead of a
HEAD request in the first place -- the resulting buffer for which
can then *also* be re-used by the eventual `url-insert-file-contents'
call.

I think `url-http-head' itself should only ever do a HEAD request,
but `url-http-head-file-attributes' and `url-http-file-exists-p'
could conditionally use the full GET buffer.


-Phil






  reply	other threads:[~2022-10-12 10:25 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-05 11:07 bug#58302: 29.0.50; browse-url-emacs is extremely slow (and I think always has been?) Phil Sainty
2022-10-06 12:53 ` Lars Ingebrigtsen
2022-10-06 23:25   ` Phil Sainty
2022-10-07 11:48     ` Lars Ingebrigtsen
2022-10-07  2:47   ` Phil Sainty
2022-10-12 10:25     ` Phil Sainty [this message]
2022-10-12 11:03       ` Lars Ingebrigtsen
2022-10-12 11:28         ` Phil Sainty
2022-10-12 11:33           ` Lars Ingebrigtsen
2022-10-13  8:01             ` Lars Ingebrigtsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d3c5f2c00773da6c4148c4f29fe6a29e@webmail.orcon.net.nz \
    --to=psainty@orcon.net.nz \
    --cc=58302@debbugs.gnu.org \
    --cc=larsi@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).