unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
@ 2022-03-31 19:49 dal-blazej--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-04-01  6:05 ` Eli Zaretskii
  2022-04-02 15:33 ` Lars Ingebrigtsen
  0 siblings, 2 replies; 25+ messages in thread
From: dal-blazej--- via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-03-31 19:49 UTC (permalink / raw)
  To: 54657


Hi,

I was surprised to see that particular site with eww leads to 100% CPU
usage for 2/3 minutes.

See profiler output below :

---------- CPU
     83,835,054  84% - url-http-generic-filter
     83,687,707  84%  - url-http-content-length-after-change-function
     82,024,107  82%   - url-http-activate-callback
     82,022,955  82%    - eww-render
     55,943,666  56%     - eww-display-html
     30,476,531  30%      - funcall-with-delayed-message
     30,476,531  30%       + #<compiled 0x6f61fe78c524dc6>
            704   0%        plist-put
             21   0%        url-generic-parse-url
     13,373,648  13%     + eww--after-page-change
          4,255   0%     + mail-header-parse-content-type
          1,098   0%     + url-generic-parse-url
          1,056   0%     + set-buffer-file-coding-system
      1,370,741   1%     file-size-human-readable-iec
          5,088   0%   + url-http-parse-headers
          8,360   0%  + url-http-wait-for-headers-change-function
     14,811,644  14% + command-execute
        186,472   0% + redisplay_internal (C function)
         26,027   0% + url-http-async-sentinel
         22,984   0% + timer-event-handler
         21,350   0% + nsm-verify-connection
            232   0% + gui-set-selection
            232   0% + deactivate-mark
             24   0% + eldoc-schedule-timer
              0   0%   ...
----------

---------- RAM
       50462  98% - url-http-generic-filter
       50458  98%  - url-http-content-length-after-change-function
       50366  98%   - url-http-activate-callback
       50366  98%    - eww-render
       50357  97%     - eww-display-html
          89   0%      + funcall-with-delayed-message
           1   0%     + eww--after-page-change
          74   0%     file-size-human-readable-iec
         689   1% + ...
         230   0% + command-execute
           8   0% + redisplay_internal (C function)
           2   0% + timer-event-handler
           1   0% + nsm-verify-connection
----------

Usually I found eww not too CPU hungry so I thought you may be
interested by that report.



In GNU Emacs 29.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.24, cairo version 1.16.0)
 of 2022-03-31 built on localhost
Repository revision: 948181df9cbdcc8845fc3662e2007d8e09f48c71
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12011000
System Description: Debian GNU/Linux 11 (bullseye)

Configured using:
 'configure --build x86_64-linux-gnu --prefix=/opt/emacs
 --with-mailutils --with-sound=yes --without-gconf --without-gsettings
 --with-x=yes --without-toolkit-scroll-bars --with-x-toolkit=gtk3
 --with-json --with-native-compilation --with-xwidgets
 build_alias=x86_64-linux-gnu 'CFLAGS=-O2 -Wall ''

Configured features:
CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS HARFBUZZ JPEG JSON LIBSELINUX
LIBXML2 MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PNG SECCOMP SOUND
SQLITE3 THREADS TIFF X11 XDBE XIM XPM XWIDGETS GTK3 ZLIB

Important settings:
  value of $LANG: en_US
  locale-coding-system: utf-8

Major mode: Summary

Load-path shadows:
/home/user/.emacs.d/elpa/transient-0.3.7/transient hides /opt/emacs/share/emacs/29.0.50/lisp/transient





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-03-31 19:49 bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/ dal-blazej--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-04-01  6:05 ` Eli Zaretskii
  2022-04-02 15:35   ` Lars Ingebrigtsen
  2022-04-02 15:33 ` Lars Ingebrigtsen
  1 sibling, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2022-04-01  6:05 UTC (permalink / raw)
  To: dal-blazej; +Cc: 54657

> Date: Thu, 31 Mar 2022 21:49:40 +0200
> From: dal-blazej--- via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
> 
> I was surprised to see that particular site with eww leads to 100% CPU
> usage for 2/3 minutes.
> 
> See profiler output below :
> 
> ---------- CPU
>      83,835,054  84% - url-http-generic-filter
>      83,687,707  84%  - url-http-content-length-after-change-function
>      82,024,107  82%   - url-http-activate-callback
>      82,022,955  82%    - eww-render
>      55,943,666  56%     - eww-display-html
>      30,476,531  30%      - funcall-with-delayed-message
>      30,476,531  30%       + #<compiled 0x6f61fe78c524dc6>

Type 'v' to see the page's source, and you will immediately understand
why.  99% of that page is a huge JS script that is basically a single
humongous line whose length is 12MB.  Even Less chokes on this page,
just showing its end on the screen.

IOW, IMO this is just another instance of the well-known problem in
the display engine with very long lines.

Maybe EWW could be smarter when displaying pages with JS scripts, but
how many pages out there have something similar?





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-03-31 19:49 bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/ dal-blazej--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-04-01  6:05 ` Eli Zaretskii
@ 2022-04-02 15:33 ` Lars Ingebrigtsen
  2022-04-02 23:55   ` dal-blazej--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
       [not found]   ` <87lewnt4td.fsf@onenetbeyond.org>
  1 sibling, 2 replies; 25+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-02 15:33 UTC (permalink / raw)
  To: dal-blazej; +Cc: 54657

dal-blazej@onenetbeyond.org writes:

> I was surprised to see that particular site with eww leads to 100% CPU
> usage for 2/3 minutes.

I'm unable to reproduce the problem -- rendering that site is nearly
instantaneous for me.  But perhaps it's serving out different HTML to
you?  Try hitting `v', save the HTML somewhere, and post the resulting
URL, and I can take a look.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-01  6:05 ` Eli Zaretskii
@ 2022-04-02 15:35   ` Lars Ingebrigtsen
  2022-04-02 15:45     ` Eli Zaretskii
  2022-04-02 17:17     ` Eli Zaretskii
  0 siblings, 2 replies; 25+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-02 15:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 54657, dal-blazej

Eli Zaretskii <eliz@gnu.org> writes:

> Type 'v' to see the page's source, and you will immediately understand
> why.  99% of that page is a huge JS script that is basically a single
> humongous line whose length is 12MB.

eww doesn't render <script> things, so it doesn't matter how long they
are for eww.  (It does matter for libxml-parse-html-region, but it
should be able to parse 12MB of <script> in a few milliseconds.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-02 15:35   ` Lars Ingebrigtsen
@ 2022-04-02 15:45     ` Eli Zaretskii
  2022-04-02 16:04       ` Lars Ingebrigtsen
  2022-04-02 17:17     ` Eli Zaretskii
  1 sibling, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2022-04-02 15:45 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 54657, dal-blazej

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: dal-blazej@onenetbeyond.org,  54657@debbugs.gnu.org
> Date: Sat, 02 Apr 2022 17:35:03 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Type 'v' to see the page's source, and you will immediately understand
> > why.  99% of that page is a huge JS script that is basically a single
> > humongous line whose length is 12MB.
> 
> eww doesn't render <script> things, so it doesn't matter how long they
> are for eww.

What do you mean by "doesn't render"?  Does it traverse it?





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-02 15:45     ` Eli Zaretskii
@ 2022-04-02 16:04       ` Lars Ingebrigtsen
  2022-04-02 16:18         ` Eli Zaretskii
  0 siblings, 1 reply; 25+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-02 16:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 54657, dal-blazej

Eli Zaretskii <eliz@gnu.org> writes:

> What do you mean by "doesn't render"?  Does it traverse it?

Yes.  A <script> is a node containing one string (which is ignored).

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-02 16:04       ` Lars Ingebrigtsen
@ 2022-04-02 16:18         ` Eli Zaretskii
  2022-04-02 16:29           ` Lars Ingebrigtsen
  0 siblings, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2022-04-02 16:18 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 54657, dal-blazej

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: dal-blazej@onenetbeyond.org,  54657@debbugs.gnu.org
> Date: Sat, 02 Apr 2022 18:04:38 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > What do you mean by "doesn't render"?  Does it traverse it?
> 
> Yes.  A <script> is a node containing one string (which is ignored).

And is shr-fill-lines involved in this in any way, shape or form?





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-02 16:18         ` Eli Zaretskii
@ 2022-04-02 16:29           ` Lars Ingebrigtsen
  0 siblings, 0 replies; 25+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-02 16:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 54657, dal-blazej

Eli Zaretskii <eliz@gnu.org> writes:

> And is shr-fill-lines involved in this in any way, shape or form?

The contents of the <script> node are ignored, so no.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-02 15:35   ` Lars Ingebrigtsen
  2022-04-02 15:45     ` Eli Zaretskii
@ 2022-04-02 17:17     ` Eli Zaretskii
  2022-04-03 11:52       ` Lars Ingebrigtsen
  1 sibling, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2022-04-02 17:17 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 54657, dal-blazej

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: dal-blazej@onenetbeyond.org,  54657@debbugs.gnu.org
> Date: Sat, 02 Apr 2022 17:35:03 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Type 'v' to see the page's source, and you will immediately understand
> > why.  99% of that page is a huge JS script that is basically a single
> > humongous line whose length is 12MB.
> 
> eww doesn't render <script> things, so it doesn't matter how long they
> are for eww.  (It does matter for libxml-parse-html-region, but it
> should be able to parse 12MB of <script> in a few milliseconds.)

According to what I see here, libxml-parse-html-region is indeed the
part that takes most of the CPU time, and I measured about 30 sec here
it took it to parse that page.  If you see something very different,
maybe the important factor here is the version of libxml2?





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-02 15:33 ` Lars Ingebrigtsen
@ 2022-04-02 23:55   ` dal-blazej--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
       [not found]   ` <87lewnt4td.fsf@onenetbeyond.org>
  1 sibling, 0 replies; 25+ messages in thread
From: dal-blazej--- via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-04-02 23:55 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 54657


Maybe it's a question of horse power ?
I run emacs with 2 cpu and 4 gig of ram within a virtual machine

Still cpu 100% for more than 3 min from master -Q to accede to the
page. I cannot see the source from within eww : it will wait for 2~3 min
... then 3 min of CPU spike and finally only show the rendered html
... no source ! Tried 3 times.

I can see the source from firefox and yes, that's ... yikes.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
       [not found]   ` <87lewnt4td.fsf@onenetbeyond.org>
@ 2022-04-03 11:51     ` Lars Ingebrigtsen
  0 siblings, 0 replies; 25+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-03 11:51 UTC (permalink / raw)
  To: dal-blazej; +Cc: 54657

dal-blazej@onenetbeyond.org writes:

> See the attached archive.

Something went wrong when you saved the source -- it's just a fragment
of the page, and it's encoded as...  I don't even know what to call it.
"Meta-HTML"?  That is, every < is encoded as &lt; etc, and then put into
a number of <span> statements.

What did you use to get that?  It's uniquely weird.  :-)

Instead, use the `v' command in eww to get the source buffer, and save
that instead.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-02 17:17     ` Eli Zaretskii
@ 2022-04-03 11:52       ` Lars Ingebrigtsen
  2022-04-03 12:06         ` Andreas Schwab
  2022-04-03 12:07         ` Eli Zaretskii
  0 siblings, 2 replies; 25+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-03 11:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 54657, dal-blazej

Eli Zaretskii <eliz@gnu.org> writes:

> According to what I see here, libxml-parse-html-region is indeed the
> part that takes most of the CPU time, and I measured about 30 sec here
> it took it to parse that page.  If you see something very different,
> maybe the important factor here is the version of libxml2?

Wow.  If you call

(benchmark-run 1 (libxml-parse-html-region (point-min) (point-max)))

in the 12MB source buffer for that page, it reports 30 seconds?  It
reports 0.01 seconds for me.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-03 11:52       ` Lars Ingebrigtsen
@ 2022-04-03 12:06         ` Andreas Schwab
  2022-04-03 12:22           ` Lars Ingebrigtsen
  2022-04-03 12:07         ` Eli Zaretskii
  1 sibling, 1 reply; 25+ messages in thread
From: Andreas Schwab @ 2022-04-03 12:06 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 54657, dal-blazej

On Apr 03 2022, Lars Ingebrigtsen wrote:

> Eli Zaretskii <eliz@gnu.org> writes:
>
>> According to what I see here, libxml-parse-html-region is indeed the
>> part that takes most of the CPU time, and I measured about 30 sec here
>> it took it to parse that page.  If you see something very different,
>> maybe the important factor here is the version of libxml2?
>
> Wow.  If you call
>
> (benchmark-run 1 (libxml-parse-html-region (point-min) (point-max)))
>
> in the 12MB source buffer for that page, it reports 30 seconds?  It
> reports 0.01 seconds for me.

I'm getting 46 seconds.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-03 11:52       ` Lars Ingebrigtsen
  2022-04-03 12:06         ` Andreas Schwab
@ 2022-04-03 12:07         ` Eli Zaretskii
  2022-04-03 12:21           ` Eli Zaretskii
  2022-04-03 12:21           ` Lars Ingebrigtsen
  1 sibling, 2 replies; 25+ messages in thread
From: Eli Zaretskii @ 2022-04-03 12:07 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 54657, dal-blazej

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: dal-blazej@onenetbeyond.org,  54657@debbugs.gnu.org
> Date: Sun, 03 Apr 2022 13:52:42 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > According to what I see here, libxml-parse-html-region is indeed the
> > part that takes most of the CPU time, and I measured about 30 sec here
> > it took it to parse that page.  If you see something very different,
> > maybe the important factor here is the version of libxml2?
> 
> Wow.  If you call
> 
> (benchmark-run 1 (libxml-parse-html-region (point-min) (point-max)))
> 
> in the 12MB source buffer for that page, it reports 30 seconds?  It
> reports 0.01 seconds for me.

I didn't do the above, I just stepped through eww-display-html in
Edebug, and looked at my watch (30 sec is easy to measure without any
instruments).

So what is your version of libxml2?  Maybe that is the important
aspect here.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-03 12:07         ` Eli Zaretskii
@ 2022-04-03 12:21           ` Eli Zaretskii
  2022-04-03 12:25             ` Lars Ingebrigtsen
  2022-04-03 12:21           ` Lars Ingebrigtsen
  1 sibling, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2022-04-03 12:21 UTC (permalink / raw)
  To: larsi; +Cc: 54657, dal-blazej

> Resent-From: Eli Zaretskii <eliz@gnu.org>
> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
> Resent-CC: bug-gnu-emacs@gnu.org
> Resent-Sender: help-debbugs@gnu.org
> Date: Sun, 03 Apr 2022 15:07:53 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: 54657@debbugs.gnu.org, dal-blazej@onenetbeyond.org
> 
> > From: Lars Ingebrigtsen <larsi@gnus.org>
> > Cc: dal-blazej@onenetbeyond.org,  54657@debbugs.gnu.org
> > Date: Sun, 03 Apr 2022 13:52:42 +0200
> > 
> > Wow.  If you call
> > 
> > (benchmark-run 1 (libxml-parse-html-region (point-min) (point-max)))
> > 
> > in the 12MB source buffer for that page, it reports 30 seconds?  It
> > reports 0.01 seconds for me.
> 
> I didn't do the above, I just stepped through eww-display-html in
> Edebug, and looked at my watch (30 sec is easy to measure without any
> instruments).
> 
> So what is your version of libxml2?  Maybe that is the important
> aspect here.

And in addition, I hope you verified that when
libxml-parse-html-region finishes in your case after so little time,
it returns the document's DOM, not something trivial?





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-03 12:07         ` Eli Zaretskii
  2022-04-03 12:21           ` Eli Zaretskii
@ 2022-04-03 12:21           ` Lars Ingebrigtsen
  2022-04-03 12:44             ` Eli Zaretskii
  1 sibling, 1 reply; 25+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-03 12:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 54657, dal-blazej

Eli Zaretskii <eliz@gnu.org> writes:

> I didn't do the above, I just stepped through eww-display-html in
> Edebug, and looked at my watch (30 sec is easy to measure without any
> instruments).

Well, that doesn't say whether it's libxml2 or something else...  could
you try the benchmark-run?

> So what is your version of libxml2?  Maybe that is the important
> aspect here.

apt says:

libxml2/testing,now 2.9.13+dfsg-1 amd64 [installed,automatic]

I guess it could also have something to do with how we're interfacing
with the library -- perhaps we're creating strings (or something) in a
sub-optimal way on some architectures and not others?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-03 12:06         ` Andreas Schwab
@ 2022-04-03 12:22           ` Lars Ingebrigtsen
  2022-04-03 13:57             ` Andreas Schwab
  0 siblings, 1 reply; 25+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-03 12:22 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: 54657, dal-blazej

Andreas Schwab <schwab@linux-m68k.org> writes:

> I'm getting 46 seconds.

Wow.  What OS is this on, and what's the libxml2 version?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-03 12:21           ` Eli Zaretskii
@ 2022-04-03 12:25             ` Lars Ingebrigtsen
  0 siblings, 0 replies; 25+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-03 12:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 54657, dal-blazej

Eli Zaretskii <eliz@gnu.org> writes:

> And in addition, I hope you verified that when
> libxml-parse-html-region finishes in your case after so little time,
> it returns the document's DOM, not something trivial?

The results look fine to me.  (memory-report-object-size
(libxml-parse-html-region (point-min) (point-max))) reports a 10MB
structure.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-03 12:21           ` Lars Ingebrigtsen
@ 2022-04-03 12:44             ` Eli Zaretskii
  0 siblings, 0 replies; 25+ messages in thread
From: Eli Zaretskii @ 2022-04-03 12:44 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 54657, dal-blazej

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: dal-blazej@onenetbeyond.org,  54657@debbugs.gnu.org
> Date: Sun, 03 Apr 2022 14:21:55 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > I didn't do the above, I just stepped through eww-display-html in
> > Edebug, and looked at my watch (30 sec is easy to measure without any
> > instruments).
> 
> Well, that doesn't say whether it's libxml2 or something else...

Of course it does: I stepped through the code one sexp at a time, and
measured only the time it took to execute libxml-parse-html-region
(after seeing as it doesn't return quickly enough).

> libxml2/testing,now 2.9.13+dfsg-1 amd64 [installed,automatic]

It's 2.7.8 here.

> I guess it could also have something to do with how we're interfacing
> with the library -- perhaps we're creating strings (or something) in a
> sub-optimal way on some architectures and not others?

The OP is on x86_64-pc-linux-gnu.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-03 12:22           ` Lars Ingebrigtsen
@ 2022-04-03 13:57             ` Andreas Schwab
  2022-04-03 14:55               ` Lars Ingebrigtsen
  0 siblings, 1 reply; 25+ messages in thread
From: Andreas Schwab @ 2022-04-03 13:57 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 54657, dal-blazej

Name        : libxml2-2
Version     : 2.9.7
Release     : 3.37.1
Architecture: x86_64
Install Date: Sa 26 Jun 2021 14:16:06 CEST
Group       : System/Libraries
Size        : 1620022
License     : MIT
Signature   : RSA/SHA256, Fr 21 Mai 2021 16:18:38 CEST, Key ID 70af9e8139db7c82
Source RPM  : libxml2-2.9.7-3.37.1.src.rpm
Build Date  : Fr 21 Mai 2021 16:17:51 CEST
Build Host  : sheep62
Relocations : (not relocatable)
Packager    : https://www.suse.com/
Vendor      : SUSE LLC <https://www.suse.com/>
URL         : http://xmlsoft.org
Summary     : A Library to Manipulate XML Files
Description :
The XML C library was initially developed for the GNOME project. It is
now used by many programs to load and save extensible data structures
or manipulate any kind of XML files.

This library implements a number of existing standards related to
markup languages, including the XML standard, name spaces in XML, XML
Base, RFC 2396, XPath, XPointer, HTML4, XInclude, SGML catalogs, and
XML catalogs. In most cases, libxml tries to implement the
specification in a rather strict way. To some extent, it provides
support for the following specifications, but does not claim to
implement them: DOM, FTP client, HTTP client, and SAX.

The library also supports RelaxNG. Support for W3C XML Schemas is in
progress.
Distribution: SUSE Linux Enterprise 15

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-03 13:57             ` Andreas Schwab
@ 2022-04-03 14:55               ` Lars Ingebrigtsen
  2022-04-03 15:05                 ` Eli Zaretskii
  2022-04-03 15:05                 ` Andreas Schwab
  0 siblings, 2 replies; 25+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-03 14:55 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: 54657, dal-blazej

Andreas Schwab <schwab@linux-m68k.org> writes:

> Version     : 2.9.7

I tried this on a Debian/bullseye machine, which has 2.9.10, and I can
reproduce the problem there -- libxml-parse-html-region takes 20 seconds
there.

So I guess this is something the libxml people have fixed sometime
between 2.9.10 and 2.9.13.  

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-03 14:55               ` Lars Ingebrigtsen
@ 2022-04-03 15:05                 ` Eli Zaretskii
  2022-04-04 10:31                   ` Lars Ingebrigtsen
  2022-04-03 15:05                 ` Andreas Schwab
  1 sibling, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2022-04-03 15:05 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 54657, schwab, dal-blazej

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: 54657@debbugs.gnu.org,  Eli Zaretskii <eliz@gnu.org>,
>   dal-blazej@onenetbeyond.org
> Date: Sun, 03 Apr 2022 16:55:32 +0200
> 
> Andreas Schwab <schwab@linux-m68k.org> writes:
> 
> > Version     : 2.9.7
> 
> I tried this on a Debian/bullseye machine, which has 2.9.10, and I can
> reproduce the problem there -- libxml-parse-html-region takes 20 seconds
> there.
> 
> So I guess this is something the libxml people have fixed sometime
> between 2.9.10 and 2.9.13.  

I guess this should be in PROBLEMS?





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-03 14:55               ` Lars Ingebrigtsen
  2022-04-03 15:05                 ` Eli Zaretskii
@ 2022-04-03 15:05                 ` Andreas Schwab
  1 sibling, 0 replies; 25+ messages in thread
From: Andreas Schwab @ 2022-04-03 15:05 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 54657, dal-blazej

On Apr 03 2022, Lars Ingebrigtsen wrote:

> So I guess this is something the libxml people have fixed sometime
> between 2.9.10 and 2.9.13.  

https://gitlab.gnome.org/GNOME/libxml2/-/commit/faea2fa9

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-03 15:05                 ` Eli Zaretskii
@ 2022-04-04 10:31                   ` Lars Ingebrigtsen
  2022-04-21 14:23                     ` dal-blazej--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 25+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-04 10:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 54657, schwab, dal-blazej

Eli Zaretskii <eliz@gnu.org> writes:

>> So I guess this is something the libxml people have fixed sometime
>> between 2.9.10 and 2.9.13.  
>
> I guess this should be in PROBLEMS?

I guess so, but it's a pretty obscure problem -- it's really unusual to
have that much inline <script> stuff in the HTML.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/
  2022-04-04 10:31                   ` Lars Ingebrigtsen
@ 2022-04-21 14:23                     ` dal-blazej--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 25+ messages in thread
From: dal-blazej--- via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-04-21 14:23 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 54657, Eli Zaretskii, schwab


> I guess so, but it's a pretty obscure problem

I think so. I'll close.

Thanks for your insights.





^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2022-04-21 14:23 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-31 19:49 bug#54657: 29.0.50; 100% CPU usage with eww on https://blogsurf.io/ dal-blazej--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-04-01  6:05 ` Eli Zaretskii
2022-04-02 15:35   ` Lars Ingebrigtsen
2022-04-02 15:45     ` Eli Zaretskii
2022-04-02 16:04       ` Lars Ingebrigtsen
2022-04-02 16:18         ` Eli Zaretskii
2022-04-02 16:29           ` Lars Ingebrigtsen
2022-04-02 17:17     ` Eli Zaretskii
2022-04-03 11:52       ` Lars Ingebrigtsen
2022-04-03 12:06         ` Andreas Schwab
2022-04-03 12:22           ` Lars Ingebrigtsen
2022-04-03 13:57             ` Andreas Schwab
2022-04-03 14:55               ` Lars Ingebrigtsen
2022-04-03 15:05                 ` Eli Zaretskii
2022-04-04 10:31                   ` Lars Ingebrigtsen
2022-04-21 14:23                     ` dal-blazej--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-04-03 15:05                 ` Andreas Schwab
2022-04-03 12:07         ` Eli Zaretskii
2022-04-03 12:21           ` Eli Zaretskii
2022-04-03 12:25             ` Lars Ingebrigtsen
2022-04-03 12:21           ` Lars Ingebrigtsen
2022-04-03 12:44             ` Eli Zaretskii
2022-04-02 15:33 ` Lars Ingebrigtsen
2022-04-02 23:55   ` dal-blazej--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
     [not found]   ` <87lewnt4td.fsf@onenetbeyond.org>
2022-04-03 11:51     ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).