From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: "John Coughlin" Newsgroups: gmane.emacs.devel Subject: Iteration over frame->face_alist is a huge performance suck Date: Thu, 01 Jul 2021 20:58:07 -0700 Message-ID: <3be10451-2126-4f55-b460-6d8191535c30@www.fastmail.com> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=5293a3fb5e1940fbabc0dd0f57363e82 Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="17302"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Cyrus-JMAP/3.5.0-alpha0-530-gd0c265785f-fm-20210616.002-gd0c26578 To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Jul 02 05:59:57 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lzALN-0004Jh-9K for ged-emacs-devel@m.gmane-mx.org; Fri, 02 Jul 2021 05:59:57 +0200 Original-Received: from localhost ([::1]:53024 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lzALM-0004tH-4q for ged-emacs-devel@m.gmane-mx.org; Thu, 01 Jul 2021 23:59:56 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:45164) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzAKU-00048s-Lv for emacs-devel@gnu.org; Thu, 01 Jul 2021 23:59:02 -0400 Original-Received: from out3-smtp.messagingengine.com ([66.111.4.27]:41713) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzAKQ-00071O-VN for emacs-devel@gnu.org; Thu, 01 Jul 2021 23:59:02 -0400 Original-Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id D515A5C00D9 for ; Thu, 1 Jul 2021 23:58:55 -0400 (EDT) Original-Received: from imap35 ([10.202.2.85]) by compute3.internal (MEProxy); Thu, 01 Jul 2021 23:58:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-type:date:from:message-id :mime-version:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm3; bh=xO4wU/4qLB1fb6BrrEEhdiDkggZOx VLDBWaWzt317jg=; b=lK1nJBXlMLOr1ojPJbyORmuzEu+XQGpTyWNArB1jV5pEe jFNP1v/Ehbfa3BC45J7uAOBsowAG9JglsPP59VTSUZ/srIJq7ZbAJ+SD8278WZs1 PuBLKzYGD+G/1QOmEX/vzoeWmn2GuTv0/KbD8dwsS/6wv0jHKLU3R+SkfCxPupHh DZ2G62FSwllo4tN8V+wM48eZLUpxT04d5MsP6JhNhRNQuCyrh774NFAA6pWj1/8S jCXXJbiGGr0yxAVI4ppIwS9Q3LYW9awcFSOeFrnMn+RcwsVHTmEwINOmjDGlYflp giCOskSxS6CxAo5bDe6bMs8aeS1tHa/FD8TEGYJ/Q== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrfeeijedgjeehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefofgggkfffhffvufgtsegrtderre erredtnecuhfhrohhmpedflfhohhhnucevohhughhhlhhinhdfuceojhgrtghksehjohhh nhgstghouhhghhhlihhnrdgtohhmqeenucggtffrrghtthgvrhhnpefhheduvddukeefhf ejleefieffieektdeuieekjefgveehledvgeekteejgfetleenucevlhhushhtvghrufhi iigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehjrggtkhesjhhohhhnsggtohhugh hhlhhinhdrtghomh X-ME-Proxy: Original-Received: by mailuser.nyi.internal (Postfix, from userid 501) id 8545315A005E; Thu, 1 Jul 2021 23:58:54 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface Received-SPF: none client-ip=66.111.4.27; envelope-from=jack@johnbcoughlin.com; helo=out3-smtp.messagingengine.com X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_NONE=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:271098 Archived-At: --5293a3fb5e1940fbabc0dd0f57363e82 Content-Type: text/plain Hello all, I'm brand new to emacs development, so please forgive any breaches of protocol in the following missive. This is mostly based on a rather janky println debugging branch I hacked together, plus haphazard profiling of a quite "non -Q" session. Recently I have been investigating slowdowns in overall responsiveness and snappiness in my emacs setup, which arise during the course of normal work. I attached a sampling profiler to the process (Instruments on MacOS), and recorded ten or so seconds of mashing the movement cursors in my org-agenda window. The result is that 93.4% of the total samples are inside of the function lface_from_face_name_no_resolve, in xfaces.c. The culprit seems to be a large association list, frame->face_list, which in my current session contains over 1000 faces. This association list is queried every time that face properties need to be resolved. I believe several factors are contributing to this being a major bottleneck: - Some important faces are looked up once per line, on every redisplay. For example :fringe and :line-number. - new faces are *pre*-pended to the alist, so old, commonly used faces get pushed to the back. The linear search in assq_no_quit takes longer and longer to find them. This is compounded as packages create faces for their own purposes. See above about my session containing over 1000 faces. - This may be less of a problem in vanilla emacs, but some packages create faces that result in quite deep recursive calls to merge_named_face. Each such frame in the stack (I observed upwards of 50 such frames with my org-agenda button mashing) is doing its own face lookups. - display-line-number-mode is also a major offender, since it has to look up several faces and merge them, at least once per line number. To verify this bottleneck for yourself, I encourage you to attach a sampling profiler to an emacs -Q session, and determine the percentage of time spent inside the function lface_from_face_name_no_resolve. By turning on display-line-numbers-mode, I was able to get this proportion above 40% in just a few seconds of typing and mashing the up and down arrows. The other diagnostic I used was to compile a duplicate function my_assq_no_quit which returns the number of iterations required to find the key in the list. This can then be logged inside of lface_from_face_name_no_resolve. It is easy to verify that as more faces are registered (e.g. by turning on hl-line-mode, line numbers, org-agenda, etc.), the number of iterations required to find commonly needed faces like :fringe grows. So, what should be done about this? The frame's face_alist field is exposed directly to elisp via the (frame-face-alist) function. It is probably not a good idea to break any contract there, explicit or implicit. An alternative is to add a sister field to the frame struct, which would be a fixed-size cache of the most recently accessed faces, laid out contiguously in memory for access speed. The invalidation strategy for such a cache can be extremely simple, even as simple as clearing the whole cache. As the profiling demonstrates, accesses should dominate face updates by many orders of magnitude. An implementation like this with, say, 100 cached pointers to face objects, would benefit from cache coherency and probably outperform the association list lookup by quite a lot for almost all faces. ---- I have not even touched on the question of why faces like :fringe need to be looked up so many times per redisplay. I imagine that there may be a reason like, in principle it's possible for elisp to modify the value of :fringe in between consecutive uses of it. Setting that aside, I think that the use of the association list for such a hot function as lface_from_face_name_no_resolve should be reconsidered. ---- Thank you for reading this far! I look forward to a fruitful conversation about everyone's favorite editor. :) -Jack ---- Appendix: output of face lookup logging, piped through uniq -c. Each pair of lines is a single redisplay. [N iters] indicates the number of iterations the assq_no_quit lookup took. N goes up as the number of faces in the frame goes up. ``` 48 [97 iters] recalling face obj named fringe 68 [68 iters] recalling face obj named line-number-current-line 50 [97 iters] recalling face obj named fringe 68 [68 iters] recalling face obj named line-number-current-line 52 [97 iters] recalling face obj named fringe 68 [68 iters] recalling face obj named line-number-current-line 54 [97 iters] recalling face obj named fringe 68 [68 iters] recalling face obj named line-number-current-line 56 [97 iters] recalling face obj named fringe 68 [68 iters] recalling face obj named line-number-current-line 58 [97 iters] recalling face obj named fringe 68 [68 iters] recalling face obj named line-number-current-line 60 [97 iters] recalling face obj named fringe 68 [68 iters] recalling face obj named line-number-current-line 62 [97 iters] recalling face obj named fringe 68 [68 iters] recalling face obj named line-number-current-line 64 [97 iters] recalling face obj named fringe 68 [68 iters] recalling face obj named line-number-current-line 148 [97 iters] recalling face obj named fringe ``` --5293a3fb5e1940fbabc0dd0f57363e82 Content-Type: text/html Content-Transfer-Encoding: quoted-printable
Hello all,=

I'm brand new to emacs development, so ple= ase forgive any breaches of
protocol in the following miss= ive. This is mostly based on a rather
janky println debugg= ing branch I hacked together, plus haphazard
profiling of = a quite "non -Q" session.

Recently I have b= een investigating slowdowns in overall responsiveness
and = snappiness in my emacs setup, which arise during the course of
=
normal work. I attached a sampling profiler to the process
(Instruments on MacOS), and recorded ten or so seconds of
mashing the movement cursors in my org-agenda window. The result i= s
that 93.4% of the total samples are inside of the functi= on
lface_from_face_name_no_resolve, in xfaces.c. The culpr= it seems to be
a large association list, frame->face_li= st, which in my current
session contains over 1000 faces.<= br>

This association list is queried every time= that face properties need
to be resolved. I believe sever= al factors are contributing to this
being a major bottlene= ck:

- Some important faces are looked up on= ce per line, on every redisplay. For
  example :fring= e and :line-number.
- new faces are *pre*-pended to the al= ist, so old, commonly used faces get
  pushed to the = back. The linear search in assq_no_quit takes longer and longer
  to find them. This is compounded as packages create faces f= or their own
  purposes. See above about my session c= ontaining over 1000 faces.
- This may be less of a problem= in vanilla emacs, but some packages create faces
  t= hat result in quite deep recursive calls to merge_named_face. Each such = frame
  in the stack (I observed upwards of 50 such f= rames with my org-agenda button
  mashing) is doing i= ts own face lookups.
  - display-line-number-mode is = also a major offender, since it has to
    = look up several faces and merge them, at least once per line number.
=

To verify this bottleneck for yourself, I enco= urage you to attach a sampling
profiler to an emacs -Q ses= sion, and determine the percentage of time spent
inside th= e function lface_from_face_name_no_resolve. By turning on
= display-line-numbers-mode, I was able to get this proportion above 40% i= n just a
few seconds of typing and mashing the up and down= arrows. The other diagnostic I used was
to compile a dupl= icate function my_assq_no_quit which returns the number of
iterations required to find the key in the list. This can then be logge= d inside
of lface_from_face_name_no_resolve. It is easy to= verify that as more faces are
registered (e.g. by turning= on hl-line-mode, line numbers, org-agenda, etc.),
the num= ber of iterations required to find commonly needed faces like :fringe
grows.


So, what = should be done about this? The frame's face_alist field is exposed
directly to elisp via the (frame-face-alist) function. It is pr= obably not a good
idea to break any contract there, explic= it or implicit. An alternative is to add
a sister field to= the frame struct, which would be a fixed-size cache of the
most recently accessed faces, laid out contiguously in memory for acce= ss speed.
The invalidation strategy for such a cache can b= e extremely simple, even as
simple as clearing the whole c= ache. As the profiling demonstrates,
accesses should domin= ate face updates by many orders of magnitude. An implementation
like this with, say, 100 cached pointers to face objects, would be= nefit from
cache coherency and probably outperform the ass= ociation list lookup by
quite a lot for almost all faces.<= br>

----

I have no= t even touched on the question of why faces like :fringe need to be
<= /div>
looked up so many times per redisplay. I imagine that there ma= y be a reason
like, in principle it's possible for elisp t= o modify the value of :fringe in
between consecutive uses = of it. Setting that aside, I think
that the use of the ass= ociation list for such a hot function as
lface_from_face_n= ame_no_resolve should be reconsidered.

----=

Thank you for reading this far! I look for= ward to a fruitful conversation about
everyone's favorite = editor. :)
-Jack

----

Appendix: output of face lookup logging, piped thr= ough uniq -c. Each pair of
lines is a single redisplay. [N= iters] indicates the number of iterations the
assq_no_qui= t lookup took. N goes up as the number of faces in the
fra= me goes up.

```
  48 [97= iters] recalling face obj named fringe
  68 [68 iter= s] recalling face obj named line-number-current-line
 = ; 50 [97 iters] recalling face obj named fringe
  68 = [68 iters] recalling face obj named line-number-current-line
  52 [97 iters] recalling face obj named fringe
&n= bsp; 68 [68 iters] recalling face obj named line-number-current-line
=
  54 [97 iters] recalling face obj named fringe
  68 [68 iters] recalling face obj named line-number-current-= line
  56 [97 iters] recalling face obj named fringe<= br>
  68 [68 iters] recalling face obj named line-number-= current-line
  58 [97 iters] recalling face obj named= fringe
  68 [68 iters] recalling face obj named line= -number-current-line
  60 [97 iters] recalling face o= bj named fringe
  68 [68 iters] recalling face obj na= med line-number-current-line
  62 [97 iters] recallin= g face obj named fringe
  68 [68 iters] recalling fac= e obj named line-number-current-line
  64 [97 iters] = recalling face obj named fringe
  68 [68 iters] recal= ling face obj named line-number-current-line
148 [97 iters= ] recalling face obj named fringe
```

--5293a3fb5e1940fbabc0dd0f57363e82--