unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Performance bottleneck in ns_draw_fringe_bitmap
@ 2024-06-26 11:56 Ben Simms
  2024-08-05  9:36 ` Mattias Engdegård
  0 siblings, 1 reply; 2+ messages in thread
From: Ben Simms @ 2024-06-26 11:56 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 16690 bytes --]

Hi all, I recently started using Emacs (ns) HEAD on an ARM macos sonoma
system.

I've noticed that ns_draw_fringe_bitmap is a fairly large performance sink
when using pixel scrolling (to the point of 99% of cpu time being inside
this function, with Emacs drawing at approx 5Hz). The slowness here isn't
as obvious when not pixel scrolling, presumably because Emacs never tries
to redraw at 60+Hz otherwise.

I have performed some profiling and discovered that in my observed worse
case situation, of the 99% of cpu time spent in ns_draw_fringe_bitmap,
approx 50% is spent in [NSBezierPath copy], and approx 30% in [NSBezierPath
fill].

I have used the following benchmark with emacs -Q to attempt to reproduce
my encountered performance issue, however I cannot reproduce exactly the
extreme case I experience in my config, but I have used this benchmark to
validate a patch that solves the slowdown I'm experiencing:

(defun scroll-up-benchmark ()
  (interactive)
  (let ((oldgc gcs-done)
        (oldtime (float-time)))
      (dotimes (_ 10) (pixel--whistlestop-pixel-up (* 5
(pixel-line-height)))  (pixel-scroll-pixel-down (* 5 (pixel-line-height))))
      (princ (format "GCs: %d Elapsed time: %f seconds\n"
                        (- gcs-done oldgc) (- (float-time) oldtime))
#'external-debugging-output)))

(defun add-fringes ()
  (interactive)
  (dotimes (_ 20)
    (newline-and-indent)
    (insert "A")
    (goto-char (line-beginning-position))
    (let ((s "x")
          (fringe-overlay (make-overlay (point) (1+ (point)))))
      (put-text-property 0 1 'display (list 'left-fringe 'left-triangle) s)
      (overlay-put fringe-overlay 'before-string s))
    (goto-char (line-end-position))))

(scroll-bar-mode -1)
(menu-bar-mode -1)
(pixel-scroll-mode)
(pixel-scroll-precision-mode)
(setq
      pixel-scroll-precision-use-momentum t)

(dotimes (_ 20)
  (add-fringes))
(dotimes (_ 5)
  (end-of-buffer)
  (condition-case nil (while t (scroll-down))
    (error nil))
  (scroll-up-benchmark))


On Emacs (e4e1d0cd0) this reports the following:

GCs: 14 Elapsed time: 5.449032 seconds
GCs: 14 Elapsed time: 5.209006 seconds
GCs: 13 Elapsed time: 5.187779 seconds
GCs: 13 Elapsed time: 5.178472 seconds
GCs: 13 Elapsed time: 5.184741 seconds

The profiler output for this is the following:

Weight                 Self Weight                                 Symbol
Name
11.30 Gc  100.0%                 1.00 Mc
Fredisplay
11.30 Gc   99.9%                 -
redisplay_preserve_echo_area
11.12 Gc   98.3%                 1.00 Mc
 redisplay_internal
6.97 Gc   61.6%                 -
internal_condition_case_1
6.97 Gc   61.6%                 -
 redisplay_window_1
6.97 Gc   61.6%                 -
redisplay_window
5.60 Gc   49.5%                 -
 try_window
1.25 Gc   11.0%                 1.00 Mc
 display_mode_lines
111.68 Mc    0.9%                 -
 update_frame_tool_bar
6.00 Mc    0.0%                 -
 gui_consider_frame_title
2.00 Mc    0.0%                 1.00 Mc
 update_window_fringes
1.00 Mc    0.0%                 1.00 Mc
 reconsider_clip_changes
1.00 Mc    0.0%                 -
 cursor_row_fully_visible_p
1.00 Mc    0.0%                 1.00 Mc
 ___chkstk_darwin
1.00 Mc    0.0%                 1.00 Mc
 redisplay_tab_bar
1.00 Mc    0.0%                 1.00 Mc
 try_window_id
3.64 Gc   32.2%                 -
update_frame
3.47 Gc   30.6%                 -
 update_window_tree
3.47 Gc   30.6%                 1.00 Mc
update_window
2.15 Gc   19.0%                 -
 gui_update_window_end
2.11 Gc   18.6%                 -
draw_window_fringes
2.09 Gc   18.4%                 1.00 Mc
   draw_row_fringe_bitmaps
20.00 Mc    0.1%                 -
 set_buffer_internal_1
40.00 Mc    0.3%                 -
display_and_set_cursor
1.00 Mc    0.0%                 1.00 Mc
  gui_draw_vertical_border
586.18 Kc    0.0%                 -
unblock_input
1.28 Gc   11.3%                 3.00 Mc
 update_window_line
25.00 Mc    0.2%                 3.00 Mc
   scrolling_window
9.39 Mc    0.0%                 3.00 Mc
 update_window_fringes
1.00 Mc    0.0%                 1.00 Mc
 redraw_overlapped_rows
1.00 Mc    0.0%                 1.00 Mc
 redraw_overlapping_rows
152.53 Mc    1.3%                 -
 update_begin
18.72 Mc    0.1%                 -
 update_end
397.05 Mc    3.5%                 -
prepare_menu_bars
91.41 Mc    0.8%                 -
echo_area_display
5.00 Mc    0.0%                 -
ns_frame_up_to_date
4.01 Mc    0.0%                 -
start_polling
4.00 Mc    0.0%                 1.00 Mc
unbind_to
3.00 Mc    0.0%                 2.00 Mc
run_window_change_functions
2.00 Mc    0.0%                 -
hscroll_windows
1.00 Mc    0.0%                 1.00 Mc
XCONS
1.00 Mc    0.0%                 -                                   Fgethash
1.00 Mc    0.0%                 -
clear_desired_matrices
1.00 Mc    0.0%                 1.00 Mc
mark_window_display_accurate_1
1.00 Mc    0.0%                 -
ns_set_doc_edited
1.00 Mc    0.0%                 1.00 Mc
clear_garbaged_frames
181.29 Mc    1.6%                 -
 flush_frame
1.00 Mc    0.0%                 -                                  unbind_to


11.90 Gc  100.0%           -                     Fredisplay
11.90 Gc   99.9%           -
redisplay_preserve_echo_area
11.66 Gc   97.9%           2.00 Mc                      redisplay_internal
7.48 Gc   62.8%           -                       internal_condition_case_1
7.48 Gc   62.8%           -                        redisplay_window_1
7.48 Gc   62.8%           3.00 Mc                         redisplay_window
6.35 Gc   53.3%           -                          try_window
6.23 Gc   52.3%           37.03 Mc                           display_line
73.70 Mc    0.6%           -                           start_display
44.01 Mc    0.3%           -                           partial_line_height
1.00 Mc    0.0%           1.00 Mc
append_space_for_newline
1.00 Mc    0.0%           1.00 Mc
gui_produce_glyphs
977.48 Mc    8.2%           1.00 Mc
 display_mode_lines
125.80 Mc    1.0%           -                          update_frame_tool_bar
10.00 Mc    0.0%           -
 gui_consider_frame_title
7.00 Mc    0.0%           3.00 Mc
 update_window_fringes
1.00 Mc    0.0%           -
 cursor_row_fully_visible_p
1.00 Mc    0.0%           -                          unbind_to
1.00 Mc    0.0%           -                          WINDOWP
1.00 Mc    0.0%           -                          window_wants_mode_line
1.00 Mc    0.0%           1.00 Mc
 window_scroll_margin
3.24 Gc   27.2%           -                       update_frame
3.06 Gc   25.7%           -                        update_window_tree
3.06 Gc   25.7%           1.00 Mc                         update_window
1.58 Gc   13.2%           1.00 Mc
 gui_update_window_end
1.54 Gc   12.9%           -                           draw_window_fringes
1.50 Gc   12.6%           -
 draw_row_fringe_bitmaps
1.50 Gc   12.6%           2.00 Mc
draw_fringe_bitmap
608.35 Kc    0.0%           608.35 Kc
FRAME_RIGHT_FRINGE_WIDTH
34.01 Mc    0.2%           -
 set_buffer_internal_1
38.00 Mc    0.3%           1.00 Mc
display_and_set_cursor
1.44 Gc   12.0%           -                          update_window_line
31.00 Mc    0.2%           2.00 Mc                          scrolling_window
12.00 Mc    0.1%           6.00 Mc
 update_window_fringes
1.00 Mc    0.0%           1.00 Mc
 window_wants_mode_line
1.00 Mc    0.0%           1.00 Mc
 window_text_bottom_y
1.00 Mc    0.0%           1.00 Mc
 redraw_overlapped_rows
214.49 Kc    0.0%           214.49 Kc
 clip_to_bounds
159.74 Mc    1.3%           -                        update_begin
20.00 Mc    0.1%           -                        update_end
20.00 Mc    0.1%           -                         ns_update_end
815.15 Mc    6.8%           -                       prepare_menu_bars
106.17 Mc    0.8%           -                       echo_area_display
10.00 Mc    0.0%           -                       ns_frame_up_to_date
3.00 Mc    0.0%           1.00 Mc
run_window_change_functions
3.00 Mc    0.0%           -                       ns_set_doc_edited
1.00 Mc    0.0%           -                       specbind
1.00 Mc    0.0%           -                       start_polling
1.00 Mc    0.0%           -                       update_overlay_arrows
1.00 Mc    0.0%           -                       hscroll_windows
237.72 Mc    1.9%           -                      flush_frame
188.97 Kc    0.0%           -                      unbind_to
2.00 Mc    0.0%           -                     swallow_events

And with a patch I have developed that uses masked bitmaps instead of
beziers for drawing fringes:

GCs: 14 Elapsed time: 5.091162 seconds
GCs: 14 Elapsed time: 4.825966 seconds
GCs: 13 Elapsed time: 4.793364 seconds
GCs: 13 Elapsed time: 4.785960 seconds
GCs: 13 Elapsed time: 4.782470 seconds

With the following profiler output:

8.55 Gc  100.0% - Fredisplay
8.45 Gc   98.9% - redisplay_preserve_echo_area
8.28 Gc   96.9% 1.00 Mc  redisplay_internal
5.35 Gc   62.5% -   internal_condition_case_1
5.35 Gc   62.5% -    redisplay_window_1
5.35 Gc   62.5% 1.00 Mc     redisplay_window
4.61 Gc   53.8% 1.11 Mc      try_window
639.37 Mc    7.4% -      display_mode_lines
86.85 Mc    1.0% -      update_frame_tool_bar
8.00 Mc    0.0% 1.00 Mc      gui_consider_frame_title
2.00 Mc    0.0% 2.00 Mc      update_window_fringes
2.00 Mc    0.0% 2.00 Mc      try_window_id
1.00 Mc    0.0% -      window_wants_mode_line
1.00 Mc    0.0% 1.00 Mc    push_handler
2.37 Gc   27.7% -   update_frame
2.22 Gc   25.9% -    update_window_tree
2.22 Gc   25.9% 3.00 Mc     update_window
1.12 Gc   13.0% -      gui_update_window_end
1.10 Gc   12.8% 1.00 Mc       draw_window_fringes
1.08 Gc   12.6% -        draw_row_fringe_bitmaps
1.08 Gc   12.6% 1.00 Mc         draw_fringe_bitmap
1.08 Gc   12.5% 12.48 Mc          draw_fringe_bitmap_1
905.12 Mc   10.5% 1.03 Mc           ns_draw_fringe_bitmap
667.92 Mc    7.8% -            CGContextDrawImageWithOptions
81.91 Mc    0.9% 1.00 Mc            NSRectFill
45.65 Mc    0.5% -            NSColorSetWithFillAndStroke
29.07 Mc    0.3% 2.00 Mc            ns_row_rect
22.53 Mc    0.2% -            ns_focus
11.00 Mc    0.1% 2.00 Mc            +[NSColor(EmacsColor)
colorWithUnsignedLong:]
8.00 Mc    0.0% 1.00 Mc            ns_unfocus
8.00 Mc    0.0% -            -[_NSTaggedPointerColor CGColor]
8.00 Mc    0.0% 2.00 Mc            -[__NSDictionaryM objectForKey:]
4.00 Mc    0.0% 4.00 Mc            NSUnionRect
3.00 Mc    0.0% 3.00 Mc            objc_msgSend
2.00 Mc    0.0% 2.00 Mc            gui_define_fringe_bitmap
2.00 Mc    0.0% 2.00 Mc            CGContextSetCompositeOperation
1.00 Mc    0.0% -            objc_autorelease
1.00 Mc    0.0% 1.00 Mc            CGContextClipToRect
1.00 Mc    0.0% -            CGGStateSetFillColor
1.00 Mc    0.0% 1.00 Mc            _objc_rootAutorelease
1.00 Mc    0.0% 1.00 Mc            NSIntersectionRect
1.00 Mc    0.0% 1.00 Mc            CGContextSaveGState
1.00 Mc    0.0% 1.00 Mc            NSMakeRect
1.00 Mc    0.0% 1.00 Mc            -[NSObject autorelease]
1.00 Mc    0.0% 1.00 Mc            CGDataProviderIsZombie
1.00 Mc    0.0% 1.00 Mc            DYLD-STUB$$NSRectFill
1.00 Mc    0.0% 1.00 Mc            objc_msgSend$colorWithUnsignedLong:
129.38 Mc    1.5% 11.34 Mc           lookup_named_face
8.00 Mc    0.0% -           window_box_right
4.31 Mc    0.0% 1.00 Mc           window_wants_tab_line
5.00 Mc    0.0% 1.00 Mc           window_wants_header_line
3.09 Mc    0.0% 3.00 Mc           builtin_lisp_symbol
2.00 Mc    0.0% 1.00 Mc           window_box_left
2.00 Mc    0.0% 1.00 Mc           FRAME_INTERNAL_BORDER_WIDTH
1.55 Mc    0.0% 1.55 Mc           FACE_FROM_ID_OR_NULL
1.00 Mc    0.0% -           WINDOWP
1.00 Mc    0.0% 1.00 Mc           EQ
260.83 Kc    0.0% 260.83 Kc           EQ
1.00 Mc    0.0% 1.00 Mc           get_fringe_bitmap_data
1.00 Mc    0.0% 1.00 Mc          ns_draw_fringe_bitmap
1.00 Mc    0.0% 1.00 Mc         FRAME_RIGHT_FRINGE_WIDTH
20.00 Mc    0.2% 1.00 Mc        set_buffer_internal_1
19.00 Mc    0.2% 1.00 Mc       display_and_set_cursor
1.07 Gc   12.4% 2.00 Mc      update_window_line
21.25 Mc    0.2% 6.00 Mc      scrolling_window
7.00 Mc    0.0% 2.00 Mc      update_window_fringes
646.38 Kc    0.0% -      window_text_bottom_y
139.87 Mc    1.6% -    update_begin
16.01 Mc    0.1% -    update_end
484.59 Mc    5.6% -   prepare_menu_bars
63.12 Mc    0.7% 365.24 Kc   echo_area_display
3.13 Mc    0.0% -   hscroll_windows
2.51 Mc    0.0% -   ns_set_doc_edited
2.00 Mc    0.0% 2.00 Mc   overlay_arrows_changed_p
2.00 Mc    0.0% 1.00 Mc   run_window_change_functions
1.00 Mc    0.0% 1.00 Mc   reset_outermost_restrictions
651.79 Kc    0.0% 651.79 Kc   ___chkstk_darwin
1.00 Mc    0.0% 1.00 Mc   mark_window_display_accurate_1
1.00 Mc    0.0% -   unbind_to
1.00 Mc    0.0% -   start_polling
169.35 Mc    1.9% -  flush_frame
1.00 Mc    0.0% -  unbind_to
51.01 Mc    0.5% - detect_input_pending_run_timers
41.13 Mc    0.4% - swallow_events


I've put together a patch that draws using a masked CGImage instead of an
NSBezier. However I'm not sure if it's any good as I'm completely
unfamiliar with the Emacs codebase and macos/NS graphics APIs. Also I
believe NS emacs wants to be compatible with GNUStep and I have no idea if
this remains compatible:

From d54c091002aff8f74f8165dd21d65075f028e728 Mon Sep 17 00:00:00 2001
From: Ben Simms <ben@bensimms.moe>
Date: Mon, 24 Jun 2024 23:35:29 +0200
Subject: [PATCH] Draw fringe using bitmaps, not huge beziers

---
 src/nsterm.m | 55 ++++++++++++++++++++++++----------------------------
 1 file changed, 25 insertions(+), 30 deletions(-)

diff --git a/src/nsterm.m b/src/nsterm.m
index 794630de1c..9c781e3bd6 100644
--- a/src/nsterm.m
+++ b/src/nsterm.m
@@ -2903,22 +2903,24 @@ Hide the window (X11 semantics)
 static void
 ns_define_fringe_bitmap (int which, unsigned short *bits, int h, int w)
 {
-  NSBezierPath *p = [NSBezierPath bezierPath];
-
   if (!fringe_bmp)
     fringe_bmp = [[NSMutableDictionary alloc] initWithCapacity:25];

-  [p moveToPoint:NSMakePoint (0, 0)];

-  for (int y = 0 ; y < h ; y++)
-    for (int x = 0 ; x < w ; x++)
-      {
-        bool bit = bits[y] & (1 << (w - x - 1));
-        if (bit)
-          [p appendBezierPathWithRect:NSMakeRect (x, y, 1, 1)];
-      }
+  for (int i = 0; i < h; i++)
+    bits[i] = ~bits[i];
+
+  CGDataProviderRef provider = CGDataProviderCreateWithData (NULL, bits,
+					   sizeof (unsigned short) * h, NULL);
+  if (provider) {
+    id p = (id)CGImageMaskCreate (w, h, 1, 1,
+                 sizeof (unsigned short),
+                 provider, NULL, 0);
+    CGDataProviderRelease (provider);
+
+    [fringe_bmp setObject:p forKey:[NSNumber numberWithInt:which]];
+  }

-  [fringe_bmp setObject:p forKey:[NSNumber numberWithInt:which]];
 }


@@ -2981,37 +2983,30 @@ Hide the window (X11 semantics)
       NSRectFill (clearRect);
     }

-  NSBezierPath *bmp = [fringe_bmp objectForKey:[NSNumber
numberWithInt:p->which]];
+  CGImageRef bmp = (CGImageRef)[fringe_bmp objectForKey:[NSNumber
numberWithInt:p->which]];

   if (bmp == nil
       && p->which < max_used_fringe_bitmap)
     {
       gui_define_fringe_bitmap (f, p->which);
-      bmp = [fringe_bmp objectForKey: [NSNumber numberWithInt: p->which]];
+      bmp = (CGImageRef)[fringe_bmp objectForKey: [NSNumber
numberWithInt: p->which]];
     }

   if (bmp)
     {
-      NSAffineTransform *transform = [NSAffineTransform transform];
-      NSColor *bm_color;
+      CGRect bounds = CGRectMake (p->x, p->y - p->dh,
+			   CGImageGetWidth (bmp), CGImageGetHeight (bmp));

-      /* Because the image is defined at (0, 0) we need to take a copy
-         and then transform that copy to the new origin.  */
-      bmp = [bmp copy];
-      [transform translateXBy:p->x yBy:p->y - p->dh];
-      [bmp transformUsingAffineTransform:transform];
+      NSGraphicsContext *ctx = [NSGraphicsContext currentContext];
+      CGContextRef context = [ctx CGContext];

-      if (!p->cursor_p)
-        bm_color = [NSColor colorWithUnsignedLong:face->foreground];
-      else if (p->overlay_p)
-        bm_color = [NSColor colorWithUnsignedLong:face->background];
-      else
-        bm_color = f->output_data.ns->cursor_color;
+      CGContextTranslateCTM (context,
+			     CGRectGetMinX (bounds), CGRectGetMaxY (bounds));
+      CGContextScaleCTM (context, 1, -1);

-      [bm_color set];
-      [bmp fill];
-
-      [bmp release];
+      CGContextSetFillColorWithColor (context, [[NSColor
colorWithUnsignedLong:face->foreground] CGColor]);
+      bounds.origin = CGPointZero;
+      CGContextDrawImage (context, bounds, bmp);
     }
   ns_unfocus (f);
 }
-- 
2.45.1



-- 
Ben Simms

[-- Attachment #2: Type: text/html, Size: 23382 bytes --]

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: Performance bottleneck in ns_draw_fringe_bitmap
  2024-06-26 11:56 Performance bottleneck in ns_draw_fringe_bitmap Ben Simms
@ 2024-08-05  9:36 ` Mattias Engdegård
  0 siblings, 0 replies; 2+ messages in thread
From: Mattias Engdegård @ 2024-08-05  9:36 UTC (permalink / raw)
  To: Ben Simms; +Cc: emacs-devel

26 juni 2024 kl. 13.56 skrev Ben Simms <ben@bensimms.moe>:

> I've noticed that ns_draw_fringe_bitmap is a fairly large performance sink when using pixel scrolling (to the point of 99% of cpu time being inside this function, with Emacs drawing at approx 5Hz). The slowness here isn't as obvious when not pixel scrolling, presumably because Emacs never tries to redraw at 60+Hz otherwise.

Thank you, and sorry about the late reply. Your patch seems to speed up your benchmark a bit on an x86 mac with an older macOS, but I haven't done any profiling.
Perhaps you should report this to bug-gnu-emacs for more focussed treatment.

(For reference: original message with patch archived at https://lists.gnu.org/archive/html/emacs-devel/2024-06/msg00900.html)




^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-08-05  9:36 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-26 11:56 Performance bottleneck in ns_draw_fringe_bitmap Ben Simms
2024-08-05  9:36 ` Mattias Engdegård

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).