unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Using __builtin_expect (likely/unlikely macros)
@ 2019-04-15  0:15 Alex Gramiak
  2019-04-15  1:18 ` Paul Eggert
  0 siblings, 1 reply; 45+ messages in thread
From: Alex Gramiak @ 2019-04-15  0:15 UTC (permalink / raw)
  To: emacs-devel

Would using these, if available, be acceptable? I noticed that some code
in configure.ac already defines HAVE___BUILTIN_EXPECT, though I couldn't
see what part of autoconf does this.

Good candidates here include the conditionals around emacs_abort/fatal,
and possibly in cases like the *alloc procedures and xfree.

Even if there's no noticeable speedup, I think that the likely/unlikely
macros are a nice way to indicate that a branch is exceedingly
rare/common.

  #define unlikely(expr) __builtin_expect(!!(expr), 0)
  #define likely(expr) __builtin_expect(!!(expr), 1)



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-15  0:15 Using __builtin_expect (likely/unlikely macros) Alex Gramiak
@ 2019-04-15  1:18 ` Paul Eggert
  2019-04-15  3:11   ` Alex Gramiak
  2020-04-15  3:14   ` John Carter
  0 siblings, 2 replies; 45+ messages in thread
From: Paul Eggert @ 2019-04-15  1:18 UTC (permalink / raw)
  To: Alex Gramiak; +Cc: emacs-devel

Alex Gramiak wrote:
> the likely/unlikely
> macros are a nice way to indicate that a branch is exceedingly
> rare/common.

The cost (in making the C code harder to read, write and maintain) so often 
exceeds that benefit that I'd rather avoid these macros unless there's a good 
performance case for putting them in.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-15  1:18 ` Paul Eggert
@ 2019-04-15  3:11   ` Alex Gramiak
  2019-04-15  4:41     ` Paul Eggert
  2020-04-15  3:14   ` John Carter
  1 sibling, 1 reply; 45+ messages in thread
From: Alex Gramiak @ 2019-04-15  3:11 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

Paul Eggert <eggert@cs.ucla.edu> writes:

> Alex Gramiak wrote:
>> the likely/unlikely
>> macros are a nice way to indicate that a branch is exceedingly
>> rare/common.
>
> The cost (in making the C code harder to read, write and maintain) so often
> exceeds that benefit that I'd rather avoid these macros unless there's a good
> performance case for putting them in.

I doubt that any performance benefit would be noticeable, at least
without some microbenchmarks that I don't have.

Though I don't think that these macros, used sparingly around code such
as emacs_abort, would make it harder to read/write/maintain. I'm
certainly not suggesting to throw them around without care.

Still, I understand the resistance to include them. 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-15  3:11   ` Alex Gramiak
@ 2019-04-15  4:41     ` Paul Eggert
  2019-04-16  0:16       ` Alex Gramiak
  0 siblings, 1 reply; 45+ messages in thread
From: Paul Eggert @ 2019-04-15  4:41 UTC (permalink / raw)
  To: Alex Gramiak; +Cc: emacs-devel

Alex Gramiak wrote:
> Though I don't think that these macros, used sparingly around code such
> as emacs_abort, would make it harder to read/write/maintain.

These macro calls would not help near calls to emacs_abort, as it should already 
be obvious to a careful human reader that the jump to emacs_abort is the road 
less traveled. (That's also obvious to GCC, since emacs_abort is _Noreturn.)



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-15  4:41     ` Paul Eggert
@ 2019-04-16  0:16       ` Alex Gramiak
  2019-04-16  2:34         ` Eli Zaretskii
  2019-04-16  3:42         ` Paul Eggert
  0 siblings, 2 replies; 45+ messages in thread
From: Alex Gramiak @ 2019-04-16  0:16 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 934 bytes --]

Paul Eggert <eggert@cs.ucla.edu> writes:

> Alex Gramiak wrote:
>> Though I don't think that these macros, used sparingly around code such
>> as emacs_abort, would make it harder to read/write/maintain.
>
> These macro calls would not help near calls to emacs_abort, as it should already
> be obvious to a careful human reader that the jump to emacs_abort is the road
> less traveled. (That's also obvious to GCC, since emacs_abort is _Noreturn.)

To human readers, yes, but from what I can tell, GCC is mixed on this.
When applying the following diff that surrounds emacs_abort in
bytecode.c and xdisp.c, the assembly in both applied/unapplied is the
same for bytecode.c, but not for xdisp.c (even without the LIKELY
cases). I tested using gcc -O2 -S.

Also, I tried putting a LIKELY in lisp_h_CHECK_TYPE (seems like a nice
place for one) and that yielded different assembly for fns.c, even
though wrong_type_argument is _Noreturn.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: likely --]
[-- Type: text/x-patch, Size: 6351 bytes --]

diff --git a/src/bytecode.c b/src/bytecode.c
index 40977799bf..ce1a7bd254 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -39,6 +39,7 @@ along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.  */
 #ifndef BYTE_CODE_SAFE
 # define BYTE_CODE_SAFE false
 #endif
+#define unlikely(expr) (__builtin_expect (expr, 0))
 
 /* Define BYTE_CODE_METER to generate a byte-op usage histogram.  */
 /* #define BYTE_CODE_METER */
@@ -404,7 +405,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
       int op;
       enum handlertype type;
 
-      if (BYTE_CODE_SAFE && ! (stack_base <= top && top < stack_lim))
+      if (unlikely (BYTE_CODE_SAFE && ! (stack_base <= top && top < stack_lim)))
 	emacs_abort ();
 
 #ifdef BYTE_CODE_METER
@@ -664,9 +665,9 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 	op_branch:
 	  op -= pc - bytestr_data;
 	op_relative_branch:
-	  if (BYTE_CODE_SAFE
+	  if (unlikely (BYTE_CODE_SAFE
 	      && ! (bytestr_data - pc <= op
-		    && op < bytestr_data + bytestr_length - pc))
+		    && op < bytestr_data + bytestr_length - pc)))
 	    emacs_abort ();
 	  quitcounter += op < 0;
 	  if (!quitcounter)
@@ -1397,7 +1398,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 	       number of cases is less, which uses a simple vector for linear
 	       search as the jump table.  */
             Lisp_Object jmp_table = POP;
-	    if (BYTE_CODE_SAFE && !HASH_TABLE_P (jmp_table))
+	    if (unlikely (BYTE_CODE_SAFE && !HASH_TABLE_P (jmp_table)))
               emacs_abort ();
             Lisp_Object v1 = POP;
             ptrdiff_t i;
@@ -1426,7 +1427,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 	    if (i >= 0)
 	      {
 		Lisp_Object val = HASH_VALUE (h, i);
-		if (BYTE_CODE_SAFE && !FIXNUMP (val))
+		if (unlikely (BYTE_CODE_SAFE && !FIXNUMP (val)))
 		  emacs_abort ();
 		op = XFIXNUM (val);
 		goto op_branch;
@@ -1436,8 +1437,8 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 
 	CASE_DEFAULT
 	CASE (Bconstant):
-	  if (BYTE_CODE_SAFE
-	      && ! (Bconstant <= op && op < Bconstant + const_length))
+	  if (unlikely (BYTE_CODE_SAFE
+                        && ! (Bconstant <= op && op < Bconstant + const_length)))
 	    emacs_abort ();
 	  PUSH (vectorp[op - Bconstant]);
 	  NEXT;
diff --git a/src/xdisp.c b/src/xdisp.c
index a88fc698b8..9872f69cb0 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -344,7 +344,8 @@ along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.  */
 
 /* Holds the list (error).  */
 static Lisp_Object list_of_error;
-
+#define UNLIKELY(expr) (__builtin_expect (expr, 0))
+#define LIKELY(expr) (__builtin_expect (expr, 1))
 #ifdef HAVE_WINDOW_SYSTEM
 
 /* Test if overflow newline into fringe.  Called with iterator IT
@@ -8322,7 +8323,7 @@ compute_stop_pos_backwards (struct it *it)
       reseat_1 (it, pos, false);
       compute_stop_pos (it);
       /* We must advance forward, right?  */
-      if (it->stop_charpos <= charpos)
+      if (UNLIKELY (it->stop_charpos <= charpos))
 	emacs_abort ();
     }
   while (charpos > BEGV && it->stop_charpos >= it->end_charpos);
@@ -8371,7 +8372,7 @@ handle_stop_backwards (struct it *it, ptrdiff_t charpos)
 	it->current.string_pos = string_pos (charpos, it->string);
       compute_stop_pos (it);
       /* We must advance forward, right?  */
-      if (it->stop_charpos <= it->prev_stop)
+      if (UNLIKELY (it->stop_charpos <= it->prev_stop))
 	emacs_abort ();
       charpos = it->stop_charpos;
     }
@@ -11443,7 +11444,7 @@ pop_message_unwind (void)
 void
 check_message_stack (void)
 {
-  if (!NILP (Vmessage_stack))
+  if (UNLIKELY (!NILP (Vmessage_stack)))
     emacs_abort ();
 }
 
@@ -15495,7 +15496,7 @@ set_cursor_from_row (struct window *w, struct glyph_row *row,
       /* Need to compute x that corresponds to GLYPH.  */
       for (g = row->glyphs[TEXT_AREA], x = row->x; g < glyph; g++)
 	{
-	  if (g >= row->glyphs[TEXT_AREA] + row->used[TEXT_AREA])
+	  if (UNLIKELY (g >= row->glyphs[TEXT_AREA] + row->used[TEXT_AREA]))
 	    emacs_abort ();
 	  x += g->pixel_width;
 	}
@@ -16827,9 +16828,9 @@ redisplay_window (Lisp_Object window, bool just_this_one_p)
 
   /* Some sanity checks.  */
   CHECK_WINDOW_END (w);
-  if (Z == Z_BYTE && CHARPOS (opoint) != BYTEPOS (opoint))
+  if (UNLIKELY (Z == Z_BYTE && CHARPOS (opoint) != BYTEPOS (opoint)))
     emacs_abort ();
-  if (BYTEPOS (opoint) < CHARPOS (opoint))
+  if (UNLIKELY (BYTEPOS (opoint) < CHARPOS (opoint)))
     emacs_abort ();
 
   if (mode_line_update_needed (w))
@@ -19318,9 +19319,9 @@ try_window_id (struct window *w)
       adjust_window_ends (w, last_text_row, false);
       eassert (w->window_end_bytepos >= 0);
     }
-  else if (first_unchanged_at_end_row == NULL
-	   && last_text_row == NULL
-	   && last_text_row_at_end == NULL)
+  else if (LIKELY (first_unchanged_at_end_row == NULL
+                   && last_text_row == NULL
+                   && last_text_row_at_end == NULL))
     {
       /* Displayed to end of window, but no line containing text was
 	 displayed.  Lines were deleted at the end of the window.  */
@@ -21015,7 +21016,7 @@ find_row_edges (struct it *it, struct glyph_row *row,
 	   which puts the iterator at the beginning of the next line, in
 	   the logical order. */
 	row->maxpos = it->current.pos;
-      else if (max_pos == min_pos && it->method != GET_FROM_BUFFER)
+      else if (LIKELY (max_pos == min_pos && it->method != GET_FROM_BUFFER))
 	/* A line that is entirely from a string/image/stretch...  */
 	row->maxpos = row->minpos;
       else
@@ -25210,7 +25211,7 @@ display_string (const char *string, Lisp_Object lisp_string, Lisp_Object face_st
 		}
 	      break;
 	    }
-	  else if (x + glyph->pixel_width >= it->first_visible_x)
+	  else if (LIKELY (x + glyph->pixel_width >= it->first_visible_x))
 	    {
 	      /* Glyph is at least partially visible.  */
 	      ++it->hpos;
@@ -27847,7 +27848,7 @@ produce_special_glyphs (struct it *it, enum display_element_type what)
 	  spec_glyph_lookup_face (XWINDOW (it->window), &glyph);
 	}
     }
-  else if (what == IT_TRUNCATION)
+  else if (LIKELY (what == IT_TRUNCATION))
     {
       /* Truncation glyph.  */
       SET_GLYPH_FROM_CHAR (glyph, '$');

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-16  0:16       ` Alex Gramiak
@ 2019-04-16  2:34         ` Eli Zaretskii
  2019-04-16  5:33           ` Alex Gramiak
  2019-04-16  3:42         ` Paul Eggert
  1 sibling, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2019-04-16  2:34 UTC (permalink / raw)
  To: Alex Gramiak; +Cc: eggert, emacs-devel

> From: Alex Gramiak <agrambot@gmail.com>
> Date: Mon, 15 Apr 2019 18:16:02 -0600
> Cc: emacs-devel@gnu.org
> 
> To human readers, yes, but from what I can tell, GCC is mixed on this.
> When applying the following diff that surrounds emacs_abort in
> bytecode.c and xdisp.c, the assembly in both applied/unapplied is the
> same for bytecode.c, but not for xdisp.c (even without the LIKELY
> cases). I tested using gcc -O2 -S.

I'm not sure whether you put LIKELY and UNLIKELY somewhat randomly,
just for testing, or did you really think each place is
likely/unlikely as in the change, but at least some places in xdisp.c
are wrong: they use LIKELY where UNLIKELY is more appropriate, abd
vice versa.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-16  0:16       ` Alex Gramiak
  2019-04-16  2:34         ` Eli Zaretskii
@ 2019-04-16  3:42         ` Paul Eggert
  2019-04-16 13:05           ` Stefan Monnier
  1 sibling, 1 reply; 45+ messages in thread
From: Paul Eggert @ 2019-04-16  3:42 UTC (permalink / raw)
  To: Alex Gramiak; +Cc: emacs-devel

Alex Gramiak wrote:
>> These macro calls would not help near calls to emacs_abort, as it should already
>> be obvious to a careful human reader that the jump to emacs_abort is the road
>> less traveled. (That's also obvious to GCC, since emacs_abort is _Noreturn.)

> To human readers, yes, but from what I can tell, GCC is mixed on this.

Then we should fix GCC, if the code it generates has a performance problem 
(whatever it is, it's quite small). That'd be better than littering the Emacs 
source code with __builtin_expect or UNLIKELY calls. The GCC manual recommends 
against manually inserting such calls; performance nerds are supposed to use 
-fprofile-arcs instead. In my experience the calls are typically more trouble 
than they're worth.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-16  2:34         ` Eli Zaretskii
@ 2019-04-16  5:33           ` Alex Gramiak
  2019-04-16 15:23             ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Alex Gramiak @ 2019-04-16  5:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> I'm not sure whether you put LIKELY and UNLIKELY somewhat randomly,
> just for testing, or did you really think each place is
> likely/unlikely as in the change, but at least some places in xdisp.c
> are wrong: they use LIKELY where UNLIKELY is more appropriate, abd
> vice versa.

Out of curiosity, which places would that be? I only put UNLIKELY calls
around the checks for emacs_abort (which I presume to be unlikely), and
the LIKELY calls are of the form:

  else if (LIKELY (<cond>))
    {
      ...
    }
  else
    emacs_abort ();

So if already at the "else if" clause, it's likely that the condition is
true, otherwise an abort would take place.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-16  3:42         ` Paul Eggert
@ 2019-04-16 13:05           ` Stefan Monnier
  2019-04-16 15:22             ` Paul Eggert
  0 siblings, 1 reply; 45+ messages in thread
From: Stefan Monnier @ 2019-04-16 13:05 UTC (permalink / raw)
  To: emacs-devel

>>> These macro calls would not help near calls to emacs_abort, as it
>>> should already be obvious to a careful human reader that the jump to
>>> emacs_abort is the road less traveled. (That's also obvious to GCC,
>>> since emacs_abort is _Noreturn.)
>> To human readers, yes, but from what I can tell, GCC is mixed on this.
> Then we should fix GCC, if the code it generates has a performance problem
> (whatever it is, it's quite small).

FWIW, the fact that a function is "_Noreturn" doesn't necessarily mean
that a call to it is unlikely (in many cases it is, I guess, but
definitely not all), so maybe GCC maintainers consciously decided not to
link the two.

> The GCC manual recommends against manually inserting such calls;

It's likely based on some past experiments that showed programmers
aren't very good at understanding what is likely and what isn't in
their code.

BTW I think instead of marking branches as likely or unlikely, I'd
prefer to tell GCC that some functions "should be slow"
(e.g. emacs_abort) so it optimizes the code paths that don't go through
those functions to the detriment of those that do.


        Stefan




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-16 13:05           ` Stefan Monnier
@ 2019-04-16 15:22             ` Paul Eggert
  2019-04-16 16:10               ` Alex Gramiak
  0 siblings, 1 reply; 45+ messages in thread
From: Paul Eggert @ 2019-04-16 15:22 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier wrote:

> FWIW, the fact that a function is "_Noreturn" doesn't necessarily mean
> that a call to it is unlikely (in many cases it is, I guess, but
> definitely not all), so maybe GCC maintainers consciously decided not to
> link the two.

Possibly they did. In hindsight I'd argue that was a mistake. If one has no 
other evidence about the likelihood of a branch, a branch to a _Noreturn call 
should default to being unlikely.

> BTW I think instead of marking branches as likely or unlikely, I'd
> prefer to tell GCC that some functions "should be slow"
> (e.g. emacs_abort) so it optimizes the code paths that don't go through
> those functions to the detriment of those that do.

GCC has the function attribute 'cold' for that. This is less intrusive than 
__builtin_expect and so would be preferable. Still, the GCC manual says that 
__attribute__ ((cold)) is ignored when profile feedback is available, which is 
another indication that people who care about performance should be using 
-fprofile-use etc. And as far as I know __attribute__ ((cold)) is rarely used: 
even glibc uses it only once, in obscure code never used on GNU/Linux. 
Presumably this is partly because the attribute didn't exist until about five 
years ago.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-16  5:33           ` Alex Gramiak
@ 2019-04-16 15:23             ` Eli Zaretskii
  2019-04-16 15:47               ` Alex Gramiak
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2019-04-16 15:23 UTC (permalink / raw)
  To: Alex Gramiak; +Cc: eggert, emacs-devel

> From: Alex Gramiak <agrambot@gmail.com>
> Cc: eggert@cs.ucla.edu,  emacs-devel@gnu.org
> Date: Mon, 15 Apr 2019 23:33:51 -0600
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > I'm not sure whether you put LIKELY and UNLIKELY somewhat randomly,
> > just for testing, or did you really think each place is
> > likely/unlikely as in the change, but at least some places in xdisp.c
> > are wrong: they use LIKELY where UNLIKELY is more appropriate, abd
> > vice versa.
> 
> Out of curiosity, which places would that be? I only put UNLIKELY calls
> around the checks for emacs_abort (which I presume to be unlikely), and
> the LIKELY calls are of the form:
> 
>   else if (LIKELY (<cond>))
>     {
>       ...
>     }
>   else
>     emacs_abort ();

OK, it's possible that I don't understand the exact semantics of
__builtin_expect in these situations.

The problem is that you used LIKELY like this:

  if (A)
    do_A;
  else if (B)
    do_B;
  else if (C)
    do_C;
  else if (LIKELY (D))
    do_D;
  else
    cant_happen ();

Essentially, the above is a moral equivalent of a 'switch' with the
'default' case aborting because it "cannot happen".  In such code, the
order of the clauses doesn't necessarily tell anything about their
likelihood; up front, they all are equally "likely".  So using LIKELY
only in the last one sends a wrong signal: that last condition is
neither more nor less likely than all the others.  Actually, in some
cases it might be _less_ likely than the preceding ones, because if I
knew that some of these conditions happens much more frequently, I'd
test it first.

Now, it's possible that the effect of __builtin_expect doesn't care
about this issue.  The GCC manual doesn't help to figure out whether
this is the case, because it only talks about a simple case of a
single 'if' clause, and doesn't tell any details about what GCC is
allowed to do when it sees __builtin_expect.  But just by looking at
how the code looks, I immediately raised a brow.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-16 15:23             ` Eli Zaretskii
@ 2019-04-16 15:47               ` Alex Gramiak
  0 siblings, 0 replies; 45+ messages in thread
From: Alex Gramiak @ 2019-04-16 15:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> OK, it's possible that I don't understand the exact semantics of
> __builtin_expect in these situations.
>
> The problem is that you used LIKELY like this:
>
>   if (A)
>     do_A;
>   else if (B)
>     do_B;
>   else if (C)
>     do_C;
>   else if (LIKELY (D))
>     do_D;
>   else
>     cant_happen ();
>
> Essentially, the above is a moral equivalent of a 'switch' with the
> 'default' case aborting because it "cannot happen".  In such code, the
> order of the clauses doesn't necessarily tell anything about their
> likelihood; up front, they all are equally "likely".  So using LIKELY
> only in the last one sends a wrong signal: that last condition is
> neither more nor less likely than all the others.  Actually, in some
> cases it might be _less_ likely than the preceding ones, because if I
> knew that some of these conditions happens much more frequently, I'd
> test it first.

It was my understanding that since an else if is equivalent to else { if
... }, it would only affect the last two branches. Though I could easily
be wrong here.

> Now, it's possible that the effect of __builtin_expect doesn't care
> about this issue.  The GCC manual doesn't help to figure out whether
> this is the case, because it only talks about a simple case of a
> single 'if' clause, and doesn't tell any details about what GCC is
> allowed to do when it sees __builtin_expect.  But just by looking at
> how the code looks, I immediately raised a brow.

Right, considering the confusion it would be counterproductive to use
them in this fashion. A workaround to the confusion would be to do:

  else
    {
      if (LIKELY (D))
        do_D;
      else
        cant_happen ();
    }



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-16 15:22             ` Paul Eggert
@ 2019-04-16 16:10               ` Alex Gramiak
  2019-04-16 17:54                 ` Paul Eggert
  0 siblings, 1 reply; 45+ messages in thread
From: Alex Gramiak @ 2019-04-16 16:10 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Stefan Monnier, emacs-devel

Paul Eggert <eggert@cs.ucla.edu> writes:

>> BTW I think instead of marking branches as likely or unlikely, I'd
>> prefer to tell GCC that some functions "should be slow"
>> (e.g. emacs_abort) so it optimizes the code paths that don't go through
>> those functions to the detriment of those that do.
>
> GCC has the function attribute 'cold' for that. This is less intrusive than
> __builtin_expect and so would be preferable. Still, the GCC manual says that
> __attribute__ ((cold)) is ignored when profile feedback is available, which is
> another indication that people who care about performance should be using
> -fprofile-use etc. And as far as I know __attribute__ ((cold)) is rarely used:
> even glibc uses it only once, in obscure code never used on GNU/Linux.
> Presumably this is partly because the attribute didn't exist until about five
> years ago.

It would still be good to use it for default builds, no? The GCC
documentation states:

  It is thus useful to mark functions used to handle unlikely
  conditions, such as perror, as cold to improve optimization of hot
  functions that do call marked functions in rare occasions.

So the error/signal calls, wrong_type_argument, etc. would be good
places for this. It doesn't indicate anything to the programmer at call
sites, but that point is controversial, so slapping a few attributes
down seems like a better way to go.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-16 16:10               ` Alex Gramiak
@ 2019-04-16 17:54                 ` Paul Eggert
  2019-04-16 20:50                   ` Alex Gramiak
  0 siblings, 1 reply; 45+ messages in thread
From: Paul Eggert @ 2019-04-16 17:54 UTC (permalink / raw)
  To: Alex Gramiak; +Cc: Stefan Monnier, emacs-devel

On 4/16/19 9:10 AM, Alex Gramiak wrote:
> It would still be good to use it for default builds, no? The GCC
> documentation states:
>
>   It is thus useful to mark functions used to handle unlikely
>   conditions, such as perror, as cold to improve optimization of hot
>   functions that do call marked functions in rare occasions.

That argument would be stronger if functions like 'perror' actually used
__attribute__ ((cold)), which they don't (at least, not in GNU systems).

There is a cost and a benefit to adding these attributes. The benefit is
that they should improve runtime performance very slightly, and (more
important) they should help nonexpert human readers know that a function
is rarely called. The cost is that they make the code harder to
maintain, thus placing a burden on maintainers. Perhaps I'm biased as I
would bear the cost and get none of the benefit; still, the overall
cost-benefit ratio doesn't look all that favorable for Emacs.

That being said, it might make sense for a few obviously-rarely-called
functions like 'emacs-abort' to be marked with __attribute__ ((cold)),
so long as we don't turn this into a mission to mark all cold functions
(which would cost us more than it would benefit). That is what GCC
itself does, with its own functions. However, I'd like to see
performance figures. Could you try it out on the benchmark of 'cd lisp
&& time make compile-always'?



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-16 17:54                 ` Paul Eggert
@ 2019-04-16 20:50                   ` Alex Gramiak
  2019-04-16 21:11                     ` Alex Gramiak
                                       ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Alex Gramiak @ 2019-04-16 20:50 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Stefan Monnier, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 6171 bytes --]

Paul Eggert <eggert@cs.ucla.edu> writes:

> That being said, it might make sense for a few obviously-rarely-called
> functions like 'emacs-abort' to be marked with __attribute__ ((cold)),
> so long as we don't turn this into a mission to mark all cold functions
> (which would cost us more than it would benefit). That is what GCC
> itself does, with its own functions. However, I'd like to see
> performance figures. Could you try it out on the benchmark of 'cd lisp
> && time make compile-always'?

Right, I agree that if used, they should be used sparingly. I tested
three versions a few times each with both 'make' and 'make -j4':

a) Regular Emacs master.
b) The below diff with only the _Cold attribute
c) The below diff with both _Cold and _Hot attributes

a) Normal
real    4:17.97s
user    3:57.18s
sys     20.394s

real    1:17.67s
user    4:23.78s
sys     18.888s

b) Cold
real    4:10.92s
user    3:50.34s
sys     20.178s

real    1:15.77s
user    4:16.73s
sys     18.943s

c) Hot/Cold
real    4:11.43s
user    3:51.07s
sys     19.961s

real    1:16.01s
user    4:17.63s
sys     18.662s

So not much of a difference. For some reason the Hot/Cold performed
consistently worse than Cold.

I also tested startup/shutdown with perf:

 Performance counter stats for '../emacs-normal -f kill-emacs' (20 runs):

            762.17 msec task-clock:u              #    0.844 CPUs utilized            ( +-  0.23% )
                 0      context-switches:u        #    0.000 K/sec                  
                 0      cpu-migrations:u          #    0.000 K/sec                  
            12,941      page-faults:u             #    0.017 M/sec                    ( +-  0.01% )
     2,998,322,125      cycles:u                  #    3.934 GHz                      ( +-  0.06% )
     1,392,869,413      stalled-cycles-frontend:u #   46.45% frontend cycles idle     ( +-  0.15% )
       982,206,843      stalled-cycles-backend:u  #   32.76% backend cycles idle      ( +-  0.18% )
     4,874,186,825      instructions:u            #    1.63  insn per cycle         
                                                  #    0.29  stalled cycles per insn  ( +-  0.01% )
     1,037,929,374      branches:u                # 1361.802 M/sec                    ( +-  0.01% )
        17,930,471      branch-misses:u           #    1.73% of all branches          ( +-  0.16% )
     1,209,539,215      L1-dcache-loads:u         # 1586.960 M/sec                    ( +-  0.01% )
        42,346,229      L1-dcache-load-misses:u   #    3.50% of all L1-dcache hits    ( +-  0.05% )
         9,088,647      LLC-loads:u               #   11.925 M/sec                    ( +-  0.29% )
   <not supported>      LLC-load-misses:u                                           

           0.90325 +- 0.00441 seconds time elapsed  ( +-  0.49% )



 Performance counter stats for '../emacs.cold -f kill-emacs' (20 runs):

            755.94 msec task-clock:u              #    0.845 CPUs utilized            ( +-  0.24% )
                 0      context-switches:u        #    0.000 K/sec                  
                 0      cpu-migrations:u          #    0.000 K/sec                  
            12,941      page-faults:u             #    0.017 M/sec                    ( +-  0.01% )
     2,976,036,365      cycles:u                  #    3.937 GHz                      ( +-  0.06% )
     1,374,451,779      stalled-cycles-frontend:u #   46.18% frontend cycles idle     ( +-  0.14% )
       990,227,732      stalled-cycles-backend:u  #   33.27% backend cycles idle      ( +-  0.18% )
     4,878,661,927      instructions:u            #    1.64  insn per cycle         
                                                  #    0.28  stalled cycles per insn  ( +-  0.00% )
     1,038,495,525      branches:u                # 1373.782 M/sec                    ( +-  0.00% )
        17,859,906      branch-misses:u           #    1.72% of all branches          ( +-  0.16% )
     1,209,345,531      L1-dcache-loads:u         # 1599.792 M/sec                    ( +-  0.00% )
        42,444,358      L1-dcache-load-misses:u   #    3.51% of all L1-dcache hits    ( +-  0.06% )
         9,204,368      LLC-loads:u               #   12.176 M/sec                    ( +-  0.41% )
   <not supported>      LLC-load-misses:u                                           

           0.89430 +- 0.00217 seconds time elapsed  ( +-  0.24% )


 Performance counter stats for '../emacs.hot-cold -f kill-emacs' (20 runs):

            761.97 msec task-clock:u              #    0.845 CPUs utilized            ( +-  0.20% )
                 0      context-switches:u        #    0.000 K/sec                  
                 0      cpu-migrations:u          #    0.000 K/sec                  
            12,947      page-faults:u             #    0.017 M/sec                    ( +-  0.01% )
     2,989,750,359      cycles:u                  #    3.924 GHz                      ( +-  0.04% )
     1,383,312,275      stalled-cycles-frontend:u #   46.27% frontend cycles idle     ( +-  0.12% )
       994,643,853      stalled-cycles-backend:u  #   33.27% backend cycles idle      ( +-  0.13% )
     4,879,318,990      instructions:u            #    1.63  insn per cycle         
                                                  #    0.28  stalled cycles per insn  ( +-  0.00% )
     1,038,584,045      branches:u                # 1363.022 M/sec                    ( +-  0.00% )
        17,863,736      branch-misses:u           #    1.72% of all branches          ( +-  0.13% )
     1,209,327,347      L1-dcache-loads:u         # 1587.103 M/sec                    ( +-  0.00% )
        42,501,374      L1-dcache-load-misses:u   #    3.51% of all L1-dcache hits    ( +-  0.05% )
         9,201,311      LLC-loads:u               #   12.076 M/sec                    ( +-  0.28% )
   <not supported>      LLC-load-misses:u                                           

           0.90132 +- 0.00201 seconds time elapsed  ( +-  0.22% )


Which again shows a slight improvement with the Cold attributes, and
still shows the hot attributes degrading performance. Perhaps I was too
overzealous with the hot tagging?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: hot/cold --]
[-- Type: text/x-patch, Size: 17489 bytes --]

diff --git a/src/alloc.c b/src/alloc.c
index fe2cdb8788..dd783863be 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -633,7 +633,7 @@ display_malloc_warning (void)
 \f
 /* Called if we can't allocate relocatable space for a buffer.  */
 
-_Cold void
+void
 buffer_memory_full (ptrdiff_t nbytes)
 {
   /* If buffers use the relocating allocator, no need to free
@@ -886,7 +886,7 @@ static void *lrealloc (void *, size_t);
 
 /* Like malloc but check for no memory and block interrupt input.  */
 
-_Hot void *
+void *
 xmalloc (size_t size)
 {
   void *val;
@@ -903,7 +903,7 @@ xmalloc (size_t size)
 
 /* Like the above, but zeroes out the memory just allocated.  */
 
-_Hot void *
+void *
 xzalloc (size_t size)
 {
   void *val;
@@ -921,7 +921,7 @@ xzalloc (size_t size)
 
 /* Like realloc but check for no memory and block interrupt input..  */
 
-_Hot void *
+void *
 xrealloc (void *block, size_t size)
 {
   void *val;
@@ -944,7 +944,7 @@ xrealloc (void *block, size_t size)
 
 /* Like free but block interrupt input.  */
 
-_Hot void
+void
 xfree (void *block)
 {
   if (!block)
@@ -1125,7 +1125,7 @@ record_xmalloc (size_t size)
 void *lisp_malloc_loser EXTERNALLY_VISIBLE;
 #endif
 
-static _Hot void *
+static void *
 lisp_malloc (size_t nbytes, enum mem_type type)
 {
   register void *val;
@@ -1170,7 +1170,7 @@ lisp_malloc (size_t nbytes, enum mem_type type)
 /* Free BLOCK.  This must be called to free memory allocated with a
    call to lisp_malloc.  */
 
-static _Hot void
+static void
 lisp_free (void *block)
 {
   if (pdumper_object_p (block))
@@ -2325,7 +2325,7 @@ compact_small_strings (void)
   current_sblock = tb;
 }
 
-_Cold void
+void
 string_overflow (void)
 {
   error ("Maximum string size exceeded");
@@ -4085,7 +4085,7 @@ set_interval_marked (INTERVAL i)
    either case this counts as memory being full even though malloc did
    not fail.  */
 
-_Cold void
+void
 memory_full (size_t nbytes)
 {
   /* Do not go into hysterics merely because a large request failed.  */
diff --git a/src/bytecode.c b/src/bytecode.c
index e2fe7153b0..40977799bf 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -336,7 +336,7 @@ bcall0 (Lisp_Object f)
    ARGS are pushed on the stack according to ARGS_TEMPLATE before
    executing BYTESTR.  */
 
-_Hot Lisp_Object
+Lisp_Object
 exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 		Lisp_Object args_template, ptrdiff_t nargs, Lisp_Object *args)
 {
diff --git a/src/data.c b/src/data.c
index 1291a92955..11cd598ed8 100644
--- a/src/data.c
+++ b/src/data.c
@@ -130,7 +130,7 @@ set_blv_valcell (struct Lisp_Buffer_Local_Value *blv, Lisp_Object val)
   blv->valcell = val;
 }
 
-static _Cold _Noreturn void
+static _Noreturn void
 wrong_length_argument (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3)
 {
   Lisp_Object size1 = make_fixnum (bool_vector_size (a1));
@@ -142,7 +142,7 @@ wrong_length_argument (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3)
 	      make_fixnum (bool_vector_size (a3)));
 }
 
-_Cold _Noreturn void
+_Noreturn void
 wrong_type_argument (register Lisp_Object predicate, register Lisp_Object value)
 {
   /* If VALUE is not even a valid Lisp object, we'd want to abort here
@@ -155,25 +155,25 @@ wrong_type_argument (register Lisp_Object predicate, register Lisp_Object value)
   xsignal2 (Qwrong_type_argument, predicate, value);
 }
 
-_Cold void
+void
 pure_write_error (Lisp_Object obj)
 {
   xsignal2 (Qerror, build_string ("Attempt to modify read-only object"), obj);
 }
 
-_Cold void
+void
 args_out_of_range (Lisp_Object a1, Lisp_Object a2)
 {
   xsignal2 (Qargs_out_of_range, a1, a2);
 }
 
-_Cold void
+void
 args_out_of_range_3 (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3)
 {
   xsignal3 (Qargs_out_of_range, a1, a2, a3);
 }
 
-_Cold void
+void
 circular_list (Lisp_Object list)
 {
   xsignal1 (Qcircular_list, list);
@@ -1018,7 +1018,7 @@ do_symval_forwarding (lispfwd valcontents)
 /* Used to signal a user-friendly error when symbol WRONG is
    not a member of CHOICE, which should be a list of symbols.  */
 
-_Cold void
+void
 wrong_choice (Lisp_Object choice, Lisp_Object wrong)
 {
   ptrdiff_t i = 0, len = list_length (choice);
@@ -1051,7 +1051,7 @@ wrong_choice (Lisp_Object choice, Lisp_Object wrong)
 /* Used to signal a user-friendly error if WRONG is not a number or
    integer/floating-point number outsize of inclusive MIN..MAX range.  */
 
-static _Cold void
+static void
 wrong_range (Lisp_Object min, Lisp_Object max, Lisp_Object wrong)
 {
   AUTO_STRING (value_should_be_from, "Value should be from ");
diff --git a/src/dispnew.c b/src/dispnew.c
index db9166cbe6..ccb08ec1b9 100644
--- a/src/dispnew.c
+++ b/src/dispnew.c
@@ -360,7 +360,7 @@ verify_row_hash (struct glyph_row *row)
    what is displayed on the screen.  While this is usually fast, it
    leads to screen flickering.  */
 
-static _Hot void
+static void
 adjust_glyph_matrix (struct window *w, struct glyph_matrix *matrix, int x, int y, struct dim dim)
 {
   int i;
@@ -2994,7 +2994,7 @@ window_to_frame_hpos (struct window *w, int hpos)
 
 /* Redraw frame F.  */
 
-_Hot void
+void
 redraw_frame (struct frame *f)
 {
   /* Error if F has no glyphs.  */
@@ -3048,7 +3048,7 @@ DEFUN ("redraw-display", Fredraw_display, Sredraw_display, 0, 0, "",
 
    Value is true if redisplay was stopped due to pending input.  */
 
-_Hot bool
+bool
 update_frame (struct frame *f, bool force_p, bool inhibit_hairy_id_p)
 {
   /* True means display has been paused because of pending input.  */
@@ -3384,7 +3384,7 @@ check_current_matrix_flags (struct window *w)
 /* Update display of window W.
    If FORCE_P, don't stop updating when input is pending.  */
 
-static _Hot bool
+static bool
 update_window (struct window *w, bool force_p)
 {
   struct glyph_matrix *desired_matrix = w->desired_matrix;
@@ -3580,7 +3580,7 @@ update_marginal_area (struct window *w, struct glyph_row *updated_row,
 /* Update the display of the text area of row VPOS in window W.
    Value is true if display has changed.  */
 
-static _Hot bool
+static bool
 update_text_area (struct window *w, struct glyph_row *updated_row, int vpos)
 {
   struct glyph_row *current_row = MATRIX_ROW (w->current_matrix, vpos);
@@ -4476,7 +4476,7 @@ scrolling_window (struct window *w, bool header_line_p)
 
    Value is true if update was stopped due to pending input.  */
 
-static _Hot bool
+static bool
 update_frame_1 (struct frame *f, bool force_p, bool inhibit_id_p,
 		bool set_cursor_p, bool updating_menu_p)
 {
@@ -4779,7 +4779,7 @@ count_match (struct glyph *str1, struct glyph *end1, struct glyph *str2, struct
 
 /* Perform a frame-based update on line VPOS in frame FRAME.  */
 
-static _Hot void
+static void
 update_frame_line (struct frame *f, int vpos, bool updating_menu_p)
 {
   struct glyph *obody, *nbody, *op1, *op2, *np1, *nend;
diff --git a/src/eval.c b/src/eval.c
index eb3d856a8f..e9f118c5cb 100644
--- a/src/eval.c
+++ b/src/eval.c
@@ -1707,25 +1707,25 @@ signal_or_quit (Lisp_Object error_symbol, Lisp_Object data, bool keyboard_quit)
 
 /* Like xsignal, but takes 0, 1, 2, or 3 args instead of a list.  */
 
-_Cold void
+void
 xsignal0 (Lisp_Object error_symbol)
 {
   xsignal (error_symbol, Qnil);
 }
 
-_Cold void
+void
 xsignal1 (Lisp_Object error_symbol, Lisp_Object arg)
 {
   xsignal (error_symbol, list1 (arg));
 }
 
-_Cold void
+void
 xsignal2 (Lisp_Object error_symbol, Lisp_Object arg1, Lisp_Object arg2)
 {
   xsignal (error_symbol, list2 (arg1, arg2));
 }
 
-_Cold void
+void
 xsignal3 (Lisp_Object error_symbol, Lisp_Object arg1, Lisp_Object arg2, Lisp_Object arg3)
 {
   xsignal (error_symbol, list3 (arg1, arg2, arg3));
@@ -1734,7 +1734,7 @@ xsignal3 (Lisp_Object error_symbol, Lisp_Object arg1, Lisp_Object arg2, Lisp_Obj
 /* Signal `error' with message S, and additional arg ARG.
    If ARG is not a proper list, make it a one-element list.  */
 
-_Cold void
+void
 signal_error (const char *s, Lisp_Object arg)
 {
   if (NILP (Fproper_list_p (arg)))
@@ -1745,7 +1745,7 @@ signal_error (const char *s, Lisp_Object arg)
 
 /* Use this for arithmetic overflow, e.g., when an integer result is
    too large even for a bignum.  */
-_Cold void
+void
 overflow_error (void)
 {
   xsignal0 (Qoverflow_error);
@@ -1892,7 +1892,7 @@ vformat_string (const char *m, va_list ap)
 }
 
 /* Dump an error message; called like vprintf.  */
-_Cold void
+void
 verror (const char *m, va_list ap)
 {
   xsignal1 (Qerror, vformat_string (m, ap));
@@ -1902,7 +1902,7 @@ verror (const char *m, va_list ap)
 /* Dump an error message; called like printf.  */
 
 /* VARARGS 1 */
-_Cold void
+void
 error (const char *m, ...)
 {
   va_list ap;
@@ -2171,7 +2171,7 @@ record_in_backtrace (Lisp_Object function, Lisp_Object *args, ptrdiff_t nargs)
 
 /* Eval a sub-expression of the current expression (i.e. in the same
    lexical scope).  */
-_Hot Lisp_Object
+Lisp_Object
 eval_sub (Lisp_Object form)
 {
   Lisp_Object fun, val, original_fun, original_args;
@@ -2863,7 +2863,7 @@ usage: (funcall FUNCTION &rest ARGUMENTS)  */)
 /* Apply a C subroutine SUBR to the NUMARGS evaluated arguments in ARG_VECTOR
    and return the result of evaluation.  */
 
-_Hot Lisp_Object
+Lisp_Object
 funcall_subr (struct Lisp_Subr *subr, ptrdiff_t numargs, Lisp_Object *args)
 {
   if (numargs < subr->min_args
@@ -2942,7 +2942,7 @@ funcall_subr (struct Lisp_Subr *subr, ptrdiff_t numargs, Lisp_Object *args)
     }
 }
 
-static _Hot Lisp_Object
+static Lisp_Object
 apply_lambda (Lisp_Object fun, Lisp_Object args, ptrdiff_t count)
 {
   Lisp_Object *arg_vector;
@@ -2978,7 +2978,7 @@ apply_lambda (Lisp_Object fun, Lisp_Object args, ptrdiff_t count)
    FUN must be either a lambda-expression, a compiled-code object,
    or a module function.  */
 
-static _Hot Lisp_Object
+static Lisp_Object
 funcall_lambda (Lisp_Object fun, ptrdiff_t nargs,
 		register Lisp_Object *arg_vector)
 {
@@ -3322,7 +3322,7 @@ do_specbind (struct Lisp_Symbol *sym, union specbinding *bind,
      i.e. bindings to the default value of a variable which can be
      buffer-local.  */
 
-_Hot void
+void
 specbind (Lisp_Object symbol, Lisp_Object value)
 {
   struct Lisp_Symbol *sym;
@@ -3580,7 +3580,7 @@ set_unwind_protect_ptr (ptrdiff_t count, void (*func) (void *), void *arg)
 /* Pop and execute entries from the unwind-protect stack until the
    depth COUNT is reached.  Return VALUE.  */
 
-_Hot Lisp_Object
+Lisp_Object
 unbind_to (ptrdiff_t count, Lisp_Object value)
 {
   Lisp_Object quitf = Vquit_flag;
diff --git a/src/keyboard.c b/src/keyboard.c
index 5569d0db2c..8fb6db987b 100644
--- a/src/keyboard.c
+++ b/src/keyboard.c
@@ -1038,7 +1038,7 @@ static Lisp_Object top_level_1 (Lisp_Object);
    This level has the catches for exiting/returning to editor command loop.
    It returns nil to exit recursive edit, t to abort it.  */
 
-_Hot Lisp_Object
+Lisp_Object
 command_loop (void)
 {
 #ifdef HAVE_STACK_OVERFLOW_HANDLING
@@ -8856,7 +8856,7 @@ void init_raw_keybuf_count (void)
    If FIX_CURRENT_BUFFER, we restore current_buffer
    from the selected window's buffer.  */
 
-static _Hot int
+static int
 read_key_sequence (Lisp_Object *keybuf, Lisp_Object prompt,
 		   bool dont_downcase_last, bool can_return_switch_frame,
 		   bool fix_current_buffer, bool prevent_redisplay)
diff --git a/src/lisp.h b/src/lisp.h
index b98cad5e8e..681efc3b52 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -34,9 +34,6 @@ along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.  */
 #include <intprops.h>
 #include <verify.h>
 
-#define _Cold __attribute__((cold))
-#define _Hot  __attribute__((hot))
-
 INLINE_HEADER_BEGIN
 
 /* Define a TYPE constant ID as an externally visible name.  Use like this:
@@ -624,7 +621,7 @@ extern Lisp_Object char_table_ref (Lisp_Object, int);
 extern void char_table_set (Lisp_Object, int, Lisp_Object);
 
 /* Defined in data.c.  */
-extern _Cold _Noreturn void wrong_type_argument (Lisp_Object, Lisp_Object);
+extern _Noreturn void wrong_type_argument (Lisp_Object, Lisp_Object);
 
 
 /* Defined in emacs.c.  */
@@ -4098,18 +4095,18 @@ extern Lisp_Object run_hook_with_args (ptrdiff_t nargs, Lisp_Object *args,
 				       Lisp_Object (*funcall)
 				       (ptrdiff_t nargs, Lisp_Object *args));
 extern Lisp_Object quit (void);
-INLINE _Cold _Noreturn void
+INLINE _Noreturn void
 xsignal (Lisp_Object error_symbol, Lisp_Object data)
 {
   Fsignal (error_symbol, data);
 }
-extern _Cold _Noreturn void xsignal0 (Lisp_Object);
-extern _Cold _Noreturn void xsignal1 (Lisp_Object, Lisp_Object);
-extern _Cold _Noreturn void xsignal2 (Lisp_Object, Lisp_Object, Lisp_Object);
-extern _Cold _Noreturn void xsignal3 (Lisp_Object, Lisp_Object, Lisp_Object,
+extern _Noreturn void xsignal0 (Lisp_Object);
+extern _Noreturn void xsignal1 (Lisp_Object, Lisp_Object);
+extern _Noreturn void xsignal2 (Lisp_Object, Lisp_Object, Lisp_Object);
+extern _Noreturn void xsignal3 (Lisp_Object, Lisp_Object, Lisp_Object,
 				Lisp_Object);
-extern _Cold _Noreturn void signal_error (const char *, Lisp_Object);
-extern _Cold _Noreturn void overflow_error (void);
+extern _Noreturn void signal_error (const char *, Lisp_Object);
+extern _Noreturn void overflow_error (void);
 extern bool FUNCTIONP (Lisp_Object);
 extern Lisp_Object funcall_subr (struct Lisp_Subr *subr, ptrdiff_t numargs, Lisp_Object *arg_vector);
 extern Lisp_Object eval_sub (Lisp_Object form);
@@ -4148,8 +4145,8 @@ extern void set_unwind_protect_ptr (ptrdiff_t, void (*) (void *), void *);
 extern Lisp_Object unbind_to (ptrdiff_t, Lisp_Object);
 extern void rebind_for_thread_switch (void);
 extern void unbind_for_thread_switch (struct thread_state *);
-extern _Cold _Noreturn void error (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2);
-extern _Cold _Noreturn void verror (const char *, va_list)
+extern _Noreturn void error (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2);
+extern _Noreturn void verror (const char *, va_list)
   ATTRIBUTE_FORMAT_PRINTF (1, 0);
 extern Lisp_Object vformat_string (const char *, va_list)
   ATTRIBUTE_FORMAT_PRINTF (1, 0);
diff --git a/src/lread.c b/src/lread.c
index 16ce4afd21..5f33fcd695 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -1912,7 +1912,7 @@ readevalloop_eager_expand_eval (Lisp_Object val, Lisp_Object macroexpand)
    START, END specify region to read in current buffer (from eval-region).
    If the input is not from a buffer, they must be nil.  */
 
-static _Hot void
+static void
 readevalloop (Lisp_Object readcharfun,
 	      struct infile *infile0,
 	      Lisp_Object sourcename,
@@ -2736,7 +2736,7 @@ read_integer (Lisp_Object readcharfun, EMACS_INT radix)
 
    FIRST_IN_LIST is true if this is the first element of a list.  */
 
-static _Hot Lisp_Object
+static Lisp_Object
 read1 (Lisp_Object readcharfun, int *pch, bool first_in_list)
 {
   int c;
@@ -4092,7 +4092,7 @@ check_obarray (Lisp_Object obarray)
 
 /* Intern symbol SYM in OBARRAY using bucket INDEX.  */
 
-static _Hot Lisp_Object
+static Lisp_Object
 intern_sym (Lisp_Object sym, Lisp_Object obarray, Lisp_Object index)
 {
   Lisp_Object *ptr;
@@ -4116,7 +4116,7 @@ intern_sym (Lisp_Object sym, Lisp_Object obarray, Lisp_Object index)
 
 /* Intern a symbol with name STRING in OBARRAY using bucket INDEX.  */
 
-_Hot Lisp_Object
+Lisp_Object
 intern_driver (Lisp_Object string, Lisp_Object obarray, Lisp_Object index)
 {
   return intern_sym (Fmake_symbol (string), obarray, index);
@@ -4125,7 +4125,7 @@ intern_driver (Lisp_Object string, Lisp_Object obarray, Lisp_Object index)
 /* Intern the C string STR: return a symbol with that name,
    interned in the current obarray.  */
 
-_Hot Lisp_Object
+Lisp_Object
 intern_1 (const char *str, ptrdiff_t len)
 {
   Lisp_Object obarray = check_obarray (Vobarray);
diff --git a/src/term.c b/src/term.c
index 472d8d19e5..a492276c88 100644
--- a/src/term.c
+++ b/src/term.c
@@ -4393,7 +4393,8 @@ use the Bourne shell command 'TERM=...; export TERM' (C-shell:\n\
   return terminal;
 }
 
-static _Cold void
+
+static void
 vfatal (const char *str, va_list ap)
 {
   fprintf (stderr, "emacs: ");
@@ -4409,7 +4410,7 @@ vfatal (const char *str, va_list ap)
    Delete TERMINAL, then call error or fatal with str1 or str2,
    respectively, according to whether MUST_SUCCEED is true.  */
 
-static _Cold void
+static void
 maybe_fatal (bool must_succeed, struct terminal *terminal,
 	     const char *str1, const char *str2, ...)
 {
@@ -4424,7 +4425,7 @@ maybe_fatal (bool must_succeed, struct terminal *terminal,
     verror (str1, ap);
 }
 
-_Cold void
+void
 fatal (const char *str, ...)
 {
   va_list ap;
diff --git a/src/xdisp.c b/src/xdisp.c
index aca6f09b05..a88fc698b8 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -13906,7 +13906,7 @@ do { if (polling_stopped_here) start_polling ();	\
 /* Perhaps in the future avoid recentering windows if it
    is not necessary; currently that causes some problems.  */
 
-static _Hot void
+static void
 redisplay_internal (void)
 {
   struct window *w = XWINDOW (selected_window);
@@ -14925,7 +14925,7 @@ buffer_flipping_blocked_p (void)
 
 /* Redisplay all leaf windows in the window tree rooted at WINDOW.  */
 
-static _Hot void
+static void
 redisplay_windows (Lisp_Object window)
 {
   while (!NILP (window))
@@ -16684,7 +16684,7 @@ set_horizontal_scroll_bar (struct window *w)
       showing point will be fully (as opposed to partially) visible on
       display.  */
 
-static _Hot void
+static void
 redisplay_window (Lisp_Object window, bool just_this_one_p)
 {
   struct window *w = XWINDOW (window);

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-16 20:50                   ` Alex Gramiak
@ 2019-04-16 21:11                     ` Alex Gramiak
  2019-04-16 21:27                     ` Stefan Monnier
  2019-04-16 21:27                     ` Konstantin Kharlamov
  2 siblings, 0 replies; 45+ messages in thread
From: Alex Gramiak @ 2019-04-16 21:11 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 6707 bytes --]

Alex Gramiak <agrambot@gmail.com> writes:
> Perhaps I was too overzealous with the hot tagging?

I definitely was, since I removed the _Hot attributes from the
*malloc/free procedures, which turned out to be in bad judgment.
Removing those yielded:

 Performance counter stats for 'src/emacs.hot-cold -f kill-emacs' (20 runs):

            751.22 msec task-clock:u              #    0.841 CPUs utilized            ( +-  0.14% )
                 0      context-switches:u        #    0.000 K/sec                  
                 0      cpu-migrations:u          #    0.000 K/sec                  
            12,928      page-faults:u             #    0.017 M/sec                    ( +-  0.02% )
     2,959,142,867      cycles:u                  #    3.939 GHz                      ( +-  0.05% )
     1,350,640,657      stalled-cycles-frontend:u #   45.64% frontend cycles idle     ( +-  0.12% )
       972,612,724      stalled-cycles-backend:u  #   32.87% backend cycles idle      ( +-  0.15% )
     4,865,119,525      instructions:u            #    1.64  insn per cycle         
                                                  #    0.28  stalled cycles per insn  ( +-  0.00% )
     1,035,582,401      branches:u                # 1378.539 M/sec                    ( +-  0.00% )
        17,794,068      branch-misses:u           #    1.72% of all branches          ( +-  0.14% )
     1,206,398,515      L1-dcache-loads:u         # 1605.925 M/sec                    ( +-  0.00% )
        42,095,141      L1-dcache-load-misses:u   #    3.49% of all L1-dcache hits    ( +-  0.05% )
         9,057,830      LLC-loads:u               #   12.058 M/sec                    ( +-  0.39% )
   <not supported>      LLC-load-misses:u                                           

           0.89309 +- 0.00484 seconds time elapsed  ( +-  0.54% )

I also realized that I completely forgot putting the attribute on
emacs_abort. With the _Cold attribute on emacs_abort:


 Performance counter stats for 'src/emacs -f kill-emacs' (20 runs):

            760.73 msec task-clock:u              #    0.846 CPUs utilized            ( +-  0.22% )
                 0      context-switches:u        #    0.000 K/sec                  
                 0      cpu-migrations:u          #    0.000 K/sec                  
            12,925      page-faults:u             #    0.017 M/sec                    ( +-  0.02% )
     2,991,729,656      cycles:u                  #    3.933 GHz                      ( +-  0.04% )
     1,388,332,047      stalled-cycles-frontend:u #   46.41% frontend cycles idle     ( +-  0.13% )
       976,840,303      stalled-cycles-backend:u  #   32.65% backend cycles idle      ( +-  0.17% )
     4,867,077,504      instructions:u            #    1.63  insn per cycle         
                                                  #    0.29  stalled cycles per insn  ( +-  0.00% )
     1,036,158,051      branches:u                # 1362.059 M/sec                    ( +-  0.00% )
        17,860,346      branch-misses:u           #    1.72% of all branches          ( +-  0.19% )
     1,207,924,887      L1-dcache-loads:u         # 1587.852 M/sec                    ( +-  0.00% )
        42,358,754      L1-dcache-load-misses:u   #    3.51% of all L1-dcache hits    ( +-  0.06% )
         9,046,859      LLC-loads:u               #   11.892 M/sec                    ( +-  0.32% )
   <not supported>      LLC-load-misses:u                                           

           0.89896 +- 0.00200 seconds time elapsed  ( +-  0.22% )


 Performance counter stats for 'src/emacs.cold -f kill-emacs' (20 runs):

            755.78 msec task-clock:u              #    0.845 CPUs utilized            ( +-  0.18% )
                 0      context-switches:u        #    0.000 K/sec                  
                 0      cpu-migrations:u          #    0.000 K/sec                  
            12,929      page-faults:u             #    0.017 M/sec                    ( +-  0.02% )
     2,978,556,838      cycles:u                  #    3.941 GHz                      ( +-  0.05% )
     1,370,387,120      stalled-cycles-frontend:u #   46.01% frontend cycles idle     ( +-  0.12% )
       978,514,384      stalled-cycles-backend:u  #   32.85% backend cycles idle      ( +-  0.16% )
     4,866,672,758      instructions:u            #    1.63  insn per cycle         
                                                  #    0.28  stalled cycles per insn  ( +-  0.00% )
     1,035,997,172      branches:u                # 1370.762 M/sec                    ( +-  0.00% )
        17,838,674      branch-misses:u           #    1.72% of all branches          ( +-  0.13% )
     1,206,937,944      L1-dcache-loads:u         # 1596.939 M/sec                    ( +-  0.00% )
        42,110,067      L1-dcache-load-misses:u   #    3.49% of all L1-dcache hits    ( +-  0.05% )
         9,088,714      LLC-loads:u               #   12.026 M/sec                    ( +-  0.28% )
   <not supported>      LLC-load-misses:u                                           

           0.89401 +- 0.00185 seconds time elapsed  ( +-  0.21% )



 Performance counter stats for 'src/emacs.hot-cold -f kill-emacs' (20 runs):

            752.56 msec task-clock:u              #    0.846 CPUs utilized            ( +-  0.18% )
                 0      context-switches:u        #    0.000 K/sec                  
                 0      cpu-migrations:u          #    0.000 K/sec                  
            12,923      page-faults:u             #    0.017 M/sec                    ( +-  0.01% )
     2,973,502,618      cycles:u                  #    3.951 GHz                      ( +-  0.05% )
     1,368,004,926      stalled-cycles-frontend:u #   46.01% frontend cycles idle     ( +-  0.11% )
       974,077,949      stalled-cycles-backend:u  #   32.76% backend cycles idle      ( +-  0.13% )
     4,865,128,800      instructions:u            #    1.64  insn per cycle         
                                                  #    0.28  stalled cycles per insn  ( +-  0.00% )
     1,035,577,546      branches:u                # 1376.070 M/sec                    ( +-  0.00% )
        17,721,035      branch-misses:u           #    1.71% of all branches          ( +-  0.17% )
     1,206,420,627      L1-dcache-loads:u         # 1603.086 M/sec                    ( +-  0.00% )
        42,129,928      L1-dcache-load-misses:u   #    3.49% of all L1-dcache hits    ( +-  0.04% )
         9,033,444      LLC-loads:u               #   12.004 M/sec                    ( +-  0.40% )
   <not supported>      LLC-load-misses:u                                           

           0.88928 +- 0.00161 seconds time elapsed  ( +-  0.18% )



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: hot/cold v2 --]
[-- Type: text/x-patch, Size: 16571 bytes --]

diff --git a/src/alloc.c b/src/alloc.c
index dd783863be..21aba9e7a9 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -633,7 +633,7 @@ display_malloc_warning (void)
 \f
 /* Called if we can't allocate relocatable space for a buffer.  */
 
-void
+_Cold void
 buffer_memory_full (ptrdiff_t nbytes)
 {
   /* If buffers use the relocating allocator, no need to free
@@ -2325,7 +2325,7 @@ compact_small_strings (void)
   current_sblock = tb;
 }
 
-void
+_Cold void
 string_overflow (void)
 {
   error ("Maximum string size exceeded");
@@ -4085,7 +4085,7 @@ set_interval_marked (INTERVAL i)
    either case this counts as memory being full even though malloc did
    not fail.  */
 
-void
+_Cold void
 memory_full (size_t nbytes)
 {
   /* Do not go into hysterics merely because a large request failed.  */
diff --git a/src/bytecode.c b/src/bytecode.c
index 40977799bf..e2fe7153b0 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -336,7 +336,7 @@ bcall0 (Lisp_Object f)
    ARGS are pushed on the stack according to ARGS_TEMPLATE before
    executing BYTESTR.  */
 
-Lisp_Object
+_Hot Lisp_Object
 exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 		Lisp_Object args_template, ptrdiff_t nargs, Lisp_Object *args)
 {
diff --git a/src/data.c b/src/data.c
index 11cd598ed8..1291a92955 100644
--- a/src/data.c
+++ b/src/data.c
@@ -130,7 +130,7 @@ set_blv_valcell (struct Lisp_Buffer_Local_Value *blv, Lisp_Object val)
   blv->valcell = val;
 }
 
-static _Noreturn void
+static _Cold _Noreturn void
 wrong_length_argument (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3)
 {
   Lisp_Object size1 = make_fixnum (bool_vector_size (a1));
@@ -142,7 +142,7 @@ wrong_length_argument (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3)
 	      make_fixnum (bool_vector_size (a3)));
 }
 
-_Noreturn void
+_Cold _Noreturn void
 wrong_type_argument (register Lisp_Object predicate, register Lisp_Object value)
 {
   /* If VALUE is not even a valid Lisp object, we'd want to abort here
@@ -155,25 +155,25 @@ wrong_type_argument (register Lisp_Object predicate, register Lisp_Object value)
   xsignal2 (Qwrong_type_argument, predicate, value);
 }
 
-void
+_Cold void
 pure_write_error (Lisp_Object obj)
 {
   xsignal2 (Qerror, build_string ("Attempt to modify read-only object"), obj);
 }
 
-void
+_Cold void
 args_out_of_range (Lisp_Object a1, Lisp_Object a2)
 {
   xsignal2 (Qargs_out_of_range, a1, a2);
 }
 
-void
+_Cold void
 args_out_of_range_3 (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3)
 {
   xsignal3 (Qargs_out_of_range, a1, a2, a3);
 }
 
-void
+_Cold void
 circular_list (Lisp_Object list)
 {
   xsignal1 (Qcircular_list, list);
@@ -1018,7 +1018,7 @@ do_symval_forwarding (lispfwd valcontents)
 /* Used to signal a user-friendly error when symbol WRONG is
    not a member of CHOICE, which should be a list of symbols.  */
 
-void
+_Cold void
 wrong_choice (Lisp_Object choice, Lisp_Object wrong)
 {
   ptrdiff_t i = 0, len = list_length (choice);
@@ -1051,7 +1051,7 @@ wrong_choice (Lisp_Object choice, Lisp_Object wrong)
 /* Used to signal a user-friendly error if WRONG is not a number or
    integer/floating-point number outsize of inclusive MIN..MAX range.  */
 
-static void
+static _Cold void
 wrong_range (Lisp_Object min, Lisp_Object max, Lisp_Object wrong)
 {
   AUTO_STRING (value_should_be_from, "Value should be from ");
diff --git a/src/dispnew.c b/src/dispnew.c
index ccb08ec1b9..db9166cbe6 100644
--- a/src/dispnew.c
+++ b/src/dispnew.c
@@ -360,7 +360,7 @@ verify_row_hash (struct glyph_row *row)
    what is displayed on the screen.  While this is usually fast, it
    leads to screen flickering.  */
 
-static void
+static _Hot void
 adjust_glyph_matrix (struct window *w, struct glyph_matrix *matrix, int x, int y, struct dim dim)
 {
   int i;
@@ -2994,7 +2994,7 @@ window_to_frame_hpos (struct window *w, int hpos)
 
 /* Redraw frame F.  */
 
-void
+_Hot void
 redraw_frame (struct frame *f)
 {
   /* Error if F has no glyphs.  */
@@ -3048,7 +3048,7 @@ DEFUN ("redraw-display", Fredraw_display, Sredraw_display, 0, 0, "",
 
    Value is true if redisplay was stopped due to pending input.  */
 
-bool
+_Hot bool
 update_frame (struct frame *f, bool force_p, bool inhibit_hairy_id_p)
 {
   /* True means display has been paused because of pending input.  */
@@ -3384,7 +3384,7 @@ check_current_matrix_flags (struct window *w)
 /* Update display of window W.
    If FORCE_P, don't stop updating when input is pending.  */
 
-static bool
+static _Hot bool
 update_window (struct window *w, bool force_p)
 {
   struct glyph_matrix *desired_matrix = w->desired_matrix;
@@ -3580,7 +3580,7 @@ update_marginal_area (struct window *w, struct glyph_row *updated_row,
 /* Update the display of the text area of row VPOS in window W.
    Value is true if display has changed.  */
 
-static bool
+static _Hot bool
 update_text_area (struct window *w, struct glyph_row *updated_row, int vpos)
 {
   struct glyph_row *current_row = MATRIX_ROW (w->current_matrix, vpos);
@@ -4476,7 +4476,7 @@ scrolling_window (struct window *w, bool header_line_p)
 
    Value is true if update was stopped due to pending input.  */
 
-static bool
+static _Hot bool
 update_frame_1 (struct frame *f, bool force_p, bool inhibit_id_p,
 		bool set_cursor_p, bool updating_menu_p)
 {
@@ -4779,7 +4779,7 @@ count_match (struct glyph *str1, struct glyph *end1, struct glyph *str2, struct
 
 /* Perform a frame-based update on line VPOS in frame FRAME.  */
 
-static void
+static _Hot void
 update_frame_line (struct frame *f, int vpos, bool updating_menu_p)
 {
   struct glyph *obody, *nbody, *op1, *op2, *np1, *nend;
diff --git a/src/eval.c b/src/eval.c
index e9f118c5cb..eb3d856a8f 100644
--- a/src/eval.c
+++ b/src/eval.c
@@ -1707,25 +1707,25 @@ signal_or_quit (Lisp_Object error_symbol, Lisp_Object data, bool keyboard_quit)
 
 /* Like xsignal, but takes 0, 1, 2, or 3 args instead of a list.  */
 
-void
+_Cold void
 xsignal0 (Lisp_Object error_symbol)
 {
   xsignal (error_symbol, Qnil);
 }
 
-void
+_Cold void
 xsignal1 (Lisp_Object error_symbol, Lisp_Object arg)
 {
   xsignal (error_symbol, list1 (arg));
 }
 
-void
+_Cold void
 xsignal2 (Lisp_Object error_symbol, Lisp_Object arg1, Lisp_Object arg2)
 {
   xsignal (error_symbol, list2 (arg1, arg2));
 }
 
-void
+_Cold void
 xsignal3 (Lisp_Object error_symbol, Lisp_Object arg1, Lisp_Object arg2, Lisp_Object arg3)
 {
   xsignal (error_symbol, list3 (arg1, arg2, arg3));
@@ -1734,7 +1734,7 @@ xsignal3 (Lisp_Object error_symbol, Lisp_Object arg1, Lisp_Object arg2, Lisp_Obj
 /* Signal `error' with message S, and additional arg ARG.
    If ARG is not a proper list, make it a one-element list.  */
 
-void
+_Cold void
 signal_error (const char *s, Lisp_Object arg)
 {
   if (NILP (Fproper_list_p (arg)))
@@ -1745,7 +1745,7 @@ signal_error (const char *s, Lisp_Object arg)
 
 /* Use this for arithmetic overflow, e.g., when an integer result is
    too large even for a bignum.  */
-void
+_Cold void
 overflow_error (void)
 {
   xsignal0 (Qoverflow_error);
@@ -1892,7 +1892,7 @@ vformat_string (const char *m, va_list ap)
 }
 
 /* Dump an error message; called like vprintf.  */
-void
+_Cold void
 verror (const char *m, va_list ap)
 {
   xsignal1 (Qerror, vformat_string (m, ap));
@@ -1902,7 +1902,7 @@ verror (const char *m, va_list ap)
 /* Dump an error message; called like printf.  */
 
 /* VARARGS 1 */
-void
+_Cold void
 error (const char *m, ...)
 {
   va_list ap;
@@ -2171,7 +2171,7 @@ record_in_backtrace (Lisp_Object function, Lisp_Object *args, ptrdiff_t nargs)
 
 /* Eval a sub-expression of the current expression (i.e. in the same
    lexical scope).  */
-Lisp_Object
+_Hot Lisp_Object
 eval_sub (Lisp_Object form)
 {
   Lisp_Object fun, val, original_fun, original_args;
@@ -2863,7 +2863,7 @@ usage: (funcall FUNCTION &rest ARGUMENTS)  */)
 /* Apply a C subroutine SUBR to the NUMARGS evaluated arguments in ARG_VECTOR
    and return the result of evaluation.  */
 
-Lisp_Object
+_Hot Lisp_Object
 funcall_subr (struct Lisp_Subr *subr, ptrdiff_t numargs, Lisp_Object *args)
 {
   if (numargs < subr->min_args
@@ -2942,7 +2942,7 @@ funcall_subr (struct Lisp_Subr *subr, ptrdiff_t numargs, Lisp_Object *args)
     }
 }
 
-static Lisp_Object
+static _Hot Lisp_Object
 apply_lambda (Lisp_Object fun, Lisp_Object args, ptrdiff_t count)
 {
   Lisp_Object *arg_vector;
@@ -2978,7 +2978,7 @@ apply_lambda (Lisp_Object fun, Lisp_Object args, ptrdiff_t count)
    FUN must be either a lambda-expression, a compiled-code object,
    or a module function.  */
 
-static Lisp_Object
+static _Hot Lisp_Object
 funcall_lambda (Lisp_Object fun, ptrdiff_t nargs,
 		register Lisp_Object *arg_vector)
 {
@@ -3322,7 +3322,7 @@ do_specbind (struct Lisp_Symbol *sym, union specbinding *bind,
      i.e. bindings to the default value of a variable which can be
      buffer-local.  */
 
-void
+_Hot void
 specbind (Lisp_Object symbol, Lisp_Object value)
 {
   struct Lisp_Symbol *sym;
@@ -3580,7 +3580,7 @@ set_unwind_protect_ptr (ptrdiff_t count, void (*func) (void *), void *arg)
 /* Pop and execute entries from the unwind-protect stack until the
    depth COUNT is reached.  Return VALUE.  */
 
-Lisp_Object
+_Hot Lisp_Object
 unbind_to (ptrdiff_t count, Lisp_Object value)
 {
   Lisp_Object quitf = Vquit_flag;
diff --git a/src/keyboard.c b/src/keyboard.c
index 8fb6db987b..5569d0db2c 100644
--- a/src/keyboard.c
+++ b/src/keyboard.c
@@ -1038,7 +1038,7 @@ static Lisp_Object top_level_1 (Lisp_Object);
    This level has the catches for exiting/returning to editor command loop.
    It returns nil to exit recursive edit, t to abort it.  */
 
-Lisp_Object
+_Hot Lisp_Object
 command_loop (void)
 {
 #ifdef HAVE_STACK_OVERFLOW_HANDLING
@@ -8856,7 +8856,7 @@ void init_raw_keybuf_count (void)
    If FIX_CURRENT_BUFFER, we restore current_buffer
    from the selected window's buffer.  */
 
-static int
+static _Hot int
 read_key_sequence (Lisp_Object *keybuf, Lisp_Object prompt,
 		   bool dont_downcase_last, bool can_return_switch_frame,
 		   bool fix_current_buffer, bool prevent_redisplay)
diff --git a/src/lisp.h b/src/lisp.h
index 681efc3b52..b98cad5e8e 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -34,6 +34,9 @@ along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.  */
 #include <intprops.h>
 #include <verify.h>
 
+#define _Cold __attribute__((cold))
+#define _Hot  __attribute__((hot))
+
 INLINE_HEADER_BEGIN
 
 /* Define a TYPE constant ID as an externally visible name.  Use like this:
@@ -621,7 +624,7 @@ extern Lisp_Object char_table_ref (Lisp_Object, int);
 extern void char_table_set (Lisp_Object, int, Lisp_Object);
 
 /* Defined in data.c.  */
-extern _Noreturn void wrong_type_argument (Lisp_Object, Lisp_Object);
+extern _Cold _Noreturn void wrong_type_argument (Lisp_Object, Lisp_Object);
 
 
 /* Defined in emacs.c.  */
@@ -4095,18 +4098,18 @@ extern Lisp_Object run_hook_with_args (ptrdiff_t nargs, Lisp_Object *args,
 				       Lisp_Object (*funcall)
 				       (ptrdiff_t nargs, Lisp_Object *args));
 extern Lisp_Object quit (void);
-INLINE _Noreturn void
+INLINE _Cold _Noreturn void
 xsignal (Lisp_Object error_symbol, Lisp_Object data)
 {
   Fsignal (error_symbol, data);
 }
-extern _Noreturn void xsignal0 (Lisp_Object);
-extern _Noreturn void xsignal1 (Lisp_Object, Lisp_Object);
-extern _Noreturn void xsignal2 (Lisp_Object, Lisp_Object, Lisp_Object);
-extern _Noreturn void xsignal3 (Lisp_Object, Lisp_Object, Lisp_Object,
+extern _Cold _Noreturn void xsignal0 (Lisp_Object);
+extern _Cold _Noreturn void xsignal1 (Lisp_Object, Lisp_Object);
+extern _Cold _Noreturn void xsignal2 (Lisp_Object, Lisp_Object, Lisp_Object);
+extern _Cold _Noreturn void xsignal3 (Lisp_Object, Lisp_Object, Lisp_Object,
 				Lisp_Object);
-extern _Noreturn void signal_error (const char *, Lisp_Object);
-extern _Noreturn void overflow_error (void);
+extern _Cold _Noreturn void signal_error (const char *, Lisp_Object);
+extern _Cold _Noreturn void overflow_error (void);
 extern bool FUNCTIONP (Lisp_Object);
 extern Lisp_Object funcall_subr (struct Lisp_Subr *subr, ptrdiff_t numargs, Lisp_Object *arg_vector);
 extern Lisp_Object eval_sub (Lisp_Object form);
@@ -4145,8 +4148,8 @@ extern void set_unwind_protect_ptr (ptrdiff_t, void (*) (void *), void *);
 extern Lisp_Object unbind_to (ptrdiff_t, Lisp_Object);
 extern void rebind_for_thread_switch (void);
 extern void unbind_for_thread_switch (struct thread_state *);
-extern _Noreturn void error (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2);
-extern _Noreturn void verror (const char *, va_list)
+extern _Cold _Noreturn void error (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2);
+extern _Cold _Noreturn void verror (const char *, va_list)
   ATTRIBUTE_FORMAT_PRINTF (1, 0);
 extern Lisp_Object vformat_string (const char *, va_list)
   ATTRIBUTE_FORMAT_PRINTF (1, 0);
diff --git a/src/lread.c b/src/lread.c
index 5f33fcd695..16ce4afd21 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -1912,7 +1912,7 @@ readevalloop_eager_expand_eval (Lisp_Object val, Lisp_Object macroexpand)
    START, END specify region to read in current buffer (from eval-region).
    If the input is not from a buffer, they must be nil.  */
 
-static void
+static _Hot void
 readevalloop (Lisp_Object readcharfun,
 	      struct infile *infile0,
 	      Lisp_Object sourcename,
@@ -2736,7 +2736,7 @@ read_integer (Lisp_Object readcharfun, EMACS_INT radix)
 
    FIRST_IN_LIST is true if this is the first element of a list.  */
 
-static Lisp_Object
+static _Hot Lisp_Object
 read1 (Lisp_Object readcharfun, int *pch, bool first_in_list)
 {
   int c;
@@ -4092,7 +4092,7 @@ check_obarray (Lisp_Object obarray)
 
 /* Intern symbol SYM in OBARRAY using bucket INDEX.  */
 
-static Lisp_Object
+static _Hot Lisp_Object
 intern_sym (Lisp_Object sym, Lisp_Object obarray, Lisp_Object index)
 {
   Lisp_Object *ptr;
@@ -4116,7 +4116,7 @@ intern_sym (Lisp_Object sym, Lisp_Object obarray, Lisp_Object index)
 
 /* Intern a symbol with name STRING in OBARRAY using bucket INDEX.  */
 
-Lisp_Object
+_Hot Lisp_Object
 intern_driver (Lisp_Object string, Lisp_Object obarray, Lisp_Object index)
 {
   return intern_sym (Fmake_symbol (string), obarray, index);
@@ -4125,7 +4125,7 @@ intern_driver (Lisp_Object string, Lisp_Object obarray, Lisp_Object index)
 /* Intern the C string STR: return a symbol with that name,
    interned in the current obarray.  */
 
-Lisp_Object
+_Hot Lisp_Object
 intern_1 (const char *str, ptrdiff_t len)
 {
   Lisp_Object obarray = check_obarray (Vobarray);
diff --git a/src/sysdep.c b/src/sysdep.c
index 57ea8220ca..84e118c250 100644
--- a/src/sysdep.c
+++ b/src/sysdep.c
@@ -2447,7 +2447,7 @@ emacs_backtrace (int backtrace_limit)
 }
 \f
 #ifndef HAVE_NTGUI
-void
+_Cold void
 emacs_abort (void)
 {
   terminate_due_to_signal (SIGABRT, 40);
diff --git a/src/term.c b/src/term.c
index a492276c88..472d8d19e5 100644
--- a/src/term.c
+++ b/src/term.c
@@ -4393,8 +4393,7 @@ use the Bourne shell command 'TERM=...; export TERM' (C-shell:\n\
   return terminal;
 }
 
-
-static void
+static _Cold void
 vfatal (const char *str, va_list ap)
 {
   fprintf (stderr, "emacs: ");
@@ -4410,7 +4409,7 @@ vfatal (const char *str, va_list ap)
    Delete TERMINAL, then call error or fatal with str1 or str2,
    respectively, according to whether MUST_SUCCEED is true.  */
 
-static void
+static _Cold void
 maybe_fatal (bool must_succeed, struct terminal *terminal,
 	     const char *str1, const char *str2, ...)
 {
@@ -4425,7 +4424,7 @@ maybe_fatal (bool must_succeed, struct terminal *terminal,
     verror (str1, ap);
 }
 
-void
+_Cold void
 fatal (const char *str, ...)
 {
   va_list ap;
diff --git a/src/xdisp.c b/src/xdisp.c
index a88fc698b8..aca6f09b05 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -13906,7 +13906,7 @@ do { if (polling_stopped_here) start_polling ();	\
 /* Perhaps in the future avoid recentering windows if it
    is not necessary; currently that causes some problems.  */
 
-static void
+static _Hot void
 redisplay_internal (void)
 {
   struct window *w = XWINDOW (selected_window);
@@ -14925,7 +14925,7 @@ buffer_flipping_blocked_p (void)
 
 /* Redisplay all leaf windows in the window tree rooted at WINDOW.  */
 
-static void
+static _Hot void
 redisplay_windows (Lisp_Object window)
 {
   while (!NILP (window))
@@ -16684,7 +16684,7 @@ set_horizontal_scroll_bar (struct window *w)
       showing point will be fully (as opposed to partially) visible on
       display.  */
 
-static void
+static _Hot void
 redisplay_window (Lisp_Object window, bool just_this_one_p)
 {
   struct window *w = XWINDOW (window);

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-16 20:50                   ` Alex Gramiak
  2019-04-16 21:11                     ` Alex Gramiak
@ 2019-04-16 21:27                     ` Stefan Monnier
  2019-04-16 21:27                     ` Konstantin Kharlamov
  2 siblings, 0 replies; 45+ messages in thread
From: Stefan Monnier @ 2019-04-16 21:27 UTC (permalink / raw)
  To: Alex Gramiak; +Cc: Paul Eggert, emacs-devel

> Which again shows a slight improvement with the Cold attributes, and
> still shows the hot attributes degrading performance.  Perhaps I was too
> overzealous with the hot tagging?

I think it's pretty tricky to decide which functions should be "hot", so
I'm not surprised you get worse results there: That matches the past
experience with programmer-annotated likelihood ;-)

For "cold" I think the meaning is pretty clear: assume this code is
almost never executed which can both mean that functions that call it
can be optimized under the assumption that those calls are unlikely, and
at the same time spend less time optimizing the function (and move it
"out of the way" to a different page).

But for "hot", it's not clear whether it means "called very often" or
"we spend a lot of time inside of it".  E.g. it would never have
occurred to me to mark  redisplay_internal as "hot".


        Stefan



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-16 20:50                   ` Alex Gramiak
  2019-04-16 21:11                     ` Alex Gramiak
  2019-04-16 21:27                     ` Stefan Monnier
@ 2019-04-16 21:27                     ` Konstantin Kharlamov
  2019-04-18  8:25                       ` Paul Eggert
  2 siblings, 1 reply; 45+ messages in thread
From: Konstantin Kharlamov @ 2019-04-16 21:27 UTC (permalink / raw)
  To: Alex Gramiak; +Cc: Paul Eggert, Stefan Monnier, emacs-devel

FWIW I was in a similar search not so long ago, and I was told that 
e.g. "cold" attribute can sometimes produce unbearably slow code 
https://gcc.gnu.org/ml/gcc-help/2019-01/msg00035.html

В Вт, апр 16, 2019 at 14:50, Alex Gramiak <agrambot@gmail.com> 
написал:
> Paul Eggert <eggert@cs.ucla.edu> writes:
> 
>>  That being said, it might make sense for a few 
>> obviously-rarely-called
>>  functions like 'emacs-abort' to be marked with __attribute__ 
>> ((cold)),
>>  so long as we don't turn this into a mission to mark all cold 
>> functions
>>  (which would cost us more than it would benefit). That is what GCC
>>  itself does, with its own functions. However, I'd like to see
>>  performance figures. Could you try it out on the benchmark of 'cd 
>> lisp
>>  && time make compile-always'?
> 
> Right, I agree that if used, they should be used sparingly. I tested
> three versions a few times each with both 'make' and 'make -j4':
> 
> a) Regular Emacs master.
> b) The below diff with only the _Cold attribute
> c) The below diff with both _Cold and _Hot attributes
> 
> a) Normal
> real    4:17.97s
> user    3:57.18s
> sys     20.394s
> 
> real    1:17.67s
> user    4:23.78s
> sys     18.888s
> 
> b) Cold
> real    4:10.92s
> user    3:50.34s
> sys     20.178s
> 
> real    1:15.77s
> user    4:16.73s
> sys     18.943s
> 
> c) Hot/Cold
> real    4:11.43s
> user    3:51.07s
> sys     19.961s
> 
> real    1:16.01s
> user    4:17.63s
> sys     18.662s
> 
> So not much of a difference. For some reason the Hot/Cold performed
> consistently worse than Cold.
> 
> I also tested startup/shutdown with perf:
> 
>  Performance counter stats for '../emacs-normal -f kill-emacs' (20 
> runs):
> 
>             762.17 msec task-clock:u              #    0.844 CPUs 
> utilized            ( +-  0.23% )
>                  0      context-switches:u        #    0.000 K/sec
>                  0      cpu-migrations:u          #    0.000 K/sec
>             12,941      page-faults:u             #    0.017 M/sec    
>                 ( +-  0.01% )
>      2,998,322,125      cycles:u                  #    3.934 GHz      
>                 ( +-  0.06% )
>      1,392,869,413      stalled-cycles-frontend:u #   46.45% frontend 
> cycles idle     ( +-  0.15% )
>        982,206,843      stalled-cycles-backend:u  #   32.76% backend 
> cycles idle      ( +-  0.18% )
>      4,874,186,825      instructions:u            #    1.63  insn per 
> cycle
>                                                   #    0.29  stalled 
> cycles per insn  ( +-  0.01% )
>      1,037,929,374      branches:u                # 1361.802 M/sec    
>                 ( +-  0.01% )
>         17,930,471      branch-misses:u           #    1.73% of all 
> branches          ( +-  0.16% )
>      1,209,539,215      L1-dcache-loads:u         # 1586.960 M/sec    
>                 ( +-  0.01% )
>         42,346,229      L1-dcache-load-misses:u   #    3.50% of all 
> L1-dcache hits    ( +-  0.05% )
>          9,088,647      LLC-loads:u               #   11.925 M/sec    
>                 ( +-  0.29% )
>    <not supported>      LLC-load-misses:u
> 
>            0.90325 +- 0.00441 seconds time elapsed  ( +-  0.49% )
> 
> 
> 
>  Performance counter stats for '../emacs.cold -f kill-emacs' (20 
> runs):
> 
>             755.94 msec task-clock:u              #    0.845 CPUs 
> utilized            ( +-  0.24% )
>                  0      context-switches:u        #    0.000 K/sec
>                  0      cpu-migrations:u          #    0.000 K/sec
>             12,941      page-faults:u             #    0.017 M/sec    
>                 ( +-  0.01% )
>      2,976,036,365      cycles:u                  #    3.937 GHz      
>                 ( +-  0.06% )
>      1,374,451,779      stalled-cycles-frontend:u #   46.18% frontend 
> cycles idle     ( +-  0.14% )
>        990,227,732      stalled-cycles-backend:u  #   33.27% backend 
> cycles idle      ( +-  0.18% )
>      4,878,661,927      instructions:u            #    1.64  insn per 
> cycle
>                                                   #    0.28  stalled 
> cycles per insn  ( +-  0.00% )
>      1,038,495,525      branches:u                # 1373.782 M/sec    
>                 ( +-  0.00% )
>         17,859,906      branch-misses:u           #    1.72% of all 
> branches          ( +-  0.16% )
>      1,209,345,531      L1-dcache-loads:u         # 1599.792 M/sec    
>                 ( +-  0.00% )
>         42,444,358      L1-dcache-load-misses:u   #    3.51% of all 
> L1-dcache hits    ( +-  0.06% )
>          9,204,368      LLC-loads:u               #   12.176 M/sec    
>                 ( +-  0.41% )
>    <not supported>      LLC-load-misses:u
> 
>            0.89430 +- 0.00217 seconds time elapsed  ( +-  0.24% )
> 
> 
>  Performance counter stats for '../emacs.hot-cold -f kill-emacs' (20 
> runs):
> 
>             761.97 msec task-clock:u              #    0.845 CPUs 
> utilized            ( +-  0.20% )
>                  0      context-switches:u        #    0.000 K/sec
>                  0      cpu-migrations:u          #    0.000 K/sec
>             12,947      page-faults:u             #    0.017 M/sec    
>                 ( +-  0.01% )
>      2,989,750,359      cycles:u                  #    3.924 GHz      
>                 ( +-  0.04% )
>      1,383,312,275      stalled-cycles-frontend:u #   46.27% frontend 
> cycles idle     ( +-  0.12% )
>        994,643,853      stalled-cycles-backend:u  #   33.27% backend 
> cycles idle      ( +-  0.13% )
>      4,879,318,990      instructions:u            #    1.63  insn per 
> cycle
>                                                   #    0.28  stalled 
> cycles per insn  ( +-  0.00% )
>      1,038,584,045      branches:u                # 1363.022 M/sec    
>                 ( +-  0.00% )
>         17,863,736      branch-misses:u           #    1.72% of all 
> branches          ( +-  0.13% )
>      1,209,327,347      L1-dcache-loads:u         # 1587.103 M/sec    
>                 ( +-  0.00% )
>         42,501,374      L1-dcache-load-misses:u   #    3.51% of all 
> L1-dcache hits    ( +-  0.05% )
>          9,201,311      LLC-loads:u               #   12.076 M/sec    
>                 ( +-  0.28% )
>    <not supported>      LLC-load-misses:u
> 
>            0.90132 +- 0.00201 seconds time elapsed  ( +-  0.22% )
> 
> 
> Which again shows a slight improvement with the Cold attributes, and
> still shows the hot attributes degrading performance. Perhaps I was 
> too
> overzealous with the hot tagging?
> 





^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-16 21:27                     ` Konstantin Kharlamov
@ 2019-04-18  8:25                       ` Paul Eggert
  2019-04-18  8:43                         ` Konstantin Kharlamov
                                           ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Paul Eggert @ 2019-04-18  8:25 UTC (permalink / raw)
  To: Konstantin Kharlamov; +Cc: Stefan Monnier, Alex Gramiak, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 601 bytes --]

Konstantin Kharlamov wrote:
> I was told that e.g. "cold" 
> attribute can sometimes produce unbearably slow code 
> https://gcc.gnu.org/ml/gcc-help/2019-01/msg00035.html

Although cold functions can be slow, it appears that overall it's a win for 
Emacs to mark _Noreturn error function declarations as cold: on my platform, 
'make compile-always' ran about 1.3% faster. So I installed the attached patch 
into master. (Like Stefan, I'm wary of marking functions 'hot' so I didn't do that.)

This patch also adds a convenience macro AVOID for the now-common pattern 
'_Noreturn ATTRIBUTE_COLD void'.

[-- Attachment #2: 0001-Mark-_Noreturn-error-functions-as-cold.txt --]
[-- Type: text/plain, Size: 27463 bytes --]

From 6d6c55db2cdfb6b354873f17285a3f602e011817 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Thu, 18 Apr 2019 00:30:24 -0700
Subject: [PATCH] Mark _Noreturn error functions as cold
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

On my platform this made ‘make compile-always’ 1.3% faster.
Suggested by Alex Gramiak in:
https://lists.gnu.org/r/emacs-devel/2019-04/msg00684.html
* configure.ac (nw): Don’t use -Wsuggest-attribute=cold.
* lib-src/make-docfile.c (write_globals):
Mark noreturn functions as cold.
* src/callproc.c (exec_failed):
* src/data.c (wrong_length_argument, wrong_type_argument):
* src/emacs-module.c (module_abort):
* src/emacs.c (terminate_due_to_signal):
* src/eval.c (unwind_to_catch):
* src/image.c (my_png_error, my_error_exit):
* src/json.c (json_out_of_memory, json_parse_error):
* src/keyboard.c (quit_throw_to_read_char, user_error):
* src/lisp.h (die, wrong_type_argument, wrong_choice)
(args_out_of_range, args_out_of_range_3, circular_list)
(buffer_overflow, memory_full, buffer_memory_full)
(string_overflow, xsignal, xsignal0, xsignal1, xsignal2)
(xsignal3, signal_error, overflow_error, error, verror)
(nsberror, report_file_errno, report_file_error)
(report_file_notify_error, terminate_due_to_signal)
(emacs_abort, fatal):
* src/lread.c (load_error_old_style_backquotes)
(end_of_file_error, invalid_syntax):
* src/pdumper.c (error_unsupported_dump_object):
* src/puresize.h (pure_write_error):
* src/search.c (matcher_overflow):
* src/sound.c (sound_perror, alsa_sound_perror):
* src/sysdep.c (handle_arith_signal):
* src/systime.h (time_overflow):
* src/term.c (maybe_fatal, vfatal):
* src/textprop.c (text_read_only):
* src/timefns.c (invalid_time_zone_specification)
(time_error, invalid_hz):
* src/xterm.c (x_connection_closed):
Use AVOID instead of _Noreturn void, so that it’s marked cold.
* src/conf_post.h (__has_attribute_cold) [!__has_attribute]:
New macro.
(ATTRIBUTE_COLD): New macro.
* src/frame.h (WINDOW_SYSTEM_RETURN): Add ATTRIBUTE_COLD.
* src/lisp.h (AVOID): New macro.
* src/xterm.c: Omit unnecessary static decls, so that we needn’t
worry about which functions should be marked cold.
(x_io_error_quitter): Mark as cold.
---
 configure.ac           |  3 +++
 lib-src/make-docfile.c |  2 ++
 src/callproc.c         |  2 +-
 src/conf_post.h        |  7 +++++
 src/data.c             |  4 +--
 src/emacs-module.c     |  6 ++---
 src/emacs.c            |  2 +-
 src/eval.c             |  2 +-
 src/frame.h            |  2 +-
 src/image.c            |  4 +--
 src/json.c             |  4 +--
 src/keyboard.c         |  6 ++---
 src/lisp.h             | 58 +++++++++++++++++++++---------------------
 src/lread.c            |  6 ++---
 src/pdumper.c          |  3 +--
 src/puresize.h         |  2 +-
 src/search.c           |  2 +-
 src/sound.c            |  4 +--
 src/sysdep.c           |  2 +-
 src/systime.h          |  2 +-
 src/term.c             |  7 +++--
 src/textprop.c         |  2 +-
 src/timefns.c          |  6 ++---
 src/xterm.c            | 35 +++----------------------
 24 files changed, 75 insertions(+), 98 deletions(-)

diff --git a/configure.ac b/configure.ac
index 3cebf3d78c..9d39bdd76b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1068,6 +1068,9 @@ AC_DEFUN
   # Emacs's use of alloca inhibits protecting the stack.
   nw="$nw -Wstack-protector"
 
+  # Emacs's use of __attribute__ ((cold)) causes false alarms with this option.
+  nw="$nw -Wsuggest-attribute=cold"
+
   # Emacs's use of partly-const functions such as Fgnutls_available_p
   # make this option problematic.
   nw="$nw -Wsuggest-attribute=const"
diff --git a/lib-src/make-docfile.c b/lib-src/make-docfile.c
index ccd245e013..4d25b0a6b9 100644
--- a/lib-src/make-docfile.c
+++ b/lib-src/make-docfile.c
@@ -747,6 +747,8 @@ write_globals (void)
 	    printf ("%d", globals[i].v.value);
 	  putchar (')');
 
+	  if (globals[i].flags & DEFUN_noreturn)
+	    fputs (" ATTRIBUTE_COLD", stdout);
 	  if (globals[i].flags & DEFUN_const)
 	    fputs (" ATTRIBUTE_CONST", stdout);
 
diff --git a/src/callproc.c b/src/callproc.c
index 2cdf84d9a8..98c67312b4 100644
--- a/src/callproc.c
+++ b/src/callproc.c
@@ -1135,7 +1135,7 @@ add_env (char **env, char **new_env, char *string)
    mess up the allocator's data structures in the parent.
    Report the error and exit the child.  */
 
-static _Noreturn void
+static AVOID
 exec_failed (char const *name, int err)
 {
   /* Avoid deadlock if the child's perror writes to a full pipe; the
diff --git a/src/conf_post.h b/src/conf_post.h
index f8254cfa9d..6ea2c7b664 100644
--- a/src/conf_post.h
+++ b/src/conf_post.h
@@ -73,6 +73,7 @@ typedef bool bool_bf;
 # define __has_attribute(a) __has_attribute_##a
 # define __has_attribute_alloc_size GNUC_PREREQ (4, 3, 0)
 # define __has_attribute_cleanup GNUC_PREREQ (3, 4, 0)
+# define __has_attribute_cold GNUC_PREREQ (4, 3, 0)
 # define __has_attribute_externally_visible GNUC_PREREQ (4, 1, 0)
 # define __has_attribute_no_address_safety_analysis false
 # define __has_attribute_no_sanitize_address GNUC_PREREQ (4, 8, 0)
@@ -223,6 +224,12 @@ extern void _DebPrint (const char *fmt, ...);
 extern char *emacs_getenv_TZ (void);
 extern int emacs_setenv_TZ (char const *);
 
+#if __has_attribute (cold)
+# define ATTRIBUTE_COLD __attribute__ ((cold))
+#else
+# define ATTRIBUTE_COLD
+#endif
+
 #if __GNUC__ >= 3  /* On GCC 3.0 we might get a warning.  */
 #define NO_INLINE __attribute__((noinline))
 #else
diff --git a/src/data.c b/src/data.c
index 11cd598ed8..1b2431011e 100644
--- a/src/data.c
+++ b/src/data.c
@@ -130,7 +130,7 @@ set_blv_valcell (struct Lisp_Buffer_Local_Value *blv, Lisp_Object val)
   blv->valcell = val;
 }
 
-static _Noreturn void
+static AVOID
 wrong_length_argument (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3)
 {
   Lisp_Object size1 = make_fixnum (bool_vector_size (a1));
@@ -142,7 +142,7 @@ wrong_length_argument (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3)
 	      make_fixnum (bool_vector_size (a3)));
 }
 
-_Noreturn void
+AVOID
 wrong_type_argument (register Lisp_Object predicate, register Lisp_Object value)
 {
   /* If VALUE is not even a valid Lisp object, we'd want to abort here
diff --git a/src/emacs-module.c b/src/emacs-module.c
index 47ca3368c0..09a768e18e 100644
--- a/src/emacs-module.c
+++ b/src/emacs-module.c
@@ -147,8 +147,7 @@ static enum emacs_funcall_exit module_non_local_exit_check (emacs_env *);
 static void module_assert_thread (void);
 static void module_assert_runtime (struct emacs_runtime *);
 static void module_assert_env (emacs_env *);
-static _Noreturn void module_abort (const char *format, ...)
-  ATTRIBUTE_FORMAT_PRINTF(1, 2);
+static AVOID module_abort (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2);
 static emacs_env *initialize_environment (emacs_env *,
 					  struct emacs_env_private *);
 static void finalize_environment (emacs_env *);
@@ -1163,8 +1162,7 @@ init_module_assertions (bool enable)
   initialize_storage (&global_storage);
 }
 
-static _Noreturn void
-ATTRIBUTE_FORMAT_PRINTF(1, 2)
+static AVOID ATTRIBUTE_FORMAT_PRINTF (1, 2)
 module_abort (const char *format, ...)
 {
   fputs ("Emacs module assertion: ", stderr);
diff --git a/src/emacs.c b/src/emacs.c
index 6ed4b0ed87..5eba88c74f 100644
--- a/src/emacs.c
+++ b/src/emacs.c
@@ -370,7 +370,7 @@ using_utf8 (void)
 
 /* Report a fatal error due to signal SIG, output a backtrace of at
    most BACKTRACE_LIMIT lines, and exit.  */
-_Noreturn void
+AVOID
 terminate_due_to_signal (int sig, int backtrace_limit)
 {
   signal (sig, SIG_DFL);
diff --git a/src/eval.c b/src/eval.c
index fa7b2d0603..c2e996a947 100644
--- a/src/eval.c
+++ b/src/eval.c
@@ -1133,7 +1133,7 @@ internal_catch (Lisp_Object tag,
 
    This is used for correct unwinding in Fthrow and Fsignal.  */
 
-static _Noreturn void
+static AVOID
 unwind_to_catch (struct handler *catch, Lisp_Object value)
 {
   bool last_time;
diff --git a/src/frame.h b/src/frame.h
index ec8f61465f..e2e8eaab9b 100644
--- a/src/frame.h
+++ b/src/frame.h
@@ -1262,7 +1262,7 @@ extern int frame_default_tool_bar_height;
 #ifdef HAVE_WINDOW_SYSTEM
 # define WINDOW_SYSTEM_RETURN
 #else
-# define WINDOW_SYSTEM_RETURN _Noreturn
+# define WINDOW_SYSTEM_RETURN _Noreturn ATTRIBUTE_COLD
 #endif
 
 extern WINDOW_SYSTEM_RETURN struct frame *
diff --git a/src/image.c b/src/image.c
index c1a23edefc..8e3a9a4930 100644
--- a/src/image.c
+++ b/src/image.c
@@ -6079,7 +6079,7 @@ init_png_functions (void)
 /* Error and warning handlers installed when the PNG library
    is initialized.  */
 
-static _Noreturn void
+static AVOID
 my_png_error (png_struct *png_ptr, const char *msg)
 {
   eassert (png_ptr != NULL);
@@ -6718,7 +6718,7 @@ struct my_jpeg_error_mgr
 };
 
 
-static _Noreturn void
+static AVOID
 my_error_exit (j_common_ptr cinfo)
 {
   struct my_jpeg_error_mgr *mgr = (struct my_jpeg_error_mgr *) cinfo->err;
diff --git a/src/json.c b/src/json.c
index 74e0534065..5917212899 100644
--- a/src/json.c
+++ b/src/json.c
@@ -255,7 +255,7 @@ json_encode (Lisp_Object string)
   return code_convert_string (string, Qutf_8_unix, Qt, true, true, true);
 }
 
-static _Noreturn void
+static AVOID
 json_out_of_memory (void)
 {
   xsignal0 (Qjson_out_of_memory);
@@ -263,7 +263,7 @@ json_out_of_memory (void)
 
 /* Signal a Lisp error corresponding to the JSON ERROR.  */
 
-static _Noreturn void
+static AVOID
 json_parse_error (const json_error_t *error)
 {
   Lisp_Object symbol;
diff --git a/src/keyboard.c b/src/keyboard.c
index 8fb6db987b..dff8f6b2fc 100644
--- a/src/keyboard.c
+++ b/src/keyboard.c
@@ -364,7 +364,7 @@ static void restore_getcjmp (void *);
 static Lisp_Object apply_modifiers (int, Lisp_Object);
 static void restore_kboard_configuration (int);
 static void handle_interrupt (bool);
-static _Noreturn void quit_throw_to_read_char (bool);
+static AVOID quit_throw_to_read_char (bool);
 static void timer_start_idle (void);
 static void timer_stop_idle (void);
 static void timer_resume_idle (void);
@@ -1131,13 +1131,12 @@ This also exits all active minibuffers.  */
   Fthrow (Qtop_level, Qnil);
 }
 
-static _Noreturn void
+static AVOID
 user_error (const char *msg)
 {
   xsignal1 (Quser_error, build_string (msg));
 }
 
-/* _Noreturn will be added to prototype by make-docfile.  */
 DEFUN ("exit-recursive-edit", Fexit_recursive_edit, Sexit_recursive_edit, 0, 0, "",
        doc: /* Exit from the innermost recursive edit or minibuffer.  */
        attributes: noreturn)
@@ -1149,7 +1148,6 @@ DEFUN ("exit-recursive-edit", Fexit_recursive_edit, Sexit_recursive_edit, 0, 0,
   user_error ("No recursive edit is in progress");
 }
 
-/* _Noreturn will be added to prototype by make-docfile.  */
 DEFUN ("abort-recursive-edit", Fabort_recursive_edit, Sabort_recursive_edit, 0, 0, "",
        doc: /* Abort the command that requested this recursive edit or minibuffer input.  */
        attributes: noreturn)
diff --git a/src/lisp.h b/src/lisp.h
index 2915944ffe..25d0a3d6ac 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -166,6 +166,9 @@ typedef EMACS_UINT uprintmax_t;
 # define pD "t"
 #endif
 
+/* Convenience macro for rarely-used functions that do not return.  */
+#define AVOID _Noreturn ATTRIBUTE_COLD void
+
 /* Extra internal type checking?  */
 
 /* Define Emacs versions of <assert.h>'s 'assert (COND)' and <verify.h>'s
@@ -196,7 +199,7 @@ typedef EMACS_UINT uprintmax_t;
 # define eassume(cond) assume (cond)
 #else /* ENABLE_CHECKING */
 
-extern _Noreturn void die (const char *, const char *, int);
+extern AVOID die (const char *, const char *, int);
 
 extern bool suppress_checking EXTERNALLY_VISIBLE;
 
@@ -621,7 +624,7 @@ extern Lisp_Object char_table_ref (Lisp_Object, int);
 extern void char_table_set (Lisp_Object, int, Lisp_Object);
 
 /* Defined in data.c.  */
-extern _Noreturn void wrong_type_argument (Lisp_Object, Lisp_Object);
+extern AVOID wrong_type_argument (Lisp_Object, Lisp_Object);
 
 
 /* Defined in emacs.c.  */
@@ -3528,7 +3531,7 @@ modiff_to_integer (modiff_count a)
 }
 
 /* Defined in data.c.  */
-extern _Noreturn void wrong_choice (Lisp_Object, Lisp_Object);
+extern AVOID wrong_choice (Lisp_Object, Lisp_Object);
 extern void notify_variable_watchers (Lisp_Object, Lisp_Object,
 				      Lisp_Object, Lisp_Object);
 extern Lisp_Object indirect_function (Lisp_Object);
@@ -3555,10 +3558,9 @@ extern intmax_t cons_to_signed (Lisp_Object, intmax_t, intmax_t);
 extern uintmax_t cons_to_unsigned (Lisp_Object, uintmax_t);
 
 extern struct Lisp_Symbol *indirect_variable (struct Lisp_Symbol *);
-extern _Noreturn void args_out_of_range (Lisp_Object, Lisp_Object);
-extern _Noreturn void args_out_of_range_3 (Lisp_Object, Lisp_Object,
-					   Lisp_Object);
-extern _Noreturn void circular_list (Lisp_Object);
+extern AVOID args_out_of_range (Lisp_Object, Lisp_Object);
+extern AVOID args_out_of_range_3 (Lisp_Object, Lisp_Object, Lisp_Object);
+extern AVOID circular_list (Lisp_Object);
 extern Lisp_Object do_symval_forwarding (lispfwd);
 enum Set_Internal_Bind {
   SET_INTERNAL_SET,
@@ -3666,7 +3668,7 @@ extern void syms_of_json (void);
 
 /* Defined in insdel.c.  */
 extern void move_gap_both (ptrdiff_t, ptrdiff_t);
-extern _Noreturn void buffer_overflow (void);
+extern AVOID buffer_overflow (void);
 extern void make_gap (ptrdiff_t);
 extern void make_gap_1 (struct buffer *, ptrdiff_t);
 extern ptrdiff_t copy_text (const unsigned char *, unsigned char *,
@@ -3766,8 +3768,8 @@ extern void *my_heap_start (void);
 extern void check_pure_size (void);
 extern void allocate_string_data (struct Lisp_String *, EMACS_INT, EMACS_INT);
 extern void malloc_warning (const char *);
-extern _Noreturn void memory_full (size_t);
-extern _Noreturn void buffer_memory_full (ptrdiff_t);
+extern AVOID memory_full (size_t);
+extern AVOID buffer_memory_full (ptrdiff_t);
 extern bool survives_gc_p (Lisp_Object);
 extern void mark_object (Lisp_Object);
 #if defined REL_ALLOC && !defined SYSTEM_MALLOC && !defined HYBRID_MALLOC
@@ -3848,7 +3850,7 @@ list4i (EMACS_INT x, EMACS_INT y, EMACS_INT w, EMACS_INT h)
 
 extern Lisp_Object make_uninit_bool_vector (EMACS_INT);
 extern Lisp_Object bool_vector_fill (Lisp_Object, Lisp_Object);
-extern _Noreturn void string_overflow (void);
+extern AVOID string_overflow (void);
 extern Lisp_Object make_string (const char *, ptrdiff_t);
 extern Lisp_Object make_formatted_string (char *, const char *, ...)
   ATTRIBUTE_FORMAT_PRINTF (2, 3);
@@ -4095,18 +4097,17 @@ extern Lisp_Object run_hook_with_args (ptrdiff_t nargs, Lisp_Object *args,
 				       Lisp_Object (*funcall)
 				       (ptrdiff_t nargs, Lisp_Object *args));
 extern Lisp_Object quit (void);
-INLINE _Noreturn void
+INLINE AVOID
 xsignal (Lisp_Object error_symbol, Lisp_Object data)
 {
   Fsignal (error_symbol, data);
 }
-extern _Noreturn void xsignal0 (Lisp_Object);
-extern _Noreturn void xsignal1 (Lisp_Object, Lisp_Object);
-extern _Noreturn void xsignal2 (Lisp_Object, Lisp_Object, Lisp_Object);
-extern _Noreturn void xsignal3 (Lisp_Object, Lisp_Object, Lisp_Object,
-				Lisp_Object);
-extern _Noreturn void signal_error (const char *, Lisp_Object);
-extern _Noreturn void overflow_error (void);
+extern AVOID xsignal0 (Lisp_Object);
+extern AVOID xsignal1 (Lisp_Object, Lisp_Object);
+extern AVOID xsignal2 (Lisp_Object, Lisp_Object, Lisp_Object);
+extern AVOID xsignal3 (Lisp_Object, Lisp_Object, Lisp_Object, Lisp_Object);
+extern AVOID signal_error (const char *, Lisp_Object);
+extern AVOID overflow_error (void);
 extern bool FUNCTIONP (Lisp_Object);
 extern Lisp_Object funcall_subr (struct Lisp_Subr *subr, ptrdiff_t numargs, Lisp_Object *arg_vector);
 extern Lisp_Object eval_sub (Lisp_Object form);
@@ -4145,8 +4146,8 @@ extern void set_unwind_protect_ptr (ptrdiff_t, void (*) (void *), void *);
 extern Lisp_Object unbind_to (ptrdiff_t, Lisp_Object);
 extern void rebind_for_thread_switch (void);
 extern void unbind_for_thread_switch (struct thread_state *);
-extern _Noreturn void error (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2);
-extern _Noreturn void verror (const char *, va_list)
+extern AVOID error (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2);
+extern AVOID verror (const char *, va_list)
   ATTRIBUTE_FORMAT_PRINTF (1, 0);
 extern Lisp_Object vformat_string (const char *, va_list)
   ATTRIBUTE_FORMAT_PRINTF (1, 0);
@@ -4243,7 +4244,7 @@ extern void syms_of_editfns (void);
 /* Defined in buffer.c.  */
 extern bool mouse_face_overlay_overlaps (Lisp_Object);
 extern Lisp_Object disable_line_numbers_overlay_at_eob (void);
-extern _Noreturn void nsberror (Lisp_Object);
+extern AVOID nsberror (Lisp_Object);
 extern void adjust_overlays_for_insert (ptrdiff_t, ptrdiff_t);
 extern void adjust_overlays_for_delete (ptrdiff_t, ptrdiff_t);
 extern void fix_start_end_in_overlays (ptrdiff_t, ptrdiff_t);
@@ -4286,9 +4287,9 @@ extern void close_file_unwind (int);
 extern void fclose_unwind (void *);
 extern void restore_point_unwind (Lisp_Object);
 extern Lisp_Object get_file_errno_data (const char *, Lisp_Object, int);
-extern _Noreturn void report_file_errno (const char *, Lisp_Object, int);
-extern _Noreturn void report_file_error (const char *, Lisp_Object);
-extern _Noreturn void report_file_notify_error (const char *, Lisp_Object);
+extern AVOID report_file_errno (const char *, Lisp_Object, int);
+extern AVOID report_file_error (const char *, Lisp_Object);
+extern AVOID report_file_notify_error (const char *, Lisp_Object);
 extern bool internal_delete_file (Lisp_Object);
 extern Lisp_Object emacs_readlinkat (int, const char *);
 extern bool file_directory_p (Lisp_Object);
@@ -4409,7 +4410,7 @@ extern bool display_arg;
 #endif
 extern Lisp_Object decode_env_path (const char *, const char *, bool);
 extern Lisp_Object empty_unibyte_string, empty_multibyte_string;
-extern _Noreturn void terminate_due_to_signal (int, int);
+extern AVOID terminate_due_to_signal (int, int);
 #ifdef WINDOWSNT
 extern Lisp_Object Vlibrary_cache;
 #endif
@@ -4574,7 +4575,7 @@ extern EMACS_INT get_random (void);
 extern void seed_random (void *, ptrdiff_t);
 extern void init_random (void);
 extern void emacs_backtrace (int);
-extern _Noreturn void emacs_abort (void) NO_INLINE;
+extern AVOID emacs_abort (void) NO_INLINE;
 extern int emacs_open (const char *, int, int);
 extern int emacs_pipe (int[2]);
 extern int emacs_close (int);
@@ -4615,8 +4616,7 @@ extern Lisp_Object directory_files_internal (Lisp_Object, Lisp_Object,
 /* Defined in term.c.  */
 extern int *char_ins_del_vector;
 extern void syms_of_term (void);
-extern _Noreturn void fatal (const char *msgid, ...)
-  ATTRIBUTE_FORMAT_PRINTF (1, 2);
+extern AVOID fatal (const char *msgid, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2);
 
 /* Defined in terminal.c.  */
 extern void syms_of_terminal (void);
diff --git a/src/lread.c b/src/lread.c
index 5f33fcd695..8cb4b63cc3 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -1019,7 +1019,7 @@ load_error_handler (Lisp_Object data)
   return Qnil;
 }
 
-static _Noreturn void
+static AVOID
 load_error_old_style_backquotes (void)
 {
   if (NILP (Vload_file_name))
@@ -1874,7 +1874,7 @@ readevalloop_1 (int old)
 /* Signal an `end-of-file' error, if possible with file name
    information.  */
 
-static _Noreturn void
+static AVOID
 end_of_file_error (void)
 {
   if (STRINGP (Vload_file_name))
@@ -2297,7 +2297,7 @@ read_internal_start (Lisp_Object stream, Lisp_Object start, Lisp_Object end)
 /* Signal Qinvalid_read_syntax error.
    S is error string of length N (if > 0)  */
 
-static _Noreturn void
+static AVOID
 invalid_syntax (const char *s)
 {
   xsignal1 (Qinvalid_read_syntax, build_string (s));
diff --git a/src/pdumper.c b/src/pdumper.c
index 600c5b3ca3..2cc9af7f56 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -680,8 +680,7 @@ static void dump_remember_cold_op (struct dump_context *ctx,
                                    enum cold_op op,
                                    Lisp_Object arg);
 
-_Noreturn
-static void
+static AVOID
 error_unsupported_dump_object (struct dump_context *ctx,
                                Lisp_Object object,
 			       const char *msg)
diff --git a/src/puresize.h b/src/puresize.h
index f120a4b330..f5fad8b42b 100644
--- a/src/puresize.h
+++ b/src/puresize.h
@@ -77,7 +77,7 @@ INLINE_HEADER_BEGIN
 #define PURESIZE  (BASE_PURESIZE * PURESIZE_RATIO * PURESIZE_CHECKING_RATIO)
 #endif
 
-extern _Noreturn void pure_write_error (Lisp_Object);
+extern AVOID pure_write_error (Lisp_Object);
 
 extern EMACS_INT pure[];
 
diff --git a/src/search.c b/src/search.c
index a450e920b0..7a6e680685 100644
--- a/src/search.c
+++ b/src/search.c
@@ -73,7 +73,7 @@ static EMACS_INT search_buffer (Lisp_Object, ptrdiff_t, ptrdiff_t,
 
 Lisp_Object re_match_object;
 
-static _Noreturn void
+static AVOID
 matcher_overflow (void)
 {
   error ("Stack overflow in regexp matcher");
diff --git a/src/sound.c b/src/sound.c
index 2b8715010e..4ba826e82c 100644
--- a/src/sound.c
+++ b/src/sound.c
@@ -297,7 +297,7 @@ static int do_play_sound (const char *, unsigned long);
 #ifndef WINDOWSNT
 /* Like perror, but signals an error.  */
 
-static _Noreturn void
+static AVOID
 sound_perror (const char *msg)
 {
   int saved_errno = errno;
@@ -874,7 +874,7 @@ vox_write (struct sound_device *sd, const char *buffer, ptrdiff_t nbytes)
 #define DEFAULT_ALSA_SOUND_DEVICE "default"
 #endif
 
-static _Noreturn void
+static AVOID
 alsa_sound_perror (const char *msg, int err)
 {
   error ("%s: %s", msg, snd_strerror (err));
diff --git a/src/sysdep.c b/src/sysdep.c
index 57ea8220ca..bc88e70dcb 100644
--- a/src/sysdep.c
+++ b/src/sysdep.c
@@ -1805,7 +1805,7 @@ deliver_fatal_thread_signal (int sig)
   deliver_thread_signal (sig, handle_fatal_signal);
 }
 
-static _Noreturn void
+static AVOID
 handle_arith_signal (int sig)
 {
   pthread_sigmask (SIG_SETMASK, &empty_mask, 0);
diff --git a/src/systime.h b/src/systime.h
index 9080cd2bba..89af0c5e3d 100644
--- a/src/systime.h
+++ b/src/systime.h
@@ -92,7 +92,7 @@ extern Lisp_Object make_lisp_time (struct timespec);
 extern bool list4_to_timespec (Lisp_Object, Lisp_Object, Lisp_Object,
 			       Lisp_Object, struct timespec *);
 extern struct timespec lisp_time_argument (Lisp_Object);
-extern _Noreturn void time_overflow (void);
+extern AVOID time_overflow (void);
 extern void init_timefns (void);
 extern void syms_of_timefns (void);
 
diff --git a/src/term.c b/src/term.c
index a492276c88..2de0a0e664 100644
--- a/src/term.c
+++ b/src/term.c
@@ -73,11 +73,10 @@ static void clear_tty_hooks (struct terminal *terminal);
 static void set_tty_hooks (struct terminal *terminal);
 static void dissociate_if_controlling_tty (int fd);
 static void delete_tty (struct terminal *);
-static _Noreturn void maybe_fatal (bool, struct terminal *,
-				   const char *, const char *, ...)
+static AVOID maybe_fatal (bool, struct terminal *, const char *, const char *,
+			  ...)
   ATTRIBUTE_FORMAT_PRINTF (3, 5) ATTRIBUTE_FORMAT_PRINTF (4, 5);
-static _Noreturn void vfatal (const char *str, va_list ap)
-  ATTRIBUTE_FORMAT_PRINTF (1, 0);
+static AVOID vfatal (const char *, va_list) ATTRIBUTE_FORMAT_PRINTF (1, 0);
 
 
 #define OUTPUT(tty, a)                                          \
diff --git a/src/textprop.c b/src/textprop.c
index bb063d3eaa..ae42c44185 100644
--- a/src/textprop.c
+++ b/src/textprop.c
@@ -65,7 +65,7 @@ static Lisp_Object interval_insert_in_front_hooks;
 /* Signal a `text-read-only' error.  This function makes it easier
    to capture that error in GDB by putting a breakpoint on it.  */
 
-static _Noreturn void
+static AVOID
 text_read_only (Lisp_Object propval)
 {
   if (STRINGP (propval))
diff --git a/src/timefns.c b/src/timefns.c
index 514fa24f8b..cb953d1b4c 100644
--- a/src/timefns.c
+++ b/src/timefns.c
@@ -172,7 +172,7 @@ emacs_localtime_rz (timezone_t tz, time_t const *t, struct tm *tm)
   return tm;
 }
 
-static _Noreturn void
+static AVOID
 invalid_time_zone_specification (Lisp_Object zone)
 {
   xsignal2 (Qerror, build_string ("Invalid time zone specification"), zone);
@@ -337,7 +337,7 @@ time_overflow (void)
   error ("Specified time is not representable");
 }
 
-static _Noreturn void
+static AVOID
 time_error (int err)
 {
   switch (err)
@@ -348,7 +348,7 @@ time_error (int err)
     }
 }
 
-static _Noreturn void
+static AVOID
 invalid_hz (Lisp_Object hz)
 {
   xsignal2 (Qerror, build_string ("Invalid time frequency"), hz);
diff --git a/src/xterm.c b/src/xterm.c
index 0facb52454..0b83263a0e 100644
--- a/src/xterm.c
+++ b/src/xterm.c
@@ -1505,37 +1505,8 @@ x_draw_fringe_bitmap (struct window *w, struct glyph_row *row, struct draw_fring
 			    Glyph display
  ***********************************************************************/
 
-
-
-static void x_set_glyph_string_clipping (struct glyph_string *);
-static void x_set_glyph_string_gc (struct glyph_string *);
-static void x_draw_glyph_string_foreground (struct glyph_string *);
-static void x_draw_composite_glyph_string_foreground (struct glyph_string *);
-static void x_draw_glyph_string_box (struct glyph_string *);
-static void x_draw_glyph_string  (struct glyph_string *);
-static _Noreturn void x_delete_glyphs (struct frame *, int);
-static void x_compute_glyph_string_overhangs (struct glyph_string *);
-static void x_set_cursor_gc (struct glyph_string *);
-static void x_set_mode_line_face_gc (struct glyph_string *);
-static void x_set_mouse_face_gc (struct glyph_string *);
 static bool x_alloc_lighter_color (struct frame *, Display *, Colormap,
 				   unsigned long *, double, int);
-static void x_setup_relief_color (struct frame *, struct relief *,
-                                  double, int, unsigned long);
-static void x_setup_relief_colors (struct glyph_string *);
-static void x_draw_image_glyph_string (struct glyph_string *);
-static void x_draw_image_relief (struct glyph_string *);
-static void x_draw_image_foreground (struct glyph_string *);
-#ifndef USE_CAIRO
-static void x_draw_image_foreground_1 (struct glyph_string *, Pixmap);
-#endif
-static void x_clear_glyph_string_rect (struct glyph_string *, int,
-                                       int, int, int);
-static void x_draw_relief_rect (struct frame *, int, int, int, int,
-                                int, bool, bool, bool, bool, bool,
-                                XRectangle *);
-static void x_draw_box_rect (struct glyph_string *, int, int, int, int,
-                             int, bool, bool, XRectangle *);
 static void x_scroll_bar_clear (struct frame *);
 
 #ifdef GLYPH_DEBUG
@@ -3975,7 +3946,7 @@ x_shift_glyphs_for_insert (struct frame *f, int x, int y, int width, int height,
    for X frames.  */
 
 static void
-x_delete_glyphs (struct frame *f, register int n)
+x_delete_glyphs (struct frame *f, int n)
 {
   emacs_abort ();
 }
@@ -9842,7 +9813,7 @@ static char *error_msg;
 /* Handle the loss of connection to display DPY.  ERROR_MESSAGE is
    the text of an error message that lead to the connection loss.  */
 
-static _Noreturn void
+static AVOID
 x_connection_closed (Display *dpy, const char *error_message, bool ioerror)
 {
   struct x_display_info *dpyinfo = x_display_info_for_display (dpy);
@@ -10005,7 +9976,7 @@ x_error_quitter (Display *display, XErrorEvent *event)
    It kills all frames on the display that we lost touch with.
    If that was the only one, it prints an error message and kills Emacs.  */
 
-static _Noreturn int
+static _Noreturn ATTRIBUTE_COLD int
 x_io_error_quitter (Display *display)
 {
   char buf[256];
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-18  8:25                       ` Paul Eggert
@ 2019-04-18  8:43                         ` Konstantin Kharlamov
  2019-04-18 13:47                         ` Andy Moreton
  2019-04-19 13:45                         ` Alex Gramiak
  2 siblings, 0 replies; 45+ messages in thread
From: Konstantin Kharlamov @ 2019-04-18  8:43 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Stefan Monnier, Alex Gramiak, emacs-devel

Thanks. I wonder if "AVOID" macro may sound more clear about its 
intended use as "NEVER_RETURN" or "DIE"?

On Чт, Apr 18, 2019 at 01:25, Paul Eggert <eggert@cs.ucla.edu> wrote:
> Konstantin Kharlamov wrote:
>> I was told that e.g. "cold" \x7fattribute can sometimes produce 
>> unbearably slow code 
>> \x7fLINKIFYaHaBBGCfcBGBICFeJGIedABcbGIDccAffeIAffeI
> 
> Although cold functions can be slow, it appears that overall it's a 
> win for Emacs to mark _Noreturn error function declarations as cold: 
> on my platform, 'make compile-always' ran about 1.3% faster. So I 
> installed the attached patch into master. (Like Stefan, I'm wary of 
> marking functions 'hot' so I didn't do that.)
> 
> This patch also adds a convenience macro AVOID for the now-common 
> pattern '_Noreturn ATTRIBUTE_COLD void'.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-18  8:25                       ` Paul Eggert
  2019-04-18  8:43                         ` Konstantin Kharlamov
@ 2019-04-18 13:47                         ` Andy Moreton
  2019-04-18 17:27                           ` Paul Eggert
  2019-04-19 13:45                         ` Alex Gramiak
  2 siblings, 1 reply; 45+ messages in thread
From: Andy Moreton @ 2019-04-18 13:47 UTC (permalink / raw)
  To: emacs-devel

On Thu 18 Apr 2019, Paul Eggert wrote:

> Konstantin Kharlamov wrote:
>> I was told that e.g. "cold" attribute can sometimes produce unbearably slow
>> code https://gcc.gnu.org/ml/gcc-help/2019-01/msg00035.html
>
> Although cold functions can be slow, it appears that overall it's a win for
> Emacs to mark _Noreturn error function declarations as cold: on my platform,
> 'make compile-always' ran about 1.3% faster. So I installed the attached patch
> into master. (Like Stefan, I'm wary of marking functions 'hot' so I didn't do
> that.)
>
> This patch also adds a convenience macro AVOID for the now-common pattern
> '_Noreturn ATTRIBUTE_COLD void'.

Please don't use this macro, as it makes the code much less readable.

It is reasonable to have a macro that combines the '_Noreturn' and
'ATTRIBUTE_COLD' decorations, but that should not be combined with the
return type.

    AndyM




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-18 13:47                         ` Andy Moreton
@ 2019-04-18 17:27                           ` Paul Eggert
  2019-04-18 17:56                             ` Andy Moreton
  0 siblings, 1 reply; 45+ messages in thread
From: Paul Eggert @ 2019-04-18 17:27 UTC (permalink / raw)
  To: Andy Moreton; +Cc: emacs-devel

On 4/18/19 6:47 AM, Andy Moreton wrote:
> Please don't use this macro, as it makes the code much less readable.
>
> It is reasonable to have a macro that combines the '_Noreturn' and
> 'ATTRIBUTE_COLD' decorations, but that should not be combined with the
> return type.

Actually I first tried it the way that you suggested, but found that
having a single macro improved readability for me, partly because
_Noreturn functions don't return anything so their return type doesn't
matter except for static type checking. 'AVOID' lets a traditionalist
reader easily see that the type 'void' is intended, while also connoting
that the function is rarely used because calls to it are normally
avoided. (I tried to shoehorn "cold" and "does not return" into the
macro's name too, but couldn't come up with anything better than
"AVOID_SKIJUMP". :-)




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-18 17:27                           ` Paul Eggert
@ 2019-04-18 17:56                             ` Andy Moreton
  2019-04-18 19:32                               ` Paul Eggert
  0 siblings, 1 reply; 45+ messages in thread
From: Andy Moreton @ 2019-04-18 17:56 UTC (permalink / raw)
  To: emacs-devel

On Thu 18 Apr 2019, Paul Eggert wrote:

> On 4/18/19 6:47 AM, Andy Moreton wrote:
>> Please don't use this macro, as it makes the code much less readable.
>>
>> It is reasonable to have a macro that combines the '_Noreturn' and
>> 'ATTRIBUTE_COLD' decorations, but that should not be combined with the
>> return type.
>
> Actually I first tried it the way that you suggested, but found that
> having a single macro improved readability for me, partly because
> _Noreturn functions don't return anything so their return type doesn't
> matter except for static type checking. 'AVOID' lets a traditionalist
> reader easily see that the type 'void' is intended, while also connoting
> that the function is rarely used because calls to it are normally
> avoided. (I tried to shoehorn "cold" and "does not return" into the
> macro's name too, but couldn't come up with anything better than
> "AVOID_SKIJUMP". :-)

I had an entirely different impression: AVOID is a weird name for a
return type, so where is the return type, and does this even compile ?

In addition, using this macro makes it entirely non-obvious that the
function has a "_NoReturn" specifier, making it harder for the reader to
understand the intended semantics.

I have no objection to adding "noreturn" and "noreturn_cold" macros if
that helps, but the "AVOID" macro should be avoided as harmful to
readability.

    AndyM





^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-18 17:56                             ` Andy Moreton
@ 2019-04-18 19:32                               ` Paul Eggert
  0 siblings, 0 replies; 45+ messages in thread
From: Paul Eggert @ 2019-04-18 19:32 UTC (permalink / raw)
  To: Andy Moreton; +Cc: emacs-devel

On 4/18/19 10:56 AM, Andy Moreton wrote:
> AVOID is a weird name for a
> return type, so where is the return type, and does this even compile ?

Heh. I had the same reaction many years ago when I first saw the EXFUN
macro: where's the return type? AVOID is better than EXFUN in that it
"looks" syntactically correct as a type name, and its unusual name
provides a useful mnemonic (obviously it compiles).

This is just a style issue. We can remove the definition of AVOID and
replace all its uses with "_Noreturn ATTRIBUTE_COLD void", or we can
rename AVOID to some other identifier, or we can define some other macro
with some slightly-different meaning. Although I considered those
alternatives and liked AVOID the best, I can see where other opinions
might differ.




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-18  8:25                       ` Paul Eggert
  2019-04-18  8:43                         ` Konstantin Kharlamov
  2019-04-18 13:47                         ` Andy Moreton
@ 2019-04-19 13:45                         ` Alex Gramiak
  2019-04-19 13:58                           ` Konstantin Kharlamov
  2019-04-19 17:33                           ` Paul Eggert
  2 siblings, 2 replies; 45+ messages in thread
From: Alex Gramiak @ 2019-04-19 13:45 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel, Stefan Monnier, Konstantin Kharlamov

[-- Attachment #1: Type: text/plain, Size: 1114 bytes --]

Paul Eggert <eggert@cs.ucla.edu> writes:

> Konstantin Kharlamov wrote:
>> I was told that e.g. "cold" attribute can sometimes produce unbearably slow
>> code https://gcc.gnu.org/ml/gcc-help/2019-01/msg00035.html
>
> Although cold functions can be slow, it appears that overall it's a win for
> Emacs to mark _Noreturn error function declarations as cold: on my platform,
> 'make compile-always' ran about 1.3% faster. So I installed the attached patch
> into master.

Thanks. What do you think about marking a few bytecode cases as
cold/unused? I've attached a diff below that does that. Does it make a
difference for you on your setup? It seems to slow Emacs down a slight
bit for me, but I was hoping you might know why it would do so. Since
the newly cold attributes should be unused, is this perhaps a GCC bug?

> This patch also adds a convenience macro AVOID for the now-common pattern
> '_Noreturn ATTRIBUTE_COLD void'.

I'm not sure about the name. If I wasn't part of this discussion I might
have thought AVOID meant that one should avoid usage of the procedure in
new code. Not a big deal, of course.



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: cold attributes --]
[-- Type: text/x-patch, Size: 2159 bytes --]

diff --git a/src/bytecode.c b/src/bytecode.c
index 40977799bf..17c4716fe0 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -656,6 +656,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 	CASE (Bunbind_all):	/* Obsolete.  Never used.  */
 	  /* To unbind back to the beginning of this frame.  Not used yet,
 	     but will be needed for tail-recursion elimination.  */
+          __attribute__((cold, unused));
 	  unbind_to (count, Qnil);
 	  NEXT;
 
@@ -749,6 +750,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 	  NEXT;
 
 	CASE (Bsave_window_excursion): /* Obsolete since 24.1.  */
+          ATTRIBUTE_COLD;
 	  {
 	    ptrdiff_t count1 = SPECPDL_INDEX ();
 	    record_unwind_protect (restore_window_configuration,
@@ -808,6 +810,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 	  }
 
 	CASE (Bcondition_case):		/* Obsolete since 24.4.  */
+          ATTRIBUTE_COLD;
 	  {
 	    Lisp_Object handlers = POP, body = POP;
 	    TOP = internal_lisp_condition_case (TOP, body, handlers);
@@ -815,12 +818,14 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 	  }
 
 	CASE (Btemp_output_buffer_setup): /* Obsolete since 24.1.  */
+          ATTRIBUTE_COLD;
 	  CHECK_STRING (TOP);
 	  temp_output_buffer_setup (SSDATA (TOP));
 	  TOP = Vstandard_output;
 	  NEXT;
 
 	CASE (Btemp_output_buffer_show): /* Obsolete since 24.1.  */
+          ATTRIBUTE_COLD;
 	  {
 	    Lisp_Object v1 = POP;
 	    temp_output_buffer_show (TOP);
@@ -1138,6 +1143,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 	  NEXT;
 
 	CASE (Binteractive_p):	/* Obsolete since 24.1.  */
+          ATTRIBUTE_COLD;
 	  PUSH (call0 (intern ("interactive-p")));
 	  NEXT;
 
@@ -1342,6 +1348,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 	  /* Actually this is Bstack_ref with offset 0, but we use Bdup
 	     for that instead.  */
 	  /* CASE (Bstack_ref): */
+          ATTRIBUTE_COLD;
 	  error ("Invalid byte opcode: op=%d, ptr=%"pD"d",
 		 op, pc - 1 - bytestr_data);
 

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-19 13:45                         ` Alex Gramiak
@ 2019-04-19 13:58                           ` Konstantin Kharlamov
  2019-04-19 14:45                             ` Alex Gramiak
  2019-04-19 17:33                           ` Paul Eggert
  1 sibling, 1 reply; 45+ messages in thread
From: Konstantin Kharlamov @ 2019-04-19 13:58 UTC (permalink / raw)
  To: Alex Gramiak; +Cc: Paul Eggert, Stefan Monnier, emacs-devel



On Пт, Apr 19, 2019 at 07:45, Alex Gramiak <agrambot@gmail.com> wrote:
> Paul Eggert <eggert@cs.ucla.edu> writes:
> 
>>  Konstantin Kharlamov wrote:
>>>  I was told that e.g. "cold" attribute can sometimes produce 
>>> unbearably slow
>>>  code https://gcc.gnu.org/ml/gcc-help/2019-01/msg00035.html
>> 
>>  Although cold functions can be slow, it appears that overall it's a 
>> win for
>>  Emacs to mark _Noreturn error function declarations as cold: on my 
>> platform,
>>  'make compile-always' ran about 1.3% faster. So I installed the 
>> attached patch
>>  into master.
> 
> Thanks. What do you think about marking a few bytecode cases as
> cold/unused? I've attached a diff below that does that. Does it make a
> difference for you on your setup? It seems to slow Emacs down a slight
> bit for me, but I was hoping you might know why it would do so. Since
> the newly cold attributes should be unused, is this perhaps a GCC bug?

The slowdown is to be expected, given that GCC devs said that "cold" 
produces very inefficient code.

It worked in Paul's case because he marked as "cold" functions that 
never return.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-19 13:58                           ` Konstantin Kharlamov
@ 2019-04-19 14:45                             ` Alex Gramiak
  0 siblings, 0 replies; 45+ messages in thread
From: Alex Gramiak @ 2019-04-19 14:45 UTC (permalink / raw)
  To: Konstantin Kharlamov; +Cc: Paul Eggert, Stefan Monnier, emacs-devel

Konstantin Kharlamov <hi-angel@yandex.ru> writes:

> The slowdown is to be expected, given that GCC devs said that "cold" produces
> very inefficient code.

I took his statement to mean that the code generated in cold branches
are very inefficient. In this case that would refer to the
obsolete/unused/error CASEs, which should not be getting called, so I
don't believe the slowdown is to be expected.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-19 13:45                         ` Alex Gramiak
  2019-04-19 13:58                           ` Konstantin Kharlamov
@ 2019-04-19 17:33                           ` Paul Eggert
  2019-04-19 20:53                             ` Alex Gramiak
  2019-04-20 15:29                             ` Andy Moreton
  1 sibling, 2 replies; 45+ messages in thread
From: Paul Eggert @ 2019-04-19 17:33 UTC (permalink / raw)
  To: Alex Gramiak; +Cc: emacs-devel, Stefan Monnier, Konstantin Kharlamov

On 4/19/19 6:45 AM, Alex Gramiak wrote:
> What do you think about marking a few bytecode cases as
> cold/unused? I've attached a diff below that does that. Does it make a
> difference for you on your setup?

No, it generates the same machine code for me on Fedora 29 x86-64, gcc
8.3.1 20190223 (Red Hat 8.3.1-2), with gcc -O2.

> It seems to slow Emacs down a slight
> bit for me, but I was hoping you might know why it would do so. Since
> the newly cold attributes should be unused, is this perhaps a GCC bug?

Possibly. Or it could be just bad luck. I've often made small changes
and noted slight slowdowns or speedups that are due to luck rather than
any real effects. Sometimes a simple 'make bootstrap' reverses the
slowdown or speedup; sometimes it's more complicated than that. I
suppose it could be related to lucky alignment in executables, or to
code that just happens to exceed the associativity of my L1 or TLB
caches, or whatever. So one must view these performance measurements
with some skepticism.

>> This patch also adds a convenience macro AVOID for the now-common pattern
>> '_Noreturn ATTRIBUTE_COLD void'.
> I'm not sure about the name. If I wasn't part of this discussion I might
> have thought AVOID meant that one should avoid usage of the procedure in
> new code. Not a big deal, of course.

Yes, I considered other names but they all had problems too, and AVOID
worked the best for me of the names that I thought of. A mnemonic is
that 'extern AVOID time_overflow (void);' means that execution typically
avoids calling time_overflow. If someone can think of a better name that
would be good. Or we can just get accustomed to AVOID; as you say, it's
not that big a deal.




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-19 17:33                           ` Paul Eggert
@ 2019-04-19 20:53                             ` Alex Gramiak
  2019-04-20  0:05                               ` Alan Mackenzie
  2019-04-20 15:29                             ` Andy Moreton
  1 sibling, 1 reply; 45+ messages in thread
From: Alex Gramiak @ 2019-04-19 20:53 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

Thanks for testing it out. I'll take your advice about measurements in
the future.

Paul Eggert <eggert@cs.ucla.edu> writes:

> Yes, I considered other names but they all had problems too, and AVOID
> worked the best for me of the names that I thought of. A mnemonic is
> that 'extern AVOID time_overflow (void);' means that execution typically
> avoids calling time_overflow. If someone can think of a better name that
> would be good. Or we can just get accustomed to AVOID; as you say, it's
> not that big a deal.

The only ones I can think of at the moment are RARE/RARELY, or a slight
change to AVOIDS.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-19 20:53                             ` Alex Gramiak
@ 2019-04-20  0:05                               ` Alan Mackenzie
  2019-04-20  0:42                                 ` Paul Eggert
  0 siblings, 1 reply; 45+ messages in thread
From: Alan Mackenzie @ 2019-04-20  0:05 UTC (permalink / raw)
  To: Alex Gramiak; +Cc: Paul Eggert, emacs-devel

Hello, Alex and Paul.

On Fri, Apr 19, 2019 at 14:53:51 -0600, Alex Gramiak wrote:
> Thanks for testing it out. I'll take your advice about measurements in
> the future.

> Paul Eggert <eggert@cs.ucla.edu> writes:

> > Yes, I considered other names but they all had problems too, and AVOID
> > worked the best for me of the names that I thought of. A mnemonic is
> > that 'extern AVOID time_overflow (void);' means that execution typically
> > avoids calling time_overflow. If someone can think of a better name that
> > would be good. Or we can just get accustomed to AVOID; as you say, it's
> > not that big a deal.

> The only ones I can think of at the moment are RARE/RARELY, or a slight
> change to AVOIDS.

Can I ask that if anybody starts using a new attribute macro, that they
update the value of c-noise-macro-names under c-mode in .dir-locals?
That way, correct fontification will then continue to happen whilst
editing our C files.

Thanks!

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-20  0:05                               ` Alan Mackenzie
@ 2019-04-20  0:42                                 ` Paul Eggert
  2019-04-20 19:46                                   ` Alan Mackenzie
  0 siblings, 1 reply; 45+ messages in thread
From: Paul Eggert @ 2019-04-20  0:42 UTC (permalink / raw)
  To: Alan Mackenzie, Alex Gramiak; +Cc: emacs-devel

Alan Mackenzie wrote:
> Can I ask that if anybody starts using a new attribute macro, that they
> update the value of c-noise-macro-names under c-mode in .dir-locals?
> That way, correct fontification will then continue to happen whilst
> editing our C files.

Can we put "ATTRIBUTE_.*" there? If not, perhaps we should.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-19 17:33                           ` Paul Eggert
  2019-04-19 20:53                             ` Alex Gramiak
@ 2019-04-20 15:29                             ` Andy Moreton
  2019-04-20 15:57                               ` Paul Eggert
  1 sibling, 1 reply; 45+ messages in thread
From: Andy Moreton @ 2019-04-20 15:29 UTC (permalink / raw)
  To: emacs-devel

On Fri 19 Apr 2019, Paul Eggert wrote:

> On 4/19/19 6:45 AM, Alex Gramiak wrote:
>> What do you think about marking a few bytecode cases as
>> cold/unused? I've attached a diff below that does that. Does it make a
>> difference for you on your setup?
>
> No, it generates the same machine code for me on Fedora 29 x86-64, gcc
> 8.3.1 20190223 (Red Hat 8.3.1-2), with gcc -O2.
>
>> It seems to slow Emacs down a slight
>> bit for me, but I was hoping you might know why it would do so. Since
>> the newly cold attributes should be unused, is this perhaps a GCC bug?
>
> Possibly. Or it could be just bad luck. I've often made small changes
> and noted slight slowdowns or speedups that are due to luck rather than
> any real effects. Sometimes a simple 'make bootstrap' reverses the
> slowdown or speedup; sometimes it's more complicated than that. I
> suppose it could be related to lucky alignment in executables, or to
> code that just happens to exceed the associativity of my L1 or TLB
> caches, or whatever. So one must view these performance measurements
> with some skepticism.
>
>>> This patch also adds a convenience macro AVOID for the now-common pattern
>>> '_Noreturn ATTRIBUTE_COLD void'.
>> I'm not sure about the name. If I wasn't part of this discussion I might
>> have thought AVOID meant that one should avoid usage of the procedure in
>> new code. Not a big deal, of course.
>
> Yes, I considered other names but they all had problems too, and AVOID
> worked the best for me of the names that I thought of. A mnemonic is
> that 'extern AVOID time_overflow (void);' means that execution typically
> avoids calling time_overflow. If someone can think of a better name that
> would be good. Or we can just get accustomed to AVOID; as you say, it's
> not that big a deal.

If the patches from this discussion on marking code as cold or
liely/unlikely does not result in improved code generation and
measureable performance gains, then they should be dropped as needless
clutter in the source code.

Performance gains are useful, but changes that claim to improve
performance should be accompanied by measurements to demonstrate the
claimed benefits. Evidence to motivate these changes does not appear
convincing.

    AndyM




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-20 15:29                             ` Andy Moreton
@ 2019-04-20 15:57                               ` Paul Eggert
  2019-04-20 16:03                                 ` Eli Zaretskii
  2019-04-20 16:28                                 ` Óscar Fuentes
  0 siblings, 2 replies; 45+ messages in thread
From: Paul Eggert @ 2019-04-20 15:57 UTC (permalink / raw)
  To: Andy Moreton; +Cc: emacs-devel

Andy Moreton wrote:
> If the patches from this discussion on marking code as cold or
> liely/unlikely does not result in improved code generation and
> measureable performance gains, then they should be dropped as needless
> clutter in the source code.

As I mentioned in <https://lists.gnu.org/r/emacs-devel/2019-04/msg00767.html> 
the 'cold' changes that I installed into master improved overall performance on 
a largish practical benchmark (cd lisp && make compile-always) by 1.3% on my 
platform: Fedora 29, GCC 8.3.1 20190223 (Red Hat 8.3.1-2), AMD Phenom II X4 
910e. The other changes proposed in this thread seem to not help performance (or 
even hurt it) and we have not installed them.

I like to see significant benefits for performance-related changes where the 
effects are not obvious. For improvements where the generated code is 
"obviously" faster (fewer and simpler instructions, say), I typically don't 
bother with measurements as my own time is limited too.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-20 15:57                               ` Paul Eggert
@ 2019-04-20 16:03                                 ` Eli Zaretskii
  2019-04-20 16:11                                   ` Paul Eggert
  2019-04-20 16:28                                 ` Óscar Fuentes
  1 sibling, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2019-04-20 16:03 UTC (permalink / raw)
  To: Paul Eggert; +Cc: andrewjmoreton, emacs-devel

> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sat, 20 Apr 2019 08:57:57 -0700
> Cc: emacs-devel@gnu.org
> 
> As I mentioned in <https://lists.gnu.org/r/emacs-devel/2019-04/msg00767.html> 
> the 'cold' changes that I installed into master improved overall performance on 
> a largish practical benchmark (cd lisp && make compile-always) by 1.3% on my 
> platform

FWIW, I consider 1.3% performance improvement as insignificant for all
practical purposes.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-20 16:03                                 ` Eli Zaretskii
@ 2019-04-20 16:11                                   ` Paul Eggert
  2019-04-20 16:18                                     ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Paul Eggert @ 2019-04-20 16:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: andrewjmoreton, emacs-devel

Eli Zaretskii wrote:
> I consider 1.3% performance improvement as insignificant for all
> practical purposes.

When I'm talking about performance measurements I try to use the word 
"significant" in its usual scientific sense 
<https://en.wikipedia.org/wiki/Statistical_significance>.

Agreed that 1.3% is no big deal by itself, but if one can make a series of 1.3% 
performance improvements that build on each other, their effects multiply and 
the overall effort has worthwhile practical benefits.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-20 16:11                                   ` Paul Eggert
@ 2019-04-20 16:18                                     ` Eli Zaretskii
  2019-04-20 16:57                                       ` Paul Eggert
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2019-04-20 16:18 UTC (permalink / raw)
  To: Paul Eggert; +Cc: andrewjmoreton, emacs-devel

> Cc: andrewjmoreton@gmail.com, emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sat, 20 Apr 2019 09:11:30 -0700
> 
> Eli Zaretskii wrote:
> > I consider 1.3% performance improvement as insignificant for all
> > practical purposes.
> 
> When I'm talking about performance measurements I try to use the word 
> "significant" in its usual scientific sense 
> <https://en.wikipedia.org/wiki/Statistical_significance>.

Of course.  I didn't mean to imply that the improvement was
insignificant statistically, only that it's insignificant in practice.

> Agreed that 1.3% is no big deal by itself, but if one can make a series of 1.3% 
> performance improvements that build on each other, their effects multiply and 
> the overall effort has worthwhile practical benefits.

Sure, but the effect on code readability also multiplies, right?



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-20 15:57                               ` Paul Eggert
  2019-04-20 16:03                                 ` Eli Zaretskii
@ 2019-04-20 16:28                                 ` Óscar Fuentes
  2019-04-20 18:58                                   ` Paul Eggert
  1 sibling, 1 reply; 45+ messages in thread
From: Óscar Fuentes @ 2019-04-20 16:28 UTC (permalink / raw)
  To: emacs-devel

Paul Eggert <eggert@cs.ucla.edu> writes:

> I like to see significant benefits for performance-related changes
> where the effects are not obvious. For improvements where the
> generated code is "obviously" faster (fewer and simpler instructions,
> say), I typically don't bother with measurements as my own time is
> limited too.

You will be surprised.

Modern hardware is complex.

> Agreed that 1.3% is no big deal by itself, but if one can make a
> series of 1.3% performance improvements that build on each other,
> their effects multiply and the overall effort has worthwhile practical
> benefits.

If the 1.3% improvement in performance requires non-minimal source code
complexity growth, IMHO it is not worhwile, at least on Emacs' case.

Another problem with this type of hacks is that they tend to dilute over
time, because compilers/architectures evolve. And because people change
the code without testing that those pesky declarations end in the right
place; it becomes a kind of cargo-cult.




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-20 16:18                                     ` Eli Zaretskii
@ 2019-04-20 16:57                                       ` Paul Eggert
  2019-04-20 17:22                                         ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Paul Eggert @ 2019-04-20 16:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: andrewjmoreton, emacs-devel

Eli Zaretskii wrote:
> Sure, but the effect on code readability also multiplies, right?

Yes, of course one must weigh cost and benefits. In this case the patch not only 
improved performance, it also made the code smaller and in my opinion more 
readable: the total source code size shrank by 1775 bytes. Source-code 
improvements add up too, and we should encourage them even if they're small, as 
they are a welcome partial antidote to the bloat that often afflicts older 
software projects.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-20 16:57                                       ` Paul Eggert
@ 2019-04-20 17:22                                         ` Eli Zaretskii
  2019-04-20 19:13                                           ` Paul Eggert
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2019-04-20 17:22 UTC (permalink / raw)
  To: Paul Eggert; +Cc: andrewjmoreton, emacs-devel

> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sat, 20 Apr 2019 09:57:14 -0700
> Cc: andrewjmoreton@gmail.com, emacs-devel@gnu.org
> 
> In this case the patch not only improved performance, it also made
> the code smaller and in my opinion more readable: the total source
> code size shrank by 1775 bytes.

I meant code in C, not the produced machine code, so the size is not
really relevant.

I'm not sure how do you see that the code became more readable: we now
have AVOID where previously we had _Noreturn, and ATTRIBUTE_COLD where
we previously had nothing.  Both make the code slightly less readable,
IMO (because the reader must first learn what they mean).



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-20 16:28                                 ` Óscar Fuentes
@ 2019-04-20 18:58                                   ` Paul Eggert
  2019-04-20 19:35                                     ` Óscar Fuentes
  0 siblings, 1 reply; 45+ messages in thread
From: Paul Eggert @ 2019-04-20 18:58 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

Óscar Fuentes wrote:
>> For improvements where the
>> generated code is "obviously" faster (fewer and simpler instructions,
>> say), I typically don't bother with measurements as my own time is
>> limited too.
> You will be surprised.
> 
> Modern hardware is complex.

I'm often surprised :-), but I don't expect to be surprised when the generated 
code seems obviously faster to me. Although I'm not as expert as the computer 
architecture researcher whose office sits next to mine, I know how modern 
hardware works reasonably well and regularly give lectures on topics like μops 
and TLBs so I won't be surprised as often as a naive programmer might be.
> If the 1.3% improvement in performance requires non-minimal source code
> complexity growth

It doesn't. In this case the source code shrank slightly. Only very slightly: 
just by 0.001% (this counts all files under Git control). Still, a win's a win.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-20 17:22                                         ` Eli Zaretskii
@ 2019-04-20 19:13                                           ` Paul Eggert
  0 siblings, 0 replies; 45+ messages in thread
From: Paul Eggert @ 2019-04-20 19:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: andrewjmoreton, emacs-devel

Eli Zaretskii wrote:
>> In this case the patch not only improved performance, it also made
>> the code smaller and in my opinion more readable: the total source
>> code size shrank by 1775 bytes.
> I meant code in C, not the produced machine code

Yes, that's what I meant too. That is, the patch made the C code shorter.

> I'm not sure how do you see that the code became more readable

It's more readable to me partly because the common case is shorter, e.g.:

-static _Noreturn void module_abort (const char *format, ...)
-  ATTRIBUTE_FORMAT_PRINTF(1, 2);
+static AVOID module_abort (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2);

and partly because the unusual case is more obviously unusual. For example, in:

-static _Noreturn void
+static AVOID
  x_connection_closed (Display *dpy, const char *error_message, bool ioerror)
...
-static _Noreturn int
+static _Noreturn ATTRIBUTE_COLD int
  x_io_error_quitter (Display *display)

the x_connection_closed closed is obviously typical (it uses the now-common 
"AVOID"), whereas the x_io_error_quitter is obviously atypical (one normally 
doesn't see ATTRIBUTE_COLD in .c files), and this lets the reader more-easily 
see that x_io_error_quitter (a) returns 'int' and (b) never returns anything, 
which makes it even more of an oddball than the usual AVOID functions are.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-20 18:58                                   ` Paul Eggert
@ 2019-04-20 19:35                                     ` Óscar Fuentes
  2019-04-20 22:54                                       ` Paul Eggert
  0 siblings, 1 reply; 45+ messages in thread
From: Óscar Fuentes @ 2019-04-20 19:35 UTC (permalink / raw)
  To: emacs-devel

Paul Eggert <eggert@cs.ucla.edu> writes:

> Óscar Fuentes wrote:
>>> For improvements where the
>>> generated code is "obviously" faster (fewer and simpler instructions,
>>> say), I typically don't bother with measurements as my own time is
>>> limited too.
>> You will be surprised.
>>
>> Modern hardware is complex.
>
> I'm often surprised :-), but I don't expect to be surprised when the
> generated code seems obviously faster to me. Although I'm not as
> expert as the computer architecture researcher whose office sits next
> to mine, I know how modern hardware works reasonably well and
> regularly give lectures on topics like μops and TLBs so I won't be
> surprised as often as a naive programmer might be.

I served many years as an Assembler programmer replacing C code where it
was considered too slow so, excuse my arrogance, I'm far from being a
naive programmer.

I've seen many times cases where removing code (deleting unnecessary
instructions and even full procedure calls) made the program run
noticeably slower. "noticeably" as in "you don't need a stopwatch to
notice it".

Also, I've seen plenty of speedups on my development machine that
vanished or even reversed when tried on a different microarchitecture,
or the same microarchitecture with different cache sizes, or even the
same CPU with different RAM modules.

On our time of multi-level caches, branch prediction, speculative
execution and what-not, acting as if your hard-earned 1% speedup will
apply to most machines out there is a risky bet.

>> If the 1.3% improvement in performance requires non-minimal source
>> code complexity growth
>
> It doesn't. In this case the source code shrank slightly. Only very
> slightly: just by 0.001% (this counts all files under Git control).
> Still, a win's a win.

Yea, but now everybody has to learn about and, worse, deal with those
non-standard macros. That's added complexity. And I'm pretty sure that
the speedup will vanish as code is modified and compilers and CPUs
change.




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-20  0:42                                 ` Paul Eggert
@ 2019-04-20 19:46                                   ` Alan Mackenzie
  0 siblings, 0 replies; 45+ messages in thread
From: Alan Mackenzie @ 2019-04-20 19:46 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Alex Gramiak, emacs-devel

Hello, Paul.

On Fri, Apr 19, 2019 at 17:42:50 -0700, Paul Eggert wrote:
> Alan Mackenzie wrote:
> > Can I ask that if anybody starts using a new attribute macro, that they
> > update the value of c-noise-macro-names under c-mode in .dir-locals?
> > That way, correct fontification will then continue to happen whilst
> > editing our C files.

> Can we put "ATTRIBUTE_.*" there? If not, perhaps we should.

Not at the moment, no.  Possibly because people might write regexps as
loosely as your suggestion, and that would mangle CC Mode.  Whats wanted
would be something more precise like "ATTRIBUTE_[A-Z]*".

For macros with semicolons, I allowed a regexp to be entered directly
instead of a list of identifiers, but there it is mandatory that the
regexp not match anything but valid identifiers.  For noise macros, a
regexp just seems less needed, somehow.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-20 19:35                                     ` Óscar Fuentes
@ 2019-04-20 22:54                                       ` Paul Eggert
  0 siblings, 0 replies; 45+ messages in thread
From: Paul Eggert @ 2019-04-20 22:54 UTC (permalink / raw)
  To: emacs-devel

Óscar Fuentes wrote:
> I'm pretty sure that
> the speedup will vanish as code is modified and compilers and CPUs
> change.

I agree that if you disable GCC optimization or do lots of other 
plausible-but-low-performance things, that patch will not help performance. 
However, I'm skeptical that the patch would make performance worse in ordinary 
production builds on current Emacs platforms. And I wasn't surprised that the 
patch was a significant (albeit small) win on the platform I tested on, because 
I looked at the machine code that it generated and I knew something of my 
platform's architecture.

Of course I could be unpleasantly surprised about how well this performance win 
generalizes to other platforms - and as I wrote, I am often surprised. To become 
surprised, though, we need to see some numbers.

> I'm far from being a naive programmer.

I didn't think you were naive, and apologize if my comments could be interpreted 
that way. My comments were intended only to make it clear that I'm not a naive 
programmer.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Using __builtin_expect (likely/unlikely macros)
  2019-04-15  1:18 ` Paul Eggert
  2019-04-15  3:11   ` Alex Gramiak
@ 2020-04-15  3:14   ` John Carter
  1 sibling, 0 replies; 45+ messages in thread
From: John Carter @ 2020-04-15  3:14 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Alex Gramiak, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1264 bytes --]

As a small datapoint I recently converted the largish (about 2megasloc)
code base I maintain to using likely/unlikely.

I have been very pleasantly surprised by how much more readable it made the
__builtin's..

And then surprised again by how much more readable it makes code in general.

I now occasionally add them to my code, not so much as an optimization
hint, but as documentation to indicate the expected "Happy Path".

In fact, so much so, I find interesting those cases where I consider it....
and then say, meh, I don't know which path is more likely...



On Mon, Apr 15, 2019 at 1:19 PM Paul Eggert <eggert@cs.ucla.edu> wrote:

> Alex Gramiak wrote:
> > the likely/unlikely
> > macros are a nice way to indicate that a branch is exceedingly
> > rare/common.
>
> The cost (in making the C code harder to read, write and maintain) so
> often
> exceeds that benefit that I'd rather avoid these macros unless there's a
> good
> performance case for putting them in.
>
>

-- 
John Carter
Phone : (64)(3) 358 6639
Tait Electronics
PO Box 1645 Christchurch
New Zealand

-- 
This Communication is Confidential. We only send and receive email on the

basis of the terms set out at www.taitradio.com/email_disclaimer 
<http://www.taitradio.com/email_disclaimer>

[-- Attachment #2: Type: text/html, Size: 1949 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2020-04-15  3:14 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-15  0:15 Using __builtin_expect (likely/unlikely macros) Alex Gramiak
2019-04-15  1:18 ` Paul Eggert
2019-04-15  3:11   ` Alex Gramiak
2019-04-15  4:41     ` Paul Eggert
2019-04-16  0:16       ` Alex Gramiak
2019-04-16  2:34         ` Eli Zaretskii
2019-04-16  5:33           ` Alex Gramiak
2019-04-16 15:23             ` Eli Zaretskii
2019-04-16 15:47               ` Alex Gramiak
2019-04-16  3:42         ` Paul Eggert
2019-04-16 13:05           ` Stefan Monnier
2019-04-16 15:22             ` Paul Eggert
2019-04-16 16:10               ` Alex Gramiak
2019-04-16 17:54                 ` Paul Eggert
2019-04-16 20:50                   ` Alex Gramiak
2019-04-16 21:11                     ` Alex Gramiak
2019-04-16 21:27                     ` Stefan Monnier
2019-04-16 21:27                     ` Konstantin Kharlamov
2019-04-18  8:25                       ` Paul Eggert
2019-04-18  8:43                         ` Konstantin Kharlamov
2019-04-18 13:47                         ` Andy Moreton
2019-04-18 17:27                           ` Paul Eggert
2019-04-18 17:56                             ` Andy Moreton
2019-04-18 19:32                               ` Paul Eggert
2019-04-19 13:45                         ` Alex Gramiak
2019-04-19 13:58                           ` Konstantin Kharlamov
2019-04-19 14:45                             ` Alex Gramiak
2019-04-19 17:33                           ` Paul Eggert
2019-04-19 20:53                             ` Alex Gramiak
2019-04-20  0:05                               ` Alan Mackenzie
2019-04-20  0:42                                 ` Paul Eggert
2019-04-20 19:46                                   ` Alan Mackenzie
2019-04-20 15:29                             ` Andy Moreton
2019-04-20 15:57                               ` Paul Eggert
2019-04-20 16:03                                 ` Eli Zaretskii
2019-04-20 16:11                                   ` Paul Eggert
2019-04-20 16:18                                     ` Eli Zaretskii
2019-04-20 16:57                                       ` Paul Eggert
2019-04-20 17:22                                         ` Eli Zaretskii
2019-04-20 19:13                                           ` Paul Eggert
2019-04-20 16:28                                 ` Óscar Fuentes
2019-04-20 18:58                                   ` Paul Eggert
2019-04-20 19:35                                     ` Óscar Fuentes
2019-04-20 22:54                                       ` Paul Eggert
2020-04-15  3:14   ` John Carter

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).