* Using __builtin_expect (likely/unlikely macros) @ 2019-04-15 0:15 Alex Gramiak 2019-04-15 1:18 ` Paul Eggert 0 siblings, 1 reply; 45+ messages in thread From: Alex Gramiak @ 2019-04-15 0:15 UTC (permalink / raw) To: emacs-devel Would using these, if available, be acceptable? I noticed that some code in configure.ac already defines HAVE___BUILTIN_EXPECT, though I couldn't see what part of autoconf does this. Good candidates here include the conditionals around emacs_abort/fatal, and possibly in cases like the *alloc procedures and xfree. Even if there's no noticeable speedup, I think that the likely/unlikely macros are a nice way to indicate that a branch is exceedingly rare/common. #define unlikely(expr) __builtin_expect(!!(expr), 0) #define likely(expr) __builtin_expect(!!(expr), 1) ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-15 0:15 Using __builtin_expect (likely/unlikely macros) Alex Gramiak @ 2019-04-15 1:18 ` Paul Eggert 2019-04-15 3:11 ` Alex Gramiak 2020-04-15 3:14 ` John Carter 0 siblings, 2 replies; 45+ messages in thread From: Paul Eggert @ 2019-04-15 1:18 UTC (permalink / raw) To: Alex Gramiak; +Cc: emacs-devel Alex Gramiak wrote: > the likely/unlikely > macros are a nice way to indicate that a branch is exceedingly > rare/common. The cost (in making the C code harder to read, write and maintain) so often exceeds that benefit that I'd rather avoid these macros unless there's a good performance case for putting them in. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-15 1:18 ` Paul Eggert @ 2019-04-15 3:11 ` Alex Gramiak 2019-04-15 4:41 ` Paul Eggert 2020-04-15 3:14 ` John Carter 1 sibling, 1 reply; 45+ messages in thread From: Alex Gramiak @ 2019-04-15 3:11 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel Paul Eggert <eggert@cs.ucla.edu> writes: > Alex Gramiak wrote: >> the likely/unlikely >> macros are a nice way to indicate that a branch is exceedingly >> rare/common. > > The cost (in making the C code harder to read, write and maintain) so often > exceeds that benefit that I'd rather avoid these macros unless there's a good > performance case for putting them in. I doubt that any performance benefit would be noticeable, at least without some microbenchmarks that I don't have. Though I don't think that these macros, used sparingly around code such as emacs_abort, would make it harder to read/write/maintain. I'm certainly not suggesting to throw them around without care. Still, I understand the resistance to include them. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-15 3:11 ` Alex Gramiak @ 2019-04-15 4:41 ` Paul Eggert 2019-04-16 0:16 ` Alex Gramiak 0 siblings, 1 reply; 45+ messages in thread From: Paul Eggert @ 2019-04-15 4:41 UTC (permalink / raw) To: Alex Gramiak; +Cc: emacs-devel Alex Gramiak wrote: > Though I don't think that these macros, used sparingly around code such > as emacs_abort, would make it harder to read/write/maintain. These macro calls would not help near calls to emacs_abort, as it should already be obvious to a careful human reader that the jump to emacs_abort is the road less traveled. (That's also obvious to GCC, since emacs_abort is _Noreturn.) ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-15 4:41 ` Paul Eggert @ 2019-04-16 0:16 ` Alex Gramiak 2019-04-16 2:34 ` Eli Zaretskii 2019-04-16 3:42 ` Paul Eggert 0 siblings, 2 replies; 45+ messages in thread From: Alex Gramiak @ 2019-04-16 0:16 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 934 bytes --] Paul Eggert <eggert@cs.ucla.edu> writes: > Alex Gramiak wrote: >> Though I don't think that these macros, used sparingly around code such >> as emacs_abort, would make it harder to read/write/maintain. > > These macro calls would not help near calls to emacs_abort, as it should already > be obvious to a careful human reader that the jump to emacs_abort is the road > less traveled. (That's also obvious to GCC, since emacs_abort is _Noreturn.) To human readers, yes, but from what I can tell, GCC is mixed on this. When applying the following diff that surrounds emacs_abort in bytecode.c and xdisp.c, the assembly in both applied/unapplied is the same for bytecode.c, but not for xdisp.c (even without the LIKELY cases). I tested using gcc -O2 -S. Also, I tried putting a LIKELY in lisp_h_CHECK_TYPE (seems like a nice place for one) and that yielded different assembly for fns.c, even though wrong_type_argument is _Noreturn. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: likely --] [-- Type: text/x-patch, Size: 6351 bytes --] diff --git a/src/bytecode.c b/src/bytecode.c index 40977799bf..ce1a7bd254 100644 --- a/src/bytecode.c +++ b/src/bytecode.c @@ -39,6 +39,7 @@ along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>. */ #ifndef BYTE_CODE_SAFE # define BYTE_CODE_SAFE false #endif +#define unlikely(expr) (__builtin_expect (expr, 0)) /* Define BYTE_CODE_METER to generate a byte-op usage histogram. */ /* #define BYTE_CODE_METER */ @@ -404,7 +405,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, int op; enum handlertype type; - if (BYTE_CODE_SAFE && ! (stack_base <= top && top < stack_lim)) + if (unlikely (BYTE_CODE_SAFE && ! (stack_base <= top && top < stack_lim))) emacs_abort (); #ifdef BYTE_CODE_METER @@ -664,9 +665,9 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, op_branch: op -= pc - bytestr_data; op_relative_branch: - if (BYTE_CODE_SAFE + if (unlikely (BYTE_CODE_SAFE && ! (bytestr_data - pc <= op - && op < bytestr_data + bytestr_length - pc)) + && op < bytestr_data + bytestr_length - pc))) emacs_abort (); quitcounter += op < 0; if (!quitcounter) @@ -1397,7 +1398,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, number of cases is less, which uses a simple vector for linear search as the jump table. */ Lisp_Object jmp_table = POP; - if (BYTE_CODE_SAFE && !HASH_TABLE_P (jmp_table)) + if (unlikely (BYTE_CODE_SAFE && !HASH_TABLE_P (jmp_table))) emacs_abort (); Lisp_Object v1 = POP; ptrdiff_t i; @@ -1426,7 +1427,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, if (i >= 0) { Lisp_Object val = HASH_VALUE (h, i); - if (BYTE_CODE_SAFE && !FIXNUMP (val)) + if (unlikely (BYTE_CODE_SAFE && !FIXNUMP (val))) emacs_abort (); op = XFIXNUM (val); goto op_branch; @@ -1436,8 +1437,8 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, CASE_DEFAULT CASE (Bconstant): - if (BYTE_CODE_SAFE - && ! (Bconstant <= op && op < Bconstant + const_length)) + if (unlikely (BYTE_CODE_SAFE + && ! (Bconstant <= op && op < Bconstant + const_length))) emacs_abort (); PUSH (vectorp[op - Bconstant]); NEXT; diff --git a/src/xdisp.c b/src/xdisp.c index a88fc698b8..9872f69cb0 100644 --- a/src/xdisp.c +++ b/src/xdisp.c @@ -344,7 +344,8 @@ along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>. */ /* Holds the list (error). */ static Lisp_Object list_of_error; - +#define UNLIKELY(expr) (__builtin_expect (expr, 0)) +#define LIKELY(expr) (__builtin_expect (expr, 1)) #ifdef HAVE_WINDOW_SYSTEM /* Test if overflow newline into fringe. Called with iterator IT @@ -8322,7 +8323,7 @@ compute_stop_pos_backwards (struct it *it) reseat_1 (it, pos, false); compute_stop_pos (it); /* We must advance forward, right? */ - if (it->stop_charpos <= charpos) + if (UNLIKELY (it->stop_charpos <= charpos)) emacs_abort (); } while (charpos > BEGV && it->stop_charpos >= it->end_charpos); @@ -8371,7 +8372,7 @@ handle_stop_backwards (struct it *it, ptrdiff_t charpos) it->current.string_pos = string_pos (charpos, it->string); compute_stop_pos (it); /* We must advance forward, right? */ - if (it->stop_charpos <= it->prev_stop) + if (UNLIKELY (it->stop_charpos <= it->prev_stop)) emacs_abort (); charpos = it->stop_charpos; } @@ -11443,7 +11444,7 @@ pop_message_unwind (void) void check_message_stack (void) { - if (!NILP (Vmessage_stack)) + if (UNLIKELY (!NILP (Vmessage_stack))) emacs_abort (); } @@ -15495,7 +15496,7 @@ set_cursor_from_row (struct window *w, struct glyph_row *row, /* Need to compute x that corresponds to GLYPH. */ for (g = row->glyphs[TEXT_AREA], x = row->x; g < glyph; g++) { - if (g >= row->glyphs[TEXT_AREA] + row->used[TEXT_AREA]) + if (UNLIKELY (g >= row->glyphs[TEXT_AREA] + row->used[TEXT_AREA])) emacs_abort (); x += g->pixel_width; } @@ -16827,9 +16828,9 @@ redisplay_window (Lisp_Object window, bool just_this_one_p) /* Some sanity checks. */ CHECK_WINDOW_END (w); - if (Z == Z_BYTE && CHARPOS (opoint) != BYTEPOS (opoint)) + if (UNLIKELY (Z == Z_BYTE && CHARPOS (opoint) != BYTEPOS (opoint))) emacs_abort (); - if (BYTEPOS (opoint) < CHARPOS (opoint)) + if (UNLIKELY (BYTEPOS (opoint) < CHARPOS (opoint))) emacs_abort (); if (mode_line_update_needed (w)) @@ -19318,9 +19319,9 @@ try_window_id (struct window *w) adjust_window_ends (w, last_text_row, false); eassert (w->window_end_bytepos >= 0); } - else if (first_unchanged_at_end_row == NULL - && last_text_row == NULL - && last_text_row_at_end == NULL) + else if (LIKELY (first_unchanged_at_end_row == NULL + && last_text_row == NULL + && last_text_row_at_end == NULL)) { /* Displayed to end of window, but no line containing text was displayed. Lines were deleted at the end of the window. */ @@ -21015,7 +21016,7 @@ find_row_edges (struct it *it, struct glyph_row *row, which puts the iterator at the beginning of the next line, in the logical order. */ row->maxpos = it->current.pos; - else if (max_pos == min_pos && it->method != GET_FROM_BUFFER) + else if (LIKELY (max_pos == min_pos && it->method != GET_FROM_BUFFER)) /* A line that is entirely from a string/image/stretch... */ row->maxpos = row->minpos; else @@ -25210,7 +25211,7 @@ display_string (const char *string, Lisp_Object lisp_string, Lisp_Object face_st } break; } - else if (x + glyph->pixel_width >= it->first_visible_x) + else if (LIKELY (x + glyph->pixel_width >= it->first_visible_x)) { /* Glyph is at least partially visible. */ ++it->hpos; @@ -27847,7 +27848,7 @@ produce_special_glyphs (struct it *it, enum display_element_type what) spec_glyph_lookup_face (XWINDOW (it->window), &glyph); } } - else if (what == IT_TRUNCATION) + else if (LIKELY (what == IT_TRUNCATION)) { /* Truncation glyph. */ SET_GLYPH_FROM_CHAR (glyph, '$'); ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-16 0:16 ` Alex Gramiak @ 2019-04-16 2:34 ` Eli Zaretskii 2019-04-16 5:33 ` Alex Gramiak 2019-04-16 3:42 ` Paul Eggert 1 sibling, 1 reply; 45+ messages in thread From: Eli Zaretskii @ 2019-04-16 2:34 UTC (permalink / raw) To: Alex Gramiak; +Cc: eggert, emacs-devel > From: Alex Gramiak <agrambot@gmail.com> > Date: Mon, 15 Apr 2019 18:16:02 -0600 > Cc: emacs-devel@gnu.org > > To human readers, yes, but from what I can tell, GCC is mixed on this. > When applying the following diff that surrounds emacs_abort in > bytecode.c and xdisp.c, the assembly in both applied/unapplied is the > same for bytecode.c, but not for xdisp.c (even without the LIKELY > cases). I tested using gcc -O2 -S. I'm not sure whether you put LIKELY and UNLIKELY somewhat randomly, just for testing, or did you really think each place is likely/unlikely as in the change, but at least some places in xdisp.c are wrong: they use LIKELY where UNLIKELY is more appropriate, abd vice versa. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-16 2:34 ` Eli Zaretskii @ 2019-04-16 5:33 ` Alex Gramiak 2019-04-16 15:23 ` Eli Zaretskii 0 siblings, 1 reply; 45+ messages in thread From: Alex Gramiak @ 2019-04-16 5:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > I'm not sure whether you put LIKELY and UNLIKELY somewhat randomly, > just for testing, or did you really think each place is > likely/unlikely as in the change, but at least some places in xdisp.c > are wrong: they use LIKELY where UNLIKELY is more appropriate, abd > vice versa. Out of curiosity, which places would that be? I only put UNLIKELY calls around the checks for emacs_abort (which I presume to be unlikely), and the LIKELY calls are of the form: else if (LIKELY (<cond>)) { ... } else emacs_abort (); So if already at the "else if" clause, it's likely that the condition is true, otherwise an abort would take place. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-16 5:33 ` Alex Gramiak @ 2019-04-16 15:23 ` Eli Zaretskii 2019-04-16 15:47 ` Alex Gramiak 0 siblings, 1 reply; 45+ messages in thread From: Eli Zaretskii @ 2019-04-16 15:23 UTC (permalink / raw) To: Alex Gramiak; +Cc: eggert, emacs-devel > From: Alex Gramiak <agrambot@gmail.com> > Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Mon, 15 Apr 2019 23:33:51 -0600 > > Eli Zaretskii <eliz@gnu.org> writes: > > > I'm not sure whether you put LIKELY and UNLIKELY somewhat randomly, > > just for testing, or did you really think each place is > > likely/unlikely as in the change, but at least some places in xdisp.c > > are wrong: they use LIKELY where UNLIKELY is more appropriate, abd > > vice versa. > > Out of curiosity, which places would that be? I only put UNLIKELY calls > around the checks for emacs_abort (which I presume to be unlikely), and > the LIKELY calls are of the form: > > else if (LIKELY (<cond>)) > { > ... > } > else > emacs_abort (); OK, it's possible that I don't understand the exact semantics of __builtin_expect in these situations. The problem is that you used LIKELY like this: if (A) do_A; else if (B) do_B; else if (C) do_C; else if (LIKELY (D)) do_D; else cant_happen (); Essentially, the above is a moral equivalent of a 'switch' with the 'default' case aborting because it "cannot happen". In such code, the order of the clauses doesn't necessarily tell anything about their likelihood; up front, they all are equally "likely". So using LIKELY only in the last one sends a wrong signal: that last condition is neither more nor less likely than all the others. Actually, in some cases it might be _less_ likely than the preceding ones, because if I knew that some of these conditions happens much more frequently, I'd test it first. Now, it's possible that the effect of __builtin_expect doesn't care about this issue. The GCC manual doesn't help to figure out whether this is the case, because it only talks about a simple case of a single 'if' clause, and doesn't tell any details about what GCC is allowed to do when it sees __builtin_expect. But just by looking at how the code looks, I immediately raised a brow. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-16 15:23 ` Eli Zaretskii @ 2019-04-16 15:47 ` Alex Gramiak 0 siblings, 0 replies; 45+ messages in thread From: Alex Gramiak @ 2019-04-16 15:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > OK, it's possible that I don't understand the exact semantics of > __builtin_expect in these situations. > > The problem is that you used LIKELY like this: > > if (A) > do_A; > else if (B) > do_B; > else if (C) > do_C; > else if (LIKELY (D)) > do_D; > else > cant_happen (); > > Essentially, the above is a moral equivalent of a 'switch' with the > 'default' case aborting because it "cannot happen". In such code, the > order of the clauses doesn't necessarily tell anything about their > likelihood; up front, they all are equally "likely". So using LIKELY > only in the last one sends a wrong signal: that last condition is > neither more nor less likely than all the others. Actually, in some > cases it might be _less_ likely than the preceding ones, because if I > knew that some of these conditions happens much more frequently, I'd > test it first. It was my understanding that since an else if is equivalent to else { if ... }, it would only affect the last two branches. Though I could easily be wrong here. > Now, it's possible that the effect of __builtin_expect doesn't care > about this issue. The GCC manual doesn't help to figure out whether > this is the case, because it only talks about a simple case of a > single 'if' clause, and doesn't tell any details about what GCC is > allowed to do when it sees __builtin_expect. But just by looking at > how the code looks, I immediately raised a brow. Right, considering the confusion it would be counterproductive to use them in this fashion. A workaround to the confusion would be to do: else { if (LIKELY (D)) do_D; else cant_happen (); } ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-16 0:16 ` Alex Gramiak 2019-04-16 2:34 ` Eli Zaretskii @ 2019-04-16 3:42 ` Paul Eggert 2019-04-16 13:05 ` Stefan Monnier 1 sibling, 1 reply; 45+ messages in thread From: Paul Eggert @ 2019-04-16 3:42 UTC (permalink / raw) To: Alex Gramiak; +Cc: emacs-devel Alex Gramiak wrote: >> These macro calls would not help near calls to emacs_abort, as it should already >> be obvious to a careful human reader that the jump to emacs_abort is the road >> less traveled. (That's also obvious to GCC, since emacs_abort is _Noreturn.) > To human readers, yes, but from what I can tell, GCC is mixed on this. Then we should fix GCC, if the code it generates has a performance problem (whatever it is, it's quite small). That'd be better than littering the Emacs source code with __builtin_expect or UNLIKELY calls. The GCC manual recommends against manually inserting such calls; performance nerds are supposed to use -fprofile-arcs instead. In my experience the calls are typically more trouble than they're worth. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-16 3:42 ` Paul Eggert @ 2019-04-16 13:05 ` Stefan Monnier 2019-04-16 15:22 ` Paul Eggert 0 siblings, 1 reply; 45+ messages in thread From: Stefan Monnier @ 2019-04-16 13:05 UTC (permalink / raw) To: emacs-devel >>> These macro calls would not help near calls to emacs_abort, as it >>> should already be obvious to a careful human reader that the jump to >>> emacs_abort is the road less traveled. (That's also obvious to GCC, >>> since emacs_abort is _Noreturn.) >> To human readers, yes, but from what I can tell, GCC is mixed on this. > Then we should fix GCC, if the code it generates has a performance problem > (whatever it is, it's quite small). FWIW, the fact that a function is "_Noreturn" doesn't necessarily mean that a call to it is unlikely (in many cases it is, I guess, but definitely not all), so maybe GCC maintainers consciously decided not to link the two. > The GCC manual recommends against manually inserting such calls; It's likely based on some past experiments that showed programmers aren't very good at understanding what is likely and what isn't in their code. BTW I think instead of marking branches as likely or unlikely, I'd prefer to tell GCC that some functions "should be slow" (e.g. emacs_abort) so it optimizes the code paths that don't go through those functions to the detriment of those that do. Stefan ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-16 13:05 ` Stefan Monnier @ 2019-04-16 15:22 ` Paul Eggert 2019-04-16 16:10 ` Alex Gramiak 0 siblings, 1 reply; 45+ messages in thread From: Paul Eggert @ 2019-04-16 15:22 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel Stefan Monnier wrote: > FWIW, the fact that a function is "_Noreturn" doesn't necessarily mean > that a call to it is unlikely (in many cases it is, I guess, but > definitely not all), so maybe GCC maintainers consciously decided not to > link the two. Possibly they did. In hindsight I'd argue that was a mistake. If one has no other evidence about the likelihood of a branch, a branch to a _Noreturn call should default to being unlikely. > BTW I think instead of marking branches as likely or unlikely, I'd > prefer to tell GCC that some functions "should be slow" > (e.g. emacs_abort) so it optimizes the code paths that don't go through > those functions to the detriment of those that do. GCC has the function attribute 'cold' for that. This is less intrusive than __builtin_expect and so would be preferable. Still, the GCC manual says that __attribute__ ((cold)) is ignored when profile feedback is available, which is another indication that people who care about performance should be using -fprofile-use etc. And as far as I know __attribute__ ((cold)) is rarely used: even glibc uses it only once, in obscure code never used on GNU/Linux. Presumably this is partly because the attribute didn't exist until about five years ago. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-16 15:22 ` Paul Eggert @ 2019-04-16 16:10 ` Alex Gramiak 2019-04-16 17:54 ` Paul Eggert 0 siblings, 1 reply; 45+ messages in thread From: Alex Gramiak @ 2019-04-16 16:10 UTC (permalink / raw) To: Paul Eggert; +Cc: Stefan Monnier, emacs-devel Paul Eggert <eggert@cs.ucla.edu> writes: >> BTW I think instead of marking branches as likely or unlikely, I'd >> prefer to tell GCC that some functions "should be slow" >> (e.g. emacs_abort) so it optimizes the code paths that don't go through >> those functions to the detriment of those that do. > > GCC has the function attribute 'cold' for that. This is less intrusive than > __builtin_expect and so would be preferable. Still, the GCC manual says that > __attribute__ ((cold)) is ignored when profile feedback is available, which is > another indication that people who care about performance should be using > -fprofile-use etc. And as far as I know __attribute__ ((cold)) is rarely used: > even glibc uses it only once, in obscure code never used on GNU/Linux. > Presumably this is partly because the attribute didn't exist until about five > years ago. It would still be good to use it for default builds, no? The GCC documentation states: It is thus useful to mark functions used to handle unlikely conditions, such as perror, as cold to improve optimization of hot functions that do call marked functions in rare occasions. So the error/signal calls, wrong_type_argument, etc. would be good places for this. It doesn't indicate anything to the programmer at call sites, but that point is controversial, so slapping a few attributes down seems like a better way to go. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-16 16:10 ` Alex Gramiak @ 2019-04-16 17:54 ` Paul Eggert 2019-04-16 20:50 ` Alex Gramiak 0 siblings, 1 reply; 45+ messages in thread From: Paul Eggert @ 2019-04-16 17:54 UTC (permalink / raw) To: Alex Gramiak; +Cc: Stefan Monnier, emacs-devel On 4/16/19 9:10 AM, Alex Gramiak wrote: > It would still be good to use it for default builds, no? The GCC > documentation states: > > It is thus useful to mark functions used to handle unlikely > conditions, such as perror, as cold to improve optimization of hot > functions that do call marked functions in rare occasions. That argument would be stronger if functions like 'perror' actually used __attribute__ ((cold)), which they don't (at least, not in GNU systems). There is a cost and a benefit to adding these attributes. The benefit is that they should improve runtime performance very slightly, and (more important) they should help nonexpert human readers know that a function is rarely called. The cost is that they make the code harder to maintain, thus placing a burden on maintainers. Perhaps I'm biased as I would bear the cost and get none of the benefit; still, the overall cost-benefit ratio doesn't look all that favorable for Emacs. That being said, it might make sense for a few obviously-rarely-called functions like 'emacs-abort' to be marked with __attribute__ ((cold)), so long as we don't turn this into a mission to mark all cold functions (which would cost us more than it would benefit). That is what GCC itself does, with its own functions. However, I'd like to see performance figures. Could you try it out on the benchmark of 'cd lisp && time make compile-always'? ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-16 17:54 ` Paul Eggert @ 2019-04-16 20:50 ` Alex Gramiak 2019-04-16 21:11 ` Alex Gramiak ` (2 more replies) 0 siblings, 3 replies; 45+ messages in thread From: Alex Gramiak @ 2019-04-16 20:50 UTC (permalink / raw) To: Paul Eggert; +Cc: Stefan Monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 6171 bytes --] Paul Eggert <eggert@cs.ucla.edu> writes: > That being said, it might make sense for a few obviously-rarely-called > functions like 'emacs-abort' to be marked with __attribute__ ((cold)), > so long as we don't turn this into a mission to mark all cold functions > (which would cost us more than it would benefit). That is what GCC > itself does, with its own functions. However, I'd like to see > performance figures. Could you try it out on the benchmark of 'cd lisp > && time make compile-always'? Right, I agree that if used, they should be used sparingly. I tested three versions a few times each with both 'make' and 'make -j4': a) Regular Emacs master. b) The below diff with only the _Cold attribute c) The below diff with both _Cold and _Hot attributes a) Normal real 4:17.97s user 3:57.18s sys 20.394s real 1:17.67s user 4:23.78s sys 18.888s b) Cold real 4:10.92s user 3:50.34s sys 20.178s real 1:15.77s user 4:16.73s sys 18.943s c) Hot/Cold real 4:11.43s user 3:51.07s sys 19.961s real 1:16.01s user 4:17.63s sys 18.662s So not much of a difference. For some reason the Hot/Cold performed consistently worse than Cold. I also tested startup/shutdown with perf: Performance counter stats for '../emacs-normal -f kill-emacs' (20 runs): 762.17 msec task-clock:u # 0.844 CPUs utilized ( +- 0.23% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 12,941 page-faults:u # 0.017 M/sec ( +- 0.01% ) 2,998,322,125 cycles:u # 3.934 GHz ( +- 0.06% ) 1,392,869,413 stalled-cycles-frontend:u # 46.45% frontend cycles idle ( +- 0.15% ) 982,206,843 stalled-cycles-backend:u # 32.76% backend cycles idle ( +- 0.18% ) 4,874,186,825 instructions:u # 1.63 insn per cycle # 0.29 stalled cycles per insn ( +- 0.01% ) 1,037,929,374 branches:u # 1361.802 M/sec ( +- 0.01% ) 17,930,471 branch-misses:u # 1.73% of all branches ( +- 0.16% ) 1,209,539,215 L1-dcache-loads:u # 1586.960 M/sec ( +- 0.01% ) 42,346,229 L1-dcache-load-misses:u # 3.50% of all L1-dcache hits ( +- 0.05% ) 9,088,647 LLC-loads:u # 11.925 M/sec ( +- 0.29% ) <not supported> LLC-load-misses:u 0.90325 +- 0.00441 seconds time elapsed ( +- 0.49% ) Performance counter stats for '../emacs.cold -f kill-emacs' (20 runs): 755.94 msec task-clock:u # 0.845 CPUs utilized ( +- 0.24% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 12,941 page-faults:u # 0.017 M/sec ( +- 0.01% ) 2,976,036,365 cycles:u # 3.937 GHz ( +- 0.06% ) 1,374,451,779 stalled-cycles-frontend:u # 46.18% frontend cycles idle ( +- 0.14% ) 990,227,732 stalled-cycles-backend:u # 33.27% backend cycles idle ( +- 0.18% ) 4,878,661,927 instructions:u # 1.64 insn per cycle # 0.28 stalled cycles per insn ( +- 0.00% ) 1,038,495,525 branches:u # 1373.782 M/sec ( +- 0.00% ) 17,859,906 branch-misses:u # 1.72% of all branches ( +- 0.16% ) 1,209,345,531 L1-dcache-loads:u # 1599.792 M/sec ( +- 0.00% ) 42,444,358 L1-dcache-load-misses:u # 3.51% of all L1-dcache hits ( +- 0.06% ) 9,204,368 LLC-loads:u # 12.176 M/sec ( +- 0.41% ) <not supported> LLC-load-misses:u 0.89430 +- 0.00217 seconds time elapsed ( +- 0.24% ) Performance counter stats for '../emacs.hot-cold -f kill-emacs' (20 runs): 761.97 msec task-clock:u # 0.845 CPUs utilized ( +- 0.20% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 12,947 page-faults:u # 0.017 M/sec ( +- 0.01% ) 2,989,750,359 cycles:u # 3.924 GHz ( +- 0.04% ) 1,383,312,275 stalled-cycles-frontend:u # 46.27% frontend cycles idle ( +- 0.12% ) 994,643,853 stalled-cycles-backend:u # 33.27% backend cycles idle ( +- 0.13% ) 4,879,318,990 instructions:u # 1.63 insn per cycle # 0.28 stalled cycles per insn ( +- 0.00% ) 1,038,584,045 branches:u # 1363.022 M/sec ( +- 0.00% ) 17,863,736 branch-misses:u # 1.72% of all branches ( +- 0.13% ) 1,209,327,347 L1-dcache-loads:u # 1587.103 M/sec ( +- 0.00% ) 42,501,374 L1-dcache-load-misses:u # 3.51% of all L1-dcache hits ( +- 0.05% ) 9,201,311 LLC-loads:u # 12.076 M/sec ( +- 0.28% ) <not supported> LLC-load-misses:u 0.90132 +- 0.00201 seconds time elapsed ( +- 0.22% ) Which again shows a slight improvement with the Cold attributes, and still shows the hot attributes degrading performance. Perhaps I was too overzealous with the hot tagging? [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: hot/cold --] [-- Type: text/x-patch, Size: 17489 bytes --] diff --git a/src/alloc.c b/src/alloc.c index fe2cdb8788..dd783863be 100644 --- a/src/alloc.c +++ b/src/alloc.c @@ -633,7 +633,7 @@ display_malloc_warning (void) \f /* Called if we can't allocate relocatable space for a buffer. */ -_Cold void +void buffer_memory_full (ptrdiff_t nbytes) { /* If buffers use the relocating allocator, no need to free @@ -886,7 +886,7 @@ static void *lrealloc (void *, size_t); /* Like malloc but check for no memory and block interrupt input. */ -_Hot void * +void * xmalloc (size_t size) { void *val; @@ -903,7 +903,7 @@ xmalloc (size_t size) /* Like the above, but zeroes out the memory just allocated. */ -_Hot void * +void * xzalloc (size_t size) { void *val; @@ -921,7 +921,7 @@ xzalloc (size_t size) /* Like realloc but check for no memory and block interrupt input.. */ -_Hot void * +void * xrealloc (void *block, size_t size) { void *val; @@ -944,7 +944,7 @@ xrealloc (void *block, size_t size) /* Like free but block interrupt input. */ -_Hot void +void xfree (void *block) { if (!block) @@ -1125,7 +1125,7 @@ record_xmalloc (size_t size) void *lisp_malloc_loser EXTERNALLY_VISIBLE; #endif -static _Hot void * +static void * lisp_malloc (size_t nbytes, enum mem_type type) { register void *val; @@ -1170,7 +1170,7 @@ lisp_malloc (size_t nbytes, enum mem_type type) /* Free BLOCK. This must be called to free memory allocated with a call to lisp_malloc. */ -static _Hot void +static void lisp_free (void *block) { if (pdumper_object_p (block)) @@ -2325,7 +2325,7 @@ compact_small_strings (void) current_sblock = tb; } -_Cold void +void string_overflow (void) { error ("Maximum string size exceeded"); @@ -4085,7 +4085,7 @@ set_interval_marked (INTERVAL i) either case this counts as memory being full even though malloc did not fail. */ -_Cold void +void memory_full (size_t nbytes) { /* Do not go into hysterics merely because a large request failed. */ diff --git a/src/bytecode.c b/src/bytecode.c index e2fe7153b0..40977799bf 100644 --- a/src/bytecode.c +++ b/src/bytecode.c @@ -336,7 +336,7 @@ bcall0 (Lisp_Object f) ARGS are pushed on the stack according to ARGS_TEMPLATE before executing BYTESTR. */ -_Hot Lisp_Object +Lisp_Object exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, Lisp_Object args_template, ptrdiff_t nargs, Lisp_Object *args) { diff --git a/src/data.c b/src/data.c index 1291a92955..11cd598ed8 100644 --- a/src/data.c +++ b/src/data.c @@ -130,7 +130,7 @@ set_blv_valcell (struct Lisp_Buffer_Local_Value *blv, Lisp_Object val) blv->valcell = val; } -static _Cold _Noreturn void +static _Noreturn void wrong_length_argument (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3) { Lisp_Object size1 = make_fixnum (bool_vector_size (a1)); @@ -142,7 +142,7 @@ wrong_length_argument (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3) make_fixnum (bool_vector_size (a3))); } -_Cold _Noreturn void +_Noreturn void wrong_type_argument (register Lisp_Object predicate, register Lisp_Object value) { /* If VALUE is not even a valid Lisp object, we'd want to abort here @@ -155,25 +155,25 @@ wrong_type_argument (register Lisp_Object predicate, register Lisp_Object value) xsignal2 (Qwrong_type_argument, predicate, value); } -_Cold void +void pure_write_error (Lisp_Object obj) { xsignal2 (Qerror, build_string ("Attempt to modify read-only object"), obj); } -_Cold void +void args_out_of_range (Lisp_Object a1, Lisp_Object a2) { xsignal2 (Qargs_out_of_range, a1, a2); } -_Cold void +void args_out_of_range_3 (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3) { xsignal3 (Qargs_out_of_range, a1, a2, a3); } -_Cold void +void circular_list (Lisp_Object list) { xsignal1 (Qcircular_list, list); @@ -1018,7 +1018,7 @@ do_symval_forwarding (lispfwd valcontents) /* Used to signal a user-friendly error when symbol WRONG is not a member of CHOICE, which should be a list of symbols. */ -_Cold void +void wrong_choice (Lisp_Object choice, Lisp_Object wrong) { ptrdiff_t i = 0, len = list_length (choice); @@ -1051,7 +1051,7 @@ wrong_choice (Lisp_Object choice, Lisp_Object wrong) /* Used to signal a user-friendly error if WRONG is not a number or integer/floating-point number outsize of inclusive MIN..MAX range. */ -static _Cold void +static void wrong_range (Lisp_Object min, Lisp_Object max, Lisp_Object wrong) { AUTO_STRING (value_should_be_from, "Value should be from "); diff --git a/src/dispnew.c b/src/dispnew.c index db9166cbe6..ccb08ec1b9 100644 --- a/src/dispnew.c +++ b/src/dispnew.c @@ -360,7 +360,7 @@ verify_row_hash (struct glyph_row *row) what is displayed on the screen. While this is usually fast, it leads to screen flickering. */ -static _Hot void +static void adjust_glyph_matrix (struct window *w, struct glyph_matrix *matrix, int x, int y, struct dim dim) { int i; @@ -2994,7 +2994,7 @@ window_to_frame_hpos (struct window *w, int hpos) /* Redraw frame F. */ -_Hot void +void redraw_frame (struct frame *f) { /* Error if F has no glyphs. */ @@ -3048,7 +3048,7 @@ DEFUN ("redraw-display", Fredraw_display, Sredraw_display, 0, 0, "", Value is true if redisplay was stopped due to pending input. */ -_Hot bool +bool update_frame (struct frame *f, bool force_p, bool inhibit_hairy_id_p) { /* True means display has been paused because of pending input. */ @@ -3384,7 +3384,7 @@ check_current_matrix_flags (struct window *w) /* Update display of window W. If FORCE_P, don't stop updating when input is pending. */ -static _Hot bool +static bool update_window (struct window *w, bool force_p) { struct glyph_matrix *desired_matrix = w->desired_matrix; @@ -3580,7 +3580,7 @@ update_marginal_area (struct window *w, struct glyph_row *updated_row, /* Update the display of the text area of row VPOS in window W. Value is true if display has changed. */ -static _Hot bool +static bool update_text_area (struct window *w, struct glyph_row *updated_row, int vpos) { struct glyph_row *current_row = MATRIX_ROW (w->current_matrix, vpos); @@ -4476,7 +4476,7 @@ scrolling_window (struct window *w, bool header_line_p) Value is true if update was stopped due to pending input. */ -static _Hot bool +static bool update_frame_1 (struct frame *f, bool force_p, bool inhibit_id_p, bool set_cursor_p, bool updating_menu_p) { @@ -4779,7 +4779,7 @@ count_match (struct glyph *str1, struct glyph *end1, struct glyph *str2, struct /* Perform a frame-based update on line VPOS in frame FRAME. */ -static _Hot void +static void update_frame_line (struct frame *f, int vpos, bool updating_menu_p) { struct glyph *obody, *nbody, *op1, *op2, *np1, *nend; diff --git a/src/eval.c b/src/eval.c index eb3d856a8f..e9f118c5cb 100644 --- a/src/eval.c +++ b/src/eval.c @@ -1707,25 +1707,25 @@ signal_or_quit (Lisp_Object error_symbol, Lisp_Object data, bool keyboard_quit) /* Like xsignal, but takes 0, 1, 2, or 3 args instead of a list. */ -_Cold void +void xsignal0 (Lisp_Object error_symbol) { xsignal (error_symbol, Qnil); } -_Cold void +void xsignal1 (Lisp_Object error_symbol, Lisp_Object arg) { xsignal (error_symbol, list1 (arg)); } -_Cold void +void xsignal2 (Lisp_Object error_symbol, Lisp_Object arg1, Lisp_Object arg2) { xsignal (error_symbol, list2 (arg1, arg2)); } -_Cold void +void xsignal3 (Lisp_Object error_symbol, Lisp_Object arg1, Lisp_Object arg2, Lisp_Object arg3) { xsignal (error_symbol, list3 (arg1, arg2, arg3)); @@ -1734,7 +1734,7 @@ xsignal3 (Lisp_Object error_symbol, Lisp_Object arg1, Lisp_Object arg2, Lisp_Obj /* Signal `error' with message S, and additional arg ARG. If ARG is not a proper list, make it a one-element list. */ -_Cold void +void signal_error (const char *s, Lisp_Object arg) { if (NILP (Fproper_list_p (arg))) @@ -1745,7 +1745,7 @@ signal_error (const char *s, Lisp_Object arg) /* Use this for arithmetic overflow, e.g., when an integer result is too large even for a bignum. */ -_Cold void +void overflow_error (void) { xsignal0 (Qoverflow_error); @@ -1892,7 +1892,7 @@ vformat_string (const char *m, va_list ap) } /* Dump an error message; called like vprintf. */ -_Cold void +void verror (const char *m, va_list ap) { xsignal1 (Qerror, vformat_string (m, ap)); @@ -1902,7 +1902,7 @@ verror (const char *m, va_list ap) /* Dump an error message; called like printf. */ /* VARARGS 1 */ -_Cold void +void error (const char *m, ...) { va_list ap; @@ -2171,7 +2171,7 @@ record_in_backtrace (Lisp_Object function, Lisp_Object *args, ptrdiff_t nargs) /* Eval a sub-expression of the current expression (i.e. in the same lexical scope). */ -_Hot Lisp_Object +Lisp_Object eval_sub (Lisp_Object form) { Lisp_Object fun, val, original_fun, original_args; @@ -2863,7 +2863,7 @@ usage: (funcall FUNCTION &rest ARGUMENTS) */) /* Apply a C subroutine SUBR to the NUMARGS evaluated arguments in ARG_VECTOR and return the result of evaluation. */ -_Hot Lisp_Object +Lisp_Object funcall_subr (struct Lisp_Subr *subr, ptrdiff_t numargs, Lisp_Object *args) { if (numargs < subr->min_args @@ -2942,7 +2942,7 @@ funcall_subr (struct Lisp_Subr *subr, ptrdiff_t numargs, Lisp_Object *args) } } -static _Hot Lisp_Object +static Lisp_Object apply_lambda (Lisp_Object fun, Lisp_Object args, ptrdiff_t count) { Lisp_Object *arg_vector; @@ -2978,7 +2978,7 @@ apply_lambda (Lisp_Object fun, Lisp_Object args, ptrdiff_t count) FUN must be either a lambda-expression, a compiled-code object, or a module function. */ -static _Hot Lisp_Object +static Lisp_Object funcall_lambda (Lisp_Object fun, ptrdiff_t nargs, register Lisp_Object *arg_vector) { @@ -3322,7 +3322,7 @@ do_specbind (struct Lisp_Symbol *sym, union specbinding *bind, i.e. bindings to the default value of a variable which can be buffer-local. */ -_Hot void +void specbind (Lisp_Object symbol, Lisp_Object value) { struct Lisp_Symbol *sym; @@ -3580,7 +3580,7 @@ set_unwind_protect_ptr (ptrdiff_t count, void (*func) (void *), void *arg) /* Pop and execute entries from the unwind-protect stack until the depth COUNT is reached. Return VALUE. */ -_Hot Lisp_Object +Lisp_Object unbind_to (ptrdiff_t count, Lisp_Object value) { Lisp_Object quitf = Vquit_flag; diff --git a/src/keyboard.c b/src/keyboard.c index 5569d0db2c..8fb6db987b 100644 --- a/src/keyboard.c +++ b/src/keyboard.c @@ -1038,7 +1038,7 @@ static Lisp_Object top_level_1 (Lisp_Object); This level has the catches for exiting/returning to editor command loop. It returns nil to exit recursive edit, t to abort it. */ -_Hot Lisp_Object +Lisp_Object command_loop (void) { #ifdef HAVE_STACK_OVERFLOW_HANDLING @@ -8856,7 +8856,7 @@ void init_raw_keybuf_count (void) If FIX_CURRENT_BUFFER, we restore current_buffer from the selected window's buffer. */ -static _Hot int +static int read_key_sequence (Lisp_Object *keybuf, Lisp_Object prompt, bool dont_downcase_last, bool can_return_switch_frame, bool fix_current_buffer, bool prevent_redisplay) diff --git a/src/lisp.h b/src/lisp.h index b98cad5e8e..681efc3b52 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -34,9 +34,6 @@ along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>. */ #include <intprops.h> #include <verify.h> -#define _Cold __attribute__((cold)) -#define _Hot __attribute__((hot)) - INLINE_HEADER_BEGIN /* Define a TYPE constant ID as an externally visible name. Use like this: @@ -624,7 +621,7 @@ extern Lisp_Object char_table_ref (Lisp_Object, int); extern void char_table_set (Lisp_Object, int, Lisp_Object); /* Defined in data.c. */ -extern _Cold _Noreturn void wrong_type_argument (Lisp_Object, Lisp_Object); +extern _Noreturn void wrong_type_argument (Lisp_Object, Lisp_Object); /* Defined in emacs.c. */ @@ -4098,18 +4095,18 @@ extern Lisp_Object run_hook_with_args (ptrdiff_t nargs, Lisp_Object *args, Lisp_Object (*funcall) (ptrdiff_t nargs, Lisp_Object *args)); extern Lisp_Object quit (void); -INLINE _Cold _Noreturn void +INLINE _Noreturn void xsignal (Lisp_Object error_symbol, Lisp_Object data) { Fsignal (error_symbol, data); } -extern _Cold _Noreturn void xsignal0 (Lisp_Object); -extern _Cold _Noreturn void xsignal1 (Lisp_Object, Lisp_Object); -extern _Cold _Noreturn void xsignal2 (Lisp_Object, Lisp_Object, Lisp_Object); -extern _Cold _Noreturn void xsignal3 (Lisp_Object, Lisp_Object, Lisp_Object, +extern _Noreturn void xsignal0 (Lisp_Object); +extern _Noreturn void xsignal1 (Lisp_Object, Lisp_Object); +extern _Noreturn void xsignal2 (Lisp_Object, Lisp_Object, Lisp_Object); +extern _Noreturn void xsignal3 (Lisp_Object, Lisp_Object, Lisp_Object, Lisp_Object); -extern _Cold _Noreturn void signal_error (const char *, Lisp_Object); -extern _Cold _Noreturn void overflow_error (void); +extern _Noreturn void signal_error (const char *, Lisp_Object); +extern _Noreturn void overflow_error (void); extern bool FUNCTIONP (Lisp_Object); extern Lisp_Object funcall_subr (struct Lisp_Subr *subr, ptrdiff_t numargs, Lisp_Object *arg_vector); extern Lisp_Object eval_sub (Lisp_Object form); @@ -4148,8 +4145,8 @@ extern void set_unwind_protect_ptr (ptrdiff_t, void (*) (void *), void *); extern Lisp_Object unbind_to (ptrdiff_t, Lisp_Object); extern void rebind_for_thread_switch (void); extern void unbind_for_thread_switch (struct thread_state *); -extern _Cold _Noreturn void error (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2); -extern _Cold _Noreturn void verror (const char *, va_list) +extern _Noreturn void error (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2); +extern _Noreturn void verror (const char *, va_list) ATTRIBUTE_FORMAT_PRINTF (1, 0); extern Lisp_Object vformat_string (const char *, va_list) ATTRIBUTE_FORMAT_PRINTF (1, 0); diff --git a/src/lread.c b/src/lread.c index 16ce4afd21..5f33fcd695 100644 --- a/src/lread.c +++ b/src/lread.c @@ -1912,7 +1912,7 @@ readevalloop_eager_expand_eval (Lisp_Object val, Lisp_Object macroexpand) START, END specify region to read in current buffer (from eval-region). If the input is not from a buffer, they must be nil. */ -static _Hot void +static void readevalloop (Lisp_Object readcharfun, struct infile *infile0, Lisp_Object sourcename, @@ -2736,7 +2736,7 @@ read_integer (Lisp_Object readcharfun, EMACS_INT radix) FIRST_IN_LIST is true if this is the first element of a list. */ -static _Hot Lisp_Object +static Lisp_Object read1 (Lisp_Object readcharfun, int *pch, bool first_in_list) { int c; @@ -4092,7 +4092,7 @@ check_obarray (Lisp_Object obarray) /* Intern symbol SYM in OBARRAY using bucket INDEX. */ -static _Hot Lisp_Object +static Lisp_Object intern_sym (Lisp_Object sym, Lisp_Object obarray, Lisp_Object index) { Lisp_Object *ptr; @@ -4116,7 +4116,7 @@ intern_sym (Lisp_Object sym, Lisp_Object obarray, Lisp_Object index) /* Intern a symbol with name STRING in OBARRAY using bucket INDEX. */ -_Hot Lisp_Object +Lisp_Object intern_driver (Lisp_Object string, Lisp_Object obarray, Lisp_Object index) { return intern_sym (Fmake_symbol (string), obarray, index); @@ -4125,7 +4125,7 @@ intern_driver (Lisp_Object string, Lisp_Object obarray, Lisp_Object index) /* Intern the C string STR: return a symbol with that name, interned in the current obarray. */ -_Hot Lisp_Object +Lisp_Object intern_1 (const char *str, ptrdiff_t len) { Lisp_Object obarray = check_obarray (Vobarray); diff --git a/src/term.c b/src/term.c index 472d8d19e5..a492276c88 100644 --- a/src/term.c +++ b/src/term.c @@ -4393,7 +4393,8 @@ use the Bourne shell command 'TERM=...; export TERM' (C-shell:\n\ return terminal; } -static _Cold void + +static void vfatal (const char *str, va_list ap) { fprintf (stderr, "emacs: "); @@ -4409,7 +4410,7 @@ vfatal (const char *str, va_list ap) Delete TERMINAL, then call error or fatal with str1 or str2, respectively, according to whether MUST_SUCCEED is true. */ -static _Cold void +static void maybe_fatal (bool must_succeed, struct terminal *terminal, const char *str1, const char *str2, ...) { @@ -4424,7 +4425,7 @@ maybe_fatal (bool must_succeed, struct terminal *terminal, verror (str1, ap); } -_Cold void +void fatal (const char *str, ...) { va_list ap; diff --git a/src/xdisp.c b/src/xdisp.c index aca6f09b05..a88fc698b8 100644 --- a/src/xdisp.c +++ b/src/xdisp.c @@ -13906,7 +13906,7 @@ do { if (polling_stopped_here) start_polling (); \ /* Perhaps in the future avoid recentering windows if it is not necessary; currently that causes some problems. */ -static _Hot void +static void redisplay_internal (void) { struct window *w = XWINDOW (selected_window); @@ -14925,7 +14925,7 @@ buffer_flipping_blocked_p (void) /* Redisplay all leaf windows in the window tree rooted at WINDOW. */ -static _Hot void +static void redisplay_windows (Lisp_Object window) { while (!NILP (window)) @@ -16684,7 +16684,7 @@ set_horizontal_scroll_bar (struct window *w) showing point will be fully (as opposed to partially) visible on display. */ -static _Hot void +static void redisplay_window (Lisp_Object window, bool just_this_one_p) { struct window *w = XWINDOW (window); ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-16 20:50 ` Alex Gramiak @ 2019-04-16 21:11 ` Alex Gramiak 2019-04-16 21:27 ` Stefan Monnier 2019-04-16 21:27 ` Konstantin Kharlamov 2 siblings, 0 replies; 45+ messages in thread From: Alex Gramiak @ 2019-04-16 21:11 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 6707 bytes --] Alex Gramiak <agrambot@gmail.com> writes: > Perhaps I was too overzealous with the hot tagging? I definitely was, since I removed the _Hot attributes from the *malloc/free procedures, which turned out to be in bad judgment. Removing those yielded: Performance counter stats for 'src/emacs.hot-cold -f kill-emacs' (20 runs): 751.22 msec task-clock:u # 0.841 CPUs utilized ( +- 0.14% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 12,928 page-faults:u # 0.017 M/sec ( +- 0.02% ) 2,959,142,867 cycles:u # 3.939 GHz ( +- 0.05% ) 1,350,640,657 stalled-cycles-frontend:u # 45.64% frontend cycles idle ( +- 0.12% ) 972,612,724 stalled-cycles-backend:u # 32.87% backend cycles idle ( +- 0.15% ) 4,865,119,525 instructions:u # 1.64 insn per cycle # 0.28 stalled cycles per insn ( +- 0.00% ) 1,035,582,401 branches:u # 1378.539 M/sec ( +- 0.00% ) 17,794,068 branch-misses:u # 1.72% of all branches ( +- 0.14% ) 1,206,398,515 L1-dcache-loads:u # 1605.925 M/sec ( +- 0.00% ) 42,095,141 L1-dcache-load-misses:u # 3.49% of all L1-dcache hits ( +- 0.05% ) 9,057,830 LLC-loads:u # 12.058 M/sec ( +- 0.39% ) <not supported> LLC-load-misses:u 0.89309 +- 0.00484 seconds time elapsed ( +- 0.54% ) I also realized that I completely forgot putting the attribute on emacs_abort. With the _Cold attribute on emacs_abort: Performance counter stats for 'src/emacs -f kill-emacs' (20 runs): 760.73 msec task-clock:u # 0.846 CPUs utilized ( +- 0.22% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 12,925 page-faults:u # 0.017 M/sec ( +- 0.02% ) 2,991,729,656 cycles:u # 3.933 GHz ( +- 0.04% ) 1,388,332,047 stalled-cycles-frontend:u # 46.41% frontend cycles idle ( +- 0.13% ) 976,840,303 stalled-cycles-backend:u # 32.65% backend cycles idle ( +- 0.17% ) 4,867,077,504 instructions:u # 1.63 insn per cycle # 0.29 stalled cycles per insn ( +- 0.00% ) 1,036,158,051 branches:u # 1362.059 M/sec ( +- 0.00% ) 17,860,346 branch-misses:u # 1.72% of all branches ( +- 0.19% ) 1,207,924,887 L1-dcache-loads:u # 1587.852 M/sec ( +- 0.00% ) 42,358,754 L1-dcache-load-misses:u # 3.51% of all L1-dcache hits ( +- 0.06% ) 9,046,859 LLC-loads:u # 11.892 M/sec ( +- 0.32% ) <not supported> LLC-load-misses:u 0.89896 +- 0.00200 seconds time elapsed ( +- 0.22% ) Performance counter stats for 'src/emacs.cold -f kill-emacs' (20 runs): 755.78 msec task-clock:u # 0.845 CPUs utilized ( +- 0.18% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 12,929 page-faults:u # 0.017 M/sec ( +- 0.02% ) 2,978,556,838 cycles:u # 3.941 GHz ( +- 0.05% ) 1,370,387,120 stalled-cycles-frontend:u # 46.01% frontend cycles idle ( +- 0.12% ) 978,514,384 stalled-cycles-backend:u # 32.85% backend cycles idle ( +- 0.16% ) 4,866,672,758 instructions:u # 1.63 insn per cycle # 0.28 stalled cycles per insn ( +- 0.00% ) 1,035,997,172 branches:u # 1370.762 M/sec ( +- 0.00% ) 17,838,674 branch-misses:u # 1.72% of all branches ( +- 0.13% ) 1,206,937,944 L1-dcache-loads:u # 1596.939 M/sec ( +- 0.00% ) 42,110,067 L1-dcache-load-misses:u # 3.49% of all L1-dcache hits ( +- 0.05% ) 9,088,714 LLC-loads:u # 12.026 M/sec ( +- 0.28% ) <not supported> LLC-load-misses:u 0.89401 +- 0.00185 seconds time elapsed ( +- 0.21% ) Performance counter stats for 'src/emacs.hot-cold -f kill-emacs' (20 runs): 752.56 msec task-clock:u # 0.846 CPUs utilized ( +- 0.18% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 12,923 page-faults:u # 0.017 M/sec ( +- 0.01% ) 2,973,502,618 cycles:u # 3.951 GHz ( +- 0.05% ) 1,368,004,926 stalled-cycles-frontend:u # 46.01% frontend cycles idle ( +- 0.11% ) 974,077,949 stalled-cycles-backend:u # 32.76% backend cycles idle ( +- 0.13% ) 4,865,128,800 instructions:u # 1.64 insn per cycle # 0.28 stalled cycles per insn ( +- 0.00% ) 1,035,577,546 branches:u # 1376.070 M/sec ( +- 0.00% ) 17,721,035 branch-misses:u # 1.71% of all branches ( +- 0.17% ) 1,206,420,627 L1-dcache-loads:u # 1603.086 M/sec ( +- 0.00% ) 42,129,928 L1-dcache-load-misses:u # 3.49% of all L1-dcache hits ( +- 0.04% ) 9,033,444 LLC-loads:u # 12.004 M/sec ( +- 0.40% ) <not supported> LLC-load-misses:u 0.88928 +- 0.00161 seconds time elapsed ( +- 0.18% ) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: hot/cold v2 --] [-- Type: text/x-patch, Size: 16571 bytes --] diff --git a/src/alloc.c b/src/alloc.c index dd783863be..21aba9e7a9 100644 --- a/src/alloc.c +++ b/src/alloc.c @@ -633,7 +633,7 @@ display_malloc_warning (void) \f /* Called if we can't allocate relocatable space for a buffer. */ -void +_Cold void buffer_memory_full (ptrdiff_t nbytes) { /* If buffers use the relocating allocator, no need to free @@ -2325,7 +2325,7 @@ compact_small_strings (void) current_sblock = tb; } -void +_Cold void string_overflow (void) { error ("Maximum string size exceeded"); @@ -4085,7 +4085,7 @@ set_interval_marked (INTERVAL i) either case this counts as memory being full even though malloc did not fail. */ -void +_Cold void memory_full (size_t nbytes) { /* Do not go into hysterics merely because a large request failed. */ diff --git a/src/bytecode.c b/src/bytecode.c index 40977799bf..e2fe7153b0 100644 --- a/src/bytecode.c +++ b/src/bytecode.c @@ -336,7 +336,7 @@ bcall0 (Lisp_Object f) ARGS are pushed on the stack according to ARGS_TEMPLATE before executing BYTESTR. */ -Lisp_Object +_Hot Lisp_Object exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, Lisp_Object args_template, ptrdiff_t nargs, Lisp_Object *args) { diff --git a/src/data.c b/src/data.c index 11cd598ed8..1291a92955 100644 --- a/src/data.c +++ b/src/data.c @@ -130,7 +130,7 @@ set_blv_valcell (struct Lisp_Buffer_Local_Value *blv, Lisp_Object val) blv->valcell = val; } -static _Noreturn void +static _Cold _Noreturn void wrong_length_argument (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3) { Lisp_Object size1 = make_fixnum (bool_vector_size (a1)); @@ -142,7 +142,7 @@ wrong_length_argument (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3) make_fixnum (bool_vector_size (a3))); } -_Noreturn void +_Cold _Noreturn void wrong_type_argument (register Lisp_Object predicate, register Lisp_Object value) { /* If VALUE is not even a valid Lisp object, we'd want to abort here @@ -155,25 +155,25 @@ wrong_type_argument (register Lisp_Object predicate, register Lisp_Object value) xsignal2 (Qwrong_type_argument, predicate, value); } -void +_Cold void pure_write_error (Lisp_Object obj) { xsignal2 (Qerror, build_string ("Attempt to modify read-only object"), obj); } -void +_Cold void args_out_of_range (Lisp_Object a1, Lisp_Object a2) { xsignal2 (Qargs_out_of_range, a1, a2); } -void +_Cold void args_out_of_range_3 (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3) { xsignal3 (Qargs_out_of_range, a1, a2, a3); } -void +_Cold void circular_list (Lisp_Object list) { xsignal1 (Qcircular_list, list); @@ -1018,7 +1018,7 @@ do_symval_forwarding (lispfwd valcontents) /* Used to signal a user-friendly error when symbol WRONG is not a member of CHOICE, which should be a list of symbols. */ -void +_Cold void wrong_choice (Lisp_Object choice, Lisp_Object wrong) { ptrdiff_t i = 0, len = list_length (choice); @@ -1051,7 +1051,7 @@ wrong_choice (Lisp_Object choice, Lisp_Object wrong) /* Used to signal a user-friendly error if WRONG is not a number or integer/floating-point number outsize of inclusive MIN..MAX range. */ -static void +static _Cold void wrong_range (Lisp_Object min, Lisp_Object max, Lisp_Object wrong) { AUTO_STRING (value_should_be_from, "Value should be from "); diff --git a/src/dispnew.c b/src/dispnew.c index ccb08ec1b9..db9166cbe6 100644 --- a/src/dispnew.c +++ b/src/dispnew.c @@ -360,7 +360,7 @@ verify_row_hash (struct glyph_row *row) what is displayed on the screen. While this is usually fast, it leads to screen flickering. */ -static void +static _Hot void adjust_glyph_matrix (struct window *w, struct glyph_matrix *matrix, int x, int y, struct dim dim) { int i; @@ -2994,7 +2994,7 @@ window_to_frame_hpos (struct window *w, int hpos) /* Redraw frame F. */ -void +_Hot void redraw_frame (struct frame *f) { /* Error if F has no glyphs. */ @@ -3048,7 +3048,7 @@ DEFUN ("redraw-display", Fredraw_display, Sredraw_display, 0, 0, "", Value is true if redisplay was stopped due to pending input. */ -bool +_Hot bool update_frame (struct frame *f, bool force_p, bool inhibit_hairy_id_p) { /* True means display has been paused because of pending input. */ @@ -3384,7 +3384,7 @@ check_current_matrix_flags (struct window *w) /* Update display of window W. If FORCE_P, don't stop updating when input is pending. */ -static bool +static _Hot bool update_window (struct window *w, bool force_p) { struct glyph_matrix *desired_matrix = w->desired_matrix; @@ -3580,7 +3580,7 @@ update_marginal_area (struct window *w, struct glyph_row *updated_row, /* Update the display of the text area of row VPOS in window W. Value is true if display has changed. */ -static bool +static _Hot bool update_text_area (struct window *w, struct glyph_row *updated_row, int vpos) { struct glyph_row *current_row = MATRIX_ROW (w->current_matrix, vpos); @@ -4476,7 +4476,7 @@ scrolling_window (struct window *w, bool header_line_p) Value is true if update was stopped due to pending input. */ -static bool +static _Hot bool update_frame_1 (struct frame *f, bool force_p, bool inhibit_id_p, bool set_cursor_p, bool updating_menu_p) { @@ -4779,7 +4779,7 @@ count_match (struct glyph *str1, struct glyph *end1, struct glyph *str2, struct /* Perform a frame-based update on line VPOS in frame FRAME. */ -static void +static _Hot void update_frame_line (struct frame *f, int vpos, bool updating_menu_p) { struct glyph *obody, *nbody, *op1, *op2, *np1, *nend; diff --git a/src/eval.c b/src/eval.c index e9f118c5cb..eb3d856a8f 100644 --- a/src/eval.c +++ b/src/eval.c @@ -1707,25 +1707,25 @@ signal_or_quit (Lisp_Object error_symbol, Lisp_Object data, bool keyboard_quit) /* Like xsignal, but takes 0, 1, 2, or 3 args instead of a list. */ -void +_Cold void xsignal0 (Lisp_Object error_symbol) { xsignal (error_symbol, Qnil); } -void +_Cold void xsignal1 (Lisp_Object error_symbol, Lisp_Object arg) { xsignal (error_symbol, list1 (arg)); } -void +_Cold void xsignal2 (Lisp_Object error_symbol, Lisp_Object arg1, Lisp_Object arg2) { xsignal (error_symbol, list2 (arg1, arg2)); } -void +_Cold void xsignal3 (Lisp_Object error_symbol, Lisp_Object arg1, Lisp_Object arg2, Lisp_Object arg3) { xsignal (error_symbol, list3 (arg1, arg2, arg3)); @@ -1734,7 +1734,7 @@ xsignal3 (Lisp_Object error_symbol, Lisp_Object arg1, Lisp_Object arg2, Lisp_Obj /* Signal `error' with message S, and additional arg ARG. If ARG is not a proper list, make it a one-element list. */ -void +_Cold void signal_error (const char *s, Lisp_Object arg) { if (NILP (Fproper_list_p (arg))) @@ -1745,7 +1745,7 @@ signal_error (const char *s, Lisp_Object arg) /* Use this for arithmetic overflow, e.g., when an integer result is too large even for a bignum. */ -void +_Cold void overflow_error (void) { xsignal0 (Qoverflow_error); @@ -1892,7 +1892,7 @@ vformat_string (const char *m, va_list ap) } /* Dump an error message; called like vprintf. */ -void +_Cold void verror (const char *m, va_list ap) { xsignal1 (Qerror, vformat_string (m, ap)); @@ -1902,7 +1902,7 @@ verror (const char *m, va_list ap) /* Dump an error message; called like printf. */ /* VARARGS 1 */ -void +_Cold void error (const char *m, ...) { va_list ap; @@ -2171,7 +2171,7 @@ record_in_backtrace (Lisp_Object function, Lisp_Object *args, ptrdiff_t nargs) /* Eval a sub-expression of the current expression (i.e. in the same lexical scope). */ -Lisp_Object +_Hot Lisp_Object eval_sub (Lisp_Object form) { Lisp_Object fun, val, original_fun, original_args; @@ -2863,7 +2863,7 @@ usage: (funcall FUNCTION &rest ARGUMENTS) */) /* Apply a C subroutine SUBR to the NUMARGS evaluated arguments in ARG_VECTOR and return the result of evaluation. */ -Lisp_Object +_Hot Lisp_Object funcall_subr (struct Lisp_Subr *subr, ptrdiff_t numargs, Lisp_Object *args) { if (numargs < subr->min_args @@ -2942,7 +2942,7 @@ funcall_subr (struct Lisp_Subr *subr, ptrdiff_t numargs, Lisp_Object *args) } } -static Lisp_Object +static _Hot Lisp_Object apply_lambda (Lisp_Object fun, Lisp_Object args, ptrdiff_t count) { Lisp_Object *arg_vector; @@ -2978,7 +2978,7 @@ apply_lambda (Lisp_Object fun, Lisp_Object args, ptrdiff_t count) FUN must be either a lambda-expression, a compiled-code object, or a module function. */ -static Lisp_Object +static _Hot Lisp_Object funcall_lambda (Lisp_Object fun, ptrdiff_t nargs, register Lisp_Object *arg_vector) { @@ -3322,7 +3322,7 @@ do_specbind (struct Lisp_Symbol *sym, union specbinding *bind, i.e. bindings to the default value of a variable which can be buffer-local. */ -void +_Hot void specbind (Lisp_Object symbol, Lisp_Object value) { struct Lisp_Symbol *sym; @@ -3580,7 +3580,7 @@ set_unwind_protect_ptr (ptrdiff_t count, void (*func) (void *), void *arg) /* Pop and execute entries from the unwind-protect stack until the depth COUNT is reached. Return VALUE. */ -Lisp_Object +_Hot Lisp_Object unbind_to (ptrdiff_t count, Lisp_Object value) { Lisp_Object quitf = Vquit_flag; diff --git a/src/keyboard.c b/src/keyboard.c index 8fb6db987b..5569d0db2c 100644 --- a/src/keyboard.c +++ b/src/keyboard.c @@ -1038,7 +1038,7 @@ static Lisp_Object top_level_1 (Lisp_Object); This level has the catches for exiting/returning to editor command loop. It returns nil to exit recursive edit, t to abort it. */ -Lisp_Object +_Hot Lisp_Object command_loop (void) { #ifdef HAVE_STACK_OVERFLOW_HANDLING @@ -8856,7 +8856,7 @@ void init_raw_keybuf_count (void) If FIX_CURRENT_BUFFER, we restore current_buffer from the selected window's buffer. */ -static int +static _Hot int read_key_sequence (Lisp_Object *keybuf, Lisp_Object prompt, bool dont_downcase_last, bool can_return_switch_frame, bool fix_current_buffer, bool prevent_redisplay) diff --git a/src/lisp.h b/src/lisp.h index 681efc3b52..b98cad5e8e 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -34,6 +34,9 @@ along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>. */ #include <intprops.h> #include <verify.h> +#define _Cold __attribute__((cold)) +#define _Hot __attribute__((hot)) + INLINE_HEADER_BEGIN /* Define a TYPE constant ID as an externally visible name. Use like this: @@ -621,7 +624,7 @@ extern Lisp_Object char_table_ref (Lisp_Object, int); extern void char_table_set (Lisp_Object, int, Lisp_Object); /* Defined in data.c. */ -extern _Noreturn void wrong_type_argument (Lisp_Object, Lisp_Object); +extern _Cold _Noreturn void wrong_type_argument (Lisp_Object, Lisp_Object); /* Defined in emacs.c. */ @@ -4095,18 +4098,18 @@ extern Lisp_Object run_hook_with_args (ptrdiff_t nargs, Lisp_Object *args, Lisp_Object (*funcall) (ptrdiff_t nargs, Lisp_Object *args)); extern Lisp_Object quit (void); -INLINE _Noreturn void +INLINE _Cold _Noreturn void xsignal (Lisp_Object error_symbol, Lisp_Object data) { Fsignal (error_symbol, data); } -extern _Noreturn void xsignal0 (Lisp_Object); -extern _Noreturn void xsignal1 (Lisp_Object, Lisp_Object); -extern _Noreturn void xsignal2 (Lisp_Object, Lisp_Object, Lisp_Object); -extern _Noreturn void xsignal3 (Lisp_Object, Lisp_Object, Lisp_Object, +extern _Cold _Noreturn void xsignal0 (Lisp_Object); +extern _Cold _Noreturn void xsignal1 (Lisp_Object, Lisp_Object); +extern _Cold _Noreturn void xsignal2 (Lisp_Object, Lisp_Object, Lisp_Object); +extern _Cold _Noreturn void xsignal3 (Lisp_Object, Lisp_Object, Lisp_Object, Lisp_Object); -extern _Noreturn void signal_error (const char *, Lisp_Object); -extern _Noreturn void overflow_error (void); +extern _Cold _Noreturn void signal_error (const char *, Lisp_Object); +extern _Cold _Noreturn void overflow_error (void); extern bool FUNCTIONP (Lisp_Object); extern Lisp_Object funcall_subr (struct Lisp_Subr *subr, ptrdiff_t numargs, Lisp_Object *arg_vector); extern Lisp_Object eval_sub (Lisp_Object form); @@ -4145,8 +4148,8 @@ extern void set_unwind_protect_ptr (ptrdiff_t, void (*) (void *), void *); extern Lisp_Object unbind_to (ptrdiff_t, Lisp_Object); extern void rebind_for_thread_switch (void); extern void unbind_for_thread_switch (struct thread_state *); -extern _Noreturn void error (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2); -extern _Noreturn void verror (const char *, va_list) +extern _Cold _Noreturn void error (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2); +extern _Cold _Noreturn void verror (const char *, va_list) ATTRIBUTE_FORMAT_PRINTF (1, 0); extern Lisp_Object vformat_string (const char *, va_list) ATTRIBUTE_FORMAT_PRINTF (1, 0); diff --git a/src/lread.c b/src/lread.c index 5f33fcd695..16ce4afd21 100644 --- a/src/lread.c +++ b/src/lread.c @@ -1912,7 +1912,7 @@ readevalloop_eager_expand_eval (Lisp_Object val, Lisp_Object macroexpand) START, END specify region to read in current buffer (from eval-region). If the input is not from a buffer, they must be nil. */ -static void +static _Hot void readevalloop (Lisp_Object readcharfun, struct infile *infile0, Lisp_Object sourcename, @@ -2736,7 +2736,7 @@ read_integer (Lisp_Object readcharfun, EMACS_INT radix) FIRST_IN_LIST is true if this is the first element of a list. */ -static Lisp_Object +static _Hot Lisp_Object read1 (Lisp_Object readcharfun, int *pch, bool first_in_list) { int c; @@ -4092,7 +4092,7 @@ check_obarray (Lisp_Object obarray) /* Intern symbol SYM in OBARRAY using bucket INDEX. */ -static Lisp_Object +static _Hot Lisp_Object intern_sym (Lisp_Object sym, Lisp_Object obarray, Lisp_Object index) { Lisp_Object *ptr; @@ -4116,7 +4116,7 @@ intern_sym (Lisp_Object sym, Lisp_Object obarray, Lisp_Object index) /* Intern a symbol with name STRING in OBARRAY using bucket INDEX. */ -Lisp_Object +_Hot Lisp_Object intern_driver (Lisp_Object string, Lisp_Object obarray, Lisp_Object index) { return intern_sym (Fmake_symbol (string), obarray, index); @@ -4125,7 +4125,7 @@ intern_driver (Lisp_Object string, Lisp_Object obarray, Lisp_Object index) /* Intern the C string STR: return a symbol with that name, interned in the current obarray. */ -Lisp_Object +_Hot Lisp_Object intern_1 (const char *str, ptrdiff_t len) { Lisp_Object obarray = check_obarray (Vobarray); diff --git a/src/sysdep.c b/src/sysdep.c index 57ea8220ca..84e118c250 100644 --- a/src/sysdep.c +++ b/src/sysdep.c @@ -2447,7 +2447,7 @@ emacs_backtrace (int backtrace_limit) } \f #ifndef HAVE_NTGUI -void +_Cold void emacs_abort (void) { terminate_due_to_signal (SIGABRT, 40); diff --git a/src/term.c b/src/term.c index a492276c88..472d8d19e5 100644 --- a/src/term.c +++ b/src/term.c @@ -4393,8 +4393,7 @@ use the Bourne shell command 'TERM=...; export TERM' (C-shell:\n\ return terminal; } - -static void +static _Cold void vfatal (const char *str, va_list ap) { fprintf (stderr, "emacs: "); @@ -4410,7 +4409,7 @@ vfatal (const char *str, va_list ap) Delete TERMINAL, then call error or fatal with str1 or str2, respectively, according to whether MUST_SUCCEED is true. */ -static void +static _Cold void maybe_fatal (bool must_succeed, struct terminal *terminal, const char *str1, const char *str2, ...) { @@ -4425,7 +4424,7 @@ maybe_fatal (bool must_succeed, struct terminal *terminal, verror (str1, ap); } -void +_Cold void fatal (const char *str, ...) { va_list ap; diff --git a/src/xdisp.c b/src/xdisp.c index a88fc698b8..aca6f09b05 100644 --- a/src/xdisp.c +++ b/src/xdisp.c @@ -13906,7 +13906,7 @@ do { if (polling_stopped_here) start_polling (); \ /* Perhaps in the future avoid recentering windows if it is not necessary; currently that causes some problems. */ -static void +static _Hot void redisplay_internal (void) { struct window *w = XWINDOW (selected_window); @@ -14925,7 +14925,7 @@ buffer_flipping_blocked_p (void) /* Redisplay all leaf windows in the window tree rooted at WINDOW. */ -static void +static _Hot void redisplay_windows (Lisp_Object window) { while (!NILP (window)) @@ -16684,7 +16684,7 @@ set_horizontal_scroll_bar (struct window *w) showing point will be fully (as opposed to partially) visible on display. */ -static void +static _Hot void redisplay_window (Lisp_Object window, bool just_this_one_p) { struct window *w = XWINDOW (window); ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-16 20:50 ` Alex Gramiak 2019-04-16 21:11 ` Alex Gramiak @ 2019-04-16 21:27 ` Stefan Monnier 2019-04-16 21:27 ` Konstantin Kharlamov 2 siblings, 0 replies; 45+ messages in thread From: Stefan Monnier @ 2019-04-16 21:27 UTC (permalink / raw) To: Alex Gramiak; +Cc: Paul Eggert, emacs-devel > Which again shows a slight improvement with the Cold attributes, and > still shows the hot attributes degrading performance. Perhaps I was too > overzealous with the hot tagging? I think it's pretty tricky to decide which functions should be "hot", so I'm not surprised you get worse results there: That matches the past experience with programmer-annotated likelihood ;-) For "cold" I think the meaning is pretty clear: assume this code is almost never executed which can both mean that functions that call it can be optimized under the assumption that those calls are unlikely, and at the same time spend less time optimizing the function (and move it "out of the way" to a different page). But for "hot", it's not clear whether it means "called very often" or "we spend a lot of time inside of it". E.g. it would never have occurred to me to mark redisplay_internal as "hot". Stefan ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-16 20:50 ` Alex Gramiak 2019-04-16 21:11 ` Alex Gramiak 2019-04-16 21:27 ` Stefan Monnier @ 2019-04-16 21:27 ` Konstantin Kharlamov 2019-04-18 8:25 ` Paul Eggert 2 siblings, 1 reply; 45+ messages in thread From: Konstantin Kharlamov @ 2019-04-16 21:27 UTC (permalink / raw) To: Alex Gramiak; +Cc: Paul Eggert, Stefan Monnier, emacs-devel FWIW I was in a similar search not so long ago, and I was told that e.g. "cold" attribute can sometimes produce unbearably slow code https://gcc.gnu.org/ml/gcc-help/2019-01/msg00035.html В Вт, апр 16, 2019 at 14:50, Alex Gramiak <agrambot@gmail.com> написал: > Paul Eggert <eggert@cs.ucla.edu> writes: > >> That being said, it might make sense for a few >> obviously-rarely-called >> functions like 'emacs-abort' to be marked with __attribute__ >> ((cold)), >> so long as we don't turn this into a mission to mark all cold >> functions >> (which would cost us more than it would benefit). That is what GCC >> itself does, with its own functions. However, I'd like to see >> performance figures. Could you try it out on the benchmark of 'cd >> lisp >> && time make compile-always'? > > Right, I agree that if used, they should be used sparingly. I tested > three versions a few times each with both 'make' and 'make -j4': > > a) Regular Emacs master. > b) The below diff with only the _Cold attribute > c) The below diff with both _Cold and _Hot attributes > > a) Normal > real 4:17.97s > user 3:57.18s > sys 20.394s > > real 1:17.67s > user 4:23.78s > sys 18.888s > > b) Cold > real 4:10.92s > user 3:50.34s > sys 20.178s > > real 1:15.77s > user 4:16.73s > sys 18.943s > > c) Hot/Cold > real 4:11.43s > user 3:51.07s > sys 19.961s > > real 1:16.01s > user 4:17.63s > sys 18.662s > > So not much of a difference. For some reason the Hot/Cold performed > consistently worse than Cold. > > I also tested startup/shutdown with perf: > > Performance counter stats for '../emacs-normal -f kill-emacs' (20 > runs): > > 762.17 msec task-clock:u # 0.844 CPUs > utilized ( +- 0.23% ) > 0 context-switches:u # 0.000 K/sec > 0 cpu-migrations:u # 0.000 K/sec > 12,941 page-faults:u # 0.017 M/sec > ( +- 0.01% ) > 2,998,322,125 cycles:u # 3.934 GHz > ( +- 0.06% ) > 1,392,869,413 stalled-cycles-frontend:u # 46.45% frontend > cycles idle ( +- 0.15% ) > 982,206,843 stalled-cycles-backend:u # 32.76% backend > cycles idle ( +- 0.18% ) > 4,874,186,825 instructions:u # 1.63 insn per > cycle > # 0.29 stalled > cycles per insn ( +- 0.01% ) > 1,037,929,374 branches:u # 1361.802 M/sec > ( +- 0.01% ) > 17,930,471 branch-misses:u # 1.73% of all > branches ( +- 0.16% ) > 1,209,539,215 L1-dcache-loads:u # 1586.960 M/sec > ( +- 0.01% ) > 42,346,229 L1-dcache-load-misses:u # 3.50% of all > L1-dcache hits ( +- 0.05% ) > 9,088,647 LLC-loads:u # 11.925 M/sec > ( +- 0.29% ) > <not supported> LLC-load-misses:u > > 0.90325 +- 0.00441 seconds time elapsed ( +- 0.49% ) > > > > Performance counter stats for '../emacs.cold -f kill-emacs' (20 > runs): > > 755.94 msec task-clock:u # 0.845 CPUs > utilized ( +- 0.24% ) > 0 context-switches:u # 0.000 K/sec > 0 cpu-migrations:u # 0.000 K/sec > 12,941 page-faults:u # 0.017 M/sec > ( +- 0.01% ) > 2,976,036,365 cycles:u # 3.937 GHz > ( +- 0.06% ) > 1,374,451,779 stalled-cycles-frontend:u # 46.18% frontend > cycles idle ( +- 0.14% ) > 990,227,732 stalled-cycles-backend:u # 33.27% backend > cycles idle ( +- 0.18% ) > 4,878,661,927 instructions:u # 1.64 insn per > cycle > # 0.28 stalled > cycles per insn ( +- 0.00% ) > 1,038,495,525 branches:u # 1373.782 M/sec > ( +- 0.00% ) > 17,859,906 branch-misses:u # 1.72% of all > branches ( +- 0.16% ) > 1,209,345,531 L1-dcache-loads:u # 1599.792 M/sec > ( +- 0.00% ) > 42,444,358 L1-dcache-load-misses:u # 3.51% of all > L1-dcache hits ( +- 0.06% ) > 9,204,368 LLC-loads:u # 12.176 M/sec > ( +- 0.41% ) > <not supported> LLC-load-misses:u > > 0.89430 +- 0.00217 seconds time elapsed ( +- 0.24% ) > > > Performance counter stats for '../emacs.hot-cold -f kill-emacs' (20 > runs): > > 761.97 msec task-clock:u # 0.845 CPUs > utilized ( +- 0.20% ) > 0 context-switches:u # 0.000 K/sec > 0 cpu-migrations:u # 0.000 K/sec > 12,947 page-faults:u # 0.017 M/sec > ( +- 0.01% ) > 2,989,750,359 cycles:u # 3.924 GHz > ( +- 0.04% ) > 1,383,312,275 stalled-cycles-frontend:u # 46.27% frontend > cycles idle ( +- 0.12% ) > 994,643,853 stalled-cycles-backend:u # 33.27% backend > cycles idle ( +- 0.13% ) > 4,879,318,990 instructions:u # 1.63 insn per > cycle > # 0.28 stalled > cycles per insn ( +- 0.00% ) > 1,038,584,045 branches:u # 1363.022 M/sec > ( +- 0.00% ) > 17,863,736 branch-misses:u # 1.72% of all > branches ( +- 0.13% ) > 1,209,327,347 L1-dcache-loads:u # 1587.103 M/sec > ( +- 0.00% ) > 42,501,374 L1-dcache-load-misses:u # 3.51% of all > L1-dcache hits ( +- 0.05% ) > 9,201,311 LLC-loads:u # 12.076 M/sec > ( +- 0.28% ) > <not supported> LLC-load-misses:u > > 0.90132 +- 0.00201 seconds time elapsed ( +- 0.22% ) > > > Which again shows a slight improvement with the Cold attributes, and > still shows the hot attributes degrading performance. Perhaps I was > too > overzealous with the hot tagging? > ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-16 21:27 ` Konstantin Kharlamov @ 2019-04-18 8:25 ` Paul Eggert 2019-04-18 8:43 ` Konstantin Kharlamov ` (2 more replies) 0 siblings, 3 replies; 45+ messages in thread From: Paul Eggert @ 2019-04-18 8:25 UTC (permalink / raw) To: Konstantin Kharlamov; +Cc: Stefan Monnier, Alex Gramiak, emacs-devel [-- Attachment #1: Type: text/plain, Size: 601 bytes --] Konstantin Kharlamov wrote: > I was told that e.g. "cold" > attribute can sometimes produce unbearably slow code > https://gcc.gnu.org/ml/gcc-help/2019-01/msg00035.html Although cold functions can be slow, it appears that overall it's a win for Emacs to mark _Noreturn error function declarations as cold: on my platform, 'make compile-always' ran about 1.3% faster. So I installed the attached patch into master. (Like Stefan, I'm wary of marking functions 'hot' so I didn't do that.) This patch also adds a convenience macro AVOID for the now-common pattern '_Noreturn ATTRIBUTE_COLD void'. [-- Attachment #2: 0001-Mark-_Noreturn-error-functions-as-cold.txt --] [-- Type: text/plain, Size: 27463 bytes --] From 6d6c55db2cdfb6b354873f17285a3f602e011817 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Thu, 18 Apr 2019 00:30:24 -0700 Subject: [PATCH] Mark _Noreturn error functions as cold MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On my platform this made ‘make compile-always’ 1.3% faster. Suggested by Alex Gramiak in: https://lists.gnu.org/r/emacs-devel/2019-04/msg00684.html * configure.ac (nw): Don’t use -Wsuggest-attribute=cold. * lib-src/make-docfile.c (write_globals): Mark noreturn functions as cold. * src/callproc.c (exec_failed): * src/data.c (wrong_length_argument, wrong_type_argument): * src/emacs-module.c (module_abort): * src/emacs.c (terminate_due_to_signal): * src/eval.c (unwind_to_catch): * src/image.c (my_png_error, my_error_exit): * src/json.c (json_out_of_memory, json_parse_error): * src/keyboard.c (quit_throw_to_read_char, user_error): * src/lisp.h (die, wrong_type_argument, wrong_choice) (args_out_of_range, args_out_of_range_3, circular_list) (buffer_overflow, memory_full, buffer_memory_full) (string_overflow, xsignal, xsignal0, xsignal1, xsignal2) (xsignal3, signal_error, overflow_error, error, verror) (nsberror, report_file_errno, report_file_error) (report_file_notify_error, terminate_due_to_signal) (emacs_abort, fatal): * src/lread.c (load_error_old_style_backquotes) (end_of_file_error, invalid_syntax): * src/pdumper.c (error_unsupported_dump_object): * src/puresize.h (pure_write_error): * src/search.c (matcher_overflow): * src/sound.c (sound_perror, alsa_sound_perror): * src/sysdep.c (handle_arith_signal): * src/systime.h (time_overflow): * src/term.c (maybe_fatal, vfatal): * src/textprop.c (text_read_only): * src/timefns.c (invalid_time_zone_specification) (time_error, invalid_hz): * src/xterm.c (x_connection_closed): Use AVOID instead of _Noreturn void, so that it’s marked cold. * src/conf_post.h (__has_attribute_cold) [!__has_attribute]: New macro. (ATTRIBUTE_COLD): New macro. * src/frame.h (WINDOW_SYSTEM_RETURN): Add ATTRIBUTE_COLD. * src/lisp.h (AVOID): New macro. * src/xterm.c: Omit unnecessary static decls, so that we needn’t worry about which functions should be marked cold. (x_io_error_quitter): Mark as cold. --- configure.ac | 3 +++ lib-src/make-docfile.c | 2 ++ src/callproc.c | 2 +- src/conf_post.h | 7 +++++ src/data.c | 4 +-- src/emacs-module.c | 6 ++--- src/emacs.c | 2 +- src/eval.c | 2 +- src/frame.h | 2 +- src/image.c | 4 +-- src/json.c | 4 +-- src/keyboard.c | 6 ++--- src/lisp.h | 58 +++++++++++++++++++++--------------------- src/lread.c | 6 ++--- src/pdumper.c | 3 +-- src/puresize.h | 2 +- src/search.c | 2 +- src/sound.c | 4 +-- src/sysdep.c | 2 +- src/systime.h | 2 +- src/term.c | 7 +++-- src/textprop.c | 2 +- src/timefns.c | 6 ++--- src/xterm.c | 35 +++---------------------- 24 files changed, 75 insertions(+), 98 deletions(-) diff --git a/configure.ac b/configure.ac index 3cebf3d78c..9d39bdd76b 100644 --- a/configure.ac +++ b/configure.ac @@ -1068,6 +1068,9 @@ AC_DEFUN # Emacs's use of alloca inhibits protecting the stack. nw="$nw -Wstack-protector" + # Emacs's use of __attribute__ ((cold)) causes false alarms with this option. + nw="$nw -Wsuggest-attribute=cold" + # Emacs's use of partly-const functions such as Fgnutls_available_p # make this option problematic. nw="$nw -Wsuggest-attribute=const" diff --git a/lib-src/make-docfile.c b/lib-src/make-docfile.c index ccd245e013..4d25b0a6b9 100644 --- a/lib-src/make-docfile.c +++ b/lib-src/make-docfile.c @@ -747,6 +747,8 @@ write_globals (void) printf ("%d", globals[i].v.value); putchar (')'); + if (globals[i].flags & DEFUN_noreturn) + fputs (" ATTRIBUTE_COLD", stdout); if (globals[i].flags & DEFUN_const) fputs (" ATTRIBUTE_CONST", stdout); diff --git a/src/callproc.c b/src/callproc.c index 2cdf84d9a8..98c67312b4 100644 --- a/src/callproc.c +++ b/src/callproc.c @@ -1135,7 +1135,7 @@ add_env (char **env, char **new_env, char *string) mess up the allocator's data structures in the parent. Report the error and exit the child. */ -static _Noreturn void +static AVOID exec_failed (char const *name, int err) { /* Avoid deadlock if the child's perror writes to a full pipe; the diff --git a/src/conf_post.h b/src/conf_post.h index f8254cfa9d..6ea2c7b664 100644 --- a/src/conf_post.h +++ b/src/conf_post.h @@ -73,6 +73,7 @@ typedef bool bool_bf; # define __has_attribute(a) __has_attribute_##a # define __has_attribute_alloc_size GNUC_PREREQ (4, 3, 0) # define __has_attribute_cleanup GNUC_PREREQ (3, 4, 0) +# define __has_attribute_cold GNUC_PREREQ (4, 3, 0) # define __has_attribute_externally_visible GNUC_PREREQ (4, 1, 0) # define __has_attribute_no_address_safety_analysis false # define __has_attribute_no_sanitize_address GNUC_PREREQ (4, 8, 0) @@ -223,6 +224,12 @@ extern void _DebPrint (const char *fmt, ...); extern char *emacs_getenv_TZ (void); extern int emacs_setenv_TZ (char const *); +#if __has_attribute (cold) +# define ATTRIBUTE_COLD __attribute__ ((cold)) +#else +# define ATTRIBUTE_COLD +#endif + #if __GNUC__ >= 3 /* On GCC 3.0 we might get a warning. */ #define NO_INLINE __attribute__((noinline)) #else diff --git a/src/data.c b/src/data.c index 11cd598ed8..1b2431011e 100644 --- a/src/data.c +++ b/src/data.c @@ -130,7 +130,7 @@ set_blv_valcell (struct Lisp_Buffer_Local_Value *blv, Lisp_Object val) blv->valcell = val; } -static _Noreturn void +static AVOID wrong_length_argument (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3) { Lisp_Object size1 = make_fixnum (bool_vector_size (a1)); @@ -142,7 +142,7 @@ wrong_length_argument (Lisp_Object a1, Lisp_Object a2, Lisp_Object a3) make_fixnum (bool_vector_size (a3))); } -_Noreturn void +AVOID wrong_type_argument (register Lisp_Object predicate, register Lisp_Object value) { /* If VALUE is not even a valid Lisp object, we'd want to abort here diff --git a/src/emacs-module.c b/src/emacs-module.c index 47ca3368c0..09a768e18e 100644 --- a/src/emacs-module.c +++ b/src/emacs-module.c @@ -147,8 +147,7 @@ static enum emacs_funcall_exit module_non_local_exit_check (emacs_env *); static void module_assert_thread (void); static void module_assert_runtime (struct emacs_runtime *); static void module_assert_env (emacs_env *); -static _Noreturn void module_abort (const char *format, ...) - ATTRIBUTE_FORMAT_PRINTF(1, 2); +static AVOID module_abort (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2); static emacs_env *initialize_environment (emacs_env *, struct emacs_env_private *); static void finalize_environment (emacs_env *); @@ -1163,8 +1162,7 @@ init_module_assertions (bool enable) initialize_storage (&global_storage); } -static _Noreturn void -ATTRIBUTE_FORMAT_PRINTF(1, 2) +static AVOID ATTRIBUTE_FORMAT_PRINTF (1, 2) module_abort (const char *format, ...) { fputs ("Emacs module assertion: ", stderr); diff --git a/src/emacs.c b/src/emacs.c index 6ed4b0ed87..5eba88c74f 100644 --- a/src/emacs.c +++ b/src/emacs.c @@ -370,7 +370,7 @@ using_utf8 (void) /* Report a fatal error due to signal SIG, output a backtrace of at most BACKTRACE_LIMIT lines, and exit. */ -_Noreturn void +AVOID terminate_due_to_signal (int sig, int backtrace_limit) { signal (sig, SIG_DFL); diff --git a/src/eval.c b/src/eval.c index fa7b2d0603..c2e996a947 100644 --- a/src/eval.c +++ b/src/eval.c @@ -1133,7 +1133,7 @@ internal_catch (Lisp_Object tag, This is used for correct unwinding in Fthrow and Fsignal. */ -static _Noreturn void +static AVOID unwind_to_catch (struct handler *catch, Lisp_Object value) { bool last_time; diff --git a/src/frame.h b/src/frame.h index ec8f61465f..e2e8eaab9b 100644 --- a/src/frame.h +++ b/src/frame.h @@ -1262,7 +1262,7 @@ extern int frame_default_tool_bar_height; #ifdef HAVE_WINDOW_SYSTEM # define WINDOW_SYSTEM_RETURN #else -# define WINDOW_SYSTEM_RETURN _Noreturn +# define WINDOW_SYSTEM_RETURN _Noreturn ATTRIBUTE_COLD #endif extern WINDOW_SYSTEM_RETURN struct frame * diff --git a/src/image.c b/src/image.c index c1a23edefc..8e3a9a4930 100644 --- a/src/image.c +++ b/src/image.c @@ -6079,7 +6079,7 @@ init_png_functions (void) /* Error and warning handlers installed when the PNG library is initialized. */ -static _Noreturn void +static AVOID my_png_error (png_struct *png_ptr, const char *msg) { eassert (png_ptr != NULL); @@ -6718,7 +6718,7 @@ struct my_jpeg_error_mgr }; -static _Noreturn void +static AVOID my_error_exit (j_common_ptr cinfo) { struct my_jpeg_error_mgr *mgr = (struct my_jpeg_error_mgr *) cinfo->err; diff --git a/src/json.c b/src/json.c index 74e0534065..5917212899 100644 --- a/src/json.c +++ b/src/json.c @@ -255,7 +255,7 @@ json_encode (Lisp_Object string) return code_convert_string (string, Qutf_8_unix, Qt, true, true, true); } -static _Noreturn void +static AVOID json_out_of_memory (void) { xsignal0 (Qjson_out_of_memory); @@ -263,7 +263,7 @@ json_out_of_memory (void) /* Signal a Lisp error corresponding to the JSON ERROR. */ -static _Noreturn void +static AVOID json_parse_error (const json_error_t *error) { Lisp_Object symbol; diff --git a/src/keyboard.c b/src/keyboard.c index 8fb6db987b..dff8f6b2fc 100644 --- a/src/keyboard.c +++ b/src/keyboard.c @@ -364,7 +364,7 @@ static void restore_getcjmp (void *); static Lisp_Object apply_modifiers (int, Lisp_Object); static void restore_kboard_configuration (int); static void handle_interrupt (bool); -static _Noreturn void quit_throw_to_read_char (bool); +static AVOID quit_throw_to_read_char (bool); static void timer_start_idle (void); static void timer_stop_idle (void); static void timer_resume_idle (void); @@ -1131,13 +1131,12 @@ This also exits all active minibuffers. */ Fthrow (Qtop_level, Qnil); } -static _Noreturn void +static AVOID user_error (const char *msg) { xsignal1 (Quser_error, build_string (msg)); } -/* _Noreturn will be added to prototype by make-docfile. */ DEFUN ("exit-recursive-edit", Fexit_recursive_edit, Sexit_recursive_edit, 0, 0, "", doc: /* Exit from the innermost recursive edit or minibuffer. */ attributes: noreturn) @@ -1149,7 +1148,6 @@ DEFUN ("exit-recursive-edit", Fexit_recursive_edit, Sexit_recursive_edit, 0, 0, user_error ("No recursive edit is in progress"); } -/* _Noreturn will be added to prototype by make-docfile. */ DEFUN ("abort-recursive-edit", Fabort_recursive_edit, Sabort_recursive_edit, 0, 0, "", doc: /* Abort the command that requested this recursive edit or minibuffer input. */ attributes: noreturn) diff --git a/src/lisp.h b/src/lisp.h index 2915944ffe..25d0a3d6ac 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -166,6 +166,9 @@ typedef EMACS_UINT uprintmax_t; # define pD "t" #endif +/* Convenience macro for rarely-used functions that do not return. */ +#define AVOID _Noreturn ATTRIBUTE_COLD void + /* Extra internal type checking? */ /* Define Emacs versions of <assert.h>'s 'assert (COND)' and <verify.h>'s @@ -196,7 +199,7 @@ typedef EMACS_UINT uprintmax_t; # define eassume(cond) assume (cond) #else /* ENABLE_CHECKING */ -extern _Noreturn void die (const char *, const char *, int); +extern AVOID die (const char *, const char *, int); extern bool suppress_checking EXTERNALLY_VISIBLE; @@ -621,7 +624,7 @@ extern Lisp_Object char_table_ref (Lisp_Object, int); extern void char_table_set (Lisp_Object, int, Lisp_Object); /* Defined in data.c. */ -extern _Noreturn void wrong_type_argument (Lisp_Object, Lisp_Object); +extern AVOID wrong_type_argument (Lisp_Object, Lisp_Object); /* Defined in emacs.c. */ @@ -3528,7 +3531,7 @@ modiff_to_integer (modiff_count a) } /* Defined in data.c. */ -extern _Noreturn void wrong_choice (Lisp_Object, Lisp_Object); +extern AVOID wrong_choice (Lisp_Object, Lisp_Object); extern void notify_variable_watchers (Lisp_Object, Lisp_Object, Lisp_Object, Lisp_Object); extern Lisp_Object indirect_function (Lisp_Object); @@ -3555,10 +3558,9 @@ extern intmax_t cons_to_signed (Lisp_Object, intmax_t, intmax_t); extern uintmax_t cons_to_unsigned (Lisp_Object, uintmax_t); extern struct Lisp_Symbol *indirect_variable (struct Lisp_Symbol *); -extern _Noreturn void args_out_of_range (Lisp_Object, Lisp_Object); -extern _Noreturn void args_out_of_range_3 (Lisp_Object, Lisp_Object, - Lisp_Object); -extern _Noreturn void circular_list (Lisp_Object); +extern AVOID args_out_of_range (Lisp_Object, Lisp_Object); +extern AVOID args_out_of_range_3 (Lisp_Object, Lisp_Object, Lisp_Object); +extern AVOID circular_list (Lisp_Object); extern Lisp_Object do_symval_forwarding (lispfwd); enum Set_Internal_Bind { SET_INTERNAL_SET, @@ -3666,7 +3668,7 @@ extern void syms_of_json (void); /* Defined in insdel.c. */ extern void move_gap_both (ptrdiff_t, ptrdiff_t); -extern _Noreturn void buffer_overflow (void); +extern AVOID buffer_overflow (void); extern void make_gap (ptrdiff_t); extern void make_gap_1 (struct buffer *, ptrdiff_t); extern ptrdiff_t copy_text (const unsigned char *, unsigned char *, @@ -3766,8 +3768,8 @@ extern void *my_heap_start (void); extern void check_pure_size (void); extern void allocate_string_data (struct Lisp_String *, EMACS_INT, EMACS_INT); extern void malloc_warning (const char *); -extern _Noreturn void memory_full (size_t); -extern _Noreturn void buffer_memory_full (ptrdiff_t); +extern AVOID memory_full (size_t); +extern AVOID buffer_memory_full (ptrdiff_t); extern bool survives_gc_p (Lisp_Object); extern void mark_object (Lisp_Object); #if defined REL_ALLOC && !defined SYSTEM_MALLOC && !defined HYBRID_MALLOC @@ -3848,7 +3850,7 @@ list4i (EMACS_INT x, EMACS_INT y, EMACS_INT w, EMACS_INT h) extern Lisp_Object make_uninit_bool_vector (EMACS_INT); extern Lisp_Object bool_vector_fill (Lisp_Object, Lisp_Object); -extern _Noreturn void string_overflow (void); +extern AVOID string_overflow (void); extern Lisp_Object make_string (const char *, ptrdiff_t); extern Lisp_Object make_formatted_string (char *, const char *, ...) ATTRIBUTE_FORMAT_PRINTF (2, 3); @@ -4095,18 +4097,17 @@ extern Lisp_Object run_hook_with_args (ptrdiff_t nargs, Lisp_Object *args, Lisp_Object (*funcall) (ptrdiff_t nargs, Lisp_Object *args)); extern Lisp_Object quit (void); -INLINE _Noreturn void +INLINE AVOID xsignal (Lisp_Object error_symbol, Lisp_Object data) { Fsignal (error_symbol, data); } -extern _Noreturn void xsignal0 (Lisp_Object); -extern _Noreturn void xsignal1 (Lisp_Object, Lisp_Object); -extern _Noreturn void xsignal2 (Lisp_Object, Lisp_Object, Lisp_Object); -extern _Noreturn void xsignal3 (Lisp_Object, Lisp_Object, Lisp_Object, - Lisp_Object); -extern _Noreturn void signal_error (const char *, Lisp_Object); -extern _Noreturn void overflow_error (void); +extern AVOID xsignal0 (Lisp_Object); +extern AVOID xsignal1 (Lisp_Object, Lisp_Object); +extern AVOID xsignal2 (Lisp_Object, Lisp_Object, Lisp_Object); +extern AVOID xsignal3 (Lisp_Object, Lisp_Object, Lisp_Object, Lisp_Object); +extern AVOID signal_error (const char *, Lisp_Object); +extern AVOID overflow_error (void); extern bool FUNCTIONP (Lisp_Object); extern Lisp_Object funcall_subr (struct Lisp_Subr *subr, ptrdiff_t numargs, Lisp_Object *arg_vector); extern Lisp_Object eval_sub (Lisp_Object form); @@ -4145,8 +4146,8 @@ extern void set_unwind_protect_ptr (ptrdiff_t, void (*) (void *), void *); extern Lisp_Object unbind_to (ptrdiff_t, Lisp_Object); extern void rebind_for_thread_switch (void); extern void unbind_for_thread_switch (struct thread_state *); -extern _Noreturn void error (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2); -extern _Noreturn void verror (const char *, va_list) +extern AVOID error (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2); +extern AVOID verror (const char *, va_list) ATTRIBUTE_FORMAT_PRINTF (1, 0); extern Lisp_Object vformat_string (const char *, va_list) ATTRIBUTE_FORMAT_PRINTF (1, 0); @@ -4243,7 +4244,7 @@ extern void syms_of_editfns (void); /* Defined in buffer.c. */ extern bool mouse_face_overlay_overlaps (Lisp_Object); extern Lisp_Object disable_line_numbers_overlay_at_eob (void); -extern _Noreturn void nsberror (Lisp_Object); +extern AVOID nsberror (Lisp_Object); extern void adjust_overlays_for_insert (ptrdiff_t, ptrdiff_t); extern void adjust_overlays_for_delete (ptrdiff_t, ptrdiff_t); extern void fix_start_end_in_overlays (ptrdiff_t, ptrdiff_t); @@ -4286,9 +4287,9 @@ extern void close_file_unwind (int); extern void fclose_unwind (void *); extern void restore_point_unwind (Lisp_Object); extern Lisp_Object get_file_errno_data (const char *, Lisp_Object, int); -extern _Noreturn void report_file_errno (const char *, Lisp_Object, int); -extern _Noreturn void report_file_error (const char *, Lisp_Object); -extern _Noreturn void report_file_notify_error (const char *, Lisp_Object); +extern AVOID report_file_errno (const char *, Lisp_Object, int); +extern AVOID report_file_error (const char *, Lisp_Object); +extern AVOID report_file_notify_error (const char *, Lisp_Object); extern bool internal_delete_file (Lisp_Object); extern Lisp_Object emacs_readlinkat (int, const char *); extern bool file_directory_p (Lisp_Object); @@ -4409,7 +4410,7 @@ extern bool display_arg; #endif extern Lisp_Object decode_env_path (const char *, const char *, bool); extern Lisp_Object empty_unibyte_string, empty_multibyte_string; -extern _Noreturn void terminate_due_to_signal (int, int); +extern AVOID terminate_due_to_signal (int, int); #ifdef WINDOWSNT extern Lisp_Object Vlibrary_cache; #endif @@ -4574,7 +4575,7 @@ extern EMACS_INT get_random (void); extern void seed_random (void *, ptrdiff_t); extern void init_random (void); extern void emacs_backtrace (int); -extern _Noreturn void emacs_abort (void) NO_INLINE; +extern AVOID emacs_abort (void) NO_INLINE; extern int emacs_open (const char *, int, int); extern int emacs_pipe (int[2]); extern int emacs_close (int); @@ -4615,8 +4616,7 @@ extern Lisp_Object directory_files_internal (Lisp_Object, Lisp_Object, /* Defined in term.c. */ extern int *char_ins_del_vector; extern void syms_of_term (void); -extern _Noreturn void fatal (const char *msgid, ...) - ATTRIBUTE_FORMAT_PRINTF (1, 2); +extern AVOID fatal (const char *msgid, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2); /* Defined in terminal.c. */ extern void syms_of_terminal (void); diff --git a/src/lread.c b/src/lread.c index 5f33fcd695..8cb4b63cc3 100644 --- a/src/lread.c +++ b/src/lread.c @@ -1019,7 +1019,7 @@ load_error_handler (Lisp_Object data) return Qnil; } -static _Noreturn void +static AVOID load_error_old_style_backquotes (void) { if (NILP (Vload_file_name)) @@ -1874,7 +1874,7 @@ readevalloop_1 (int old) /* Signal an `end-of-file' error, if possible with file name information. */ -static _Noreturn void +static AVOID end_of_file_error (void) { if (STRINGP (Vload_file_name)) @@ -2297,7 +2297,7 @@ read_internal_start (Lisp_Object stream, Lisp_Object start, Lisp_Object end) /* Signal Qinvalid_read_syntax error. S is error string of length N (if > 0) */ -static _Noreturn void +static AVOID invalid_syntax (const char *s) { xsignal1 (Qinvalid_read_syntax, build_string (s)); diff --git a/src/pdumper.c b/src/pdumper.c index 600c5b3ca3..2cc9af7f56 100644 --- a/src/pdumper.c +++ b/src/pdumper.c @@ -680,8 +680,7 @@ static void dump_remember_cold_op (struct dump_context *ctx, enum cold_op op, Lisp_Object arg); -_Noreturn -static void +static AVOID error_unsupported_dump_object (struct dump_context *ctx, Lisp_Object object, const char *msg) diff --git a/src/puresize.h b/src/puresize.h index f120a4b330..f5fad8b42b 100644 --- a/src/puresize.h +++ b/src/puresize.h @@ -77,7 +77,7 @@ INLINE_HEADER_BEGIN #define PURESIZE (BASE_PURESIZE * PURESIZE_RATIO * PURESIZE_CHECKING_RATIO) #endif -extern _Noreturn void pure_write_error (Lisp_Object); +extern AVOID pure_write_error (Lisp_Object); extern EMACS_INT pure[]; diff --git a/src/search.c b/src/search.c index a450e920b0..7a6e680685 100644 --- a/src/search.c +++ b/src/search.c @@ -73,7 +73,7 @@ static EMACS_INT search_buffer (Lisp_Object, ptrdiff_t, ptrdiff_t, Lisp_Object re_match_object; -static _Noreturn void +static AVOID matcher_overflow (void) { error ("Stack overflow in regexp matcher"); diff --git a/src/sound.c b/src/sound.c index 2b8715010e..4ba826e82c 100644 --- a/src/sound.c +++ b/src/sound.c @@ -297,7 +297,7 @@ static int do_play_sound (const char *, unsigned long); #ifndef WINDOWSNT /* Like perror, but signals an error. */ -static _Noreturn void +static AVOID sound_perror (const char *msg) { int saved_errno = errno; @@ -874,7 +874,7 @@ vox_write (struct sound_device *sd, const char *buffer, ptrdiff_t nbytes) #define DEFAULT_ALSA_SOUND_DEVICE "default" #endif -static _Noreturn void +static AVOID alsa_sound_perror (const char *msg, int err) { error ("%s: %s", msg, snd_strerror (err)); diff --git a/src/sysdep.c b/src/sysdep.c index 57ea8220ca..bc88e70dcb 100644 --- a/src/sysdep.c +++ b/src/sysdep.c @@ -1805,7 +1805,7 @@ deliver_fatal_thread_signal (int sig) deliver_thread_signal (sig, handle_fatal_signal); } -static _Noreturn void +static AVOID handle_arith_signal (int sig) { pthread_sigmask (SIG_SETMASK, &empty_mask, 0); diff --git a/src/systime.h b/src/systime.h index 9080cd2bba..89af0c5e3d 100644 --- a/src/systime.h +++ b/src/systime.h @@ -92,7 +92,7 @@ extern Lisp_Object make_lisp_time (struct timespec); extern bool list4_to_timespec (Lisp_Object, Lisp_Object, Lisp_Object, Lisp_Object, struct timespec *); extern struct timespec lisp_time_argument (Lisp_Object); -extern _Noreturn void time_overflow (void); +extern AVOID time_overflow (void); extern void init_timefns (void); extern void syms_of_timefns (void); diff --git a/src/term.c b/src/term.c index a492276c88..2de0a0e664 100644 --- a/src/term.c +++ b/src/term.c @@ -73,11 +73,10 @@ static void clear_tty_hooks (struct terminal *terminal); static void set_tty_hooks (struct terminal *terminal); static void dissociate_if_controlling_tty (int fd); static void delete_tty (struct terminal *); -static _Noreturn void maybe_fatal (bool, struct terminal *, - const char *, const char *, ...) +static AVOID maybe_fatal (bool, struct terminal *, const char *, const char *, + ...) ATTRIBUTE_FORMAT_PRINTF (3, 5) ATTRIBUTE_FORMAT_PRINTF (4, 5); -static _Noreturn void vfatal (const char *str, va_list ap) - ATTRIBUTE_FORMAT_PRINTF (1, 0); +static AVOID vfatal (const char *, va_list) ATTRIBUTE_FORMAT_PRINTF (1, 0); #define OUTPUT(tty, a) \ diff --git a/src/textprop.c b/src/textprop.c index bb063d3eaa..ae42c44185 100644 --- a/src/textprop.c +++ b/src/textprop.c @@ -65,7 +65,7 @@ static Lisp_Object interval_insert_in_front_hooks; /* Signal a `text-read-only' error. This function makes it easier to capture that error in GDB by putting a breakpoint on it. */ -static _Noreturn void +static AVOID text_read_only (Lisp_Object propval) { if (STRINGP (propval)) diff --git a/src/timefns.c b/src/timefns.c index 514fa24f8b..cb953d1b4c 100644 --- a/src/timefns.c +++ b/src/timefns.c @@ -172,7 +172,7 @@ emacs_localtime_rz (timezone_t tz, time_t const *t, struct tm *tm) return tm; } -static _Noreturn void +static AVOID invalid_time_zone_specification (Lisp_Object zone) { xsignal2 (Qerror, build_string ("Invalid time zone specification"), zone); @@ -337,7 +337,7 @@ time_overflow (void) error ("Specified time is not representable"); } -static _Noreturn void +static AVOID time_error (int err) { switch (err) @@ -348,7 +348,7 @@ time_error (int err) } } -static _Noreturn void +static AVOID invalid_hz (Lisp_Object hz) { xsignal2 (Qerror, build_string ("Invalid time frequency"), hz); diff --git a/src/xterm.c b/src/xterm.c index 0facb52454..0b83263a0e 100644 --- a/src/xterm.c +++ b/src/xterm.c @@ -1505,37 +1505,8 @@ x_draw_fringe_bitmap (struct window *w, struct glyph_row *row, struct draw_fring Glyph display ***********************************************************************/ - - -static void x_set_glyph_string_clipping (struct glyph_string *); -static void x_set_glyph_string_gc (struct glyph_string *); -static void x_draw_glyph_string_foreground (struct glyph_string *); -static void x_draw_composite_glyph_string_foreground (struct glyph_string *); -static void x_draw_glyph_string_box (struct glyph_string *); -static void x_draw_glyph_string (struct glyph_string *); -static _Noreturn void x_delete_glyphs (struct frame *, int); -static void x_compute_glyph_string_overhangs (struct glyph_string *); -static void x_set_cursor_gc (struct glyph_string *); -static void x_set_mode_line_face_gc (struct glyph_string *); -static void x_set_mouse_face_gc (struct glyph_string *); static bool x_alloc_lighter_color (struct frame *, Display *, Colormap, unsigned long *, double, int); -static void x_setup_relief_color (struct frame *, struct relief *, - double, int, unsigned long); -static void x_setup_relief_colors (struct glyph_string *); -static void x_draw_image_glyph_string (struct glyph_string *); -static void x_draw_image_relief (struct glyph_string *); -static void x_draw_image_foreground (struct glyph_string *); -#ifndef USE_CAIRO -static void x_draw_image_foreground_1 (struct glyph_string *, Pixmap); -#endif -static void x_clear_glyph_string_rect (struct glyph_string *, int, - int, int, int); -static void x_draw_relief_rect (struct frame *, int, int, int, int, - int, bool, bool, bool, bool, bool, - XRectangle *); -static void x_draw_box_rect (struct glyph_string *, int, int, int, int, - int, bool, bool, XRectangle *); static void x_scroll_bar_clear (struct frame *); #ifdef GLYPH_DEBUG @@ -3975,7 +3946,7 @@ x_shift_glyphs_for_insert (struct frame *f, int x, int y, int width, int height, for X frames. */ static void -x_delete_glyphs (struct frame *f, register int n) +x_delete_glyphs (struct frame *f, int n) { emacs_abort (); } @@ -9842,7 +9813,7 @@ static char *error_msg; /* Handle the loss of connection to display DPY. ERROR_MESSAGE is the text of an error message that lead to the connection loss. */ -static _Noreturn void +static AVOID x_connection_closed (Display *dpy, const char *error_message, bool ioerror) { struct x_display_info *dpyinfo = x_display_info_for_display (dpy); @@ -10005,7 +9976,7 @@ x_error_quitter (Display *display, XErrorEvent *event) It kills all frames on the display that we lost touch with. If that was the only one, it prints an error message and kills Emacs. */ -static _Noreturn int +static _Noreturn ATTRIBUTE_COLD int x_io_error_quitter (Display *display) { char buf[256]; -- 2.17.1 ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-18 8:25 ` Paul Eggert @ 2019-04-18 8:43 ` Konstantin Kharlamov 2019-04-18 13:47 ` Andy Moreton 2019-04-19 13:45 ` Alex Gramiak 2 siblings, 0 replies; 45+ messages in thread From: Konstantin Kharlamov @ 2019-04-18 8:43 UTC (permalink / raw) To: Paul Eggert; +Cc: Stefan Monnier, Alex Gramiak, emacs-devel Thanks. I wonder if "AVOID" macro may sound more clear about its intended use as "NEVER_RETURN" or "DIE"? On Чт, Apr 18, 2019 at 01:25, Paul Eggert <eggert@cs.ucla.edu> wrote: > Konstantin Kharlamov wrote: >> I was told that e.g. "cold" \x7fattribute can sometimes produce >> unbearably slow code >> \x7fLINKIFYecAaeFaEfDaHeHcbGGJDcIBBBcICeDCFEFHBBIda > > Although cold functions can be slow, it appears that overall it's a > win for Emacs to mark _Noreturn error function declarations as cold: > on my platform, 'make compile-always' ran about 1.3% faster. So I > installed the attached patch into master. (Like Stefan, I'm wary of > marking functions 'hot' so I didn't do that.) > > This patch also adds a convenience macro AVOID for the now-common > pattern '_Noreturn ATTRIBUTE_COLD void'. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-18 8:25 ` Paul Eggert 2019-04-18 8:43 ` Konstantin Kharlamov @ 2019-04-18 13:47 ` Andy Moreton 2019-04-18 17:27 ` Paul Eggert 2019-04-19 13:45 ` Alex Gramiak 2 siblings, 1 reply; 45+ messages in thread From: Andy Moreton @ 2019-04-18 13:47 UTC (permalink / raw) To: emacs-devel On Thu 18 Apr 2019, Paul Eggert wrote: > Konstantin Kharlamov wrote: >> I was told that e.g. "cold" attribute can sometimes produce unbearably slow >> code https://gcc.gnu.org/ml/gcc-help/2019-01/msg00035.html > > Although cold functions can be slow, it appears that overall it's a win for > Emacs to mark _Noreturn error function declarations as cold: on my platform, > 'make compile-always' ran about 1.3% faster. So I installed the attached patch > into master. (Like Stefan, I'm wary of marking functions 'hot' so I didn't do > that.) > > This patch also adds a convenience macro AVOID for the now-common pattern > '_Noreturn ATTRIBUTE_COLD void'. Please don't use this macro, as it makes the code much less readable. It is reasonable to have a macro that combines the '_Noreturn' and 'ATTRIBUTE_COLD' decorations, but that should not be combined with the return type. AndyM ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-18 13:47 ` Andy Moreton @ 2019-04-18 17:27 ` Paul Eggert 2019-04-18 17:56 ` Andy Moreton 0 siblings, 1 reply; 45+ messages in thread From: Paul Eggert @ 2019-04-18 17:27 UTC (permalink / raw) To: Andy Moreton; +Cc: emacs-devel On 4/18/19 6:47 AM, Andy Moreton wrote: > Please don't use this macro, as it makes the code much less readable. > > It is reasonable to have a macro that combines the '_Noreturn' and > 'ATTRIBUTE_COLD' decorations, but that should not be combined with the > return type. Actually I first tried it the way that you suggested, but found that having a single macro improved readability for me, partly because _Noreturn functions don't return anything so their return type doesn't matter except for static type checking. 'AVOID' lets a traditionalist reader easily see that the type 'void' is intended, while also connoting that the function is rarely used because calls to it are normally avoided. (I tried to shoehorn "cold" and "does not return" into the macro's name too, but couldn't come up with anything better than "AVOID_SKIJUMP". :-) ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-18 17:27 ` Paul Eggert @ 2019-04-18 17:56 ` Andy Moreton 2019-04-18 19:32 ` Paul Eggert 0 siblings, 1 reply; 45+ messages in thread From: Andy Moreton @ 2019-04-18 17:56 UTC (permalink / raw) To: emacs-devel On Thu 18 Apr 2019, Paul Eggert wrote: > On 4/18/19 6:47 AM, Andy Moreton wrote: >> Please don't use this macro, as it makes the code much less readable. >> >> It is reasonable to have a macro that combines the '_Noreturn' and >> 'ATTRIBUTE_COLD' decorations, but that should not be combined with the >> return type. > > Actually I first tried it the way that you suggested, but found that > having a single macro improved readability for me, partly because > _Noreturn functions don't return anything so their return type doesn't > matter except for static type checking. 'AVOID' lets a traditionalist > reader easily see that the type 'void' is intended, while also connoting > that the function is rarely used because calls to it are normally > avoided. (I tried to shoehorn "cold" and "does not return" into the > macro's name too, but couldn't come up with anything better than > "AVOID_SKIJUMP". :-) I had an entirely different impression: AVOID is a weird name for a return type, so where is the return type, and does this even compile ? In addition, using this macro makes it entirely non-obvious that the function has a "_NoReturn" specifier, making it harder for the reader to understand the intended semantics. I have no objection to adding "noreturn" and "noreturn_cold" macros if that helps, but the "AVOID" macro should be avoided as harmful to readability. AndyM ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-18 17:56 ` Andy Moreton @ 2019-04-18 19:32 ` Paul Eggert 0 siblings, 0 replies; 45+ messages in thread From: Paul Eggert @ 2019-04-18 19:32 UTC (permalink / raw) To: Andy Moreton; +Cc: emacs-devel On 4/18/19 10:56 AM, Andy Moreton wrote: > AVOID is a weird name for a > return type, so where is the return type, and does this even compile ? Heh. I had the same reaction many years ago when I first saw the EXFUN macro: where's the return type? AVOID is better than EXFUN in that it "looks" syntactically correct as a type name, and its unusual name provides a useful mnemonic (obviously it compiles). This is just a style issue. We can remove the definition of AVOID and replace all its uses with "_Noreturn ATTRIBUTE_COLD void", or we can rename AVOID to some other identifier, or we can define some other macro with some slightly-different meaning. Although I considered those alternatives and liked AVOID the best, I can see where other opinions might differ. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-18 8:25 ` Paul Eggert 2019-04-18 8:43 ` Konstantin Kharlamov 2019-04-18 13:47 ` Andy Moreton @ 2019-04-19 13:45 ` Alex Gramiak 2019-04-19 13:58 ` Konstantin Kharlamov 2019-04-19 17:33 ` Paul Eggert 2 siblings, 2 replies; 45+ messages in thread From: Alex Gramiak @ 2019-04-19 13:45 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel, Stefan Monnier, Konstantin Kharlamov [-- Attachment #1: Type: text/plain, Size: 1114 bytes --] Paul Eggert <eggert@cs.ucla.edu> writes: > Konstantin Kharlamov wrote: >> I was told that e.g. "cold" attribute can sometimes produce unbearably slow >> code https://gcc.gnu.org/ml/gcc-help/2019-01/msg00035.html > > Although cold functions can be slow, it appears that overall it's a win for > Emacs to mark _Noreturn error function declarations as cold: on my platform, > 'make compile-always' ran about 1.3% faster. So I installed the attached patch > into master. Thanks. What do you think about marking a few bytecode cases as cold/unused? I've attached a diff below that does that. Does it make a difference for you on your setup? It seems to slow Emacs down a slight bit for me, but I was hoping you might know why it would do so. Since the newly cold attributes should be unused, is this perhaps a GCC bug? > This patch also adds a convenience macro AVOID for the now-common pattern > '_Noreturn ATTRIBUTE_COLD void'. I'm not sure about the name. If I wasn't part of this discussion I might have thought AVOID meant that one should avoid usage of the procedure in new code. Not a big deal, of course. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: cold attributes --] [-- Type: text/x-patch, Size: 2159 bytes --] diff --git a/src/bytecode.c b/src/bytecode.c index 40977799bf..17c4716fe0 100644 --- a/src/bytecode.c +++ b/src/bytecode.c @@ -656,6 +656,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, CASE (Bunbind_all): /* Obsolete. Never used. */ /* To unbind back to the beginning of this frame. Not used yet, but will be needed for tail-recursion elimination. */ + __attribute__((cold, unused)); unbind_to (count, Qnil); NEXT; @@ -749,6 +750,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, NEXT; CASE (Bsave_window_excursion): /* Obsolete since 24.1. */ + ATTRIBUTE_COLD; { ptrdiff_t count1 = SPECPDL_INDEX (); record_unwind_protect (restore_window_configuration, @@ -808,6 +810,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, } CASE (Bcondition_case): /* Obsolete since 24.4. */ + ATTRIBUTE_COLD; { Lisp_Object handlers = POP, body = POP; TOP = internal_lisp_condition_case (TOP, body, handlers); @@ -815,12 +818,14 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, } CASE (Btemp_output_buffer_setup): /* Obsolete since 24.1. */ + ATTRIBUTE_COLD; CHECK_STRING (TOP); temp_output_buffer_setup (SSDATA (TOP)); TOP = Vstandard_output; NEXT; CASE (Btemp_output_buffer_show): /* Obsolete since 24.1. */ + ATTRIBUTE_COLD; { Lisp_Object v1 = POP; temp_output_buffer_show (TOP); @@ -1138,6 +1143,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, NEXT; CASE (Binteractive_p): /* Obsolete since 24.1. */ + ATTRIBUTE_COLD; PUSH (call0 (intern ("interactive-p"))); NEXT; @@ -1342,6 +1348,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, /* Actually this is Bstack_ref with offset 0, but we use Bdup for that instead. */ /* CASE (Bstack_ref): */ + ATTRIBUTE_COLD; error ("Invalid byte opcode: op=%d, ptr=%"pD"d", op, pc - 1 - bytestr_data); ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-19 13:45 ` Alex Gramiak @ 2019-04-19 13:58 ` Konstantin Kharlamov 2019-04-19 14:45 ` Alex Gramiak 2019-04-19 17:33 ` Paul Eggert 1 sibling, 1 reply; 45+ messages in thread From: Konstantin Kharlamov @ 2019-04-19 13:58 UTC (permalink / raw) To: Alex Gramiak; +Cc: Paul Eggert, Stefan Monnier, emacs-devel On Пт, Apr 19, 2019 at 07:45, Alex Gramiak <agrambot@gmail.com> wrote: > Paul Eggert <eggert@cs.ucla.edu> writes: > >> Konstantin Kharlamov wrote: >>> I was told that e.g. "cold" attribute can sometimes produce >>> unbearably slow >>> code https://gcc.gnu.org/ml/gcc-help/2019-01/msg00035.html >> >> Although cold functions can be slow, it appears that overall it's a >> win for >> Emacs to mark _Noreturn error function declarations as cold: on my >> platform, >> 'make compile-always' ran about 1.3% faster. So I installed the >> attached patch >> into master. > > Thanks. What do you think about marking a few bytecode cases as > cold/unused? I've attached a diff below that does that. Does it make a > difference for you on your setup? It seems to slow Emacs down a slight > bit for me, but I was hoping you might know why it would do so. Since > the newly cold attributes should be unused, is this perhaps a GCC bug? The slowdown is to be expected, given that GCC devs said that "cold" produces very inefficient code. It worked in Paul's case because he marked as "cold" functions that never return. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-19 13:58 ` Konstantin Kharlamov @ 2019-04-19 14:45 ` Alex Gramiak 0 siblings, 0 replies; 45+ messages in thread From: Alex Gramiak @ 2019-04-19 14:45 UTC (permalink / raw) To: Konstantin Kharlamov; +Cc: Paul Eggert, Stefan Monnier, emacs-devel Konstantin Kharlamov <hi-angel@yandex.ru> writes: > The slowdown is to be expected, given that GCC devs said that "cold" produces > very inefficient code. I took his statement to mean that the code generated in cold branches are very inefficient. In this case that would refer to the obsolete/unused/error CASEs, which should not be getting called, so I don't believe the slowdown is to be expected. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-19 13:45 ` Alex Gramiak 2019-04-19 13:58 ` Konstantin Kharlamov @ 2019-04-19 17:33 ` Paul Eggert 2019-04-19 20:53 ` Alex Gramiak 2019-04-20 15:29 ` Andy Moreton 1 sibling, 2 replies; 45+ messages in thread From: Paul Eggert @ 2019-04-19 17:33 UTC (permalink / raw) To: Alex Gramiak; +Cc: emacs-devel, Stefan Monnier, Konstantin Kharlamov On 4/19/19 6:45 AM, Alex Gramiak wrote: > What do you think about marking a few bytecode cases as > cold/unused? I've attached a diff below that does that. Does it make a > difference for you on your setup? No, it generates the same machine code for me on Fedora 29 x86-64, gcc 8.3.1 20190223 (Red Hat 8.3.1-2), with gcc -O2. > It seems to slow Emacs down a slight > bit for me, but I was hoping you might know why it would do so. Since > the newly cold attributes should be unused, is this perhaps a GCC bug? Possibly. Or it could be just bad luck. I've often made small changes and noted slight slowdowns or speedups that are due to luck rather than any real effects. Sometimes a simple 'make bootstrap' reverses the slowdown or speedup; sometimes it's more complicated than that. I suppose it could be related to lucky alignment in executables, or to code that just happens to exceed the associativity of my L1 or TLB caches, or whatever. So one must view these performance measurements with some skepticism. >> This patch also adds a convenience macro AVOID for the now-common pattern >> '_Noreturn ATTRIBUTE_COLD void'. > I'm not sure about the name. If I wasn't part of this discussion I might > have thought AVOID meant that one should avoid usage of the procedure in > new code. Not a big deal, of course. Yes, I considered other names but they all had problems too, and AVOID worked the best for me of the names that I thought of. A mnemonic is that 'extern AVOID time_overflow (void);' means that execution typically avoids calling time_overflow. If someone can think of a better name that would be good. Or we can just get accustomed to AVOID; as you say, it's not that big a deal. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-19 17:33 ` Paul Eggert @ 2019-04-19 20:53 ` Alex Gramiak 2019-04-20 0:05 ` Alan Mackenzie 2019-04-20 15:29 ` Andy Moreton 1 sibling, 1 reply; 45+ messages in thread From: Alex Gramiak @ 2019-04-19 20:53 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel Thanks for testing it out. I'll take your advice about measurements in the future. Paul Eggert <eggert@cs.ucla.edu> writes: > Yes, I considered other names but they all had problems too, and AVOID > worked the best for me of the names that I thought of. A mnemonic is > that 'extern AVOID time_overflow (void);' means that execution typically > avoids calling time_overflow. If someone can think of a better name that > would be good. Or we can just get accustomed to AVOID; as you say, it's > not that big a deal. The only ones I can think of at the moment are RARE/RARELY, or a slight change to AVOIDS. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-19 20:53 ` Alex Gramiak @ 2019-04-20 0:05 ` Alan Mackenzie 2019-04-20 0:42 ` Paul Eggert 0 siblings, 1 reply; 45+ messages in thread From: Alan Mackenzie @ 2019-04-20 0:05 UTC (permalink / raw) To: Alex Gramiak; +Cc: Paul Eggert, emacs-devel Hello, Alex and Paul. On Fri, Apr 19, 2019 at 14:53:51 -0600, Alex Gramiak wrote: > Thanks for testing it out. I'll take your advice about measurements in > the future. > Paul Eggert <eggert@cs.ucla.edu> writes: > > Yes, I considered other names but they all had problems too, and AVOID > > worked the best for me of the names that I thought of. A mnemonic is > > that 'extern AVOID time_overflow (void);' means that execution typically > > avoids calling time_overflow. If someone can think of a better name that > > would be good. Or we can just get accustomed to AVOID; as you say, it's > > not that big a deal. > The only ones I can think of at the moment are RARE/RARELY, or a slight > change to AVOIDS. Can I ask that if anybody starts using a new attribute macro, that they update the value of c-noise-macro-names under c-mode in .dir-locals? That way, correct fontification will then continue to happen whilst editing our C files. Thanks! -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-20 0:05 ` Alan Mackenzie @ 2019-04-20 0:42 ` Paul Eggert 2019-04-20 19:46 ` Alan Mackenzie 0 siblings, 1 reply; 45+ messages in thread From: Paul Eggert @ 2019-04-20 0:42 UTC (permalink / raw) To: Alan Mackenzie, Alex Gramiak; +Cc: emacs-devel Alan Mackenzie wrote: > Can I ask that if anybody starts using a new attribute macro, that they > update the value of c-noise-macro-names under c-mode in .dir-locals? > That way, correct fontification will then continue to happen whilst > editing our C files. Can we put "ATTRIBUTE_.*" there? If not, perhaps we should. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-20 0:42 ` Paul Eggert @ 2019-04-20 19:46 ` Alan Mackenzie 0 siblings, 0 replies; 45+ messages in thread From: Alan Mackenzie @ 2019-04-20 19:46 UTC (permalink / raw) To: Paul Eggert; +Cc: Alex Gramiak, emacs-devel Hello, Paul. On Fri, Apr 19, 2019 at 17:42:50 -0700, Paul Eggert wrote: > Alan Mackenzie wrote: > > Can I ask that if anybody starts using a new attribute macro, that they > > update the value of c-noise-macro-names under c-mode in .dir-locals? > > That way, correct fontification will then continue to happen whilst > > editing our C files. > Can we put "ATTRIBUTE_.*" there? If not, perhaps we should. Not at the moment, no. Possibly because people might write regexps as loosely as your suggestion, and that would mangle CC Mode. Whats wanted would be something more precise like "ATTRIBUTE_[A-Z]*". For macros with semicolons, I allowed a regexp to be entered directly instead of a list of identifiers, but there it is mandatory that the regexp not match anything but valid identifiers. For noise macros, a regexp just seems less needed, somehow. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-19 17:33 ` Paul Eggert 2019-04-19 20:53 ` Alex Gramiak @ 2019-04-20 15:29 ` Andy Moreton 2019-04-20 15:57 ` Paul Eggert 1 sibling, 1 reply; 45+ messages in thread From: Andy Moreton @ 2019-04-20 15:29 UTC (permalink / raw) To: emacs-devel On Fri 19 Apr 2019, Paul Eggert wrote: > On 4/19/19 6:45 AM, Alex Gramiak wrote: >> What do you think about marking a few bytecode cases as >> cold/unused? I've attached a diff below that does that. Does it make a >> difference for you on your setup? > > No, it generates the same machine code for me on Fedora 29 x86-64, gcc > 8.3.1 20190223 (Red Hat 8.3.1-2), with gcc -O2. > >> It seems to slow Emacs down a slight >> bit for me, but I was hoping you might know why it would do so. Since >> the newly cold attributes should be unused, is this perhaps a GCC bug? > > Possibly. Or it could be just bad luck. I've often made small changes > and noted slight slowdowns or speedups that are due to luck rather than > any real effects. Sometimes a simple 'make bootstrap' reverses the > slowdown or speedup; sometimes it's more complicated than that. I > suppose it could be related to lucky alignment in executables, or to > code that just happens to exceed the associativity of my L1 or TLB > caches, or whatever. So one must view these performance measurements > with some skepticism. > >>> This patch also adds a convenience macro AVOID for the now-common pattern >>> '_Noreturn ATTRIBUTE_COLD void'. >> I'm not sure about the name. If I wasn't part of this discussion I might >> have thought AVOID meant that one should avoid usage of the procedure in >> new code. Not a big deal, of course. > > Yes, I considered other names but they all had problems too, and AVOID > worked the best for me of the names that I thought of. A mnemonic is > that 'extern AVOID time_overflow (void);' means that execution typically > avoids calling time_overflow. If someone can think of a better name that > would be good. Or we can just get accustomed to AVOID; as you say, it's > not that big a deal. If the patches from this discussion on marking code as cold or liely/unlikely does not result in improved code generation and measureable performance gains, then they should be dropped as needless clutter in the source code. Performance gains are useful, but changes that claim to improve performance should be accompanied by measurements to demonstrate the claimed benefits. Evidence to motivate these changes does not appear convincing. AndyM ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-20 15:29 ` Andy Moreton @ 2019-04-20 15:57 ` Paul Eggert 2019-04-20 16:03 ` Eli Zaretskii 2019-04-20 16:28 ` Óscar Fuentes 0 siblings, 2 replies; 45+ messages in thread From: Paul Eggert @ 2019-04-20 15:57 UTC (permalink / raw) To: Andy Moreton; +Cc: emacs-devel Andy Moreton wrote: > If the patches from this discussion on marking code as cold or > liely/unlikely does not result in improved code generation and > measureable performance gains, then they should be dropped as needless > clutter in the source code. As I mentioned in <https://lists.gnu.org/r/emacs-devel/2019-04/msg00767.html> the 'cold' changes that I installed into master improved overall performance on a largish practical benchmark (cd lisp && make compile-always) by 1.3% on my platform: Fedora 29, GCC 8.3.1 20190223 (Red Hat 8.3.1-2), AMD Phenom II X4 910e. The other changes proposed in this thread seem to not help performance (or even hurt it) and we have not installed them. I like to see significant benefits for performance-related changes where the effects are not obvious. For improvements where the generated code is "obviously" faster (fewer and simpler instructions, say), I typically don't bother with measurements as my own time is limited too. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-20 15:57 ` Paul Eggert @ 2019-04-20 16:03 ` Eli Zaretskii 2019-04-20 16:11 ` Paul Eggert 2019-04-20 16:28 ` Óscar Fuentes 1 sibling, 1 reply; 45+ messages in thread From: Eli Zaretskii @ 2019-04-20 16:03 UTC (permalink / raw) To: Paul Eggert; +Cc: andrewjmoreton, emacs-devel > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sat, 20 Apr 2019 08:57:57 -0700 > Cc: emacs-devel@gnu.org > > As I mentioned in <https://lists.gnu.org/r/emacs-devel/2019-04/msg00767.html> > the 'cold' changes that I installed into master improved overall performance on > a largish practical benchmark (cd lisp && make compile-always) by 1.3% on my > platform FWIW, I consider 1.3% performance improvement as insignificant for all practical purposes. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-20 16:03 ` Eli Zaretskii @ 2019-04-20 16:11 ` Paul Eggert 2019-04-20 16:18 ` Eli Zaretskii 0 siblings, 1 reply; 45+ messages in thread From: Paul Eggert @ 2019-04-20 16:11 UTC (permalink / raw) To: Eli Zaretskii; +Cc: andrewjmoreton, emacs-devel Eli Zaretskii wrote: > I consider 1.3% performance improvement as insignificant for all > practical purposes. When I'm talking about performance measurements I try to use the word "significant" in its usual scientific sense <https://en.wikipedia.org/wiki/Statistical_significance>. Agreed that 1.3% is no big deal by itself, but if one can make a series of 1.3% performance improvements that build on each other, their effects multiply and the overall effort has worthwhile practical benefits. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-20 16:11 ` Paul Eggert @ 2019-04-20 16:18 ` Eli Zaretskii 2019-04-20 16:57 ` Paul Eggert 0 siblings, 1 reply; 45+ messages in thread From: Eli Zaretskii @ 2019-04-20 16:18 UTC (permalink / raw) To: Paul Eggert; +Cc: andrewjmoreton, emacs-devel > Cc: andrewjmoreton@gmail.com, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sat, 20 Apr 2019 09:11:30 -0700 > > Eli Zaretskii wrote: > > I consider 1.3% performance improvement as insignificant for all > > practical purposes. > > When I'm talking about performance measurements I try to use the word > "significant" in its usual scientific sense > <https://en.wikipedia.org/wiki/Statistical_significance>. Of course. I didn't mean to imply that the improvement was insignificant statistically, only that it's insignificant in practice. > Agreed that 1.3% is no big deal by itself, but if one can make a series of 1.3% > performance improvements that build on each other, their effects multiply and > the overall effort has worthwhile practical benefits. Sure, but the effect on code readability also multiplies, right? ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-20 16:18 ` Eli Zaretskii @ 2019-04-20 16:57 ` Paul Eggert 2019-04-20 17:22 ` Eli Zaretskii 0 siblings, 1 reply; 45+ messages in thread From: Paul Eggert @ 2019-04-20 16:57 UTC (permalink / raw) To: Eli Zaretskii; +Cc: andrewjmoreton, emacs-devel Eli Zaretskii wrote: > Sure, but the effect on code readability also multiplies, right? Yes, of course one must weigh cost and benefits. In this case the patch not only improved performance, it also made the code smaller and in my opinion more readable: the total source code size shrank by 1775 bytes. Source-code improvements add up too, and we should encourage them even if they're small, as they are a welcome partial antidote to the bloat that often afflicts older software projects. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-20 16:57 ` Paul Eggert @ 2019-04-20 17:22 ` Eli Zaretskii 2019-04-20 19:13 ` Paul Eggert 0 siblings, 1 reply; 45+ messages in thread From: Eli Zaretskii @ 2019-04-20 17:22 UTC (permalink / raw) To: Paul Eggert; +Cc: andrewjmoreton, emacs-devel > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sat, 20 Apr 2019 09:57:14 -0700 > Cc: andrewjmoreton@gmail.com, emacs-devel@gnu.org > > In this case the patch not only improved performance, it also made > the code smaller and in my opinion more readable: the total source > code size shrank by 1775 bytes. I meant code in C, not the produced machine code, so the size is not really relevant. I'm not sure how do you see that the code became more readable: we now have AVOID where previously we had _Noreturn, and ATTRIBUTE_COLD where we previously had nothing. Both make the code slightly less readable, IMO (because the reader must first learn what they mean). ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-20 17:22 ` Eli Zaretskii @ 2019-04-20 19:13 ` Paul Eggert 0 siblings, 0 replies; 45+ messages in thread From: Paul Eggert @ 2019-04-20 19:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: andrewjmoreton, emacs-devel Eli Zaretskii wrote: >> In this case the patch not only improved performance, it also made >> the code smaller and in my opinion more readable: the total source >> code size shrank by 1775 bytes. > I meant code in C, not the produced machine code Yes, that's what I meant too. That is, the patch made the C code shorter. > I'm not sure how do you see that the code became more readable It's more readable to me partly because the common case is shorter, e.g.: -static _Noreturn void module_abort (const char *format, ...) - ATTRIBUTE_FORMAT_PRINTF(1, 2); +static AVOID module_abort (const char *, ...) ATTRIBUTE_FORMAT_PRINTF (1, 2); and partly because the unusual case is more obviously unusual. For example, in: -static _Noreturn void +static AVOID x_connection_closed (Display *dpy, const char *error_message, bool ioerror) ... -static _Noreturn int +static _Noreturn ATTRIBUTE_COLD int x_io_error_quitter (Display *display) the x_connection_closed closed is obviously typical (it uses the now-common "AVOID"), whereas the x_io_error_quitter is obviously atypical (one normally doesn't see ATTRIBUTE_COLD in .c files), and this lets the reader more-easily see that x_io_error_quitter (a) returns 'int' and (b) never returns anything, which makes it even more of an oddball than the usual AVOID functions are. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-20 15:57 ` Paul Eggert 2019-04-20 16:03 ` Eli Zaretskii @ 2019-04-20 16:28 ` Óscar Fuentes 2019-04-20 18:58 ` Paul Eggert 1 sibling, 1 reply; 45+ messages in thread From: Óscar Fuentes @ 2019-04-20 16:28 UTC (permalink / raw) To: emacs-devel Paul Eggert <eggert@cs.ucla.edu> writes: > I like to see significant benefits for performance-related changes > where the effects are not obvious. For improvements where the > generated code is "obviously" faster (fewer and simpler instructions, > say), I typically don't bother with measurements as my own time is > limited too. You will be surprised. Modern hardware is complex. > Agreed that 1.3% is no big deal by itself, but if one can make a > series of 1.3% performance improvements that build on each other, > their effects multiply and the overall effort has worthwhile practical > benefits. If the 1.3% improvement in performance requires non-minimal source code complexity growth, IMHO it is not worhwile, at least on Emacs' case. Another problem with this type of hacks is that they tend to dilute over time, because compilers/architectures evolve. And because people change the code without testing that those pesky declarations end in the right place; it becomes a kind of cargo-cult. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-20 16:28 ` Óscar Fuentes @ 2019-04-20 18:58 ` Paul Eggert 2019-04-20 19:35 ` Óscar Fuentes 0 siblings, 1 reply; 45+ messages in thread From: Paul Eggert @ 2019-04-20 18:58 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel Óscar Fuentes wrote: >> For improvements where the >> generated code is "obviously" faster (fewer and simpler instructions, >> say), I typically don't bother with measurements as my own time is >> limited too. > You will be surprised. > > Modern hardware is complex. I'm often surprised :-), but I don't expect to be surprised when the generated code seems obviously faster to me. Although I'm not as expert as the computer architecture researcher whose office sits next to mine, I know how modern hardware works reasonably well and regularly give lectures on topics like μops and TLBs so I won't be surprised as often as a naive programmer might be. > If the 1.3% improvement in performance requires non-minimal source code > complexity growth It doesn't. In this case the source code shrank slightly. Only very slightly: just by 0.001% (this counts all files under Git control). Still, a win's a win. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-20 18:58 ` Paul Eggert @ 2019-04-20 19:35 ` Óscar Fuentes 2019-04-20 22:54 ` Paul Eggert 0 siblings, 1 reply; 45+ messages in thread From: Óscar Fuentes @ 2019-04-20 19:35 UTC (permalink / raw) To: emacs-devel Paul Eggert <eggert@cs.ucla.edu> writes: > Óscar Fuentes wrote: >>> For improvements where the >>> generated code is "obviously" faster (fewer and simpler instructions, >>> say), I typically don't bother with measurements as my own time is >>> limited too. >> You will be surprised. >> >> Modern hardware is complex. > > I'm often surprised :-), but I don't expect to be surprised when the > generated code seems obviously faster to me. Although I'm not as > expert as the computer architecture researcher whose office sits next > to mine, I know how modern hardware works reasonably well and > regularly give lectures on topics like μops and TLBs so I won't be > surprised as often as a naive programmer might be. I served many years as an Assembler programmer replacing C code where it was considered too slow so, excuse my arrogance, I'm far from being a naive programmer. I've seen many times cases where removing code (deleting unnecessary instructions and even full procedure calls) made the program run noticeably slower. "noticeably" as in "you don't need a stopwatch to notice it". Also, I've seen plenty of speedups on my development machine that vanished or even reversed when tried on a different microarchitecture, or the same microarchitecture with different cache sizes, or even the same CPU with different RAM modules. On our time of multi-level caches, branch prediction, speculative execution and what-not, acting as if your hard-earned 1% speedup will apply to most machines out there is a risky bet. >> If the 1.3% improvement in performance requires non-minimal source >> code complexity growth > > It doesn't. In this case the source code shrank slightly. Only very > slightly: just by 0.001% (this counts all files under Git control). > Still, a win's a win. Yea, but now everybody has to learn about and, worse, deal with those non-standard macros. That's added complexity. And I'm pretty sure that the speedup will vanish as code is modified and compilers and CPUs change. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-20 19:35 ` Óscar Fuentes @ 2019-04-20 22:54 ` Paul Eggert 0 siblings, 0 replies; 45+ messages in thread From: Paul Eggert @ 2019-04-20 22:54 UTC (permalink / raw) To: emacs-devel Óscar Fuentes wrote: > I'm pretty sure that > the speedup will vanish as code is modified and compilers and CPUs > change. I agree that if you disable GCC optimization or do lots of other plausible-but-low-performance things, that patch will not help performance. However, I'm skeptical that the patch would make performance worse in ordinary production builds on current Emacs platforms. And I wasn't surprised that the patch was a significant (albeit small) win on the platform I tested on, because I looked at the machine code that it generated and I knew something of my platform's architecture. Of course I could be unpleasantly surprised about how well this performance win generalizes to other platforms - and as I wrote, I am often surprised. To become surprised, though, we need to see some numbers. > I'm far from being a naive programmer. I didn't think you were naive, and apologize if my comments could be interpreted that way. My comments were intended only to make it clear that I'm not a naive programmer. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Using __builtin_expect (likely/unlikely macros) 2019-04-15 1:18 ` Paul Eggert 2019-04-15 3:11 ` Alex Gramiak @ 2020-04-15 3:14 ` John Carter 1 sibling, 0 replies; 45+ messages in thread From: John Carter @ 2020-04-15 3:14 UTC (permalink / raw) To: Paul Eggert; +Cc: Alex Gramiak, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1264 bytes --] As a small datapoint I recently converted the largish (about 2megasloc) code base I maintain to using likely/unlikely. I have been very pleasantly surprised by how much more readable it made the __builtin's.. And then surprised again by how much more readable it makes code in general. I now occasionally add them to my code, not so much as an optimization hint, but as documentation to indicate the expected "Happy Path". In fact, so much so, I find interesting those cases where I consider it.... and then say, meh, I don't know which path is more likely... On Mon, Apr 15, 2019 at 1:19 PM Paul Eggert <eggert@cs.ucla.edu> wrote: > Alex Gramiak wrote: > > the likely/unlikely > > macros are a nice way to indicate that a branch is exceedingly > > rare/common. > > The cost (in making the C code harder to read, write and maintain) so > often > exceeds that benefit that I'd rather avoid these macros unless there's a > good > performance case for putting them in. > > -- John Carter Phone : (64)(3) 358 6639 Tait Electronics PO Box 1645 Christchurch New Zealand -- This Communication is Confidential. We only send and receive email on the basis of the terms set out at www.taitradio.com/email_disclaimer <http://www.taitradio.com/email_disclaimer> [-- Attachment #2: Type: text/html, Size: 1949 bytes --] ^ permalink raw reply [flat|nested] 45+ messages in thread
end of thread, other threads:[~2020-04-15 3:14 UTC | newest] Thread overview: 45+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-04-15 0:15 Using __builtin_expect (likely/unlikely macros) Alex Gramiak 2019-04-15 1:18 ` Paul Eggert 2019-04-15 3:11 ` Alex Gramiak 2019-04-15 4:41 ` Paul Eggert 2019-04-16 0:16 ` Alex Gramiak 2019-04-16 2:34 ` Eli Zaretskii 2019-04-16 5:33 ` Alex Gramiak 2019-04-16 15:23 ` Eli Zaretskii 2019-04-16 15:47 ` Alex Gramiak 2019-04-16 3:42 ` Paul Eggert 2019-04-16 13:05 ` Stefan Monnier 2019-04-16 15:22 ` Paul Eggert 2019-04-16 16:10 ` Alex Gramiak 2019-04-16 17:54 ` Paul Eggert 2019-04-16 20:50 ` Alex Gramiak 2019-04-16 21:11 ` Alex Gramiak 2019-04-16 21:27 ` Stefan Monnier 2019-04-16 21:27 ` Konstantin Kharlamov 2019-04-18 8:25 ` Paul Eggert 2019-04-18 8:43 ` Konstantin Kharlamov 2019-04-18 13:47 ` Andy Moreton 2019-04-18 17:27 ` Paul Eggert 2019-04-18 17:56 ` Andy Moreton 2019-04-18 19:32 ` Paul Eggert 2019-04-19 13:45 ` Alex Gramiak 2019-04-19 13:58 ` Konstantin Kharlamov 2019-04-19 14:45 ` Alex Gramiak 2019-04-19 17:33 ` Paul Eggert 2019-04-19 20:53 ` Alex Gramiak 2019-04-20 0:05 ` Alan Mackenzie 2019-04-20 0:42 ` Paul Eggert 2019-04-20 19:46 ` Alan Mackenzie 2019-04-20 15:29 ` Andy Moreton 2019-04-20 15:57 ` Paul Eggert 2019-04-20 16:03 ` Eli Zaretskii 2019-04-20 16:11 ` Paul Eggert 2019-04-20 16:18 ` Eli Zaretskii 2019-04-20 16:57 ` Paul Eggert 2019-04-20 17:22 ` Eli Zaretskii 2019-04-20 19:13 ` Paul Eggert 2019-04-20 16:28 ` Óscar Fuentes 2019-04-20 18:58 ` Paul Eggert 2019-04-20 19:35 ` Óscar Fuentes 2019-04-20 22:54 ` Paul Eggert 2020-04-15 3:14 ` John Carter
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).