* Lisp_Marker size on 32bit systems @ 2018-09-06 0:41 Stefan Monnier 2018-09-06 6:51 ` Paul Eggert 0 siblings, 1 reply; 28+ messages in thread From: Stefan Monnier @ 2018-09-06 0:41 UTC (permalink / raw) To: emacs-devel The new Lisp_Marker is larger than the old one on 32bit systems: sizeof (struct Lisp_Marker) used to be 24 (bytes) when we used Lisp_Misc, but it is now 32 (bytes) instead! The reason seems to be that the vectorlike_header (which contains just a simple int) occupies 8 bytes! So those 8 bytes, plus 20 bytes of actual real data leads to 28bytes which are rounded up to 32 for alignment purposes. Why does vectorlike_header occupy 8bytes? Because we use union vectorlike_header { ptrdiff_t size; /* Align the union so that there is no padding after it. */ Lisp_Object align; GCALIGNED_UNION }; where GCALIGNED_UNION forces alignment on a multiple of 8 and hence a minimum size of 8 as well. So, on 32bit hosts, our vectorlike_header carries 4bytes of useful info but occupies 8bytes anyway. This sucks. This misfeature was introduced by the following commit: commit b1573a97e17b518723ab3f906eb6d521caed196d Author: Paul Eggert <eggert@cs.ucla.edu> Date: Mon Nov 13 08:51:41 2017 -0800 Use alignas to fix GCALIGN-related bugs Could we get this fixed, to reduce the overhead of our vectors on 32bit hosts (including bringing back Lisp_Marker back to 24 bytes)? Stefan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-06 0:41 Lisp_Marker size on 32bit systems Stefan Monnier @ 2018-09-06 6:51 ` Paul Eggert 2018-09-06 12:17 ` Stefan Monnier 0 siblings, 1 reply; 28+ messages in thread From: Paul Eggert @ 2018-09-06 6:51 UTC (permalink / raw) To: Stefan Monnier, emacs-devel Stefan Monnier wrote: > used to be 24 (bytes) when we used Lisp_Misc, but it is now 32 (bytes) instead! I'll take a look at it. I was hoping those 8 bytes wouldn't make enough difference to worry about, but evidently not.... ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-06 6:51 ` Paul Eggert @ 2018-09-06 12:17 ` Stefan Monnier 2018-09-07 7:15 ` Paul Eggert 0 siblings, 1 reply; 28+ messages in thread From: Stefan Monnier @ 2018-09-06 12:17 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel >> used to be 24 (bytes) when we used Lisp_Misc, but it is now 32 >> (bytes) instead! > I'll take a look at it. Thanks. AFAICT the only solution is to use the GCALIGNED_UNION trick in each and every "real Lisp_Object struct" rather than once and forall in vectorlike_header. Maybe we should use a LISP_STRUCT macro like #define LISP_STRUCT(name, fields) \ struct name { union { struct { fields } s; GCALIGNED_UNION; } u; } > I was hoping those 8 bytes wouldn't make enough > difference to worry about, but evidently not.... It's really the 4 extra padding bytes incurred by all vectorlikes that annoy me. The resulting extra 8 bytes in markers is just a symptom ;-) Stefan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-06 12:17 ` Stefan Monnier @ 2018-09-07 7:15 ` Paul Eggert 2018-09-07 8:05 ` Eli Zaretskii 2018-09-07 12:16 ` Stefan Monnier 0 siblings, 2 replies; 28+ messages in thread From: Paul Eggert @ 2018-09-07 7:15 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1176 bytes --] Stefan Monnier wrote: > Thanks. AFAICT the only solution is to use the GCALIGNED_UNION trick in > each and every "real Lisp_Object struct" rather than once and forall in > vectorlike_header. The trick does need to move out of union vectorlike_header. However, the trick is not needed for most of those structs, since they're allocated only by the GC and are therefore already GC-aligned. The trick is needed only for structs that C might allocate statically or on the stack, and whose addresses are tagged as Lisp pointers. Just a few types do that, and I've noted them in the first attached patch. Although the first attached patch shrinks sizeof (struct Lisp_Maker) from 32 to 24 bytes on x86 as requested, allocate_pseudovector still *allocates* 32 bytes for the struct, as it rounds the size up to the next multiple of alignof (max_align_t), which is 16 on x86. It's not hard to change that to 8 (please see 2nd attached patch) but this causes a 20% CPU performance hit (!) to 'make compile-always' on my platform (AMD Phenom II X4 910e circa 2010, Fedora 28 x86-64, gcc -m32 -march=native), so I didn't install and can't recommend the 2nd attached patch. [-- Attachment #2: 0001-Shrink-pseudovectors-a-bit.patch --] [-- Type: text/x-patch, Size: 15465 bytes --] From 2ccf72b1af5eef8746b9b6facb7c09e6258afb90 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Thu, 6 Sep 2018 19:17:14 -0700 Subject: [PATCH] Shrink pseudovectors a bit sizeof (struct Lisp_Marker) was 32 on x86, where 24 would do. Problem noted by Stefan Monnier in: https://lists.gnu.org/r/emacs-devel/2018-09/msg00165.html * src/bignum.h (struct Lisp_Bignum): * src/frame.h (struct frame): * src/lisp.h (struct Lisp_Vector, struct Lisp_Bool_Vector) (struct Lisp_Char_Table, struct Lisp_Hash_Table) (struct Lisp_Marker, struct Lisp_Overlay) (struct Lisp_Misc_Ptr, struct Lisp_User_Ptr) (struct Lisp_Finalizer, struct Lisp_Float) (struct Lisp_Module_Function): * src/process.h (struct Lisp_Process): * src/termhooks.h (struct terminal): * src/thread.h (struct thread_state, struct Lisp_Mutex) (struct Lisp_CondVar): * src/window.c (struct save_window_data): * src/window.h (struct window): * src/xterm.h (struct scroll_bar): * src/xwidget.h (struct xwidget, struct xwidget_view): Add GCALIGNED_STRUCT attribute. * src/lisp.h (GCALIGNED_UNION_MEMBER): Renamed from GCALIGNED_UNION. All uses changed. (GCALIGNED_STRUCT_MEMBER, GCALIGNED_STRUCT, GCALIGNED): New macros. All uses of open-coded GCALIGNED changed to use GCALIGNED. (union vectorlike_header): No longer GC-aligned. (PSEUDOVECSIZE): Yield 0 for pseudovectors without Lisp objects that place a member before where the first Lisp object member would be. --- src/alloc.c | 8 +++-- src/bignum.h | 2 +- src/fileio.c | 4 +-- src/frame.h | 2 +- src/keymap.c | 4 +-- src/lisp.h | 90 ++++++++++++++++++++++++++++++------------------- src/process.h | 2 +- src/termhooks.h | 2 +- src/thread.h | 6 ++-- src/window.c | 2 +- src/window.h | 2 +- src/xterm.h | 2 +- src/xwidget.h | 4 +-- 13 files changed, 76 insertions(+), 54 deletions(-) diff --git a/src/alloc.c b/src/alloc.c index 28ca7804ee..abb98a9eb6 100644 --- a/src/alloc.c +++ b/src/alloc.c @@ -641,9 +641,11 @@ buffer_memory_full (ptrdiff_t nbytes) implement Lisp objects; since pseudovectors can contain any C type, this is max_align_t. On recent GNU/Linux x86 and x86-64 this can often waste up to 8 bytes, since alignof (max_align_t) is 16 but - typical vectors need only an alignment of 8. However, it is not - worth the hassle to avoid this waste. */ -enum { LISP_ALIGNMENT = alignof (union { max_align_t x; GCALIGNED_UNION }) }; + typical vectors need only an alignment of 8. Although shrinking + the alignment to 8 would save memory, it cost a 20% hit to Emacs + CPU performance on Fedora 28 x86-64 when compiled with gcc -m32. */ +enum { LISP_ALIGNMENT = alignof (union { max_align_t x; + GCALIGNED_UNION_MEMBER }) }; verify (LISP_ALIGNMENT % GCALIGNMENT == 0); /* True if malloc (N) is known to return storage suitably aligned for diff --git a/src/bignum.h b/src/bignum.h index 0e38c615ee..6551549343 100644 --- a/src/bignum.h +++ b/src/bignum.h @@ -39,7 +39,7 @@ struct Lisp_Bignum { union vectorlike_header header; mpz_t value; -}; +} GCALIGNED_STRUCT; extern mpz_t mpz[4]; diff --git a/src/fileio.c b/src/fileio.c index 66b2333317..5ca7c595f7 100644 --- a/src/fileio.c +++ b/src/fileio.c @@ -3394,9 +3394,9 @@ union read_non_regular int fd; ptrdiff_t inserted, trytry; } s; - GCALIGNED_UNION + GCALIGNED_UNION_MEMBER }; -verify (alignof (union read_non_regular) % GCALIGNMENT == 0); +verify (GCALIGNED (union read_non_regular)); static Lisp_Object read_non_regular (Lisp_Object state) diff --git a/src/frame.h b/src/frame.h index a3bb633e57..ad7376a653 100644 --- a/src/frame.h +++ b/src/frame.h @@ -578,7 +578,7 @@ struct frame enum ns_appearance_type ns_appearance; bool_bf ns_transparent_titlebar; #endif -}; +} GCALIGNED_STRUCT; /* Most code should use these functions to set Lisp fields in struct frame. */ diff --git a/src/keymap.c b/src/keymap.c index 52db7b491f..79dce15a81 100644 --- a/src/keymap.c +++ b/src/keymap.c @@ -554,9 +554,9 @@ union map_keymap Lisp_Object args; void *data; } s; - GCALIGNED_UNION + GCALIGNED_UNION_MEMBER }; -verify (alignof (union map_keymap) % GCALIGNMENT == 0); +verify (GCALIGNED (union map_keymap)); static void map_keymap_char_table_item (Lisp_Object args, Lisp_Object key, Lisp_Object val) diff --git a/src/lisp.h b/src/lisp.h index 78c25f97dc..7e365e8f47 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -229,7 +229,7 @@ extern bool suppress_checking EXTERNALLY_VISIBLE; USE_LSB_TAG not only requires the least 3 bits of pointers returned by malloc to be 0 but also needs to be able to impose a mult-of-8 alignment on some non-GC Lisp_Objects, all of which are aligned via - GCALIGNED_UNION at the end of a union. */ + GCALIGNED_UNION_MEMBER, GCALIGNED_STRUCT_MEMBER, and GCALIGNED_STRUCT. */ enum Lisp_Bits { @@ -282,7 +282,35 @@ error !; # define GCALIGNMENT 1 #endif -#define GCALIGNED_UNION char alignas (GCALIGNMENT) gcaligned; +/* If a struct is always allocated by the GC and is therefore always + GC-aligned, put GCALIGNED_STRUCT after its closing '}'; this can + help the compiler generate better code. + + To cause a union to have alignment of at least GCALIGNMENT, put + GCALIGNED_UNION_MEMBER in its member list. Similarly for a struct + and GCALIGNED_STRUCT_MEMBER, although this may make the struct a + bit bigger on non-GCC platforms. Any struct using + GCALIGNED_STRUCT_MEMBER should also use GCALIGNED_STRUCT. + + Although these macros are reasonably portable, they are not + guaranteed on non-GCC platforms, as C11 does not require support + for alignment to GCALIGNMENT and older compilers may ignore + alignment requests. For any type T where garbage collection + requires alignment, use verify (GCALIGNED (T)) to verify the + requirement on the current platform. Types need this check if + their objects can be allocated outside the garbage collector. For + example, struct Lisp_Symbol needs the check because of lispsym and + struct Lisp_Cons needs it because of STACK_CONS. */ + +#define GCALIGNED_UNION_MEMBER char alignas (GCALIGNMENT) gcaligned; +#if HAVE_STRUCT_ATTRIBUTE_ALIGNED +# define GCALIGNED_STRUCT_MEMBER +# define GCALIGNED_STRUCT __attribute__ ((aligned (GCALIGNMENT))) +#else +# define GCALIGNED_STRUCT_MEMBER GCALIGNED_UNION_MEMBER +# define GCALIGNED_STRUCT +#endif +#define GCALIGNED(type) (alignof (type) % GCALIGNMENT == 0) /* Lisp_Word is a scalar word suitable for holding a tagged pointer or integer. Usually it is a pointer to a deliberately-incomplete type @@ -751,10 +779,10 @@ struct Lisp_Symbol /* Next symbol in obarray bucket, if the symbol is interned. */ struct Lisp_Symbol *next; } s; - GCALIGNED_UNION + GCALIGNED_UNION_MEMBER } u; }; -verify (alignof (struct Lisp_Symbol) % GCALIGNMENT == 0); +verify (GCALIGNED (struct Lisp_Symbol)); /* Declare a Lisp-callable function. The MAXARGS parameter has the same meaning as in the DEFUN macro, and is used to construct a prototype. */ @@ -843,7 +871,9 @@ typedef EMACS_UINT Lisp_Word_tag; and PSEUDOVECTORP cast their pointers to union vectorlike_header *, because when two such pointers potentially alias, a compiler won't incorrectly reorder loads and stores to their size fields. See - Bug#8546. */ + Bug#8546. This union formerly contained more members, and there's + no compelling reason to change it to a struct merely because the + number of members has been reduced to one. */ union vectorlike_header { /* The main member contains various pieces of information: @@ -866,20 +896,7 @@ union vectorlike_header Current layout limits the pseudovectors to 63 PVEC_xxx subtypes, 4095 Lisp_Objects in GC-ed area and 4095 word-sized other slots. */ ptrdiff_t size; - /* Align the union so that there is no padding after it. - This is needed for the following reason: - If the alignment constraint of Lisp_Object is greater than the size of - vectorlike_header (e.g. with-wide-int), vectorlike objects which have - 0 Lisp_Object fields and whose 1st field has a smaller alignment - constraint than Lisp_Object may end up with their 1st field "before - pseudovector index 0", in which case PSEUDOVECSIZE will return - a "negative" number. We could fix PSEUDOVECSIZE, but it's easier to - just force rounding up the size of vectorlike_header to the alignment - of Lisp_Object. */ - Lisp_Object align; - GCALIGNED_UNION }; -verify (alignof (union vectorlike_header) % GCALIGNMENT == 0); INLINE bool (SYMBOLP) (Lisp_Object x) @@ -1251,10 +1268,10 @@ struct Lisp_Cons struct Lisp_Cons *chain; } u; } s; - GCALIGNED_UNION + GCALIGNED_UNION_MEMBER } u; }; -verify (alignof (struct Lisp_Cons) % GCALIGNMENT == 0); +verify (GCALIGNED (struct Lisp_Cons)); INLINE bool (NILP) (Lisp_Object x) @@ -1373,10 +1390,10 @@ struct Lisp_String unsigned char *data; } s; struct Lisp_String *next; - GCALIGNED_UNION + GCALIGNED_UNION_MEMBER } u; }; -verify (alignof (struct Lisp_String) % GCALIGNMENT == 0); +verify (GCALIGNED (struct Lisp_String)); INLINE bool STRINGP (Lisp_Object x) @@ -1507,7 +1524,7 @@ struct Lisp_Vector { union vectorlike_header header; Lisp_Object contents[FLEXIBLE_ARRAY_MEMBER]; - }; + } GCALIGNED_STRUCT; INLINE bool (VECTORLIKEP) (Lisp_Object x) @@ -1599,7 +1616,7 @@ struct Lisp_Bool_Vector The bits are in little-endian order in the bytes, and the bytes are in little-endian order in the words. */ bits_word data[FLEXIBLE_ARRAY_MEMBER]; - }; + } GCALIGNED_STRUCT; /* Some handy constants for calculating sizes and offsets, mostly of vectorlike objects. */ @@ -1765,7 +1782,8 @@ memclear (void *p, ptrdiff_t nbytes) ones that the GC needs to trace). */ #define PSEUDOVECSIZE(type, nonlispfield) \ - ((offsetof (type, nonlispfield) - header_size) / word_size) + (offsetof (type, nonlispfield) < header_size \ + ? 0 : (offsetof (type, nonlispfield) - header_size) / word_size) /* Compute A OP B, using the unsigned comparison operator OP. A and B should be integer expressions. This is not the same as @@ -1830,7 +1848,7 @@ struct Lisp_Char_Table /* These hold additional data. It is a vector. */ Lisp_Object extras[FLEXIBLE_ARRAY_MEMBER]; - }; + } GCALIGNED_STRUCT; INLINE bool CHAR_TABLE_P (Lisp_Object a) @@ -1942,7 +1960,9 @@ struct Lisp_Subr const char *symbol_name; const char *intspec; EMACS_INT doc; - }; + GCALIGNED_STRUCT_MEMBER + } GCALIGNED_STRUCT; +verify (GCALIGNED (struct Lisp_Subr)); INLINE bool SUBRP (Lisp_Object a) @@ -2194,7 +2214,7 @@ struct Lisp_Hash_Table /* Next weak hash table if this is a weak hash table. The head of the list is in weak_hash_tables. */ struct Lisp_Hash_Table *next_weak; -}; +} GCALIGNED_STRUCT; INLINE bool @@ -2313,7 +2333,7 @@ struct Lisp_Marker used to implement the functionality of markers, but rather to (ab)use markers as a cache for char<->byte mappings). */ ptrdiff_t bytepos; -}; +} GCALIGNED_STRUCT; /* START and END are markers in the overlay's buffer, and PLIST is the overlay's property list. */ @@ -2335,13 +2355,13 @@ struct Lisp_Overlay Lisp_Object end; Lisp_Object plist; struct Lisp_Overlay *next; - }; + } GCALIGNED_STRUCT; struct Lisp_Misc_Ptr { union vectorlike_header header; void *pointer; - }; + } GCALIGNED_STRUCT; extern Lisp_Object make_misc_ptr (void *); @@ -2388,7 +2408,7 @@ struct Lisp_User_Ptr union vectorlike_header header; void (*finalizer) (void *); void *p; -}; +} GCALIGNED_STRUCT; #endif /* A finalizer sentinel. */ @@ -2404,7 +2424,7 @@ struct Lisp_Finalizer /* Circular list of all active weak references. */ struct Lisp_Finalizer *prev; struct Lisp_Finalizer *next; - }; + } GCALIGNED_STRUCT; INLINE bool FINALIZERP (Lisp_Object x) @@ -2616,7 +2636,7 @@ struct Lisp_Float double data; struct Lisp_Float *chain; } u; - }; + } GCALIGNED_STRUCT; INLINE bool (FLOATP) (Lisp_Object x) @@ -3946,7 +3966,7 @@ struct Lisp_Module_Function ptrdiff_t min_arity, max_arity; emacs_subr subr; void *data; -}; +} GCALIGNED_STRUCT; INLINE bool MODULE_FUNCTIONP (Lisp_Object o) diff --git a/src/process.h b/src/process.h index 6bc22146a7..3c6dd7b91f 100644 --- a/src/process.h +++ b/src/process.h @@ -203,7 +203,7 @@ struct Lisp_Process bool_bf gnutls_p : 1; bool_bf gnutls_complete_negotiation_p : 1; #endif -}; + } GCALIGNED_STRUCT; INLINE bool PROCESSP (Lisp_Object a) diff --git a/src/termhooks.h b/src/termhooks.h index 8b5f648b43..211429169b 100644 --- a/src/termhooks.h +++ b/src/termhooks.h @@ -661,7 +661,7 @@ struct terminal frames on the terminal when it calls this hook, so infinite recursion is prevented. */ void (*delete_terminal_hook) (struct terminal *); -}; +} GCALIGNED_STRUCT; INLINE bool TERMINALP (Lisp_Object a) diff --git a/src/thread.h b/src/thread.h index 8ecb00824d..28d8d864fb 100644 --- a/src/thread.h +++ b/src/thread.h @@ -184,7 +184,7 @@ struct thread_state /* Threads are kept on a linked list. */ struct thread_state *next_thread; -}; +} GCALIGNED_STRUCT; INLINE bool THREADP (Lisp_Object a) @@ -231,7 +231,7 @@ struct Lisp_Mutex /* The lower-level mutex object. */ lisp_mutex_t mutex; -}; +} GCALIGNED_STRUCT; INLINE bool MUTEXP (Lisp_Object a) @@ -265,7 +265,7 @@ struct Lisp_CondVar /* The lower-level condition variable object. */ sys_cond_t cond; -}; +} GCALIGNED_STRUCT; INLINE bool CONDVARP (Lisp_Object a) diff --git a/src/window.c b/src/window.c index d4fc5568a5..04de965680 100644 --- a/src/window.c +++ b/src/window.c @@ -6268,7 +6268,7 @@ struct save_window_data /* These are currently unused. We need them as soon as we convert to pixels. */ int frame_menu_bar_height, frame_tool_bar_height; - }; + } GCALIGNED_STRUCT; /* This is saved as a Lisp_Vector. */ struct saved_window diff --git a/src/window.h b/src/window.h index 013083eb9a..cc0b6b6667 100644 --- a/src/window.h +++ b/src/window.h @@ -400,7 +400,7 @@ struct window /* Z_BYTE - buffer position of the last glyph in the current matrix of W. Should be nonnegative, and only valid if window_end_valid is true. */ ptrdiff_t window_end_bytepos; - }; + } GCALIGNED_STRUCT; INLINE bool WINDOWP (Lisp_Object a) diff --git a/src/xterm.h b/src/xterm.h index 1849a5c953..2ea8a93f8c 100644 --- a/src/xterm.h +++ b/src/xterm.h @@ -937,7 +937,7 @@ struct scroll_bar /* True if the scroll bar is horizontal. */ bool horizontal; -}; +} GCALIGNED_STRUCT; /* Turning a lisp vector value into a pointer to a struct scroll_bar. */ #define XSCROLL_BAR(vec) ((struct scroll_bar *) XVECTOR (vec)) diff --git a/src/xwidget.h b/src/xwidget.h index 89fc7ff458..c203d4f60c 100644 --- a/src/xwidget.h +++ b/src/xwidget.h @@ -61,7 +61,7 @@ struct xwidget /* Kill silently if Emacs is exited. */ bool_bf kill_without_query : 1; -}; +} GCALIGNED_STRUCT; struct xwidget_view { @@ -88,7 +88,7 @@ struct xwidget_view int clip_left; long handler_id; -}; +} GCALIGNED_STRUCT; #endif /* Test for xwidget pseudovector. */ -- 2.17.1 [-- Attachment #3: marker24.diff --] [-- Type: text/x-patch, Size: 1590 bytes --] diff --git a/src/alloc.c b/src/alloc.c index a0639fd577..cbeb51bbc9 100644 --- a/src/alloc.c +++ b/src/alloc.c @@ -638,13 +638,23 @@ buffer_memory_full (ptrdiff_t nbytes) /* LISP_ALIGNMENT is the alignment of Lisp objects. It must be at least GCALIGNMENT so that pointers can be tagged. It also must be at least as strict as the alignment of all the C types used to - implement Lisp objects; since pseudovectors can contain any C type, - this is max_align_t. On recent GNU/Linux x86 and x86-64 this can - often waste up to 8 bytes, since alignof (max_align_t) is 16 but - typical vectors need only an alignment of 8. However, it is not - worth the hassle to avoid this waste. */ -enum { LISP_ALIGNMENT = alignof (union { max_align_t x; - GCALIGNED_UNION_MEMBER }) }; + implement Lisp objects. This union contains all the C types whose + alignment contributes to LISP_ALIGNMENT. This is not an exhaustive + list of the types, just enough so that the answer works on all + practical Emacs targets. This union does not contain max_align_t, + because with recent GCC on x86 that has an alignment of 16, but + Emacs does not use any types requiring an alignment more than 8. + Emacs modules must respect the alignment limit here. */ +union Lisp_kitchen_sink +{ + double d; + intmax_t i; + uintmax_t u; + void (*f) (void); + void *p; + GCALIGNED_UNION_MEMBER +}; +enum { LISP_ALIGNMENT = alignof (union Lisp_kitchen_sink) }; verify (LISP_ALIGNMENT % GCALIGNMENT == 0); /* True if malloc (N) is known to return storage suitably aligned for ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-07 7:15 ` Paul Eggert @ 2018-09-07 8:05 ` Eli Zaretskii 2018-09-07 13:45 ` Paul Eggert 2018-09-07 12:16 ` Stefan Monnier 1 sibling, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2018-09-07 8:05 UTC (permalink / raw) To: Paul Eggert; +Cc: monnier, emacs-devel > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Fri, 7 Sep 2018 00:15:57 -0700 > Cc: emacs-devel@gnu.org > > Stefan Monnier wrote: > > > Thanks. AFAICT the only solution is to use the GCALIGNED_UNION trick in > > each and every "real Lisp_Object struct" rather than once and forall in > > vectorlike_header. > > The trick does need to move out of union vectorlike_header. However, the trick > is not needed for most of those structs, since they're allocated only by the GC > and are therefore already GC-aligned. The trick is needed only for structs that > C might allocate statically or on the stack, and whose addresses are tagged as > Lisp pointers. Just a few types do that, and I've noted them in the first > attached patch. > > Although the first attached patch shrinks sizeof (struct Lisp_Maker) from 32 to > 24 bytes on x86 as requested, allocate_pseudovector still *allocates* 32 bytes > for the struct, as it rounds the size up to the next multiple of alignof > (max_align_t), which is 16 on x86. It's not hard to change that to 8 (please see > 2nd attached patch) but this causes a 20% CPU performance hit (!) to 'make > compile-always' on my platform (AMD Phenom II X4 910e circa 2010, Fedora 28 > x86-64, gcc -m32 -march=native), so I didn't install and can't recommend the 2nd > attached patch. The current master fails to build in the x86 32-bit configuration with wide ints: In file included from lisp.h:35:0, from window.c:25: ../lib/verify.h:207:21: error: static assertion failed: "verify (header_size == sizeof (union vectorlike_header))" # define _GL_VERIFY _Static_assert ^ ../lib/verify.h:252:20: note: in expansion of macro '_GL_VERIFY' # define verify(R) _GL_VERIFY (R, "verify (" #R ")") ^~~~~~~~~~ lisp.h:1630:1: note: in expansion of macro 'verify' verify (header_size == sizeof (union vectorlike_header)); ^~~~~~ Makefile:385: recipe for target `window.o' failed ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-07 8:05 ` Eli Zaretskii @ 2018-09-07 13:45 ` Paul Eggert 2018-09-07 14:12 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Stefan Monnier 2018-09-07 14:19 ` Lisp_Marker size on 32bit systems Eli Zaretskii 0 siblings, 2 replies; 28+ messages in thread From: Paul Eggert @ 2018-09-07 13:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: monnier, emacs-devel Eli Zaretskii wrote: > The current master fails to build in the x86 32-bit configuration with > wide ints: > > In file included from lisp.h:35:0, > from window.c:25: > ../lib/verify.h:207:21: error: static assertion failed: "verify (header_size == sizeof (union vectorlike_header))" It works for me in that configuration in Fedora 28. I get the following values; what do you get? sizeof (ptrdiff_t) = 4 sizeof (union vectorlike_header) = 4 offsetof (struct Lisp_Vector, contents) = 4 offsetof (struct Lisp_Sub_Char_Table, depth) == 4 offsetof (struct Lisp_Sub_Char_Table, contents) == 12 If you're getting different values, it could be that the fact that the code ever worked at all is just luck. I am using gcc 8.1.1 20180712 (Red Hat 8.1.1-5), and configure this way (because many modules don't work in 32-bit mode): ./configure --with-wide-int CC=gcc -m32 -march=native --enable-gcc-warnings --without-sound --without-dbus --without-file-notification --without-gconf --without-gif --without-gsettings --without-imagemagick --without-rsvg --with-x-toolkit=no --with-modules ^ permalink raw reply [flat|nested] 28+ messages in thread
* GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) 2018-09-07 13:45 ` Paul Eggert @ 2018-09-07 14:12 ` Stefan Monnier 2018-09-07 14:23 ` Eli Zaretskii ` (3 more replies) 2018-09-07 14:19 ` Lisp_Marker size on 32bit systems Eli Zaretskii 1 sibling, 4 replies; 28+ messages in thread From: Stefan Monnier @ 2018-09-07 14:12 UTC (permalink / raw) To: emacs-devel > sizeof (ptrdiff_t) = 4 > sizeof (union vectorlike_header) = 4 > offsetof (struct Lisp_Vector, contents) = 4 > offsetof (struct Lisp_Sub_Char_Table, depth) == 4 > offsetof (struct Lisp_Sub_Char_Table, contents) == 12 BTW, after too many years learning to only use C functions and variables in GDB (and not macros, CPP constants, or other compile-time-only thingies), I only recently started to extend my GDB world. Along the way I discovered that while `sizeof` works great, `offsetof` gives me an error: No symbol "offsetof" in current context. any idea why this is (and how to fix it)? Stefan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) 2018-09-07 14:12 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Stefan Monnier @ 2018-09-07 14:23 ` Eli Zaretskii 2018-09-07 15:16 ` GDB and compiler-operations Andreas Schwab ` (2 subsequent siblings) 3 siblings, 0 replies; 28+ messages in thread From: Eli Zaretskii @ 2018-09-07 14:23 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Fri, 07 Sep 2018 10:12:06 -0400 > > BTW, after too many years learning to only use C functions and variables > in GDB (and not macros, CPP constants, or other compile-time-only > thingies), I only recently started to extend my GDB world. Good for you! > Along the way I discovered that while `sizeof` works great, `offsetof` > gives me an error: > > No symbol "offsetof" in current context. AFAIK, GDB has special support for sizeof, but not for offsetof. Maybe you should ask for a new GDB feature. Or you could use the trick I used when Paul asked for values of offsets. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: GDB and compiler-operations 2018-09-07 14:12 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Stefan Monnier 2018-09-07 14:23 ` Eli Zaretskii @ 2018-09-07 15:16 ` Andreas Schwab 2018-09-07 15:48 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Paul Eggert 2018-09-07 19:59 ` Tom Tromey 3 siblings, 0 replies; 28+ messages in thread From: Andreas Schwab @ 2018-09-07 15:16 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel On Sep 07 2018, Stefan Monnier <monnier@iro.umontreal.ca> wrote: > Along the way I discovered that while `sizeof` works great, `offsetof` > gives me an error: > > No symbol "offsetof" in current context. > > any idea why this is (and how to fix it)? sizeof is a keyword, offsetof a macro. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) 2018-09-07 14:12 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Stefan Monnier 2018-09-07 14:23 ` Eli Zaretskii 2018-09-07 15:16 ` GDB and compiler-operations Andreas Schwab @ 2018-09-07 15:48 ` Paul Eggert 2018-09-07 15:58 ` GDB and compiler-operations Stefan Monnier 2018-09-07 19:59 ` Tom Tromey 3 siblings, 1 reply; 28+ messages in thread From: Paul Eggert @ 2018-09-07 15:48 UTC (permalink / raw) To: Stefan Monnier, emacs-devel On 09/07/2018 07:12 AM, Stefan Monnier wrote: > No symbol "offsetof" in current context. On a newer platform where GDB can see C macros (you really should enable this if you can, by the way, it mak,es debugging easier), I see this: (gdb) p offsetof (struct Lisp_Vector, contents) No symbol "__builtin_offsetof" in current context. So the problem with me is that GDB does not support __builtin_offsetof. This is a known bug in GDB: https://sourceware.org/bugzilla/show_bug.cgi?id=16240 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: GDB and compiler-operations 2018-09-07 15:48 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Paul Eggert @ 2018-09-07 15:58 ` Stefan Monnier 2018-09-07 17:11 ` Eli Zaretskii 2018-09-07 17:15 ` Paul Eggert 0 siblings, 2 replies; 28+ messages in thread From: Stefan Monnier @ 2018-09-07 15:58 UTC (permalink / raw) To: emacs-devel > On a newer platform where GDB can see C macros (you really should enable > this if you can, by the way, it mak,es debugging easier), I'm on Debian testing, which doesn't strike me as old. How new does it need to be? > I see this: > > (gdb) p offsetof (struct Lisp_Vector, contents) > No symbol "__builtin_offsetof" in current context. Hmm... indeed that's the error I was seeing the other day, but today I get the other one. IIRC this was the same machine, tho (and if not, it was one running Debian testing or Debian stable, so nothing newer). Stefan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: GDB and compiler-operations 2018-09-07 15:58 ` GDB and compiler-operations Stefan Monnier @ 2018-09-07 17:11 ` Eli Zaretskii 2018-09-07 17:15 ` Paul Eggert 1 sibling, 0 replies; 28+ messages in thread From: Eli Zaretskii @ 2018-09-07 17:11 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Fri, 07 Sep 2018 11:58:06 -0400 > > > (gdb) p offsetof (struct Lisp_Vector, contents) > > No symbol "__builtin_offsetof" in current context. > > Hmm... indeed that's the error I was seeing the other day, but today > I get the other one. IIRC this was the same machine, tho (and if not, > it was one running Debian testing or Debian stable, so nothing newer). It depends on whether the source was compiled with -g3 or not. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: GDB and compiler-operations 2018-09-07 15:58 ` GDB and compiler-operations Stefan Monnier 2018-09-07 17:11 ` Eli Zaretskii @ 2018-09-07 17:15 ` Paul Eggert 1 sibling, 0 replies; 28+ messages in thread From: Paul Eggert @ 2018-09-07 17:15 UTC (permalink / raw) To: Stefan Monnier, emacs-devel On 09/07/2018 08:58 AM, Stefan Monnier wrote: >> On a newer platform where GDB can see C macros (you really should enable >> this if you can, by the way, it mak,es debugging easier), > I'm on Debian testing, which doesn't strike me as old. > How new does it need to be? Sorry, don't know offhand. But tool age doesn't appear to apply to you (see below). > >> I see this: >> >> (gdb) p offsetof (struct Lisp_Vector, contents) >> No symbol "__builtin_offsetof" in current context. > Hmm... indeed that's the error I was seeing the other day, but today > I get the other one. IIRC this was the same machine, tho (and if not, > it was one running Debian testing or Debian stable, so nothing newer). Most likely when you were getting the other message you compiled with plain -g instead of -g3. './configure' tries to default to -g3 if available but perhaps you overrode that, or it didn't work for you. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: GDB and compiler-operations 2018-09-07 14:12 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Stefan Monnier ` (2 preceding siblings ...) 2018-09-07 15:48 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Paul Eggert @ 2018-09-07 19:59 ` Tom Tromey 3 siblings, 0 replies; 28+ messages in thread From: Tom Tromey @ 2018-09-07 19:59 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel >>>>> "Stefan" == Stefan Monnier <monnier@iro.umontreal.ca> writes: Stefan> Along the way I discovered that while `sizeof` works great, `offsetof` Stefan> gives me an error: Stefan> No symbol "offsetof" in current context. Stefan> any idea why this is (and how to fix it)? Other people answered the why. To fix it you have two options. One, define an offsetof macro: (gdb) macro define offsetof(type, field) ((int) (((type *) 0)->field)) The gdb C expression parser will automatically use macros you define interactively. Two, instead of using offsetof to inspect a type, upgrade to a newish (8.1 or better) gdb and use "ptype/o": (gdb) ptype/o struct Lisp_Vector /* offset | size */ type = struct Lisp_Vector { /* 0 | 8 */ union vectorlike_header { /* 8 */ ptrdiff_t size; /* 1 */ char gcaligned; /* total size (bytes): 8 */ } header; /* 8 | 0 */ Lisp_Object contents[]; /* total size (bytes): 8 */ } The output here is modeled on the pahole utility. If you can't upgrade gdb, there's a "pahole.py" script out there that adds a pahole command to gdb instead. I can email it if you really need it, but some distros installed it by default. So you could just try "(gdb) pahole struct Lisp_Vector". Tom ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-07 13:45 ` Paul Eggert 2018-09-07 14:12 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Stefan Monnier @ 2018-09-07 14:19 ` Eli Zaretskii 2018-09-07 16:27 ` Paul Eggert 1 sibling, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2018-09-07 14:19 UTC (permalink / raw) To: Paul Eggert; +Cc: monnier, emacs-devel > Cc: monnier@IRO.UMontreal.CA, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Fri, 7 Sep 2018 06:45:48 -0700 > > > In file included from lisp.h:35:0, > > from window.c:25: > > ../lib/verify.h:207:21: error: static assertion failed: "verify (header_size == sizeof (union vectorlike_header))" > > It works for me in that configuration in Fedora 28. I get the following values; > what do you get? > > sizeof (ptrdiff_t) = 4 > sizeof (union vectorlike_header) = 4 > offsetof (struct Lisp_Vector, contents) = 4 > offsetof (struct Lisp_Sub_Char_Table, depth) == 4 > offsetof (struct Lisp_Sub_Char_Table, contents) == 12 I obtained the below by building Emacs after commenting out the offending 'verify': (gdb) ptype union vectorlike_header type = union vectorlike_header { ptrdiff_t size; } (gdb) p sizeof(union vectorlike_header) $3 = 4 (gdb) ptype /o struct Lisp_Vector /* offset | size */ type = struct Lisp_Vector { /* 0 | 4 */ union vectorlike_header { /* 4 */ ptrdiff_t size; /* total size (bytes): 4 */ } header; /* XXX 4-byte hole */ /* 8 | 0 */ Lisp_Object contents[]; /* total size (bytes): 8 */ } (gdb) p sizeof(ptrdiff_t) $4 = 4 (gdb) ptype /o struct Lisp_Sub_Char_Table /* offset | size */ type = struct Lisp_Sub_Char_Table { /* 0 | 4 */ union vectorlike_header { /* 4 */ ptrdiff_t size; /* total size (bytes): 4 */ } header; /* 4 | 4 */ int depth; /* 8 | 4 */ int min_char; /* XXX 4-byte hole */ /* 16 | 0 */ Lisp_Object contents[]; /* total size (bytes): 16 */ } I think GCC aligns the Lisp_Object array within the structures because a Lisp_Object is an 8-byte data type in this configuration. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-07 14:19 ` Lisp_Marker size on 32bit systems Eli Zaretskii @ 2018-09-07 16:27 ` Paul Eggert 2018-09-07 17:16 ` Eli Zaretskii 0 siblings, 1 reply; 28+ messages in thread From: Paul Eggert @ 2018-09-07 16:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 479 bytes --] On 09/07/2018 07:19 AM, Eli Zaretskii wrote: > I think GCC aligns the Lisp_Object array within the structures because > a Lisp_Object is an 8-byte data type in this configuration. That alignment is platform-dependent. On Fedora 28 configured --with-wide-int and with gcc -m32, a Lisp_Object is 8 bytes but its alignment is only 4 bytes. Apparently the alignment of 'long long' is 4 on Fedora 28 x86, but 8 on MS-Windows x86. I installed the attached; please give it a try. [-- Attachment #2: 0001-Fix-overenthusiastic-header-size-check.patch --] [-- Type: text/x-patch, Size: 3876 bytes --] From 8776b3ccc765bff54b0186cadeba7c0a6fc60779 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Fri, 7 Sep 2018 09:17:25 -0700 Subject: [PATCH] Fix overenthusiastic header size check Problem reported by Eli Zaretskii in: https://lists.gnu.org/r/emacs-devel/2018-09/msg00222.html * doc/lispref/internals.texi (Garbage Collection): Document vector sizes and slot counts more accurately. * src/lisp.h: Omit header_size sanity check that was too picky. Add some less-picky checks. --- doc/lispref/internals.texi | 4 +++- src/lisp.h | 26 +++++++++++++++++++------- 2 files changed, 22 insertions(+), 8 deletions(-) diff --git a/doc/lispref/internals.texi b/doc/lispref/internals.texi index 3fe28446ea..d42e2444e6 100644 --- a/doc/lispref/internals.texi +++ b/doc/lispref/internals.texi @@ -382,7 +382,7 @@ Garbage Collection The total size of all string data in bytes. @item vector-size -Internal size of a vector header, i.e., @code{sizeof (struct Lisp_Vector)}. +Size in bytes of a vector of length 1, including its header. @item used-vectors The number of vector headers allocated from the vector blocks. @@ -392,6 +392,8 @@ Garbage Collection @item used-slots The number of slots in all used vectors. +Slot counts might include some or all overhead from vector headers, +depending on the platform. @item free-slots The number of free slots in all vector blocks. diff --git a/src/lisp.h b/src/lisp.h index 7e365e8f47..56623a75f7 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -1619,7 +1619,16 @@ struct Lisp_Bool_Vector } GCALIGNED_STRUCT; /* Some handy constants for calculating sizes - and offsets, mostly of vectorlike objects. */ + and offsets, mostly of vectorlike objects. + + The garbage collector assumes that the initial part of any struct + that starts with a union vectorlike_header followed by N + Lisp_Objects (some possibly in arrays and/or a trailing flexible + array) will be laid out like a struct Lisp_Vector with N + Lisp_Objects. This assumption is true in practice on known Emacs + targets even though the C standard does not guarantee it. This + header contains a few sanity checks that should suffice to detect + violations of this assumption on plausible practical hosts. */ enum { @@ -1627,7 +1636,6 @@ enum bool_header_size = offsetof (struct Lisp_Bool_Vector, data), word_size = sizeof (Lisp_Object) }; -verify (header_size == sizeof (union vectorlike_header)); /* The number of data words and bytes in a bool vector with SIZE bits. */ @@ -1989,6 +1997,13 @@ enum char_table_specials SUB_CHAR_TABLE_OFFSET = PSEUDOVECSIZE (struct Lisp_Sub_Char_Table, contents) }; +/* Sanity-check pseudovector layout. */ +verify (offsetof (struct Lisp_Char_Table, defalt) == header_size); +verify (offsetof (struct Lisp_Char_Table, extras) + == header_size + CHAR_TABLE_STANDARD_SLOTS * sizeof (Lisp_Object)); +verify (offsetof (struct Lisp_Sub_Char_Table, contents) + == header_size + SUB_CHAR_TABLE_OFFSET * sizeof (Lisp_Object)); + /* Return the number of "extra" slots in the char table CT. */ INLINE int @@ -1998,11 +2013,6 @@ CHAR_TABLE_EXTRA_SLOTS (struct Lisp_Char_Table *ct) - CHAR_TABLE_STANDARD_SLOTS); } -/* Make sure that sub char-table contents slot is where we think it is. */ -verify (offsetof (struct Lisp_Sub_Char_Table, contents) - == (offsetof (struct Lisp_Vector, contents) - + SUB_CHAR_TABLE_OFFSET * sizeof (Lisp_Object))); - /* Save and restore the instruction and environment pointers, without affecting the signal mask. */ @@ -2216,6 +2226,8 @@ struct Lisp_Hash_Table struct Lisp_Hash_Table *next_weak; } GCALIGNED_STRUCT; +/* Sanity-check pseudovector layout. */ +verify (offsetof (struct Lisp_Hash_Table, weak) == header_size); INLINE bool HASH_TABLE_P (Lisp_Object a) -- 2.17.1 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-07 16:27 ` Paul Eggert @ 2018-09-07 17:16 ` Eli Zaretskii 2018-09-07 18:13 ` Paul Eggert 0 siblings, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2018-09-07 17:16 UTC (permalink / raw) To: Paul Eggert; +Cc: monnier, emacs-devel > Cc: monnier@IRO.UMontreal.CA, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Fri, 7 Sep 2018 09:27:56 -0700 > > On 09/07/2018 07:19 AM, Eli Zaretskii wrote: > > I think GCC aligns the Lisp_Object array within the structures because > > a Lisp_Object is an 8-byte data type in this configuration. > > That alignment is platform-dependent. On Fedora 28 configured > --with-wide-int and with gcc -m32, a Lisp_Object is 8 bytes but its > alignment is only 4 bytes. Apparently the alignment of 'long long' is 4 > on Fedora 28 x86, but 8 on MS-Windows x86. Isn't it strange, though? Why would that be platform dependent? Could it be due to GCC version differences (mine is 7.3.0)? > I installed the attached; please give it a try. Builds fine, thanks. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-07 17:16 ` Eli Zaretskii @ 2018-09-07 18:13 ` Paul Eggert 2018-09-07 18:32 ` Eli Zaretskii 0 siblings, 1 reply; 28+ messages in thread From: Paul Eggert @ 2018-09-07 18:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: monnier, emacs-devel On 09/07/2018 10:16 AM, Eli Zaretskii wrote: > Why would that be platform dependent? There are differences in GCC struct layout between MS-Windows and GNU/Linux; see the GCC option -mms-bitfields. I expect it's the usual story about platforms making different tradeoffs between backward compatibility vs performance. I wasn't aware of the 'long long' alignment issue until now, though. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-07 18:13 ` Paul Eggert @ 2018-09-07 18:32 ` Eli Zaretskii 2018-09-07 19:05 ` Paul Eggert 0 siblings, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2018-09-07 18:32 UTC (permalink / raw) To: Paul Eggert; +Cc: monnier, emacs-devel > Cc: monnier@IRO.UMontreal.CA, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Fri, 7 Sep 2018 11:13:49 -0700 > > There are differences in GCC struct layout between MS-Windows and > GNU/Linux; see the GCC option -mms-bitfields. "-mms-bitfields" is about something very different, and specifically for compatibility with Microsoft compilers. I don't see how that could affect long long. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-07 18:32 ` Eli Zaretskii @ 2018-09-07 19:05 ` Paul Eggert 2018-09-07 19:22 ` Eli Zaretskii 0 siblings, 1 reply; 28+ messages in thread From: Paul Eggert @ 2018-09-07 19:05 UTC (permalink / raw) To: Eli Zaretskii; +Cc: monnier, emacs-devel On 09/07/2018 11:32 AM, Eli Zaretskii wrote: >> There are differences in GCC struct layout between MS-Windows and >> GNU/Linux; see the GCC option -mms-bitfields. > "-mms-bitfields" is about something very different, and specifically > for compatibility with Microsoft compilers. I don't see how that > could affect long long. Sure, but the point is that GCC does not lay out structures identically on MS-Windows vs GNU/Linux. If it makes an exception for bitfields it may very well make an exception for long long. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-07 19:05 ` Paul Eggert @ 2018-09-07 19:22 ` Eli Zaretskii 0 siblings, 0 replies; 28+ messages in thread From: Eli Zaretskii @ 2018-09-07 19:22 UTC (permalink / raw) To: Paul Eggert; +Cc: monnier, emacs-devel > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Fri, 7 Sep 2018 12:05:18 -0700 > Cc: monnier@IRO.UMontreal.CA, emacs-devel@gnu.org > > Sure, but the point is that GCC does not lay out structures identically > on MS-Windows vs GNU/Linux. If it makes an exception for bitfields it > may very well make an exception for long long. Yes, the facts are unequivocal. I just don't understand why the different behavior. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-07 7:15 ` Paul Eggert 2018-09-07 8:05 ` Eli Zaretskii @ 2018-09-07 12:16 ` Stefan Monnier 2018-09-07 19:04 ` Paul Eggert 1 sibling, 1 reply; 28+ messages in thread From: Stefan Monnier @ 2018-09-07 12:16 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel > Although the first attached patch shrinks sizeof (struct Lisp_Maker) from 32 > to 24 bytes on x86 as requested, allocate_pseudovector still *allocates* 32 > bytes for the struct, as it rounds the size up to the next multiple of > alignof (max_align_t), which is 16 on x86. It's not hard to change that to > 8 (please see 2nd attached patch) but this causes a 20% CPU performance hit > (!) to 'make compile-always' on my platform (AMD Phenom II X4 910e circa > 2010, Fedora 28 x86-64, gcc -m32 -march=native), so I didn't install and > can't recommend the 2nd attached patch. Where does this 20% slow down come from? Stefan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-07 12:16 ` Stefan Monnier @ 2018-09-07 19:04 ` Paul Eggert 2018-09-07 19:45 ` Stefan Monnier 0 siblings, 1 reply; 28+ messages in thread From: Paul Eggert @ 2018-09-07 19:04 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel On 09/07/2018 05:16 AM, Stefan Monnier wrote: > Where does this 20% slow down come from? Ha! It was because the 16-byte alignment caused pure space to overflow, which disabled GC. No wonder there was such a big performance difference. I fixed that, and now the patch to shrink marker allocations from 32 to 24 bytes on x86 causes my standard benchmark (make compile-always) to run only 1.0% slower, which is more reasonable. A microbenchmark of running (make-marker) over and over again for 10,000,000 times runs 36% slower (815 vs 597 ns for a single call). So it still looks like we should be following GCC's max_align_t hint and using 16-byte alignment on x86, even though this wastes memory. Maybe we should be using 4 mark bits instead of 3? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-07 19:04 ` Paul Eggert @ 2018-09-07 19:45 ` Stefan Monnier 2018-09-07 21:03 ` Paul Eggert 0 siblings, 1 reply; 28+ messages in thread From: Stefan Monnier @ 2018-09-07 19:45 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel >> Where does this 20% slow down come from? > Ha! It was because the 16-byte alignment caused pure space to overflow, > which disabled GC. No wonder there was such a big performance difference. OK, that makes more sense. > I fixed that, and now the patch to shrink marker allocations from 32 to 24 > bytes on x86 causes my standard benchmark (make compile-always) to run only > 1.0% slower, which is more reasonable. A microbenchmark of running > (make-marker) over and over again for 10,000,000 times runs 36% slower (815 > vs 597 ns for a single call). I still wonder why it would be slower at all. > Maybe we should be using 4 mark bits instead of 3? On 32bit systems, both cons cells and float cells use 8 bytes each, so aligning on multiples of 16 would double their memory use. And we currently have one free tag, so not only would using 4 tag bits significantly increase memory use for those objects, but it's not clear what the extra tags would be useful for. Also with the bignum support, the pressure to maximize the size of our fixnums is much lower, so we could even consider using fewer Lisp_Int tags if we feel like we need more tags. FWIW, IIUC XEmacs uses a 2bit tag which simply distinguishes between Lisp_Int0, Lisp_Int1, Char, and other objects. Since we don't have chars, that's like using a single-bit tag for us. Maybe we should introduce some way to instrument SYMBOLP/STRINGP/VECTORP/MARKERP/CONSP/... in order to try and figure out which objects are more deserving of having their tag right there in the Lisp_Object rather than just in the vectorlike_header. Stefan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-07 19:45 ` Stefan Monnier @ 2018-09-07 21:03 ` Paul Eggert 2018-09-08 1:54 ` Stefan Monnier 0 siblings, 1 reply; 28+ messages in thread From: Paul Eggert @ 2018-09-07 21:03 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1053 bytes --] Stefan Monnier wrote: > I still wonder why it would be slower at all. My guess is cache effects. My processor has a cache line size of 64 bytes, so if objects are allocated in 32-byte chunks they won't straddle cache boundaries and code will be less likely to thrash the cache. I ran this benchmark in the 'lisp' subdirectory: EMACSLOADPATH= perf stat -dd '../src/emacs' -batch --no-site-file --no-site-lisp --eval '(setq load-prefer-newer t)' -f batch-byte-compile org/org.el and am attaching the results for the 24-bit allocation (a bit slower) and the 32-bit allocation (a bit faster), and they are in line with this guess. >> Maybe we should be using 4 mark bits instead of 3? > On 32bit systems, both cons cells and float cells use 8 bytes each, so > aligning on multiples of 16 would double their memory use. We'd use two tags for both conses and float cells, so that shouldn't be a problem. > it's not clear > what the extra tags would be useful for. Presumably to help performance elsewhere. Admittedly I'm blue-skying a bit here. [-- Attachment #2: perf-stat-emacs-24.txt --] [-- Type: text/plain, Size: 2140 bytes --] Performance counter stats for '../src/emacs -batch --no-site-file --no-site-lisp --eval (setq load-prefer-newer t) -f batch-byte-compile org/org.el': 3710.809824 task-clock:u (msec) # 0.998 CPUs utilized 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 4,298 page-faults:u # 0.001 M/sec 9,414,411,487 cycles:u # 2.537 GHz (18.79%) 757,273,968 stalled-cycles-frontend:u # 8.04% frontend cycles idle (18.80%) 4,395,722,568 stalled-cycles-backend:u # 46.69% backend cycles idle (18.80%) 9,301,424,026 instructions:u # 0.99 insn per cycle # 0.47 stalled cycles per insn (18.79%) 1,390,988,726 branches:u # 374.848 M/sec (18.76%) 67,045,930 branch-misses:u # 4.82% of all branches (18.74%) 5,219,768,450 L1-dcache-loads:u # 1406.639 M/sec (18.73%) 43,583,947 L1-dcache-load-misses:u # 0.83% of all L1-dcache hits (18.72%) 85,736,523 LLC-loads:u # 23.105 M/sec (18.73%) 14,635,030 LLC-load-misses:u # 17.07% of all LL-cache hits (18.73%) 2,429,036,804 L1-icache-loads:u # 654.584 M/sec (18.73%) 7,260,772 L1-icache-load-misses:u # 0.30% of all L1-icache hits (18.72%) 5,206,401,315 dTLB-loads:u # 1403.036 M/sec (18.72%) 4,878,637 dTLB-load-misses:u # 0.09% of all dTLB cache hits (18.72%) 2,418,273,074 iTLB-loads:u # 651.683 M/sec (18.75%) 2,946 iTLB-load-misses:u # 0.00% of all iTLB cache hits (18.77%) 3.718081199 seconds time elapsed [-- Attachment #3: perf-stat-emacs-32.txt --] [-- Type: text/plain, Size: 2140 bytes --] Performance counter stats for '../src/emacs -batch --no-site-file --no-site-lisp --eval (setq load-prefer-newer t) -f batch-byte-compile org/org.el': 3643.107970 task-clock:u (msec) # 0.998 CPUs utilized 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 4,320 page-faults:u # 0.001 M/sec 9,235,689,147 cycles:u # 2.535 GHz (18.77%) 683,279,700 stalled-cycles-frontend:u # 7.40% frontend cycles idle (18.74%) 4,395,369,277 stalled-cycles-backend:u # 47.59% backend cycles idle (18.73%) 9,226,971,053 instructions:u # 1.00 insn per cycle # 0.48 stalled cycles per insn (18.74%) 1,360,574,869 branches:u # 373.465 M/sec (18.75%) 68,352,263 branch-misses:u # 5.02% of all branches (18.75%) 5,125,592,214 L1-dcache-loads:u # 1406.928 M/sec (18.74%) 41,529,042 L1-dcache-load-misses:u # 0.81% of all L1-dcache hits (18.74%) 77,752,725 LLC-loads:u # 21.342 M/sec (18.74%) 14,778,615 LLC-load-misses:u # 19.01% of all LL-cache hits (18.75%) 2,394,079,664 L1-icache-loads:u # 657.153 M/sec (18.74%) 6,663,498 L1-icache-load-misses:u # 0.28% of all L1-icache hits (18.74%) 5,164,701,717 dTLB-loads:u # 1417.664 M/sec (18.74%) 4,443,995 dTLB-load-misses:u # 0.09% of all dTLB cache hits (18.76%) 2,345,758,968 iTLB-loads:u # 643.889 M/sec (18.78%) 3,369 iTLB-load-misses:u # 0.00% of all iTLB cache hits (18.79%) 3.650332037 seconds time elapsed ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-07 21:03 ` Paul Eggert @ 2018-09-08 1:54 ` Stefan Monnier 2018-09-08 3:04 ` Paul Eggert 0 siblings, 1 reply; 28+ messages in thread From: Stefan Monnier @ 2018-09-08 1:54 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel >> I still wonder why it would be slower at all. > My guess is cache effects. My processor has a cache line size of 64 bytes, > so if objects are allocated in 32-byte chunks they won't straddle cache > boundaries and code will be less likely to thrash the cache. The difference of alignment is between multiples-of-8 and multiples-of-16, and allocation is done by the vectorlike code, so I think 32 byte objects aren't supposed to be much more likely to be aligned on 32 byte boundaries than on 32B + 16B. Hence, I don't find your argument very convincing. > I ran this benchmark in the 'lisp' subdirectory: > EMACSLOADPATH= perf stat -dd > '../src/emacs' -batch --no-site-file --no-site-lisp --eval '(setq > load-prefer-newer t)' -f batch-byte-compile org/org.el > > and am attaching the results for the 24-bit allocation (a bit slower) and > the 32-bit allocation (a bit faster), and they are in line with this guess. Those perf-stats also show improved I$ performance, which isn't explained by your suggested explanation. Similarly, they show a reduced number of instructions. IOW, I think there's something else at play than just the cache effects. >> it's not clear what the extra tags would be useful for. > Presumably to help performance elsewhere. Admittedly I'm blue-skying > a bit here. I think a 16 byte alignment could indeed be a good idea on 64bit systems (assuming we make the tags take advantage of it), but on 32bit systems where the memory is usually more constrained to start with, I think it would be a mistake. Stefan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-08 1:54 ` Stefan Monnier @ 2018-09-08 3:04 ` Paul Eggert 2018-09-08 3:10 ` Stefan Monnier 0 siblings, 1 reply; 28+ messages in thread From: Paul Eggert @ 2018-09-08 3:04 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel Stefan Monnier wrote: > I think 32 byte objects aren't supposed to be much more likely to be > aligned on 32 byte boundaries than on 32B + 16B. True, I miscalculated. If the cache line size is 64 bytes and objects are allocated on 16-byte boundaries (the case now), the probability that a randomly-placed 24-byte marker (allocated as 32 bytes) will straddle into two cache lines is (2-1)/4, or 25%. Whereas if objects are allocated in 8-byte boundaries as you're suggesting, the probability that the same marker will straddle is (3-1)/8, which is still 25%. So for this particular case the straddling issue should be a wash. > Those perf-stats also show improved I$ performance, which isn't > explained by your suggested explanation. Similarly, they show a reduced > number of instructions. Yes, it could well be that the 32-byte allocation is faster than the 24 partly due to some reason other than d-cache effects. Although there is a smaller percentage of cache misses in the 32-byte version, it could be that this is because the 32-byte version uses simpler code that would be faster even if the cache miss rate were the same. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems 2018-09-08 3:04 ` Paul Eggert @ 2018-09-08 3:10 ` Stefan Monnier 0 siblings, 0 replies; 28+ messages in thread From: Stefan Monnier @ 2018-09-08 3:10 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel >> Those perf-stats also show improved I$ performance, which isn't >> explained by your suggested explanation. Similarly, they show >> a reduced number of instructions. > > Yes, it could well be that the 32-byte allocation is faster than the 24 > partly due to some reason other than d-cache effects. Although there > is a smaller percentage of cache misses in the 32-byte version, it could be > that this is because the 32-byte version uses simpler code that would be > faster even if the cache miss rate were the same. That's my impression as well, but I'd be curious to know why that is. The only "obvious" advantage is that 32 is a power of 2 so you can use shift for multiplication/division, but that would only apply to things like indexing arrays of markers or computing the diff between two marker pointers. AFAIK we don't do any such operation. Stefan ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2018-09-08 3:10 UTC | newest] Thread overview: 28+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-09-06 0:41 Lisp_Marker size on 32bit systems Stefan Monnier 2018-09-06 6:51 ` Paul Eggert 2018-09-06 12:17 ` Stefan Monnier 2018-09-07 7:15 ` Paul Eggert 2018-09-07 8:05 ` Eli Zaretskii 2018-09-07 13:45 ` Paul Eggert 2018-09-07 14:12 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Stefan Monnier 2018-09-07 14:23 ` Eli Zaretskii 2018-09-07 15:16 ` GDB and compiler-operations Andreas Schwab 2018-09-07 15:48 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Paul Eggert 2018-09-07 15:58 ` GDB and compiler-operations Stefan Monnier 2018-09-07 17:11 ` Eli Zaretskii 2018-09-07 17:15 ` Paul Eggert 2018-09-07 19:59 ` Tom Tromey 2018-09-07 14:19 ` Lisp_Marker size on 32bit systems Eli Zaretskii 2018-09-07 16:27 ` Paul Eggert 2018-09-07 17:16 ` Eli Zaretskii 2018-09-07 18:13 ` Paul Eggert 2018-09-07 18:32 ` Eli Zaretskii 2018-09-07 19:05 ` Paul Eggert 2018-09-07 19:22 ` Eli Zaretskii 2018-09-07 12:16 ` Stefan Monnier 2018-09-07 19:04 ` Paul Eggert 2018-09-07 19:45 ` Stefan Monnier 2018-09-07 21:03 ` Paul Eggert 2018-09-08 1:54 ` Stefan Monnier 2018-09-08 3:04 ` Paul Eggert 2018-09-08 3:10 ` Stefan Monnier
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.