* Lisp_Marker size on 32bit systems
@ 2018-09-06 0:41 Stefan Monnier
2018-09-06 6:51 ` Paul Eggert
0 siblings, 1 reply; 28+ messages in thread
From: Stefan Monnier @ 2018-09-06 0:41 UTC (permalink / raw)
To: emacs-devel
The new Lisp_Marker is larger than the old one on 32bit systems:
sizeof (struct Lisp_Marker)
used to be 24 (bytes) when we used Lisp_Misc, but it is now 32 (bytes) instead!
The reason seems to be that the vectorlike_header (which contains just
a simple int) occupies 8 bytes!
So those 8 bytes, plus 20 bytes of actual real data leads to 28bytes
which are rounded up to 32 for alignment purposes.
Why does vectorlike_header occupy 8bytes? Because we use
union vectorlike_header
{
ptrdiff_t size;
/* Align the union so that there is no padding after it. */
Lisp_Object align;
GCALIGNED_UNION
};
where GCALIGNED_UNION forces alignment on a multiple of 8 and hence
a minimum size of 8 as well.
So, on 32bit hosts, our vectorlike_header carries 4bytes of useful info
but occupies 8bytes anyway.
This sucks.
This misfeature was introduced by the following commit:
commit b1573a97e17b518723ab3f906eb6d521caed196d
Author: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon Nov 13 08:51:41 2017 -0800
Use alignas to fix GCALIGN-related bugs
Could we get this fixed, to reduce the overhead of our vectors on 32bit
hosts (including bringing back Lisp_Marker back to 24 bytes)?
Stefan
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-06 0:41 Lisp_Marker size on 32bit systems Stefan Monnier
@ 2018-09-06 6:51 ` Paul Eggert
2018-09-06 12:17 ` Stefan Monnier
0 siblings, 1 reply; 28+ messages in thread
From: Paul Eggert @ 2018-09-06 6:51 UTC (permalink / raw)
To: Stefan Monnier, emacs-devel
Stefan Monnier wrote:
> used to be 24 (bytes) when we used Lisp_Misc, but it is now 32 (bytes) instead!
I'll take a look at it. I was hoping those 8 bytes wouldn't make enough
difference to worry about, but evidently not....
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-06 6:51 ` Paul Eggert
@ 2018-09-06 12:17 ` Stefan Monnier
2018-09-07 7:15 ` Paul Eggert
0 siblings, 1 reply; 28+ messages in thread
From: Stefan Monnier @ 2018-09-06 12:17 UTC (permalink / raw)
To: Paul Eggert; +Cc: emacs-devel
>> used to be 24 (bytes) when we used Lisp_Misc, but it is now 32
>> (bytes) instead!
> I'll take a look at it.
Thanks. AFAICT the only solution is to use the GCALIGNED_UNION trick in
each and every "real Lisp_Object struct" rather than once and forall in
vectorlike_header.
Maybe we should use a LISP_STRUCT macro like
#define LISP_STRUCT(name, fields) \
struct name { union { struct { fields } s; GCALIGNED_UNION; } u; }
> I was hoping those 8 bytes wouldn't make enough
> difference to worry about, but evidently not....
It's really the 4 extra padding bytes incurred by all vectorlikes that
annoy me. The resulting extra 8 bytes in markers is just a symptom ;-)
Stefan
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-06 12:17 ` Stefan Monnier
@ 2018-09-07 7:15 ` Paul Eggert
2018-09-07 8:05 ` Eli Zaretskii
2018-09-07 12:16 ` Stefan Monnier
0 siblings, 2 replies; 28+ messages in thread
From: Paul Eggert @ 2018-09-07 7:15 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 1176 bytes --]
Stefan Monnier wrote:
> Thanks. AFAICT the only solution is to use the GCALIGNED_UNION trick in
> each and every "real Lisp_Object struct" rather than once and forall in
> vectorlike_header.
The trick does need to move out of union vectorlike_header. However, the trick
is not needed for most of those structs, since they're allocated only by the GC
and are therefore already GC-aligned. The trick is needed only for structs that
C might allocate statically or on the stack, and whose addresses are tagged as
Lisp pointers. Just a few types do that, and I've noted them in the first
attached patch.
Although the first attached patch shrinks sizeof (struct Lisp_Maker) from 32 to
24 bytes on x86 as requested, allocate_pseudovector still *allocates* 32 bytes
for the struct, as it rounds the size up to the next multiple of alignof
(max_align_t), which is 16 on x86. It's not hard to change that to 8 (please see
2nd attached patch) but this causes a 20% CPU performance hit (!) to 'make
compile-always' on my platform (AMD Phenom II X4 910e circa 2010, Fedora 28
x86-64, gcc -m32 -march=native), so I didn't install and can't recommend the 2nd
attached patch.
[-- Attachment #2: 0001-Shrink-pseudovectors-a-bit.patch --]
[-- Type: text/x-patch, Size: 15465 bytes --]
From 2ccf72b1af5eef8746b9b6facb7c09e6258afb90 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Thu, 6 Sep 2018 19:17:14 -0700
Subject: [PATCH] Shrink pseudovectors a bit
sizeof (struct Lisp_Marker) was 32 on x86, where 24 would do.
Problem noted by Stefan Monnier in:
https://lists.gnu.org/r/emacs-devel/2018-09/msg00165.html
* src/bignum.h (struct Lisp_Bignum):
* src/frame.h (struct frame):
* src/lisp.h (struct Lisp_Vector, struct Lisp_Bool_Vector)
(struct Lisp_Char_Table, struct Lisp_Hash_Table)
(struct Lisp_Marker, struct Lisp_Overlay)
(struct Lisp_Misc_Ptr, struct Lisp_User_Ptr)
(struct Lisp_Finalizer, struct Lisp_Float)
(struct Lisp_Module_Function):
* src/process.h (struct Lisp_Process):
* src/termhooks.h (struct terminal):
* src/thread.h (struct thread_state, struct Lisp_Mutex)
(struct Lisp_CondVar):
* src/window.c (struct save_window_data):
* src/window.h (struct window):
* src/xterm.h (struct scroll_bar):
* src/xwidget.h (struct xwidget, struct xwidget_view):
Add GCALIGNED_STRUCT attribute.
* src/lisp.h (GCALIGNED_UNION_MEMBER): Renamed from GCALIGNED_UNION.
All uses changed.
(GCALIGNED_STRUCT_MEMBER, GCALIGNED_STRUCT, GCALIGNED): New macros.
All uses of open-coded GCALIGNED changed to use GCALIGNED.
(union vectorlike_header): No longer GC-aligned.
(PSEUDOVECSIZE): Yield 0 for pseudovectors without Lisp
objects that place a member before where the first Lisp object
member would be.
---
src/alloc.c | 8 +++--
src/bignum.h | 2 +-
src/fileio.c | 4 +--
src/frame.h | 2 +-
src/keymap.c | 4 +--
src/lisp.h | 90 ++++++++++++++++++++++++++++++-------------------
src/process.h | 2 +-
src/termhooks.h | 2 +-
src/thread.h | 6 ++--
src/window.c | 2 +-
src/window.h | 2 +-
src/xterm.h | 2 +-
src/xwidget.h | 4 +--
13 files changed, 76 insertions(+), 54 deletions(-)
diff --git a/src/alloc.c b/src/alloc.c
index 28ca7804ee..abb98a9eb6 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -641,9 +641,11 @@ buffer_memory_full (ptrdiff_t nbytes)
implement Lisp objects; since pseudovectors can contain any C type,
this is max_align_t. On recent GNU/Linux x86 and x86-64 this can
often waste up to 8 bytes, since alignof (max_align_t) is 16 but
- typical vectors need only an alignment of 8. However, it is not
- worth the hassle to avoid this waste. */
-enum { LISP_ALIGNMENT = alignof (union { max_align_t x; GCALIGNED_UNION }) };
+ typical vectors need only an alignment of 8. Although shrinking
+ the alignment to 8 would save memory, it cost a 20% hit to Emacs
+ CPU performance on Fedora 28 x86-64 when compiled with gcc -m32. */
+enum { LISP_ALIGNMENT = alignof (union { max_align_t x;
+ GCALIGNED_UNION_MEMBER }) };
verify (LISP_ALIGNMENT % GCALIGNMENT == 0);
/* True if malloc (N) is known to return storage suitably aligned for
diff --git a/src/bignum.h b/src/bignum.h
index 0e38c615ee..6551549343 100644
--- a/src/bignum.h
+++ b/src/bignum.h
@@ -39,7 +39,7 @@ struct Lisp_Bignum
{
union vectorlike_header header;
mpz_t value;
-};
+} GCALIGNED_STRUCT;
extern mpz_t mpz[4];
diff --git a/src/fileio.c b/src/fileio.c
index 66b2333317..5ca7c595f7 100644
--- a/src/fileio.c
+++ b/src/fileio.c
@@ -3394,9 +3394,9 @@ union read_non_regular
int fd;
ptrdiff_t inserted, trytry;
} s;
- GCALIGNED_UNION
+ GCALIGNED_UNION_MEMBER
};
-verify (alignof (union read_non_regular) % GCALIGNMENT == 0);
+verify (GCALIGNED (union read_non_regular));
static Lisp_Object
read_non_regular (Lisp_Object state)
diff --git a/src/frame.h b/src/frame.h
index a3bb633e57..ad7376a653 100644
--- a/src/frame.h
+++ b/src/frame.h
@@ -578,7 +578,7 @@ struct frame
enum ns_appearance_type ns_appearance;
bool_bf ns_transparent_titlebar;
#endif
-};
+} GCALIGNED_STRUCT;
/* Most code should use these functions to set Lisp fields in struct frame. */
diff --git a/src/keymap.c b/src/keymap.c
index 52db7b491f..79dce15a81 100644
--- a/src/keymap.c
+++ b/src/keymap.c
@@ -554,9 +554,9 @@ union map_keymap
Lisp_Object args;
void *data;
} s;
- GCALIGNED_UNION
+ GCALIGNED_UNION_MEMBER
};
-verify (alignof (union map_keymap) % GCALIGNMENT == 0);
+verify (GCALIGNED (union map_keymap));
static void
map_keymap_char_table_item (Lisp_Object args, Lisp_Object key, Lisp_Object val)
diff --git a/src/lisp.h b/src/lisp.h
index 78c25f97dc..7e365e8f47 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -229,7 +229,7 @@ extern bool suppress_checking EXTERNALLY_VISIBLE;
USE_LSB_TAG not only requires the least 3 bits of pointers returned by
malloc to be 0 but also needs to be able to impose a mult-of-8 alignment
on some non-GC Lisp_Objects, all of which are aligned via
- GCALIGNED_UNION at the end of a union. */
+ GCALIGNED_UNION_MEMBER, GCALIGNED_STRUCT_MEMBER, and GCALIGNED_STRUCT. */
enum Lisp_Bits
{
@@ -282,7 +282,35 @@ error !;
# define GCALIGNMENT 1
#endif
-#define GCALIGNED_UNION char alignas (GCALIGNMENT) gcaligned;
+/* If a struct is always allocated by the GC and is therefore always
+ GC-aligned, put GCALIGNED_STRUCT after its closing '}'; this can
+ help the compiler generate better code.
+
+ To cause a union to have alignment of at least GCALIGNMENT, put
+ GCALIGNED_UNION_MEMBER in its member list. Similarly for a struct
+ and GCALIGNED_STRUCT_MEMBER, although this may make the struct a
+ bit bigger on non-GCC platforms. Any struct using
+ GCALIGNED_STRUCT_MEMBER should also use GCALIGNED_STRUCT.
+
+ Although these macros are reasonably portable, they are not
+ guaranteed on non-GCC platforms, as C11 does not require support
+ for alignment to GCALIGNMENT and older compilers may ignore
+ alignment requests. For any type T where garbage collection
+ requires alignment, use verify (GCALIGNED (T)) to verify the
+ requirement on the current platform. Types need this check if
+ their objects can be allocated outside the garbage collector. For
+ example, struct Lisp_Symbol needs the check because of lispsym and
+ struct Lisp_Cons needs it because of STACK_CONS. */
+
+#define GCALIGNED_UNION_MEMBER char alignas (GCALIGNMENT) gcaligned;
+#if HAVE_STRUCT_ATTRIBUTE_ALIGNED
+# define GCALIGNED_STRUCT_MEMBER
+# define GCALIGNED_STRUCT __attribute__ ((aligned (GCALIGNMENT)))
+#else
+# define GCALIGNED_STRUCT_MEMBER GCALIGNED_UNION_MEMBER
+# define GCALIGNED_STRUCT
+#endif
+#define GCALIGNED(type) (alignof (type) % GCALIGNMENT == 0)
/* Lisp_Word is a scalar word suitable for holding a tagged pointer or
integer. Usually it is a pointer to a deliberately-incomplete type
@@ -751,10 +779,10 @@ struct Lisp_Symbol
/* Next symbol in obarray bucket, if the symbol is interned. */
struct Lisp_Symbol *next;
} s;
- GCALIGNED_UNION
+ GCALIGNED_UNION_MEMBER
} u;
};
-verify (alignof (struct Lisp_Symbol) % GCALIGNMENT == 0);
+verify (GCALIGNED (struct Lisp_Symbol));
/* Declare a Lisp-callable function. The MAXARGS parameter has the same
meaning as in the DEFUN macro, and is used to construct a prototype. */
@@ -843,7 +871,9 @@ typedef EMACS_UINT Lisp_Word_tag;
and PSEUDOVECTORP cast their pointers to union vectorlike_header *,
because when two such pointers potentially alias, a compiler won't
incorrectly reorder loads and stores to their size fields. See
- Bug#8546. */
+ Bug#8546. This union formerly contained more members, and there's
+ no compelling reason to change it to a struct merely because the
+ number of members has been reduced to one. */
union vectorlike_header
{
/* The main member contains various pieces of information:
@@ -866,20 +896,7 @@ union vectorlike_header
Current layout limits the pseudovectors to 63 PVEC_xxx subtypes,
4095 Lisp_Objects in GC-ed area and 4095 word-sized other slots. */
ptrdiff_t size;
- /* Align the union so that there is no padding after it.
- This is needed for the following reason:
- If the alignment constraint of Lisp_Object is greater than the size of
- vectorlike_header (e.g. with-wide-int), vectorlike objects which have
- 0 Lisp_Object fields and whose 1st field has a smaller alignment
- constraint than Lisp_Object may end up with their 1st field "before
- pseudovector index 0", in which case PSEUDOVECSIZE will return
- a "negative" number. We could fix PSEUDOVECSIZE, but it's easier to
- just force rounding up the size of vectorlike_header to the alignment
- of Lisp_Object. */
- Lisp_Object align;
- GCALIGNED_UNION
};
-verify (alignof (union vectorlike_header) % GCALIGNMENT == 0);
INLINE bool
(SYMBOLP) (Lisp_Object x)
@@ -1251,10 +1268,10 @@ struct Lisp_Cons
struct Lisp_Cons *chain;
} u;
} s;
- GCALIGNED_UNION
+ GCALIGNED_UNION_MEMBER
} u;
};
-verify (alignof (struct Lisp_Cons) % GCALIGNMENT == 0);
+verify (GCALIGNED (struct Lisp_Cons));
INLINE bool
(NILP) (Lisp_Object x)
@@ -1373,10 +1390,10 @@ struct Lisp_String
unsigned char *data;
} s;
struct Lisp_String *next;
- GCALIGNED_UNION
+ GCALIGNED_UNION_MEMBER
} u;
};
-verify (alignof (struct Lisp_String) % GCALIGNMENT == 0);
+verify (GCALIGNED (struct Lisp_String));
INLINE bool
STRINGP (Lisp_Object x)
@@ -1507,7 +1524,7 @@ struct Lisp_Vector
{
union vectorlike_header header;
Lisp_Object contents[FLEXIBLE_ARRAY_MEMBER];
- };
+ } GCALIGNED_STRUCT;
INLINE bool
(VECTORLIKEP) (Lisp_Object x)
@@ -1599,7 +1616,7 @@ struct Lisp_Bool_Vector
The bits are in little-endian order in the bytes, and
the bytes are in little-endian order in the words. */
bits_word data[FLEXIBLE_ARRAY_MEMBER];
- };
+ } GCALIGNED_STRUCT;
/* Some handy constants for calculating sizes
and offsets, mostly of vectorlike objects. */
@@ -1765,7 +1782,8 @@ memclear (void *p, ptrdiff_t nbytes)
ones that the GC needs to trace). */
#define PSEUDOVECSIZE(type, nonlispfield) \
- ((offsetof (type, nonlispfield) - header_size) / word_size)
+ (offsetof (type, nonlispfield) < header_size \
+ ? 0 : (offsetof (type, nonlispfield) - header_size) / word_size)
/* Compute A OP B, using the unsigned comparison operator OP. A and B
should be integer expressions. This is not the same as
@@ -1830,7 +1848,7 @@ struct Lisp_Char_Table
/* These hold additional data. It is a vector. */
Lisp_Object extras[FLEXIBLE_ARRAY_MEMBER];
- };
+ } GCALIGNED_STRUCT;
INLINE bool
CHAR_TABLE_P (Lisp_Object a)
@@ -1942,7 +1960,9 @@ struct Lisp_Subr
const char *symbol_name;
const char *intspec;
EMACS_INT doc;
- };
+ GCALIGNED_STRUCT_MEMBER
+ } GCALIGNED_STRUCT;
+verify (GCALIGNED (struct Lisp_Subr));
INLINE bool
SUBRP (Lisp_Object a)
@@ -2194,7 +2214,7 @@ struct Lisp_Hash_Table
/* Next weak hash table if this is a weak hash table. The head
of the list is in weak_hash_tables. */
struct Lisp_Hash_Table *next_weak;
-};
+} GCALIGNED_STRUCT;
INLINE bool
@@ -2313,7 +2333,7 @@ struct Lisp_Marker
used to implement the functionality of markers, but rather to (ab)use
markers as a cache for char<->byte mappings). */
ptrdiff_t bytepos;
-};
+} GCALIGNED_STRUCT;
/* START and END are markers in the overlay's buffer, and
PLIST is the overlay's property list. */
@@ -2335,13 +2355,13 @@ struct Lisp_Overlay
Lisp_Object end;
Lisp_Object plist;
struct Lisp_Overlay *next;
- };
+ } GCALIGNED_STRUCT;
struct Lisp_Misc_Ptr
{
union vectorlike_header header;
void *pointer;
- };
+ } GCALIGNED_STRUCT;
extern Lisp_Object make_misc_ptr (void *);
@@ -2388,7 +2408,7 @@ struct Lisp_User_Ptr
union vectorlike_header header;
void (*finalizer) (void *);
void *p;
-};
+} GCALIGNED_STRUCT;
#endif
/* A finalizer sentinel. */
@@ -2404,7 +2424,7 @@ struct Lisp_Finalizer
/* Circular list of all active weak references. */
struct Lisp_Finalizer *prev;
struct Lisp_Finalizer *next;
- };
+ } GCALIGNED_STRUCT;
INLINE bool
FINALIZERP (Lisp_Object x)
@@ -2616,7 +2636,7 @@ struct Lisp_Float
double data;
struct Lisp_Float *chain;
} u;
- };
+ } GCALIGNED_STRUCT;
INLINE bool
(FLOATP) (Lisp_Object x)
@@ -3946,7 +3966,7 @@ struct Lisp_Module_Function
ptrdiff_t min_arity, max_arity;
emacs_subr subr;
void *data;
-};
+} GCALIGNED_STRUCT;
INLINE bool
MODULE_FUNCTIONP (Lisp_Object o)
diff --git a/src/process.h b/src/process.h
index 6bc22146a7..3c6dd7b91f 100644
--- a/src/process.h
+++ b/src/process.h
@@ -203,7 +203,7 @@ struct Lisp_Process
bool_bf gnutls_p : 1;
bool_bf gnutls_complete_negotiation_p : 1;
#endif
-};
+ } GCALIGNED_STRUCT;
INLINE bool
PROCESSP (Lisp_Object a)
diff --git a/src/termhooks.h b/src/termhooks.h
index 8b5f648b43..211429169b 100644
--- a/src/termhooks.h
+++ b/src/termhooks.h
@@ -661,7 +661,7 @@ struct terminal
frames on the terminal when it calls this hook, so infinite
recursion is prevented. */
void (*delete_terminal_hook) (struct terminal *);
-};
+} GCALIGNED_STRUCT;
INLINE bool
TERMINALP (Lisp_Object a)
diff --git a/src/thread.h b/src/thread.h
index 8ecb00824d..28d8d864fb 100644
--- a/src/thread.h
+++ b/src/thread.h
@@ -184,7 +184,7 @@ struct thread_state
/* Threads are kept on a linked list. */
struct thread_state *next_thread;
-};
+} GCALIGNED_STRUCT;
INLINE bool
THREADP (Lisp_Object a)
@@ -231,7 +231,7 @@ struct Lisp_Mutex
/* The lower-level mutex object. */
lisp_mutex_t mutex;
-};
+} GCALIGNED_STRUCT;
INLINE bool
MUTEXP (Lisp_Object a)
@@ -265,7 +265,7 @@ struct Lisp_CondVar
/* The lower-level condition variable object. */
sys_cond_t cond;
-};
+} GCALIGNED_STRUCT;
INLINE bool
CONDVARP (Lisp_Object a)
diff --git a/src/window.c b/src/window.c
index d4fc5568a5..04de965680 100644
--- a/src/window.c
+++ b/src/window.c
@@ -6268,7 +6268,7 @@ struct save_window_data
/* These are currently unused. We need them as soon as we convert
to pixels. */
int frame_menu_bar_height, frame_tool_bar_height;
- };
+ } GCALIGNED_STRUCT;
/* This is saved as a Lisp_Vector. */
struct saved_window
diff --git a/src/window.h b/src/window.h
index 013083eb9a..cc0b6b6667 100644
--- a/src/window.h
+++ b/src/window.h
@@ -400,7 +400,7 @@ struct window
/* Z_BYTE - buffer position of the last glyph in the current matrix of W.
Should be nonnegative, and only valid if window_end_valid is true. */
ptrdiff_t window_end_bytepos;
- };
+ } GCALIGNED_STRUCT;
INLINE bool
WINDOWP (Lisp_Object a)
diff --git a/src/xterm.h b/src/xterm.h
index 1849a5c953..2ea8a93f8c 100644
--- a/src/xterm.h
+++ b/src/xterm.h
@@ -937,7 +937,7 @@ struct scroll_bar
/* True if the scroll bar is horizontal. */
bool horizontal;
-};
+} GCALIGNED_STRUCT;
/* Turning a lisp vector value into a pointer to a struct scroll_bar. */
#define XSCROLL_BAR(vec) ((struct scroll_bar *) XVECTOR (vec))
diff --git a/src/xwidget.h b/src/xwidget.h
index 89fc7ff458..c203d4f60c 100644
--- a/src/xwidget.h
+++ b/src/xwidget.h
@@ -61,7 +61,7 @@ struct xwidget
/* Kill silently if Emacs is exited. */
bool_bf kill_without_query : 1;
-};
+} GCALIGNED_STRUCT;
struct xwidget_view
{
@@ -88,7 +88,7 @@ struct xwidget_view
int clip_left;
long handler_id;
-};
+} GCALIGNED_STRUCT;
#endif
/* Test for xwidget pseudovector. */
--
2.17.1
[-- Attachment #3: marker24.diff --]
[-- Type: text/x-patch, Size: 1590 bytes --]
diff --git a/src/alloc.c b/src/alloc.c
index a0639fd577..cbeb51bbc9 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -638,13 +638,23 @@ buffer_memory_full (ptrdiff_t nbytes)
/* LISP_ALIGNMENT is the alignment of Lisp objects. It must be at
least GCALIGNMENT so that pointers can be tagged. It also must be
at least as strict as the alignment of all the C types used to
- implement Lisp objects; since pseudovectors can contain any C type,
- this is max_align_t. On recent GNU/Linux x86 and x86-64 this can
- often waste up to 8 bytes, since alignof (max_align_t) is 16 but
- typical vectors need only an alignment of 8. However, it is not
- worth the hassle to avoid this waste. */
-enum { LISP_ALIGNMENT = alignof (union { max_align_t x;
- GCALIGNED_UNION_MEMBER }) };
+ implement Lisp objects. This union contains all the C types whose
+ alignment contributes to LISP_ALIGNMENT. This is not an exhaustive
+ list of the types, just enough so that the answer works on all
+ practical Emacs targets. This union does not contain max_align_t,
+ because with recent GCC on x86 that has an alignment of 16, but
+ Emacs does not use any types requiring an alignment more than 8.
+ Emacs modules must respect the alignment limit here. */
+union Lisp_kitchen_sink
+{
+ double d;
+ intmax_t i;
+ uintmax_t u;
+ void (*f) (void);
+ void *p;
+ GCALIGNED_UNION_MEMBER
+};
+enum { LISP_ALIGNMENT = alignof (union Lisp_kitchen_sink) };
verify (LISP_ALIGNMENT % GCALIGNMENT == 0);
/* True if malloc (N) is known to return storage suitably aligned for
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-07 7:15 ` Paul Eggert
@ 2018-09-07 8:05 ` Eli Zaretskii
2018-09-07 13:45 ` Paul Eggert
2018-09-07 12:16 ` Stefan Monnier
1 sibling, 1 reply; 28+ messages in thread
From: Eli Zaretskii @ 2018-09-07 8:05 UTC (permalink / raw)
To: Paul Eggert; +Cc: monnier, emacs-devel
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Fri, 7 Sep 2018 00:15:57 -0700
> Cc: emacs-devel@gnu.org
>
> Stefan Monnier wrote:
>
> > Thanks. AFAICT the only solution is to use the GCALIGNED_UNION trick in
> > each and every "real Lisp_Object struct" rather than once and forall in
> > vectorlike_header.
>
> The trick does need to move out of union vectorlike_header. However, the trick
> is not needed for most of those structs, since they're allocated only by the GC
> and are therefore already GC-aligned. The trick is needed only for structs that
> C might allocate statically or on the stack, and whose addresses are tagged as
> Lisp pointers. Just a few types do that, and I've noted them in the first
> attached patch.
>
> Although the first attached patch shrinks sizeof (struct Lisp_Maker) from 32 to
> 24 bytes on x86 as requested, allocate_pseudovector still *allocates* 32 bytes
> for the struct, as it rounds the size up to the next multiple of alignof
> (max_align_t), which is 16 on x86. It's not hard to change that to 8 (please see
> 2nd attached patch) but this causes a 20% CPU performance hit (!) to 'make
> compile-always' on my platform (AMD Phenom II X4 910e circa 2010, Fedora 28
> x86-64, gcc -m32 -march=native), so I didn't install and can't recommend the 2nd
> attached patch.
The current master fails to build in the x86 32-bit configuration with
wide ints:
In file included from lisp.h:35:0,
from window.c:25:
../lib/verify.h:207:21: error: static assertion failed: "verify (header_size == sizeof (union vectorlike_header))"
# define _GL_VERIFY _Static_assert
^
../lib/verify.h:252:20: note: in expansion of macro '_GL_VERIFY'
# define verify(R) _GL_VERIFY (R, "verify (" #R ")")
^~~~~~~~~~
lisp.h:1630:1: note: in expansion of macro 'verify'
verify (header_size == sizeof (union vectorlike_header));
^~~~~~
Makefile:385: recipe for target `window.o' failed
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-07 7:15 ` Paul Eggert
2018-09-07 8:05 ` Eli Zaretskii
@ 2018-09-07 12:16 ` Stefan Monnier
2018-09-07 19:04 ` Paul Eggert
1 sibling, 1 reply; 28+ messages in thread
From: Stefan Monnier @ 2018-09-07 12:16 UTC (permalink / raw)
To: Paul Eggert; +Cc: emacs-devel
> Although the first attached patch shrinks sizeof (struct Lisp_Maker) from 32
> to 24 bytes on x86 as requested, allocate_pseudovector still *allocates* 32
> bytes for the struct, as it rounds the size up to the next multiple of
> alignof (max_align_t), which is 16 on x86. It's not hard to change that to
> 8 (please see 2nd attached patch) but this causes a 20% CPU performance hit
> (!) to 'make compile-always' on my platform (AMD Phenom II X4 910e circa
> 2010, Fedora 28 x86-64, gcc -m32 -march=native), so I didn't install and
> can't recommend the 2nd attached patch.
Where does this 20% slow down come from?
Stefan
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-07 8:05 ` Eli Zaretskii
@ 2018-09-07 13:45 ` Paul Eggert
2018-09-07 14:12 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Stefan Monnier
2018-09-07 14:19 ` Lisp_Marker size on 32bit systems Eli Zaretskii
0 siblings, 2 replies; 28+ messages in thread
From: Paul Eggert @ 2018-09-07 13:45 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: monnier, emacs-devel
Eli Zaretskii wrote:
> The current master fails to build in the x86 32-bit configuration with
> wide ints:
>
> In file included from lisp.h:35:0,
> from window.c:25:
> ../lib/verify.h:207:21: error: static assertion failed: "verify (header_size == sizeof (union vectorlike_header))"
It works for me in that configuration in Fedora 28. I get the following values;
what do you get?
sizeof (ptrdiff_t) = 4
sizeof (union vectorlike_header) = 4
offsetof (struct Lisp_Vector, contents) = 4
offsetof (struct Lisp_Sub_Char_Table, depth) == 4
offsetof (struct Lisp_Sub_Char_Table, contents) == 12
If you're getting different values, it could be that the fact that the code ever
worked at all is just luck.
I am using gcc 8.1.1 20180712 (Red Hat 8.1.1-5), and configure this way
(because many modules don't work in 32-bit mode):
./configure --with-wide-int CC=gcc -m32 -march=native --enable-gcc-warnings
--without-sound --without-dbus --without-file-notification --without-gconf
--without-gif --without-gsettings --without-imagemagick --without-rsvg
--with-x-toolkit=no --with-modules
^ permalink raw reply [flat|nested] 28+ messages in thread
* GDB and compiler-operations (was: Lisp_Marker size on 32bit systems)
2018-09-07 13:45 ` Paul Eggert
@ 2018-09-07 14:12 ` Stefan Monnier
2018-09-07 14:23 ` Eli Zaretskii
` (3 more replies)
2018-09-07 14:19 ` Lisp_Marker size on 32bit systems Eli Zaretskii
1 sibling, 4 replies; 28+ messages in thread
From: Stefan Monnier @ 2018-09-07 14:12 UTC (permalink / raw)
To: emacs-devel
> sizeof (ptrdiff_t) = 4
> sizeof (union vectorlike_header) = 4
> offsetof (struct Lisp_Vector, contents) = 4
> offsetof (struct Lisp_Sub_Char_Table, depth) == 4
> offsetof (struct Lisp_Sub_Char_Table, contents) == 12
BTW, after too many years learning to only use C functions and variables
in GDB (and not macros, CPP constants, or other compile-time-only
thingies), I only recently started to extend my GDB world.
Along the way I discovered that while `sizeof` works great, `offsetof`
gives me an error:
No symbol "offsetof" in current context.
any idea why this is (and how to fix it)?
Stefan
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-07 13:45 ` Paul Eggert
2018-09-07 14:12 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Stefan Monnier
@ 2018-09-07 14:19 ` Eli Zaretskii
2018-09-07 16:27 ` Paul Eggert
1 sibling, 1 reply; 28+ messages in thread
From: Eli Zaretskii @ 2018-09-07 14:19 UTC (permalink / raw)
To: Paul Eggert; +Cc: monnier, emacs-devel
> Cc: monnier@IRO.UMontreal.CA, emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Fri, 7 Sep 2018 06:45:48 -0700
>
> > In file included from lisp.h:35:0,
> > from window.c:25:
> > ../lib/verify.h:207:21: error: static assertion failed: "verify (header_size == sizeof (union vectorlike_header))"
>
> It works for me in that configuration in Fedora 28. I get the following values;
> what do you get?
>
> sizeof (ptrdiff_t) = 4
> sizeof (union vectorlike_header) = 4
> offsetof (struct Lisp_Vector, contents) = 4
> offsetof (struct Lisp_Sub_Char_Table, depth) == 4
> offsetof (struct Lisp_Sub_Char_Table, contents) == 12
I obtained the below by building Emacs after commenting out the
offending 'verify':
(gdb) ptype union vectorlike_header
type = union vectorlike_header {
ptrdiff_t size;
}
(gdb) p sizeof(union vectorlike_header)
$3 = 4
(gdb) ptype /o struct Lisp_Vector
/* offset | size */ type = struct Lisp_Vector {
/* 0 | 4 */ union vectorlike_header {
/* 4 */ ptrdiff_t size;
/* total size (bytes): 4 */
} header;
/* XXX 4-byte hole */
/* 8 | 0 */ Lisp_Object contents[];
/* total size (bytes): 8 */
}
(gdb) p sizeof(ptrdiff_t)
$4 = 4
(gdb) ptype /o struct Lisp_Sub_Char_Table
/* offset | size */ type = struct Lisp_Sub_Char_Table {
/* 0 | 4 */ union vectorlike_header {
/* 4 */ ptrdiff_t size;
/* total size (bytes): 4 */
} header;
/* 4 | 4 */ int depth;
/* 8 | 4 */ int min_char;
/* XXX 4-byte hole */
/* 16 | 0 */ Lisp_Object contents[];
/* total size (bytes): 16 */
}
I think GCC aligns the Lisp_Object array within the structures because
a Lisp_Object is an 8-byte data type in this configuration.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: GDB and compiler-operations (was: Lisp_Marker size on 32bit systems)
2018-09-07 14:12 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Stefan Monnier
@ 2018-09-07 14:23 ` Eli Zaretskii
2018-09-07 15:16 ` GDB and compiler-operations Andreas Schwab
` (2 subsequent siblings)
3 siblings, 0 replies; 28+ messages in thread
From: Eli Zaretskii @ 2018-09-07 14:23 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Fri, 07 Sep 2018 10:12:06 -0400
>
> BTW, after too many years learning to only use C functions and variables
> in GDB (and not macros, CPP constants, or other compile-time-only
> thingies), I only recently started to extend my GDB world.
Good for you!
> Along the way I discovered that while `sizeof` works great, `offsetof`
> gives me an error:
>
> No symbol "offsetof" in current context.
AFAIK, GDB has special support for sizeof, but not for offsetof.
Maybe you should ask for a new GDB feature. Or you could use the
trick I used when Paul asked for values of offsets.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: GDB and compiler-operations
2018-09-07 14:12 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Stefan Monnier
2018-09-07 14:23 ` Eli Zaretskii
@ 2018-09-07 15:16 ` Andreas Schwab
2018-09-07 15:48 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Paul Eggert
2018-09-07 19:59 ` Tom Tromey
3 siblings, 0 replies; 28+ messages in thread
From: Andreas Schwab @ 2018-09-07 15:16 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
On Sep 07 2018, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> Along the way I discovered that while `sizeof` works great, `offsetof`
> gives me an error:
>
> No symbol "offsetof" in current context.
>
> any idea why this is (and how to fix it)?
sizeof is a keyword, offsetof a macro.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: GDB and compiler-operations (was: Lisp_Marker size on 32bit systems)
2018-09-07 14:12 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Stefan Monnier
2018-09-07 14:23 ` Eli Zaretskii
2018-09-07 15:16 ` GDB and compiler-operations Andreas Schwab
@ 2018-09-07 15:48 ` Paul Eggert
2018-09-07 15:58 ` GDB and compiler-operations Stefan Monnier
2018-09-07 19:59 ` Tom Tromey
3 siblings, 1 reply; 28+ messages in thread
From: Paul Eggert @ 2018-09-07 15:48 UTC (permalink / raw)
To: Stefan Monnier, emacs-devel
On 09/07/2018 07:12 AM, Stefan Monnier wrote:
> No symbol "offsetof" in current context.
On a newer platform where GDB can see C macros (you really should enable
this if you can, by the way, it mak,es debugging easier), I see this:
(gdb) p offsetof (struct Lisp_Vector, contents)
No symbol "__builtin_offsetof" in current context.
So the problem with me is that GDB does not support __builtin_offsetof.
This is a known bug in GDB:
https://sourceware.org/bugzilla/show_bug.cgi?id=16240
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: GDB and compiler-operations
2018-09-07 15:48 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Paul Eggert
@ 2018-09-07 15:58 ` Stefan Monnier
2018-09-07 17:11 ` Eli Zaretskii
2018-09-07 17:15 ` Paul Eggert
0 siblings, 2 replies; 28+ messages in thread
From: Stefan Monnier @ 2018-09-07 15:58 UTC (permalink / raw)
To: emacs-devel
> On a newer platform where GDB can see C macros (you really should enable
> this if you can, by the way, it mak,es debugging easier),
I'm on Debian testing, which doesn't strike me as old.
How new does it need to be?
> I see this:
>
> (gdb) p offsetof (struct Lisp_Vector, contents)
> No symbol "__builtin_offsetof" in current context.
Hmm... indeed that's the error I was seeing the other day, but today
I get the other one. IIRC this was the same machine, tho (and if not,
it was one running Debian testing or Debian stable, so nothing newer).
Stefan
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-07 14:19 ` Lisp_Marker size on 32bit systems Eli Zaretskii
@ 2018-09-07 16:27 ` Paul Eggert
2018-09-07 17:16 ` Eli Zaretskii
0 siblings, 1 reply; 28+ messages in thread
From: Paul Eggert @ 2018-09-07 16:27 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: monnier, emacs-devel
[-- Attachment #1: Type: text/plain, Size: 479 bytes --]
On 09/07/2018 07:19 AM, Eli Zaretskii wrote:
> I think GCC aligns the Lisp_Object array within the structures because
> a Lisp_Object is an 8-byte data type in this configuration.
That alignment is platform-dependent. On Fedora 28 configured
--with-wide-int and with gcc -m32, a Lisp_Object is 8 bytes but its
alignment is only 4 bytes. Apparently the alignment of 'long long' is 4
on Fedora 28 x86, but 8 on MS-Windows x86.
I installed the attached; please give it a try.
[-- Attachment #2: 0001-Fix-overenthusiastic-header-size-check.patch --]
[-- Type: text/x-patch, Size: 3876 bytes --]
From 8776b3ccc765bff54b0186cadeba7c0a6fc60779 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Fri, 7 Sep 2018 09:17:25 -0700
Subject: [PATCH] Fix overenthusiastic header size check
Problem reported by Eli Zaretskii in:
https://lists.gnu.org/r/emacs-devel/2018-09/msg00222.html
* doc/lispref/internals.texi (Garbage Collection):
Document vector sizes and slot counts more accurately.
* src/lisp.h: Omit header_size sanity check that was too picky.
Add some less-picky checks.
---
doc/lispref/internals.texi | 4 +++-
src/lisp.h | 26 +++++++++++++++++++-------
2 files changed, 22 insertions(+), 8 deletions(-)
diff --git a/doc/lispref/internals.texi b/doc/lispref/internals.texi
index 3fe28446ea..d42e2444e6 100644
--- a/doc/lispref/internals.texi
+++ b/doc/lispref/internals.texi
@@ -382,7 +382,7 @@ Garbage Collection
The total size of all string data in bytes.
@item vector-size
-Internal size of a vector header, i.e., @code{sizeof (struct Lisp_Vector)}.
+Size in bytes of a vector of length 1, including its header.
@item used-vectors
The number of vector headers allocated from the vector blocks.
@@ -392,6 +392,8 @@ Garbage Collection
@item used-slots
The number of slots in all used vectors.
+Slot counts might include some or all overhead from vector headers,
+depending on the platform.
@item free-slots
The number of free slots in all vector blocks.
diff --git a/src/lisp.h b/src/lisp.h
index 7e365e8f47..56623a75f7 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -1619,7 +1619,16 @@ struct Lisp_Bool_Vector
} GCALIGNED_STRUCT;
/* Some handy constants for calculating sizes
- and offsets, mostly of vectorlike objects. */
+ and offsets, mostly of vectorlike objects.
+
+ The garbage collector assumes that the initial part of any struct
+ that starts with a union vectorlike_header followed by N
+ Lisp_Objects (some possibly in arrays and/or a trailing flexible
+ array) will be laid out like a struct Lisp_Vector with N
+ Lisp_Objects. This assumption is true in practice on known Emacs
+ targets even though the C standard does not guarantee it. This
+ header contains a few sanity checks that should suffice to detect
+ violations of this assumption on plausible practical hosts. */
enum
{
@@ -1627,7 +1636,6 @@ enum
bool_header_size = offsetof (struct Lisp_Bool_Vector, data),
word_size = sizeof (Lisp_Object)
};
-verify (header_size == sizeof (union vectorlike_header));
/* The number of data words and bytes in a bool vector with SIZE bits. */
@@ -1989,6 +1997,13 @@ enum char_table_specials
SUB_CHAR_TABLE_OFFSET = PSEUDOVECSIZE (struct Lisp_Sub_Char_Table, contents)
};
+/* Sanity-check pseudovector layout. */
+verify (offsetof (struct Lisp_Char_Table, defalt) == header_size);
+verify (offsetof (struct Lisp_Char_Table, extras)
+ == header_size + CHAR_TABLE_STANDARD_SLOTS * sizeof (Lisp_Object));
+verify (offsetof (struct Lisp_Sub_Char_Table, contents)
+ == header_size + SUB_CHAR_TABLE_OFFSET * sizeof (Lisp_Object));
+
/* Return the number of "extra" slots in the char table CT. */
INLINE int
@@ -1998,11 +2013,6 @@ CHAR_TABLE_EXTRA_SLOTS (struct Lisp_Char_Table *ct)
- CHAR_TABLE_STANDARD_SLOTS);
}
-/* Make sure that sub char-table contents slot is where we think it is. */
-verify (offsetof (struct Lisp_Sub_Char_Table, contents)
- == (offsetof (struct Lisp_Vector, contents)
- + SUB_CHAR_TABLE_OFFSET * sizeof (Lisp_Object)));
-
/* Save and restore the instruction and environment pointers,
without affecting the signal mask. */
@@ -2216,6 +2226,8 @@ struct Lisp_Hash_Table
struct Lisp_Hash_Table *next_weak;
} GCALIGNED_STRUCT;
+/* Sanity-check pseudovector layout. */
+verify (offsetof (struct Lisp_Hash_Table, weak) == header_size);
INLINE bool
HASH_TABLE_P (Lisp_Object a)
--
2.17.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: GDB and compiler-operations
2018-09-07 15:58 ` GDB and compiler-operations Stefan Monnier
@ 2018-09-07 17:11 ` Eli Zaretskii
2018-09-07 17:15 ` Paul Eggert
1 sibling, 0 replies; 28+ messages in thread
From: Eli Zaretskii @ 2018-09-07 17:11 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Fri, 07 Sep 2018 11:58:06 -0400
>
> > (gdb) p offsetof (struct Lisp_Vector, contents)
> > No symbol "__builtin_offsetof" in current context.
>
> Hmm... indeed that's the error I was seeing the other day, but today
> I get the other one. IIRC this was the same machine, tho (and if not,
> it was one running Debian testing or Debian stable, so nothing newer).
It depends on whether the source was compiled with -g3 or not.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: GDB and compiler-operations
2018-09-07 15:58 ` GDB and compiler-operations Stefan Monnier
2018-09-07 17:11 ` Eli Zaretskii
@ 2018-09-07 17:15 ` Paul Eggert
1 sibling, 0 replies; 28+ messages in thread
From: Paul Eggert @ 2018-09-07 17:15 UTC (permalink / raw)
To: Stefan Monnier, emacs-devel
On 09/07/2018 08:58 AM, Stefan Monnier wrote:
>> On a newer platform where GDB can see C macros (you really should enable
>> this if you can, by the way, it mak,es debugging easier),
> I'm on Debian testing, which doesn't strike me as old.
> How new does it need to be?
Sorry, don't know offhand. But tool age doesn't appear to apply to you
(see below).
>
>> I see this:
>>
>> (gdb) p offsetof (struct Lisp_Vector, contents)
>> No symbol "__builtin_offsetof" in current context.
> Hmm... indeed that's the error I was seeing the other day, but today
> I get the other one. IIRC this was the same machine, tho (and if not,
> it was one running Debian testing or Debian stable, so nothing newer).
Most likely when you were getting the other message you compiled with
plain -g instead of -g3. './configure' tries to default to -g3 if
available but perhaps you overrode that, or it didn't work for you.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-07 16:27 ` Paul Eggert
@ 2018-09-07 17:16 ` Eli Zaretskii
2018-09-07 18:13 ` Paul Eggert
0 siblings, 1 reply; 28+ messages in thread
From: Eli Zaretskii @ 2018-09-07 17:16 UTC (permalink / raw)
To: Paul Eggert; +Cc: monnier, emacs-devel
> Cc: monnier@IRO.UMontreal.CA, emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Fri, 7 Sep 2018 09:27:56 -0700
>
> On 09/07/2018 07:19 AM, Eli Zaretskii wrote:
> > I think GCC aligns the Lisp_Object array within the structures because
> > a Lisp_Object is an 8-byte data type in this configuration.
>
> That alignment is platform-dependent. On Fedora 28 configured
> --with-wide-int and with gcc -m32, a Lisp_Object is 8 bytes but its
> alignment is only 4 bytes. Apparently the alignment of 'long long' is 4
> on Fedora 28 x86, but 8 on MS-Windows x86.
Isn't it strange, though? Why would that be platform dependent?
Could it be due to GCC version differences (mine is 7.3.0)?
> I installed the attached; please give it a try.
Builds fine, thanks.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-07 17:16 ` Eli Zaretskii
@ 2018-09-07 18:13 ` Paul Eggert
2018-09-07 18:32 ` Eli Zaretskii
0 siblings, 1 reply; 28+ messages in thread
From: Paul Eggert @ 2018-09-07 18:13 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: monnier, emacs-devel
On 09/07/2018 10:16 AM, Eli Zaretskii wrote:
> Why would that be platform dependent?
There are differences in GCC struct layout between MS-Windows and
GNU/Linux; see the GCC option -mms-bitfields. I expect it's the usual
story about platforms making different tradeoffs between backward
compatibility vs performance. I wasn't aware of the 'long long'
alignment issue until now, though.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-07 18:13 ` Paul Eggert
@ 2018-09-07 18:32 ` Eli Zaretskii
2018-09-07 19:05 ` Paul Eggert
0 siblings, 1 reply; 28+ messages in thread
From: Eli Zaretskii @ 2018-09-07 18:32 UTC (permalink / raw)
To: Paul Eggert; +Cc: monnier, emacs-devel
> Cc: monnier@IRO.UMontreal.CA, emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Fri, 7 Sep 2018 11:13:49 -0700
>
> There are differences in GCC struct layout between MS-Windows and
> GNU/Linux; see the GCC option -mms-bitfields.
"-mms-bitfields" is about something very different, and specifically
for compatibility with Microsoft compilers. I don't see how that
could affect long long.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-07 12:16 ` Stefan Monnier
@ 2018-09-07 19:04 ` Paul Eggert
2018-09-07 19:45 ` Stefan Monnier
0 siblings, 1 reply; 28+ messages in thread
From: Paul Eggert @ 2018-09-07 19:04 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
On 09/07/2018 05:16 AM, Stefan Monnier wrote:
> Where does this 20% slow down come from?
Ha! It was because the 16-byte alignment caused pure space to overflow,
which disabled GC. No wonder there was such a big performance difference.
I fixed that, and now the patch to shrink marker allocations from 32 to
24 bytes on x86 causes my standard benchmark (make compile-always) to
run only 1.0% slower, which is more reasonable. A microbenchmark of
running (make-marker) over and over again for 10,000,000 times runs 36%
slower (815 vs 597 ns for a single call). So it still looks like we
should be following GCC's max_align_t hint and using 16-byte alignment
on x86, even though this wastes memory.
Maybe we should be using 4 mark bits instead of 3?
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-07 18:32 ` Eli Zaretskii
@ 2018-09-07 19:05 ` Paul Eggert
2018-09-07 19:22 ` Eli Zaretskii
0 siblings, 1 reply; 28+ messages in thread
From: Paul Eggert @ 2018-09-07 19:05 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: monnier, emacs-devel
On 09/07/2018 11:32 AM, Eli Zaretskii wrote:
>> There are differences in GCC struct layout between MS-Windows and
>> GNU/Linux; see the GCC option -mms-bitfields.
> "-mms-bitfields" is about something very different, and specifically
> for compatibility with Microsoft compilers. I don't see how that
> could affect long long.
Sure, but the point is that GCC does not lay out structures identically
on MS-Windows vs GNU/Linux. If it makes an exception for bitfields it
may very well make an exception for long long.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-07 19:05 ` Paul Eggert
@ 2018-09-07 19:22 ` Eli Zaretskii
0 siblings, 0 replies; 28+ messages in thread
From: Eli Zaretskii @ 2018-09-07 19:22 UTC (permalink / raw)
To: Paul Eggert; +Cc: monnier, emacs-devel
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Fri, 7 Sep 2018 12:05:18 -0700
> Cc: monnier@IRO.UMontreal.CA, emacs-devel@gnu.org
>
> Sure, but the point is that GCC does not lay out structures identically
> on MS-Windows vs GNU/Linux. If it makes an exception for bitfields it
> may very well make an exception for long long.
Yes, the facts are unequivocal. I just don't understand why the
different behavior.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-07 19:04 ` Paul Eggert
@ 2018-09-07 19:45 ` Stefan Monnier
2018-09-07 21:03 ` Paul Eggert
0 siblings, 1 reply; 28+ messages in thread
From: Stefan Monnier @ 2018-09-07 19:45 UTC (permalink / raw)
To: Paul Eggert; +Cc: emacs-devel
>> Where does this 20% slow down come from?
> Ha! It was because the 16-byte alignment caused pure space to overflow,
> which disabled GC. No wonder there was such a big performance difference.
OK, that makes more sense.
> I fixed that, and now the patch to shrink marker allocations from 32 to 24
> bytes on x86 causes my standard benchmark (make compile-always) to run only
> 1.0% slower, which is more reasonable. A microbenchmark of running
> (make-marker) over and over again for 10,000,000 times runs 36% slower (815
> vs 597 ns for a single call).
I still wonder why it would be slower at all.
> Maybe we should be using 4 mark bits instead of 3?
On 32bit systems, both cons cells and float cells use 8 bytes each, so
aligning on multiples of 16 would double their memory use.
And we currently have one free tag, so not only would using 4 tag bits
significantly increase memory use for those objects, but it's not clear
what the extra tags would be useful for. Also with the bignum support,
the pressure to maximize the size of our fixnums is much lower, so we
could even consider using fewer Lisp_Int tags if we feel like we need
more tags.
FWIW, IIUC XEmacs uses a 2bit tag which simply distinguishes between
Lisp_Int0, Lisp_Int1, Char, and other objects. Since we don't have
chars, that's like using a single-bit tag for us.
Maybe we should introduce some way to instrument
SYMBOLP/STRINGP/VECTORP/MARKERP/CONSP/... in order to try and figure out
which objects are more deserving of having their tag right there in the
Lisp_Object rather than just in the vectorlike_header.
Stefan
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: GDB and compiler-operations
2018-09-07 14:12 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Stefan Monnier
` (2 preceding siblings ...)
2018-09-07 15:48 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Paul Eggert
@ 2018-09-07 19:59 ` Tom Tromey
3 siblings, 0 replies; 28+ messages in thread
From: Tom Tromey @ 2018-09-07 19:59 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
>>>>> "Stefan" == Stefan Monnier <monnier@iro.umontreal.ca> writes:
Stefan> Along the way I discovered that while `sizeof` works great, `offsetof`
Stefan> gives me an error:
Stefan> No symbol "offsetof" in current context.
Stefan> any idea why this is (and how to fix it)?
Other people answered the why. To fix it you have two options.
One, define an offsetof macro:
(gdb) macro define offsetof(type, field) ((int) (((type *) 0)->field))
The gdb C expression parser will automatically use macros you define
interactively.
Two, instead of using offsetof to inspect a type, upgrade to a newish
(8.1 or better) gdb and use "ptype/o":
(gdb) ptype/o struct Lisp_Vector
/* offset | size */ type = struct Lisp_Vector {
/* 0 | 8 */ union vectorlike_header {
/* 8 */ ptrdiff_t size;
/* 1 */ char gcaligned;
/* total size (bytes): 8 */
} header;
/* 8 | 0 */ Lisp_Object contents[];
/* total size (bytes): 8 */
}
The output here is modeled on the pahole utility.
If you can't upgrade gdb, there's a "pahole.py" script out there that
adds a pahole command to gdb instead. I can email it if you really
need it, but some distros installed it by default. So you could just
try "(gdb) pahole struct Lisp_Vector".
Tom
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-07 19:45 ` Stefan Monnier
@ 2018-09-07 21:03 ` Paul Eggert
2018-09-08 1:54 ` Stefan Monnier
0 siblings, 1 reply; 28+ messages in thread
From: Paul Eggert @ 2018-09-07 21:03 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 1053 bytes --]
Stefan Monnier wrote:
> I still wonder why it would be slower at all.
My guess is cache effects. My processor has a cache line size of 64 bytes, so if
objects are allocated in 32-byte chunks they won't straddle cache boundaries and
code will be less likely to thrash the cache. I ran this benchmark in the 'lisp'
subdirectory:
EMACSLOADPATH= perf stat -dd '../src/emacs' -batch --no-site-file --no-site-lisp
--eval '(setq load-prefer-newer t)' -f batch-byte-compile org/org.el
and am attaching the results for the 24-bit allocation (a bit slower) and the
32-bit allocation (a bit faster), and they are in line with this guess.
>> Maybe we should be using 4 mark bits instead of 3?
> On 32bit systems, both cons cells and float cells use 8 bytes each, so
> aligning on multiples of 16 would double their memory use.
We'd use two tags for both conses and float cells, so that shouldn't be a problem.
> it's not clear
> what the extra tags would be useful for.
Presumably to help performance elsewhere. Admittedly I'm blue-skying a bit here.
[-- Attachment #2: perf-stat-emacs-24.txt --]
[-- Type: text/plain, Size: 2140 bytes --]
Performance counter stats for '../src/emacs -batch --no-site-file --no-site-lisp --eval (setq load-prefer-newer t) -f batch-byte-compile org/org.el':
3710.809824 task-clock:u (msec) # 0.998 CPUs utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
4,298 page-faults:u # 0.001 M/sec
9,414,411,487 cycles:u # 2.537 GHz (18.79%)
757,273,968 stalled-cycles-frontend:u # 8.04% frontend cycles idle (18.80%)
4,395,722,568 stalled-cycles-backend:u # 46.69% backend cycles idle (18.80%)
9,301,424,026 instructions:u # 0.99 insn per cycle
# 0.47 stalled cycles per insn (18.79%)
1,390,988,726 branches:u # 374.848 M/sec (18.76%)
67,045,930 branch-misses:u # 4.82% of all branches (18.74%)
5,219,768,450 L1-dcache-loads:u # 1406.639 M/sec (18.73%)
43,583,947 L1-dcache-load-misses:u # 0.83% of all L1-dcache hits (18.72%)
85,736,523 LLC-loads:u # 23.105 M/sec (18.73%)
14,635,030 LLC-load-misses:u # 17.07% of all LL-cache hits (18.73%)
2,429,036,804 L1-icache-loads:u # 654.584 M/sec (18.73%)
7,260,772 L1-icache-load-misses:u # 0.30% of all L1-icache hits (18.72%)
5,206,401,315 dTLB-loads:u # 1403.036 M/sec (18.72%)
4,878,637 dTLB-load-misses:u # 0.09% of all dTLB cache hits (18.72%)
2,418,273,074 iTLB-loads:u # 651.683 M/sec (18.75%)
2,946 iTLB-load-misses:u # 0.00% of all iTLB cache hits (18.77%)
3.718081199 seconds time elapsed
[-- Attachment #3: perf-stat-emacs-32.txt --]
[-- Type: text/plain, Size: 2140 bytes --]
Performance counter stats for '../src/emacs -batch --no-site-file --no-site-lisp --eval (setq load-prefer-newer t) -f batch-byte-compile org/org.el':
3643.107970 task-clock:u (msec) # 0.998 CPUs utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
4,320 page-faults:u # 0.001 M/sec
9,235,689,147 cycles:u # 2.535 GHz (18.77%)
683,279,700 stalled-cycles-frontend:u # 7.40% frontend cycles idle (18.74%)
4,395,369,277 stalled-cycles-backend:u # 47.59% backend cycles idle (18.73%)
9,226,971,053 instructions:u # 1.00 insn per cycle
# 0.48 stalled cycles per insn (18.74%)
1,360,574,869 branches:u # 373.465 M/sec (18.75%)
68,352,263 branch-misses:u # 5.02% of all branches (18.75%)
5,125,592,214 L1-dcache-loads:u # 1406.928 M/sec (18.74%)
41,529,042 L1-dcache-load-misses:u # 0.81% of all L1-dcache hits (18.74%)
77,752,725 LLC-loads:u # 21.342 M/sec (18.74%)
14,778,615 LLC-load-misses:u # 19.01% of all LL-cache hits (18.75%)
2,394,079,664 L1-icache-loads:u # 657.153 M/sec (18.74%)
6,663,498 L1-icache-load-misses:u # 0.28% of all L1-icache hits (18.74%)
5,164,701,717 dTLB-loads:u # 1417.664 M/sec (18.74%)
4,443,995 dTLB-load-misses:u # 0.09% of all dTLB cache hits (18.76%)
2,345,758,968 iTLB-loads:u # 643.889 M/sec (18.78%)
3,369 iTLB-load-misses:u # 0.00% of all iTLB cache hits (18.79%)
3.650332037 seconds time elapsed
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-07 21:03 ` Paul Eggert
@ 2018-09-08 1:54 ` Stefan Monnier
2018-09-08 3:04 ` Paul Eggert
0 siblings, 1 reply; 28+ messages in thread
From: Stefan Monnier @ 2018-09-08 1:54 UTC (permalink / raw)
To: Paul Eggert; +Cc: emacs-devel
>> I still wonder why it would be slower at all.
> My guess is cache effects. My processor has a cache line size of 64 bytes,
> so if objects are allocated in 32-byte chunks they won't straddle cache
> boundaries and code will be less likely to thrash the cache.
The difference of alignment is between multiples-of-8 and
multiples-of-16, and allocation is done by the vectorlike code, so
I think 32 byte objects aren't supposed to be much more likely to be
aligned on 32 byte boundaries than on 32B + 16B.
Hence, I don't find your argument very convincing.
> I ran this benchmark in the 'lisp' subdirectory:
> EMACSLOADPATH= perf stat -dd
> '../src/emacs' -batch --no-site-file --no-site-lisp --eval '(setq
> load-prefer-newer t)' -f batch-byte-compile org/org.el
>
> and am attaching the results for the 24-bit allocation (a bit slower) and
> the 32-bit allocation (a bit faster), and they are in line with this guess.
Those perf-stats also show improved I$ performance, which isn't
explained by your suggested explanation. Similarly, they show a reduced
number of instructions.
IOW, I think there's something else at play than just the cache effects.
>> it's not clear what the extra tags would be useful for.
> Presumably to help performance elsewhere. Admittedly I'm blue-skying
> a bit here.
I think a 16 byte alignment could indeed be a good idea on 64bit systems
(assuming we make the tags take advantage of it), but on 32bit systems
where the memory is usually more constrained to start with, I think it
would be a mistake.
Stefan
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-08 1:54 ` Stefan Monnier
@ 2018-09-08 3:04 ` Paul Eggert
2018-09-08 3:10 ` Stefan Monnier
0 siblings, 1 reply; 28+ messages in thread
From: Paul Eggert @ 2018-09-08 3:04 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
Stefan Monnier wrote:
> I think 32 byte objects aren't supposed to be much more likely to be
> aligned on 32 byte boundaries than on 32B + 16B.
True, I miscalculated. If the cache line size is 64 bytes and objects are
allocated on 16-byte boundaries (the case now), the probability that a
randomly-placed 24-byte marker (allocated as 32 bytes) will straddle into two
cache lines is (2-1)/4, or 25%. Whereas if objects are allocated in 8-byte
boundaries as you're suggesting, the probability that the same marker will
straddle is (3-1)/8, which is still 25%. So for this particular case the
straddling issue should be a wash.
> Those perf-stats also show improved I$ performance, which isn't
> explained by your suggested explanation. Similarly, they show a reduced
> number of instructions.
Yes, it could well be that the 32-byte allocation is faster than the 24 partly
due to some reason other than d-cache effects. Although there is a smaller
percentage of cache misses in the 32-byte version, it could be that this is
because the 32-byte version uses simpler code that would be faster even if the
cache miss rate were the same.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Lisp_Marker size on 32bit systems
2018-09-08 3:04 ` Paul Eggert
@ 2018-09-08 3:10 ` Stefan Monnier
0 siblings, 0 replies; 28+ messages in thread
From: Stefan Monnier @ 2018-09-08 3:10 UTC (permalink / raw)
To: Paul Eggert; +Cc: emacs-devel
>> Those perf-stats also show improved I$ performance, which isn't
>> explained by your suggested explanation. Similarly, they show
>> a reduced number of instructions.
>
> Yes, it could well be that the 32-byte allocation is faster than the 24
> partly due to some reason other than d-cache effects. Although there
> is a smaller percentage of cache misses in the 32-byte version, it could be
> that this is because the 32-byte version uses simpler code that would be
> faster even if the cache miss rate were the same.
That's my impression as well, but I'd be curious to know why that is.
The only "obvious" advantage is that 32 is a power of 2 so you can use
shift for multiplication/division, but that would only apply to things
like indexing arrays of markers or computing the diff between two
marker pointers. AFAIK we don't do any such operation.
Stefan
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2018-09-08 3:10 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-06 0:41 Lisp_Marker size on 32bit systems Stefan Monnier
2018-09-06 6:51 ` Paul Eggert
2018-09-06 12:17 ` Stefan Monnier
2018-09-07 7:15 ` Paul Eggert
2018-09-07 8:05 ` Eli Zaretskii
2018-09-07 13:45 ` Paul Eggert
2018-09-07 14:12 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Stefan Monnier
2018-09-07 14:23 ` Eli Zaretskii
2018-09-07 15:16 ` GDB and compiler-operations Andreas Schwab
2018-09-07 15:48 ` GDB and compiler-operations (was: Lisp_Marker size on 32bit systems) Paul Eggert
2018-09-07 15:58 ` GDB and compiler-operations Stefan Monnier
2018-09-07 17:11 ` Eli Zaretskii
2018-09-07 17:15 ` Paul Eggert
2018-09-07 19:59 ` Tom Tromey
2018-09-07 14:19 ` Lisp_Marker size on 32bit systems Eli Zaretskii
2018-09-07 16:27 ` Paul Eggert
2018-09-07 17:16 ` Eli Zaretskii
2018-09-07 18:13 ` Paul Eggert
2018-09-07 18:32 ` Eli Zaretskii
2018-09-07 19:05 ` Paul Eggert
2018-09-07 19:22 ` Eli Zaretskii
2018-09-07 12:16 ` Stefan Monnier
2018-09-07 19:04 ` Paul Eggert
2018-09-07 19:45 ` Stefan Monnier
2018-09-07 21:03 ` Paul Eggert
2018-09-08 1:54 ` Stefan Monnier
2018-09-08 3:04 ` Paul Eggert
2018-09-08 3:10 ` Stefan Monnier
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.