Stefan Monnier wrote: > I still wonder why it would be slower at all. My guess is cache effects. My processor has a cache line size of 64 bytes, so if objects are allocated in 32-byte chunks they won't straddle cache boundaries and code will be less likely to thrash the cache. I ran this benchmark in the 'lisp' subdirectory: EMACSLOADPATH= perf stat -dd '../src/emacs' -batch --no-site-file --no-site-lisp --eval '(setq load-prefer-newer t)' -f batch-byte-compile org/org.el and am attaching the results for the 24-bit allocation (a bit slower) and the 32-bit allocation (a bit faster), and they are in line with this guess. >> Maybe we should be using 4 mark bits instead of 3? > On 32bit systems, both cons cells and float cells use 8 bytes each, so > aligning on multiples of 16 would double their memory use. We'd use two tags for both conses and float cells, so that shouldn't be a problem. > it's not clear > what the extra tags would be useful for. Presumably to help performance elsewhere. Admittedly I'm blue-skying a bit here.