* [RFC] temporary Lisp_Strings @ 2014-09-02 12:55 Dmitry Antipov 2014-09-02 13:21 ` Andreas Schwab ` (2 more replies) 0 siblings, 3 replies; 38+ messages in thread From: Dmitry Antipov @ 2014-09-02 12:55 UTC (permalink / raw) To: Emacs development discussions [-- Attachment #1: Type: text/plain, Size: 297 bytes --] I'm thinking about temporary Lisp_Strings on C stack (allocated with alloca). Simple implementation (with no check whether it fits on stack) and a few use cases attached. Among others, the question is: is there a way to make sure that an address returned by alloca fits in Lisp_Object? Dmitry [-- Attachment #2: alloca_string.patch --] [-- Type: text/x-patch, Size: 2374 bytes --] === modified file 'src/fileio.c' --- src/fileio.c 2014-09-02 11:41:22 +0000 +++ src/fileio.c 2014-09-02 11:47:45 +0000 @@ -1179,11 +1179,11 @@ char newdir_utf8[MAX_UTF8_PATH]; filename_from_ansi (newdir, newdir_utf8); - tem = build_string (newdir_utf8); + tem = alloca_string (newdir_utf8); } else #endif - tem = build_string (newdir); + tem = alloca_string (newdir); newdirlen = SBYTES (tem); if (multibyte && !STRING_MULTIBYTE (tem)) { @@ -1215,7 +1215,7 @@ /* `getpwnam' may return a unibyte string, which will bite us since we expect the directory to be multibyte. */ - tem = build_string (newdir); + tem = alloca_string (newdir); newdirlen = SBYTES (tem); if (multibyte && !STRING_MULTIBYTE (tem)) { @@ -1249,7 +1249,7 @@ adir = NULL; else if (multibyte) { - Lisp_Object tem = build_string (adir); + Lisp_Object tem = alloca_string (adir); tem = DECODE_FILE (tem); newdirlen = SBYTES (tem); @@ -1350,7 +1350,7 @@ getcwd (adir, adir_size); if (multibyte) { - Lisp_Object tem = build_string (adir); + Lisp_Object tem = alloca_string (adir); tem = DECODE_FILE (tem); newdirlen = SBYTES (tem); === modified file 'src/lisp.h' --- src/lisp.h 2014-09-02 06:49:40 +0000 +++ src/lisp.h 2014-09-02 12:31:01 +0000 @@ -4459,6 +4459,25 @@ memcpy (alloca (SBYTES (string) + 1), \ SSDATA (string), SBYTES (string) + 1) +/* Create temporary Lisp_String. Use alloca and do not disturb GC. */ + +#define alloca_string(str) \ + ({ Lisp_Object string; \ + struct Lisp_String *s; \ + ptrdiff_t nchars, nbytes, size = strlen (str); \ + parse_str_as_multibyte ((const unsigned char *) str, \ + size, &nchars, &nbytes); \ + s = alloca (sizeof *s + nbytes + 1); \ + s->data = (unsigned char *) s + sizeof (*s); \ + memcpy (s->data, str, nbytes); \ + s->data[size] = '\0'; \ + s->intervals = NULL; \ + if (nbytes == nchars || nbytes != size) \ + s->size = size, s->size_byte = -1; \ + else \ + s->size = nchars, s->size_byte = nbytes; \ + XSETSTRING (string, s); string; }) + /* Set up the name of the machine we're running on. */ extern void init_system_name (void); ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC] temporary Lisp_Strings 2014-09-02 12:55 [RFC] temporary Lisp_Strings Dmitry Antipov @ 2014-09-02 13:21 ` Andreas Schwab 2014-09-02 13:47 ` Dmitry Antipov 2014-09-02 14:00 ` Stefan Monnier 2014-09-02 14:37 ` [RFC] temporary Lisp_Strings Paul Eggert 2 siblings, 1 reply; 38+ messages in thread From: Andreas Schwab @ 2014-09-02 13:21 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Emacs development discussions Dmitry Antipov <dmantipov@yandex.ru> writes: > I'm thinking about temporary Lisp_Strings on C stack (allocated with alloca). > Simple implementation (with no check whether it fits on stack) and a few use > cases attached. Among others, the question is: is there a way to make sure > that an address returned by alloca fits in Lisp_Object? There is none. The stack could be in a completely different address region that may not be suitable in non-USE_LSB_TAG configurations. > +#define alloca_string(str) \ > + ({ Lisp_Object string; \ > + struct Lisp_String *s; \ > + ptrdiff_t nchars, nbytes, size = strlen (str); \ > + parse_str_as_multibyte ((const unsigned char *) str, \ > + size, &nchars, &nbytes); \ > + s = alloca (sizeof *s + nbytes + 1); \ > + s->data = (unsigned char *) s + sizeof (*s); \ > + memcpy (s->data, str, nbytes); \ > + s->data[size] = '\0'; \ > + s->intervals = NULL; \ > + if (nbytes == nchars || nbytes != size) \ > + s->size = size, s->size_byte = -1; \ > + else \ > + s->size = nchars, s->size_byte = nbytes; \ > + XSETSTRING (string, s); string; }) This will eventually fail with USE_LSB_TAG, since you don't enforce alignment. Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC] temporary Lisp_Strings 2014-09-02 13:21 ` Andreas Schwab @ 2014-09-02 13:47 ` Dmitry Antipov 2014-09-02 14:14 ` Andreas Schwab 2014-09-02 15:00 ` Eli Zaretskii 0 siblings, 2 replies; 38+ messages in thread From: Dmitry Antipov @ 2014-09-02 13:47 UTC (permalink / raw) To: Andreas Schwab; +Cc: Emacs development discussions On 09/02/2014 05:21 PM, Andreas Schwab wrote: > There is none. The stack could be in a completely different address > region that may not be suitable in non-USE_LSB_TAG configurations. IIUC non-USE_LSB_TAG configurations are not widely used, so this is a tier-2 problem. >> +#define alloca_string(str) \ >> + ({ Lisp_Object string; \ >> + struct Lisp_String *s; \ >> + ptrdiff_t nchars, nbytes, size = strlen (str); \ >> + parse_str_as_multibyte ((const unsigned char *) str, \ >> + size, &nchars, &nbytes); \ >> + s = alloca (sizeof *s + nbytes + 1); \ >> + s->data = (unsigned char *) s + sizeof (*s); \ >> + memcpy (s->data, str, nbytes); \ >> + s->data[size] = '\0'; \ >> + s->intervals = NULL; \ >> + if (nbytes == nchars || nbytes != size) \ >> + s->size = size, s->size_byte = -1; \ >> + else \ >> + s->size = nchars, s->size_byte = nbytes; \ >> + XSETSTRING (string, s); string; }) > > This will eventually fail with USE_LSB_TAG, since you don't enforce > alignment. What about alignment guaranteed by alloca? IIUC compilers tends to align it enough to US_LSB_TAG: http://msdn.microsoft.com/en-us/library/x9sx5da1.aspx https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55945 Dmitry ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC] temporary Lisp_Strings 2014-09-02 13:47 ` Dmitry Antipov @ 2014-09-02 14:14 ` Andreas Schwab 2014-09-02 15:00 ` Eli Zaretskii 1 sibling, 0 replies; 38+ messages in thread From: Andreas Schwab @ 2014-09-02 14:14 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Emacs development discussions Dmitry Antipov <dmantipov@yandex.ru> writes: > What about alignment guaranteed by alloca? You only get as much as the architecture requires. > IIUC compilers tends to align it enough to US_LSB_TAG: tends != guarantees. Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC] temporary Lisp_Strings 2014-09-02 13:47 ` Dmitry Antipov 2014-09-02 14:14 ` Andreas Schwab @ 2014-09-02 15:00 ` Eli Zaretskii 1 sibling, 0 replies; 38+ messages in thread From: Eli Zaretskii @ 2014-09-02 15:00 UTC (permalink / raw) To: Dmitry Antipov; +Cc: schwab, emacs-devel > Date: Tue, 02 Sep 2014 17:47:15 +0400 > From: Dmitry Antipov <dmantipov@yandex.ru> > Cc: Emacs development discussions <emacs-devel@gnu.org> > > What about alignment guaranteed by alloca? IIUC compilers tends > to align it enough to US_LSB_TAG: > > http://msdn.microsoft.com/en-us/library/x9sx5da1.aspx This is not relevant, it talks about the Microsoft compiler. > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55945 This is only about x64, AFAICT. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC] temporary Lisp_Strings 2014-09-02 12:55 [RFC] temporary Lisp_Strings Dmitry Antipov 2014-09-02 13:21 ` Andreas Schwab @ 2014-09-02 14:00 ` Stefan Monnier 2014-09-02 15:13 ` Dmitry Antipov 2014-09-02 14:37 ` [RFC] temporary Lisp_Strings Paul Eggert 2 siblings, 1 reply; 38+ messages in thread From: Stefan Monnier @ 2014-09-02 14:00 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Emacs development discussions > I'm thinking about temporary Lisp_Strings on C stack (allocated with alloca). > Simple implementation (with no check whether it fits on stack) and a few use > cases attached. Among others, the question is: is there a way to make sure > that an address returned by alloca fits in Lisp_Object? Have you made some preliminary measurements (on microbenchmarks) to try and see how much speed up we might gain? Given the cost of strlen and parse_str_as_multibyte, I'd expect that the best-case benefit might turn out to be rather small. Stefan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC] temporary Lisp_Strings 2014-09-02 14:00 ` Stefan Monnier @ 2014-09-02 15:13 ` Dmitry Antipov 2014-09-02 17:32 ` Stefan Monnier 0 siblings, 1 reply; 38+ messages in thread From: Dmitry Antipov @ 2014-09-02 15:13 UTC (permalink / raw) To: Stefan Monnier; +Cc: Emacs development discussions [-- Attachment #1: Type: text/plain, Size: 1597 bytes --] On 09/02/2014 06:00 PM, Stefan Monnier wrote: > Have you made some preliminary measurements (on microbenchmarks) to try > and see how much speed up we might gain? Given the cost of strlen and > parse_str_as_multibyte, I'd expect that the best-case benefit might turn > out to be rather small. For the moment, I don't have an idea how to benchmark parse_str_as_multibyte in "near-to-real-use" conditions. But I guess that we can have some gain for "simple" (short, especially short unibyte) strings. That guess is based on the following results (note ~3x speedup in strcpy/strcat workload): $ gcc -Wall -O2 -o t-alloca t-alloca.c t-use.c $ $ /usr/bin/time ./t-alloca 100 100 Use malloc for allocation and sprintf for workload 2.23user 0.00system 0:02.23elapsed 100%CPU (0avgtext+0avgdata 1440maxresident)k 0inputs+0outputs (0major+59minor)pagefaults 0swaps $ $ /usr/bin/time ./t-alloca 100 100 x Use alloca for allocation and sprintf for workload 1.85user 0.00system 0:01.85elapsed 99%CPU (0avgtext+0avgdata 1436maxresident)k 0inputs+0outputs (0major+59minor)pagefaults 0swaps $ $ gcc -DFAST -Wall -O2 -o t-alloca t-alloca.c t-use.c $ $ /usr/bin/time ./t-alloca 100 100 Use malloc for allocation and strcpy/strcat for workload 0.56user 0.00system 0:00.56elapsed 99%CPU (0avgtext+0avgdata 1440maxresident)k 0inputs+0outputs (0major+59minor)pagefaults 0swaps $ $ /usr/bin/time ./t-alloca 100 100 x Use alloca for allocation and strcpy/strcat for workload 0.20user 0.00system 0:00.20elapsed 100%CPU (0avgtext+0avgdata 1440maxresident)k 0inputs+0outputs (0major+60minor)pagefaults 0swaps Dmitry [-- Attachment #2: t-alloca.c --] [-- Type: text/x-csrc, Size: 874 bytes --] #include <stdio.h> #include <stdlib.h> #include <string.h> #ifdef FAST #define FUNCNAME "strcpy/strcat" #else #define FUNCNAME "sprintf" #endif extern void use_with_malloc (char *, int, char *, int); extern void use_with_alloca (char *, int, char *, int); int main (int argc, char *argv[]) { char *p1, *p2; int i, len1, len2; if (argc >= 3) { len1 = atoi (argv[1]); len2 = atoi (argv[2]); p1 = malloc (len1 + 1); memset (p1, 'a', len1); p1[len1] = '\0'; p2 = malloc (len2 + 1); memset (p1, 'b', len2); p2[len2] = '\0'; printf ("Use %s for allocation and %s for workload\n", (argc == 3) ? "malloc" : "alloca", FUNCNAME); for (i = 0; i < 10000000; i++) { if (argc == 3) use_with_malloc (p1, len1, p2, len2); else use_with_alloca (p1, len1, p2, len2); } } return 0; } [-- Attachment #3: t-use.c --] [-- Type: text/x-csrc, Size: 499 bytes --] #include <stdio.h> #include <stdlib.h> #include <string.h> char *ptr; void use_with_malloc (char *p, int plen, char *q, int qlen) { ptr = malloc (plen + qlen + 1); #ifdef FAST strcpy (ptr, p); strcat (ptr, q); #else sprintf (ptr, "%s%s", p, q); #endif (void) ptr; free (ptr); } void use_with_alloca (char *p, int plen, char *q, int qlen) { ptr = alloca (plen + qlen + 1); #ifdef FAST strcpy (ptr, p); strcat (ptr, q); #else sprintf (ptr, "%s%s", p, q); #endif (void) ptr; } ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC] temporary Lisp_Strings 2014-09-02 15:13 ` Dmitry Antipov @ 2014-09-02 17:32 ` Stefan Monnier 2014-09-03 10:23 ` Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] Dmitry Antipov 0 siblings, 1 reply; 38+ messages in thread From: Stefan Monnier @ 2014-09-02 17:32 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Emacs development discussions >> Have you made some preliminary measurements (on microbenchmarks) to try >> and see how much speed up we might gain? Given the cost of strlen and >> parse_str_as_multibyte, I'd expect that the best-case benefit might turn >> out to be rather small. > For the moment, I don't have an idea how to benchmark parse_str_as_multibyte > in "near-to-real-use" conditions. I was thinking of benchmarking build_string against alloca_string. Stefan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-02 17:32 ` Stefan Monnier @ 2014-09-03 10:23 ` Dmitry Antipov 2014-09-03 13:14 ` Stefan Monnier 2014-09-03 14:39 ` Paul Eggert 0 siblings, 2 replies; 38+ messages in thread From: Dmitry Antipov @ 2014-09-03 10:23 UTC (permalink / raw) To: Emacs development discussions [-- Attachment #1: Type: text/plain, Size: 1098 bytes --] To whom it may concern, I decided to start from conses and vectors just because this is simpler than strings. My experimental stuff is attached, and typical benchmark output may be something like: Average of 100: stack allocation is 6.879408 times faster for conses Average of 1000: stack allocation is 7.390145 times faster for conses Average of 10000: stack allocation is 4.154356 times faster for conses Average of 100000: stack allocation is 2.417694 times faster for conses Average of 10: stack allocation is 11.535833 times faster for vectors Average of 100: stack allocation is 4.731400 times faster for vectors Average of 1000: stack allocation is 1.443268 times faster for vectors Note that this is sustained allocation speed; since we're unlikely to allocate large amounts of Lisp data on stack, real speedup may vary. Questions: 1) Should alloca_vector fallback to Fmake_vector for large vectors? 2) Is there a way to rewrite alloca_xxx macros to avoid statement expressions? IIUC this is not portable beyond gcc and compilers that mimics it (clang, Intel's icc). Dmitry [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: cons_vector_benchmark.patch --] [-- Type: text/x-patch; name="cons_vector_benchmark.patch", Size: 4044 bytes --] === modified file 'src/alloc.c' --- src/alloc.c 2014-08-29 07:29:47 +0000 +++ src/alloc.c 2014-09-03 09:41:26 +0000 @@ -7118,8 +7118,76 @@ file, line, msg); terminate_due_to_signal (SIGABRT, INT_MAX); } -#endif -\f + +/* Stress alloca with inconveniently-sized requests and check + whether all allocated areas may be used for Lisp_Object. */ + +static void +verify_alloca (void) +{ + int i; + enum { ALLOCA_CHECK_MAX = 256 }; + for (i = 0; i < ALLOCA_CHECK_MAX; i++) + { + char *ptr = alloca (i + 1); + eassert (valid_pointer_for_lisp_object (ptr)); + } +} + +#else /* not ENABLE_CHECKING */ + +#define verify_alloca() ((void) 0) + +#endif /* ENABLE_CHECKING */ + +DEFUN ("cons-benchmark", Fcons_benchmark, Scons_benchmark, 1, 1, 0, + doc: /* Benchmark cons allocation. */) + (Lisp_Object obj) +{ + double x, y; + ptrdiff_t i, max; + struct timespec ts; + + CHECK_NUMBER (obj); + max = XINT (obj); + + ts = current_timespec (); + for (i = 0, obj = Qnil; i < max; i++) + obj = Fcons (Qt, obj); + x = timespectod (timespec_sub (current_timespec (), ts)); + + ts = current_timespec (); + for (i = 0, obj = Qnil; i < max; i++) + obj = alloca_cons (Qt, obj); + y = timespectod (timespec_sub (current_timespec (), ts)); + + return Fcons (make_float (x), make_float (y)); +} + +DEFUN ("vector-benchmark", Fvector_benchmark, Svector_benchmark, 1, 1, 0, + doc: /* Benchmark vector allocation. */) + (Lisp_Object obj) +{ + double x, y; + ptrdiff_t i, max; + struct timespec ts; + + CHECK_NUMBER (obj); + max = XINT (obj); + + ts = current_timespec (); + for (i = 0, obj = Qnil; i < max; i++) + obj = Fmake_vector (make_number (i + 1), Qnil); + x = timespectod (timespec_sub (current_timespec (), ts)); + + ts = current_timespec (); + for (i = 0, obj = Qnil; i < max; i++) + obj = alloca_vector (i + 1, Qnil); + y = timespectod (timespec_sub (current_timespec (), ts)); + + return Fcons (make_float (x), make_float (y)); +} + /* Initialization. */ void @@ -7129,6 +7197,8 @@ purebeg = PUREBEG; pure_size = PURESIZE; + verify_alloca (); + #if GC_MARK_STACK || defined GC_MALLOC_CHECK mem_init (); Vdead = make_pure_string ("DEAD", 4, 4, 0); @@ -7284,6 +7354,9 @@ #if GC_MARK_STACK == GC_USE_GCPROS_CHECK_ZOMBIES defsubr (&Sgc_status); #endif + + defsubr (&Scons_benchmark); + defsubr (&Svector_benchmark); } /* When compiled with GCC, GDB might say "No enum type named === modified file 'src/lisp.h' --- src/lisp.h 2014-09-02 18:05:00 +0000 +++ src/lisp.h 2014-09-03 09:37:46 +0000 @@ -4533,6 +4533,37 @@ memory_full (SIZE_MAX); \ } while (false) +/* True if Lisp_Object may be placed at P. Used only + under ENABLE_CHECKING and optimized away otherwise. */ + +INLINE bool +valid_pointer_for_lisp_object (void *p) +{ + uintptr_t v = (uintptr_t) p; + return !(USE_LSB_TAG ? (v & ~VALMASK) : v >> VALBITS); +} + +/* Allocate Lisp_Cons on stack. */ + +#define alloca_cons(head, tail) \ + ({ struct Lisp_Cons *_c = alloca (sizeof *_c); \ + eassert (valid_pointer_for_lisp_object (_c)); \ + _c->car = (head), _c->u.cdr = (tail); \ + make_lisp_ptr (_c, Lisp_Cons); }) + +/* Allocate Lisp_Vector on stack, with respect to MAX_ALLOCA limit. */ + +#define alloca_vector(slots, init) \ + ({ struct Lisp_Vector *_v; \ + ptrdiff_t _i, _n = header_size + (slots) * word_size; \ + eassert (_n <= MAX_ALLOCA); \ + _v = alloca (_n); \ + eassert (valid_pointer_for_lisp_object (_v)); \ + _v->header.size = (slots); \ + for (_i = 0; _i < _v->header.size; _i++) \ + _v->contents[_i] = init; \ + make_lisp_ptr (_v, Lisp_Vectorlike); }) + /* Loop over all tails of a list, checking for cycles. FIXME: Make tortoise and n internal declarations. FIXME: Unroll the loop body so we don't need `n'. */ [-- Attachment #3: cons-vector-benchmark.el --] [-- Type: text/x-emacs-lisp, Size: 670 bytes --] (defun cons-benchmark-test () (interactive) (dolist (size '(100 1000 10000 100000)) (let ((avg 0.0)) (dotimes (i 1000) (let ((data (cons-benchmark size))) (setq avg (+ avg (/ (car data) (cdr data)))))) (message "Average of %d: stack allocation is %f times faster for conses" size (/ avg 1000.0))))) (defun vector-benchmark-test () (interactive) (dolist (size '(10 100 1000)) (let ((avg 0.0)) (dotimes (i 1000) (let ((data (vector-benchmark size))) (setq avg (+ avg (/ (car data) (cdr data)))))) (message "Average of %d: stack allocation is %f times faster for vectors" size (/ avg 1000.0))))) ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-03 10:23 ` Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] Dmitry Antipov @ 2014-09-03 13:14 ` Stefan Monnier 2014-09-03 14:39 ` Paul Eggert 1 sibling, 0 replies; 38+ messages in thread From: Stefan Monnier @ 2014-09-03 13:14 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Emacs development discussions > 2) Is there a way to rewrite alloca_xxx macros to avoid statement > expressions? I guess the way to do it would be to replace obj = alloca_vector (i + 1, Qnil); with alloca_vector (obj, i + 1, Qnil); -- Stefan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-03 10:23 ` Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] Dmitry Antipov 2014-09-03 13:14 ` Stefan Monnier @ 2014-09-03 14:39 ` Paul Eggert 2014-09-03 15:39 ` Dmitry Antipov 1 sibling, 1 reply; 38+ messages in thread From: Paul Eggert @ 2014-09-03 14:39 UTC (permalink / raw) To: Dmitry Antipov, Emacs development discussions Dmitry Antipov wrote: > 1) Should alloca_vector fallback to Fmake_vector for large vectors? Of course. It should be like SAFE_NALLOCA. > 2) Is there a way to rewrite alloca_xxx macros to avoid statement > expressions? IIUC this is not portable beyond gcc and compilers > that mimics it (clang, Intel's icc). Some thoughts: 1. It's OK to use statement expressions on compilers that support them, and fall back on slower technology on those that don't. 2. That being said, the GNU alloca documentation says "Do not use 'alloca' inside the arguments of a function call", i.e., that code must not do foo (alloca (n)), but instead must do (p = alloca (n), foo (p)). So it's not clear that trying to do these macros as expressions will work. 3. It's often better (i.e., faster and/or less memory-intensive) to use a block-scope object, which is reclaimed on block exit rather than function exit. This suggests that we should have two forms of cons stack allocation, one block-scope and one function-scope, depending on the lifetime that the caller wants. For a block-scope cons we can use a C99-style compound literal, which (unlike alloca) would be safe as an expression. (We're assuming C99 in the trunk now.) For a block-scope vector, if __STDC_NO_VLA__ is not defined we can declare a variable-length array, otherwise we can fall back on alloca and/or malloc. 4. Looking through the Emacs code now, I see that it's not disciplined about using SAFE_ALLOCA/SAFE_NALLOCA/etc. for unbounded arrays on the stack. I'll look into cleaning this up. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-03 14:39 ` Paul Eggert @ 2014-09-03 15:39 ` Dmitry Antipov 2014-09-03 16:42 ` Paul Eggert 0 siblings, 1 reply; 38+ messages in thread From: Dmitry Antipov @ 2014-09-03 15:39 UTC (permalink / raw) To: Paul Eggert; +Cc: Emacs development discussions On 09/03/2014 06:39 PM, Paul Eggert wrote: > 3. It's often better (i.e., faster and/or less memory-intensive) to use a block-scope object, > which is reclaimed on block exit rather than function exit. This suggests that we should have > two forms of cons stack allocation, one block-scope and one function-scope, depending on the > lifetime that the caller wants. For a block-scope cons we can use a C99-style compound literal, > which (unlike alloca) would be safe as an expression. (We're assuming C99 in the trunk now.) OK. So, for the simplest object (cons) we can have: 1) Function-scoped version using alloca (assuming statement expressions): #define alloca_cons(head, tail) \ ({ struct Lisp_Cons *_c = alloca (sizeof *_c); \ eassert (pointer_valid_for_lisp_object (_c)); \ _c->car = (head), _c->u.cdr = (tail); \ make_lisp_ptr (_c, Lisp_Cons); }) 2) Block-scoped version using implicit stack allocation (assuming statement expressions): #define scoped_cons(head, tail) \ ({ struct Lisp_Cons alignas (GCALIGNMENT) _c; \ _c.car = (head), _c.u.cdr = (tail); \ make_lisp_ptr (&_c, Lisp_Cons); }) 3) Block-scoped version assuming no statement expression but compound literals: #define scoped_cons_2(head, tail) \ make_lisp_ptr (&((struct Lisp_Cons) { head, { tail } }), Lisp_Cons) If we have 2), why we need 1) at all? 2) in a top-level function scope is an equivalent to 1), isn't it? In 3), how we can be sure that Lisp_Cons is properly aligned on stack? Dmitry ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-03 15:39 ` Dmitry Antipov @ 2014-09-03 16:42 ` Paul Eggert 2014-09-03 17:47 ` Stefan Monnier 2014-09-04 4:59 ` Dmitry Antipov 0 siblings, 2 replies; 38+ messages in thread From: Paul Eggert @ 2014-09-03 16:42 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Emacs development discussions Dmitry Antipov wrote: > 1) Function-scoped version using alloca (assuming statement expressions): > ... > 2) Block-scoped version using implicit stack allocation (assuming > statement expressions): > ... > 3) Block-scoped version assuming no statement expression but compound > literals: > ... > If we have 2), why we need 1) at all? 2) in a top-level function scope > is an equivalent to 1), isn't it? Correct. We'd need (1) e.g. to build up a list in a loop, and have the list survive until function exit. But on further thought this is probably a dangerous feature, since it'll be too tempting to write unbounded loops. So let's not do (1). > In 3), how we can be sure that Lisp_Cons is properly aligned on stack? For GCC, we can define struct Lisp_Cons via 'struct __attribute__ ((aligned (GCALIGNMENT))) Lisp_Cons { ... };'. For compilers that don't support this syntax we can align the struct by hand, by using a character-array compound literal that's a bit too large, aligning the resulting pointer by hand, and then using the aligned pointer. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-03 16:42 ` Paul Eggert @ 2014-09-03 17:47 ` Stefan Monnier 2014-09-03 18:00 ` Paul Eggert 2014-09-04 4:59 ` Dmitry Antipov 1 sibling, 1 reply; 38+ messages in thread From: Stefan Monnier @ 2014-09-03 17:47 UTC (permalink / raw) To: Paul Eggert; +Cc: Dmitry Antipov, Emacs development discussions > Correct. We'd need (1) e.g. to build up a list in a loop, and have the list > survive until function exit. But on further thought this is > probably a dangerous feature, since it'll be too tempting to write unbounded > loops. So let's not do (1). Is (2) actually valid? I mean, are we allowed to refer (via a reference) to a variable that's in a block we already exited? Stefan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-03 17:47 ` Stefan Monnier @ 2014-09-03 18:00 ` Paul Eggert 0 siblings, 0 replies; 38+ messages in thread From: Paul Eggert @ 2014-09-03 18:00 UTC (permalink / raw) To: Stefan Monnier; +Cc: Dmitry Antipov, Emacs development discussions Stefan Monnier wrote: > Is (2) actually valid? I mean, are we allowed to refer (via > a reference) to a variable that's in a block we already exited? You're right of course: (2) won't work. Thanks for catching that. (3) is better these days anyway, I expect, as it relies only on C99 features and we're already assuming C99 in the trunk. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-03 16:42 ` Paul Eggert 2014-09-03 17:47 ` Stefan Monnier @ 2014-09-04 4:59 ` Dmitry Antipov 2014-09-04 5:13 ` Paul Eggert 1 sibling, 1 reply; 38+ messages in thread From: Dmitry Antipov @ 2014-09-04 4:59 UTC (permalink / raw) To: Paul Eggert; +Cc: Emacs development discussions On 09/03/2014 08:42 PM, Paul Eggert wrote: > For GCC, we can define struct Lisp_Cons via 'struct __attribute__ ((aligned (GCALIGNMENT))) Lisp_Cons { ... };'. > For compilers that don't support this syntax we can align the struct by hand, by using a character-array compound > literal that's a bit too large, aligning the resulting pointer by hand, and then using the aligned pointer. Yes, that seems to work: Lisp_Object obj; struct Lisp_Cons *c = ((struct Lisp_Cons *) (((uintptr_t) (char [2 * sizeof (struct Lisp_Cons) - 1]) {} + (GCALIGNMENT - 1)) & ~(GCALIGNMENT - 1))); c->car = Qnil; c->u.cdr = Qnil; XSETCONS (obj, c); But I don't see how to fold this snippet into a macro which can be used as an rvalue, just like: Lisp_Object obj = scoped_cons (Qnil, Qnil); Dmitry ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-04 4:59 ` Dmitry Antipov @ 2014-09-04 5:13 ` Paul Eggert 2014-09-04 5:51 ` Dmitry Antipov 0 siblings, 1 reply; 38+ messages in thread From: Paul Eggert @ 2014-09-04 5:13 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Emacs development discussions Dmitry Antipov wrote: > I don't see how to fold this snippet into a macro which can be used as > an rvalue, just like: How about something like this? The cool thing here is that GCC optimizes 'foo' away to a function that simply returns 0. typedef unsigned long uintptr_t; typedef uintptr_t Lisp_Object; struct __attribute__ ((aligned (8))) Lisp_Cons { Lisp_Object car; Lisp_Object cdr; }; #define BCONS(a, b) ((Lisp_Object) &(struct Lisp_Cons) { a, b } + 3) #define UNTAG(x) ((struct Lisp_Cons *) (uintptr_t) ((x) - 3)) int foo (void) { Lisp_Object a = BCONS (0, 0); Lisp_Object b = BCONS (0, a); return UNTAG (UNTAG (b) -> cdr) -> car; } ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-04 5:13 ` Paul Eggert @ 2014-09-04 5:51 ` Dmitry Antipov 2014-09-04 6:45 ` Paul Eggert 2014-09-04 13:11 ` Stefan Monnier 0 siblings, 2 replies; 38+ messages in thread From: Dmitry Antipov @ 2014-09-04 5:51 UTC (permalink / raw) To: Paul Eggert; +Cc: Emacs development discussions On 09/04/2014 09:13 AM, Paul Eggert wrote: > How about something like this? No, I'm talking about the case when we can't us __attribute__ ((aligned (X))). Because if we can, there is no problem at all: #define _GC_ALIGNED_ __attribute__ ((aligned (GCALIGNMENT))) struct _GC_ALIGNED_ Lisp_Cons { ...; }; #define scoped_cons(car, cdr) \ make_lisp_ptr (&((struct Lisp_Cons) { car, { cdr } }), Lisp_Cons) Dmitry ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-04 5:51 ` Dmitry Antipov @ 2014-09-04 6:45 ` Paul Eggert 2014-09-04 13:11 ` Stefan Monnier 1 sibling, 0 replies; 38+ messages in thread From: Paul Eggert @ 2014-09-04 6:45 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Emacs development discussions Dmitry Antipov wrote: > I'm talking about the case when we can't us __attribute__ ((aligned (X))). How about something like this? typedef unsigned long uintptr_t; typedef uintptr_t Lisp_Object; struct Lisp_Cons { Lisp_Object car; Lisp_Object cdr; }; static Lisp_Object cons_init (void *result, Lisp_Object a, Lisp_Object b) { struct Lisp_Cons *p = (struct Lisp_Cons *) (((uintptr_t) result + 7) & -8); p->car = a; p->cdr = b; return (Lisp_Object) p + 3; } #define BCONS(a, b) cons_init ((char[sizeof (struct Lisp_Cons) + 7]){}, a, b) #define untag(x) ((struct Lisp_Cons *) (uintptr_t) ((x) - 3)) int foo (void) { Lisp_Object a = BCONS (0, 0); Lisp_Object b = BCONS (0, a); return untag (untag(b) -> cdr) -> car; } ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-04 5:51 ` Dmitry Antipov 2014-09-04 6:45 ` Paul Eggert @ 2014-09-04 13:11 ` Stefan Monnier 2014-09-04 13:37 ` Dmitry Antipov 1 sibling, 1 reply; 38+ messages in thread From: Stefan Monnier @ 2014-09-04 13:11 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Paul Eggert, Emacs development discussions > No, I'm talking about the case when we can't us __attribute__ ((aligned (X))). Let's not worry about this case, we can/should punt to Fcons for them. Stefan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-04 13:11 ` Stefan Monnier @ 2014-09-04 13:37 ` Dmitry Antipov 2014-09-04 14:46 ` Dmitry Antipov 0 siblings, 1 reply; 38+ messages in thread From: Dmitry Antipov @ 2014-09-04 13:37 UTC (permalink / raw) To: Stefan Monnier; +Cc: Paul Eggert, Emacs development discussions On 09/04/2014 05:11 PM, Stefan Monnier wrote: > Let's not worry about this case, we can/should punt to Fcons for them. Thanks to Paul, there is a C99-compilant solution: #if (__GNUC__ || __HP_cc || __HP_aCC || __IBMC__ \ || __IBMCPP__ || __ICC || 0x5110 <= __SUNPRO_C) /* have __attribute__ ((aligned (GCALIGNMENT))) for conses */ #define scoped_cons(car, cdr) \ make_lisp_ptr (&((struct Lisp_Cons) { car, { cdr } }), Lisp_Cons) #else /* another compiler, need an explicit alignment */ INLINE Lisp_Object scoped_cons_init (void *ptr, Lisp_Object x, Lisp_Object y) { struct Lisp_Cons *c = (struct Lisp_Cons *) (((uintptr_t) ptr + (GCALIGNMENT - 1)) & ~(GCALIGNMENT - 1)); c->car = x; c->u.cdr = y; return make_lisp_ptr (c, Lisp_Cons); } #define scoped_cons(car, cdr) \ scoped_cons_init ((char[sizeof (struct Lisp_Cons) \ + (GCALIGNMENT - 1)]) {}, (car), (cdr)) #endif /* compiler selection */ ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-04 13:37 ` Dmitry Antipov @ 2014-09-04 14:46 ` Dmitry Antipov 2014-09-04 16:03 ` Paul Eggert 0 siblings, 1 reply; 38+ messages in thread From: Dmitry Antipov @ 2014-09-04 14:46 UTC (permalink / raw) To: Paul Eggert; +Cc: Stefan Monnier, Emacs development discussions On 09/04/2014 05:37 PM, Dmitry Antipov wrote: > Thanks to Paul, there is a C99-compilant solution: As for the vectors, I don't know how to use VLAs in the way similar to alloca in the following attempt: INLINE Lisp_Object local_vector_init (uintptr_t addr, ptrdiff_t length, Lisp_Object init) { ptrdiff_t i; struct Lisp_Vector *v = (struct Lisp_Vector *) addr; v->header.size = length; for (i = 0; i < length; i++) v->contents[i] = init; return make_lisp_ptr (v, Lisp_Vectorlike); } #define local_vector(obj, length, init) \ (MAX_ALLOCA < (length) * word_size + header_size \ ? Fmake_vector (make_number (length), (init)) \ : (obj = XIL ((uintptr_t) alloca ((length) * word_size + header_size)), \ local_vector_init ((uintptr_t) XLI (obj), (length), (init)))) Users are expected to call: Lisp_Object obj = local_vector (obj, 10, Qnil); Somewhat similar may be implemented for strings. Dmitry ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-04 14:46 ` Dmitry Antipov @ 2014-09-04 16:03 ` Paul Eggert 2014-09-05 4:00 ` Dmitry Antipov 0 siblings, 1 reply; 38+ messages in thread From: Paul Eggert @ 2014-09-04 16:03 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Stefan Monnier, Emacs development discussions [-- Attachment #1: Type: text/plain, Size: 267 bytes --] Dmitry Antipov wrote: > I don't know how to use VLAs in the way similar to alloca For vectors the macro needs to generate a series of declarations and statements rather than an expression. It's less clean, but it's good enough. Something like the attached, say. [-- Attachment #2: safe-vla.c --] [-- Type: text/x-csrc, Size: 1609 bytes --] /* Placeholders for what's in lisp.h already. */ #include <stddef.h> #include <stdbool.h> #include <inttypes.h> typedef long Lisp_Object; enum { word_size = sizeof (Lisp_Object) }; enum { MAX_ALLOCA = 16 * 1024 }; #define USE_SAFE_ALLOCA bool sa_must_free = false #define SAFE_FREE() (sa_must_free ? magic_freer () : (void) 0) #define min(a, b) ((a) < (b) ? (a) : (b)) extern void *xnmalloc (size_t, size_t); extern Lisp_Object make_save_memory (Lisp_Object *, ptrdiff_t); extern void magic_freer (void); extern void record_unwind_protect (void (*) (Lisp_Object), Lisp_Object); extern void free_save_value (Lisp_Object); extern _Noreturn void memory_full (size_t); /* The new macro. */ #ifdef __STDC_NO_VLA__ # define SAFE_LISP_ARRAY(name, elems) \ Lisp_Object *name; \ SAFE_ALLOCA_LISP (name, elems) #else # define SAFE_LISP_ARRAY(name, elems) \ ptrdiff_t name##n = elems; \ bool name##issmall = name##n <= MAX_ALLOCA / word_size; \ Lisp_Object name##vec[name##issmall && name##n ? name##n : 1]; \ Lisp_Object *name; \ if (name##issmall) \ name = name##vec; \ else \ { \ name = xnmalloc (name##n, word_size); \ record_unwind_protect (free_save_value, \ make_save_memory (name, name##n)); \ sa_must_free = true; \ } #endif /* Example use. */ Lisp_Object foo (ptrdiff_t n) { USE_SAFE_ALLOCA; SAFE_LISP_ARRAY (foo, n); for (ptrdiff_t i = 0; i < n; i++) foo[i] = i; Lisp_Object x = 0; for (ptrdiff_t i = 0; i < n; i++) x ^= foo[0]; SAFE_FREE (); return x; } ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-04 16:03 ` Paul Eggert @ 2014-09-05 4:00 ` Dmitry Antipov 2014-09-05 4:24 ` Stefan Monnier 2014-09-05 7:15 ` Paul Eggert 0 siblings, 2 replies; 38+ messages in thread From: Dmitry Antipov @ 2014-09-05 4:00 UTC (permalink / raw) To: Paul Eggert; +Cc: Stefan Monnier, Emacs development discussions [-- Attachment #1: Type: text/plain, Size: 803 bytes --] On 09/04/2014 08:03 PM, Paul Eggert wrote: > For vectors the macro needs to generate a series of declarations and > statements rather than an expression. It's less clean, but it's good > enough. Something like the attached, say. IMO this is a bit overengineered. In particular, the whole thing is to allocate short-lived objects which are usually small. This means that !issmall branch is unlikely to be taken but requires SAFE_xxx anyway. Moreover, I think we need a very special benchmark to see a difference between alloca and VLA (if any), thus using the latter at any cost doesn't worth an extra complexity. So, although I have no strong objections about your version, I'm voting for simpler and cleaner version with alloca/fallback to regular GC for vectors and strings. Stefan? Dmitry [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: local_objects.patch --] [-- Type: text/x-patch; name="local_objects.patch", Size: 6654 bytes --] === modified file 'src/alloc.c' --- src/alloc.c 2014-08-29 07:29:47 +0000 +++ src/alloc.c 2014-09-05 02:43:56 +0000 @@ -7118,8 +7118,29 @@ file, line, msg); terminate_due_to_signal (SIGABRT, INT_MAX); } -#endif -\f + +/* Stress alloca with inconveniently sized requests and check + whether all allocated areas may be used for Lisp_Object. */ + +NO_INLINE static void +verify_alloca (void) +{ + int i; + enum { ALLOCA_CHECK_MAX = 256 }; + /* Start from size of the smallest Lisp object. */ + for (i = sizeof (struct Lisp_Cons); i <= ALLOCA_CHECK_MAX; i++) + { + char *ptr = alloca (i); + eassert (pointer_valid_for_lisp_object (ptr)); + } +} + +#else /* not ENABLE_CHECKING */ + +#define verify_alloca() ((void) 0) + +#endif /* ENABLE_CHECKING */ + /* Initialization. */ void @@ -7129,6 +7150,8 @@ purebeg = PUREBEG; pure_size = PURESIZE; + verify_alloca (); + #if GC_MARK_STACK || defined GC_MALLOC_CHECK mem_init (); Vdead = make_pure_string ("DEAD", 4, 4, 0); === modified file 'src/character.h' --- src/character.h 2014-07-08 07:17:04 +0000 +++ src/character.h 2014-09-04 16:18:48 +0000 @@ -644,8 +644,6 @@ const unsigned char **, int *); extern int translate_char (Lisp_Object, int c); -extern void parse_str_as_multibyte (const unsigned char *, - ptrdiff_t, ptrdiff_t *, ptrdiff_t *); extern ptrdiff_t count_size_as_multibyte (const unsigned char *, ptrdiff_t); extern ptrdiff_t str_as_multibyte (unsigned char *, ptrdiff_t, ptrdiff_t, ptrdiff_t *); === modified file 'src/lisp.h' --- src/lisp.h 2014-09-02 18:05:00 +0000 +++ src/lisp.h 2014-09-05 02:45:18 +0000 @@ -298,6 +298,13 @@ # endif #endif +/* Stolen from gnulib. */ +#if (__GNUC__ || __HP_cc || __HP_aCC || __IBMC__ \ + || __IBMCPP__ || __ICC || 0x5110 <= __SUNPRO_C) +#define GCALIGNED __attribute__ ((aligned (GCALIGNMENT))) +#else +#define GCALIGNED /* empty */ +#endif /* Some operations are so commonly executed that they are implemented as macros, not functions, because otherwise runtime performance would @@ -1016,7 +1023,7 @@ typedef struct interval *INTERVAL; -struct Lisp_Cons +struct GCALIGNED Lisp_Cons { /* Car of this cons cell. */ Lisp_Object car; @@ -3622,6 +3629,10 @@ /* Defined in vm-limit.c. */ extern void memory_warnings (void *, void (*warnfun) (const char *)); +/* Defined in character.c. */ +extern void parse_str_as_multibyte (const unsigned char *, ptrdiff_t, + ptrdiff_t *, ptrdiff_t *); + /* Defined in alloc.c. */ extern void check_pure_size (void); extern void free_misc (Lisp_Object); @@ -4533,6 +4544,111 @@ memory_full (SIZE_MAX); \ } while (false) +/* Use the following functions to allocate temporary (function- + or block-scoped) conses, vectors, and strings. These objects + are not managed by GC, and passing them out of their scope + causes an immediate crash in GC. */ + +#if (__GNUC__ || __HP_cc || __HP_aCC || __IBMC__ \ + || __IBMCPP__ || __ICC || 0x5110 <= __SUNPRO_C) + +/* Allocate temporary block-scoped cons. This version assumes + that stack-allocated Lisp_Cons is always aligned properly. */ + +#define scoped_cons(car, cdr) \ + make_lisp_ptr (&((struct Lisp_Cons) { car, { cdr } }), Lisp_Cons) + +#else /* not __GNUC__ etc... */ + +/* Helper function for an alternate scoped cons, see below. */ + +INLINE Lisp_Object +scoped_cons_init (void *ptr, Lisp_Object x, Lisp_Object y) +{ + struct Lisp_Cons *c = (struct Lisp_Cons *) + (((uintptr_t) ptr + (GCALIGNMENT - 1)) & ~(GCALIGNMENT - 1)); + c->car = x; + c->u.cdr = y; + return make_lisp_ptr (c, Lisp_Cons); +} + +/* This version uses explicit alignment. */ + +#define scoped_cons(car, cdr) \ + scoped_cons_init ((char[sizeof (struct Lisp_Cons) \ + + (GCALIGNMENT - 1)]) {}, (car), (cdr)) + +#endif /* __GNUC__ etc... */ + +/* True if Lisp_Object may be placed at P. Used only + under ENABLE_CHECKING and optimized away otherwise. */ + +INLINE bool +pointer_valid_for_lisp_object (void *p) +{ + uintptr_t v = (uintptr_t) p; + return !(USE_LSB_TAG ? (v & ~VALMASK) : v >> VALBITS); +} + +/* Helper function for build_local_vector, see below. */ + +INLINE Lisp_Object +local_vector_init (uintptr_t addr, ptrdiff_t length, Lisp_Object init) +{ + ptrdiff_t i; + struct Lisp_Vector *v = (struct Lisp_Vector *) addr; + + eassert (pointer_valid_for_lisp_object (v)); + v->header.size = length; + for (i = 0; i < length; i++) + v->contents[i] = init; + return make_lisp_ptr (v, Lisp_Vectorlike); +} + +/* If size permits, create temporary function-scoped vector OBJ of + length SIZE, with each element being INIT. Otherwise create + regular GC-managed vector. */ + +#define build_local_vector(obj, size, init) \ + (MAX_ALLOCA < (size) * word_size + header_size \ + ? obj = Fmake_vector (make_number (size), (init)) \ + : (obj = XIL ((uintptr_t) alloca \ + ((size) * word_size + header_size)), \ + obj = local_vector_init ((uintptr_t) XLI (obj), (size), (init)))) + +/* Helper function for build_local_string, see below. */ + +INLINE Lisp_Object +local_string_init (uintptr_t addr, const char *data, ptrdiff_t size) +{ + ptrdiff_t nchars, nbytes; + struct Lisp_String *s = (struct Lisp_String *) addr; + + eassert (pointer_valid_for_lisp_object (s)); + parse_str_as_multibyte ((const unsigned char *) data, + size, &nchars, &nbytes); + s->data = (unsigned char *) (addr + sizeof *s); + s->intervals = NULL; + memcpy (s->data, data, size); + s->data[size] = '\0'; + if (size == nchars || size != nbytes) + s->size = size, s->size_byte = -1; + else + s->size = nchars, s->size_byte = nbytes; + return make_lisp_ptr (s, Lisp_String); +} + +/* If size permits, create temporary function-scoped string OBJ + with contents DATA of length NBYTES. Otherwise create regular + GC-managed string. */ + +#define build_local_string(obj, data, nbytes) \ + (MAX_ALLOCA < (nbytes) + sizeof (struct Lisp_String) \ + ? obj = make_string ((data), (nbytes)) \ + : (obj = XIL ((uintptr_t) alloca \ + ((nbytes) + sizeof (struct Lisp_String))), \ + obj = local_string_init ((uintptr_t) XLI (obj), data, nbytes))) + /* Loop over all tails of a list, checking for cycles. FIXME: Make tortoise and n internal declarations. FIXME: Unroll the loop body so we don't need `n'. */ ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-05 4:00 ` Dmitry Antipov @ 2014-09-05 4:24 ` Stefan Monnier 2014-09-05 9:28 ` Dmitry Antipov 2014-09-05 7:15 ` Paul Eggert 1 sibling, 1 reply; 38+ messages in thread From: Stefan Monnier @ 2014-09-05 4:24 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Paul Eggert, Emacs development discussions > alloca/fallback to regular GC for vectors and strings. Stefan? I'm definitely in favor of a simple solution. So far I haven't seen compelling evidence that such alloca tricks are even worthwhile at all. Your benchmark for `cons' showed that there's some potential benefit, but then we have to figure out which Fcons calls can be replaced by alloca ones, and then assess whether the result is worth the effort (and the risk, since such alloca-allocated thingies need to be handled with care, making sure they can't escape to Elisp, and also using the stack more intensively increases the risk of crashing into the stack depth limit). Stefan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-05 4:24 ` Stefan Monnier @ 2014-09-05 9:28 ` Dmitry Antipov 0 siblings, 0 replies; 38+ messages in thread From: Dmitry Antipov @ 2014-09-05 9:28 UTC (permalink / raw) To: Stefan Monnier; +Cc: Paul Eggert, Emacs development discussions On 09/05/2014 08:24 AM, Stefan Monnier wrote: > Your benchmark for `cons' showed that there's some potential benefit, > but then we have to figure out which Fcons calls can be replaced by > alloca ones, and then assess whether the result is worth the effort > (and the risk, since such alloca-allocated thingies need to be handled > with care, making sure they can't escape to Elisp, and also using the > stack more intensively increases the risk of crashing into the > stack depth limit). Surely this is a bit risky and requires careful looking how this object is used. On the other side, it's not too hard to debug; use of local object outside of its scope causes the very likely eassert/crash at next GC, and the fault address is very different from what we see with heap objects. Dmitry ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-05 4:00 ` Dmitry Antipov 2014-09-05 4:24 ` Stefan Monnier @ 2014-09-05 7:15 ` Paul Eggert 2014-09-05 9:16 ` Dmitry Antipov 2014-09-05 15:35 ` Stefan Monnier 1 sibling, 2 replies; 38+ messages in thread From: Paul Eggert @ 2014-09-05 7:15 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Stefan Monnier, Emacs development discussions Dmitry Antipov wrote: > I'm voting for simpler and cleaner version with > alloca/fallback to regular GC for vectors and strings. I like the idea of simpler and cleaner, but your patch's implementations of build_local_vector and build_local_string use alloca expressions as function arguments, and the GNU API for alloca doesn't support that. You can work around the problem by using statement expressions, falling back on plain Fmake_vector and/or make_string for compilers that don't support statement expressions. To address Stefan's concern about alloca-allocated thingies escaping to Elisp, is there some way you can include a runtime test for that, when checking is enabled? It wouldn't have to catch every escapee, just a high percentage of them. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-05 7:15 ` Paul Eggert @ 2014-09-05 9:16 ` Dmitry Antipov 2014-09-05 14:35 ` Paul Eggert 2014-09-05 15:35 ` Stefan Monnier 1 sibling, 1 reply; 38+ messages in thread From: Dmitry Antipov @ 2014-09-05 9:16 UTC (permalink / raw) To: Paul Eggert; +Cc: Stefan Monnier, Emacs development discussions On 09/05/2014 11:15 AM, Paul Eggert wrote: > I like the idea of simpler and cleaner, but your patch's implementations of > build_local_vector and build_local_string use alloca expressions as function > arguments I don't do that - I'm using 1st arg (result) as a temporary variable, assuming that Lisp_Object is always wide enough to hold an address: #define build_local_vector(obj, size, init) \ (MAX_ALLOCA < (size) * word_size + header_size \ ? obj = Fmake_vector (make_number (size), (init)) \ : (obj = XIL ((uintptr_t) alloca \ here ((size) * word_size + header_size)), \ obj = local_vector_init ((uintptr_t) XLI (obj), (size), (init)))) IIUC the compiler should know about alloca limitations and so should not try to (mis)optimize the code above to: #define build_local_vector(obj, size, init) \ (MAX_ALLOCA < (size) * word_size + header_size \ ? obj = Fmake_vector (make_number (size), (init)) \ obj = local_vector_init \ ((uintptr_t) alloca ((size) * word_size + header_size), \ (size), (init))) Dmitry ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-05 9:16 ` Dmitry Antipov @ 2014-09-05 14:35 ` Paul Eggert 0 siblings, 0 replies; 38+ messages in thread From: Paul Eggert @ 2014-09-05 14:35 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Emacs development discussions Dmitry Antipov wrote: > On 09/05/2014 11:15 AM, Paul Eggert wrote: >> implementations of >> build_local_vector and build_local_string use alloca expressions as >> function >> arguments > > I don't do that I'm afraid you do, as XIL is a function. This isn't a serious problem, though, as it can be worked around with a statement expression, or (now that I think of it) by using lisp_h_XIL instead of XIL. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-05 7:15 ` Paul Eggert 2014-09-05 9:16 ` Dmitry Antipov @ 2014-09-05 15:35 ` Stefan Monnier 2014-09-08 10:33 ` Dmitry Antipov 1 sibling, 1 reply; 38+ messages in thread From: Stefan Monnier @ 2014-09-05 15:35 UTC (permalink / raw) To: Paul Eggert; +Cc: Dmitry Antipov, Emacs development discussions > To address Stefan's concern about alloca-allocated thingies escaping to > Elisp, is there some way you can include a runtime test for that, when > checking is enabled? It wouldn't have to catch every escapee, just a high > percentage of them. I'd expect that such escaping would only occur in "unusual setups", as a result of later unrelated changes. E.g. an alloca_string object is passed to DECODE_FILE or to Fexpand_file_name, which will usually stay within C code or even if it escapes to Elisp its lifetime will not escape the stack discipline. And some times later we get random crashes in odd situations because someone figure a clever way to do god-knows-what (e.g. speed things up via a memoization) by storing those refs into some global table. Basically, this risks making Elisp's semantics more murky, so it has to come with a good performance story. Stefan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-05 15:35 ` Stefan Monnier @ 2014-09-08 10:33 ` Dmitry Antipov 2014-09-08 12:01 ` Dmitry Antipov 2014-09-08 12:44 ` Stefan Monnier 0 siblings, 2 replies; 38+ messages in thread From: Dmitry Antipov @ 2014-09-08 10:33 UTC (permalink / raw) To: Stefan Monnier, Paul Eggert; +Cc: Emacs development discussions On 09/05/2014 07:35 PM, Stefan Monnier wrote: > I'd expect that such escaping would only occur in "unusual setups", as > a result of later unrelated changes. E.g. an alloca_string object is > passed to DECODE_FILE or to Fexpand_file_name, which will usually stay > within C code or even if it escapes to Elisp its lifetime will not > escape the stack discipline. > > And some times later we get random crashes in odd situations because > someone figure a clever way to do god-knows-what (e.g. speed things up > via a memoization) by storing those refs into some global table. Surely such an errors are possible. But if we use alloca in general, there is always a possibility for someone to misuse pointer returned from it, no matter whether that pointer is used for Lisp_Objects or not. Any developer good enough in C should be able to find, track and fix this class of errors, so I don't see a problem here. > Basically, this risks making Elisp's semantics more murky, so it has to > come with a good performance story. In somewhat corner cases where an allocation speed is really important, we can get a very good story. For example, using build_local_vector instead of Fmake_vector in Ffind_charset_region makes this function ~18x times faster (for reatively large buffers): (defun test-charset () (interactive) (let ((start (float-time)) (max (1- (buffer-size)))) (dotimes (i max) (find-charset-region (1+ max) (+ max 2))) (message "%fs elapsed" (- (float-time) start)))) Of course, no one should detect charsets by scanning buffer in such a way, but I found this example very illustrative. And I believe that no one can design an example where using alloca introduces a 18x slowdown :-). Dmitry ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-08 10:33 ` Dmitry Antipov @ 2014-09-08 12:01 ` Dmitry Antipov 2014-09-08 13:30 ` Stefan Monnier 2014-09-08 12:44 ` Stefan Monnier 1 sibling, 1 reply; 38+ messages in thread From: Dmitry Antipov @ 2014-09-08 12:01 UTC (permalink / raw) To: Stefan Monnier, Paul Eggert; +Cc: Emacs development discussions On 09/08/2014 02:33 PM, Dmitry Antipov wrote: >> Basically, this risks making Elisp's semantics more murky, so it has to >> come with a good performance story. Here is a "real-life" example with my favorite benchmark running over xdisp.c: (defun scroll-up-benchmark () (interactive) (let ((oldgc gcs-done) (oldtime (float-time))) (condition-case nil (while t (scroll-up) (redisplay)) (error (message "GCs: %d Elapsed time: %f seconds" (- gcs-done oldgc) (- (float-time) oldtime)))))) ==> 401 GCs, ~29.9s. With two-lines change: === modified file 'src/textprop.c' --- src/textprop.c 2014-09-07 07:04:01 +0000 +++ src/textprop.c 2014-09-08 11:49:52 +0000 @@ -1319,7 +1319,8 @@ markers). If OBJECT is a string, START and END are 0-based indices into it. */) (Lisp_Object start, Lisp_Object end, Lisp_Object property, Lisp_Object value, Lisp_Object object) { - Fadd_text_properties (start, end, list2 (property, value), object); + Lisp_Object val = scoped_cons (property, scoped_cons (value, Qnil)); + Fadd_text_properties (start, end, val, object); return Qnil; } it shows 399 GCs and ~29.6s. Here we just allocate 2 cons cells (32 bytes on a 64-bit system) on stack, and can see a measurable performance improvement. I believe that carefully used stack allocation can give a much more visible improvement for a wide range of Lisp workloads. Dmitry ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-08 12:01 ` Dmitry Antipov @ 2014-09-08 13:30 ` Stefan Monnier 0 siblings, 0 replies; 38+ messages in thread From: Stefan Monnier @ 2014-09-08 13:30 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Paul Eggert, Emacs development discussions > ==> 401 GCs, ~29.9s. [...] > it shows 399 GCs and ~29.6s. 1% speedup is not earth shattering, but at least this time it's a real use-case. > Here we just allocate 2 cons cells (32 bytes on a 64-bit system) on stack, > and can see a measurable performance improvement. Indeed this Fadd_text_properties business sucks. We shouldn't allocate any cons cells at all when we set a single property. Stefan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-08 10:33 ` Dmitry Antipov 2014-09-08 12:01 ` Dmitry Antipov @ 2014-09-08 12:44 ` Stefan Monnier 2014-09-08 13:30 ` Stefan Monnier 1 sibling, 1 reply; 38+ messages in thread From: Stefan Monnier @ 2014-09-08 12:44 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Paul Eggert, Emacs development discussions > Surely such an errors are possible. But if we use alloca in general, > there is always a possibility for someone to misuse pointer returned > from it, no matter whether that pointer is used for Lisp_Objects or not. There's an important difference: an object that is accessible from the Elisp world should never have such "stack-allocation" constraint, because that is a kind of semantic constraint that is common in C but which doesn't exist in C. So as long as your Lips_Objects stay within the C world, it's indeed acceptable, since the C programmer presumably knows that memory management is a big dangerous mess; but as soon as your alloca'd object gets to the Elisp world this is not acceptable any more since it introduces a new kind of problem to this Elisp world. > Any developer good enough in C should be able to find, track and fix > this class of errors, so I don't see a problem here. Agreed, as long as the object doesn't get to Elisp. The tricky part here is that Elisp can be invoked at many different points, and we tend to keep adding new such points via new hooks. So the places where alloca can be used are probably somewhat rare. > Of course, no one should detect charsets by scanning buffer in such a way, IOW, this is not a good argument in favor of using alloca there. Stefan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] 2014-09-08 12:44 ` Stefan Monnier @ 2014-09-08 13:30 ` Stefan Monnier 0 siblings, 0 replies; 38+ messages in thread From: Stefan Monnier @ 2014-09-08 13:30 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Paul Eggert, Emacs development discussions > There's an important difference: an object that is accessible from the > Elisp world should never have such "stack-allocation" constraint, > because that is a kind of semantic constraint that is common in C but > which doesn't exist in C. ^^^ Elisp -- Stefan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC] temporary Lisp_Strings 2014-09-02 12:55 [RFC] temporary Lisp_Strings Dmitry Antipov 2014-09-02 13:21 ` Andreas Schwab 2014-09-02 14:00 ` Stefan Monnier @ 2014-09-02 14:37 ` Paul Eggert 2014-09-02 15:24 ` Dmitry Antipov 2014-09-02 15:28 ` Dmitry Antipov 2 siblings, 2 replies; 38+ messages in thread From: Paul Eggert @ 2014-09-02 14:37 UTC (permalink / raw) To: Dmitry Antipov, Emacs development discussions What would the GC do when it sees such a string? Wouldn't this make the GC a bit less robust, as it couldn't diagnose bogus strings any more when GC_CHECK_MARKED_OBJECTS is defined? Like Stefan, I wonder how much performance benefit you're really gaining here. Presumably most of it is lack of pressure on the GC, and how do you measure that? I assume you're thinking of eventually doing this for conses etc. too? Dmitry Antipov wrote: > is there a way to make sure that an address returned by alloca fits in Lisp_Object? It should work on typical platforms, as alloca should return an address that is aligned well enough for Emacs, just as malloc does. Perhaps we may run into an oddball platform where alloca isn't suitably aligned, but if so we can simply allocate a few more bytes than needed and then align the pointers ourselves. For starters, though, I'd just assume that it's aligned. An Emacs built with ENABLE_CHECKING should verify any alignment issues already, as make_lisp_ptr checks for this. We don't need to worry about !USE_LSB_TAG on the trunk anymore, as support for abusing the high-order bits of pointers has been withdrawn on the trunk. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC] temporary Lisp_Strings 2014-09-02 14:37 ` [RFC] temporary Lisp_Strings Paul Eggert @ 2014-09-02 15:24 ` Dmitry Antipov 2014-09-02 15:28 ` Dmitry Antipov 1 sibling, 0 replies; 38+ messages in thread From: Dmitry Antipov @ 2014-09-02 15:24 UTC (permalink / raw) To: Paul Eggert; +Cc: Emacs development discussions On 09/02/2014 06:37 PM, Paul Eggert wrote: > What would the GC do when it sees such a string? In theory, such a strings are out of GC's scope: - If GC_MARK_STACK == GC_MAKE_GCPROS_NOOPS, such a string is never recorded in rb-tree and so GC should not recognize it as a collectable object; - If GC_MARK_STACK == GC_USE_GCPROS_AS_BEFORE: - If you GCPRO such a string, this is fatal error; - Otherwise GC just silently ignores it. > Wouldn't this make the GC a bit less robust, as it couldn't diagnose > bogus strings any more when GC_CHECK_MARKED_OBJECTS is defined? No ideas yet, this should be investigated. Dmitry ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC] temporary Lisp_Strings 2014-09-02 14:37 ` [RFC] temporary Lisp_Strings Paul Eggert 2014-09-02 15:24 ` Dmitry Antipov @ 2014-09-02 15:28 ` Dmitry Antipov 1 sibling, 0 replies; 38+ messages in thread From: Dmitry Antipov @ 2014-09-02 15:28 UTC (permalink / raw) To: Paul Eggert; +Cc: Emacs development discussions On 09/02/2014 06:37 PM, Paul Eggert wrote: > I assume you're thinking of eventually doing this for conses etc. too? For the beginning, it may be even better because conses doesn't need (over)complicated stuff like parse_str_as_multibyte. Dmitry ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2014-09-08 13:30 UTC | newest] Thread overview: 38+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-09-02 12:55 [RFC] temporary Lisp_Strings Dmitry Antipov 2014-09-02 13:21 ` Andreas Schwab 2014-09-02 13:47 ` Dmitry Antipov 2014-09-02 14:14 ` Andreas Schwab 2014-09-02 15:00 ` Eli Zaretskii 2014-09-02 14:00 ` Stefan Monnier 2014-09-02 15:13 ` Dmitry Antipov 2014-09-02 17:32 ` Stefan Monnier 2014-09-03 10:23 ` Benchmarking temporary Lisp objects [Was: Re: [RFC] temporary Lisp_Strings] Dmitry Antipov 2014-09-03 13:14 ` Stefan Monnier 2014-09-03 14:39 ` Paul Eggert 2014-09-03 15:39 ` Dmitry Antipov 2014-09-03 16:42 ` Paul Eggert 2014-09-03 17:47 ` Stefan Monnier 2014-09-03 18:00 ` Paul Eggert 2014-09-04 4:59 ` Dmitry Antipov 2014-09-04 5:13 ` Paul Eggert 2014-09-04 5:51 ` Dmitry Antipov 2014-09-04 6:45 ` Paul Eggert 2014-09-04 13:11 ` Stefan Monnier 2014-09-04 13:37 ` Dmitry Antipov 2014-09-04 14:46 ` Dmitry Antipov 2014-09-04 16:03 ` Paul Eggert 2014-09-05 4:00 ` Dmitry Antipov 2014-09-05 4:24 ` Stefan Monnier 2014-09-05 9:28 ` Dmitry Antipov 2014-09-05 7:15 ` Paul Eggert 2014-09-05 9:16 ` Dmitry Antipov 2014-09-05 14:35 ` Paul Eggert 2014-09-05 15:35 ` Stefan Monnier 2014-09-08 10:33 ` Dmitry Antipov 2014-09-08 12:01 ` Dmitry Antipov 2014-09-08 13:30 ` Stefan Monnier 2014-09-08 12:44 ` Stefan Monnier 2014-09-08 13:30 ` Stefan Monnier 2014-09-02 14:37 ` [RFC] temporary Lisp_Strings Paul Eggert 2014-09-02 15:24 ` Dmitry Antipov 2014-09-02 15:28 ` Dmitry Antipov
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).