From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Pip Cet Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] add compiled regexp primitive lisp object Date: Thu, 01 Aug 2024 01:04:10 +0000 Message-ID: <87mslxxddk.fsf@protonmail.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="30991"; mail-complaints-to="usenet@ciao.gmane.io" Cc: "emacs-devel@gnu.org" To: Danny McClanahan Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Aug 01 06:52:05 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sZNnJ-0007mP-26 for ged-emacs-devel@m.gmane-mx.org; Thu, 01 Aug 2024 06:52:05 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sZNmt-0003ma-UP; Thu, 01 Aug 2024 00:51:39 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sZKEw-0004tu-1U for emacs-devel@gnu.org; Wed, 31 Jul 2024 21:04:22 -0400 Original-Received: from mail-40131.protonmail.ch ([185.70.40.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sZKEr-00050B-PH for emacs-devel@gnu.org; Wed, 31 Jul 2024 21:04:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail3; t=1722474254; x=1722733454; bh=+HlUBRkcoOAG4QC6SZ1WxTOen9KmgqvfN2mCBQCcELc=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector; b=pP+HnmagmvLVyDrZ3djmn59AHMEpYhp03Kv6MYL5uk75XtVqXJ8BenuAQiPsQa0iR nGc38c5NsJfzXo/isBnAjCJXJZYfZgiabQhx0HWoeBpzhatHJ66OcJ+qbGr1ezTKOo 48EhtQpZIqgZ5B7yt13AAmZNVLWMKQ/dfy0S9Y5sEymK/pCTfV7saQDSY5Fz4R+i/j LU40SqRKnP0eWh1nk17wgPcFNmzzy7hOy+zAQ88teCDyVvX2UzxxGwyAWf/7VDaYxm Hz0Qnexjd1vM6iW979nBTuQ4U/lLCsw+JfTiCtJ5hwg1E5PbP/H9oCG4JdyDyCh85j 7dw7x3j/esY7Q== In-Reply-To: Feedback-ID: 112775352:user:proton X-Pm-Message-ID: f332bcd7ba812a0819579389a8348f4c723039c7 Received-SPF: pass client-ip=185.70.40.131; envelope-from=pipcet@protonmail.com; helo=mail-40131.protonmail.ch X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Thu, 01 Aug 2024 00:51:34 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:322243 Archived-At: "Danny McClanahan" writes: > This is a first attempt at a lisp-level API for explicit regexp > compilation. I have provided the entire diff inline in this email > under the impression that this will make it easier to discuss the > specifics--I do apologize if diffs above a certain size should > always be attached as patch files in the future. That's an interesting idea. I'm not sure I fully understand the ramifications, so please forgive the stupid questions. > The result of this change is that pre-compiled regexp objects constructed= by > `make-regexp' will have the lifetime of standard lisp objects, instead of > being potentially invalidated and re-compiled upon every call to `string-= match'. That part I think I understand, but what data precisely is part of those regexp objects, and why? If I change case-fold-search, do I have to recompile my regexp or will its case sensitivity stick? > - make Lisp_Regexp purecopyable and pdumpable Is the purecopy part really necessary? It may be easier not to purecopy regexp objects for now... > However, precompiled regexp lisp objects are *not* automatically coerced = to > lisp strings, so any lisp code that expects to be able to e.g. > (concat my-regexp-var "asdf") will now signal an error if my-regexp-var i= s > converted into a precompiled regexp with the new `make-regexp' constructo= r. > I had to re-run autogen.sh to avoid segfaulting upon bootstrap after modi= fying > lisp.h (re-running ./configure alone didn't work). I suspect everyone els= e is > well aware of the ramifications of editing lisp.h enums, but wanted to ma= ke > sure that was clear. That's odd. I'm not sure I saw anything in the patch that explains that beh= avior. > I have tried to extend existing idioms where obvious, and split off helpe= r > methods to improve readability. I am very open to any style improvements > as well as architectural changes. I think I can mostly provide silly questions, so that's what I'll do. However, my first one should come before the notes on the patch: why does the new PVEC store a pointer to re_pattern_buffer rather than containing re_pattern_buffer directly? Doing that would simplify the code, particularly the dumping and purecopying part. Or is there no one-to-one correspondence between pvecs and re_pattern_buffers? > diff --git a/etc/emacs_lldb.py b/etc/emacs_lldb.py > index ba80d3431f3..2a1fd238b17 100644 > --- a/etc/emacs_lldb.py > +++ b/etc/emacs_lldb.py > @@ -69,6 +69,7 @@ class Lisp_Object: > "PVEC_MODULE_FUNCTION": "struct Lisp_Module_Function", > "PVEC_NATIVE_COMP_UNIT": "struct Lisp_Native_Comp_Unit", > "PVEC_SQLITE": "struct Lisp_Sqlite", > + "PVEC_REGEXP": "struct Lisp_Regexp", > "PVEC_COMPILED": "struct Lisp_Vector", > "PVEC_CHAR_TABLE": "struct Lisp_Vector", > "PVEC_SUB_CHAR_TABLE": "void", Can you also update .gdbinit? > diff --git a/src/alloc.c b/src/alloc.c > index 48b170b866f..856393b54df 100644 > --- a/src/alloc.c > +++ b/src/alloc.c > @@ -3467,6 +3467,16 @@ cleanup_vector (struct Lisp_Vector *vector) > =09 } > } > break; > + case PVEC_REGEXP: > + { > +=09struct Lisp_Regexp *r =3D PSEUDOVEC_STRUCT (vector, Lisp_Regexp); > +=09eassert (r->buffer->allocated > 0); > +=09eassert (r->buffer->used <=3D r->buffer->allocated); > +=09xfree (r->buffer->buffer); > +=09xfree (r->buffer->fastmap); > +=09xfree (r->buffer); > + } > + break; > case PVEC_OBARRAY: > { > =09struct Lisp_Obarray *o =3D PSEUDOVEC_STRUCT (vector, Lisp_Obarray); > @@ -5881,6 +5891,36 @@ make_pure_c_string (const char *data, ptrdiff_t nc= hars) > > static Lisp_Object purecopy (Lisp_Object obj); > > +static struct re_pattern_buffer * > +make_pure_re_pattern_buffer (const struct re_pattern_buffer *bufp) Is the purecopy code actually necessary? I'm impressed you added it, but is there a compelling reason to purecopy the new regexp objects? > +{ > + struct re_pattern_buffer *pure =3D pure_alloc (sizeof *pure, -1); I don't think -1 is correct here, as the result may well be unaligned, whic= h will crash on some architectures (but not x86). > diff --git a/src/data.c b/src/data.c > index d947d200870..d5fff6af6b9 100644 > --- a/src/data.c > +++ b/src/data.c > @@ -286,11 +286,13 @@ DEFUN ("cl-type-of", Fcl_type_of, Scl_type_of, 1, 1= , 0, > =09 return Qtreesit_node; > =09case PVEC_TS_COMPILED_QUERY: > =09 return Qtreesit_compiled_query; > - case PVEC_SQLITE: > - return Qsqlite; > - case PVEC_SUB_CHAR_TABLE: > - return Qsub_char_table; > - /* "Impossible" cases. */ > +=09case PVEC_SQLITE: > +=09 return Qsqlite; > +=09case PVEC_SUB_CHAR_TABLE: > +=09 return Qsub_char_table; > +=09case PVEC_REGEXP: > +=09 return Qregexp; > +=09/* "Impossible" cases. */ > =09case PVEC_MISC_PTR: > case PVEC_OTHER: > case PVEC_FREE: ; I think you changed some whitespace here, or maybe I'm missing what changed for PVEC_SQLITE etc. > diff --git a/src/pdumper.c b/src/pdumper.c > index 53bddf91f04..ae7fafc7d1c 100644 > --- a/src/pdumper.c > +++ b/src/pdumper.c > @@ -1528,6 +1528,36 @@ dump_emacs_reloc_copy_from_dump (struct dump_conte= xt *ctx, dump_off dump_offset, > dump_off_to_lisp (size))); > } > > +static void > +dump_remember_fixup_ptr_raw (struct dump_context *ctx, > +=09=09=09 dump_off dump_offset, > +=09=09=09 dump_off new_dump_offset); > + > +/* Add a byte range relocation that copies arbitrary bytes from the dump= . > + > + When the dump is loaded, Emacs copies SIZE bytes into DUMP_OFFSET fro= m > + dump. Dump bytes are loaded from SOURCE. */ > +static void > +dump_cold_bytes (struct dump_context *ctx, dump_off dump_offset, > +=09=09 void* source, dump_off size) > +{ > + eassert (size >=3D 0); > + eassert (size < (1 << EMACS_RELOC_LENGTH_BITS)); Why is this second eassert necessary? > + > + if (!ctx->flags.dump_object_contents) > + return; > + > + if (size =3D=3D 0) > + { > + eassert (source =3D=3D NULL); I don't think this eassert makes sense, to be honest. > + return; > + } > + eassert (source !=3D NULL); > + > + dump_remember_fixup_ptr_raw (ctx, dump_offset, ctx->offset); > + dump_write (ctx, source, size); > +} > + > /* Add an Emacs relocation that sets values to arbitrary bytes. > > When the dump is loaded, Emacs copies SIZE bytes from the > @@ -2125,6 +2155,74 @@ dump_marker (struct dump_context *ctx, const struc= t Lisp_Marker *marker) > return finish_dump_pvec (ctx, &out->header); > } > > +static dump_off > +dump_re_pattern_buffer (struct dump_context *ctx, struct re_pattern_buff= er *bufp) > +{ > +#if CHECK_STRUCTS && !defined (HASH_re_pattern_buffer_36714DF24A) > +# error "re_pattern_buffer changed. See CHECK_STRUCTS comment in config.= h." > +#endif > + struct re_pattern_buffer out; > + dump_object_start (ctx, &out, sizeof (out)); > + if (bufp->buffer) > + dump_field_fixup_later (ctx, &out, bufp, &bufp->buffer); Is bufp->buffer ever NULL? > + DUMP_FIELD_COPY (&out, bufp, allocated); > + DUMP_FIELD_COPY (&out, bufp, used); > + DUMP_FIELD_COPY (&out, bufp, charset_unibyte); > + if (bufp->fastmap) > + dump_field_fixup_later (ctx, &out, bufp, &bufp->fastmap); Same for bufp->fastmap. > + dump_field_lv (ctx, &out, bufp, &bufp->translate, WEIGHT_NORMAL); > + DUMP_FIELD_COPY (&out, bufp, re_nsub); > + DUMP_FIELD_COPY (&out, bufp, can_be_null); > + DUMP_FIELD_COPY (&out, bufp, regs_allocated); > + DUMP_FIELD_COPY (&out, bufp, fastmap_accurate); > + DUMP_FIELD_COPY (&out, bufp, used_syntax); > + DUMP_FIELD_COPY (&out, bufp, multibyte); > + DUMP_FIELD_COPY (&out, bufp, target_multibyte); > + dump_off offset =3D dump_object_finish (ctx, &out, sizeof (out)); > + if (bufp->buffer) > + { > + eassert (bufp->allocated > 0); > + if (bufp->allocated > DUMP_OFF_MAX - 1) > +=09error ("regex pattern buffer too large"); > + dump_off total_size =3D ptrdiff_t_to_dump_off (bufp->allocated); I'm not sure why this is called "total" size? > @@ -2135,6 +2233,7 @@ dump_interval_node (struct dump_context *ctx, struc= t itree_node *node) > dump_object_start (ctx, &out, sizeof (out)); > if (node->parent) > dump_field_fixup_later (ctx, &out, node, &node->parent); > + /* FIXME: should these both be &node->{left,right} instead of &node->p= arent? */ > if (node->left) > dump_field_fixup_later (ctx, &out, node, &node->parent); > if (node->right) Yes, I believe those should be node->left and node->right... > @@ -3462,11 +3563,11 @@ dump_cold_string (struct dump_context *ctx, Lisp_= Object string) > error ("string too large"); > dump_off total_size =3D ptrdiff_t_to_dump_off (SBYTES (string) + 1); > eassert (total_size > 0); > - dump_remember_fixup_ptr_raw > + dump_cold_bytes > (ctx, > string_offset + dump_offsetof (struct Lisp_String, u.s.data), > - ctx->offset); > - dump_write (ctx, XSTRING (string)->u.s.data, total_size); > + XSTRING (string)->u.s.data, > + total_size); > } > > static void I think this change is unrelated, and may have a small performance impact. > @@ -3475,12 +3576,12 @@ dump_cold_charset (struct dump_context *ctx, Lisp= _Object data) > /* Dump charset lookup tables. */ > int cs_i =3D XFIXNUM (XCAR (data)); > dump_off cs_dump_offset =3D dump_off_from_lisp (XCDR (data)); > - dump_remember_fixup_ptr_raw > + struct charset *cs =3D charset_table + cs_i; > + dump_cold_bytes > (ctx, > cs_dump_offset + dump_offsetof (struct charset, code_space_mask), > - ctx->offset); > - struct charset *cs =3D charset_table + cs_i; > - dump_write (ctx, cs->code_space_mask, 256); > + cs->code_space_mask, > + 256); > } > > static void Same here, and for the dump_cold_buffer case. > static void > diff --git a/src/print.c b/src/print.c > index 8f28b14e8b6..011b02d316f 100644 > --- a/src/print.c > +++ b/src/print.c > @@ -2084,6 +2084,31 @@ print_vectorlike_unreadable (Lisp_Object obj, Lisp= _Object printcharfun, > } > return; > > + case PVEC_REGEXP: > + { > +=09struct Lisp_Regexp *r =3D XREGEXP (obj); > +=09print_c_string ("# +=09print_object (r->pattern, printcharfun, escapeflag); > +=09int i =3D sprintf (buf, " nsub=3D%ld", r->buffer->re_nsub); > +=09strout (buf, i, i, printcharfun); > +=09print_c_string (" translate=3D", printcharfun); > +=09print_object (r->buffer->translate, printcharfun, escapeflag); > +=09print_c_string (" whitespace=3D", printcharfun); > +=09print_object (r->whitespace_regexp, printcharfun, escapeflag); > +=09print_c_string (" syntax_table=3D", printcharfun); > +=09print_object (r->syntax_table, printcharfun, escapeflag); > +=09if (r->posix) > +=09 { > +=09 print_c_string (" posix=3Dtrue", printcharfun); > +=09 } > +=09else > +=09 { > +=09 print_c_string (" posix=3Dfalse", printcharfun); > +=09 } > +=09print_c_string (">", printcharfun); > + } > + return; > + > case PVEC_OBARRAY: > { > =09struct Lisp_Obarray *o =3D XOBARRAY (obj); I'm not sure we should be using the a=3Db pattern here. Maybe one day we want to read such objects and it'd be easier to do that if the output syntax were more Lisp-like (in fact, I'd convert the whole thing to a hash table and print that...) > diff --git a/src/regex-emacs.c b/src/regex-emacs.c > index 92dbdbecbf1..113a5dd9ed1 100644 > --- a/src/regex-emacs.c > +++ b/src/regex-emacs.c > @@ -32,6 +32,7 @@ > #include "character.h" > #include "buffer.h" > #include "syntax.h" > +#include "charset.h" > #include "category.h" > #include "dispextern.h" > > @@ -468,6 +469,8 @@ print_fastmap (FILE *dest, char *fastmap) > bool was_a_range =3D false; > int i =3D 0; > > + /* FIXME: unify "1 << BYTEWIDTH" and "FASTMAP_SIZE" defines (both are > + the same value =3D 256). */ > while (i < (1 << BYTEWIDTH)) > { > if (fastmap[i++]) > @@ -5353,3 +5356,71 @@ re_compile_pattern (const char *pattern, ptrdiff_t= length, > return NULL; > return re_error_msgid[ret]; > } > + > +DEFUN ("make-regexp", Fmake_regexp, Smake_regexp, 1, 3, 0, > + doc: /* Compile a regexp object from string PATTERN. > + > +POSIX is non-nil if we want full backtracking (POSIX style) for > +this pattern. > +TRANSLATE is a translation table for ignoring case, or t to look up from > +the current buffer if `case-fold-search' is on, or nil for none. */) And that look-up happens at the time make-regexp is called, not when it is used, right? That might be surprising if case-fold-search changes, for example. > + (Lisp_Object pattern, Lisp_Object posix, Lisp_Object translate) > +{ > + const char *whitespace_regexp =3D STRINGP (Vsearch_spaces_regexp) ? > + SSDATA (Vsearch_spaces_regexp) : NULL; > + char *val; > + bool is_posix =3D !NILP (posix); > + struct Lisp_Regexp *p; > + struct re_pattern_buffer *bufp =3D NULL; > + > + if (EQ(translate, Qt)) > + translate =3D (!NILP (Vcase_fold_search) > +=09=09 ? BVAR (current_buffer, case_canon_table) > +=09=09 : Qnil); > + > + CHECK_STRING (pattern); > + > + bufp =3D xzalloc (sizeof (*bufp)); > + > + bufp->fastmap =3D xzalloc (FASTMAP_SIZE); > + bufp->translate =3D translate; > + bufp->multibyte =3D STRING_MULTIBYTE (pattern); > + bufp->charset_unibyte =3D charset_unibyte; > + > + val =3D (char *) re_compile_pattern (SSDATA (pattern), SBYTES (pattern= ), > +=09=09=09=09 is_posix, whitespace_regexp, bufp); > + > + if (val) > + { > + xfree (bufp->buffer); > + xfree (bufp->fastmap); > + xfree (bufp); > + xsignal1 (Qinvalid_regexp, build_string (val)); > + } > + > + p =3D ALLOCATE_PSEUDOVECTOR (struct Lisp_Regexp, syntax_table, > +=09=09=09 PVEC_REGEXP); > + p->pattern =3D pattern; > + p->whitespace_regexp =3D Vsearch_spaces_regexp; Why do we save whitespace_regexp, by the way? Technically, of course, Emacs strings are mutable, so someone might modify pattern and it would no longer correspond to the compiled regexp. Who'd do that, though? > + /* If the compiled pattern hard codes some of the contents of the > + syntax-table, it can only be reused with *this* syntax table. */ > + p->syntax_table =3D bufp->used_syntax ? BVAR (current_buffer, syntax_t= able) : Qt; > + p->posix =3D is_posix; > + p->buffer =3D bufp; > + return make_lisp_ptr (p, Lisp_Vectorlike); > +} > + > +DEFUN ("regexpp", Fregexpp, Sregexpp, 1, 1, 0, > + doc: /* Say whether OBJECT is a compiled regexp object. */) > + (Lisp_Object object) > +{ > + return REGEXP_P (object)? Qt: Qnil; > +} > + > +void syms_of_regexp (void) > +{ > + defsubr (&Smake_regexp); > + defsubr (&Sregexpp); > + DEFSYM (Qregexp, "regexp"); > + DEFSYM (Qregexpp, "regexpp"); > +} > diff --git a/src/search.c b/src/search.c > index 2ff8b0599c4..5710ff30005 100644 > --- a/src/search.c > +++ b/src/search.c > @@ -181,6 +181,8 @@ freeze_pattern (struct regexp_cache *searchbuf) > { > eassert (!searchbuf->busy); > record_unwind_protect_ptr (unfreeze_pattern, searchbuf); > + eassert (searchbuf !=3D NULL); > + assume (searchbuf); I believe "eassume" does precisely that. > searchbuf->busy =3D true; > } > > @@ -261,11 +263,13 @@ compile_pattern (Lisp_Object pattern, struct re_reg= isters *regp, > > =0C > static Lisp_Object > -looking_at_1 (Lisp_Object string, bool posix, bool modify_data) > +looking_at_1 (Lisp_Object regexp, bool posix, bool modify_data) > { > Lisp_Object val; > unsigned char *p1, *p2; > ptrdiff_t s1, s2; > + struct re_pattern_buffer *bufp =3D NULL; > + struct regexp_cache *cache_entry =3D NULL; > register ptrdiff_t i; > > if (running_asynch_code) > @@ -276,18 +280,22 @@ looking_at_1 (Lisp_Object string, bool posix, bool = modify_data) > set_char_table_extras (BVAR (current_buffer, case_canon_table), 2, > =09=09=09 BVAR (current_buffer, case_eqv_table)); > > - CHECK_STRING (string); > + if (REGEXP_P (regexp)) > + bufp =3D XREGEXP (regexp)->buffer; > + else > + CHECK_STRING (regexp); Maybe we should have CHECK_STRING_OR_REGEXP (regexp, &bufp), with an appropriate error symbol? > @@ -485,12 +511,21 @@ DEFUN ("posix-string-match", Fposix_string_match, S= posix_string_match, 2, 4, 0, > fast_string_match_internal (Lisp_Object regexp, Lisp_Object string, > =09=09=09 Lisp_Object table) Should we really be modifying the fast_* methods, which compile their patterns quite differently from the ordinary functions? That might cause surprising behavior. > @@ -3405,16 +3473,24 @@ DEFUN ("re--describe-compiled", Fre__describe_com= piled, Sre__describe_compiled, > If RAW is non-nil, just return the actual bytecode. */) > (Lisp_Object regexp, Lisp_Object raw) > { > - struct regexp_cache *cache_entry > - =3D compile_pattern (regexp, NULL, > - (!NILP (Vcase_fold_search) > - ? BVAR (current_buffer, case_canon_table) : Qnil= ), > - false, > - !NILP (BVAR (current_buffer, > - enable_multibyte_characters))); > + struct re_pattern_buffer *bufp; > + > + if (REGEXP_P (regexp)) > + bufp =3D XREGEXP (regexp)->buffer; > + else > + { > + struct regexp_cache *cache_entry > +=09=3D compile_pattern (regexp, NULL, > +=09=09=09 (!NILP (Vcase_fold_search) > +=09=09=09 ? BVAR (current_buffer, case_canon_table) : Qnil), > +=09=09=09 false, > +=09=09=09 !NILP (BVAR (current_buffer, > +=09=09=09=09=09enable_multibyte_characters))); > + bufp =3D &cache_entry->buf; > + } > + We never CHECK_STRING in re--describe-compiled, so (re--describe-compiled nil) currently appears to crash Emacs (without the patch). Can we fix that while we're in there? Pip