unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Matching regex case-sensitively in C strings?
@ 2022-11-07  5:41 Yuan Fu
  2022-11-07 11:52 ` Eli Zaretskii
  0 siblings, 1 reply; 16+ messages in thread
From: Yuan Fu @ 2022-11-07  5:41 UTC (permalink / raw)
  To: emacs-devel

Is there a function that matches regex case-sensitively in a C string? Ie, something like fast_c_string_match_ignore_case but case-sensitive.

Yuan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Matching regex case-sensitively in C strings?
  2022-11-07  5:41 Matching regex case-sensitively in C strings? Yuan Fu
@ 2022-11-07 11:52 ` Eli Zaretskii
  2022-11-07 20:35   ` Yuan Fu
  0 siblings, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2022-11-07 11:52 UTC (permalink / raw)
  To: Yuan Fu; +Cc: emacs-devel

> From: Yuan Fu <casouri@gmail.com>
> Date: Sun, 6 Nov 2022 21:41:42 -0800
> 
> Is there a function that matches regex case-sensitively in a C string? Ie, something like fast_c_string_match_ignore_case but case-sensitive.

I don't think so, but it should be trivial to add such a function if
you need it.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Matching regex case-sensitively in C strings?
  2022-11-07 11:52 ` Eli Zaretskii
@ 2022-11-07 20:35   ` Yuan Fu
  2022-11-08 10:18     ` Mattias Engdegård
  0 siblings, 1 reply; 16+ messages in thread
From: Yuan Fu @ 2022-11-07 20:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel



> On Nov 7, 2022, at 3:52 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Sun, 6 Nov 2022 21:41:42 -0800
>> 
>> Is there a function that matches regex case-sensitively in a C string? Ie, something like fast_c_string_match_ignore_case but case-sensitive.
> 
> I don't think so, but it should be trivial to add such a function if
> you need it.

Cool, I need some help understanding fast_c_string_match_ignore_case:

ptrdiff_t
fast_c_string_match_ignore_case (Lisp_Object regexp,
				 const char *string, ptrdiff_t len)
{
  regexp = string_make_unibyte (regexp);
  // Why do we need to unwind stack?
  specpdl_ref count = SPECPDL_INDEX ();
  struct regexp_cache *cache_entry
    = compile_pattern (regexp, 0, Vascii_canon_table, 0, 0);
  // What does freezing a pattern do?
  freeze_pattern (cache_entry);
  // What is re_match_object for? I see that it can be t, nil or a string.
  re_match_object = Qt;
  // 
  ptrdiff_t val = re_search (&cache_entry->buf, string, len, 0, len, 0);
  unbind_to (count, Qnil);
  return val;
}

Yuan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Matching regex case-sensitively in C strings?
  2022-11-07 20:35   ` Yuan Fu
@ 2022-11-08 10:18     ` Mattias Engdegård
  2022-11-08 12:29       ` Eli Zaretskii
  2022-11-08 19:31       ` Yuan Fu
  0 siblings, 2 replies; 16+ messages in thread
From: Mattias Engdegård @ 2022-11-08 10:18 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, emacs-devel

7 nov. 2022 kl. 21.35 skrev Yuan Fu <casouri@gmail.com>:

> fast_c_string_match_ignore_case (Lisp_Object regexp,
> 				 const char *string, ptrdiff_t len)
> {
>  regexp = string_make_unibyte (regexp);

This is expensive and not obviously correct when it makes a difference. Ie, no longer "fast", and may hide bugs.
Something should be done about that.

>  // Why do we need to unwind stack?
>  specpdl_ref count = SPECPDL_INDEX ();

Because freeze_pattern pushes an unwind-protect on the specpdl.

>  struct regexp_cache *cache_entry
>    = compile_pattern (regexp, 0, Vascii_canon_table, 0, 0);

`Vascii_canon_table` is what makes it case-insensitive; you want to use Qnil (but you probably already know that now).
Since this is the only thing that differs from your intended use, I suggest you generalise this subroutine with a boolean parameter.

>  // What does freezing a pattern do?
>  freeze_pattern (cache_entry);

It locks the compiled pattern record to make the regexp engine reentrant (but here it also seems to be used for GC purposes; not sure about that).

>  // What is re_match_object for? I see that it can be t, nil or a string.
>  re_match_object = Qt;

Described in regex-emacs.h:

> /* The string or buffer being matched.
>    It is used for looking up syntax properties.
> 
>    If the value is a Lisp string object, match text in that string; if
>    it's nil, match text in the current buffer; if it's t, match text
>    in a C string.
> 
>    This value is effectively another parameter to re_search_2 and
>    re_match_2.  No calls into Lisp or thread switches are allowed
>    before setting re_match_object and calling into the regex search
>    and match functions.  These functions capture the current value of
>    re_match_object into gl_state on entry.
> 






^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Matching regex case-sensitively in C strings?
  2022-11-08 10:18     ` Mattias Engdegård
@ 2022-11-08 12:29       ` Eli Zaretskii
  2022-11-08 19:31       ` Yuan Fu
  1 sibling, 0 replies; 16+ messages in thread
From: Eli Zaretskii @ 2022-11-08 12:29 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: casouri, emacs-devel

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Tue, 8 Nov 2022 11:18:10 +0100
> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel <emacs-devel@gnu.org>
> 
> >  // What does freezing a pattern do?
> >  freeze_pattern (cache_entry);
> 
> It locks the compiled pattern record to make the regexp engine reentrant (but here it also seems to be used for GC purposes; not sure about that).

See shrink_regexp_cache, which is called by GC.  You don't want this
to happen to you when you are in the middle of a regexp search.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Matching regex case-sensitively in C strings?
  2022-11-08 10:18     ` Mattias Engdegård
  2022-11-08 12:29       ` Eli Zaretskii
@ 2022-11-08 19:31       ` Yuan Fu
  2022-11-08 19:37         ` Eli Zaretskii
  1 sibling, 1 reply; 16+ messages in thread
From: Yuan Fu @ 2022-11-08 19:31 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Eli Zaretskii, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1994 bytes --]



> On Nov 8, 2022, at 2:18 AM, Mattias Engdegård <mattiase@acm.org> wrote:
> 
> 7 nov. 2022 kl. 21.35 skrev Yuan Fu <casouri@gmail.com>:
> 
>> fast_c_string_match_ignore_case (Lisp_Object regexp,
>> 				 const char *string, ptrdiff_t len)
>> {
>> regexp = string_make_unibyte (regexp);
> 
> This is expensive and not obviously correct when it makes a difference. Ie, no longer "fast", and may hide bugs.
> Something should be done about that.
> 
>> // Why do we need to unwind stack?
>> specpdl_ref count = SPECPDL_INDEX ();
> 
> Because freeze_pattern pushes an unwind-protect on the specpdl.
> 
>> struct regexp_cache *cache_entry
>>   = compile_pattern (regexp, 0, Vascii_canon_table, 0, 0);
> 
> `Vascii_canon_table` is what makes it case-insensitive; you want to use Qnil (but you probably already know that now).
> Since this is the only thing that differs from your intended use, I suggest you generalise this subroutine with a boolean parameter.
> 
>> // What does freezing a pattern do?
>> freeze_pattern (cache_entry);
> 
> It locks the compiled pattern record to make the regexp engine reentrant (but here it also seems to be used for GC purposes; not sure about that).
> 
>> // What is re_match_object for? I see that it can be t, nil or a string.
>> re_match_object = Qt;
> 
> Described in regex-emacs.h:
> 
>> /* The string or buffer being matched.
>>   It is used for looking up syntax properties.
>> 
>>   If the value is a Lisp string object, match text in that string; if
>>   it's nil, match text in the current buffer; if it's t, match text
>>   in a C string.
>> 
>>   This value is effectively another parameter to re_search_2 and
>>   re_match_2.  No calls into Lisp or thread switches are allowed
>>   before setting re_match_object and calling into the regex search
>>   and match functions.  These functions capture the current value of
>>   re_match_object into gl_state on entry.
>> 

Thanks! How about:

Yuan


[-- Attachment #2: c_string_match.diff --]
[-- Type: application/octet-stream, Size: 2112 bytes --]

diff --git a/src/lisp.h b/src/lisp.h
index 1e41e2064c9..d0b3f8f05a5 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -4772,6 +4772,8 @@ fast_string_match_ignore_case (Lisp_Object regexp, Lisp_Object string)
 
 extern ptrdiff_t fast_c_string_match_ignore_case (Lisp_Object, const char *,
 						  ptrdiff_t);
+extern ptrdiff_t fast_c_string_match (Lisp_Object, const char *, ptrdiff_t,
+				      bool);
 extern ptrdiff_t fast_looking_at (Lisp_Object, ptrdiff_t, ptrdiff_t,
                                   ptrdiff_t, ptrdiff_t, Lisp_Object);
 extern ptrdiff_t find_newline1 (ptrdiff_t, ptrdiff_t, ptrdiff_t, ptrdiff_t,
diff --git a/src/search.c b/src/search.c
index b5d6a442c0f..90856cf5c12 100644
--- a/src/search.c
+++ b/src/search.c
@@ -505,10 +505,29 @@ fast_string_match_internal (Lisp_Object regexp, Lisp_Object string,
 fast_c_string_match_ignore_case (Lisp_Object regexp,
 				 const char *string, ptrdiff_t len)
 {
+  return fast_c_string_match (regexp, string, len, true);
+}
+
+/* Match REGEXP against STRING, searching all of STRING and return the
+   index of the match, or negative on failure.  This does not clobber
+   the match data.  Ignore case when searching if IGNORE_CASE is true.
+
+   We assume that STRING contains single-byte characters.  */
+
+ptrdiff_t
+fast_c_string_match (Lisp_Object regepx,
+		     const char *string, ptrdiff_t len, bool ignore_case)
+{
+  /* FIXME: This is expensive and not obviously correct when it makes
+     a difference. I.e., no longer "fast", and may hide bugs.
+     Something should be done about this.  */
   regexp = string_make_unibyte (regexp);
+  /* Record specpdl index because freeze_pattern pushes an
+     unwind-protect on the specpdl.  */
   specpdl_ref count = SPECPDL_INDEX ();
+  Lisp_Object translate_table = ignore_case ? Vascii_canon_table : Qnil;
   struct regexp_cache *cache_entry
-    = compile_pattern (regexp, 0, Vascii_canon_table, 0, 0);
+    = compile_pattern (regexp, 0, translate_table, 0, 0);
   freeze_pattern (cache_entry);
   re_match_object = Qt;
   ptrdiff_t val = re_search (&cache_entry->buf, string, len, 0, len, 0);

[-- Attachment #3: Type: text/plain, Size: 2 bytes --]




^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Matching regex case-sensitively in C strings?
  2022-11-08 19:31       ` Yuan Fu
@ 2022-11-08 19:37         ` Eli Zaretskii
  2022-11-08 20:59           ` Yuan Fu
  2022-11-09 10:33           ` Mattias Engdegård
  0 siblings, 2 replies; 16+ messages in thread
From: Eli Zaretskii @ 2022-11-08 19:37 UTC (permalink / raw)
  To: Yuan Fu; +Cc: mattiase, emacs-devel

> From: Yuan Fu <casouri@gmail.com>
> Date: Tue, 8 Nov 2022 11:31:35 -0800
> Cc: Eli Zaretskii <eliz@gnu.org>,
>  emacs-devel <emacs-devel@gnu.org>
> 
> +  Lisp_Object translate_table = ignore_case ? Vascii_canon_table : Qnil;

Please don't call this "translate_table", as it isn't.  The
documentation elsewhere calls this "canonicalize_table" or just
"canonicalize".

>  fast_c_string_match_ignore_case (Lisp_Object regexp,
>  				 const char *string, ptrdiff_t len)
>  {
> +  return fast_c_string_match (regexp, string, len, true);
> +}

I'm bothered that this function, which is supposed to be fast, will
now be slower due to an extra function call.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Matching regex case-sensitively in C strings?
  2022-11-08 19:37         ` Eli Zaretskii
@ 2022-11-08 20:59           ` Yuan Fu
  2022-11-09 10:53             ` Mattias Engdegård
  2022-11-09 10:33           ` Mattias Engdegård
  1 sibling, 1 reply; 16+ messages in thread
From: Yuan Fu @ 2022-11-08 20:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: mattiase, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 842 bytes --]



> On Nov 8, 2022, at 11:37 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Tue, 8 Nov 2022 11:31:35 -0800
>> Cc: Eli Zaretskii <eliz@gnu.org>,
>> emacs-devel <emacs-devel@gnu.org>
>> 
>> +  Lisp_Object translate_table = ignore_case ? Vascii_canon_table : Qnil;
> 
> Please don't call this "translate_table", as it isn't.  The
> documentation elsewhere calls this "canonicalize_table" or just
> "canonicalize".
> 
>> fast_c_string_match_ignore_case (Lisp_Object regexp,
>> 				 const char *string, ptrdiff_t len)
>> {
>> +  return fast_c_string_match (regexp, string, len, true);
>> +}
> 
> I'm bothered that this function, which is supposed to be fast, will
> now be slower due to an extra function call.

How about this:

Now c_string_match functions mirror their string_match counterparts.

Yuan


[-- Attachment #2: c_string_match.diff --]
[-- Type: application/octet-stream, Size: 2934 bytes --]

diff --git a/src/lisp.h b/src/lisp.h
index 1e41e2064c9..9f497d92270 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -4757,6 +4757,8 @@ XMODULE_FUNCTION (Lisp_Object o)
 extern void record_unwind_save_match_data (void);
 extern ptrdiff_t fast_string_match_internal (Lisp_Object, Lisp_Object,
 					     Lisp_Object);
+extern ptrdiff_t fast_c_string_match_internal (Lisp_Object, const char *,
+					       ptrdiff_t, Lisp_Object);
 
 INLINE ptrdiff_t
 fast_string_match (Lisp_Object regexp, Lisp_Object string)
@@ -4770,8 +4772,21 @@ fast_string_match_ignore_case (Lisp_Object regexp, Lisp_Object string)
   return fast_string_match_internal (regexp, string, Vascii_canon_table);
 }
 
-extern ptrdiff_t fast_c_string_match_ignore_case (Lisp_Object, const char *,
-						  ptrdiff_t);
+INLINE ptrdiff_t
+fast_c_string_match (Lisp_Object regexp,
+		     const char *string, ptrdiff_t len)
+{
+  return fast_c_string_match_internal (regexp, string, len, Qnil);
+}
+
+INLINE ptrdiff_t
+fast_c_string_match_ignore_case (Lisp_Object regexp,
+				 const char *string, ptrdiff_t len)
+{
+  return fast_c_string_match_internal (regexp, string, len,
+				       Vascii_canon_table);
+}
+
 extern ptrdiff_t fast_looking_at (Lisp_Object, ptrdiff_t, ptrdiff_t,
                                   ptrdiff_t, ptrdiff_t, Lisp_Object);
 extern ptrdiff_t find_newline1 (ptrdiff_t, ptrdiff_t, ptrdiff_t, ptrdiff_t,
diff --git a/src/search.c b/src/search.c
index b5d6a442c0f..f7a28202b23 100644
--- a/src/search.c
+++ b/src/search.c
@@ -496,19 +496,26 @@ fast_string_match_internal (Lisp_Object regexp, Lisp_Object string,
   return val;
 }
 
-/* Match REGEXP against STRING, searching all of STRING ignoring case,
-   and return the index of the match, or negative on failure.
-   This does not clobber the match data.
+/* Match REGEXP against STRING, searching all of STRING and return the
+   index of the match, or negative on failure.  This does not clobber
+   the match data.  Table is a canonicalize table.
+
    We assume that STRING contains single-byte characters.  */
 
 ptrdiff_t
-fast_c_string_match_ignore_case (Lisp_Object regexp,
-				 const char *string, ptrdiff_t len)
+fast_c_string_match_internal (Lisp_Object regexp,
+			      const char *string, ptrdiff_t len,
+			      Lisp_Object table)
 {
+  /* FIXME: This is expensive and not obviously correct when it makes
+     a difference. I.e., no longer "fast", and may hide bugs.
+     Something should be done about this.  */
   regexp = string_make_unibyte (regexp);
+  /* Record specpdl index because freeze_pattern pushes an
+     unwind-protect on the specpdl.  */
   specpdl_ref count = SPECPDL_INDEX ();
   struct regexp_cache *cache_entry
-    = compile_pattern (regexp, 0, Vascii_canon_table, 0, 0);
+    = compile_pattern (regexp, 0, table, 0, 0);
   freeze_pattern (cache_entry);
   re_match_object = Qt;
   ptrdiff_t val = re_search (&cache_entry->buf, string, len, 0, len, 0);

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Matching regex case-sensitively in C strings?
  2022-11-08 19:37         ` Eli Zaretskii
  2022-11-08 20:59           ` Yuan Fu
@ 2022-11-09 10:33           ` Mattias Engdegård
  2022-11-09 13:06             ` Eli Zaretskii
  1 sibling, 1 reply; 16+ messages in thread
From: Mattias Engdegård @ 2022-11-09 10:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Yuan Fu, emacs-devel

8 nov. 2022 kl. 20.37 skrev Eli Zaretskii <eliz@gnu.org>:

> Please don't call this "translate_table", as it isn't.  The
> documentation elsewhere calls this "canonicalize_table" or just
> "canonicalize".

Agreed.

>> fast_c_string_match_ignore_case (Lisp_Object regexp,
>> 				 const char *string, ptrdiff_t len)
>> {
>> +  return fast_c_string_match (regexp, string, len, true);
>> +}
> 
> I'm bothered that this function, which is supposed to be fast, will
> now be slower due to an extra function call.

You are right to worry about it, but there is actually no need for concern here: it's a tail call (and the extra argument is last) so it should compile to an unconditional jump (and setting a register). You would be hard-pressed to measure the "cost" even if you looked at cycle counters.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Matching regex case-sensitively in C strings?
  2022-11-08 20:59           ` Yuan Fu
@ 2022-11-09 10:53             ` Mattias Engdegård
  0 siblings, 0 replies; 16+ messages in thread
From: Mattias Engdegård @ 2022-11-09 10:53 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, emacs-devel

8 nov. 2022 kl. 21.59 skrev Yuan Fu <casouri@gmail.com>:

> How about this:

Either would do as far as I'm concerned; there is no difference in performance (inlining buys us nothing here).

+/* Match REGEXP against STRING, searching all of STRING and return the
+   index of the match, or negative on failure.  This does not clobber
+   the match data.  Table is a canonicalize table.
                                                  ^
                                               or nil





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Matching regex case-sensitively in C strings?
  2022-11-09 10:33           ` Mattias Engdegård
@ 2022-11-09 13:06             ` Eli Zaretskii
  2022-11-10  9:52               ` Mattias Engdegård
  0 siblings, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2022-11-09 13:06 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: casouri, emacs-devel

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Wed, 9 Nov 2022 11:33:09 +0100
> Cc: Yuan Fu <casouri@gmail.com>, emacs-devel@gnu.org
> 
> 8 nov. 2022 kl. 20.37 skrev Eli Zaretskii <eliz@gnu.org>:
> 
> >> fast_c_string_match_ignore_case (Lisp_Object regexp,
> >> 				 const char *string, ptrdiff_t len)
> >> {
> >> +  return fast_c_string_match (regexp, string, len, true);
> >> +}
> > 
> > I'm bothered that this function, which is supposed to be fast, will
> > now be slower due to an extra function call.
> 
> You are right to worry about it, but there is actually no need for
> concern here: it's a tail call (and the extra argument is last) so
> it should compile to an unconditional jump (and setting a register).

That's not what I see here, even with -O2.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Matching regex case-sensitively in C strings?
  2022-11-09 13:06             ` Eli Zaretskii
@ 2022-11-10  9:52               ` Mattias Engdegård
  2022-11-10 11:18                 ` Eli Zaretskii
  0 siblings, 1 reply; 16+ messages in thread
From: Mattias Engdegård @ 2022-11-10  9:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: casouri, emacs-devel

9 nov. 2022 kl. 14.06 skrev Eli Zaretskii <eliz@gnu.org>:

>> You are right to worry about it, but there is actually no need for
>> concern here: it's a tail call (and the extra argument is last) so
>> it should compile to an unconditional jump (and setting a register).
> 
> That's not what I see here, even with -O2.

Trying to read your mind, are you using 32-bit x86? Even that shouldn't be disastrous; it's just some stack manipulation. Hardly noticeable given that it's a regexp match that comes next.

I am definitely guilty of not thinking much of 32-bit x86 when coding for performance (haven't for a number years). Of course this doesn't mean we should allow it to become much worse than it needs to be.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Matching regex case-sensitively in C strings?
  2022-11-10  9:52               ` Mattias Engdegård
@ 2022-11-10 11:18                 ` Eli Zaretskii
  2022-11-10 11:35                   ` Robert Pluim
  2022-11-10 14:20                   ` Mattias Engdegård
  0 siblings, 2 replies; 16+ messages in thread
From: Eli Zaretskii @ 2022-11-10 11:18 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: casouri, emacs-devel

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Thu, 10 Nov 2022 10:52:40 +0100
> Cc: casouri@gmail.com, emacs-devel@gnu.org
> 
> 9 nov. 2022 kl. 14.06 skrev Eli Zaretskii <eliz@gnu.org>:
> 
> >> You are right to worry about it, but there is actually no need for
> >> concern here: it's a tail call (and the extra argument is last) so
> >> it should compile to an unconditional jump (and setting a register).
> > 
> > That's not what I see here, even with -O2.
> 
> Trying to read your mind, are you using 32-bit x86? Even that shouldn't be disastrous; it's just some stack manipulation. Hardly noticeable given that it's a regexp match that comes next.

I agree that the effect is probably minuscule, but it's still there.
And on some architectures it could be more than that.

Why not have two separate functions for these two jobs?  The code is
short, and some duplication is not an issue in this case.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Matching regex case-sensitively in C strings?
  2022-11-10 11:18                 ` Eli Zaretskii
@ 2022-11-10 11:35                   ` Robert Pluim
  2022-11-10 14:20                   ` Mattias Engdegård
  1 sibling, 0 replies; 16+ messages in thread
From: Robert Pluim @ 2022-11-10 11:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Mattias Engdegård, casouri, emacs-devel

>>>>> On Thu, 10 Nov 2022 13:18:56 +0200, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Mattias Engdegård <mattiase@acm.org>
    >> Date: Thu, 10 Nov 2022 10:52:40 +0100
    >> Cc: casouri@gmail.com, emacs-devel@gnu.org
    >> 
    >> 9 nov. 2022 kl. 14.06 skrev Eli Zaretskii <eliz@gnu.org>:
    >> 
    >> >> You are right to worry about it, but there is actually no need for
    >> >> concern here: it's a tail call (and the extra argument is last) so
    >> >> it should compile to an unconditional jump (and setting a register).
    >> > 
    >> > That's not what I see here, even with -O2.
    >> 
    >> Trying to read your mind, are you using 32-bit x86? Even that
    >> shouldn't be disastrous; it's just some stack manipulation. Hardly
    >> noticeable given that it's a regexp match that comes next.

    Eli> I agree that the effect is probably minuscule, but it's still there.
    Eli> And on some architectures it could be more than that.

    Eli> Why not have two separate functions for these two jobs?  The code is
    Eli> short, and some duplication is not an issue in this case.

Or one generic function and two #defines that pass in the appropriate
canonicalize table value.

Robert
-- 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Matching regex case-sensitively in C strings?
  2022-11-10 11:18                 ` Eli Zaretskii
  2022-11-10 11:35                   ` Robert Pluim
@ 2022-11-10 14:20                   ` Mattias Engdegård
  2022-11-10 22:25                     ` Yuan Fu
  1 sibling, 1 reply; 16+ messages in thread
From: Mattias Engdegård @ 2022-11-10 14:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: casouri, emacs-devel

10 nov. 2022 kl. 12.18 skrev Eli Zaretskii <eliz@gnu.org>:

> Why not have two separate functions for these two jobs?  The code is
> short, and some duplication is not an issue in this case.

Sure. Yuan Fu presented an alternative which used inline functions and I don't mind that approach either.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Matching regex case-sensitively in C strings?
  2022-11-10 14:20                   ` Mattias Engdegård
@ 2022-11-10 22:25                     ` Yuan Fu
  0 siblings, 0 replies; 16+ messages in thread
From: Yuan Fu @ 2022-11-10 22:25 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Eli Zaretskii, emacs-devel



> On Nov 10, 2022, at 6:20 AM, Mattias Engdegård <mattiase@acm.org> wrote:
> 
> 10 nov. 2022 kl. 12.18 skrev Eli Zaretskii <eliz@gnu.org>:
> 
>> Why not have two separate functions for these two jobs?  The code is
>> short, and some duplication is not an issue in this case.
> 
> Sure. Yuan Fu presented an alternative which used inline functions and I don't mind that approach either.
> 

Yeah, I pushed that version onto feature/tree-sitter.

Yuan


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-11-10 22:25 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-07  5:41 Matching regex case-sensitively in C strings? Yuan Fu
2022-11-07 11:52 ` Eli Zaretskii
2022-11-07 20:35   ` Yuan Fu
2022-11-08 10:18     ` Mattias Engdegård
2022-11-08 12:29       ` Eli Zaretskii
2022-11-08 19:31       ` Yuan Fu
2022-11-08 19:37         ` Eli Zaretskii
2022-11-08 20:59           ` Yuan Fu
2022-11-09 10:53             ` Mattias Engdegård
2022-11-09 10:33           ` Mattias Engdegård
2022-11-09 13:06             ` Eli Zaretskii
2022-11-10  9:52               ` Mattias Engdegård
2022-11-10 11:18                 ` Eli Zaretskii
2022-11-10 11:35                   ` Robert Pluim
2022-11-10 14:20                   ` Mattias Engdegård
2022-11-10 22:25                     ` Yuan Fu

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).