Making `replace-match' more efficient

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* Making `replace-match' more efficient
@ 2021-10-07 18:33 Lars Ingebrigtsen
  2021-10-07 18:54 ` Eli Zaretskii
  0 siblings, 1 reply; 3+ messages in thread
From: Lars Ingebrigtsen @ 2021-10-07 18:33 UTC (permalink / raw)
  To: emacs-devel

We use `replace-match' a lot in our code (either directly or
indirectly), and I was poking at it to see whether there's any obvious
opportunities to make it faster, and immediately found a way to make
`replace-regexp-in-string' 10% faster in the by just checking whether
the replacement contains any backslashes or not.  (It's f2bd2386a79.)

So I poked at it some more, and thought that the exploratory patch below
seemed like an obvious way to make it generate less garbage, but the
performance impact was exactly zero, so I ditched that.

But it still seems like there's still opportunities for making it faster
in many common cases.  For instance, the most popular REPLACE string is
"", which should be exploitable...

Anyway, I'm just posting this here in case it inspires somebody to
explore.

diff --git a/src/search.c b/src/search.c
index 08f1e9474f..5471e05c0f 100644
--- a/src/search.c
+++ b/src/search.c
@@ -2501,9 +2501,6 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0,
     {
       Lisp_Object before, after;
 
-      before = Fsubstring (string, make_fixnum (0), make_fixnum (sub_start));
-      after = Fsubstring (string, make_fixnum (sub_end), Qnil);
-
       /* Substitute parts of the match into NEWTEXT
 	 if desired.  */
       if (NILP (literal))
@@ -2598,6 +2595,30 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0,
       else if (case_action == cap_initial)
 	newtext = Fupcase_initials (newtext);
 
+      if (SBYTES (string) == SCHARS (string)
+	  && SBYTES (newtext) == SCHARS (newtext)
+	  && !string_intervals (string)
+	  && !string_intervals (newtext))
+	{
+	  char *conc;
+	  ptrdiff_t len = sub_start + SBYTES (newtext) +
+	    SBYTES (string) - sub_end;
+
+	  conc = xmalloc (len);
+	  memcpy (conc, SSDATA (string), sub_start);
+	  memcpy (conc + sub_start, SSDATA (newtext), SBYTES (newtext));
+	  memcpy (conc + sub_start + SBYTES (newtext),
+		  SSDATA (string) + sub_end,
+		  SBYTES (string) - sub_end);
+	  Lisp_Object result = make_unibyte_string (conc, len);
+
+	  xfree (conc);
+	  return result;
+	}
+      
+      before = Fsubstring (string, make_fixnum (0), make_fixnum (sub_start));
+      after = Fsubstring (string, make_fixnum (sub_end), Qnil);
+
       return concat3 (before, newtext, after);
     }
 

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: Making `replace-match' more efficient
  2021-10-07 18:33 Making `replace-match' more efficient Lars Ingebrigtsen
@ 2021-10-07 18:54 ` Eli Zaretskii
  2021-10-07 18:56   ` Lars Ingebrigtsen
  0 siblings, 1 reply; 3+ messages in thread
From: Eli Zaretskii @ 2021-10-07 18:54 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: emacs-devel

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Thu, 07 Oct 2021 20:33:00 +0200
> 
> +      if (SBYTES (string) == SCHARS (string)
> +	  && SBYTES (newtext) == SCHARS (newtext)

This is not enough to detect pure-ASCII strings, because such strings
are many times unibyte.  Perhaps this is one reason why you didn't see
any significant performance gains.




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Making `replace-match' more efficient
  2021-10-07 18:54 ` Eli Zaretskii
@ 2021-10-07 18:56   ` Lars Ingebrigtsen
  0 siblings, 0 replies; 3+ messages in thread
From: Lars Ingebrigtsen @ 2021-10-07 18:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> +      if (SBYTES (string) == SCHARS (string)
>> +	  && SBYTES (newtext) == SCHARS (newtext)
>
> This is not enough to detect pure-ASCII strings, because such strings
> are many times unibyte.  Perhaps this is one reason why you didn't see
> any significant performance gains.

It was just an exploratory patch.  I confirmed that this branch was
taken in my tests.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-10-07 18:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-07 18:33 Making `replace-match' more efficient Lars Ingebrigtsen
2021-10-07 18:54 ` Eli Zaretskii
2021-10-07 18:56   ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).