From: Paul Eggert <eggert@cs.ucla.edu>
To: Lars Ingebrigtsen <larsi@gnus.org>
Cc: 19208@debbugs.gnu.org, Phillip Lord <phillip.lord@newcastle.ac.uk>
Subject: bug#19208: replace-match unhelpful error message
Date: Sat, 3 Aug 2019 13:22:26 -0700 [thread overview]
Message-ID: <e5164f5d-38ee-660c-39b7-5bd564d587e3@cs.ucla.edu> (raw)
In-Reply-To: <87r3wn5kgw.fsf@newcastle.ac.uk>
[-- Attachment #1: Type: text/plain, Size: 1298 bytes --]
> No, that's just another sanity check -- num_regs is the max allowed
> number of sub-matches. (I've now added some comments to clarify.)
Unfortunately that patch messes up the sanity check, as the patched code allows
'sub' to be negative, or to be equal to search_regs.num_regs, and in either case
this results in a bad pointer.
> Yoda conditionals and a !...
Actually those were Leibniz conditionals, which are comparisons involving "<" or
"<=". The idea is that the conditionals' textual order reflects numeric order.
This is a common style in math when doing range checking, e.g., "0 <= i < n".
Yoda conditionals are expressions like "0 != x" which I agree are confusing.
I learned Leibniz conditionals from the late Val Schorre, who was *really* good
at programming and programming style: Knuth credits Schorre with the invention
of goto-less programming in the early 1960s. When I redid your patch as a
Leibniz conditional, I instantly spotted the bug that it introduced. You might
give Leibniz conditionals a try, as they help make code more readable and
reliable when checking for range errors or overflow.
While in the neighborhood I spotted some other rare possibilities for bad
behavior due to out-of-range indexes. I fixed the bugs I found by installing the
attached.
[-- Attachment #2: 0001-Fix-rare-undefined-behaviors-in-replace-match.patch --]
[-- Type: text/x-patch, Size: 8020 bytes --]
From 13fe8a27042b1539d43727e6df97c386c61c3053 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 3 Aug 2019 12:45:19 -0700
Subject: [PATCH] Fix rare undefined behaviors in replace-match
* src/search.c (Freplace_match): Simplify by caching search_regs
components. Fix sanity check for out-of-range subscripts;
it incorrectly allowed negative subscripts, subscripts
equal to search_regs.num_regs, and it had undefined
behavior for subscripts outside ptrdiff_t range.
Improve wording of newly-introduced replace-match diagnostic.
Rework use of opoint, to avoid setting point to an out-of-range value
in rare cases involving modification hooks.
---
src/search.c | 107 ++++++++++++++++++---------------------------------
1 file changed, 38 insertions(+), 69 deletions(-)
diff --git a/src/search.c b/src/search.c
index 0e2ae059e8..9b674a5810 100644
--- a/src/search.c
+++ b/src/search.c
@@ -2389,44 +2389,32 @@ since only regular expressions have distinguished subexpressions. */)
case_action = nochange; /* We tried an initialization */
/* but some C compilers blew it */
- if (search_regs.num_regs <= 0)
+ ptrdiff_t num_regs = search_regs.num_regs;
+ if (num_regs <= 0)
error ("`replace-match' called before any match found");
if (NILP (subexp))
sub = 0;
else
{
- CHECK_FIXNUM (subexp);
+ CHECK_RANGED_INTEGER (subexp, 0, num_regs - 1);
sub = XFIXNUM (subexp);
- /* Sanity check to see whether the subexp is larger than the
- allowed number of sub-regexps. */
- if (sub >= 0 && sub > search_regs.num_regs)
- args_out_of_range (subexp, make_fixnum (search_regs.num_regs));
}
- /* Check whether the subexpression to replace is greater than the
- number of subexpressions in the regexp. */
- if (sub > 0 && search_regs.start[sub] == -1)
- args_out_of_range (build_string ("Attempt to replace regexp subexpression that doesn't exist"),
- subexp);
+ ptrdiff_t sub_start = search_regs.start[sub];
+ ptrdiff_t sub_end = search_regs.end[sub];
+ eassert (sub_start <= sub_end);
- /* Sanity check to see whether the text to replace is present in the
- buffer/string. */
- if (NILP (string))
+ /* Check whether the text to replace is present in the buffer/string. */
+ if (! (NILP (string)
+ ? BEGV <= sub_start && sub_end <= ZV
+ : 0 <= sub_start && sub_end <= SCHARS (string)))
{
- if (search_regs.start[sub] < BEGV
- || search_regs.start[sub] > search_regs.end[sub]
- || search_regs.end[sub] > ZV)
- args_out_of_range (make_fixnum (search_regs.start[sub]),
- make_fixnum (search_regs.end[sub]));
- }
- else
- {
- if (search_regs.start[sub] < 0
- || search_regs.start[sub] > search_regs.end[sub]
- || search_regs.end[sub] > SCHARS (string))
- args_out_of_range (make_fixnum (search_regs.start[sub]),
- make_fixnum (search_regs.end[sub]));
+ if (sub_start < 0)
+ xsignal2 (Qerror,
+ build_string ("replace-match subexpression does not exist"),
+ subexp);
+ args_out_of_range (make_fixnum (sub_start), make_fixnum (sub_end));
}
if (NILP (fixedcase))
@@ -2434,8 +2422,8 @@ since only regular expressions have distinguished subexpressions. */)
/* Decide how to casify by examining the matched text. */
ptrdiff_t last;
- pos = search_regs.start[sub];
- last = search_regs.end[sub];
+ pos = sub_start;
+ last = sub_end;
if (NILP (string))
pos_byte = CHAR_TO_BYTE (pos);
@@ -2511,9 +2499,8 @@ since only regular expressions have distinguished subexpressions. */)
{
Lisp_Object before, after;
- before = Fsubstring (string, make_fixnum (0),
- make_fixnum (search_regs.start[sub]));
- after = Fsubstring (string, make_fixnum (search_regs.end[sub]), Qnil);
+ before = Fsubstring (string, make_fixnum (0), make_fixnum (sub_start));
+ after = Fsubstring (string, make_fixnum (sub_end), Qnil);
/* Substitute parts of the match into NEWTEXT
if desired. */
@@ -2542,12 +2529,12 @@ since only regular expressions have distinguished subexpressions. */)
if (c == '&')
{
- substart = search_regs.start[sub];
- subend = search_regs.end[sub];
+ substart = sub_start;
+ subend = sub_end;
}
else if (c >= '1' && c <= '9')
{
- if (c - '0' < search_regs.num_regs
+ if (c - '0' < num_regs
&& search_regs.start[c - '0'] >= 0)
{
substart = search_regs.start[c - '0'];
@@ -2612,13 +2599,8 @@ since only regular expressions have distinguished subexpressions. */)
return concat3 (before, newtext, after);
}
- /* Record point, then move (quietly) to the start of the match. */
- if (PT >= search_regs.end[sub])
- opoint = PT - ZV;
- else if (PT > search_regs.start[sub])
- opoint = search_regs.end[sub] - ZV;
- else
- opoint = PT;
+ /* Record point. A nonpositive OPOINT is actually an offset from ZV. */
+ opoint = PT <= sub_start ? PT : max (PT, sub_end) - ZV;
/* If we want non-literal replacement,
perform substitution on the replacement string. */
@@ -2687,7 +2669,7 @@ since only regular expressions have distinguished subexpressions. */)
if (c == '&')
idx = sub;
- else if (c >= '1' && c <= '9' && c - '0' < search_regs.num_regs)
+ else if ('1' <= c && c <= '9' && c - '0' < num_regs)
{
if (search_regs.start[c - '0'] >= 1)
idx = c - '0';
@@ -2745,25 +2727,11 @@ since only regular expressions have distinguished subexpressions. */)
xfree (substed);
}
- /* The functions below modify the buffer, so they could trigger
- various modification hooks (see signal_before_change and
- signal_after_change). If these hooks clobber the match data we
- error out since otherwise this will result in confusing bugs. */
- ptrdiff_t sub_start = search_regs.start[sub];
- ptrdiff_t sub_end = search_regs.end[sub];
- ptrdiff_t num_regs = search_regs.num_regs;
- newpoint = search_regs.start[sub] + SCHARS (newtext);
+ newpoint = sub_start + SCHARS (newtext);
+ ptrdiff_t newstart = sub_start == sub_end ? newpoint : sub_start;
/* Replace the old text with the new in the cleanest possible way. */
- replace_range (search_regs.start[sub], search_regs.end[sub],
- newtext, 1, 0, 1, 1);
- /* Update saved data to match adjustment made by replace_range. */
- {
- ptrdiff_t change = newpoint - sub_end;
- if (sub_start >= sub_end)
- sub_start += change;
- sub_end += change;
- }
+ replace_range (sub_start, sub_end, newtext, 1, 0, 1, true);
if (case_action == all_caps)
Fupcase_region (make_fixnum (search_regs.start[sub]),
@@ -2773,17 +2741,18 @@ since only regular expressions have distinguished subexpressions. */)
Fupcase_initials_region (make_fixnum (search_regs.start[sub]),
make_fixnum (newpoint));
- if (search_regs.start[sub] != sub_start
- || search_regs.end[sub] != sub_end
- || search_regs.num_regs != num_regs)
+ /* The replace_range etc. functions can trigger modification hooks
+ (see signal_before_change and signal_after_change). Try to error
+ out if these hooks clobber the match data since clobbering can
+ result in confusing bugs. Although this sanity check does not
+ catch all possible clobberings, it should catch many of them. */
+ if (! (search_regs.num_regs == num_regs
+ && search_regs.start[sub] == newstart
+ && search_regs.end[sub] == newpoint))
error ("Match data clobbered by buffer modification hooks");
- /* Put point back where it was in the text. */
- if (opoint <= 0)
- TEMP_SET_PT (opoint + ZV);
- else
- TEMP_SET_PT (opoint);
-
+ /* Put point back where it was in the text, if possible. */
+ TEMP_SET_PT (clip_to_bounds (BEGV, opoint + (opoint <= 0 ? ZV : 0), ZV));
/* Now move point "officially" to the start of the inserted replacement. */
move_if_not_intangible (newpoint);
--
2.17.1
next prev parent reply other threads:[~2019-08-03 20:22 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-28 17:00 bug#19208: replace-match unhelpful error message Phillip Lord
2019-08-03 14:01 ` Lars Ingebrigtsen
2019-08-03 20:22 ` Paul Eggert [this message]
2019-08-03 20:31 ` Lars Ingebrigtsen
2019-08-03 20:50 ` Paul Eggert
2019-08-04 9:09 ` Andreas Schwab
2019-08-03 23:59 ` Noam Postavsky
2019-08-04 9:02 ` Andreas Schwab
2019-08-04 14:12 ` Noam Postavsky
2019-08-04 15:13 ` Andreas Schwab
2019-08-04 17:41 ` Paul Eggert
2019-08-04 18:14 ` Andreas Schwab
2019-08-05 2:25 ` Richard Stallman
2019-08-04 11:47 ` Lars Ingebrigtsen
2019-08-04 9:18 ` Andreas Schwab
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e5164f5d-38ee-660c-39b7-5bd564d587e3@cs.ucla.edu \
--to=eggert@cs.ucla.edu \
--cc=19208@debbugs.gnu.org \
--cc=larsi@gnus.org \
--cc=phillip.lord@newcastle.ac.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.