From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#23917: Please consider making Bug #23917 a blocker for 25.1 (was Re: org-capture: Capture template =?UTF-8?Q?=E2=80=98g=E2=80=99:?= Match data clobbered by buffer modification hooks) Date: Tue, 19 Jul 2016 18:35:45 +0300 Message-ID: <83a8hd1vzi.fsf@gnu.org> References: <87vb066ejv.fsf@linaro.org> <8360s67qcp.fsf@gnu.org> <87bn1yyaui.fsf@linaro.org> <87mvlhmv0x.fsf_-_@moondust.awandering> <837fcl5zs9.fsf@gnu.org> <87a8hgkwcb.fsf@linaro.org> <8360s42mcb.fsf@gnu.org> <87eg6rgmlg.fsf@gmail.com> <83lh0y24y6.fsf@gnu.org> <83eg6q1hbo.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1468942693 7847 80.91.229.3 (19 Jul 2016 15:38:13 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 19 Jul 2016 15:38:13 +0000 (UTC) Cc: 23917@debbugs.gnu.org, rpluim@gmail.com, jwiegley@gmail.com, alex.bennee@linaro.org, nljlistbox2@gmail.com To: Stefan Monnier Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Jul 19 17:38:02 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1bPX5m-00015a-Pc for geb-bug-gnu-emacs@m.gmane.org; Tue, 19 Jul 2016 17:37:55 +0200 Original-Received: from localhost ([::1]:56750 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bPX5l-0002qO-KB for geb-bug-gnu-emacs@m.gmane.org; Tue, 19 Jul 2016 11:37:53 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:54296) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bPX59-0002Qt-EB for bug-gnu-emacs@gnu.org; Tue, 19 Jul 2016 11:37:21 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bPX52-0008CO-GV for bug-gnu-emacs@gnu.org; Tue, 19 Jul 2016 11:37:14 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:45115) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bPX4w-0008B8-6f; Tue, 19 Jul 2016 11:37:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1bPX4w-0001zL-2U; Tue, 19 Jul 2016 11:37:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org, emacs-orgmode@gnu.org Resent-Date: Tue, 19 Jul 2016 15:37:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 23917 X-GNU-PR-Package: emacs,org-mode X-GNU-PR-Keywords: Original-Received: via spool by 23917-submit@debbugs.gnu.org id=B23917.14689425677568 (code B ref 23917); Tue, 19 Jul 2016 15:37:02 +0000 Original-Received: (at 23917) by debbugs.gnu.org; 19 Jul 2016 15:36:07 +0000 Original-Received: from localhost ([127.0.0.1]:57449 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bPX42-0001y0-VJ for submit@debbugs.gnu.org; Tue, 19 Jul 2016 11:36:07 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:34768) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bPX41-0001xX-FY for 23917@debbugs.gnu.org; Tue, 19 Jul 2016 11:36:05 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bPX3s-0007wl-8z for 23917@debbugs.gnu.org; Tue, 19 Jul 2016 11:36:00 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:46620) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bPX3h-0007v4-Be; Tue, 19 Jul 2016 11:35:45 -0400 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:3429 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1bPX3g-0004Qp-A6; Tue, 19 Jul 2016 11:35:44 -0400 In-reply-to: (message from Stefan Monnier on Tue, 19 Jul 2016 00:48:19 -0400) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:121261 Archived-At: > From: Stefan Monnier > Cc: rpluim@gmail.com, 23917@debbugs.gnu.org, alex.bennee@linaro.org, jwiegley@gmail.com, nljlistbox2@gmail.com > Date: Tue, 19 Jul 2016 00:48:19 -0400 > > > The more general problem is when there's at least one more > > sub-expression, whose start and/or end are after the new EOB. > > Those sub-expression's data will be completely bogus after the > > adjustment, > > If they were after the EOB, they were already bogus to start with. I think we are mis-communicating. I mean the following scenario: Before call to replace_range in replace-match: |---------------------------|---|------|----| s1 e1 s2 e2 EOB (s1, e1, etc. are the start and end of the corresponding sub-expressions.) After the call to replace_range in replace-match: |---------|---|------|----| s1 e1 s2 e2 EOB IOW, the 1st sub-expression got replaced with a much shorter text, which made EOB be smaller than the original beginning and end of the 2nd sub-expression. There's nothing bogus with this, is there? The user will expect to get match-data adjusted as shown in the second diagram, and that's what she will really get -- unless there are buffer-modification hooks that use save-match-data. In the latter case, what the user will get instead is this: |---------|---|------|----| s1 EOB e1 s2 e2 and that is even before the adjustment code kicks in and makes "adjustments" with an incorrect adjustment value, which is computed as newpoint = search_regs.start[sub] + SCHARS (newtext); [...] ptrdiff_t change = newpoint - search_regs.end[sub]; and so will use the new EOB as search_regs.end[sub], instead of the correct original value of e1 from the first diagram above. IOW, the call to save-match-data in a buffer-modification hook _disrupts_ the normal operation of replace-match in this case, by indirectly sabotaging the adjustment of match data after the replacement. Am I missing something? > And in any case, that's what has been happening for ever and has > proved safe enough. So you are saying that if a bug has been happening "for ever", it doesn't have to be fixed? (I disagree about "safe enough": the amount of bug reports in our data base that are not reproducible and about whose reasons we have no clear idea is non-negligible, so we don't really know whether this particular issue caused trouble in the past or not.) > >> So I think a safe fix is to try and relax the check we added to > >> replace-match so it doesn't get all worked up when something ≥ EOB gets > >> changed to something else that's also ≥ EOB. > > And lose the other sub-expressions in a more general case? Really? > > I'm not sure what you mean by "losing sub-expressions". See above: the match data for any sub-expressions beyond the one that shrunk too much is now bogus. Thus "losing". > >> Or maybe instead of signaling an error, we could simply skip the "Adjust > >> search data for this change". > > That would still sweep the problem under the carpet, leaving the match > > data bogus, so I don't like doing that. > > Maybe I'm not 100% satisfied with the behavior either, but I don't think > it's a significant problem and I don't think it'd cause the crash we saw > in bug#23869. We don't only solve bugs that cause crashes, do we? When I debugged the crash in bug#23869, I found and tried to solve the root cause of it, not just the symptom that caused the assertion violation. I still think we should strive to solve the root cause. As for the significance of the problem, I hope you will reconsider this after reading the above description of the scenario I have in mind. (Or tell me where I am wrong.) > > The crash in bug#23869 was due to this: > > > > newpoint = search_regs.start[sub] + SCHARS (newtext); > > [...] > > /* Now move point "officially" to the start of the inserted replacement. */ > > move_if_not_intangible (newpoint); <<<<<<<<<<<<<<<<<<<<<<< > > > > because due to clobbering, newpoint became -1. > > Ah, I see. > > Then maybe another fix is to compute newpoint before we call > replace_range, so it uses search_regs.start[sub] before the > *-change-functions can mess it up. IOW: > > @@ -2726,9 +2726,9 @@ since only regular expressions have distinguished > subexpressions. */) > unsigned num_regs = search_regs.num_regs; > > /* Replace the old text with the new in the cleanest possible way. */ > + newpoint = search_regs.start[sub] + SCHARS (newtext); > replace_range (search_regs.start[sub], search_regs.end[sub], > newtext, 1, 0, 1); > - newpoint = search_regs.start[sub] + SCHARS (newtext); > > if (case_action == all_caps) > Fupcase_region (make_number (search_regs.start[sub]), > > Would that be sufficient to avoid the crash? Why not? To avoid the crash in that particular use case, yes (but it can be avoided even easier, by validating the value of newpoint before the call to move_if_not_intangible). But what about the root cause? It is a clear bug, and could even cause crashes, if one of the match-data end-point positions becomes negative as result of the bogus adjustment, and someone then uses those positions for something. What's worse, these bugs happen in programs that are completely valid on the Lisp level.