unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#14281: 24.3; replace-match leaves point at wrong place
@ 2013-04-27  1:19 Barry OReilly
  2013-05-09 19:07 ` Barry OReilly
  0 siblings, 1 reply; 12+ messages in thread
From: Barry OReilly @ 2013-04-27  1:19 UTC (permalink / raw)
  To: 14281

[-- Attachment #1: Type: text/plain, Size: 14822 bytes --]

I've been experiencing an intermittent and difficult to reproduce bug where
I use Evil (rev 42737279f1b75ec3731c3f3b0f2c465862159b40) to search and
replace in a region and when the bug occurs, replacements are made
throughout the buffer and point is left in the wrong place. Reproduction
comes in streaks: when I witness it once, I can do it again and again for a
while, then it stops reproducing at all. I made progress tracking the cause
down such that I can report some findings here. I didn't finish narrowing
down the problem because it stopped reproducing.

Description of events at the Lisp level:
   • Let my-old-string be a string which I've redacted for privacy. It is
14 non regex chars.
   • Let my-new-string be a redacted string. It is 16 chars, no backslashes.
   • Evil calls: (perform-replace my-old-string (replace-eval-replacement .
my-new-string) nil t nil nil nil 6605 6749)
   • perform-replace calls: (replace-match my-new-string t nil)
match-data=(6686 6700 [redacted filename])
      • Before the call to replace-match, point is 6700
      • After the call to replace-match, point is inexplicably 25

Since replace-match is implemented in C, I attached gdb to the running
Emacs. I find that when the variable newpoint is assigned by:
   newpoint = search_regs.start[sub] + SCHARS (newtext);
it is 9+16. I can see where 16 comes from, but not 9. sub is 0; I found
search_regs.start[0] has the correct value of 6686 for a time prior. I
continued to narrow down when search_regs.start[0] changes until the
problem stopped reproducing. Here is the stack trace where it goes wrong:

#6  0x00000000004fb13f in signal_after_change (charpos=6686, lendel=14,
lenins=16) at insdel.c:2058
#7  0x00000000004fcd87 in replace_range (from=6686, to=<value optimized
out>, new=<value optimized out>, prepare=<value optimized out>,
inherit=false, markers=true) at insdel.c:1427
#8  0x0000000000516cd9 in Freplace_match (newtext=158062897,
fixedcase=<value optimized out>, literal=<value optimized out>, string=1,
subexp=<value optimized out>) at search.c:2644
#9  0x0000000000544d8f in eval_sub (form=<value optimized out>) at
eval.c:2157
#10 0x000000000054511f in Fprogn (args=<value optimized out>) at eval.c:360
#11 0x0000000000545454 in funcall_lambda (fun=167311462, nargs=5,
arg_vector=0x7fff400cb870) at eval.c:3003
#12 0x000000000054686c in apply_lambda (fun=167311446, args=19819474) at
eval.c:2887
#13 0x0000000000544a84 in eval_sub (form=<value optimized out>) at
eval.c:2218
#14 0x000000000054504f in Fsetq (args=<value optimized out>) at eval.c:429
#15 0x0000000000544f55 in eval_sub (form=<value optimized out>) at
eval.c:2091
#16 0x000000000054511f in Fprogn (args=<value optimized out>) at eval.c:360
#17 0x0000000000544f55 in eval_sub (form=<value optimized out>) at
eval.c:2091
#18 0x0000000000544f55 in eval_sub (form=<value optimized out>) at
eval.c:2091
#19 0x000000000054511f in Fprogn (args=<value optimized out>) at eval.c:360
#20 0x0000000000544f55 in eval_sub (form=<value optimized out>) at
eval.c:2091
#21 0x0000000000544c63 in eval_sub (form=<value optimized out>) at
eval.c:2214
#22 0x000000000054511f in Fprogn (args=<value optimized out>) at eval.c:360
#23 0x0000000000547a08 in Fwhile (args=<value optimized out>) at eval.c:936
#24 0x0000000000544f55 in eval_sub (form=<value optimized out>) at
eval.c:2091
#25 0x0000000000546975 in Funwind_protect (args=145892294) at eval.c:1148
#26 0x0000000000544f55 in eval_sub (form=<value optimized out>) at
eval.c:2091
#27 0x000000000054511f in Fprogn (args=<value optimized out>) at eval.c:360
#28 0x0000000000547e97 in FletX (args=131856198) at eval.c:844
#29 0x0000000000544f55 in eval_sub (form=<value optimized out>) at
eval.c:2091
#30 0x000000000054511f in Fprogn (args=<value optimized out>) at eval.c:360
#31 0x0000000000545454 in funcall_lambda (fun=137050230, nargs=9,
arg_vector=0x7fff400cc2f0) at eval.c:3003
#32 0x0000000000545692 in Ffuncall (nargs=10, args=<value optimized out>)
at eval.c:2839
#33 0x0000000000577b1e in exec_byte_code (bytestr=6605, vector=38,
maxdepth=11914850, args_template=11914850, nargs=0, args=0x0) at
bytecode.c:900
#34 0x00000000005453d7 in funcall_lambda (fun=14279397, nargs=5,
arg_vector=0x7fff400cc4e8) at eval.c:3010
#35 0x0000000000545692 in Ffuncall (nargs=6, args=<value optimized out>) at
eval.c:2839
#36 0x0000000000545cc4 in Fapply (nargs=2, args=0x7fff400cc590) at
eval.c:2312
#37 0x0000000000545ee0 in apply1 (fn=97848594, arg=<value optimized out>)
at eval.c:2546
#38 0x0000000000541564 in Fcall_interactively (function=97848594,
record_flag=11914850, keys=11950037) at callint.c:377
#39 0x00000000005457f7 in Ffuncall (nargs=2, args=<value optimized out>) at
eval.c:2785
#40 0x0000000000577b1e in exec_byte_code (bytestr=6605, vector=33,
maxdepth=11914850, args_template=11914850, nargs=0, args=0x0) at
bytecode.c:900
#41 0x00000000005453d7 in funcall_lambda (fun=19430901, nargs=3,
arg_vector=0x7fff400cc8b0) at eval.c:3010
#42 0x000000000054686c in apply_lambda (fun=19430901, args=158068849) at
eval.c:2887
#43 0x0000000000544a84 in eval_sub (form=<value optimized out>) at
eval.c:2218
#44 0x0000000000546edd in Feval (form=169173046, lexical=<value optimized
out>) at eval.c:2005
#45 0x00000000005457e0 in Ffuncall (nargs=2, args=<value optimized out>) at
eval.c:2781
#46 0x0000000000577b1e in exec_byte_code (bytestr=6605, vector=33,
maxdepth=11914850, args_template=11914850, nargs=0, args=0x0) at
bytecode.c:900
#47 0x00000000005453d7 in funcall_lambda (fun=19550949, nargs=1,
arg_vector=0x7fff400ccd48) at eval.c:3010
#48 0x0000000000545692 in Ffuncall (nargs=2, args=<value optimized out>) at
eval.c:2839
#49 0x0000000000545e47 in Fapply (nargs=2, args=0x7fff400ccd40) at
eval.c:2259
#50 0x0000000000545ee0 in apply1 (fn=13903650, arg=<value optimized out>)
at eval.c:2546
#51 0x0000000000541564 in Fcall_interactively (function=13903650,
record_flag=11914850, keys=11950037) at callint.c:377
#52 0x00000000005457f7 in Ffuncall (nargs=4, args=<value optimized out>) at
eval.c:2785
#53 0x0000000000545a74 in call3 (fn=<value optimized out>, arg1=<value
optimized out>, arg2=11914850, arg3=11914898) at eval.c:2603
#54 0x00000000004e5402 in command_loop_1 () at keyboard.c:1587
#55 0x00000000005441a6 in internal_condition_case (bfun=0x4e5040
<command_loop_1>, handlers=11966498, hfun=0x4dc150 <cmd_error>) at
eval.c:1289
#56 0x00000000004dbc9a in command_loop_2 (ignore=<value optimized out>) at
keyboard.c:1168
#57 0x000000000054429a in internal_catch (tag=<value optimized out>,
func=0x4dbc80 <command_loop_2>, arg=11914850) at eval.c:1060
#58 0x00000000004dc3f0 in command_loop () at keyboard.c:1147
#59 recursive_edit_1 () at keyboard.c:779
#60 0x00000000004dc53a in Frecursive_edit () at keyboard.c:843
#61 0x00000000004d4146 in main (argc=<value optimized out>,
argv=0x7fff400cd538) at emacs.c:1528

Before replace_match's call to signal_after_change, search_regs.start[0] is
correct, after signal_after_change returns, it is incorrect.

Looking at the code, I see what I think are callbacks to Lisp related to
overlays. It's notable that Evil uses overlays to display the search and
replace as the user types the command in, showing the old string striked
out the old with the new string alongside. Is it possible the invoked hooks
could replace the match data and thus cause search_regs to have wrong
values upon return? Are there other ideas about what's going on? The next
time I witness the bug, I'll narrow down where in signal_after_change the
value becomes wrong.



In GNU Emacs 24.3.2 (x86_64-unknown-linux-gnu, GTK+ Version 2.10.4)
 of 2013-03-11 on psd15
Windowing system distributor `The X.Org Foundation', version 11.0.70101000
System Description:    Red Hat Enterprise Linux Client release 5.4 (Tikanga)

Configured using:
 `configure '--prefix=/[redacted]/user/boreilly/sw/emacs-24.3/rhel5-install'
 '--with-gif=no''

Important settings:
  value of $LANG: en_US.UTF-8
  value of $XMODIFIERS: @im=none
  locale-coding-system: utf-8-unix
  default enable-multibyte-characters: t

Major mode: Fundamental

Minor modes in effect:
  diff-auto-refine-mode: t
  shell-dirtrack-mode: t
  global-whitespace-mode: t
  global-cedet-m3-minor-mode: t
  global-semantic-mru-bookmark-mode: t
  global-semanticdb-minor-mode: t
  global-semantic-idle-scheduler-mode: t
  semantic-mode: t
  global-ede-mode: t
  evil-mode: t
  evil-local-mode: t
  global-undo-tree-mode: t
  undo-tree-mode: t
  show-paren-mode: t
  delete-selection-mode: t
  global-auto-revert-mode: t
  tooltip-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
' ' ' ' ' ' ' / h e r e SPC 1 3 <return> N N n ' '
' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ; j
k k k k k k k k k k k k k k k k k k k k k k k k k k
k k V y f j j j j j j <return> / n e w t e x t <return>
f j <return> k k V y f j <return> V y f j <return>
j j j k f j j j <return> f <return> ; ' ' ' j V y C-c
V y C-c V y / s u b <return> ' ' ' ' ' ' ' ' ' ' '
' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
' ' ' ' ' ' k k k k k k k k k k k k n 0 f <return>
j k f j j j j j j j H <return> f <return> f j <return>
j k W W W W h h h h h h h h h h h h h h h h h v l l
l l l l l l l l l l l l l l l l l l l l l l l l l l
l l l l l l l l l l l l l l l l l l l l l l l y M-x
r e p o r t - <tab> <return>

Recent messages:
Undo branch point!
Commands: d, s, x, u; f, o, 1, 2, m, v; ~, %; q to quit; ? for help. [5
times]
Mark set [7 times]
Commands: d, s, x, u; f, o, 1, 2, m, v; ~, %; q to quit; ? for help.
Mark set
Commands: d, s, x, u; f, o, 1, 2, m, v; ~, %; q to quit; ? for help. [5
times]
Mark set [2 times]
Commands: d, s, x, u; f, o, 1, 2, m, v; ~, %; q to quit; ? for help. [2
times]
Mark set
Commands: d, s, x, u; f, o, 1, 2, m, v; ~, %; q to quit; ? for help. [2
times]

Load-path shadows:
/redacted/user/boreilly/sw/cedet/lisp/speedbar/loaddefs hides
/redacted/user/boreilly/sw/cedet/lisp/eieio/loaddefs
/redacted/user/boreilly/sw/cedet/lisp/speedbar/loaddefs hides
/redacted/user/boreilly/sw/cedet/lisp/cedet/loaddefs
/redacted/user/boreilly/sw/cedet/lisp/speedbar/loaddefs hides
/redacted/user/boreilly/sw/emacs-24.3/rhel5-install/share/emacs/24.3/lisp/loaddefs
/redacted/user/boreilly/sw/cedet/lisp/eieio/eieio hides
/redacted/user/boreilly/sw/emacs-24.3/rhel5-install/share/emacs/24.3/lisp/emacs-lisp/eieio
/redacted/user/boreilly/sw/cedet/lisp/eieio/eieio-speedbar hides
/redacted/user/boreilly/sw/emacs-24.3/rhel5-install/share/emacs/24.3/lisp/emacs-lisp/eieio-speedbar
/redacted/user/boreilly/sw/cedet/lisp/eieio/eieio-opt hides
/redacted/user/boreilly/sw/emacs-24.3/rhel5-install/share/emacs/24.3/lisp/emacs-lisp/eieio-opt
/redacted/user/boreilly/sw/cedet/lisp/eieio/eieio-datadebug hides
/redacted/user/boreilly/sw/emacs-24.3/rhel5-install/share/emacs/24.3/lisp/emacs-lisp/eieio-datadebug
/redacted/user/boreilly/sw/cedet/lisp/eieio/eieio-custom hides
/redacted/user/boreilly/sw/emacs-24.3/rhel5-install/share/emacs/24.3/lisp/emacs-lisp/eieio-custom
/redacted/user/boreilly/sw/cedet/lisp/eieio/eieio-base hides
/redacted/user/boreilly/sw/emacs-24.3/rhel5-install/share/emacs/24.3/lisp/emacs-lisp/eieio-base
/redacted/user/boreilly/sw/cedet/lisp/eieio/chart hides
/redacted/user/boreilly/sw/emacs-24.3/rhel5-install/share/emacs/24.3/lisp/emacs-lisp/chart

Features:
(shadow sort mail-extr emacsbug message format-spec rfc822 mml mml-sec
mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils
mailheader sendmail rfc2047 rfc2045 ietf-drums mail-utils jka-compr
vc-hg ede/emacs semantic/bovine/el semantic/bovine/make
semantic/bovine/make-by make-mode diff-mode easy-mmode shell pcomplete
thingatpt semantic/analyze/complete semantic/db-typecache semantic/edit
ede/maven2 ede/lein2 ede/ant ede/java-root ede/jvm-base ede/linux
ede/make ede/dired dired ffap url-parse auth-source gnus-util mm-util
mail-prsvr password-cache url-vars find-file semantic/tag-write
ede/locate semantic/m3 semantic/tag-file semantic/db-file
semantic/adebug eieio-datadebug data-debug cedet-files semantic/bovine/c
semantic/decorate/include semantic/decorate/mode hideif
semantic/bovine/c-by semantic/bovine cc-langs cc-mode cc-fonts cc-guess
cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs my-config
semantic/lex-spp etags whitespace cus-start cus-load my-proj
ede/cpp-root hippie-exp comint ansi-color eassist derived cedet-m3
semantic/mru-bookmark semantic/db-mode semantic/idle semantic/bovine/gcc
semantic/dep semantic/ia semantic/analyze/refs semantic/senator
semantic/db-find semantic/db-ref semantic/decorate working fame pulse
cedet-devel-load warnings eieio-opt help-mode find-func srecode/map
srecode semantic/canned-configs semantic/ia-sb semantic/analyze
semantic/sort semantic/scope semantic/analyze/fcn semantic/db
semantic/ctxt semantic/format semantic/tag-ls semantic/find
semantic/util-modes semantic/util semantic semantic/tag semantic/lex
semantic/fw mode-local cedet-compat inversion ede/speedbar ede/files ede
ede/base ede/auto ede/source eieio-base eieio-speedbar speedbar sb-image
ezimage dframe easymenu eieio-custom wid-edit cedet eieio byte-opt
bytecomp byte-compile cconv eieio-core cedet-remove-builtin cl-macs gv
evil evil-integration evil-maps evil-commands evil-types evil-search
evil-ex evil-macros evil-repeat evil-states evil-core evil-common
windmove rect evil-digraphs evil-vars ring edmacro kmacro goto-chg
undo-tree diff rainbow-delimiters my-util advice help-fns advice-preload
electric paren delsel autorevert cl cl-lib time-date tooltip ediff-hook
vc-hooks lisp-float-type mwheel x-win x-dnd tool-bar dnd fontset image
regexp-opt fringe tabulated-list newcomment lisp-mode register page
menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock
syntax facemenu font-core frame cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer loaddefs
button faces cus-face macroexp files text-properties overlay sha1 md5
base64 format env code-pages mule custom widget hashtable-print-readable
backquote make-network-process dbusbind dynamic-setting
system-font-setting font-render-setting move-toolbar gtk x-toolkit x
multi-tty emacs)

[-- Attachment #2: Type: text/html, Size: 35690 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#14281: 24.3; replace-match leaves point at wrong place
  2013-04-27  1:19 bug#14281: 24.3; replace-match leaves point at wrong place Barry OReilly
@ 2013-05-09 19:07 ` Barry OReilly
  2013-05-09 21:27   ` Stefan Monnier
  0 siblings, 1 reply; 12+ messages in thread
From: Barry OReilly @ 2013-05-09 19:07 UTC (permalink / raw)
  To: 14281

[-- Attachment #1: Type: text/plain, Size: 1374 bytes --]

I reproduced the issue again and investigated. Since I'm unsure how to
determine information about the Lisp system within GDB, I used debug
statements in C and Lisp to arrive at the following description of what
happens, additive to my previous description:

My after-change-functions, in order, are:
   semantic-change-function
   c-after-change
   jit-lock-after-change

search_regs.start[0] first takes on the incorrect value inside
c-after-change's call to save-match-data. Since c-after-change code seems
correct, I determined that the match-data function begins returning the
incorrect value during semantic-change-function. I am using CEDET r8557.

>> I suppose that caveat would pin the bug on one of the third party
>> packages I use. However, why couldn't Emacs save off the match-data
>> itself and restore it after the after-change-functions? Is there any
>> legit situation where a change hook would want to change the
>> match-data in effect after the change hook returns?

Stefan> There are many cases where an after-change-function won't use
regular
Stefan> expressions at all.

The answer doesn't seem to fit the question, so I'll rephrase: Why does
Emacs allow after-change-functions to change the match-data beyond its
scope? Or: why doesn't the signal_after_change C function do like
set-match-data instead of leaving it to client change hooks to do so?

[-- Attachment #2: Type: text/html, Size: 1511 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#14281: 24.3; replace-match leaves point at wrong place
  2013-05-09 19:07 ` Barry OReilly
@ 2013-05-09 21:27   ` Stefan Monnier
  2013-05-10 13:27     ` Barry OReilly
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Monnier @ 2013-05-09 21:27 UTC (permalink / raw)
  To: Barry OReilly; +Cc: 14281

> I reproduced the issue again and investigated. Since I'm unsure how to
> determine information about the Lisp system within GDB, I used debug
> statements in C and Lisp to arrive at the following description of what
> happens, additive to my previous description:

> My after-change-functions, in order, are:
>    semantic-change-function
>    c-after-change
>    jit-lock-after-change

> search_regs.start[0] first takes on the incorrect value inside
> c-after-change's call to save-match-data. Since c-after-change code seems
> correct, I determined that the match-data function begins returning the
> incorrect value during semantic-change-function. I am using CEDET r8557.

Not sure what CEDET r8557 does, but at least the CEDET code in Emacs's
trunk is pretty simple in this respect: semantic-change-function only
runs the semantic-change-functions hook, and grep seems to indicate that
this hook is never modified, so the whole thing should never get
anywhere near the match-data.

Note that there are other change functions than after-change-functions.
There's also before-change-functions and there are overlays's
modification-hooks.  You might like to check those as well.

>>> I suppose that caveat would pin the bug on one of the third party
>>> packages I use. However, why couldn't Emacs save off the match-data
>>> itself and restore it after the after-change-functions? Is there any
>>> legit situation where a change hook would want to change the
>>> match-data in effect after the change hook returns?

Stefan> There are many cases where an after-change-function won't use regular
Stefan> expressions at all.
> The answer doesn't seem to fit the question, so I'll rephrase: Why does
> Emacs allow after-change-functions to change the match-data beyond its
> scope? Or: why doesn't the signal_after_change C function do like
> set-match-data instead of leaving it to client change hooks to do so?

Because it wastes time in most cases (when after-change-functions
won't use regular expressions at all).

Maybe we could add code that saves just the (match-beginning 0) and
signals an error if it was not properly preserved.  This would still
require change-functions to save the match-data if they use it, but it
might catch the offenders earlier.


        Stefan





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#14281: 24.3; replace-match leaves point at wrong place
  2013-05-09 21:27   ` Stefan Monnier
@ 2013-05-10 13:27     ` Barry OReilly
  2013-05-10 18:19       ` Barry OReilly
  0 siblings, 1 reply; 12+ messages in thread
From: Barry OReilly @ 2013-05-10 13:27 UTC (permalink / raw)
  To: 14281

[-- Attachment #1: Type: text/plain, Size: 495 bytes --]

I had the session still open, so I narrowed the match-data change to
semantic-mrub-push before it stopped reproducing.

The trace: semantic-change-function
     calls (run-hook-with-args 'semantic-change-functions start end length)
     calls semantic-edits-change-function-handle-changes
     calls (run-hook-with-args 'semantic-edits-new-change-functions o)
     calls semantic-mru-bookmark-change-hook-fcn
     calls semantic-mrub-push

I've forwarded this bug report to cedet-semantic list.

[-- Attachment #2: Type: text/html, Size: 551 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#14281: 24.3; replace-match leaves point at wrong place
  2013-05-10 13:27     ` Barry OReilly
@ 2013-05-10 18:19       ` Barry OReilly
  2013-05-10 20:56         ` Stefan Monnier
  0 siblings, 1 reply; 12+ messages in thread
From: Barry OReilly @ 2013-05-10 18:19 UTC (permalink / raw)
  To: 14281


[-- Attachment #1.1: Type: text/plain, Size: 510 bytes --]

> Maybe we could add code that saves just the (match-beginning 0) and
> signals an error if it was not properly preserved.  This would still
> require change-functions to save the match-data if they use it, but it
> might catch the offenders earlier.

I made a crack at it. It was necessary to use save-match-data in a couple
of around advices and disable undo-tree to get some basic commands to work.
I imagine there would be more cases to resolve.

Are the changes in the .diff what you had in mind roughly?

[-- Attachment #1.2: Type: text/html, Size: 556 bytes --]

[-- Attachment #2: match-data-corrupted-error.diff --]
[-- Type: application/octet-stream, Size: 3492 bytes --]

diff --git a/src/insdel.c b/src/insdel.c
index 8029291..ff90a93 100644
--- a/src/insdel.c
+++ b/src/insdel.c
@@ -53,6 +53,9 @@ static Lisp_Object combine_after_change_buffer;
 
 Lisp_Object Qinhibit_modification_hooks;
 
+/* Error condition used when change hooks inappropriately change match-data. */
+static Lisp_Object Qmatch_data_corrupted;
+
 static void signal_before_change (ptrdiff_t, ptrdiff_t, ptrdiff_t *);
 
 /* Also used in marker.c to enable expensive marker checks.  */
@@ -2034,6 +2037,10 @@ signal_after_change (ptrdiff_t charpos, ptrdiff_t lendel, ptrdiff_t lenins)
 
   specbind (Qinhibit_modification_hooks, Qt);
 
+  /* Change hook functions are expected to use save-match-data if using regexps.
+     Check (match-beginning 0) to verify hooks do not corrupt match-data. */
+  Lisp_Object match_beginning = match_limit(make_number(0), 1);
+
   if (!NILP (Vafter_change_functions))
     {
       Lisp_Object args[4];
@@ -2053,6 +2060,9 @@ signal_after_change (ptrdiff_t charpos, ptrdiff_t lendel, ptrdiff_t lenins)
       XSETCDR (rvoe_arg, Qt);
     }
 
+  if( ! EQ(match_beginning, match_limit(make_number(0), 1)) )
+    xsignal1 (Qmatch_data_corrupted, build_string ("Match data corrupted in after-change-functions; consider using save-match-data"));
+
   if (buffer_has_overlays ())
     report_overlay_modification (make_number (charpos),
 				 make_number (charpos + lenins),
@@ -2061,12 +2071,18 @@ signal_after_change (ptrdiff_t charpos, ptrdiff_t lendel, ptrdiff_t lenins)
 				 make_number (charpos + lenins),
 				 make_number (lendel));
 
+  if( ! EQ(match_beginning, match_limit(make_number(0), 1)) )
+    xsignal1 (Qmatch_data_corrupted, build_string ("Match data corrupted in overlay modification hooks; consider using save-match-data"));
+
   /* After an insertion, call the text properties
      insert-behind-hooks or insert-in-front-hooks.  */
   if (lendel == 0)
     report_interval_modification (make_number (charpos),
 				  make_number (charpos + lenins));
 
+  if( ! EQ(match_beginning, match_limit(make_number(0), 1)) )
+    xsignal1 (Qmatch_data_corrupted, build_string ("Match data corrupted in text property insertion hooks; consider using save-match-data"));
+
   unbind_to (count, Qnil);
 }
 
@@ -2181,5 +2197,12 @@ as well as hooks attached to text properties and overlays.  */);
   inhibit_modification_hooks = 0;
   DEFSYM (Qinhibit_modification_hooks, "inhibit-modification-hooks");
 
+  DEFSYM (Qmatch_data_corrupted, "match-data-corrupted");
+
+  Fput (Qmatch_data_corrupted, Qerror_conditions,
+	listn (CONSTYPE_PURE, 2, Qmatch_data_corrupted, Qerror));
+  Fput (Qmatch_data_corrupted, Qerror_message,
+	build_pure_c_string ("Match data corrupted"));
+
   defsubr (&Scombine_after_change_execute);
 }
diff --git a/src/lisp.h b/src/lisp.h
index e2c24ee..8596fa0 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -3414,6 +3414,7 @@ extern bool check_existing (const char *);
 
 /* Defined in search.c.  */
 extern void shrink_regexp_cache (void);
+extern Lisp_Object match_limit (Lisp_Object, bool);
 extern void restore_search_regs (void);
 extern void record_unwind_save_match_data (void);
 struct re_registers;
diff --git a/src/search.c b/src/search.c
index ea36133..c5a50cb 100644
--- a/src/search.c
+++ b/src/search.c
@@ -2695,8 +2695,8 @@ since only regular expressions have distinguished subexpressions.  */)
 
   return Qnil;
 }
-\f
-static Lisp_Object
+
+Lisp_Object
 match_limit (Lisp_Object num, bool beginningp)
 {
   EMACS_INT n;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* bug#14281: 24.3; replace-match leaves point at wrong place
  2013-05-10 18:19       ` Barry OReilly
@ 2013-05-10 20:56         ` Stefan Monnier
  2013-05-14 17:01           ` Barry OReilly
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Monnier @ 2013-05-10 20:56 UTC (permalink / raw)
  To: Barry OReilly; +Cc: 14281

>> Maybe we could add code that saves just the (match-beginning 0) and
>> signals an error if it was not properly preserved.  This would still
>> require change-functions to save the match-data if they use it, but it
>> might catch the offenders earlier.
> I made a crack at it.

Cool.

> It was necessary to use save-match-data in a couple of around advices
> and disable undo-tree to get some basic commands to work.  I imagine
> there would be more cases to resolve.

Can you show us the offenders?

> Are the changes in the .diff what you had in mind roughly?

Pretty much, yes (tho with spaces before open parens, staying with 80
columns, and maybe not using match_limit since it seems like it can
signal errors in corner cases).  Also I wouldn't bother defining a new
kind of error: it should always correspond to a coding mistake, so not
something which you might like to catch in a condition-case.

If there are "many offenders" (as you saw) or in case (worse) there are
legitimate cases where the test signals an error, we could introduce
a debug-match-data variable so only masochists help us debug.


        Stefan





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#14281: 24.3; replace-match leaves point at wrong place
  2013-05-10 20:56         ` Stefan Monnier
@ 2013-05-14 17:01           ` Barry OReilly
  2013-05-15 15:03             ` Barry OReilly
  0 siblings, 1 reply; 12+ messages in thread
From: Barry OReilly @ 2013-05-14 17:01 UTC (permalink / raw)
  Cc: 14281


[-- Attachment #1.1: Type: text/plain, Size: 2114 bytes --]

> Can you show us the offenders?

Offenders and status:
   semantic-change-function      Fixed in CEDET mainline
   evil-ex-search-update-pattern Fixed in Evil mainline
   evil-track-last-insertion     See below
   Undo tree                     Haven't investigated

I'm not so sure search_regs are saved and restored through save-match-data.
I have this advice:

(defadvice evil-track-last-insertion (around
my-advice-evil-track-last-insertion activate)
  (my-msg "DEBUG: 01 evil-track-last-insertion match-data=%s
match-beginning=%s" (match-data) (match-beginning 0))
  (save-match-data ad-do-it)
  (my-msg "DEBUG: 02 evil-track-last-insertion match-data=%s
match-beginning=%s" (match-data) (match-beginning 0))
  )

Note: my-msg is like message but prefixes the time.

When I inserted a new line in a .cc file, I get:

2013-05-14T12:51:39.010590 DEBUG: 01 evil-track-last-insertion
match-data=(#<marker at 1 in FileNameRedacted.cc> #<marker at 5 in
FileNameRedacted.cc> #<marker at 4 in FileNameRedacted.cc> #<marker at 5 in
FileNameRedacted.cc>) match-beginning=0
2013-05-14T12:51:39.020153 DEBUG: 02 evil-track-last-insertion
match-data=(#<marker at 1 in FileNameRedacted.cc> #<marker at 5 in
FileNameRedacted.cc> #<marker at 4 in FileNameRedacted.cc> #<marker at 5 in
FileNameRedacted.cc>) match-beginning=1
ad-handle-definition: `semantic-change-function' got redefined
2013-05-14T12:51:39.023610 DEBUG: 01 semantic-change-function
match-beginning=1
2013-05-14T12:51:39.025341 DEBUG: 02 semantic-change-function
match-beginning=1
2013-05-14T12:51:39.025525 DEBUG: 01 c-after-change match-beginning=1
2013-05-14T12:51:39.026920 DEBUG: 02 c-after-change match-beginning=1
2013-05-14T12:51:39.027110 DEBUG: 01 jit-lock-after-change match-beginning=1
2013-05-14T12:51:39.027319 DEBUG: 02 jit-lock-after-change match-beginning=1
newline: Match data corrupted in after-change-functions. Consider using
save-match-data

In this case, the change of match-beginning from 0 to 1 in
evil-track-last-insertion causes the error.

For reference, I am currently using the attached patch, which incorporates
your feedback.

[-- Attachment #1.2: Type: text/html, Size: 2323 bytes --]

[-- Attachment #2: check-match-beginning.diff --]
[-- Type: application/octet-stream, Size: 5325 bytes --]

diff --git a/src/insdel.c b/src/insdel.c
index 8029291..db2147f 100644
--- a/src/insdel.c
+++ b/src/insdel.c
@@ -28,6 +28,7 @@ along with GNU Emacs.  If not, see <http://www.gnu.org/licenses/>.  */
 #include "buffer.h"
 #include "window.h"
 #include "blockinput.h"
+#include "regex.h"
 #include "region-cache.h"
 
 static void insert_from_string_1 (Lisp_Object, ptrdiff_t, ptrdiff_t, ptrdiff_t,
@@ -2034,6 +2035,14 @@ signal_after_change (ptrdiff_t charpos, ptrdiff_t lendel, ptrdiff_t lenins)
 
   specbind (Qinhibit_modification_hooks, Qt);
 
+  /* Change hook functions are expected to use save-match-data if using regexps.
+     This checks match-beginning to detect cases of hooks corrupting match-data.
+     TODO: Implement in signal_before_change too
+   */
+  Lisp_Object match_beginning = 0 < search_regs.num_regs ?
+                                make_number (search_regs.start[0]) :
+                                Qnil;
+
   if (!NILP (Vafter_change_functions))
     {
       Lisp_Object args[4];
@@ -2053,6 +2062,12 @@ signal_after_change (ptrdiff_t charpos, ptrdiff_t lendel, ptrdiff_t lenins)
       XSETCDR (rvoe_arg, Qt);
     }
 
+  if ( ! EQ (match_beginning, 0 < search_regs.num_regs ?
+                              make_number (search_regs.start[0]) :
+                              Qnil))
+    error ("Match data corrupted in after-change-functions. "
+           "Consider using save-match-data");
+
   if (buffer_has_overlays ())
     report_overlay_modification (make_number (charpos),
 				 make_number (charpos + lenins),
@@ -2061,12 +2076,24 @@ signal_after_change (ptrdiff_t charpos, ptrdiff_t lendel, ptrdiff_t lenins)
 				 make_number (charpos + lenins),
 				 make_number (lendel));
 
+  if ( ! EQ (match_beginning, 0 < search_regs.num_regs ?
+                              make_number (search_regs.start[0]) :
+                              Qnil))
+    error ("Match data corrupted in overlay modification hooks. "
+           "Consider using save-match-data");
+
   /* After an insertion, call the text properties
      insert-behind-hooks or insert-in-front-hooks.  */
   if (lendel == 0)
     report_interval_modification (make_number (charpos),
 				  make_number (charpos + lenins));
 
+  if ( ! EQ (match_beginning, 0 < search_regs.num_regs ?
+                              make_number (search_regs.start[0]) :
+                              Qnil))
+    error ("Match data corrupted in text property insertion hooks. "
+           "Consider using save-match-data");
+
   unbind_to (count, Qnil);
 }
 
diff --git a/src/regex.h b/src/regex.h
index 8fe7ba1..b4833dc 100644
--- a/src/regex.h
+++ b/src/regex.h
@@ -429,6 +429,24 @@ struct re_registers
   regoff_t *end;
 };
 
+/* Every call to re_match, etc., must pass &search_regs as the regs
+   argument unless you can show it is unnecessary (i.e., if re_match
+   is certainly going to be called again before region-around-match
+   can be called).
+
+   Since the registers are now dynamically allocated, we need to make
+   sure not to refer to the Nth register before checking that it has
+   been allocated by checking search_regs.num_regs.
+
+   The regex code keeps track of whether it has allocated the search
+   buffer using bits in the re_pattern_buffer.  This means that whenever
+   you compile a new pattern, it completely forgets whether it has
+   allocated any registers, and will allocate new registers the next
+   time you call a searching or matching function.  Therefore, we need
+   to call re_set_registers after compiling a new pattern or after
+   setting the match registers, so that the regex functions will be
+   able to free or re-allocate it properly.  */
+extern struct re_registers search_regs;
 
 /* If `regs_allocated' is REGS_UNALLOCATED in the pattern buffer,
    `re_match_2' returns information about at least this many registers
diff --git a/src/search.c b/src/search.c
index ea36133..ffbea28 100644
--- a/src/search.c
+++ b/src/search.c
@@ -59,25 +59,7 @@ static struct regexp_cache searchbufs[REGEXP_CACHE_SIZE];
 /* The head of the linked list; points to the most recently used buffer.  */
 static struct regexp_cache *searchbuf_head;
 
-
-/* Every call to re_match, etc., must pass &search_regs as the regs
-   argument unless you can show it is unnecessary (i.e., if re_match
-   is certainly going to be called again before region-around-match
-   can be called).
-
-   Since the registers are now dynamically allocated, we need to make
-   sure not to refer to the Nth register before checking that it has
-   been allocated by checking search_regs.num_regs.
-
-   The regex code keeps track of whether it has allocated the search
-   buffer using bits in the re_pattern_buffer.  This means that whenever
-   you compile a new pattern, it completely forgets whether it has
-   allocated any registers, and will allocate new registers the next
-   time you call a searching or matching function.  Therefore, we need
-   to call re_set_registers after compiling a new pattern or after
-   setting the match registers, so that the regex functions will be
-   able to free or re-allocate it properly.  */
-static struct re_registers search_regs;
+struct re_registers search_regs;
 
 /* The buffer in which the last search was performed, or
    Qt if the last search was done in a string;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* bug#14281: 24.3; replace-match leaves point at wrong place
  2013-05-14 17:01           ` Barry OReilly
@ 2013-05-15 15:03             ` Barry OReilly
  2013-05-15 19:18               ` Stefan Monnier
  0 siblings, 1 reply; 12+ messages in thread
From: Barry OReilly @ 2013-05-15 15:03 UTC (permalink / raw)
  Cc: 14281

[-- Attachment #1: Type: text/plain, Size: 1436 bytes --]

To clarify and expand on the previous:

(defadvice evil-track-last-insertion (around
my-advice-evil-track-last-insertion activate)
  (my-msg "DEBUG: 01a evil-track-last-insertion match-beginning=%s
match-end=%s match-data=%s" (match-beginning 0) (match-end 0) (match-data))
  (save-match-data ad-do-it)
  (my-msg "DEBUG: 01b evil-track-last-insertion match-beginning=%s
match-end=%s match-data=%s" (match-beginning 0) (match-end 0) (match-data))
  )

Causes output:

2013-05-15T09:15:21.604604 DEBUG: 01a evil-track-last-insertion
match-beginning=0 match-end=5 match-data=(#<marker at 1 in Redacted.cc>
#<marker at 5 in Redacted.cc> #<marker at 4 in Redacted.cc> #<marker at 5
in Redacted.cc>)
2013-05-15T09:15:21.613707 DEBUG: 01b evil-track-last-insertion
match-beginning=1 match-end=5 match-data=(#<marker at 1 in Redacted.cc>
#<marker at 5 in Redacted.cc> #<marker at 4 in Redacted.cc> #<marker at 5
in Redacted.cc>)

(match-beginning 0) evaluates to 0 while the first element of (match-data)
is a marker at 1. Is this discrepancy expected?

I determined the reason for the discrepancy is that when match-data creates
the markers, set-marker can adjust the position. This puts the markers
saved and restored through save-match-data out of sync with the
search_regs. I suspect this is a problem since replace-match assumes
search_regs don't change during change hooks. It certainly creates a
problem for the error checking approach.

[-- Attachment #2: Type: text/html, Size: 1574 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#14281: 24.3; replace-match leaves point at wrong place
  2013-05-15 15:03             ` Barry OReilly
@ 2013-05-15 19:18               ` Stefan Monnier
  2013-05-15 20:44                 ` Barry OReilly
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Monnier @ 2013-05-15 19:18 UTC (permalink / raw)
  To: Barry OReilly; +Cc: 14281

>   (my-msg "DEBUG: 01a evil-track-last-insertion match-beginning=%s
> match-end=%s match-data=%s" (match-beginning 0) (match-end 0) (match-data))
[...]
> 2013-05-15T09:15:21.604604 DEBUG: 01a evil-track-last-insertion
> match-beginning=0 match-end=5 match-data=(#<marker at 1 in Redacted.cc>

Hmm... I wonder how we get there:
match-beginning=0 would seem to indicate that the last successful match
was not made against a buffer but against a string, but in that case
(match-data) should not return markers but integers instead!


        Stefan





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#14281: 24.3; replace-match leaves point at wrong place
  2013-05-15 19:18               ` Stefan Monnier
@ 2013-05-15 20:44                 ` Barry OReilly
  2013-05-15 20:58                   ` Barry OReilly
  0 siblings, 1 reply; 12+ messages in thread
From: Barry OReilly @ 2013-05-15 20:44 UTC (permalink / raw)
  Cc: 14281

[-- Attachment #1: Type: text/plain, Size: 783 bytes --]

Because looking_at_1 sets search_regs and last_thing_searched by differing
criteria. This patch makes them consistent in my test case.

diff --git a/src/insdel.c b/src/insdel.c
index 8029291..db2147f 100644
--- a/src/insdel.c
+++ b/src/insdel.c

diff --git a/src/search.c b/src/search.c
index ea36133..60658c8 100644
--- a/src/search.c
+++ b/src/search.c
@@ -336,10 +323,9 @@ looking_at_1 (Lisp_Object string, bool posix)
          search_regs.end[i]
            = BYTE_TO_CHAR (search_regs.end[i] + BEGV_BYTE);
        }
-
-  /* Set last_thing_searched only when match data is changed.  */
-  if (NILP (Vinhibit_changing_match_data))
+    /* Set last_thing_searched only when match data is changed.  */
     XSETBUFFER (last_thing_searched, current_buffer);
+  }

   return val;
 }

[-- Attachment #2: Type: text/html, Size: 872 bytes --]

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* bug#14281: 24.3; replace-match leaves point at wrong place
  2013-05-15 20:44                 ` Barry OReilly
@ 2013-05-15 20:58                   ` Barry OReilly
  2013-05-21 21:49                     ` Stefan Monnier
  0 siblings, 1 reply; 12+ messages in thread
From: Barry OReilly @ 2013-05-15 20:58 UTC (permalink / raw)
  Cc: 14281

[-- Attachment #1: Type: text/plain, Size: 847 bytes --]

Sorry, didn't edit out the unrelated changes correctly.

diff --git a/src/search.c b/src/search.c
index ea36133..60658c8 100644
--- a/src/search.c
+++ b/src/search.c
@@ -328,6 +314,7 @@ looking_at_1 (Lisp_Object string, bool posix)

   val = (i >= 0 ? Qt : Qnil);
   if (NILP (Vinhibit_changing_match_data) && i >= 0)
+  {
     for (i = 0; i < search_regs.num_regs; i++)
       if (search_regs.start[i] >= 0)
        {
@@ -336,10 +323,9 @@ looking_at_1 (Lisp_Object string, bool posix)
          search_regs.end[i]
            = BYTE_TO_CHAR (search_regs.end[i] + BEGV_BYTE);
        }
-
-  /* Set last_thing_searched only when match data is changed.  */
-  if (NILP (Vinhibit_changing_match_data))
+    /* Set last_thing_searched only when match data is changed.  */
     XSETBUFFER (last_thing_searched, current_buffer);
+  }

   return val;
 }

[-- Attachment #2: Type: text/html, Size: 966 bytes --]

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* bug#14281: 24.3; replace-match leaves point at wrong place
  2013-05-15 20:58                   ` Barry OReilly
@ 2013-05-21 21:49                     ` Stefan Monnier
  0 siblings, 0 replies; 12+ messages in thread
From: Stefan Monnier @ 2013-05-21 21:49 UTC (permalink / raw)
  To: Barry OReilly; +Cc: 14281-done

> Sorry, didn't edit out the unrelated changes correctly.

Thank you very much for tracking down this nasty corner-case bug.
I installed your patch, which I think fixes the original problem, right?


        Stefan





^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-05-21 21:49 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-27  1:19 bug#14281: 24.3; replace-match leaves point at wrong place Barry OReilly
2013-05-09 19:07 ` Barry OReilly
2013-05-09 21:27   ` Stefan Monnier
2013-05-10 13:27     ` Barry OReilly
2013-05-10 18:19       ` Barry OReilly
2013-05-10 20:56         ` Stefan Monnier
2013-05-14 17:01           ` Barry OReilly
2013-05-15 15:03             ` Barry OReilly
2013-05-15 19:18               ` Stefan Monnier
2013-05-15 20:44                 ` Barry OReilly
2013-05-15 20:58                   ` Barry OReilly
2013-05-21 21:49                     ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).