unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#35802: Broken data loaded from uni-decomposition
@ 2019-05-19 19:46 Juri Linkov
  2019-06-06 17:07 ` npostavs
  0 siblings, 1 reply; 12+ messages in thread
From: Juri Linkov @ 2019-05-19 19:46 UTC (permalink / raw)
  To: 35802

While adding defcustoms for char-fold in bug#35689, I encountered a problem
where calling `(setq char-fold-table (char-fold-make-table))' garbled data
returned by `(unicode-property-table-internal 'decomposition)'.

This happens only with my customizations, so I tried to narrow down what
customization caused this, so here is the minimal test case:

0. emacs -Q

1. Eval:

(equal (progn (load "international/uni-decomposition.el" t t t t)
              (aref (cdr (assq 'decomposition char-code-property-alist)) 1024))
       (progn (let ((search-spaces-regexp "\\(\\s-\\|\n\\)+"))
                (load "international/uni-decomposition.el" t t t t))
              (aref (cdr (assq 'decomposition char-code-property-alist)) 1024)))
=> nil

But should return `t'.  I customized `search-whitespace-regexp'
(whose value isearch sets to `search-spaces-regexp') to a legitimate
value, but `unicode-property-table-internal' used in char-fold.el fails
to correctly load "uni-decomposition.el", thus breaking the char-fold search.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#35802: Broken data loaded from uni-decomposition
  2019-05-19 19:46 bug#35802: Broken data loaded from uni-decomposition Juri Linkov
@ 2019-06-06 17:07 ` npostavs
  2019-06-06 20:41   ` Juri Linkov
  0 siblings, 1 reply; 12+ messages in thread
From: npostavs @ 2019-06-06 17:07 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 35802

Juri Linkov <juri@linkov.net> writes:

> But should return `t'.  I customized `search-whitespace-regexp'
> (whose value isearch sets to `search-spaces-regexp') to a legitimate
> value, but `unicode-property-table-internal' used in char-fold.el fails
> to correctly load "uni-decomposition.el", thus breaking the char-fold search.

The problem is that this messes up a search in find-auto-coding:

      (if (re-search-forward
           "[\r\n]\\([^\r\n]*\\)[ \t]*Local Variables:[ \t]*\\([^\r\n]*\\)[\r\n]"
           tail-end t)
          ...
          (let* ((prefix (regexp-quote (match-string 1)))
                 (suffix (regexp-quote (match-string 2)))

The space between "Local Variables" becomes "\\(\\s-\\|\n\\)+" which is
a problem because it adds a new capturing group, which means suffix gets
the wrong value.  Then we fail to find the ";; End:" line, and don't
apply the "coding: utf-8" setting.

So the value you chose isn't entirely legitimate, you should use a shy
group instead: 

(equal (progn (load "international/uni-decomposition.el" t t t t)
              (aref (cdr (assq 'decomposition char-code-property-alist)) 1024))
       (progn (let ((search-spaces-regexp "\\(?:\\s-\\|\n\\)+"))
                (load "international/uni-decomposition.el" t t t t))
              (aref (cdr (assq 'decomposition char-code-property-alist)) 1024)))
;=> t

And possibly let-binding search-spaces-regexp in find-auto-coding would
make sense (although, there's probably more places like this that might
break, not sure if we can ever hope to find them all).





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#35802: Broken data loaded from uni-decomposition
  2019-06-06 17:07 ` npostavs
@ 2019-06-06 20:41   ` Juri Linkov
  2019-06-11 14:18     ` npostavs
  0 siblings, 1 reply; 12+ messages in thread
From: Juri Linkov @ 2019-06-06 20:41 UTC (permalink / raw)
  To: npostavs; +Cc: 35802

>> But should return `t'.  I customized `search-whitespace-regexp'
>> (whose value isearch sets to `search-spaces-regexp') to a legitimate
>> value, but `unicode-property-table-internal' used in char-fold.el fails
>> to correctly load "uni-decomposition.el", thus breaking the char-fold search.
>
> The problem is that this messes up a search in find-auto-coding:

Thanks for finding this.

>       (if (re-search-forward
>            "[\r\n]\\([^\r\n]*\\)[ \t]*Local Variables:[ \t]*\\([^\r\n]*\\)[\r\n]"
>            tail-end t)
>           ...
>           (let* ((prefix (regexp-quote (match-string 1)))
>                  (suffix (regexp-quote (match-string 2)))
>
> The space between "Local Variables" becomes "\\(\\s-\\|\n\\)+" which is
> a problem because it adds a new capturing group, which means suffix gets
> the wrong value.  Then we fail to find the ";; End:" line, and don't
> apply the "coding: utf-8" setting.

When this feature is used in Isearch, the documented way to avoid this problem
is to replace the space with ‘[ ]’, i.e. to use

  "Local[ ]Variables:"

> So the value you chose isn't entirely legitimate, you should use a shy
> group instead:
>
> (equal (progn (load "international/uni-decomposition.el" t t t t)
>               (aref (cdr (assq 'decomposition char-code-property-alist)) 1024))
>        (progn (let ((search-spaces-regexp "\\(?:\\s-\\|\n\\)+"))
>                 (load "international/uni-decomposition.el" t t t t))
>               (aref (cdr (assq 'decomposition char-code-property-alist)) 1024)))
> ;=> t

Maybe this gotcha should be mentioned in the documentation of
search-spaces-regexp and search-whitespace-regexp?

> And possibly let-binding search-spaces-regexp in find-auto-coding would
> make sense (although, there's probably more places like this that might
> break, not sure if we can ever hope to find them all).

This is almost the same class of problems as wrapping re-search-forward
in save-match-data, so finding all places that affect matching elsewhere
will take time.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#35802: Broken data loaded from uni-decomposition
  2019-06-06 20:41   ` Juri Linkov
@ 2019-06-11 14:18     ` npostavs
  2019-06-11 21:11       ` Juri Linkov
  0 siblings, 1 reply; 12+ messages in thread
From: npostavs @ 2019-06-11 14:18 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 35802, npostavs

Juri Linkov <juri@linkov.net> writes:

>> And possibly let-binding search-spaces-regexp in find-auto-coding would
>> make sense (although, there's probably more places like this that might
>> break, not sure if we can ever hope to find them all).
>
> This is almost the same class of problems as wrapping re-search-forward
> in save-match-data, so finding all places that affect matching elsewhere
> will take time.

Actually maybe it's just a matter of making isearch bind
search-spaces-regexp less widely.  I'm not quite following when the your
problem happens though.  Can you show a backtrace from your original
problem using

    (add-hook 'after-load-functions
              (lambda (f) (when (string-match-p "uni-decomposition" f)
                       (debug nil :search-spaces-regexp search-spaces-regexp))))





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#35802: Broken data loaded from uni-decomposition
  2019-06-11 14:18     ` npostavs
@ 2019-06-11 21:11       ` Juri Linkov
  2019-06-16  2:12         ` Noam Postavsky
  0 siblings, 1 reply; 12+ messages in thread
From: Juri Linkov @ 2019-06-11 21:11 UTC (permalink / raw)
  To: npostavs; +Cc: 35802

> Actually maybe it's just a matter of making isearch bind
> search-spaces-regexp less widely.  I'm not quite following when the your
> problem happens though.  Can you show a backtrace from your original
> problem using
>
>     (add-hook 'after-load-functions
>               (lambda (f) (when (string-match-p "uni-decomposition" f)
>                        (debug nil :search-spaces-regexp search-spaces-regexp))))

When I eval both the above and (setq search-whitespace-regexp "\\(\\s-\\|\n\\)+")
then debugger still shows that search-spaces-regexp is nil
(also note where search-spaces-regexp is let-bound to non-nil in the backtrace,
see also more info after the backtrace):

Debugger entered: (:search-spaces-regexp nil)
  (progn (debug nil :search-spaces-regexp search-spaces-regexp))
  (if (string-match-p "uni-decomposition" f) (progn (debug nil :search-spaces-regexp search-spaces-regexp)))
  (closure (t) (f) (if (string-match-p "uni-decomposition" f) (progn (debug nil :search-spaces-regexp search-spaces-regexp))))("emacs/lisp/international/uni-decomposition.el")
  run-hook-with-args((closure (t) (f) (if (string-match-p "uni-decomposition" f) (progn (debug nil :search-spaces-regexp search-spaces-regexp)))) "emacs/lisp/international/uni-decomposition.el")
  do-after-load-evaluation("emacs/lisp/international/uni-decomposition.el")
  load-with-code-conversion("emacs/lisp/international/uni-decomposition.el" "emacs/lisp/international/uni-decomposition.el" t t)
  unicode-property-table-internal(decomposition)
  char-fold-make-table()
  byte-code("\301 \20\301\207" [char-fold-table char-fold-make-table] 1)
  char-fold-to-regexp("a" (isearch-printing-char isearch-del-char))
  funcall(char-fold-to-regexp "a" (isearch-printing-char isearch-del-char))
  (if (functionp isearch-regexp-function) (funcall isearch-regexp-function string lax) (word-search-regexp string lax))
  (let ((lax (and (not bound) (isearch--lax-regexp-function-p)))) (if lax (progn (setq isearch-adjusted t))) (if (functionp isearch-regexp-function) (funcall isearch-regexp-function string lax) (word-search-regexp string lax)))
  (cond (isearch-regexp-function (let ((lax (and (not bound) (isearch--lax-regexp-function-p)))) (if lax (progn (setq isearch-adjusted t))) (if (functionp isearch-regexp-function) (funcall isearch-regexp-function string lax) (word-search-regexp string lax)))) (isearch-regexp string) (t (regexp-quote string)))
  (funcall (if isearch-forward #'re-search-forward #'re-search-backward) (cond (isearch-regexp-function (let ((lax (and (not bound) (isearch--lax-regexp-function-p)))) (if lax (progn (setq isearch-adjusted t))) (if (functionp isearch-regexp-function) (funcall isearch-regexp-function string lax) (word-search-regexp string lax)))) (isearch-regexp string) (t (regexp-quote string))) bound noerror count)
  (let ((search-spaces-regexp (if (cond (isearch-regexp isearch-regexp-lax-whitespace) (t isearch-lax-whitespace)) (progn search-whitespace-regexp)))) (funcall (if isearch-forward #'re-search-forward #'re-search-backward) (cond (isearch-regexp-function (let ((lax (and (not bound) (isearch--lax-regexp-function-p)))) (if lax (progn (setq isearch-adjusted t))) (if (functionp isearch-regexp-function) (funcall isearch-regexp-function string lax) (word-search-regexp string lax)))) (isearch-regexp string) (t (regexp-quote string))) bound noerror count))
  (closure (isearch-commands minibuffer-history-symbol t) (string &optional bound noerror count) (let ((search-spaces-regexp (if (cond (isearch-regexp isearch-regexp-lax-whitespace) (t isearch-lax-whitespace)) (progn search-whitespace-regexp)))) (funcall (if isearch-forward #'re-search-forward #'re-search-backward) (cond (isearch-regexp-function (let (...) (if lax ...) (if ... ... ...))) (isearch-regexp string) (t (regexp-quote string))) bound noerror count)))("a" nil t)
  funcall((closure (isearch-commands minibuffer-history-symbol t) (string &optional bound noerror count) (let ((search-spaces-regexp (if (cond (isearch-regexp isearch-regexp-lax-whitespace) (t isearch-lax-whitespace)) (progn search-whitespace-regexp)))) (funcall (if isearch-forward #'re-search-forward #'re-search-backward) (cond (isearch-regexp-function (let (...) (if lax ...) (if ... ... ...))) (isearch-regexp string) (t (regexp-quote string))) bound noerror count))) "a" nil t)
  (save-excursion (funcall func string bound noerror))
  (let* ((func (isearch-search-fun)) (pos1 (save-excursion (funcall func string bound noerror))) pos2) (if (and (with-no-warnings (char-table-p translation-table-for-input)) (multibyte-string-p string) (string-match-p "[^[:ascii:]]" string)) (progn (let ((translated (apply 'string (mapcar ... string))) match-data) (if translated (progn (let (...) (unwind-protect ... ...)) (if (and pos2 ...) (progn ... ...))))))) (if pos1 (progn (if (and multi-isearch-next-buffer-current-function (buffer-live-p multi-isearch-current-buffer)) (switch-to-buffer multi-isearch-current-buffer)) (goto-char pos1) pos1)))
  isearch-search-string("a" nil t)
  isearch-search()
  isearch-search-and-update()
  isearch-process-search-string("a" "a")
  isearch-process-search-char(97 1)
  isearch-printing-char(97 1)
  funcall-interactively(isearch-printing-char 97 1)
  call-interactively(isearch-printing-char nil nil)
  command-execute(isearch-printing-char)

But when I add (message "search-spaces-regexp: %S" search-spaces-regexp)
at the top level in char-fold.el, then after its autoload from isearch,
the *Message* buffer contains:

  search-spaces-regexp: "\\(\\s-\\|
  \\)+"





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#35802: Broken data loaded from uni-decomposition
  2019-06-11 21:11       ` Juri Linkov
@ 2019-06-16  2:12         ` Noam Postavsky
  2019-06-16 19:22           ` Juri Linkov
  0 siblings, 1 reply; 12+ messages in thread
From: Noam Postavsky @ 2019-06-16  2:12 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 35802

Juri Linkov <juri@linkov.net> writes:

>> Actually maybe it's just a matter of making isearch bind
>> search-spaces-regexp less widely.  I'm not quite following when the your
>> problem happens though.  Can you show a backtrace from your original
>> problem using
>>
>>     (add-hook 'after-load-functions
>>               (lambda (f) (when (string-match-p "uni-decomposition" f)
>>                        (debug nil :search-spaces-regexp search-spaces-regexp))))
>
> When I eval both the above and (setq search-whitespace-regexp "\\(\\s-\\|\n\\)+")
> then debugger still shows that search-spaces-regexp is nil
> (also note where search-spaces-regexp is let-bound to non-nil in the backtrace,
> see also more info after the backtrace):

Hmm, can you show exactly what's needed to trigger this?  I tried the
above after applying your patch in Bug#35802, but uni-decomposition was
never loaded.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#35802: Broken data loaded from uni-decomposition
  2019-06-16  2:12         ` Noam Postavsky
@ 2019-06-16 19:22           ` Juri Linkov
  2019-06-21 11:16             ` Noam Postavsky
  0 siblings, 1 reply; 12+ messages in thread
From: Juri Linkov @ 2019-06-16 19:22 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: 35802

>>> Actually maybe it's just a matter of making isearch bind
>>> search-spaces-regexp less widely.  I'm not quite following when the your
>>> problem happens though.  Can you show a backtrace from your original
>>> problem using
>>>
>>>     (add-hook 'after-load-functions
>>>               (lambda (f) (when (string-match-p "uni-decomposition" f)
>>>                        (debug nil :search-spaces-regexp search-spaces-regexp))))
>>
>> When I eval both the above and (setq search-whitespace-regexp "\\(\\s-\\|\n\\)+")
>> then debugger still shows that search-spaces-regexp is nil
>> (also note where search-spaces-regexp is let-bound to non-nil in the backtrace,
>> see also more info after the backtrace):
>
> Hmm, can you show exactly what's needed to trigger this?  I tried the
> above after applying your patch in Bug#35802, but uni-decomposition was
> never loaded.

Here is a complete reproducible test case:

-1. Apply the following patch to master and recompile
 0. emacs -Q
 1. Eval:

   (progn
     (setq search-whitespace-regexp "\\(\\s-\\|\n\\)+")
     (add-hook 'after-load-functions
               (lambda (f) (when (string-match-p "uni-decomposition" f)
                             (debug nil :search-spaces-regexp search-spaces-regexp)))))

 2. Type C-s M-s ' a

diff --git a/lisp/char-fold.el b/lisp/char-fold.el
index d2fa7108bb..7b0e55bb11 100644
--- a/lisp/char-fold.el
+++ b/lisp/char-fold.el
@@ -28,7 +28,6 @@
   (defun char-fold-make-table ()
     (let* ((equiv (make-char-table 'char-fold-table))
            (equiv-multi (make-char-table 'char-fold-table))
-           (search-spaces-regexp nil)   ; workaround for bug#35802
            (table (unicode-property-table-internal 'decomposition)))
       (set-char-table-extra-slot equiv 0 equiv-multi)
 
@@ -141,6 +140,11 @@ char-fold-table
 
 Exceptionally for the space character (32), ALIST is ignored.")
 
+(progn
+  (message "search-spaces-regexp: %S" search-spaces-regexp)
+  ;; Emulate funcall from defcustom :set
+  (setq char-fold-table (char-fold-make-table)))
+
 (defun char-fold--make-space-string (n)
   "Return a string that matches N spaces."
   (format "\\(?:%s\\|%s\\)"





^ permalink raw reply related	[flat|nested] 12+ messages in thread

* bug#35802: Broken data loaded from uni-decomposition
  2019-06-16 19:22           ` Juri Linkov
@ 2019-06-21 11:16             ` Noam Postavsky
  2019-06-21 19:16               ` Juri Linkov
  0 siblings, 1 reply; 12+ messages in thread
From: Noam Postavsky @ 2019-06-21 11:16 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 35802

[-- Attachment #1: Type: text/plain, Size: 477 bytes --]


> --- a/lisp/char-fold.el
> +++ b/lisp/char-fold.el
> @@ -28,7 +28,6 @@
>    (defun char-fold-make-table ()
>      (let* ((equiv (make-char-table 'char-fold-table))
>             (equiv-multi (make-char-table 'char-fold-table))
> -           (search-spaces-regexp nil)   ; workaround for bug#35802

Ah, I guess this part explains why search-spaces-regexp turned up nil in
the debugger backtrace.  So I think adjusting isearch-search-fun-default
should be enough to fix this.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: patch --]
[-- Type: text/x-diff, Size: 3799 bytes --]

From 4f643ac01ff133e4ad088606c7faf03d2c6287b6 Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Fri, 21 Jun 2019 07:09:44 -0400
Subject: [PATCH] Don't bind search-spaces-regexp around possible autoload
 (Bug#35802)

* lisp/isearch.el (isearch-search-fun-default): Move possible autoload
trigger outside let-binding of search-spaces-regexp.
* lisp/char-fold.el (char-fold-make-table): Remove no longer needed
workaround.
---
 lisp/char-fold.el |  1 -
 lisp/isearch.el   | 38 ++++++++++++++++++++++----------------
 2 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/lisp/char-fold.el b/lisp/char-fold.el
index 7a79873873..7b0e55bb11 100644
--- a/lisp/char-fold.el
+++ b/lisp/char-fold.el
@@ -28,7 +28,6 @@ (eval-and-compile
   (defun char-fold-make-table ()
     (let* ((equiv (make-char-table 'char-fold-table))
            (equiv-multi (make-char-table 'char-fold-table))
-           (search-spaces-regexp nil)   ; workaround for bug#35802
            (table (unicode-property-table-internal 'decomposition)))
       (set-char-table-extra-slot equiv 0 equiv-multi)
 
diff --git a/lisp/isearch.el b/lisp/isearch.el
index bb29c2914b..bfd2e776ec 100644
--- a/lisp/isearch.el
+++ b/lisp/isearch.el
@@ -3263,25 +3263,31 @@ (defun isearch--lax-regexp-function-p ()
 (defun isearch-search-fun-default ()
   "Return default functions to use for the search."
   (lambda (string &optional bound noerror count)
-    ;; Use lax versions to not fail at the end of the word while
-    ;; the user adds and removes characters in the search string
-    ;; (or when using nonincremental word isearch)
-    (let ((search-spaces-regexp (when (cond
-                                       (isearch-regexp isearch-regexp-lax-whitespace)
-                                       (t isearch-lax-whitespace))
+    (let (;; Evaluate this before binding `search-spaces-regexp' which
+          ;; can break all sorts of regexp searches.  In particular,
+          ;; calling `isearch-regexp-function' can trigger autoloading
+          ;; (Bug#35802).
+          (regexp
+           (cond (isearch-regexp-function
+                  (let ((lax (and (not bound)
+                                  (isearch--lax-regexp-function-p))))
+                    (when lax
+                      (setq isearch-adjusted t))
+                    (if (functionp isearch-regexp-function)
+                        (funcall isearch-regexp-function string lax)
+                      (word-search-regexp string lax))))
+                 (isearch-regexp string)
+                 (t (regexp-quote string))))
+          ;; Use lax versions to not fail at the end of the word while
+          ;; the user adds and removes characters in the search string
+          ;; (or when using nonincremental word isearch)
+          (search-spaces-regexp (when (if isearch-regexp
+                                          isearch-regexp-lax-whitespace
+                                        isearch-lax-whitespace)
                                   search-whitespace-regexp)))
       (funcall
        (if isearch-forward #'re-search-forward #'re-search-backward)
-       (cond (isearch-regexp-function
-              (let ((lax (and (not bound) (isearch--lax-regexp-function-p))))
-                (when lax
-                  (setq isearch-adjusted t))
-                (if (functionp isearch-regexp-function)
-                    (funcall isearch-regexp-function string lax)
-                  (word-search-regexp string lax))))
-             (isearch-regexp string)
-             (t (regexp-quote string)))
-       bound noerror count))))
+       regexp bound noerror count))))
 
 (defun isearch-search-string (string bound noerror)
   "Search for the first occurrence of STRING or its translation.
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* bug#35802: Broken data loaded from uni-decomposition
  2019-06-21 11:16             ` Noam Postavsky
@ 2019-06-21 19:16               ` Juri Linkov
  2019-06-22 22:35                 ` Noam Postavsky
  0 siblings, 1 reply; 12+ messages in thread
From: Juri Linkov @ 2019-06-21 19:16 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: 35802

> So I think adjusting isearch-search-fun-default should be enough
> to fix this.

Yes, hopefully the value of search-spaces-regexp is not needed at the
time of regexp generation, even though it's mentioned in the comments
of char-fold-to-regexp.

Are more changes required to avoid such problem in other error-prone
places and to make char-fold--test-bug-35802 still to pass, like using
"Local[ ]Variables:" in find-auto-coding?

Also maybe a warning about the need of using non-capturing groups should be
added to documentation of search-spaces-regexp, search-whitespace-regexp,
Info-search-whitespace-regexp?





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#35802: Broken data loaded from uni-decomposition
  2019-06-21 19:16               ` Juri Linkov
@ 2019-06-22 22:35                 ` Noam Postavsky
  2019-06-23 21:25                   ` Juri Linkov
  0 siblings, 1 reply; 12+ messages in thread
From: Noam Postavsky @ 2019-06-22 22:35 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 35802

[-- Attachment #1: Type: text/plain, Size: 862 bytes --]

tags 35802 + patch
quit

Juri Linkov <juri@linkov.net> writes:

>> So I think adjusting isearch-search-fun-default should be enough
>> to fix this.
>
> Yes, hopefully the value of search-spaces-regexp is not needed at the
> time of regexp generation, even though it's mentioned in the comments
> of char-fold-to-regexp.

Right, it only affects searching.

> Are more changes required to avoid such problem in other error-prone
> places and to make char-fold--test-bug-35802 still to pass, like using
> "Local[ ]Variables:" in find-auto-coding?

I don't think it's reasonable to start protecting all regexps which
might have whitespace in them.

> Also maybe a warning about the need of using non-capturing groups should be
> added to documentation of search-spaces-regexp, search-whitespace-regexp,
> Info-search-whitespace-regexp?

Right, I forgot about that.


[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 7238 bytes --]

From 9bc358511c1240f1c49c12cd84e34210d7cbc16b Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Fri, 21 Jun 2019 07:09:44 -0400
Subject: [PATCH] Don't bind search-spaces-regexp around possible autoload
 (Bug#35802)

* lisp/isearch.el (isearch-search-fun-default): Move possible autoload
trigger outside let-binding of search-spaces-regexp.
* lisp/char-fold.el (char-fold-make-table): Remove no longer needed
workaround.

* lisp/info.el (Info-search-whitespace-regexp):
* lisp/isearch.el (search-whitespace-regexp):
* src/search.c (syms_of_search) <search-spaces-regexp>: Add warning
about adding capturing groups to the value.

* test/lisp/char-fold-tests.el: Remove, binding search-spaces-regexp
to a different should be considered a bug.
---
 lisp/char-fold.el            |  1 -
 lisp/info.el                 |  4 +++-
 lisp/isearch.el              | 44 ++++++++++++++++++++++++++------------------
 src/search.c                 |  4 +++-
 test/lisp/char-fold-tests.el |  8 --------
 5 files changed, 32 insertions(+), 29 deletions(-)

diff --git a/lisp/char-fold.el b/lisp/char-fold.el
index 7a79873873..7b0e55bb11 100644
--- a/lisp/char-fold.el
+++ b/lisp/char-fold.el
@@ -28,7 +28,6 @@ (eval-and-compile
   (defun char-fold-make-table ()
     (let* ((equiv (make-char-table 'char-fold-table))
            (equiv-multi (make-char-table 'char-fold-table))
-           (search-spaces-regexp nil)   ; workaround for bug#35802
            (table (unicode-property-table-internal 'decomposition)))
       (set-char-table-extra-slot equiv 0 equiv-multi)
 
diff --git a/lisp/info.el b/lisp/info.el
index c211887a39..3203c5f171 100644
--- a/lisp/info.el
+++ b/lisp/info.el
@@ -343,7 +343,9 @@ (defcustom Info-search-whitespace-regexp "\\s-+"
 This applies to Info search for regular expressions.
 You might want to use something like \"[ \\t\\r\\n]+\" instead.
 In the Customization buffer, that is `[' followed by a space,
-a tab, a carriage return (control-M), a newline, and `]+'."
+a tab, a carriage return (control-M), a newline, and `]+'.  Don't
+add any capturing groups into this value; that can change the
+numbering of existing capture groups in unexpected ways."
   :type 'regexp
   :group 'info)
 
diff --git a/lisp/isearch.el b/lisp/isearch.el
index bb29c2914b..f150a3bba4 100644
--- a/lisp/isearch.el
+++ b/lisp/isearch.el
@@ -129,8 +129,10 @@ (defcustom search-whitespace-regexp (purecopy "\\s-+")
 then each space you type matches literally, against one space.
 
 You might want to use something like \"[ \\t\\r\\n]+\" instead.
-In the Customization buffer, that is `[' followed by a space,
-a tab, a carriage return (control-M), a newline, and `]+'."
+In the Customization buffer, that is `[' followed by a space, a
+tab, a carriage return (control-M), a newline, and `]+'.  Don't
+add any capturing groups into this value; that can change the
+numbering of existing capture groups in unexpected ways."
   :type '(choice (const :tag "Match Spaces Literally" nil)
 		 regexp)
   :version "24.3")
@@ -3263,25 +3265,31 @@ (defun isearch--lax-regexp-function-p ()
 (defun isearch-search-fun-default ()
   "Return default functions to use for the search."
   (lambda (string &optional bound noerror count)
-    ;; Use lax versions to not fail at the end of the word while
-    ;; the user adds and removes characters in the search string
-    ;; (or when using nonincremental word isearch)
-    (let ((search-spaces-regexp (when (cond
-                                       (isearch-regexp isearch-regexp-lax-whitespace)
-                                       (t isearch-lax-whitespace))
+    (let (;; Evaluate this before binding `search-spaces-regexp' which
+          ;; can break all sorts of regexp searches.  In particular,
+          ;; calling `isearch-regexp-function' can trigger autoloading
+          ;; (Bug#35802).
+          (regexp
+           (cond (isearch-regexp-function
+                  (let ((lax (and (not bound)
+                                  (isearch--lax-regexp-function-p))))
+                    (when lax
+                      (setq isearch-adjusted t))
+                    (if (functionp isearch-regexp-function)
+                        (funcall isearch-regexp-function string lax)
+                      (word-search-regexp string lax))))
+                 (isearch-regexp string)
+                 (t (regexp-quote string))))
+          ;; Use lax versions to not fail at the end of the word while
+          ;; the user adds and removes characters in the search string
+          ;; (or when using nonincremental word isearch)
+          (search-spaces-regexp (when (if isearch-regexp
+                                          isearch-regexp-lax-whitespace
+                                        isearch-lax-whitespace)
                                   search-whitespace-regexp)))
       (funcall
        (if isearch-forward #'re-search-forward #'re-search-backward)
-       (cond (isearch-regexp-function
-              (let ((lax (and (not bound) (isearch--lax-regexp-function-p))))
-                (when lax
-                  (setq isearch-adjusted t))
-                (if (functionp isearch-regexp-function)
-                    (funcall isearch-regexp-function string lax)
-                  (word-search-regexp string lax))))
-             (isearch-regexp string)
-             (t (regexp-quote string)))
-       bound noerror count))))
+       regexp bound noerror count))))
 
 (defun isearch-search-string (string bound noerror)
   "Search for the first occurrence of STRING or its translation.
diff --git a/src/search.c b/src/search.c
index 8a0f707b72..fa574959fb 100644
--- a/src/search.c
+++ b/src/search.c
@@ -3390,7 +3390,9 @@ syms_of_search (void)
 Some commands use this for user-specified regexps.
 Spaces that occur inside character classes or repetition operators
 or other such regexp constructs are not replaced with this.
-A value of nil (which is the normal value) means treat spaces literally.  */);
+A value of nil (which is the normal value) means treat spaces
+literally.  Note that a value with capturing groups can change the
+numbering of existing capture groups in unexpected ways.  */);
   Vsearch_spaces_regexp = Qnil;
 
   DEFSYM (Qinhibit_changing_match_data, "inhibit-changing-match-data");
diff --git a/test/lisp/char-fold-tests.el b/test/lisp/char-fold-tests.el
index 8a7414084b..3fde312a13 100644
--- a/test/lisp/char-fold-tests.el
+++ b/test/lisp/char-fold-tests.el
@@ -124,13 +124,5 @@ (ert-deftest char-fold--speed-test ()
         ;; Ensure it took less than a second.
         (should (< (- (time-to-seconds) time) 1))))))
 
-(ert-deftest char-fold--test-bug-35802 ()
-  (let* ((char-code-property-alist      ; initial value
-          (cons '(decomposition . "uni-decomposition.el")
-                char-code-property-alist))
-         (search-spaces-regexp "\\(\\s-\\|\n\\)+")
-         (char-fold-table (char-fold-make-table)))
-    (char-fold--test-match-exactly "ä" "ä")))
-
 (provide 'char-fold-tests)
 ;;; char-fold-tests.el ends here
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* bug#35802: Broken data loaded from uni-decomposition
  2019-06-22 22:35                 ` Noam Postavsky
@ 2019-06-23 21:25                   ` Juri Linkov
  2019-06-26  2:08                     ` Noam Postavsky
  0 siblings, 1 reply; 12+ messages in thread
From: Juri Linkov @ 2019-06-23 21:25 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: 35802

>> Yes, hopefully the value of search-spaces-regexp is not needed at the
>> time of regexp generation, even though it's mentioned in the comments
>> of char-fold-to-regexp.
>
> Right, it only affects searching.
>
>> Also maybe a warning about the need of using non-capturing groups should be
>> added to documentation of search-spaces-regexp, search-whitespace-regexp,
>> Info-search-whitespace-regexp?
>
> Right, I forgot about that.

Thanks, after your fix I could continue to finish bug#35689.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#35802: Broken data loaded from uni-decomposition
  2019-06-23 21:25                   ` Juri Linkov
@ 2019-06-26  2:08                     ` Noam Postavsky
  0 siblings, 0 replies; 12+ messages in thread
From: Noam Postavsky @ 2019-06-26  2:08 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 35802

tags 35802 fixed
close 35802 27.1
quit

Juri Linkov <juri@linkov.net> writes:

> Thanks, after your fix I could continue to finish bug#35689.

Pushed to master.

648fdbbcec 2019-06-25T22:00:03-04:00 "Don't bind search-spaces-regexp around possible autoload (Bug#35802)"
https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=648fdbbcec159e6bfdb7cd06d32c59e8a17a055e






^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-06-26  2:08 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-19 19:46 bug#35802: Broken data loaded from uni-decomposition Juri Linkov
2019-06-06 17:07 ` npostavs
2019-06-06 20:41   ` Juri Linkov
2019-06-11 14:18     ` npostavs
2019-06-11 21:11       ` Juri Linkov
2019-06-16  2:12         ` Noam Postavsky
2019-06-16 19:22           ` Juri Linkov
2019-06-21 11:16             ` Noam Postavsky
2019-06-21 19:16               ` Juri Linkov
2019-06-22 22:35                 ` Noam Postavsky
2019-06-23 21:25                   ` Juri Linkov
2019-06-26  2:08                     ` Noam Postavsky

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).