unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Noam Postavsky <npostavs@gmail.com>
To: Juri Linkov <juri@linkov.net>
Cc: 35802@debbugs.gnu.org
Subject: bug#35802: Broken data loaded from uni-decomposition
Date: Sat, 22 Jun 2019 18:35:59 -0400	[thread overview]
Message-ID: <871rzluu28.fsf@gmail.com> (raw)
In-Reply-To: <877e9e3jqv.fsf@mail.linkov.net> (Juri Linkov's message of "Fri,  21 Jun 2019 22:16:24 +0300")

[-- Attachment #1: Type: text/plain, Size: 862 bytes --]

tags 35802 + patch
quit

Juri Linkov <juri@linkov.net> writes:

>> So I think adjusting isearch-search-fun-default should be enough
>> to fix this.
>
> Yes, hopefully the value of search-spaces-regexp is not needed at the
> time of regexp generation, even though it's mentioned in the comments
> of char-fold-to-regexp.

Right, it only affects searching.

> Are more changes required to avoid such problem in other error-prone
> places and to make char-fold--test-bug-35802 still to pass, like using
> "Local[ ]Variables:" in find-auto-coding?

I don't think it's reasonable to start protecting all regexps which
might have whitespace in them.

> Also maybe a warning about the need of using non-capturing groups should be
> added to documentation of search-spaces-regexp, search-whitespace-regexp,
> Info-search-whitespace-regexp?

Right, I forgot about that.


[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 7238 bytes --]

From 9bc358511c1240f1c49c12cd84e34210d7cbc16b Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Fri, 21 Jun 2019 07:09:44 -0400
Subject: [PATCH] Don't bind search-spaces-regexp around possible autoload
 (Bug#35802)

* lisp/isearch.el (isearch-search-fun-default): Move possible autoload
trigger outside let-binding of search-spaces-regexp.
* lisp/char-fold.el (char-fold-make-table): Remove no longer needed
workaround.

* lisp/info.el (Info-search-whitespace-regexp):
* lisp/isearch.el (search-whitespace-regexp):
* src/search.c (syms_of_search) <search-spaces-regexp>: Add warning
about adding capturing groups to the value.

* test/lisp/char-fold-tests.el: Remove, binding search-spaces-regexp
to a different should be considered a bug.
---
 lisp/char-fold.el            |  1 -
 lisp/info.el                 |  4 +++-
 lisp/isearch.el              | 44 ++++++++++++++++++++++++++------------------
 src/search.c                 |  4 +++-
 test/lisp/char-fold-tests.el |  8 --------
 5 files changed, 32 insertions(+), 29 deletions(-)

diff --git a/lisp/char-fold.el b/lisp/char-fold.el
index 7a79873873..7b0e55bb11 100644
--- a/lisp/char-fold.el
+++ b/lisp/char-fold.el
@@ -28,7 +28,6 @@ (eval-and-compile
   (defun char-fold-make-table ()
     (let* ((equiv (make-char-table 'char-fold-table))
            (equiv-multi (make-char-table 'char-fold-table))
-           (search-spaces-regexp nil)   ; workaround for bug#35802
            (table (unicode-property-table-internal 'decomposition)))
       (set-char-table-extra-slot equiv 0 equiv-multi)
 
diff --git a/lisp/info.el b/lisp/info.el
index c211887a39..3203c5f171 100644
--- a/lisp/info.el
+++ b/lisp/info.el
@@ -343,7 +343,9 @@ (defcustom Info-search-whitespace-regexp "\\s-+"
 This applies to Info search for regular expressions.
 You might want to use something like \"[ \\t\\r\\n]+\" instead.
 In the Customization buffer, that is `[' followed by a space,
-a tab, a carriage return (control-M), a newline, and `]+'."
+a tab, a carriage return (control-M), a newline, and `]+'.  Don't
+add any capturing groups into this value; that can change the
+numbering of existing capture groups in unexpected ways."
   :type 'regexp
   :group 'info)
 
diff --git a/lisp/isearch.el b/lisp/isearch.el
index bb29c2914b..f150a3bba4 100644
--- a/lisp/isearch.el
+++ b/lisp/isearch.el
@@ -129,8 +129,10 @@ (defcustom search-whitespace-regexp (purecopy "\\s-+")
 then each space you type matches literally, against one space.
 
 You might want to use something like \"[ \\t\\r\\n]+\" instead.
-In the Customization buffer, that is `[' followed by a space,
-a tab, a carriage return (control-M), a newline, and `]+'."
+In the Customization buffer, that is `[' followed by a space, a
+tab, a carriage return (control-M), a newline, and `]+'.  Don't
+add any capturing groups into this value; that can change the
+numbering of existing capture groups in unexpected ways."
   :type '(choice (const :tag "Match Spaces Literally" nil)
 		 regexp)
   :version "24.3")
@@ -3263,25 +3265,31 @@ (defun isearch--lax-regexp-function-p ()
 (defun isearch-search-fun-default ()
   "Return default functions to use for the search."
   (lambda (string &optional bound noerror count)
-    ;; Use lax versions to not fail at the end of the word while
-    ;; the user adds and removes characters in the search string
-    ;; (or when using nonincremental word isearch)
-    (let ((search-spaces-regexp (when (cond
-                                       (isearch-regexp isearch-regexp-lax-whitespace)
-                                       (t isearch-lax-whitespace))
+    (let (;; Evaluate this before binding `search-spaces-regexp' which
+          ;; can break all sorts of regexp searches.  In particular,
+          ;; calling `isearch-regexp-function' can trigger autoloading
+          ;; (Bug#35802).
+          (regexp
+           (cond (isearch-regexp-function
+                  (let ((lax (and (not bound)
+                                  (isearch--lax-regexp-function-p))))
+                    (when lax
+                      (setq isearch-adjusted t))
+                    (if (functionp isearch-regexp-function)
+                        (funcall isearch-regexp-function string lax)
+                      (word-search-regexp string lax))))
+                 (isearch-regexp string)
+                 (t (regexp-quote string))))
+          ;; Use lax versions to not fail at the end of the word while
+          ;; the user adds and removes characters in the search string
+          ;; (or when using nonincremental word isearch)
+          (search-spaces-regexp (when (if isearch-regexp
+                                          isearch-regexp-lax-whitespace
+                                        isearch-lax-whitespace)
                                   search-whitespace-regexp)))
       (funcall
        (if isearch-forward #'re-search-forward #'re-search-backward)
-       (cond (isearch-regexp-function
-              (let ((lax (and (not bound) (isearch--lax-regexp-function-p))))
-                (when lax
-                  (setq isearch-adjusted t))
-                (if (functionp isearch-regexp-function)
-                    (funcall isearch-regexp-function string lax)
-                  (word-search-regexp string lax))))
-             (isearch-regexp string)
-             (t (regexp-quote string)))
-       bound noerror count))))
+       regexp bound noerror count))))
 
 (defun isearch-search-string (string bound noerror)
   "Search for the first occurrence of STRING or its translation.
diff --git a/src/search.c b/src/search.c
index 8a0f707b72..fa574959fb 100644
--- a/src/search.c
+++ b/src/search.c
@@ -3390,7 +3390,9 @@ syms_of_search (void)
 Some commands use this for user-specified regexps.
 Spaces that occur inside character classes or repetition operators
 or other such regexp constructs are not replaced with this.
-A value of nil (which is the normal value) means treat spaces literally.  */);
+A value of nil (which is the normal value) means treat spaces
+literally.  Note that a value with capturing groups can change the
+numbering of existing capture groups in unexpected ways.  */);
   Vsearch_spaces_regexp = Qnil;
 
   DEFSYM (Qinhibit_changing_match_data, "inhibit-changing-match-data");
diff --git a/test/lisp/char-fold-tests.el b/test/lisp/char-fold-tests.el
index 8a7414084b..3fde312a13 100644
--- a/test/lisp/char-fold-tests.el
+++ b/test/lisp/char-fold-tests.el
@@ -124,13 +124,5 @@ (ert-deftest char-fold--speed-test ()
         ;; Ensure it took less than a second.
         (should (< (- (time-to-seconds) time) 1))))))
 
-(ert-deftest char-fold--test-bug-35802 ()
-  (let* ((char-code-property-alist      ; initial value
-          (cons '(decomposition . "uni-decomposition.el")
-                char-code-property-alist))
-         (search-spaces-regexp "\\(\\s-\\|\n\\)+")
-         (char-fold-table (char-fold-make-table)))
-    (char-fold--test-match-exactly "ä" "ä")))
-
 (provide 'char-fold-tests)
 ;;; char-fold-tests.el ends here
-- 
2.11.0


  reply	other threads:[~2019-06-22 22:35 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-19 19:46 bug#35802: Broken data loaded from uni-decomposition Juri Linkov
2019-06-06 17:07 ` npostavs
2019-06-06 20:41   ` Juri Linkov
2019-06-11 14:18     ` npostavs
2019-06-11 21:11       ` Juri Linkov
2019-06-16  2:12         ` Noam Postavsky
2019-06-16 19:22           ` Juri Linkov
2019-06-21 11:16             ` Noam Postavsky
2019-06-21 19:16               ` Juri Linkov
2019-06-22 22:35                 ` Noam Postavsky [this message]
2019-06-23 21:25                   ` Juri Linkov
2019-06-26  2:08                     ` Noam Postavsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871rzluu28.fsf@gmail.com \
    --to=npostavs@gmail.com \
    --cc=35802@debbugs.gnu.org \
    --cc=juri@linkov.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).