From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Noam Postavsky Newsgroups: gmane.emacs.bugs Subject: bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode Date: Sun, 26 May 2019 18:17:55 -0400 Message-ID: <875zpw97xo.fsf@gmail.com> References: <87ftujuvkd.fsf@zira.vinc17.org> <878sv7ff6j.fsf@gmail.com> <87pnoiegsh.fsf@gmail.com> <20190517213602.GA11777@zira.vinc17.org> <875zq8e6tw.fsf@gmail.com> <20190518144756.GA21327@zira.vinc17.org> <87r28vd2d5.fsf@gmail.com> <20190519001704.GA5467@zira.vinc17.org> <87k1embaqx.fsf@gmail.com> <87h89qb722.fsf@gmail.com> <87o93wam5s.fsf@gmail.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="120394"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux) Cc: Vincent Lefevre , 33887@debbugs.gnu.org To: Stefan Monnier Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon May 27 00:19:16 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hV1U3-000VA4-SR for geb-bug-gnu-emacs@m.gmane.org; Mon, 27 May 2019 00:19:16 +0200 Original-Received: from localhost ([127.0.0.1]:35989 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hV1U2-0003bq-TH for geb-bug-gnu-emacs@m.gmane.org; Sun, 26 May 2019 18:19:14 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:35090) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hV1Tu-0003bi-JW for bug-gnu-emacs@gnu.org; Sun, 26 May 2019 18:19:09 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hV1Tr-0007NB-IC for bug-gnu-emacs@gnu.org; Sun, 26 May 2019 18:19:06 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:38497) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hV1Tr-0007My-G4 for bug-gnu-emacs@gnu.org; Sun, 26 May 2019 18:19:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1hV1Tp-0007Cj-QH for bug-gnu-emacs@gnu.org; Sun, 26 May 2019 18:19:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Noam Postavsky Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 26 May 2019 22:19:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 33887 X-GNU-PR-Package: emacs Original-Received: via spool by 33887-submit@debbugs.gnu.org id=B33887.155890908727626 (code B ref 33887); Sun, 26 May 2019 22:19:01 +0000 Original-Received: (at 33887) by debbugs.gnu.org; 26 May 2019 22:18:07 +0000 Original-Received: from localhost ([127.0.0.1]:52041 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hV1Sx-0007BW-7I for submit@debbugs.gnu.org; Sun, 26 May 2019 18:18:07 -0400 Original-Received: from mail-it1-f179.google.com ([209.85.166.179]:54945) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hV1Sv-0007B0-0Z for 33887@debbugs.gnu.org; Sun, 26 May 2019 18:18:05 -0400 Original-Received: by mail-it1-f179.google.com with SMTP id h20so24073013itk.4 for <33887@debbugs.gnu.org>; Sun, 26 May 2019 15:18:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=ELP+WO8JS6VFOi9PfvStfRHjuBlJ9TqymZaZS8x0RLw=; b=IaYsziczo0ieGmxqWuXU2U4FMZ6J9WLIkqGriBtSbqUh8SBuOQUiIIvZ5eAmjxZhWy ibq7HS8RnzGnOps2lJiB5LvUUUNgA7WkCZIMO9vIHnEv/qBPuLR1H9myJA48mou+SsrI EJo9hrI7EkKV9KPtCpTDi9WvCmeRT+HzG1TUa+z+m4JhBy8qQ2Fdz8WvSrPM5JNCUF/v spGYK/VKE6na+p57v6AvRCHOwuDNaVUH+MQzKp1kMpxel3IOxmj9ecfmneYlrkG9AXVX qRYFEmXYV2tMRhrJI/AW0JFCDvlT46mPfwm4a2bXqFipOP4W/+btmX0qFDHf9gwuBvr+ M4zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=ELP+WO8JS6VFOi9PfvStfRHjuBlJ9TqymZaZS8x0RLw=; b=KFjQHuplvYaLLg16cQv78l3DjLX21lR40KKq268YD4WdTTHGwS2gz+DtS5IM40bPdX PkL8f7mCMHkoMnXNjX1pwTmW1M4mLYoJ+Xg/Brb4B7WEgwyAd8vNrHd9qlnD8ueW7msg lufHgp7k1C7B3xuTJZppco03babsMMH/nz6Dh40cIn0gRG7+D7cayuBQ/nR6fXnJjzzS YDZRzV0YTJXm71C46Wde8m7zJ6pY+kfzVz9Lr/ChnC0Ml/5yTdyXN8C775nwqeOQ3Nge IU7f8ojP0biaRsUCAlD2F4OGAK+MohEAyvqOfZqQ+V9BuP8gUNI6UWocQQRdxi5jdUGY 6pcA== X-Gm-Message-State: APjAAAWvjcKhYLhC4A8dGfLFIIh+1PjoM4sYS3xKthU+POZChYRMXOS5 zFv2Uci1obliyIGEMVGRC11ESt19 X-Google-Smtp-Source: APXvYqwpG2bimueKVSEqDSlf3GHKBZxjHjqZbk0I7KBs4GZwpt3N6nv1cOGZOGgd7chvIfVtC4qdhg== X-Received: by 2002:a24:5c90:: with SMTP id q138mr12345868itb.96.1558909079267; Sun, 26 May 2019 15:17:59 -0700 (PDT) Original-Received: from minid (cbl-45-2-119-34.yyz.frontiernetworks.ca. [45.2.119.34]) by smtp.gmail.com with ESMTPSA id q1sm1050017ios.86.2019.05.26.15.17.56 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sun, 26 May 2019 15:17:58 -0700 (PDT) In-Reply-To: (Stefan Monnier's message of "Wed, 22 May 2019 18:37:44 -0400") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:159799 Archived-At: --=-=-= Content-Type: text/plain Stefan Monnier writes: > I pushed a patch which should fix the "lone >" problem without > introducing any undue extra cost. It should also fix the "very long > line" case. Seems to pass my tests. Not sure if you missed the alternate fix I proposed in https://debbugs.gnu.org/33887#94 or not. It does have the disadvantage of leaving (car (syntax-ppss)) unreliable for any other code which uses it. Here's a patch against master that should cover the remaining cases Vincent raised: --=-=-= Content-Type: text/plain Content-Disposition: attachment; filename=0001-Fix-some-SGML-syntax-edge-cases-Bug-33887.patch Content-Description: patch >From 2ffdab0e86161396e3d2606949d1fcf93c58b592 Mon Sep 17 00:00:00 2001 From: Noam Postavsky Date: Sun, 26 May 2019 11:07:14 -0400 Subject: [PATCH 1/2] Fix some SGML syntax edge cases (Bug#33887) * lisp/textmodes/sgml-mode.el (sgml-syntax-propertize-rules): Handle single and double quotes symmetrically. Don't skip quoted comment enders. * test/lisp/textmodes/sgml-mode-tests.el (sgml-tests--quotes-syntax): Add more test cases. (sgml-mode-quote-in-long-text): New test. --- lisp/textmodes/sgml-mode.el | 5 +++- test/lisp/textmodes/sgml-mode-tests.el | 45 ++++++++++++++++++++++++++++------ 2 files changed, 42 insertions(+), 8 deletions(-) diff --git a/lisp/textmodes/sgml-mode.el b/lisp/textmodes/sgml-mode.el index 75f20722b0..1df7e78afc 100644 --- a/lisp/textmodes/sgml-mode.el +++ b/lisp/textmodes/sgml-mode.el @@ -363,9 +363,12 @@ (eval-and-compile ;; the resulting number of calls to syntax-ppss made it too slow ;; (bug#33887), so we're now careful to leave alone any pair ;; of quotes that doesn't hold a < or > char, which is the vast majority. - ("\\(?:\\(?1:\"\\)[^\"<>]*\\|\\(?1:'\\)[^'\"<>]*\\)" + ("\\([\"']\\)[^\"'<>]*" (1 (if (eq (char-after) (char-after (match-beginning 0))) (forward-char 1) + ;; Avoid skipping comment ender. + (when (eq (char-after) ?>) + (skip-chars-backward "-")) ;; Be careful to call `syntax-ppss' on a position before the one ;; we're going to change, so as not to need to flush the data we ;; just computed. diff --git a/test/lisp/textmodes/sgml-mode-tests.el b/test/lisp/textmodes/sgml-mode-tests.el index 1b8965e344..34d26480a4 100644 --- a/test/lisp/textmodes/sgml-mode-tests.el +++ b/test/lisp/textmodes/sgml-mode-tests.el @@ -161,15 +161,46 @@ (ert-deftest sgml-quote-works () (should (string= "&&" (buffer-string)))))) (ert-deftest sgml-tests--quotes-syntax () + (dolist (str '("a\"b c'd" + "a'b c\"d" + "\"a'" + "'a\"" + "\"a'\"" + "'a\"'" + "a\"b c'd" + "c>'d" + "" + "" + )) + (with-temp-buffer + (sgml-mode) + (insert str) + (ert-info ((format "%S" str) :prefix "Test case: ") + ;; Check that last tag is parsed as a tag. + (should (= 1 (car (syntax-ppss (1- (point-max)))))) + (should (= 0 (car (syntax-ppss (point-max))))))))) + +(ert-deftest sgml-mode-quote-in-long-text () (with-temp-buffer (sgml-mode) - (insert "a\"b c'd") - (should (= 1 (car (syntax-ppss (1- (point-max)))))) - (should (= 0 (car (syntax-ppss (point-max))))) - (erase-buffer) - (insert "c>d") - (should (= 1 (car (syntax-ppss (1- (point-max)))))) - (should (= 0 (car (syntax-ppss (point-max))))))) + (insert "" + ;; `syntax-propertize-wholelines' extends chunk size based + ;; on line length, so newlines are significant! + (make-string syntax-propertize-chunk-size ?a) "\n" + "'" + (make-string syntax-propertize-chunk-size ?a) "\n" + "") + ;; If we just check (syntax-ppss (point-max)) immediately, then + ;; we'll end up propertizing the whole buffer in one chunk (so the + ;; test is useless). Simulate something more like what happens + ;; when the buffer is viewed normally. + (cl-loop for pos from (point-min) to (point-max) + by syntax-propertize-chunk-size + do (syntax-ppss pos)) + (syntax-ppss (point-max)) + ;; Check that last tag is parsed as a tag. + (should (= 1 (- (car (syntax-ppss (1- (point-max)))) + (car (syntax-ppss (point-max)))))))) (provide 'sgml-mode-tests) ;;; sgml-mode-tests.el ends here -- 2.11.0 --=-=-= Content-Type: text/plain And about the highlighting of quoted text outside tags, we can just disable fontification, while leaving the syntax code untouched: --=-=-= Content-Type: text/plain Content-Disposition: attachment; filename=0002-Don-t-fontiy-text-outside-of-SGML-XML-tags-Bug-33887.patch Content-Description: patch >From a4a6008d96011e2517939cb8cb51624802a8c31e Mon Sep 17 00:00:00 2001 From: Noam Postavsky Date: Sun, 26 May 2019 17:41:22 -0400 Subject: [PATCH 2/2] Don't fontiy text outside of SGML/XML tags (Bug#33887) * lisp/font-lock.el (font-lock-syntactic-face-function-default): New function. (font-lock-syntactic-face-function): Use it as default value. * lisp/textmodes/sgml-mode.el (sgml-font-lock-syntactic-face): New function. (sgml-mode): * lisp/nxml/nxml-mode.el (nxml-mode): Use it as font-lock-syntactic-face-function value. --- lisp/font-lock.el | 7 +++++-- lisp/nxml/nxml-mode.el | 4 +++- lisp/textmodes/sgml-mode.el | 11 +++++++++-- 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/lisp/font-lock.el b/lisp/font-lock.el index 3991a4ee8e..ddf1cbdb9f 100644 --- a/lisp/font-lock.el +++ b/lisp/font-lock.el @@ -527,9 +527,12 @@ (defvar font-lock-syntactically-fontified 0 sometimes be slightly incorrect.") (make-variable-buffer-local 'font-lock-syntactically-fontified) +(defun font-lock-syntactic-face-function-default (state) + "Default value for `font-lock-syntactic-face-function'." + (if (nth 3 state) font-lock-string-face font-lock-comment-face)) + (defvar font-lock-syntactic-face-function - (lambda (state) - (if (nth 3 state) font-lock-string-face font-lock-comment-face)) + #'font-lock-syntactic-face-function-default "Function to determine which face to use when fontifying syntactically. The function is called with a single parameter (the state as returned by `parse-partial-sexp' at the beginning of the region to highlight) and diff --git a/lisp/nxml/nxml-mode.el b/lisp/nxml/nxml-mode.el index da01b2a342..05044d66df 100644 --- a/lisp/nxml/nxml-mode.el +++ b/lisp/nxml/nxml-mode.el @@ -551,7 +551,9 @@ (define-derived-mode nxml-mode text-mode "nXML" nil ; no special syntax table (font-lock-extend-region-functions . (nxml-extend-region)) (jit-lock-contextually . t) - (font-lock-unfontify-region-function . nxml-unfontify-region))) + (font-lock-unfontify-region-function . nxml-unfontify-region) + (font-lock-syntactic-face-function + . sgml-font-lock-syntactic-face))) (with-demoted-errors (rng-nxml-mode-init))) diff --git a/lisp/textmodes/sgml-mode.el b/lisp/textmodes/sgml-mode.el index 1df7e78afc..225fe72a01 100644 --- a/lisp/textmodes/sgml-mode.el +++ b/lisp/textmodes/sgml-mode.el @@ -329,6 +329,11 @@ (defconst sgml-font-lock-keywords-2 (defvar sgml-font-lock-keywords sgml-font-lock-keywords-1 "Rules for highlighting SGML code. See also `sgml-tag-face-alist'.") +(defun sgml-font-lock-syntactic-face (state) + "`font-lock-syntactic-face-function' for `sgml-mode'." + (and (nth 9 state) ;; Only use faces within tags. + (font-lock-syntactic-face-function-default state))) + (defvar-local sgml--syntax-propertize-ppss nil) (defun sgml--syntax-propertize-ppss (pos) @@ -573,7 +578,7 @@ (define-derived-mode sgml-mode text-mode '(sgml-xml-mode "XML" "SGML") ;; This is desirable because SGML discards a newline that appears ;; immediately after a start tag or immediately before an end tag. (setq-local paragraph-start (concat "[ \t]*$\\|\ -[ \t]*")) +\[ \t]*")) (setq-local paragraph-separate (concat paragraph-start "$")) (setq-local adaptive-fill-regexp "[ \t]*") (add-hook 'fill-nobreak-predicate 'sgml-fill-nobreak nil t) @@ -591,7 +596,9 @@ (define-derived-mode sgml-mode text-mode '(sgml-xml-mode "XML" "SGML") (setq font-lock-defaults '((sgml-font-lock-keywords sgml-font-lock-keywords-1 sgml-font-lock-keywords-2) - nil t)) + nil t nil + (font-lock-syntactic-face-function + . sgml-font-lock-syntactic-face))) (setq-local syntax-propertize-function #'sgml-syntax-propertize) (setq-local facemenu-add-face-function 'sgml-mode-facemenu-add-face-function) (setq-local sgml-xml-mode (sgml-xml-guess)) -- 2.11.0 --=-=-=--