* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode @ 2018-12-27 10:13 Vincent Lefevre 2018-12-27 16:02 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 42+ messages in thread From: Vincent Lefevre @ 2018-12-27 10:13 UTC (permalink / raw) To: 33887 When I open a large XML file and immediately go to the end of the file with '<ESC> >', Emacs hangs for several seconds. For instance, on /usr/share/xml/iso-codes/iso_639-3.xml from iso-codes in Debian (a 1-MB file), it takes 5 seconds. On a 4-MB personal XML file, it takes 15 seconds. This is a regression: Emacs 25 did not hang at all. In GNU Emacs 26.1 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.2) of 2018-12-26, modified by Debian built on x86-ubc-01 Windowing system distributor 'The X.Org Foundation', version 11.0.12003000 System Description: Debian GNU/Linux buster/sid Recent messages: Loading /etc/emacs/site-start.d/50latex-cjk-common.el (source)...done Loading /etc/emacs/site-start.d/50latex-cjk-thai.el (source)...done Loading /etc/emacs/site-start.d/50maxima-emacs.el (source)...done Loading /etc/emacs/site-start.d/50psvn.el (source)...done Loading /etc/emacs/site-start.d/50python-docutils.el (source)...done Loading /etc/emacs/site-start.d/50texlive-lang-english.el (source)...done Loading /etc/emacs/site-start.d/50why3.el (source)...done Loading /home/vinc17/share/emacs/site-lisp/mutteditor.el (source)...done Loading time...done For information about GNU Emacs and the GNU system, type C-h C-a. Configured using: 'configure --build x86_64-linux-gnu --prefix=/usr --sharedstatedir=/var/lib --libexecdir=/usr/lib --localstatedir=/var/lib --infodir=/usr/share/info --mandir=/usr/share/man --enable-libsystemd --with-pop=yes --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/26.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/26.1/site-lisp:/usr/share/emacs/site-lisp --with-sound=alsa --without-gconf --with-mailutils --build x86_64-linux-gnu --prefix=/usr --sharedstatedir=/var/lib --libexecdir=/usr/lib --localstatedir=/var/lib --infodir=/usr/share/info --mandir=/usr/share/man --enable-libsystemd --with-pop=yes --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/26.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/26.1/site-lisp:/usr/share/emacs/site-lisp --with-sound=alsa --without-gconf --with-mailutils --with-x=yes --with-x-toolkit=gtk3 --with-toolkit-scroll-bars 'CFLAGS=-g -O2 -fdebug-prefix-map=/build/emacs-3ThesY/emacs-26.1+1=. -fstack-protector-strong -Wformat -Werror=format-security -Wall' 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2' LDFLAGS=-Wl,-z,relro' Configured features: XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GSETTINGS NOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB TOOLKIT_SCROLL_BARS GTK3 X11 THREADS LIBSYSTEMD LCMS2 Important settings: value of $LC_COLLATE: POSIX value of $LC_CTYPE: en_US.UTF-8 value of $LC_TIME: en_DK value of $LANG: POSIX locale-coding-system: utf-8-unix Major mode: Lisp Interaction Minor modes in effect: display-time-mode: t show-paren-mode: t tooltip-mode: t global-eldoc-mode: t eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t column-number-mode: t line-number-mode: t transient-mark-mode: t Load-path shadows: /usr/share/emacs/site-lisp/llvm-3.5/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-3.6/tablegen-mode /usr/share/emacs/site-lisp/llvm-3.5/llvm-mode hides /usr/share/emacs/site-lisp/llvm-3.6/llvm-mode /usr/share/emacs/site-lisp/llvm-3.5/emacs hides /usr/share/emacs/site-lisp/llvm-3.6/emacs /usr/share/emacs/site-lisp/llvm-3.5/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-3.7/tablegen-mode /usr/share/emacs/site-lisp/llvm-3.5/llvm-mode hides /usr/share/emacs/site-lisp/llvm-3.7/llvm-mode /usr/share/emacs/site-lisp/llvm-3.5/emacs hides /usr/share/emacs/site-lisp/llvm-3.7/emacs /usr/share/emacs/site-lisp/llvm-3.5/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-3.8/tablegen-mode /usr/share/emacs/site-lisp/llvm-3.5/llvm-mode hides /usr/share/emacs/site-lisp/llvm-3.8/llvm-mode /usr/share/emacs/site-lisp/llvm-3.5/emacs hides /usr/share/emacs/site-lisp/llvm-3.8/emacs /usr/share/emacs/site-lisp/llvm-3.5/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-3.9/tablegen-mode /usr/share/emacs/site-lisp/llvm-3.5/llvm-mode hides /usr/share/emacs/site-lisp/llvm-3.9/llvm-mode /usr/share/emacs/site-lisp/llvm-3.5/emacs hides /usr/share/emacs/site-lisp/llvm-3.9/emacs /usr/share/emacs/site-lisp/llvm-3.5/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-4.0/tablegen-mode /usr/share/emacs/site-lisp/llvm-3.5/llvm-mode hides /usr/share/emacs/site-lisp/llvm-4.0/llvm-mode /usr/share/emacs/site-lisp/llvm-3.5/emacs hides /usr/share/emacs/site-lisp/llvm-4.0/emacs /usr/share/emacs/site-lisp/rst hides /usr/share/emacs/26.1/lisp/textmodes/rst /usr/share/emacs/site-lisp/latex-cjk-thai/thai-word hides /usr/share/emacs/26.1/lisp/language/thai-word Features: (shadow sort mail-extr warnings emacsbug message rmc puny seq byte-opt gv bytecomp byte-compile cconv dired dired-loaddefs format-spec rfc822 mml easymenu mml-sec password-cache epa derived epg epg-config gnus-util rmail rmail-loaddefs mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils elec-pair time cus-start cus-load paren cc-styles cc-align cc-engine cc-vars cc-defs edmacro kmacro cl-loaddefs cl-lib time-date mule-util tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow isearch timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite charscript charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote dbusbind inotify lcms2 dynamic-setting system-font-setting font-render-setting move-toolbar gtk x-toolkit x multi-tty make-network-process emacs) Memory information: ((conses 16 118562 10618) (symbols 48 23199 1) (miscs 40 54 133) (strings 32 34944 2101) (string-bytes 1 946046) (vectors 16 15937) (vector-slots 8 510844 4784) (floats 8 56 97) (intervals 56 279 0) (buffers 992 12)) ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2018-12-27 10:13 bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode Vincent Lefevre @ 2018-12-27 16:02 ` Eli Zaretskii 2018-12-27 16:39 ` Stefan Monnier 2019-01-17 22:57 ` Stefan Monnier 2019-01-08 22:11 ` Fernando Jascovich 2019-05-15 23:53 ` Noam Postavsky 2 siblings, 2 replies; 42+ messages in thread From: Eli Zaretskii @ 2018-12-27 16:02 UTC (permalink / raw) To: Vincent Lefevre, Stefan Monnier; +Cc: 33887 > From: Vincent Lefevre <vincent@vinc17.net> > Date: Thu, 27 Dec 2018 11:13:06 +0100 > > When I open a large XML file and immediately go to the end of the > file with '<ESC> >', Emacs hangs for several seconds. For instance, > on /usr/share/xml/iso-codes/iso_639-3.xml from iso-codes in Debian > (a 1-MB file), it takes 5 seconds. On a 4-MB personal XML file, it > takes 15 seconds. > > This is a regression: Emacs 25 did not hang at all. Confirmed, thanks. The profile (see below) blames syntax-ppss called by sgml-syntax-propertize, so I suspect commit 0055190, which added sgml-syntax-propertize-inside to sgml-syntax-propertize. CC'ing Stefan who made those changes. Here's the profile: - command-execute 532 77% - call-interactively 532 77% - funcall-interactively 522 75% - end-of-buffer 500 72% - recenter 496 71% - jit-lock-function 496 71% - jit-lock-fontify-now 496 71% - jit-lock--run-functions 496 71% - run-hook-wrapped 496 71% - #<compiled 0x200000000b3a7fd0> 496 71% - font-lock-fontify-region 496 71% - font-lock-default-fontify-region 496 71% - nxml-extend-region 496 71% - skip-syntax-forward 496 71% - internal--syntax-propertize 496 71% - syntax-propertize 496 71% - sgml-syntax-propertize 490 71% syntax-ppss 445 64% push-mark 1 0% - find-file 20 2% - find-file-noselect 20 2% - find-file-noselect-1 19 2% - after-find-file 17 2% - normal-mode 17 2% - set-auto-mode 17 2% - set-auto-mode-0 17 2% - xml-mode 17 2% - byte-code 14 2% - require 12 1% - byte-code 11 1% - require 10 1% - byte-code 9 1% - require 6 0% - byte-code 6 0% - cl-generic-define-method 4 0% - cl--generic-make-function 4 0% - cl--generic-make-next-function 4 0% - cl--generic-get-dispatcher 4 0% - byte-compile 3 0% byte-code 1 0% - #<compiled 0x200000000b325048> 1 0% byte-compile-top-level 1 0% - custom-declare-variable 1 0% - custom-initialize-reset 1 0% - eval 1 0% - funcall 1 0% - #<compiled 0x200000000b3c88b8> 1 0% - executable-find 1 0% locate-file 1 0% file-truename 1 0% - rng-nxml-mode-init 2 0% - rng-validate-mode 2 0% - rng-auto-set-schema 2 0% - rng-locate-schema-file 2 0% - rng-locate-schema-file-using 2 0% - rng-get-parsed-schema-locating-file 2 0% - rng-parse-schema-locating-file 1 0% - rng-parse-validate-file 1 0% - nxml-parse-instance 1 0% nxml-parse-instance-1 1 0% - file-truename 1 0% - file-truename 1 0% - file-truename 1 0% file-truename 1 0% - insert-file-contents 1 0% xml-find-file-coding-system 1 0% - execute-extended-command 1 0% - sit-for 1 0% redisplay 1 0% - minibuffer-complete 1 0% - completion-in-region 1 0% - completion--in-region 1 0% - #<compiled 0x2000000001b04c20> 1 0% - apply 1 0% - #<compiled 0x20000000013baac8> 1 0% - completion--in-region-1 1 0% - completion--do-completion 1 0% - completion-try-completion 1 0% - completion--nth-completion 1 0% - completion--some 1 0% - #<compiled 0x2000000001b0bd20> 1 0% - completion-basic-try-completion 1 0% - try-completion 1 0% completion-file-name-table 1 0% - byte-code 10 1% - read-extended-command 9 1% - completing-read 9 1% - completing-read-default 9 1% read-from-minibuffer 9 1% - find-file-read-args 1 0% - read-file-name 1 0% - read-file-name-default 1 0% - completing-read 1 0% - completing-read-default 1 0% - read-from-minibuffer 1 0% - redisplay_internal (C function) 1 0% find-image 1 0% - ... 158 22% Automatic GC 156 22% - macroexp--all-forms 1 0% - macroexp--expand-all 1 0% - #<compiled 0x2000000001375130> 1 0% - macroexp--all-forms 1 0% - macroexp--expand-all 1 0% - macroexp--all-forms 1 0% - macroexp--expand-all 1 0% - #<compiled 0x2000000001375130> 1 0% - macroexp--all-forms 1 0% - macroexp--expand-all 1 0% - #<compiled 0x2000000001375068> 1 0% - macroexp--all-forms 1 0% - macroexp--expand-all 1 0% - macroexp-macroexpand 1 0% - macroexpand 1 0% #<compiled 0x20000000013f0600> 1 0% - rng-compute-start-tag-open-deriv 1 0% - rng-element-get-child 1 0% - rng-compile 1 0% - apply 1 0% - rng-compile-group 1 0% - mapcar 1 0% - rng-compile 1 0% - apply 1 0% - rng-compile-attribute 1 0% - rng-compile 1 0% - apply 1 0% - rng-compile-ref 1 0% - rng-compile 1 0% - apply 1 0% - rng-compile-data 1 0% rng-compile-dt 1 0% ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2018-12-27 16:02 ` Eli Zaretskii @ 2018-12-27 16:39 ` Stefan Monnier 2018-12-27 16:43 ` Eli Zaretskii 2019-01-17 22:57 ` Stefan Monnier 1 sibling, 1 reply; 42+ messages in thread From: Stefan Monnier @ 2018-12-27 16:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Vincent Lefevre, 33887 >> When I open a large XML file and immediately go to the end of the >> file with '<ESC> >', Emacs hangs for several seconds. For instance, >> on /usr/share/xml/iso-codes/iso_639-3.xml from iso-codes in Debian >> (a 1-MB file), it takes 5 seconds. On a 4-MB personal XML file, it >> takes 15 seconds. >> >> This is a regression: Emacs 25 did not hang at all. > > Confirmed, thanks. > > The profile (see below) blames syntax-ppss called by > sgml-syntax-propertize, so I suspect commit 0055190, which added > sgml-syntax-propertize-inside to sgml-syntax-propertize. Sounds right, but I'm not sure what to do about this. I don't wonder why so much time is passed on syntax-ppss, which is generally expected to be relatively fast. Maybe sgml-syntax-propertize is called too often (I see it's mostly called from skip-syntax-forward; maybe we should call syntax-propertize explicitly beforehand with a more distant position so sgml-syntax-propertize is called just once). Stefan > Here's the profile: > > - command-execute 532 77% > - call-interactively 532 77% > - funcall-interactively 522 75% > - end-of-buffer 500 72% > - recenter 496 71% > - jit-lock-function 496 71% > - jit-lock-fontify-now 496 71% > - jit-lock--run-functions 496 71% > - run-hook-wrapped 496 71% > - #<compiled 0x200000000b3a7fd0> 496 71% > - font-lock-fontify-region 496 71% > - font-lock-default-fontify-region 496 71% > - nxml-extend-region 496 71% > - skip-syntax-forward 496 71% > - internal--syntax-propertize 496 71% > - syntax-propertize 496 71% > - sgml-syntax-propertize 490 71% > syntax-ppss 445 64% > push-mark 1 0% > - find-file 20 2% > - find-file-noselect 20 2% > - find-file-noselect-1 19 2% > - after-find-file 17 2% > - normal-mode 17 2% > - set-auto-mode 17 2% > - set-auto-mode-0 17 2% > - xml-mode 17 2% > - byte-code 14 2% > - require 12 1% > - byte-code 11 1% > - require 10 1% > - byte-code 9 1% > - require 6 0% > - byte-code 6 0% > - cl-generic-define-method 4 0% > - cl--generic-make-function 4 0% > - cl--generic-make-next-function 4 0% > - cl--generic-get-dispatcher 4 0% > - byte-compile 3 0% > byte-code 1 0% > - #<compiled 0x200000000b325048> 1 0% > byte-compile-top-level 1 0% > - custom-declare-variable 1 0% > - custom-initialize-reset 1 0% > - eval 1 0% > - funcall 1 0% > - #<compiled 0x200000000b3c88b8> 1 0% > - executable-find 1 0% > locate-file 1 0% > file-truename 1 0% > - rng-nxml-mode-init 2 0% > - rng-validate-mode 2 0% > - rng-auto-set-schema 2 0% > - rng-locate-schema-file 2 0% > - rng-locate-schema-file-using 2 0% > - rng-get-parsed-schema-locating-file 2 0% > - rng-parse-schema-locating-file 1 0% > - rng-parse-validate-file 1 0% > - nxml-parse-instance 1 0% > nxml-parse-instance-1 1 0% > - file-truename 1 0% > - file-truename 1 0% > - file-truename 1 0% > file-truename 1 0% > - insert-file-contents 1 0% > xml-find-file-coding-system 1 0% > - execute-extended-command 1 0% > - sit-for 1 0% > redisplay 1 0% > - minibuffer-complete 1 0% > - completion-in-region 1 0% > - completion--in-region 1 0% > - #<compiled 0x2000000001b04c20> 1 0% > - apply 1 0% > - #<compiled 0x20000000013baac8> 1 0% > - completion--in-region-1 1 0% > - completion--do-completion 1 0% > - completion-try-completion 1 0% > - completion--nth-completion 1 0% > - completion--some 1 0% > - #<compiled 0x2000000001b0bd20> 1 0% > - completion-basic-try-completion 1 0% > - try-completion 1 0% > completion-file-name-table 1 0% > - byte-code 10 1% > - read-extended-command 9 1% > - completing-read 9 1% > - completing-read-default 9 1% > read-from-minibuffer 9 1% > - find-file-read-args 1 0% > - read-file-name 1 0% > - read-file-name-default 1 0% > - completing-read 1 0% > - completing-read-default 1 0% > - read-from-minibuffer 1 0% > - redisplay_internal (C function) 1 0% > find-image 1 0% > - ... 158 22% > Automatic GC 156 22% > - macroexp--all-forms 1 0% > - macroexp--expand-all 1 0% > - #<compiled 0x2000000001375130> 1 0% > - macroexp--all-forms 1 0% > - macroexp--expand-all 1 0% > - macroexp--all-forms 1 0% > - macroexp--expand-all 1 0% > - #<compiled 0x2000000001375130> 1 0% > - macroexp--all-forms 1 0% > - macroexp--expand-all 1 0% > - #<compiled 0x2000000001375068> 1 0% > - macroexp--all-forms 1 0% > - macroexp--expand-all 1 0% > - macroexp-macroexpand 1 0% > - macroexpand 1 0% > #<compiled 0x20000000013f0600> 1 0% > - rng-compute-start-tag-open-deriv 1 0% > - rng-element-get-child 1 0% > - rng-compile 1 0% > - apply 1 0% > - rng-compile-group 1 0% > - mapcar 1 0% > - rng-compile 1 0% > - apply 1 0% > - rng-compile-attribute 1 0% > - rng-compile 1 0% > - apply 1 0% > - rng-compile-ref 1 0% > - rng-compile 1 0% > - apply 1 0% > - rng-compile-data 1 0% > rng-compile-dt 1 0% ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2018-12-27 16:39 ` Stefan Monnier @ 2018-12-27 16:43 ` Eli Zaretskii 2018-12-27 17:32 ` Stefan Monnier 0 siblings, 1 reply; 42+ messages in thread From: Eli Zaretskii @ 2018-12-27 16:43 UTC (permalink / raw) To: Stefan Monnier; +Cc: vincent, 33887 > From: Stefan Monnier <monnier@IRO.UMontreal.CA> > Cc: Vincent Lefevre <vincent@vinc17.net>, 33887@debbugs.gnu.org > Date: Thu, 27 Dec 2018 11:39:06 -0500 > > > The profile (see below) blames syntax-ppss called by > > sgml-syntax-propertize, so I suspect commit 0055190, which added > > sgml-syntax-propertize-inside to sgml-syntax-propertize. > > Sounds right, but I'm not sure what to do about this. > I don't wonder why so much time is passed on syntax-ppss, which is > generally expected to be relatively fast. Why was sgml-syntax-propertize-inside added? Is its effect an absolute must, or merely a nice-to-have feature? If the latter, perhaps a defcustom that could disable that call will be an okay solution, at least as a stopgap? ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2018-12-27 16:43 ` Eli Zaretskii @ 2018-12-27 17:32 ` Stefan Monnier 2018-12-27 17:47 ` Eli Zaretskii 2018-12-27 18:43 ` Vincent Lefevre 0 siblings, 2 replies; 42+ messages in thread From: Stefan Monnier @ 2018-12-27 17:32 UTC (permalink / raw) To: Eli Zaretskii; +Cc: vincent, 33887 > Why was sgml-syntax-propertize-inside added? Is its effect an > absolute must, or merely a nice-to-have feature? It's needed for correctness in the presence of <?...?> or <![CDATA[...]]> > If the latter, perhaps a defcustom that could disable that call will > be an okay solution, at least as a stopgap? I don't think it should be terribly expensive, so I'd rather first try and better understand the performance issue, Stefan ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2018-12-27 17:32 ` Stefan Monnier @ 2018-12-27 17:47 ` Eli Zaretskii 2018-12-27 18:43 ` Vincent Lefevre 1 sibling, 0 replies; 42+ messages in thread From: Eli Zaretskii @ 2018-12-27 17:47 UTC (permalink / raw) To: Stefan Monnier; +Cc: vincent, 33887 > From: Stefan Monnier <monnier@IRO.UMontreal.CA> > Cc: vincent@vinc17.net, 33887@debbugs.gnu.org > Date: Thu, 27 Dec 2018 12:32:21 -0500 > > > If the latter, perhaps a defcustom that could disable that call will > > be an okay solution, at least as a stopgap? > > I don't think it should be terribly expensive, so I'd rather first try > and better understand the performance issue, Sure. I thought you already did ;-) ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2018-12-27 17:32 ` Stefan Monnier 2018-12-27 17:47 ` Eli Zaretskii @ 2018-12-27 18:43 ` Vincent Lefevre 2018-12-28 17:18 ` Stefan Monnier 1 sibling, 1 reply; 42+ messages in thread From: Vincent Lefevre @ 2018-12-27 18:43 UTC (permalink / raw) To: Stefan Monnier; +Cc: 33887 On 2018-12-27 12:32:21 -0500, Stefan Monnier wrote: > > Why was sgml-syntax-propertize-inside added? Is its effect an > > absolute must, or merely a nice-to-have feature? > > It's needed for correctness in the presence of <?...?> or <![CDATA[...]]> I use both in some of my XML files and I have never found any issue with them. Or perhaps this is just for particular cases? -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2018-12-27 18:43 ` Vincent Lefevre @ 2018-12-28 17:18 ` Stefan Monnier 0 siblings, 0 replies; 42+ messages in thread From: Stefan Monnier @ 2018-12-28 17:18 UTC (permalink / raw) To: Vincent Lefevre; +Cc: 33887 >> > Why was sgml-syntax-propertize-inside added? Is its effect an >> > absolute must, or merely a nice-to-have feature? >> It's needed for correctness in the presence of <?...?> or <![CDATA[...]]> > I use both in some of my XML files and I have never found any issue > with them. Or perhaps this is just for particular cases? Yes, it only makes a real difference when the content of those things ends up confusing the parser (e.g. it looks like an unclosed tag, or things along these lines). Stefan ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2018-12-27 16:02 ` Eli Zaretskii 2018-12-27 16:39 ` Stefan Monnier @ 2019-01-17 22:57 ` Stefan Monnier 1 sibling, 0 replies; 42+ messages in thread From: Stefan Monnier @ 2019-01-17 22:57 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Vincent Lefevre, 33887 > The profile (see below) blames syntax-ppss called by > sgml-syntax-propertize, so I suspect commit 0055190, which added > sgml-syntax-propertize-inside to sgml-syntax-propertize. Hmm... actually, the syntax-ppss calls that take time are directly made from within sgml-syntax-propertize rather than from within sgml-syntax-propertize-inside (which doesn't even appear in your profile (in my profile I get 8099 units of time in sgml-syntax-propertize, of which 7611 in syntax-ppss and only 77 in sgml-syntax-propertize-inside). The problem seems to come from the following syntax propertize rule: ;; Double quotes outside of tags should not introduce strings. ;; Be careful to call `syntax-ppss' on a position before the one we're ;; going to change, so as not to need to flush the data we just computed. ("\"" (0 (if (prog1 (zerop (car (syntax-ppss (match-beginning 0)))) (goto-char (match-end 0))) (string-to-syntax ".")))) If I comment it out, the delay is *much* smaller. The problem being that " are quite common characters in XML files, so the regexp matches often and we call syntax-ppss each time, so we end up calling syntax-ppss very often. I'm trying to figure out how to avoid calling syntax-ppss for every " character. I'm thinking of looking at pairs of " chars and only do extra work if there's a < or > between the two. Stefan ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2018-12-27 10:13 bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode Vincent Lefevre 2018-12-27 16:02 ` Eli Zaretskii @ 2019-01-08 22:11 ` Fernando Jascovich 2019-01-10 15:09 ` Eli Zaretskii 2019-05-15 23:53 ` Noam Postavsky 2 siblings, 1 reply; 42+ messages in thread From: Fernando Jascovich @ 2019-01-08 22:11 UTC (permalink / raw) To: 33887 Hi everyone, this is my first email to bug-gnu-emacs, so please let me know if I am making some mistake. For no special reason, I took this bug in order to start to know emacs' code. Following and confirming the details of the bug, I found that indeed the performance issue is introduced at commit 0055190174, but not beacuse the introduction of `sgml-syntax-propertize-inside`. The problem is with the last rule: ``` ("\"" (0 (if (prog1 (zerop (car (syntax-ppss (match-beginning 0)))) (goto-char (match-end 0))) (string-to-syntax "."))) ``` I can't see the real effect of this rule, I tested xml parsing without this rule and it works fine, marking double quotes inside tags as expected without this performance issue. Do we need to target double quotes outside tags explicitly? -- Fernando Jascovich developer m: +54 9 3548 63 9833 github: https://github.com/fernando-jascovich/ linkedin: https://www.linkedin.com/in/fernandojascovich/ ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-01-08 22:11 ` Fernando Jascovich @ 2019-01-10 15:09 ` Eli Zaretskii 2019-01-17 23:25 ` Stefan Monnier 0 siblings, 1 reply; 42+ messages in thread From: Eli Zaretskii @ 2019-01-10 15:09 UTC (permalink / raw) To: Fernando Jascovich, Stefan Monnier; +Cc: 33887 > From: Fernando Jascovich <fernando.ej@gmail.com> > Date: Tue, 08 Jan 2019 19:11:02 -0300 > > Hi everyone, this is my first email to bug-gnu-emacs, so please let me > know if I am making some mistake. > For no special reason, I took this bug in order to start to know emacs' > code. > Following and confirming the details of the bug, I found that indeed the > performance issue is introduced at commit 0055190174, but not beacuse > the introduction of `sgml-syntax-propertize-inside`. > The problem is with the last rule: > ``` > ("\"" (0 (if (prog1 (zerop (car (syntax-ppss (match-beginning 0)))) > (goto-char (match-end 0))) > (string-to-syntax "."))) > ``` > I can't see the real effect of this rule, I tested xml parsing without > this rule and it works fine, marking double quotes inside tags as > expected without this performance issue. > Do we need to target double quotes outside tags explicitly? Stefan, any comments? ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-01-10 15:09 ` Eli Zaretskii @ 2019-01-17 23:25 ` Stefan Monnier 0 siblings, 0 replies; 42+ messages in thread From: Stefan Monnier @ 2019-01-17 23:25 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Fernando Jascovich, 33887 >> From: Fernando Jascovich <fernando.ej@gmail.com> >> Date: Tue, 08 Jan 2019 19:11:02 -0300 >> >> Hi everyone, this is my first email to bug-gnu-emacs, so please let me >> know if I am making some mistake. >> For no special reason, I took this bug in order to start to know emacs' >> code. >> Following and confirming the details of the bug, I found that indeed the >> performance issue is introduced at commit 0055190174, but not beacuse >> the introduction of `sgml-syntax-propertize-inside`. >> The problem is with the last rule: >> ``` >> ("\"" (0 (if (prog1 (zerop (car (syntax-ppss (match-beginning 0)))) >> (goto-char (match-end 0))) >> (string-to-syntax "."))) >> ``` >> I can't see the real effect of this rule, I tested xml parsing without >> this rule and it works fine, marking double quotes inside tags as >> expected without this performance issue. >> Do we need to target double quotes outside tags explicitly? > > Stefan, any comments? Yes, he's exactly right. I just pushed a patch to master which should reduce significantly this delay. Stefan ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2018-12-27 10:13 bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode Vincent Lefevre 2018-12-27 16:02 ` Eli Zaretskii 2019-01-08 22:11 ` Fernando Jascovich @ 2019-05-15 23:53 ` Noam Postavsky 2019-05-16 10:54 ` Vincent Lefevre ` (2 more replies) 2 siblings, 3 replies; 42+ messages in thread From: Noam Postavsky @ 2019-05-15 23:53 UTC (permalink / raw) To: Vincent Lefevre; +Cc: 33887 Vincent Lefevre <vincent@vinc17.net> writes: > This is a regression: Emacs 25 did not hang at all. Should we backport Stefan's fix to emacs-26? Or specifically, backport [1: e7e92dc5d2], which is Stefan's fix on top of my fix for the loss-of-single-quote-fontification bug (Bug#35381). [1: e7e92dc5d2]: 2019-05-15 19:04:14 -0400 Fix merge of sgml-syntax-propertize-rules https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=e7e92dc5d24ac3bcde69732bab6a6c3c0d9de97b ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-15 23:53 ` Noam Postavsky @ 2019-05-16 10:54 ` Vincent Lefevre 2019-05-16 12:15 ` Noam Postavsky 2019-05-16 14:01 ` Eli Zaretskii 2 siblings, 0 replies; 42+ messages in thread From: Vincent Lefevre @ 2019-05-16 10:54 UTC (permalink / raw) To: Noam Postavsky; +Cc: 33887 Hi, On 2019-05-15 19:53:08 -0400, Noam Postavsky wrote: > Vincent Lefevre <vincent@vinc17.net> writes: > > > This is a regression: Emacs 25 did not hang at all. > > Should we backport Stefan's fix to emacs-26? Or specifically, backport > [1: e7e92dc5d2], which is Stefan's fix on top of my fix for the > loss-of-single-quote-fontification bug (Bug#35381). > > [1: e7e92dc5d2]: 2019-05-15 19:04:14 -0400 > Fix merge of sgml-syntax-propertize-rules > https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=e7e92dc5d24ac3bcde69732bab6a6c3c0d9de97b It would be nice if this could be fixed quickly in emacs-26, hoping that it could be fixed in Debian before the next stable release. (I'm still using Emacs 25 because of this bug.) -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-15 23:53 ` Noam Postavsky 2019-05-16 10:54 ` Vincent Lefevre @ 2019-05-16 12:15 ` Noam Postavsky 2019-05-17 21:36 ` Vincent Lefevre 2019-05-16 14:01 ` Eli Zaretskii 2 siblings, 1 reply; 42+ messages in thread From: Noam Postavsky @ 2019-05-16 12:15 UTC (permalink / raw) To: Vincent Lefevre; +Cc: 33887 [-- Attachment #1: Type: text/plain, Size: 336 bytes --] Noam Postavsky <npostavs@gmail.com> writes: > [1: e7e92dc5d2]: 2019-05-15 19:04:14 -0400 > Fix merge of sgml-syntax-propertize-rules > https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=e7e92dc5d24ac3bcde69732bab6a6c3c0d9de97b Uh, I goofed that one, Stefan fixed it [2: 9a74e5666b]. The corrected patch would be as follows: [-- Attachment #2: patch --] [-- Type: text/plain, Size: 3164 bytes --] From 2221c244ee01c4c336ec860cf52a1ef37111ff19 Mon Sep 17 00:00:00 2001 From: Noam Postavsky <npostavs@gmail.com> Date: Wed, 15 May 2019 18:51:30 -0400 Subject: [PATCH] Backport sgml-syntax-propertize-rules speedup (Bug#33887) * lisp/textmodes/sgml-mode.el (sgml-syntax-propertize-rules): Reapply 2019-01-17 "* lisp/textmodes/sgml-mode.el: Try and fix bug#33887." taking into account 2019-05-09 "Recognize single quote attribute values in nxml and sgml (Bug#35381)" which means we have to handle single quotes as well. * test/lisp/textmodes/sgml-mode-tests.el (sgml-quote-works): New test. --- lisp/textmodes/sgml-mode.el | 21 +++++++++++++++------ test/lisp/textmodes/sgml-mode-tests.el | 7 +++++++ 2 files changed, 22 insertions(+), 6 deletions(-) diff --git a/lisp/textmodes/sgml-mode.el b/lisp/textmodes/sgml-mode.el index 128e58810e..1c307d12b0 100644 --- a/lisp/textmodes/sgml-mode.el +++ b/lisp/textmodes/sgml-mode.el @@ -347,12 +347,21 @@ sgml-font-lock-keywords ("--[ \t\n]*\\(>\\)" (1 "> b")) ("\\(<\\)[?!]" (1 (prog1 "|>" (sgml-syntax-propertize-inside end)))) - ;; Quotes outside of tags should not introduce strings. - ;; Be careful to call `syntax-ppss' on a position before the one we're - ;; going to change, so as not to need to flush the data we just computed. - ("[\"']" (0 (if (prog1 (zerop (car (syntax-ppss (match-beginning 0)))) - (goto-char (match-end 0))) - (string-to-syntax "."))))))) + ;; Quotes outside of tags should not introduce strings which end up + ;; hiding tags. We used to test every quote and mark it as "." + ;; if it's outside of tags, but there are too many quotes and + ;; the resulting number of calls to syntax-ppss made it too slow + ;; (bug#33887), so we're now careful to leave alone any pair + ;; of quotes that doesn't hold a < or > char, which is the vast majority. + ("\\(?:\\(?1:\"\\)[^\"<>]*[<>\"]\\|\\(?1:'\\)[^'<>]*[<>']\\)" + (1 (unless (memq (char-before) '(?\' ?\")) + ;; Be careful to call `syntax-ppss' on a position before the one + ;; we're going to change, so as not to need to flush the data we + ;; just computed. + (if (prog1 (zerop (car (syntax-ppss (match-beginning 0)))) + (goto-char (1- (match-end 0)))) + (string-to-syntax "."))))) + ))) (defun sgml-syntax-propertize (start end) "Syntactic keywords for `sgml-mode'." diff --git a/test/lisp/textmodes/sgml-mode-tests.el b/test/lisp/textmodes/sgml-mode-tests.el index 7318a667b3..1c501abf38 100644 --- a/test/lisp/textmodes/sgml-mode-tests.el +++ b/test/lisp/textmodes/sgml-mode-tests.el @@ -130,5 +130,12 @@ sgml-with-content (sgml-delete-tag 1) (should (string= "Winter is comin'" (buffer-string))))) +(ert-deftest sgml-tests--quotes-syntax () + (with-temp-buffer + (sgml-mode) + (insert "a\"b <tag>c'd</tag>") + (should (= 1 (car (syntax-ppss (1- (point-max)))))) + (should (= 0 (car (syntax-ppss (point-max))))))) + (provide 'sgml-mode-tests) ;;; sgml-mode-tests.el ends here -- 2.11.0 [-- Attachment #3: Type: text/plain, Size: 215 bytes --] [2: 9a74e5666b]: 2019-05-15 22:21:36 -0400 * lisp/textmodes/sgml-mode.el (sgml-syntax-propertize-rules): Fix typo https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=9a74e5666b022098c63d0047c0df90c66e1aa64a ^ permalink raw reply related [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-16 12:15 ` Noam Postavsky @ 2019-05-17 21:36 ` Vincent Lefevre 2019-05-18 4:15 ` Noam Postavsky 0 siblings, 1 reply; 42+ messages in thread From: Vincent Lefevre @ 2019-05-17 21:36 UTC (permalink / raw) To: Noam Postavsky; +Cc: 33887 [-- Attachment #1: Type: text/plain, Size: 793 bytes --] On 2019-05-16 08:15:58 -0400, Noam Postavsky wrote: > The corrected patch would be as follows: [...] I've tried the combination of ca14dd1d4628094dd33d5d94694dcf5f29e843b8 7dab3ee7ab54b3c2e7bc24170376054786c01d6f and this patch against Debian's current source package. Emacs no longer hangs, but I get incorrect highlighting, for instance on the following XML file. <root> <!-- comment --> <a>"a'</a> <!-- comment --> </root> Highlighting starts to be wrong at the single-quote character. I've attached a screenshot obtained with the -Q option. Did I miss anything? -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) [-- Attachment #2: nxml.png --] [-- Type: image/png, Size: 5294 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-17 21:36 ` Vincent Lefevre @ 2019-05-18 4:15 ` Noam Postavsky 2019-05-18 14:47 ` Vincent Lefevre 0 siblings, 1 reply; 42+ messages in thread From: Noam Postavsky @ 2019-05-18 4:15 UTC (permalink / raw) To: Vincent Lefevre; +Cc: Stefan Monnier, 33887 [-- Attachment #1: Type: text/plain, Size: 634 bytes --] Vincent Lefevre <vincent@vinc17.net> writes: > I've tried the combination of > > ca14dd1d4628094dd33d5d94694dcf5f29e843b8 > 7dab3ee7ab54b3c2e7bc24170376054786c01d6f > > and this patch against Debian's current source package. > > Emacs no longer hangs, but I get incorrect highlighting, > for instance on the following XML file. > > <root> > <!-- comment --> > <a>"a'</a> > <!-- comment --> > </root> > > Highlighting starts to be wrong at the single-quote character. > I've attached a screenshot obtained with the -Q option. > > Did I miss anything? Ah, I didn't get the mixed quote handling right. Here's the fix for master: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: patch --] [-- Type: text/x-diff, Size: 2449 bytes --] From 4677edd8dd65b5d956732821e78794f35b275418 Mon Sep 17 00:00:00 2001 From: Noam Postavsky <npostavs@gmail.com> Date: Sat, 18 May 2019 00:04:01 -0400 Subject: [PATCH] Fix Bug#33887 for mixed quote usage * lisp/textmodes/sgml-mode.el (sgml-syntax-propertize-rules): Only skip syntax-ppss for matched quotes. * test/lisp/textmodes/sgml-mode-tests.el (sgml-tests--quotes-syntax): Expand test. --- lisp/textmodes/sgml-mode.el | 4 ++-- test/lisp/textmodes/sgml-mode-tests.el | 17 ++++++++++++----- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/lisp/textmodes/sgml-mode.el b/lisp/textmodes/sgml-mode.el index 1b064fb825..e3cf56aa0e 100644 --- a/lisp/textmodes/sgml-mode.el +++ b/lisp/textmodes/sgml-mode.el @@ -345,8 +345,8 @@ sgml-font-lock-keywords ;; the resulting number of calls to syntax-ppss made it too slow ;; (bug#33887), so we're now careful to leave alone any pair ;; of quotes that doesn't hold a < or > char, which is the vast majority. - ("\\(?:\\(?1:\"\\)[^\"<>]*[<>\"]\\|\\(?1:'\\)[^'<>]*[<>']\\)" - (1 (unless (memq (char-before) '(?\' ?\")) + ("\\([\"']\\)[^<>\"']*[<>\"']" + (1 (unless (eq (char-after (match-beginning 1)) (char-before)) ;; Be careful to call `syntax-ppss' on a position before the one ;; we're going to change, so as not to need to flush the data we ;; just computed. diff --git a/test/lisp/textmodes/sgml-mode-tests.el b/test/lisp/textmodes/sgml-mode-tests.el index a900e8dcf2..ffcc2cd840 100644 --- a/test/lisp/textmodes/sgml-mode-tests.el +++ b/test/lisp/textmodes/sgml-mode-tests.el @@ -161,11 +161,18 @@ sgml-with-content (should (string= "&&" (buffer-string)))))) (ert-deftest sgml-tests--quotes-syntax () - (with-temp-buffer - (sgml-mode) - (insert "a\"b <tag>c'd</tag>") - (should (= 1 (car (syntax-ppss (1- (point-max)))))) - (should (= 0 (car (syntax-ppss (point-max))))))) + (dolist (str '("a\"b <t>c'd</t>" + "a'b <t>c\"d</t>" + "<t>\"a'</t>" + "<t>'a\"</t>" + "<t>\"a'\"</t>" + "<t>'a\"'</t>")) + (with-temp-buffer + (sgml-mode) + (insert str) + ;; Check that last tag is parsed as a tag. + (should (= 1 (car (syntax-ppss (1- (point-max)))))) + (should (= 0 (car (syntax-ppss (point-max)))))))) (provide 'sgml-mode-tests) ;;; sgml-mode-tests.el ends here -- 2.11.0 [-- Attachment #3: Type: text/plain, Size: 47 bytes --] And the correponding patch against emacs-26: [-- Attachment #4: patch --] [-- Type: text/plain, Size: 3402 bytes --] From 3a1a36b0b42772f35c70fb7e996ba8fed787e1c2 Mon Sep 17 00:00:00 2001 From: Noam Postavsky <npostavs@gmail.com> Date: Wed, 15 May 2019 18:51:30 -0400 Subject: [PATCH] Backport sgml-syntax-propertize-rules speedup (Bug#33887) * lisp/textmodes/sgml-mode.el (sgml-syntax-propertize-rules): Reapply 2019-01-17 "* lisp/textmodes/sgml-mode.el: Try and fix bug#33887." taking into account 2019-05-09 "Recognize single quote attribute values in nxml and sgml (Bug#35381)" which means we have to handle single quotes as well. * test/lisp/textmodes/sgml-mode-tests.el (sgml-quote-works): New test. --- lisp/textmodes/sgml-mode.el | 21 +++++++++++++++------ test/lisp/textmodes/sgml-mode-tests.el | 14 ++++++++++++++ 2 files changed, 29 insertions(+), 6 deletions(-) diff --git a/lisp/textmodes/sgml-mode.el b/lisp/textmodes/sgml-mode.el index 128e58810e..f8a37c3820 100644 --- a/lisp/textmodes/sgml-mode.el +++ b/lisp/textmodes/sgml-mode.el @@ -347,12 +347,21 @@ sgml-font-lock-keywords ("--[ \t\n]*\\(>\\)" (1 "> b")) ("\\(<\\)[?!]" (1 (prog1 "|>" (sgml-syntax-propertize-inside end)))) - ;; Quotes outside of tags should not introduce strings. - ;; Be careful to call `syntax-ppss' on a position before the one we're - ;; going to change, so as not to need to flush the data we just computed. - ("[\"']" (0 (if (prog1 (zerop (car (syntax-ppss (match-beginning 0)))) - (goto-char (match-end 0))) - (string-to-syntax "."))))))) + ;; Quotes outside of tags should not introduce strings which end up + ;; hiding tags. We used to test every quote and mark it as "." + ;; if it's outside of tags, but there are too many quotes and + ;; the resulting number of calls to syntax-ppss made it too slow + ;; (bug#33887), so we're now careful to leave alone any pair + ;; of quotes that doesn't hold a < or > char, which is the vast majority. + ("\\([\"']\\)[^<>\"']*[<>\"']" + (1 (unless (eq (char-after (match-beginning 1)) (char-before)) + ;; Be careful to call `syntax-ppss' on a position before the one + ;; we're going to change, so as not to need to flush the data we + ;; just computed. + (if (prog1 (zerop (car (syntax-ppss (match-beginning 0)))) + (goto-char (1- (match-end 0)))) + (string-to-syntax "."))))) + ))) (defun sgml-syntax-propertize (start end) "Syntactic keywords for `sgml-mode'." diff --git a/test/lisp/textmodes/sgml-mode-tests.el b/test/lisp/textmodes/sgml-mode-tests.el index 7318a667b3..8d0bb88163 100644 --- a/test/lisp/textmodes/sgml-mode-tests.el +++ b/test/lisp/textmodes/sgml-mode-tests.el @@ -130,5 +130,19 @@ sgml-with-content (sgml-delete-tag 1) (should (string= "Winter is comin'" (buffer-string))))) +(ert-deftest sgml-tests--quotes-syntax () + (dolist (str '("a\"b <t>c'd</t>" + "a'b <t>c\"d</t>" + "<t>\"a'</t>" + "<t>'a\"</t>" + "<t>\"a'\"</t>" + "<t>'a\"'</t>")) + (with-temp-buffer + (sgml-mode) + (insert str) + ;; Check that last tag is parsed as a tag. + (should (= 1 (car (syntax-ppss (1- (point-max)))))) + (should (= 0 (car (syntax-ppss (point-max)))))))) + (provide 'sgml-mode-tests) ;;; sgml-mode-tests.el ends here -- 2.11.0 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-18 4:15 ` Noam Postavsky @ 2019-05-18 14:47 ` Vincent Lefevre 2019-05-18 14:55 ` Vincent Lefevre 2019-05-18 18:49 ` Noam Postavsky 0 siblings, 2 replies; 42+ messages in thread From: Vincent Lefevre @ 2019-05-18 14:47 UTC (permalink / raw) To: Noam Postavsky; +Cc: Stefan Monnier, 33887 There's still an issue. On the following XML file <root> <a>text</a> <!-- ' --> <a>text</a> </root> the part after the comment <!-- ' --> is highlighted as a comment. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-18 14:47 ` Vincent Lefevre @ 2019-05-18 14:55 ` Vincent Lefevre 2019-05-18 14:57 ` Vincent Lefevre 2019-05-18 18:49 ` Noam Postavsky 1 sibling, 1 reply; 42+ messages in thread From: Vincent Lefevre @ 2019-05-18 14:55 UTC (permalink / raw) To: Noam Postavsky; +Cc: Stefan Monnier, 33887 On 2019-05-18 16:47:56 +0200, Vincent Lefevre wrote: > There's still an issue. On the following XML file > > <root> > <a>text</a> > <!-- ' --> > <a>text</a> > </root> > > the part after the comment <!-- ' --> is highlighted as a comment. And on the following XML file too: <root> <!DOCTYPE root [ <!ENTITY f SYSTEM "f.xml"> ]> <a>ab'cd</a> <a>text</a> </root> -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-18 14:55 ` Vincent Lefevre @ 2019-05-18 14:57 ` Vincent Lefevre 2019-05-18 15:01 ` Vincent Lefevre 0 siblings, 1 reply; 42+ messages in thread From: Vincent Lefevre @ 2019-05-18 14:57 UTC (permalink / raw) To: Noam Postavsky; +Cc: Stefan Monnier, 33887 On 2019-05-18 16:55:43 +0200, Vincent Lefevre wrote: > And on the following XML file too: > > <root> > <!DOCTYPE root [ > <!ENTITY f SYSTEM "f.xml"> > ]> > <a>ab'cd</a> > <a>text</a> > </root> I actually meant <!DOCTYPE root [ <!ENTITY f SYSTEM "f.xml"> ]> <root> <a>ab'cd</a> <a>text</a> </root> -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-18 14:57 ` Vincent Lefevre @ 2019-05-18 15:01 ` Vincent Lefevre 0 siblings, 0 replies; 42+ messages in thread From: Vincent Lefevre @ 2019-05-18 15:01 UTC (permalink / raw) To: Noam Postavsky; +Cc: Stefan Monnier, 33887 And another one: <root> <a>text</a> <!-- "don't" --> <a>text</a> </root> The second text is highlighted as a comment. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-18 14:47 ` Vincent Lefevre 2019-05-18 14:55 ` Vincent Lefevre @ 2019-05-18 18:49 ` Noam Postavsky 2019-05-19 0:17 ` Vincent Lefevre 2019-05-20 11:47 ` Vincent Lefevre 1 sibling, 2 replies; 42+ messages in thread From: Noam Postavsky @ 2019-05-18 18:49 UTC (permalink / raw) To: Vincent Lefevre; +Cc: Stefan Monnier, 33887 [-- Attachment #1: Type: text/plain, Size: 576 bytes --] Vincent Lefevre <vincent@vinc17.net> writes: > There's still an issue. On the following XML file > > <root> > <a>text</a> > <!-- ' --> > <a>text</a> > </root> > > the part after the comment <!-- ' --> is highlighted as a comment. > And another one: > > <root> > <a>text</a> > <!-- "don't" --> > <a>text</a> > </root> > > The second text is highlighted as a comment. Right, this is a collision between the syntax rules. The following patch fixes it, though perhaps it would be better to just search for the end of the comment in the ("\\(<\\)!--" (1 "< b")) rule instead? [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: patch --] [-- Type: text/x-diff, Size: 2702 bytes --] From a866e4f4b556fb4a346fa68c62296f10966690a1 Mon Sep 17 00:00:00 2001 From: Noam Postavsky <npostavs@gmail.com> Date: Sat, 18 May 2019 13:18:19 -0400 Subject: [PATCH] Fix sgml syntax handling of quotes in comments * lisp/textmodes/sgml-mode.el (sgml-syntax-propertize-rules): Make sure not to skip over comment ender when searching for quotes. * test/lisp/textmodes/sgml-mode-tests.el (sgml-tests--quotes-syntax): Add a some more cases. --- lisp/textmodes/sgml-mode.el | 11 ++++++++--- test/lisp/textmodes/sgml-mode-tests.el | 16 +++++++++------- 2 files changed, 17 insertions(+), 10 deletions(-) diff --git a/lisp/textmodes/sgml-mode.el b/lisp/textmodes/sgml-mode.el index e3cf56aa0e..1af1d1eaef 100644 --- a/lisp/textmodes/sgml-mode.el +++ b/lisp/textmodes/sgml-mode.el @@ -350,9 +350,14 @@ sgml-font-lock-keywords ;; Be careful to call `syntax-ppss' on a position before the one ;; we're going to change, so as not to need to flush the data we ;; just computed. - (if (prog1 (zerop (car (syntax-ppss (match-beginning 0)))) - (goto-char (1- (match-end 0)))) - (string-to-syntax "."))))) + (let ((ppss (syntax-ppss (match-beginning 0)))) + (if (prog1 (zerop (car ppss)) ; Outside tag. + (goto-char (1- (match-end 0))) + ;; If we're in a comment, don't skip over comment + ;; ender. + (when (nth 4 ppss) + (skip-chars-backward "- \t\n"))) + (string-to-syntax ".")))))) ))) (defun sgml-syntax-propertize (start end) diff --git a/test/lisp/textmodes/sgml-mode-tests.el b/test/lisp/textmodes/sgml-mode-tests.el index ffcc2cd840..7e1ddf4047 100644 --- a/test/lisp/textmodes/sgml-mode-tests.el +++ b/test/lisp/textmodes/sgml-mode-tests.el @@ -166,13 +166,15 @@ sgml-with-content "<t>\"a'</t>" "<t>'a\"</t>" "<t>\"a'\"</t>" - "<t>'a\"'</t>")) - (with-temp-buffer - (sgml-mode) - (insert str) - ;; Check that last tag is parsed as a tag. - (should (= 1 (car (syntax-ppss (1- (point-max)))))) - (should (= 0 (car (syntax-ppss (point-max)))))))) + "<t>'a\"'</t>" + "<t><!-- ' --></t>" + "<t><!-- \" --></t>")) + (ert-info (str :prefix "Test string: ") + (sgml-with-content + str + ;; Check that last tag is parsed as a tag. + (should (= 1 (car (syntax-ppss (1- (point-max)))))) + (should (= 0 (car (syntax-ppss (point-max))))))))) (provide 'sgml-mode-tests) ;;; sgml-mode-tests.el ends here -- 2.11.0 [-- Attachment #3: Type: text/plain, Size: 449 bytes --] > <!DOCTYPE root [ > <!ENTITY f SYSTEM "f.xml"> > ]> > <root> > <a>ab'cd</a> > <a>text</a> > </root> This is a different issue, I think the problem is that sgml-syntax-propertize-inside doesn't handle nesting in the DTD definition <! [ <! ... > ]>. The patch below just avoids calling sgml-syntax-propertize-inside on the prolog in nxml-mode (but the problem remains in sgml-mode). Though you'll hit Bug#18871/23668 if you try to edit the DTD. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #4: patch --] [-- Type: text/x-diff, Size: 2580 bytes --] From 9a50fc38b537d570f739c428a57c66557152151b Mon Sep 17 00:00:00 2001 From: Noam Postavsky <npostavs@gmail.com> Date: Sat, 18 May 2019 14:37:51 -0400 Subject: [PATCH] Don't sgml-syntax-propertize-inside XML prolog * lisp/nxml/nxml-mode.el (nxml-syntax-propertize): New function. (nxml-mode): Use it as the syntax-propertize-function. * test/lisp/nxml/nxml-mode-tests.el (nxml-mode-doctype-and-quote-syntax): New test. --- lisp/nxml/nxml-mode.el | 16 +++++++++++++++- test/lisp/nxml/nxml-mode-tests.el | 8 ++++++++ 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/lisp/nxml/nxml-mode.el b/lisp/nxml/nxml-mode.el index ab035b927e..7c39c5023c 100644 --- a/lisp/nxml/nxml-mode.el +++ b/lisp/nxml/nxml-mode.el @@ -423,6 +423,20 @@ nxml-parent-document-set (when rng-validate-mode (rng-validate-while-idle (current-buffer))))) +(defvar nxml-prolog-end) ;; nxml-rap.el +(defun nxml-syntax-propertize (start end) + "Syntactic keywords for `nxml-mode'." + ;; Like `sgml-syntax-propertize', but skip prolog. + (setq start (max start nxml-prolog-end)) + (if (>= start end) + (goto-char end) + (goto-char start) + (sgml-syntax-propertize-inside end) + (funcall + (syntax-propertize-rules sgml-syntax-propertize-rules) + start end))) + + (defvar tildify-space-string) (defvar tildify-foreach-region-function) @@ -518,7 +532,7 @@ nxml-mode (nxml-with-invisible-motion (nxml-scan-prolog))))) (setq-local syntax-ppss-table sgml-tag-syntax-table) - (setq-local syntax-propertize-function #'sgml-syntax-propertize) + (setq-local syntax-propertize-function #'nxml-syntax-propertize) (add-hook 'change-major-mode-hook #'nxml-cleanup nil t) ;; Emacs 23 handles the encoding attribute on the xml declaration diff --git a/test/lisp/nxml/nxml-mode-tests.el b/test/lisp/nxml/nxml-mode-tests.el index 92744be619..2bbf92bc96 100644 --- a/test/lisp/nxml/nxml-mode-tests.el +++ b/test/lisp/nxml/nxml-mode-tests.el @@ -78,5 +78,13 @@ nxml-mode-tests-correctly-indented-string (should-not (equal (get-text-property squote-txt-pos 'face) (get-text-property dquote-att-pos 'face)))))) +(ert-deftest nxml-mode-doctype-and-quote-syntax () + (with-temp-buffer + (insert "<!DOCTYPE t [\n<!ENTITY f SYSTEM \"f.xml\">\n]>\n<t>'</t>") + (nxml-mode) + ;; Check that last tag is parsed as a tag. + (should (= 1 (car (syntax-ppss (1- (point-max)))))) + (should (= 0 (car (syntax-ppss (point-max))))))) + (provide 'nxml-mode-tests) ;;; nxml-mode-tests.el ends here -- 2.11.0 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-18 18:49 ` Noam Postavsky @ 2019-05-19 0:17 ` Vincent Lefevre 2019-05-19 17:43 ` Noam Postavsky 2019-05-20 11:47 ` Vincent Lefevre 1 sibling, 1 reply; 42+ messages in thread From: Vincent Lefevre @ 2019-05-19 0:17 UTC (permalink / raw) To: Noam Postavsky; +Cc: Stefan Monnier, 33887 There's an issue with the following XML file: <root> <a>don't</a> <a>text</a> <a>></a> <a>don't</a> <a>text</a> </root> where highlighting becomes wrong starting at the second '. However, even though > is valid, I normally use > instead. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-19 0:17 ` Vincent Lefevre @ 2019-05-19 17:43 ` Noam Postavsky 2019-05-19 18:48 ` Stefan Monnier 0 siblings, 1 reply; 42+ messages in thread From: Noam Postavsky @ 2019-05-19 17:43 UTC (permalink / raw) To: Vincent Lefevre; +Cc: Stefan Monnier, 33887 Vincent Lefevre <vincent@vinc17.net> writes: > There's an issue with the following XML file: > > <root> > <a>don't</a> > <a>text</a> > <a>></a> > <a>don't</a> > <a>text</a> > </root> > > where highlighting becomes wrong starting at the second '. > > However, even though > is valid, I normally use > instead. Hmm, I can't see a way to handle this case without making the syntax propertizing slow again. Stefan, any ideas? ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-19 17:43 ` Noam Postavsky @ 2019-05-19 18:48 ` Stefan Monnier 2019-05-19 19:03 ` Noam Postavsky 0 siblings, 1 reply; 42+ messages in thread From: Stefan Monnier @ 2019-05-19 18:48 UTC (permalink / raw) To: Noam Postavsky; +Cc: Vincent Lefevre, 33887 > Hmm, I can't see a way to handle this case without making the > syntax propertizing slow again. Stefan, any ideas? Can you summarize the origin of the problem in his example? Stefan ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-19 18:48 ` Stefan Monnier @ 2019-05-19 19:03 ` Noam Postavsky 2019-05-19 19:24 ` Stefan Monnier 0 siblings, 1 reply; 42+ messages in thread From: Noam Postavsky @ 2019-05-19 19:03 UTC (permalink / raw) To: Stefan Monnier; +Cc: Vincent Lefevre, 33887 Stefan Monnier <monnier@iro.umontreal.ca> writes: > Can you summarize the origin of the problem in his example? <t>>1</t> (syntax-ppss) on the location of "1" in the above, gives (-1 ...). And then (syntax-ppss) on the "/" will give (0 ...). So the syntax propertize rule for quote use of (zerop (car (syntax-ppss))) no longer works correctly to see whether it's inside or outside a tag. ">" outside of tags should be set to syntax ".", but I would assume that adding a syntax-propertize rule which calls syntax-ppss for every ">" (to check whether it's inside a tag or not) will be very slow, just like calling it for every quote was. ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-19 19:03 ` Noam Postavsky @ 2019-05-19 19:24 ` Stefan Monnier 2019-05-20 20:47 ` Noam Postavsky 2019-05-22 21:44 ` Stefan Monnier 0 siblings, 2 replies; 42+ messages in thread From: Stefan Monnier @ 2019-05-19 19:24 UTC (permalink / raw) To: Noam Postavsky; +Cc: Vincent Lefevre, 33887 >> Can you summarize the origin of the problem in his example? > > <t>>1</t> > > (syntax-ppss) on the location of "1" in the above, gives (-1 ...). And > then (syntax-ppss) on the "/" will give (0 ...). So the syntax > propertize rule for quote use of (zerop (car (syntax-ppss))) no longer > works correctly to see whether it's inside or outside a tag. > > ">" outside of tags should be set to syntax ".", but I would assume that > adding a syntax-propertize rule which calls syntax-ppss for every ">" > (to check whether it's inside a tag or not) will be very slow, just like > calling it for every quote was. Oh, damn! Hmm... Stefan ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-19 19:24 ` Stefan Monnier @ 2019-05-20 20:47 ` Noam Postavsky 2019-05-21 1:06 ` Vincent Lefevre 2019-05-22 22:37 ` Stefan Monnier 2019-05-22 21:44 ` Stefan Monnier 1 sibling, 2 replies; 42+ messages in thread From: Noam Postavsky @ 2019-05-20 20:47 UTC (permalink / raw) To: Stefan Monnier; +Cc: Vincent Lefevre, 33887 [-- Attachment #1: Type: text/plain, Size: 817 bytes --] > There's an issue with the following XML file, which does not have > any special character, except a single quote in the middle of the > text. > > <root> > <a>12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789'012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 > </a> > </root> > > Note that the newline character before the </a> is important. Right, this is due to chunking by syntax-propertize. Here's the fix: [-- Attachment #2: patch --] [-- Type: text/plain, Size: 3469 bytes --] From 2025fa25f76fd8a2df46fca8807ca386372757d5 Mon Sep 17 00:00:00 2001 From: Noam Postavsky <npostavs@gmail.com> Date: Mon, 20 May 2019 16:04:24 -0400 Subject: [PATCH 1/2] Handle lone quote 500+ characters away from XML tag (Bug#33887) Because syntax-propertize works in small buffer chunks, the rule for finding quotes which don't contain angle brackets failed to trigger when the angle bracket was outside of the current chunk. * lisp/textmodes/sgml-mode.el (sgml-syntax-propertize-rules): Match quotes on lines with no other angle bracket or quote too (the syntax-propertize chunk is extended to cover whole lines). * test/lisp/nxml/nxml-mode-tests.el (nxml-mode-quote-in-long-text): New test. --- lisp/textmodes/sgml-mode.el | 9 +++++++-- test/lisp/nxml/nxml-mode-tests.el | 22 ++++++++++++++++++++++ 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/lisp/textmodes/sgml-mode.el b/lisp/textmodes/sgml-mode.el index 137745fbc1..b555db7b76 100644 --- a/lisp/textmodes/sgml-mode.el +++ b/lisp/textmodes/sgml-mode.el @@ -353,8 +353,13 @@ sgml-font-lock-keywords ;; the resulting number of calls to syntax-ppss made it too slow ;; (bug#33887), so we're now careful to leave alone any pair ;; of quotes that doesn't hold a < or > char, which is the vast majority. - ("\\([\"']\\)[^<>\"']*[<>\"']" - (1 (unless (eq (char-after (match-beginning 1)) (char-before)) + ;; We also check quotes which are unpaired to end of line, + ;; otherwise we miss the case where the quote might "contain" an + ;; angle bracket outside of the current syntax-propertize chunk + ;; (this relies on `syntax-propertize-wholelines' being enabled). + ("\\([\"']\\)[^<>\"']*\\([<>\"']\\|$\\)" + (1 (unless (eq (char-after (match-beginning 1)) + (char-after (match-beginning 2))) ;; Be careful to call `syntax-ppss' on a position before the one ;; we're going to change, so as not to need to flush the data we ;; just computed. diff --git a/test/lisp/nxml/nxml-mode-tests.el b/test/lisp/nxml/nxml-mode-tests.el index 2bbf92bc96..0916a1e652 100644 --- a/test/lisp/nxml/nxml-mode-tests.el +++ b/test/lisp/nxml/nxml-mode-tests.el @@ -86,5 +86,27 @@ nxml-mode-tests-correctly-indented-string (should (= 1 (car (syntax-ppss (1- (point-max)))))) (should (= 0 (car (syntax-ppss (point-max))))))) +(ert-deftest nxml-mode-quote-in-long-text () + (with-temp-buffer + (nxml-mode) + (insert "<t>" + ;; `syntax-propertize-wholelines' extends chunk size based + ;; on line length, so newlines are significant! + (make-string syntax-propertize-chunk-size ?a) "\n" + "'" + (make-string syntax-propertize-chunk-size ?a) "\n" + "</t>") + ;; If we just check (syntax-ppss (point-max)) immediately, then + ;; we'll end up propertizing the whole buffer in one chunk (so the + ;; test is useless). Simulate something more like what happens + ;; when the buffer is viewed normally. + (cl-loop for pos from (point-min) to (point-max) + by syntax-propertize-chunk-size + do (syntax-ppss pos)) + (syntax-ppss (point-max)) + ;; Check that last tag is parsed as a tag. + (should (= 1 (- (car (syntax-ppss (1- (point-max)))) + (car (syntax-ppss (point-max)))))))) + (provide 'nxml-mode-tests) ;;; nxml-mode-tests.el ends here -- 2.11.0 [-- Attachment #3: Type: text/plain, Size: 896 bytes --] Note that you have to be sure to recompile sgml-mode.el AND nxml-mode.el after applying these patches, 'make' isn't smart enough to do it automatically (yes, I figured this out the hard way). >> <t>>1</t> >> >> (syntax-ppss) on the location of "1" in the above, gives (-1 ...). And >> then (syntax-ppss) on the "/" will give (0 ...). So the syntax >> propertize rule for quote use of (zerop (car (syntax-ppss))) no longer >> works correctly to see whether it's inside or outside a tag. >> >> ">" outside of tags should be set to syntax ".", but I would assume that >> adding a syntax-propertize rule which calls syntax-ppss for every ">" >> (to check whether it's inside a tag or not) will be very slow, just like >> calling it for every quote was. Oh, I figured it out, we can just look at (nth 9 ppss), because the list of open parens is still okay, regardless of unmatched close parens. [-- Attachment #4: patch --] [-- Type: text/plain, Size: 2400 bytes --] From d1520ab5b94d0f130955800ea11222a3702a5519 Mon Sep 17 00:00:00 2001 From: Noam Postavsky <npostavs@gmail.com> Date: Mon, 20 May 2019 16:29:04 -0400 Subject: [PATCH 2/2] Handle ">" outside SGML/XML tags (Bug#33887) * lisp/textmodes/sgml-mode.el (sgml-syntax-propertize-rules): Check the list of open parens rather than current depth, the latter is not reliable. * test/lisp/textmodes/sgml-mode-tests.el (sgml-tests--quotes-syntax): Extend test for this case. --- lisp/textmodes/sgml-mode.el | 4 +++- test/lisp/textmodes/sgml-mode-tests.el | 9 ++++++--- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/lisp/textmodes/sgml-mode.el b/lisp/textmodes/sgml-mode.el index b555db7b76..052201e5ee 100644 --- a/lisp/textmodes/sgml-mode.el +++ b/lisp/textmodes/sgml-mode.el @@ -364,7 +364,9 @@ sgml-font-lock-keywords ;; we're going to change, so as not to need to flush the data we ;; just computed. (let ((ppss (syntax-ppss (match-beginning 0)))) - (if (prog1 (zerop (car ppss)) ; Outside tag. + ;; Can't rely on depth (nth 0 ppss), because we don't + ;; mark ">" outside of tags. + (if (prog1 (null (nth 9 ppss)) ; Outside tag. (goto-char (1- (match-end 0))) ;; If we're in a comment, don't skip over comment ;; ender. diff --git a/test/lisp/textmodes/sgml-mode-tests.el b/test/lisp/textmodes/sgml-mode-tests.el index 09941fe6f1..d6913863d6 100644 --- a/test/lisp/textmodes/sgml-mode-tests.el +++ b/test/lisp/textmodes/sgml-mode-tests.el @@ -138,13 +138,16 @@ sgml-with-content "<t>\"a'\"</t>" "<t>'a\"'</t>" "<t><!-- ' --></t>" - "<t><!-- \" --></t>")) + "<t><!-- \" --></t>" + ;; Yes, ">" is technically valid outside tags! + "<t>>'</t>" + )) (ert-info (str :prefix "Test string: ") (sgml-with-content str ;; Check that last tag is parsed as a tag. - (should (= 1 (car (syntax-ppss (1- (point-max)))))) - (should (= 0 (car (syntax-ppss (point-max))))))))) + (should (= 1 (- (car (syntax-ppss (1- (point-max)))) + (car (syntax-ppss (point-max)))))))))) (provide 'sgml-mode-tests) ;;; sgml-mode-tests.el ends here -- 2.11.0 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-20 20:47 ` Noam Postavsky @ 2019-05-21 1:06 ` Vincent Lefevre 2019-05-21 12:27 ` Noam Postavsky 2019-05-22 22:37 ` Stefan Monnier 1 sibling, 1 reply; 42+ messages in thread From: Vincent Lefevre @ 2019-05-21 1:06 UTC (permalink / raw) To: Noam Postavsky; +Cc: Stefan Monnier, 33887 Thanks for the fixes. Also I don't think that in a text node, the " and ' characters should be interpreted for highlighting. In particular, ' is generally used as an apostrophe, not as a quote. For instance, this looks strange: <a>This "shouldn't" and "can't" be right.</a> These characters have no special meaning in a text node. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-21 1:06 ` Vincent Lefevre @ 2019-05-21 12:27 ` Noam Postavsky 2019-05-22 13:58 ` Stefan Monnier 0 siblings, 1 reply; 42+ messages in thread From: Noam Postavsky @ 2019-05-21 12:27 UTC (permalink / raw) To: Vincent Lefevre; +Cc: Stefan Monnier, 33887 Vincent Lefevre <vincent@vinc17.net> writes: > Also I don't think that in a text node, the " and ' characters should > be interpreted for highlighting. In particular, ' is generally used > as an apostrophe, not as a quote. For instance, this looks strange: > > <a>This "shouldn't" and "can't" be right.</a> > > These characters have no special meaning in a text node. Hmm, right, it should be possible to fix the crossing quotes in the above case, but even the simpler <a>"oops" 'oops'</a> shows the same highlighting. This seems directly due to "we're now careful to leave alone any pair of quotes that doesn't hold a < or > char". So uh, Stefan, how was that supposed to work exactly? ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-21 12:27 ` Noam Postavsky @ 2019-05-22 13:58 ` Stefan Monnier 2019-05-22 15:44 ` Vincent Lefevre 0 siblings, 1 reply; 42+ messages in thread From: Stefan Monnier @ 2019-05-22 13:58 UTC (permalink / raw) To: Noam Postavsky; +Cc: Vincent Lefevre, 33887 > shows the same highlighting. This seems directly due to "we're now > careful to leave alone any pair of quotes that doesn't hold a < or > > char". So uh, Stefan, how was that supposed to work exactly? Remember: when I wrote this, we only supported "..." and not '...'. Stefan ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-22 13:58 ` Stefan Monnier @ 2019-05-22 15:44 ` Vincent Lefevre 2019-05-22 16:01 ` Stefan Monnier 0 siblings, 1 reply; 42+ messages in thread From: Vincent Lefevre @ 2019-05-22 15:44 UTC (permalink / raw) To: Stefan Monnier; +Cc: Noam Postavsky, 33887 On 2019-05-22 09:58:54 -0400, Stefan Monnier wrote: > > shows the same highlighting. This seems directly due to "we're now > > careful to leave alone any pair of quotes that doesn't hold a < or > > > char". So uh, Stefan, how was that supposed to work exactly? > > Remember: when I wrote this, we only supported "..." and not '...'. I'm not sure what you mean by that, but the single quotes are not the only issue. In general, you don't know the quoting rules in a text node used by the underlying language (if any), even if you have only double quotes. For instance, a text node may contain C or shell code, which can be: "a string with \"double quotes\"..." And one does not expect this to be interpreted as two pairs of double-quoted text ("a string with \" and "..."). In short, you should leave text nodes with no specific highlighting, as this was the case with Emacs 25. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-22 15:44 ` Vincent Lefevre @ 2019-05-22 16:01 ` Stefan Monnier 0 siblings, 0 replies; 42+ messages in thread From: Stefan Monnier @ 2019-05-22 16:01 UTC (permalink / raw) To: Vincent Lefevre; +Cc: Noam Postavsky, 33887 > I'm not sure what you mean by that, but the single quotes are not > the only issue. No but it introduces problems a lot more often. > In general, you don't know the quoting rules in a > text node used by the underlying language (if any), even if you > have only double quotes. For instance, a text node may contain C > or shell code, which can be: > > "a string with \"double quotes\"..." Of course. But to the extent that it doesn't break the rest of the SGML support, I think it was a pretty good tradeoff (and has arguably a more often beneficial than harmful effect). > And one does not expect this to be interpreted as two pairs of > double-quoted text ("a string with \" and "..."). In short, you > should leave text nodes with no specific highlighting, as this > was the case with Emacs 25. IIRC in Emacs-24 it was yet different. Basically, the focus should be to handle tags correctly and what happens in the regular text between tags is not nearly as important. Stefan ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-20 20:47 ` Noam Postavsky 2019-05-21 1:06 ` Vincent Lefevre @ 2019-05-22 22:37 ` Stefan Monnier 2019-05-26 22:17 ` Noam Postavsky 1 sibling, 1 reply; 42+ messages in thread From: Stefan Monnier @ 2019-05-22 22:37 UTC (permalink / raw) To: Noam Postavsky; +Cc: Vincent Lefevre, 33887 > Right, this is due to chunking by syntax-propertize. Here's the fix: I pushed a patch which should fix the "lone >" problem without introducing any undue extra cost. It should also fix the "very long line" case. Stefan ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-22 22:37 ` Stefan Monnier @ 2019-05-26 22:17 ` Noam Postavsky 2019-05-27 9:18 ` Vincent Lefevre 0 siblings, 1 reply; 42+ messages in thread From: Noam Postavsky @ 2019-05-26 22:17 UTC (permalink / raw) To: Stefan Monnier; +Cc: Vincent Lefevre, 33887 [-- Attachment #1: Type: text/plain, Size: 516 bytes --] Stefan Monnier <monnier@iro.umontreal.ca> writes: > I pushed a patch which should fix the "lone >" problem without > introducing any undue extra cost. It should also fix the "very long > line" case. Seems to pass my tests. Not sure if you missed the alternate fix I proposed in https://debbugs.gnu.org/33887#94 or not. It does have the disadvantage of leaving (car (syntax-ppss)) unreliable for any other code which uses it. Here's a patch against master that should cover the remaining cases Vincent raised: [-- Attachment #2: patch --] [-- Type: text/plain, Size: 4011 bytes --] From 2ffdab0e86161396e3d2606949d1fcf93c58b592 Mon Sep 17 00:00:00 2001 From: Noam Postavsky <npostavs@gmail.com> Date: Sun, 26 May 2019 11:07:14 -0400 Subject: [PATCH 1/2] Fix some SGML syntax edge cases (Bug#33887) * lisp/textmodes/sgml-mode.el (sgml-syntax-propertize-rules): Handle single and double quotes symmetrically. Don't skip quoted comment enders. * test/lisp/textmodes/sgml-mode-tests.el (sgml-tests--quotes-syntax): Add more test cases. (sgml-mode-quote-in-long-text): New test. --- lisp/textmodes/sgml-mode.el | 5 +++- test/lisp/textmodes/sgml-mode-tests.el | 45 ++++++++++++++++++++++++++++------ 2 files changed, 42 insertions(+), 8 deletions(-) diff --git a/lisp/textmodes/sgml-mode.el b/lisp/textmodes/sgml-mode.el index 75f20722b0..1df7e78afc 100644 --- a/lisp/textmodes/sgml-mode.el +++ b/lisp/textmodes/sgml-mode.el @@ -363,9 +363,12 @@ (eval-and-compile ;; the resulting number of calls to syntax-ppss made it too slow ;; (bug#33887), so we're now careful to leave alone any pair ;; of quotes that doesn't hold a < or > char, which is the vast majority. - ("\\(?:\\(?1:\"\\)[^\"<>]*\\|\\(?1:'\\)[^'\"<>]*\\)" + ("\\([\"']\\)[^\"'<>]*" (1 (if (eq (char-after) (char-after (match-beginning 0))) (forward-char 1) + ;; Avoid skipping comment ender. + (when (eq (char-after) ?>) + (skip-chars-backward "-")) ;; Be careful to call `syntax-ppss' on a position before the one ;; we're going to change, so as not to need to flush the data we ;; just computed. diff --git a/test/lisp/textmodes/sgml-mode-tests.el b/test/lisp/textmodes/sgml-mode-tests.el index 1b8965e344..34d26480a4 100644 --- a/test/lisp/textmodes/sgml-mode-tests.el +++ b/test/lisp/textmodes/sgml-mode-tests.el @@ -161,15 +161,46 @@ (ert-deftest sgml-quote-works () (should (string= "&&" (buffer-string)))))) (ert-deftest sgml-tests--quotes-syntax () + (dolist (str '("a\"b <t>c'd</t>" + "a'b <t>c\"d</t>" + "<t>\"a'</t>" + "<t>'a\"</t>" + "<t>\"a'\"</t>" + "<t>'a\"'</t>" + "a\"b <tag>c'd</tag>" + "<tag>c>'d</tag>" + "<t><!-- \" --></t>" + "<t><!-- ' --></t>" + )) + (with-temp-buffer + (sgml-mode) + (insert str) + (ert-info ((format "%S" str) :prefix "Test case: ") + ;; Check that last tag is parsed as a tag. + (should (= 1 (car (syntax-ppss (1- (point-max)))))) + (should (= 0 (car (syntax-ppss (point-max))))))))) + +(ert-deftest sgml-mode-quote-in-long-text () (with-temp-buffer (sgml-mode) - (insert "a\"b <tag>c'd</tag>") - (should (= 1 (car (syntax-ppss (1- (point-max)))))) - (should (= 0 (car (syntax-ppss (point-max))))) - (erase-buffer) - (insert "<tag>c>d</tag>") - (should (= 1 (car (syntax-ppss (1- (point-max)))))) - (should (= 0 (car (syntax-ppss (point-max))))))) + (insert "<t>" + ;; `syntax-propertize-wholelines' extends chunk size based + ;; on line length, so newlines are significant! + (make-string syntax-propertize-chunk-size ?a) "\n" + "'" + (make-string syntax-propertize-chunk-size ?a) "\n" + "</t>") + ;; If we just check (syntax-ppss (point-max)) immediately, then + ;; we'll end up propertizing the whole buffer in one chunk (so the + ;; test is useless). Simulate something more like what happens + ;; when the buffer is viewed normally. + (cl-loop for pos from (point-min) to (point-max) + by syntax-propertize-chunk-size + do (syntax-ppss pos)) + (syntax-ppss (point-max)) + ;; Check that last tag is parsed as a tag. + (should (= 1 (- (car (syntax-ppss (1- (point-max)))) + (car (syntax-ppss (point-max)))))))) (provide 'sgml-mode-tests) ;;; sgml-mode-tests.el ends here -- 2.11.0 [-- Attachment #3: Type: text/plain, Size: 134 bytes --] And about the highlighting of quoted text outside tags, we can just disable fontification, while leaving the syntax code untouched: [-- Attachment #4: patch --] [-- Type: text/plain, Size: 4141 bytes --] From a4a6008d96011e2517939cb8cb51624802a8c31e Mon Sep 17 00:00:00 2001 From: Noam Postavsky <npostavs@gmail.com> Date: Sun, 26 May 2019 17:41:22 -0400 Subject: [PATCH 2/2] Don't fontiy text outside of SGML/XML tags (Bug#33887) * lisp/font-lock.el (font-lock-syntactic-face-function-default): New function. (font-lock-syntactic-face-function): Use it as default value. * lisp/textmodes/sgml-mode.el (sgml-font-lock-syntactic-face): New function. (sgml-mode): * lisp/nxml/nxml-mode.el (nxml-mode): Use it as font-lock-syntactic-face-function value. --- lisp/font-lock.el | 7 +++++-- lisp/nxml/nxml-mode.el | 4 +++- lisp/textmodes/sgml-mode.el | 11 +++++++++-- 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/lisp/font-lock.el b/lisp/font-lock.el index 3991a4ee8e..ddf1cbdb9f 100644 --- a/lisp/font-lock.el +++ b/lisp/font-lock.el @@ -527,9 +527,12 @@ (defvar font-lock-syntactically-fontified 0 sometimes be slightly incorrect.") (make-variable-buffer-local 'font-lock-syntactically-fontified) +(defun font-lock-syntactic-face-function-default (state) + "Default value for `font-lock-syntactic-face-function'." + (if (nth 3 state) font-lock-string-face font-lock-comment-face)) + (defvar font-lock-syntactic-face-function - (lambda (state) - (if (nth 3 state) font-lock-string-face font-lock-comment-face)) + #'font-lock-syntactic-face-function-default "Function to determine which face to use when fontifying syntactically. The function is called with a single parameter (the state as returned by `parse-partial-sexp' at the beginning of the region to highlight) and diff --git a/lisp/nxml/nxml-mode.el b/lisp/nxml/nxml-mode.el index da01b2a342..05044d66df 100644 --- a/lisp/nxml/nxml-mode.el +++ b/lisp/nxml/nxml-mode.el @@ -551,7 +551,9 @@ (define-derived-mode nxml-mode text-mode "nXML" nil ; no special syntax table (font-lock-extend-region-functions . (nxml-extend-region)) (jit-lock-contextually . t) - (font-lock-unfontify-region-function . nxml-unfontify-region))) + (font-lock-unfontify-region-function . nxml-unfontify-region) + (font-lock-syntactic-face-function + . sgml-font-lock-syntactic-face))) (with-demoted-errors (rng-nxml-mode-init))) diff --git a/lisp/textmodes/sgml-mode.el b/lisp/textmodes/sgml-mode.el index 1df7e78afc..225fe72a01 100644 --- a/lisp/textmodes/sgml-mode.el +++ b/lisp/textmodes/sgml-mode.el @@ -329,6 +329,11 @@ (defconst sgml-font-lock-keywords-2 (defvar sgml-font-lock-keywords sgml-font-lock-keywords-1 "Rules for highlighting SGML code. See also `sgml-tag-face-alist'.") +(defun sgml-font-lock-syntactic-face (state) + "`font-lock-syntactic-face-function' for `sgml-mode'." + (and (nth 9 state) ;; Only use faces within tags. + (font-lock-syntactic-face-function-default state))) + (defvar-local sgml--syntax-propertize-ppss nil) (defun sgml--syntax-propertize-ppss (pos) @@ -573,7 +578,7 @@ (define-derived-mode sgml-mode text-mode '(sgml-xml-mode "XML" "SGML") ;; This is desirable because SGML discards a newline that appears ;; immediately after a start tag or immediately before an end tag. (setq-local paragraph-start (concat "[ \t]*$\\|\ -[ \t]*</?\\(" sgml-name-re sgml-attrs-re "\\)?>")) +\[ \t]*</?\\(" sgml-name-re sgml-attrs-re "\\)?>")) (setq-local paragraph-separate (concat paragraph-start "$")) (setq-local adaptive-fill-regexp "[ \t]*") (add-hook 'fill-nobreak-predicate 'sgml-fill-nobreak nil t) @@ -591,7 +596,9 @@ (define-derived-mode sgml-mode text-mode '(sgml-xml-mode "XML" "SGML") (setq font-lock-defaults '((sgml-font-lock-keywords sgml-font-lock-keywords-1 sgml-font-lock-keywords-2) - nil t)) + nil t nil + (font-lock-syntactic-face-function + . sgml-font-lock-syntactic-face))) (setq-local syntax-propertize-function #'sgml-syntax-propertize) (setq-local facemenu-add-face-function 'sgml-mode-facemenu-add-face-function) (setq-local sgml-xml-mode (sgml-xml-guess)) -- 2.11.0 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-26 22:17 ` Noam Postavsky @ 2019-05-27 9:18 ` Vincent Lefevre 2019-05-27 12:02 ` Noam Postavsky 0 siblings, 1 reply; 42+ messages in thread From: Vincent Lefevre @ 2019-05-27 9:18 UTC (permalink / raw) To: Noam Postavsky; +Cc: Stefan Monnier, 33887 On 2019-05-26 18:17:55 -0400, Noam Postavsky wrote: > And about the highlighting of quoted text outside tags, we can just > disable fontification, while leaving the syntax code untouched: [...] I've applied it with a minor change against Emacs 26 (context lines for hunk #1 of sgml-mode.el are different), but the comments are no longer highlighted as comments. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-27 9:18 ` Vincent Lefevre @ 2019-05-27 12:02 ` Noam Postavsky 2019-05-29 0:30 ` Vincent Lefevre 0 siblings, 1 reply; 42+ messages in thread From: Noam Postavsky @ 2019-05-27 12:02 UTC (permalink / raw) To: Vincent Lefevre; +Cc: Stefan Monnier, 33887 [-- Attachment #1: Type: text/plain, Size: 864 bytes --] Vincent Lefevre <vincent@vinc17.net> writes: > On 2019-05-26 18:17:55 -0400, Noam Postavsky wrote: >> And about the highlighting of quoted text outside tags, we can just >> disable fontification, while leaving the syntax code untouched: > [...] > > I've applied it with a minor change against Emacs 26 (context lines > for hunk #1 of sgml-mode.el are different), but the comments are > no longer highlighted as comments. Ah, I guess reusing the default font-lock-syntactic-face-function doesn't really make sense after all. So sgml-font-lock-syntactic-face should be like this: (defun sgml-font-lock-syntactic-face (state) "`font-lock-syntactic-face-function' for `sgml-mode'." ;; Don't use string face outside of tags. (cond ((and (nth 9 state) (nth 3 state)) font-lock-string-face) ((nth 4 state) font-lock-comment-face))) [-- Attachment #2: patch --] [-- Type: text/plain, Size: 4188 bytes --] From 0c3e6a97f92dec31e7e186dae933c86700034089 Mon Sep 17 00:00:00 2001 From: Noam Postavsky <npostavs@gmail.com> Date: Sun, 26 May 2019 17:41:22 -0400 Subject: [PATCH] Don't fontify text outside of SGML/XML tags (Bug#33887) * lisp/font-lock.el (font-lock-syntactic-face-function-default): New function. (font-lock-syntactic-face-function): Use it as default value. * lisp/textmodes/sgml-mode.el (sgml-font-lock-syntactic-face): New function. (sgml-mode): * lisp/nxml/nxml-mode.el (nxml-mode): Use it as font-lock-syntactic-face-function value. --- lisp/font-lock.el | 7 +++++-- lisp/nxml/nxml-mode.el | 4 +++- lisp/textmodes/sgml-mode.el | 12 ++++++++++-- 3 files changed, 18 insertions(+), 5 deletions(-) diff --git a/lisp/font-lock.el b/lisp/font-lock.el index 3991a4ee8e..ddf1cbdb9f 100644 --- a/lisp/font-lock.el +++ b/lisp/font-lock.el @@ -527,9 +527,12 @@ (defvar font-lock-syntactically-fontified 0 sometimes be slightly incorrect.") (make-variable-buffer-local 'font-lock-syntactically-fontified) +(defun font-lock-syntactic-face-function-default (state) + "Default value for `font-lock-syntactic-face-function'." + (if (nth 3 state) font-lock-string-face font-lock-comment-face)) + (defvar font-lock-syntactic-face-function - (lambda (state) - (if (nth 3 state) font-lock-string-face font-lock-comment-face)) + #'font-lock-syntactic-face-function-default "Function to determine which face to use when fontifying syntactically. The function is called with a single parameter (the state as returned by `parse-partial-sexp' at the beginning of the region to highlight) and diff --git a/lisp/nxml/nxml-mode.el b/lisp/nxml/nxml-mode.el index da01b2a342..05044d66df 100644 --- a/lisp/nxml/nxml-mode.el +++ b/lisp/nxml/nxml-mode.el @@ -551,7 +551,9 @@ (define-derived-mode nxml-mode text-mode "nXML" nil ; no special syntax table (font-lock-extend-region-functions . (nxml-extend-region)) (jit-lock-contextually . t) - (font-lock-unfontify-region-function . nxml-unfontify-region))) + (font-lock-unfontify-region-function . nxml-unfontify-region) + (font-lock-syntactic-face-function + . sgml-font-lock-syntactic-face))) (with-demoted-errors (rng-nxml-mode-init))) diff --git a/lisp/textmodes/sgml-mode.el b/lisp/textmodes/sgml-mode.el index 1df7e78afc..da25665e62 100644 --- a/lisp/textmodes/sgml-mode.el +++ b/lisp/textmodes/sgml-mode.el @@ -329,6 +329,12 @@ (defconst sgml-font-lock-keywords-2 (defvar sgml-font-lock-keywords sgml-font-lock-keywords-1 "Rules for highlighting SGML code. See also `sgml-tag-face-alist'.") +(defun sgml-font-lock-syntactic-face (state) + "`font-lock-syntactic-face-function' for `sgml-mode'." + ;; Don't use string face outside of tags. + (cond ((and (nth 9 state) (nth 3 state)) font-lock-string-face) + ((nth 4 state) font-lock-comment-face))) + (defvar-local sgml--syntax-propertize-ppss nil) (defun sgml--syntax-propertize-ppss (pos) @@ -573,7 +579,7 @@ (define-derived-mode sgml-mode text-mode '(sgml-xml-mode "XML" "SGML") ;; This is desirable because SGML discards a newline that appears ;; immediately after a start tag or immediately before an end tag. (setq-local paragraph-start (concat "[ \t]*$\\|\ -[ \t]*</?\\(" sgml-name-re sgml-attrs-re "\\)?>")) +\[ \t]*</?\\(" sgml-name-re sgml-attrs-re "\\)?>")) (setq-local paragraph-separate (concat paragraph-start "$")) (setq-local adaptive-fill-regexp "[ \t]*") (add-hook 'fill-nobreak-predicate 'sgml-fill-nobreak nil t) @@ -591,7 +597,9 @@ (define-derived-mode sgml-mode text-mode '(sgml-xml-mode "XML" "SGML") (setq font-lock-defaults '((sgml-font-lock-keywords sgml-font-lock-keywords-1 sgml-font-lock-keywords-2) - nil t)) + nil t nil + (font-lock-syntactic-face-function + . sgml-font-lock-syntactic-face))) (setq-local syntax-propertize-function #'sgml-syntax-propertize) (setq-local facemenu-add-face-function 'sgml-mode-facemenu-add-face-function) (setq-local sgml-xml-mode (sgml-xml-guess)) -- 2.11.0 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-27 12:02 ` Noam Postavsky @ 2019-05-29 0:30 ` Vincent Lefevre 2019-06-04 12:55 ` Noam Postavsky 0 siblings, 1 reply; 42+ messages in thread From: Vincent Lefevre @ 2019-05-29 0:30 UTC (permalink / raw) To: Noam Postavsky; +Cc: Stefan Monnier, 33887 Thanks. A last issue: a comment before the root element is not highlighted. Example: in <?xml version="1.0" encoding="utf-8"?> <!-- comment --> <root> <!-- comment --> </root> <!-- comment --> the first comment is not highlighted, but the other two comments are. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-29 0:30 ` Vincent Lefevre @ 2019-06-04 12:55 ` Noam Postavsky 0 siblings, 0 replies; 42+ messages in thread From: Noam Postavsky @ 2019-06-04 12:55 UTC (permalink / raw) To: Vincent Lefevre; +Cc: Stefan Monnier, 33887 tags 33887 fixed close 33887 27.1 quit Vincent Lefevre <vincent@vinc17.net> writes: > Thanks. A last issue: a comment before the root element is not > highlighted. Example: in > > <?xml version="1.0" encoding="utf-8"?> > <!-- comment --> > <root> > <!-- comment --> > </root> > <!-- comment --> > > the first comment is not highlighted, but the other two comments are. This was followed up in https://debbugs.gnu.org/32823#45 I'm pushing the current patches to master and closing this bug, as I think all the issues here are resolved (if not, we can open new bugs). e04f93e18a 2019-06-04T08:42:50-04:00 "Don't fontify text outside of SGML/XML tags (Bug#33887)" https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=e04f93e18a8083d3a4930decc523c4f5d9a97c9e 438e4804d1 2019-06-04T08:42:50-04:00 "Fix some SGML syntax edge cases (Bug#33887)" https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=438e4804d107720f526d0c7c367cbd029f264676 ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-19 19:24 ` Stefan Monnier 2019-05-20 20:47 ` Noam Postavsky @ 2019-05-22 21:44 ` Stefan Monnier 1 sibling, 0 replies; 42+ messages in thread From: Stefan Monnier @ 2019-05-22 21:44 UTC (permalink / raw) To: Noam Postavsky; +Cc: Vincent Lefevre, 33887 >> <t>>1</t> > Oh, damn! Hmm... Maybe the best way to detect this is using `parse-partial-sexp` passing it a `targetdepth` of -1. The trick will be when/where to call it so it's cheap enough. Stefan ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-18 18:49 ` Noam Postavsky 2019-05-19 0:17 ` Vincent Lefevre @ 2019-05-20 11:47 ` Vincent Lefevre 1 sibling, 0 replies; 42+ messages in thread From: Vincent Lefevre @ 2019-05-20 11:47 UTC (permalink / raw) To: Noam Postavsky; +Cc: Stefan Monnier, 33887 There's an issue with the following XML file, which does not have any special character, except a single quote in the middle of the text. <root> <a>12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789'012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 </a> </root> Note that the newline character before the </a> is important. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode 2019-05-15 23:53 ` Noam Postavsky 2019-05-16 10:54 ` Vincent Lefevre 2019-05-16 12:15 ` Noam Postavsky @ 2019-05-16 14:01 ` Eli Zaretskii 2 siblings, 0 replies; 42+ messages in thread From: Eli Zaretskii @ 2019-05-16 14:01 UTC (permalink / raw) To: Noam Postavsky; +Cc: vincent, 33887 > From: Noam Postavsky <npostavs@gmail.com> > Date: Wed, 15 May 2019 19:53:08 -0400 > Cc: 33887@debbugs.gnu.org > > Vincent Lefevre <vincent@vinc17.net> writes: > > > This is a regression: Emacs 25 did not hang at all. > > Should we backport Stefan's fix to emacs-26? Or specifically, backport > [1: e7e92dc5d2], which is Stefan's fix on top of my fix for the > loss-of-single-quote-fontification bug (Bug#35381). > > [1: e7e92dc5d2]: 2019-05-15 19:04:14 -0400 > Fix merge of sgml-syntax-propertize-rules > https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=e7e92dc5d24ac3bcde69732bab6a6c3c0d9de97b I'd like to leave this fix on master for a while, so that we could make sure it has no adverse consequences. Can we revisit this in a month's time, say? ^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2019-06-04 12:55 UTC | newest] Thread overview: 42+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-12-27 10:13 bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode Vincent Lefevre 2018-12-27 16:02 ` Eli Zaretskii 2018-12-27 16:39 ` Stefan Monnier 2018-12-27 16:43 ` Eli Zaretskii 2018-12-27 17:32 ` Stefan Monnier 2018-12-27 17:47 ` Eli Zaretskii 2018-12-27 18:43 ` Vincent Lefevre 2018-12-28 17:18 ` Stefan Monnier 2019-01-17 22:57 ` Stefan Monnier 2019-01-08 22:11 ` Fernando Jascovich 2019-01-10 15:09 ` Eli Zaretskii 2019-01-17 23:25 ` Stefan Monnier 2019-05-15 23:53 ` Noam Postavsky 2019-05-16 10:54 ` Vincent Lefevre 2019-05-16 12:15 ` Noam Postavsky 2019-05-17 21:36 ` Vincent Lefevre 2019-05-18 4:15 ` Noam Postavsky 2019-05-18 14:47 ` Vincent Lefevre 2019-05-18 14:55 ` Vincent Lefevre 2019-05-18 14:57 ` Vincent Lefevre 2019-05-18 15:01 ` Vincent Lefevre 2019-05-18 18:49 ` Noam Postavsky 2019-05-19 0:17 ` Vincent Lefevre 2019-05-19 17:43 ` Noam Postavsky 2019-05-19 18:48 ` Stefan Monnier 2019-05-19 19:03 ` Noam Postavsky 2019-05-19 19:24 ` Stefan Monnier 2019-05-20 20:47 ` Noam Postavsky 2019-05-21 1:06 ` Vincent Lefevre 2019-05-21 12:27 ` Noam Postavsky 2019-05-22 13:58 ` Stefan Monnier 2019-05-22 15:44 ` Vincent Lefevre 2019-05-22 16:01 ` Stefan Monnier 2019-05-22 22:37 ` Stefan Monnier 2019-05-26 22:17 ` Noam Postavsky 2019-05-27 9:18 ` Vincent Lefevre 2019-05-27 12:02 ` Noam Postavsky 2019-05-29 0:30 ` Vincent Lefevre 2019-06-04 12:55 ` Noam Postavsky 2019-05-22 21:44 ` Stefan Monnier 2019-05-20 11:47 ` Vincent Lefevre 2019-05-16 14:01 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).