unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#7172: emacs 23.2; xml.el: xml-parse-file hangs when DOCTYPE element names contain _ (underscore)
@ 2010-10-07 17:07 Jose Marino
  2010-10-08  0:39 ` Glenn Morris
  2012-07-01 10:59 ` Chong Yidong
  0 siblings, 2 replies; 3+ messages in thread
From: Jose Marino @ 2010-10-07 17:07 UTC (permalink / raw)
  To: 7172

In a DOCTYPE construction, whenever there's an ELEMENT name with an 
underscore in its name, function xml-parse-file makes emacs become 
unresponsive and use 100% cpu. Emacs recovers nicely with C-g but no 
error is printed.

To reproduce this behavior I set up these two simple xml files:

------------ output --------------
$ cat example-good.xml
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE EXAMPLE [
    <!ELEMENT EXAMPLE EMPTY>
]>
<EXAMPLE>
</EXAMPLE>

$ cat example-bad.xml
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE EXAMPLE [
    <!ELEMENT EXAM_PLE EMPTY>
]>
<EXAM_PLE>
</EXAM_PLE>
------------ output --------------

Then from emacs I run:
(xml-parse-file "example-good.xml")
Which as expected produces:
((EXAMPLE nil "
"))

But when I do the same for the other file:
(xml-parse-file "example-bad.xml")
No output is produced and emacs becomes unresponsive.

Attaching strace to the running emacs process prints:
brk(0x267b000)                          = 0x267b000
brk(0x269d000)                          = 0x269d000
brk(0x2637000)                          = 0x2637000
brk(0x2659000)                          = 0x2659000
brk(0x267b000)                          = 0x267b000
brk(0x269d000)                          = 0x269d000
brk(0x2637000)                          = 0x2637000
brk(0x2659000)                          = 0x2659000
brk(0x267b000)                          = 0x267b000
brk(0x269d000)                          = 0x269d000
brk(0x2637000)                          = 0x2637000
brk(0x2659000)                          = 0x2659000

These messages repeat over and over.

I should mention that this behavior seems to be triggered by the 
underscore in the DOCTYPE ELEMENT name, and is not affected by the 
underscore in the actual element's name. Thus, this file also triggers 
the bug:
$ cat example-bad2.xml
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE EXAMPLE [
    <!ELEMENT EXAM_PLE EMPTY>
]>
<EXAMPLE>
</EXAMPLE>





^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#7172: emacs 23.2; xml.el: xml-parse-file hangs when DOCTYPE element names contain _ (underscore)
  2010-10-07 17:07 bug#7172: emacs 23.2; xml.el: xml-parse-file hangs when DOCTYPE element names contain _ (underscore) Jose Marino
@ 2010-10-08  0:39 ` Glenn Morris
  2012-07-01 10:59 ` Chong Yidong
  1 sibling, 0 replies; 3+ messages in thread
From: Glenn Morris @ 2010-10-08  0:39 UTC (permalink / raw)
  To: Jose Marino; +Cc: 7172

Jose Marino wrote:

> Attaching strace to the running emacs process prints:
> brk(0x267b000)                          = 0x267b000

A much more useful thing to do in such cases is to
M-x toggle-debug-on-quit
beforehand, then interrupt Emacs with C-g when it hangs. Resulting backtrace:

Debugger entered--Lisp error: (quit)
  looking-at("<!ATTLIST[ 	\n\r]*\\([[:alpha:]:_][-[:digit:].[:alpha:]:_]*\\)[ 	\n\r]*\\(\\(?:[ 	\n\r]*[[:alpha:]:_][-[:digit:].[:alpha:]:_]*[ 	\n\r]*\\(?:CDATA\\|\\(?:ID\\|IDREF\\|IDREFS\\|ENTITY\\|ENTITIES\\|NMTOKEN\\|NMTOKENS\\)\\|\\(?:NOTATION[ 	\n\r]([ 	\n\r]*[[:alpha:]:_][-[:digit:].[:alpha:]:_]*\\(?:[ 	\n\r]*|[ 	\n\r]*[[:alpha:]:_][-[:digit:].[:alpha:]:_]*\\)*[ 	\n\r]*)\\)\\|\\(?:\\(?:NOTATION[ 	\n\r]([ 	\n\r]*[[:alpha:]:_][-[:digit:].[:alpha:]:_]*\\(?:[ 	\n\r]*|[ 	\n\r]*[[:alpha:]:_][-[:digit:].[:alpha:]:_]*\\)*[ 	\n\r]*)\\)\\|\\(?:([ 	\n\r]*[-[:digit:].[:alpha:]:_]+\\(?:[ 	\n\r]*|[ 	\n\r]*[-[:digit:].[:alpha:]:_]+\\)*[ 	\n\r])\\)\\)\\)[ 	\n\r]*\\(?:#REQUIRED\\|#IMPLIED\\|\\(?:#FIXED[ 	\n\r]\\)*\\(?:\"\\(?:[^&\"]\\|\\(?:&[[:alpha:]:_][-[:digit:].[:alpha:]:_]*;\\|\\(?:&#[0-9]+;\\|&#x[0-9a-fA-F]+;\\)\\)\\)*\"\\|'\\(?:[^&']\\|\\(?:&[[:alpha:]:_][-[:digit:].[:alpha:]:_]*;\\|\\(?:&#[0-9]+;\\|&#x[0-9a-fA-F]+;\\)\\)\\)*'\\)\\)\\)\\)*[ 	\n\r]*>")
  xml-parse-dtd(nil)
  xml-parse-tag(nil nil)
  xml-parse-tag(nil nil)
  xml-parse-region(1 116 #<buffer  *temp*> nil nil)
  xml-parse-file("example-bad.xml")


That certainly is a regexp.





^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#7172: emacs 23.2; xml.el: xml-parse-file hangs when DOCTYPE element names contain _ (underscore)
  2010-10-07 17:07 bug#7172: emacs 23.2; xml.el: xml-parse-file hangs when DOCTYPE element names contain _ (underscore) Jose Marino
  2010-10-08  0:39 ` Glenn Morris
@ 2012-07-01 10:59 ` Chong Yidong
  1 sibling, 0 replies; 3+ messages in thread
From: Chong Yidong @ 2012-07-01 10:59 UTC (permalink / raw)
  To: Jose Marino; +Cc: 7172

Jose Marino <marinoj@astro.ufl.edu> writes:

> In a DOCTYPE construction, whenever there's an ELEMENT name with an
> underscore in its name, function xml-parse-file makes emacs become
> unresponsive and use 100% cpu. Emacs recovers nicely with C-g but no
> error is printed.
>
> $ cat example-bad.xml
> <?xml version="1.0" encoding="utf-8"?>
> <!DOCTYPE EXAMPLE [
>    <!ELEMENT EXAM_PLE EMPTY>
> ]>
> <EXAM_PLE>
> </EXAM_PLE>

This is fixed in trunk.  Thanks for the bug report.





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-07-01 10:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-07 17:07 bug#7172: emacs 23.2; xml.el: xml-parse-file hangs when DOCTYPE element names contain _ (underscore) Jose Marino
2010-10-08  0:39 ` Glenn Morris
2012-07-01 10:59 ` Chong Yidong

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).