unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#26533: 26.0.50; xml-parse-region's symbol-qname argument is ignored
@ 2017-04-16 12:36 Christopher Wellons
  2017-04-17 15:33 ` David Engster
  0 siblings, 1 reply; 3+ messages in thread
From: Christopher Wellons @ 2017-04-16 12:36 UTC (permalink / raw)
  To: 26533


A bug was introduced in aea67018 that causes the special "symbol-qnames"
value for PARSE-NS to be ignored, as if it were nil. This information is
discarded by the change to xml-parse-attlist, so functions further down
the line see the argument as if it was set to nil.

Here's an example of the bug:

    (with-temp-buffer
      (insert "<root a:b='c'></root>")
      (let ((xml-default-ns ()))
        (xml-parse-region nil nil nil nil 'symbol-qnames)))

Prior to this commit (Emacs 25.1 and earlier) the result is:

    ((root ((b . "c"))))

After this commit:

    ((root ((a:b . "c"))))

This is the same as PARSE-NS being set to nil.





^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#26533: 26.0.50; xml-parse-region's symbol-qname argument is ignored
  2017-04-16 12:36 bug#26533: 26.0.50; xml-parse-region's symbol-qname argument is ignored Christopher Wellons
@ 2017-04-17 15:33 ` David Engster
  2017-04-17 16:29   ` Christopher Wellons
  0 siblings, 1 reply; 3+ messages in thread
From: David Engster @ 2017-04-17 15:33 UTC (permalink / raw)
  To: Christopher Wellons; +Cc: 26533-done

Christopher Wellons writes:
> A bug was introduced in aea67018 that causes the special "symbol-qnames"
> value for PARSE-NS to be ignored, as if it were nil. This information is
> discarded by the change to xml-parse-attlist, so functions further down
> the line see the argument as if it was set to nil.
>
> Here's an example of the bug:
>
>     (with-temp-buffer
>       (insert "<root a:b='c'></root>")
>       (let ((xml-default-ns ()))
>         (xml-parse-region nil nil nil nil 'symbol-qnames)))
>
> Prior to this commit (Emacs 25.1 and earlier) the result is:
>
>     ((root ((b . "c"))))
>
> After this commit:
>
>     ((root ((a:b . "c"))))
>
> This is the same as PARSE-NS being set to nil.

Thanks for the report.

You are right that the fix for bug #23440 was not correct. I now pushed
a hopefully better version to master.

Note however that your test above has two problems: First, it's invalid
XML since you're using an undeclared prefix (so the parser should rather
throw an error, but I'm not eager to make the xml parser more strict, as
there's a lot of invalid XML in the wild). Second, I don't understand
why you let-bind `xml-default-ns' to nil. This will break namespace
expansion, and it will actually do this for the whole Emacs session if
xml.el gets autoloaded during the above.

-David





^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#26533: 26.0.50; xml-parse-region's symbol-qname argument is ignored
  2017-04-17 15:33 ` David Engster
@ 2017-04-17 16:29   ` Christopher Wellons
  0 siblings, 0 replies; 3+ messages in thread
From: Christopher Wellons @ 2017-04-17 16:29 UTC (permalink / raw)
  To: David Engster; +Cc: 26533-done


Thanks, David! Your fix works fine as far as I can tell.

I'm using this trick in Elfeed (a syndication feed reader) as a fast
method to strip all namespaces from the XML as it's being parsed. As you
said, there's a lot of invalid XML in the wild. I've found it works a
lot better to ignore namespaces and strictness, instead extracting the
required information heuristically as long as it's reasonably close.
Otherwise there would be a whole lot more feeds that wouldn't work well,
or at all, in Elfeed.

I had noticed with symbol-qnames that xml-parse-region drops unknown
namespaces. Since this information comes from an alist, that seemed like
reasonable behavior and I assumed it was intentional -- though signaling
an error would also be reasonable. To tightly control which namespaces
are stripped, I bind xml-default-ns to my own alist for that call. This
feels like the natural and lispy way to use this function.

The file that binds xml-default-ns requires the xml package explicitly,
so there's no risk of it autoloading while it's bound. Though that's an
interesting consequence I hadn't considered before. I _have_ seen
similar issues with accept-process-output when arbitrary process events
are handled while the stack is in an unusual state.





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-04-17 16:29 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-16 12:36 bug#26533: 26.0.50; xml-parse-region's symbol-qname argument is ignored Christopher Wellons
2017-04-17 15:33 ` David Engster
2017-04-17 16:29   ` Christopher Wellons

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).