From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Christopher Wellons Newsgroups: gmane.emacs.bugs Subject: bug#26533: 26.0.50; xml-parse-region's symbol-qname argument is ignored Date: Mon, 17 Apr 2017 12:29:15 -0400 Message-ID: <87efwr576s.fsf@wellocc1-ares.jhuapl.edu> References: <87fuh8bkco.fsf@tengu.zeus.nullprogram.com> <874lxncamk.fsf@engster.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1492446621 21586 195.159.176.226 (17 Apr 2017 16:30:21 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 17 Apr 2017 16:30:21 +0000 (UTC) Cc: 26533-done@debbugs.gnu.org To: David Engster Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Apr 17 18:30:15 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d09XW-0005Ms-Kt for geb-bug-gnu-emacs@m.gmane.org; Mon, 17 Apr 2017 18:30:10 +0200 Original-Received: from localhost ([::1]:37755 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d09Xc-00063B-Gr for geb-bug-gnu-emacs@m.gmane.org; Mon, 17 Apr 2017 12:30:16 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:53256) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d09XS-0005zS-LE for bug-gnu-emacs@gnu.org; Mon, 17 Apr 2017 12:30:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1d09XP-0007P0-Ho for bug-gnu-emacs@gnu.org; Mon, 17 Apr 2017 12:30:06 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:54517) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1d09XP-0007Ok-Ej for bug-gnu-emacs@gnu.org; Mon, 17 Apr 2017 12:30:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1d09XP-0000zc-2d for bug-gnu-emacs@gnu.org; Mon, 17 Apr 2017 12:30:03 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Christopher Wellons Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 17 Apr 2017 16:30:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 26533 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 26533-done@debbugs.gnu.org id=D26533.14924465603747 (code D ref 26533); Mon, 17 Apr 2017 16:30:02 +0000 Original-Received: (at 26533-done) by debbugs.gnu.org; 17 Apr 2017 16:29:20 +0000 Original-Received: from localhost ([127.0.0.1]:52716 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d09Wi-0000yN-8p for submit@debbugs.gnu.org; Mon, 17 Apr 2017 12:29:20 -0400 Original-Received: from mail.nullprogram.com ([192.241.191.137]:58631) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d09Wg-0000yF-GS for 26533-done@debbugs.gnu.org; Mon, 17 Apr 2017 12:29:19 -0400 Original-Received: from localhost ([127.0.0.1] helo=wellocc1-ares.jhuapl.edu) by mail.nullprogram.com with esmtp (Exim 4.84_2) (envelope-from ) id 1d09Wc-0000IO-90; Mon, 17 Apr 2017 12:29:14 -0400 In-Reply-To: <874lxncamk.fsf@engster.org> X-Hashcash: 1:20:170417:deng@randomsample.de::MPTX3I8z3AIy2UB+:000000000000000000000000000000000000000001htR X-Hashcash: 1:20:170417:26533-done@debbugs.gnu.org::xu/Arj33tdcdqzq9:000000000000000000000000000000000003Jwc X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:131699 Archived-At: Thanks, David! Your fix works fine as far as I can tell. I'm using this trick in Elfeed (a syndication feed reader) as a fast method to strip all namespaces from the XML as it's being parsed. As you said, there's a lot of invalid XML in the wild. I've found it works a lot better to ignore namespaces and strictness, instead extracting the required information heuristically as long as it's reasonably close. Otherwise there would be a whole lot more feeds that wouldn't work well, or at all, in Elfeed. I had noticed with symbol-qnames that xml-parse-region drops unknown namespaces. Since this information comes from an alist, that seemed like reasonable behavior and I assumed it was intentional -- though signaling an error would also be reasonable. To tightly control which namespaces are stripped, I bind xml-default-ns to my own alist for that call. This feels like the natural and lispy way to use this function. The file that binds xml-default-ns requires the xml package explicitly, so there's no risk of it autoloading while it's bound. Though that's an interesting consequence I hadn't considered before. I _have_ seen similar issues with accept-process-output when arbitrary process events are handled while the stack is in an unusual state.