From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Lars Magne Ingebrigtsen Newsgroups: gmane.emacs.devel Subject: Re: Problems with xml-parse-string Date: Fri, 24 Sep 2010 18:46:11 +0200 Organization: Programmerer Ingebrigtsen Message-ID: References: <87zkvaiked.fsf@stupidchicken.com> <87vd5ymptn.fsf@stupidchicken.com> <87vd5x7ty2.fsf@stupidchicken.com> <87vd5wo48a.fsf@stupidchicken.com> <8739t03q2g.fsf@stupidchicken.com> <87k4mb2mfu.fsf@stupidchicken.com> <87pqw3nm4y.fsf@stupidchicken.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: dough.gmane.org 1285346812 9428 80.91.229.12 (24 Sep 2010 16:46:52 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 24 Sep 2010 16:46:52 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Sep 24 18:46:51 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OzBQ8-00050g-KU for ged-emacs-devel@m.gmane.org; Fri, 24 Sep 2010 18:46:49 +0200 Original-Received: from localhost ([127.0.0.1]:38647 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OzBQ7-0008Pc-S2 for ged-emacs-devel@m.gmane.org; Fri, 24 Sep 2010 12:46:47 -0400 Original-Received: from [140.186.70.92] (port=52418 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OzBPv-0008Ne-L5 for emacs-devel@gnu.org; Fri, 24 Sep 2010 12:46:40 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OzBPr-0004CM-18 for emacs-devel@gnu.org; Fri, 24 Sep 2010 12:46:35 -0400 Original-Received: from lo.gmane.org ([80.91.229.12]:34008) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OzBPq-0004C1-MS for emacs-devel@gnu.org; Fri, 24 Sep 2010 12:46:30 -0400 Original-Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1OzBPl-0004pY-JT for emacs-devel@gnu.org; Fri, 24 Sep 2010 18:46:25 +0200 Original-Received: from cm-84.215.34.171.getinternet.no ([84.215.34.171]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 24 Sep 2010 18:46:25 +0200 Original-Received: from larsi by cm-84.215.34.171.getinternet.no with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 24 Sep 2010 18:46:25 +0200 X-Injected-Via-Gmane: http://gmane.org/ Mail-Followup-To: emacs-devel@gnu.org Original-Lines: 76 Original-X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: cm-84.215.34.171.getinternet.no Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAJFBMVEV/pWhHhB5Fghw/fSCx yKMDBAI/eyFCgB1Afh77/PosRB4AAgCMwEIKAAACYklEQVQ4jW2TvWvbQBjGT+AYkilnKCnZpKHu ftSUTApVA9pajDJIS1rVh8kWt/REtiY0Ojp58OLNYFs08qwl98/1uQ8rLs5r0OH78bwfz92R8vhr 0J/0q4KPpWSInHHOrzn5RbpBELypEX+lvNeIM17n5E8NMBFj+cCllTA+gAZgTCERQhROwJgBkn+h fu/tmJcCGruvNaQMfErpaZpW4+ua80FuUU7uAqpBktZpXa/X68oCRuK5junB1aLuduqrT6somkUI Mrcxi+I4wi+O3YYDs25FDqujx3kbW0W9qr4tFi+A1XQaH4XxHpgd4hPO5/vgipCQkH3FfpAXIgxD 0iF32hFKPfOlJzBhaAzTXgV+j3Y6GDY6EYyNUjaAYwAm+kLcCFHC3k2Tw/Z8C9oDGWSqGnC2qygA 5Pt3Sl3iPKCYtMCc4Eap1KaabFMBsHuWKdUs9dH+B6Q8U4hLA4LA900qUZTi+0iDJwcwie2qKO83 GjS2XX+n3YdMA1U5hQOlKZEkpkgLJiBSjlTK2RpFzIB+z0lkCT9webLm2SsLHjA1wAiLuYlt8d/K eH6GhLtd3bzeqKUFSduuHvDHh8yBrFnCEtpzNt6cKgDP60JxqYtT90AE/BiyMLy4Vkm6O6CEH08a jFST7rRb/NSW58S7wLoFPV3jVts0jKZTpbItQJV+31g+JOGFdv451aRIHDhXzzV6GOY2UZlqhriH SndF23iVwPHkM+ZIMhSPo3A6i/HwZgeJjkcUx5Li9uISm4/ZT1YovgM69q8F5wYcIlWk8310IHSp jo89Tz8Kb2TBgqA4wD8KsJxy3x3pZwAAAABJRU5ErkJggg== Mail-Copies-To: never X-Now-Playing: The Advisory Circle's _Mind How You Go (Revised Edition)_: "Seasons" User-Agent: Gnus/5.110011 (No Gnus v0.11) Emacs/24.0.50 (gnu/linux) Cancel-Lock: sha1:dsnLQF82MWd8yW+o8N8WGJjXK5g= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:130788 Archived-At: Chong Yidong writes: >> The main difference between sxml and xml.el output is that it has the >> weird an unnecessary "@" node for the attributes and that it wastes a >> cons in the attributes, isn't it? > > The xml.el output always has an alist for attributes after each tag; if > there are no attributes, the element after the tag name is nil. In > sxml, the `@' denotes an attribute list, which is omitted if no > attributes exist. Yes. So it's yet another irregularity you have to check for. To take a concrete example: You want the src of the img node you have. xml.el: (cdr (assq 'img (cadr node))) sxml.el: (if (and (consp (cadr node)) (eq (caadr node) '@)) (cadr (assq 'img node))) (And I'm not even sure that's correct. It's probably not. Which is my point.) libxml: (cdr (assq :img (cdr node))) (The difference between libxml and xml.c for attributes is minuscule.) >> Other than that it has the same problem that xml.el has, in that text >> nodes have to be special-cased, so you can't say assq or use simple >> descent without testing. > > It is illogical to criticize sxml for wasting conses, while arguing for > wrapping each text node in a cons. No, it is not. I'm sacrificing space for speed and regularity. sxml wasting cons cells, and adding slowdowns at the same time. > Anyway, it is difficult to see how real the problem is without a > concrete example. Could you provide one? I suspect that the real > problem, if one exists, is Elisp's relatively weak support for list > mapping and reduction; if that's the case, the correct solution is to > pull in some of the relevant functions from the CL package. Here's a pretty piece of code, chosen at random: (defun nnrss-find-el (tag data &optional found-list) "Find the all matching elements in the data. Careful with this on large documents!" (when (consp data) (dolist (bit data) (when (car-safe bit) (when (equal tag (car bit)) ;; Old xml.el may return a list of string. (when (and (consp (caddr bit)) (stringp (caaddr bit))) (setcar (cddr bit) (caaddr bit))) (setq found-list (append found-list (list bit)))) (if (and (consp (car-safe (caddr bit))) (not (stringp (caddr bit)))) (setq found-list (append found-list (nnrss-find-el tag (caddr bit)))) (setq found-list (append found-list (nnrss-find-el tag (cddr bit)))))))) found-list) The horror! -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen