From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Wojciech Meyer Newsgroups: gmane.emacs.devel Subject: Re: Problems with xml-parse-string Date: Thu, 23 Sep 2010 22:58:14 +0100 Message-ID: <87hbhggm15.fsf@gmail.com> References: <87pqw6d7nz.fsf@stupidchicken.com> <87zkvaiked.fsf@stupidchicken.com> <87vd5ymptn.fsf@stupidchicken.com> <87vd5x7ty2.fsf@stupidchicken.com> <87vd5wo48a.fsf@stupidchicken.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: dough.gmane.org 1285279303 15358 80.91.229.12 (23 Sep 2010 22:01:43 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 23 Sep 2010 22:01:43 +0000 (UTC) Cc: Leo , emacs-devel@gnu.org To: Chong Yidong Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Sep 24 00:01:36 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OytrD-000632-MU for ged-emacs-devel@m.gmane.org; Fri, 24 Sep 2010 00:01:36 +0200 Original-Received: from localhost ([127.0.0.1]:56601 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OytrD-0005wG-7U for ged-emacs-devel@m.gmane.org; Thu, 23 Sep 2010 18:01:35 -0400 Original-Received: from [140.186.70.92] (port=54361 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Oytom-0004nO-OA for emacs-devel@gnu.org; Thu, 23 Sep 2010 18:00:34 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OytoS-0007HT-Gi for emacs-devel@gnu.org; Thu, 23 Sep 2010 17:58:45 -0400 Original-Received: from mail-ww0-f49.google.com ([74.125.82.49]:63086) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OytoS-0007HF-6H for emacs-devel@gnu.org; Thu, 23 Sep 2010 17:58:44 -0400 Original-Received: by wwb24 with SMTP id 24so2381489wwb.30 for ; Thu, 23 Sep 2010 14:58:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:received:received:from:to:cc:subject:references :date:in-reply-to:message-id:user-agent:mime-version:content-type; bh=ySv5eDwt77J2bMXbddOKinYhxUeApRzl/mrKslHJOzc=; b=lQwf6Ydwu2tuUDIMtxuY1fn+d7uU+Kxh/lIqO0qGm9NjktPUAExMXj8YPZi93qQURn pBNNVJT+heDYXA9qFlE4OXRi23XgeRMJzfAI/dEwp2G002szIP+m1EendEP+beMinaAl HVlsMv3Bq6xCz23LoXOdRAmZlk4G0nPZxH/BM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-type; b=un+isb1U08jU1GCkBBvdH8VnvWCOMdftzgdDwbEQxIARReerMTO89/DWAZ1cwj5aVV Y8E0sva4d6KX4/+CvKQXvwlyyKac/dkShUaMA7tVy5QOri/0swSQdvrih+FunxMcTbVp ErikRFrOkD3vVr+BTm4kve5Lmd0AQtSTtp7YE= Original-Received: by 10.216.35.75 with SMTP id t53mr8761773wea.95.1285279122497; Thu, 23 Sep 2010 14:58:42 -0700 (PDT) Original-Received: from spec-desktop.specuu.com (host86-133-35-46.range86-133.btcentralplus.com [86.133.35.46]) by mx.google.com with ESMTPS id p42sm861910weq.12.2010.09.23.14.58.39 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 23 Sep 2010 14:58:40 -0700 (PDT) In-Reply-To: <87vd5wo48a.fsf@stupidchicken.com> (Chong Yidong's message of "Thu, 23 Sep 2010 11:43:17 -0400") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:130705 Archived-At: --=-=-= Content-Type: text/plain Chong Yidong writes: > Leo writes: > >> On 2010-09-23 00:59 +0100, Stefan Monnier wrote: >>> FWIW, while I haven't use sml.el much, the little bit I've used it was >>> not particularly pleasant, partly because of the odd format. I don't >>> know how/why the xml.el was chosen and how much thought was put into >>> it, but my experience with it is not 100% positive. >> >> That looks like my experience too. > > The main differences in the "new" format are (i) listing attributes as > (:foo bar) inside the element list, rather than in an alist after the > element name, (ii) listing text as (text "foo") rather than "foo", and > (iii) the as-yet-unresolved issue with XML namespaces, which probably > needs to be fixed in xml.c. > > Point (i) is a broken design choice, as I already pointed out. As for > (ii), it is a little nicer to take the cdr of each list member without > checking for stringp. If others thing this is a really good change, I > won't object, though it seems pretty trivial to me. We can add an > optional flag to the xml-* functions to toggle between the two > representations. This patch fixes all the problems above and gives SXML conforming representation of the elements tree. Obviously we would need to patch `xml.el', and provide an interface for accessing tree elements. Thanks, Wojciech --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=0001-Support-for-SXML-AST.patch >From 7db230f57fe9b7904d4d55e1fbe90a7522bd38a5 Mon Sep 17 00:00:00 2001 From: Wojciech Meyer Date: Thu, 23 Sep 2010 22:45:32 +0100 Subject: [PATCH] Support for SXML AST. * xml.c (make_dom): Make output to conform with SXML spec. * ChangeLog: Add entry. Signed-off-by: Wojciech Meyer --- src/ChangeLog | 5 ++++ src/xml.c | 60 ++++++++++++++++++++++++++++++++++++++++++++------------ 2 files changed, 52 insertions(+), 13 deletions(-) diff --git a/src/ChangeLog b/src/ChangeLog index 2dd892f..b11d291 100644 --- a/src/ChangeLog +++ b/src/ChangeLog @@ -1,3 +1,8 @@ +2010-09-23 Wojciech Meyer + + * xml.c (make_dom): Make output to conform with + SXML spec. + 2010-09-22 Eli Zaretskii * editfns.c (Fsubst_char_in_region, Ftranslate_region_internal) diff --git a/src/xml.c b/src/xml.c index 5829f1d..9dc0931 100644 --- a/src/xml.c +++ b/src/xml.c @@ -28,7 +28,8 @@ along with GNU Emacs. If not, see . */ #include "lisp.h" #include "buffer.h" -Lisp_Object make_dom (xmlNode *node) +static Lisp_Object +make_dom (xmlNode *node) { if (node->type == XML_ELEMENT_NODE) { @@ -36,27 +37,60 @@ Lisp_Object make_dom (xmlNode *node) xmlNode *child; xmlAttr *property; Lisp_Object plist = Qnil; + int was_element_node = 0; + + /* First add the attributes. */ - /* First add the attributes. */ property = node->properties; - while (property != NULL) + + /* Don't add nil if no properties */ + if (property != NULL) { - if (property->children && - property->children->content) + /* Add special `@' node containing properties */ + plist = Fcons(intern("@"),Qnil); + + while (property != NULL) { - plist = Fcons (Fcons (intern (property->name), - build_string (property->children->content)), - plist); + if (property->children && + property->children->content) + { + plist = + Fcons + (Fcons (intern (property->name), + Fcons (build_string (property->children->content), + Qnil)), + plist); + } + property = property->next; } - property = property->next; + result = Fcons (Fnreverse (plist), result); } - result = Fcons (Fnreverse (plist), result); - /* Then add the children of the node. */ + child = node->children; + + /* First try to lookup for elements + if any found, prohibit adding any text elements */ + while (child != NULL) { - result = Fcons (make_dom (child), result); + if (child->type == XML_ELEMENT_NODE) + { + was_element_node = 1; + result = Fcons (make_dom (child), result); + } + + child = child->next; + } + + child = node->children; + + while (!was_element_node && child != NULL) + { + + if ( child->type == XML_TEXT_NODE ) + result = Fcons (make_dom (child), result); + child = child->next; } @@ -73,7 +107,7 @@ Lisp_Object make_dom (xmlNode *node) return Qnil; } -static Lisp_Object +INLINE static Lisp_Object parse_string (Lisp_Object string, Lisp_Object base_url, int htmlp) { xmlDoc *doc; -- 1.7.0.4 --=-=-=--