all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Problems with xml-parse-string
@ 2010-09-14 18:11 Leo
  2010-09-14 18:24 ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 100+ messages in thread
From: Leo @ 2010-09-14 18:11 UTC (permalink / raw)
  To: emacs-devel

Hello,

xml-parse-string is already defined in xml.el. How about rename these
two primitives to parse-html-string and parse-xml-string?

xml.el and the primitive xml-parse-string do not return the same
structure. But I don't know which one is better.

I use the primitive in a function like this

(defun test ()
  (xml-parse-string (buffer-substring (point) (point-max))))

Somehow it can return the symbol `buffer-substring' or 'edebug-after',
after alternating C-u C-M-x and C-M-x that function. Unfortunately I
can't find a way to reliably reproduce it. Any wild guess where the
problem might be? Passing an empty string (xml-parse-string "") returns
#<objfwd to nil>, is this normal?

Leo




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-14 18:11 Problems with xml-parse-string Leo
@ 2010-09-14 18:24 ` Lars Magne Ingebrigtsen
  2010-09-14 21:18   ` Stefan Monnier
  0 siblings, 1 reply; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-14 18:24 UTC (permalink / raw)
  To: emacs-devel

Leo <sdl.web@gmail.com> writes:

> xml-parse-string is already defined in xml.el. How about rename these
> two primitives to parse-html-string and parse-xml-string?

Makes sense to me.

> Passing an empty string (xml-parse-string "") returns #<objfwd to
> nil>, is this normal?

No.  I'm checking in a fix now.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-14 18:24 ` Lars Magne Ingebrigtsen
@ 2010-09-14 21:18   ` Stefan Monnier
  2010-09-15  8:06     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 100+ messages in thread
From: Stefan Monnier @ 2010-09-14 21:18 UTC (permalink / raw)
  To: emacs-devel

>> xml-parse-string is already defined in xml.el. How about rename these
>> two primitives to parse-html-string and parse-xml-string?
> Makes sense to me.

If you're renaming them, then please add a "libxml-" prefix of some sort
while you're at it.


        Stefan



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-14 21:18   ` Stefan Monnier
@ 2010-09-15  8:06     ` Lars Magne Ingebrigtsen
  2010-09-15  8:51       ` Stefan Monnier
  0 siblings, 1 reply; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-15  8:06 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@IRO.UMontreal.CA> writes:

> If you're renaming them, then please add a "libxml-" prefix of some sort
> while you're at it.

Isn't that being slightly over-specific?  I mean, it's conceivable that
we would switch out the library used for the parsing with something else
at some point.  Conceivable, if unlikely.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-15  8:06     ` Lars Magne Ingebrigtsen
@ 2010-09-15  8:51       ` Stefan Monnier
  2010-09-15  9:21         ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 100+ messages in thread
From: Stefan Monnier @ 2010-09-15  8:51 UTC (permalink / raw)
  To: emacs-devel

>> If you're renaming them, then please add a "libxml-" prefix of some sort
>> while you're at it.
> Isn't that being slightly over-specific?  I mean, it's conceivable that
> we would switch out the library used for the parsing with something else
> at some point.  Conceivable, if unlikely.

Yes, it's conceivable, but maybe the args and/or output would look
different anyway.


        Stefan



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-15  8:51       ` Stefan Monnier
@ 2010-09-15  9:21         ` Lars Magne Ingebrigtsen
  2010-09-15  9:54           ` Leo
  2010-09-21 23:00           ` Chong Yidong
  0 siblings, 2 replies; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-15  9:21 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@IRO.UMontreal.CA> writes:

> Yes, it's conceivable, but maybe the args and/or output would look
> different anyway.

If we kept the name, then we'd keep the interface.  :-)

Anyway, it's not important to me.  If y'all want the names to be
`libxml-parse-html-string'/`libxml-parse-xml-string', that's fine by me.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-15  9:21         ` Lars Magne Ingebrigtsen
@ 2010-09-15  9:54           ` Leo
  2010-09-15 10:16             ` Julien Danjou
  2010-09-21 23:00           ` Chong Yidong
  1 sibling, 1 reply; 100+ messages in thread
From: Leo @ 2010-09-15  9:54 UTC (permalink / raw)
  To: emacs-devel

On 2010-09-15 10:21 +0100, Lars Magne Ingebrigtsen wrote:
> Anyway, it's not important to me. If y'all want the names to be
> `libxml-parse-html-string'/`libxml-parse-xml-string', that's fine by
> me.

I personally don't like the libxml- prefix.

Leo




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-15  9:54           ` Leo
@ 2010-09-15 10:16             ` Julien Danjou
  2010-09-15 15:58               ` Chad Brown
  0 siblings, 1 reply; 100+ messages in thread
From: Julien Danjou @ 2010-09-15 10:16 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 176 bytes --]

On Wed, Sep 15 2010, Leo wrote:

> I personally don't like the libxml- prefix.

Me neither.

-- 
Julien Danjou
// ᐰ <julien@danjou.info>   http://julien.danjou.info

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-15 10:16             ` Julien Danjou
@ 2010-09-15 15:58               ` Chad Brown
  0 siblings, 0 replies; 100+ messages in thread
From: Chad Brown @ 2010-09-15 15:58 UTC (permalink / raw)
  To: Emacs-Devel devel

> I personally don't like the libxml- prefix.

Adding the name of the optional external dependency allows us to
create a function that uses the external version if available and
falls back to an internal one if not.  I don't know if Emacs has a
convention for handling this, but I think we're likely to want one
soon (with libxml2 and GNUTLS being the first two users).

Is there already a convention for handling this that I don't know?

*Chad



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-15  9:21         ` Lars Magne Ingebrigtsen
  2010-09-15  9:54           ` Leo
@ 2010-09-21 23:00           ` Chong Yidong
  2010-09-21 23:24             ` Leo
  2010-09-22 23:48             ` Stefan Monnier
  1 sibling, 2 replies; 100+ messages in thread
From: Chong Yidong @ 2010-09-21 23:00 UTC (permalink / raw)
  To: emacs-devel; +Cc: Mark A. Hershberger

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Stefan Monnier <monnier@IRO.UMontreal.CA> writes:
>
>> Yes, it's conceivable, but maybe the args and/or output would look
>> different anyway.
>
> If we kept the name, then we'd keep the interface.  :-)
>
> Anyway, it's not important to me.  If y'all want the names to be
> `libxml-parse-html-string'/`libxml-parse-xml-string', that's fine by
> me.

A `libxml' prefix sounds a bit weird to me.  When we link to the GPM
library, the functions are `gpm-mouse-*' not `libgpm-mouse-*'.

I think the xml.el package needs to be re-written with the new libxml2
support in mind, so that it makes use of the libxml functions if they
are available and fall back on the Elisp parser.

From this point of view, the new functions should be called
xml-parse-string-internal or something like that.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-21 23:00           ` Chong Yidong
@ 2010-09-21 23:24             ` Leo
  2010-09-22  2:26               ` Chong Yidong
  2010-09-22 23:48             ` Stefan Monnier
  1 sibling, 1 reply; 100+ messages in thread
From: Leo @ 2010-09-21 23:24 UTC (permalink / raw)
  To: emacs-devel

On 2010-09-22 00:00 +0100, Chong Yidong wrote:
> From this point of view, the new functions should be called
> xml-parse-string-internal or something like that.

The new function actually is quite usable. Its output is slightly
cleaner than that of xml.el.

Leo




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-21 23:24             ` Leo
@ 2010-09-22  2:26               ` Chong Yidong
  2010-09-22  3:15                 ` Chong Yidong
  0 siblings, 1 reply; 100+ messages in thread
From: Chong Yidong @ 2010-09-22  2:26 UTC (permalink / raw)
  To: Leo; +Cc: emacs-devel

Leo <sdl.web@gmail.com> writes:

> On 2010-09-22 00:00 +0100, Chong Yidong wrote:
>> From this point of view, the new functions should be called
>> xml-parse-string-internal or something like that.
>
> The new function actually is quite usable. Its output is slightly
> cleaner than that of xml.el.

No doubt, but that's not my point.  Ideally, Emacs should provide a
single interface for parsing XML.  What I have in mind is to rename
xml.el's xml-parse-string, which is an internal function, to xml--parse,
and provide something like this:

(defun xml-parse-string (str)
  (if (boundp 'xml-parse-string-internal)
      (xml-parse-string-internal str)
    (xml-parse-string-elisp str)))

with similar functions for the current xml-parse-region and
xml-parse-file.

This would solve our namespace clash problem, and be cleaner.  This
assumes that the output of the new libxml2 functions is similar (or can
be made similar) to that of the old elisp parser.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22  2:26               ` Chong Yidong
@ 2010-09-22  3:15                 ` Chong Yidong
  2010-09-22  7:14                   ` Stefan Monnier
  2010-09-22 10:35                   ` Lars Magne Ingebrigtsen
  0 siblings, 2 replies; 100+ messages in thread
From: Chong Yidong @ 2010-09-22  3:15 UTC (permalink / raw)
  To: Leo; +Cc: emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

> No doubt, but that's not my point.  Ideally, Emacs should provide a
> single interface for parsing XML.

I've just checked in a change that renames the libxml functions to
xml-parse-html-string-internal and xml-parse-string-internal.  I've also
made the parse tree format identical to that of xml.el.

Now xml.el should be changed to make use of these functions.  I think we
should introduce xml-parse-file-internal as well, for the sake of
xml-parse-file; it doesn't make sense to load a file into Emacs if it's
just going to be passed to libxml2.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22  3:15                 ` Chong Yidong
@ 2010-09-22  7:14                   ` Stefan Monnier
  2010-09-22 10:35                   ` Lars Magne Ingebrigtsen
  1 sibling, 0 replies; 100+ messages in thread
From: Stefan Monnier @ 2010-09-22  7:14 UTC (permalink / raw)
  To: Chong Yidong; +Cc: Leo, emacs-devel

> xml-parse-file; it doesn't make sense to load a file into Emacs if it's
> just going to be passed to libxml2.

Except if it's accessed via file-name-handler-alist, of course,


        Stefan



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22  3:15                 ` Chong Yidong
  2010-09-22  7:14                   ` Stefan Monnier
@ 2010-09-22 10:35                   ` Lars Magne Ingebrigtsen
  2010-09-22 10:58                     ` Lars Magne Ingebrigtsen
                                       ` (2 more replies)
  1 sibling, 3 replies; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-22 10:35 UTC (permalink / raw)
  To: emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

> I've just checked in a change that renames the libxml functions to
> xml-parse-html-string-internal and xml-parse-string-internal.  I've also
> made the parse tree format identical to that of xml.el.

Euhm.  Actually, I spent quite a bit of thought on the parse tree format
that the functions spit out to make it regular, easy and fast to deal
with.  The format that xml.el spits out is more arcane and less
comfortable to use.  You can't say `(assq 'img (cdr node))' and stuff to
get the image nodes, because the child nodes aren't a regular assoc
list (because of the string fields) and stuff.

So I protest the change.

Anyway, I was going to rework the functions to be `html-parse-buffer'
again, since in 99% of cases you have the html in a buffer, and putting
it in a string just to call that function is odd.

If you're concerned about previous users of xml.el being confused by the
change in parse tree format, I don't think you have to worry all that
much, because there are zero (0) users of the xml.el `xml-parse-string'
in the Emacs tree.

Please revert.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 10:35                   ` Lars Magne Ingebrigtsen
@ 2010-09-22 10:58                     ` Lars Magne Ingebrigtsen
  2010-09-22 11:00                     ` Leo
  2010-09-22 14:05                     ` Chong Yidong
  2 siblings, 0 replies; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-22 10:58 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> If you're concerned about previous users of xml.el being confused by the
> change in parse tree format, I don't think you have to worry all that
> much, because there are zero (0) users of the xml.el `xml-parse-string'
> in the Emacs tree.

That's not totally accurate.  While there are zero users of
`xml-parse-string', there's a couple of handfuls of users of
`xml-parse-region' and `xml-parse-file' in the tree.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 10:35                   ` Lars Magne Ingebrigtsen
  2010-09-22 10:58                     ` Lars Magne Ingebrigtsen
@ 2010-09-22 11:00                     ` Leo
  2010-09-22 11:09                       ` Lars Magne Ingebrigtsen
  2010-09-22 14:05                     ` Chong Yidong
  2 siblings, 1 reply; 100+ messages in thread
From: Leo @ 2010-09-22 11:00 UTC (permalink / raw)
  To: emacs-devel

On 2010-09-22 11:35 +0100, Lars Magne Ingebrigtsen wrote:
> Anyway, I was going to rework the functions to be `html-parse-buffer'
> again, since in 99% of cases you have the html in a buffer, and
> putting it in a string just to call that function is odd.

+2 for the -buffer variants. Maybe -region is better (more flexible).

I think we can leave xml.el alone for now except maybe renaming its
internal function xml-parse-string (which does no parsing). And there is
xmltok another parser that does validating.

It'd be extremely useful for emacs to have a set of solid and stable xml
API.

Leo




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 11:00                     ` Leo
@ 2010-09-22 11:09                       ` Lars Magne Ingebrigtsen
  2010-09-22 11:41                         ` Lars Magne Ingebrigtsen
  2010-09-22 12:45                         ` Leo
  0 siblings, 2 replies; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-22 11:09 UTC (permalink / raw)
  To: emacs-devel

Leo <sdl.web@gmail.com> writes:

> +2 for the -buffer variants. Maybe -region is better (more flexible).

True.

> It'd be extremely useful for emacs to have a set of solid and stable xml
> API.

Having worked with a lot of DOM representations (recursive searches for
elements, etc), I think the one I picked for the libxml2 thing really is
the most pleasant (and fast) to work with.  :-)  But that would be my
opinion, wouldn't it?

There's very little if-ing necessary.  It's all very regular.  If the
cdr of anything is a list, you can descend into it, and everything
does have a cdr -- there are no atoms sprinkled here and there that you
have to special-case all over the place...

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 11:09                       ` Lars Magne Ingebrigtsen
@ 2010-09-22 11:41                         ` Lars Magne Ingebrigtsen
  2010-09-22 11:55                           ` Wojciech Meyer
  2010-09-22 12:45                         ` Leo
  1 sibling, 1 reply; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-22 11:41 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> There's very little if-ing necessary.  It's all very regular.  If the
> cdr of anything is a list, you can descend into it, and everything
> does have a cdr -- there are no atoms sprinkled here and there that you
> have to special-case all over the place...

To take a real example: Here's how to find all image urls in a HTML
document:

(defun find-src (node)
  (let (src)
    (dolist (elem (cdr node))
      (cond
       ((eq (car elem) 'img)
	(push (cdr (assq :src (cdr elem))) src))
       ((consp (cdr elem))
	(setq src (nconc (find-src elem) src)))))
    src))

(find-src (html-parse-string a))
=>
("http://feeds.feedburner.com/~r/Slashdot/slashdot/~4/ObrTJGt5o5g" "http://feedads.g.doubleclick.net/~at/dM87odHHTwmp1SgHU6CIgmtDAvA/1/di" "http://feedads.g.doubleclick.net/~at/dM87odHHTwmp1SgHU6CIgmtDAvA/0/di" "http://da.feedsportal.com/r/78871208090/u/49/f/530758/c/32909/s/234625720/a2.img" "http://slashdot.feedsportal.com/c/32909/f/530758/s/dfc1ab8/mf.gif" "http://a.fsdn.com/sd/twitter_icon_large.png" "http://a.fsdn.com/sd/facebook_icon_large.png")

It's, like, you need to know nothing about the DOM to traverse it, and
you can use the fast `assq' to get at what you want when not doing a
recursive descent...

And since I'm going to write a HTML renderer in Emacs Lisp, I think the
DOM should be as fast and as easy to work with as possible.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 11:41                         ` Lars Magne Ingebrigtsen
@ 2010-09-22 11:55                           ` Wojciech Meyer
  2010-09-22 12:09                             ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-22 11:55 UTC (permalink / raw)
  To: emacs-devel

On Wed, Sep 22, 2010 at 12:41 PM, Lars Magne Ingebrigtsen
<larsi@gnus.org> wrote:
> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>
>> There's very little if-ing necessary.  It's all very regular.  If the
>> cdr of anything is a list, you can descend into it, and everything
>> does have a cdr -- there are no atoms sprinkled here and there that you
>> have to special-case all over the place...

We should use S-XML. S-expression based XML for this and other places
where we have XML.

http://okmij.org/ftp/Scheme/SXML.html

Wojciech



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 11:55                           ` Wojciech Meyer
@ 2010-09-22 12:09                             ` Lars Magne Ingebrigtsen
  2010-09-22 12:17                               ` Wojciech Meyer
  0 siblings, 1 reply; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-22 12:09 UTC (permalink / raw)
  To: emacs-devel

Wojciech Meyer <wojciech.meyer@googlemail.com> writes:

> We should use S-XML. S-expression based XML for this and other places
> where we have XML.

Why on earth would we shoot ourselves in our feet by converting XML to a
parse tree that looks like this?

     (*TOP*
       (@ (*NAMESPACES* 
            (HTML "http://www.w3.org/TR/REC-html40")))
       (RESERVATION
         (NAME (@ (HTML:CLASS "largeSansSerif"))
           "Layman, A")
         (SEAT (@ (HTML:CLASS "largeMonotype")
                  (CLASS "Y"))
            "33B")
         (HTML:A (@ (HREF "/cgi-bin/ResStatus"))
            "Check Status")
         (DEPARTURE "1997-05-24T07:55:00+1")))


-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 12:09                             ` Lars Magne Ingebrigtsen
@ 2010-09-22 12:17                               ` Wojciech Meyer
  2010-09-22 12:18                                 ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-22 12:17 UTC (permalink / raw)
  To: emacs-devel

On Wed, Sep 22, 2010 at 1:09 PM, Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
> Wojciech Meyer <wojciech.meyer@googlemail.com> writes:
>
>> We should use S-XML. S-expression based XML for this and other places
>> where we have XML.
>
> Why on earth would we shoot ourselves in our feet by converting XML to a
> parse tree that looks like this?
>
>     (*TOP*
>       (@ (*NAMESPACES*
>            (HTML "http://www.w3.org/TR/REC-html40")))
>       (RESERVATION
>         (NAME (@ (HTML:CLASS "largeSansSerif"))
>           "Layman, A")
>         (SEAT (@ (HTML:CLASS "largeMonotype")
>                  (CLASS "Y"))
>            "33B")
>         (HTML:A (@ (HREF "/cgi-bin/ResStatus"))
>            "Check Status")
>         (DEPARTURE "1997-05-24T07:55:00+1")))

Why not? What's the problem here?

Having an Sxml framework will open new doors, and that's the appropriate
method of doing that (IMHO).

Personally, I think because it's:
- well known standard in representing XML with s-expressions
- it would be easier to port existing tools (like some Scheme Xpath
query language
 to Elisp)
- because it supports everything that XML supports

Wojciech



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 12:17                               ` Wojciech Meyer
@ 2010-09-22 12:18                                 ` Lars Magne Ingebrigtsen
  2010-09-22 12:20                                   ` Wojciech Meyer
  0 siblings, 1 reply; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-22 12:18 UTC (permalink / raw)
  To: emacs-devel

Wojciech Meyer <wojciech.meyer@googlemail.com> writes:

> Why not? What's the problem here?

It's fiddly to work with.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 12:18                                 ` Lars Magne Ingebrigtsen
@ 2010-09-22 12:20                                   ` Wojciech Meyer
  2010-09-22 12:26                                     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-22 12:20 UTC (permalink / raw)
  To: emacs-devel

On Wed, Sep 22, 2010 at 1:18 PM, Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
> Wojciech Meyer <wojciech.meyer@googlemail.com> writes:
>
>> Why not? What's the problem here?
>
> It's fiddly to work with.

Can you elaborate?

Wojciech



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 12:20                                   ` Wojciech Meyer
@ 2010-09-22 12:26                                     ` Lars Magne Ingebrigtsen
  2010-09-22 12:34                                       ` Wojciech Meyer
  0 siblings, 1 reply; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-22 12:26 UTC (permalink / raw)
  To: emacs-devel

Wojciech Meyer <wojciech.meyer@googlemail.com> writes:

>> It's fiddly to work with.
>
> Can you elaborate?

Did you look at my code example?  Did you see how unfiddly it was?

In any case, if the problem here is that we have a different interface
for the xml parsing already in place -- I just added xml-parse-buffer to
xml.c as an afterthought, since it was trivial to do.  I don't care
about xml parsing per se.  My point was to have the sloppy real-world
html-parse-buffer interface available from Emacs, and we don't have any
equivalent Emacs Lisp code available (that I know of) that is has to be
compatible with.

So if we drop the xml-parse-* thing from xml.c, can we go back to
bikeshedding about important things like `C-d', and I can get on with
implementing the HTML renderer?  (With the DOM as I wanted it to be.)

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 12:26                                     ` Lars Magne Ingebrigtsen
@ 2010-09-22 12:34                                       ` Wojciech Meyer
  2010-09-22 12:46                                         ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-22 12:34 UTC (permalink / raw)
  To: emacs-devel

On Wed, Sep 22, 2010 at 1:26 PM, Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
> Wojciech Meyer <wojciech.meyer@googlemail.com> writes:
>
>>> It's fiddly to work with.
>>
>> Can you elaborate?
>
> Did you look at my code example?  Did you see how unfiddly it was?

It is data, it doesn't need to look good. Obviously you handle that
as an abstract data structure.

>
> In any case, if the problem here is that we have a different interface
> for the xml parsing already in place -- I just added xml-parse-buffer to
> xml.c as an afterthought, since it was trivial to do.  I don't care
> about xml parsing per se.  My point was to have the sloppy real-world

but I care, and many people here do care, because it is important to
got it right.

> html-parse-buffer interface available from Emacs, and we don't have any
> equivalent Emacs Lisp code available (that I know of) that is has to be
> compatible with.

Wojciech



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 11:09                       ` Lars Magne Ingebrigtsen
  2010-09-22 11:41                         ` Lars Magne Ingebrigtsen
@ 2010-09-22 12:45                         ` Leo
  2010-09-22 13:14                           ` Lars Magne Ingebrigtsen
  1 sibling, 1 reply; 100+ messages in thread
From: Leo @ 2010-09-22 12:45 UTC (permalink / raw)
  To: emacs-devel

On 2010-09-22 12:09 +0100, Lars Magne Ingebrigtsen wrote:
> Having worked with a lot of DOM representations (recursive searches
> for elements, etc), I think the one I picked for the libxml2 thing
> really is the most pleasant (and fast) to work with. :-) But that
> would be my opinion, wouldn't it?

Yeah I like that too. After seeing two of its outputs, I quickly noticed
the pattern and just create some short two-liners to access the tree.

Leo




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 12:34                                       ` Wojciech Meyer
@ 2010-09-22 12:46                                         ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-22 12:46 UTC (permalink / raw)
  To: emacs-devel

Wojciech Meyer <wojciech.meyer@googlemail.com> writes:

> It is data, it doesn't need to look good. Obviously you handle that
> as an abstract data structure.

I take it you're not a programmer?

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 12:45                         ` Leo
@ 2010-09-22 13:14                           ` Lars Magne Ingebrigtsen
  2010-09-22 14:07                             ` Chong Yidong
  0 siblings, 1 reply; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-22 13:14 UTC (permalink / raw)
  To: emacs-devel

Leo <sdl.web@gmail.com> writes:

> Yeah I like that too. After seeing two of its outputs, I quickly noticed
> the pattern and just create some short two-liners to access the tree.

Nice.  :-)

Perhaps Stefan's idea of just naming the functions `libxml-parse-*' was
the best to avoid the compatibility discussion.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 10:35                   ` Lars Magne Ingebrigtsen
  2010-09-22 10:58                     ` Lars Magne Ingebrigtsen
  2010-09-22 11:00                     ` Leo
@ 2010-09-22 14:05                     ` Chong Yidong
  2010-09-22 14:32                       ` Lars Magne Ingebrigtsen
  2 siblings, 1 reply; 100+ messages in thread
From: Chong Yidong @ 2010-09-22 14:05 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Euhm.  Actually, I spent quite a bit of thought on the parse tree format
> that the functions spit out to make it regular, easy and fast to deal
> with.  The format that xml.el spits out is more arcane and less
> comfortable to use.  You can't say `(assq 'img (cdr node))' and stuff to
> get the image nodes, because the child nodes aren't a regular assoc
> list (because of the string fields) and stuff.
>
> So I protest the change.
>
> If you're concerned about previous users of xml.el being confused by the
> change in parse tree format, I don't think you have to worry all that
> much, because there are zero (0) users of the xml.el `xml-parse-string'
> in the Emacs tree.

The issue isn't xml-parse-string, it's xml-parse-region and
xml-parse-file, which are used by several parts of Emacs including,
ahem, Gnus.  I don't want to have two parts of Emacs providing xml
parsing that provide slightly incompatible parse tree formats.

So either the new libxml functions have to provide the same format as
xml.el, or xml.el has to be changed to used the new format, breaking
existing uses.  I am amenable to the latter if the new format is so much
better than the old one that it's worth dealing with the backward
compatibility headaches.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 13:14                           ` Lars Magne Ingebrigtsen
@ 2010-09-22 14:07                             ` Chong Yidong
  2010-09-22 15:04                               ` Eli Zaretskii
                                                 ` (2 more replies)
  0 siblings, 3 replies; 100+ messages in thread
From: Chong Yidong @ 2010-09-22 14:07 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Leo <sdl.web@gmail.com> writes:
>
>> Yeah I like that too. After seeing two of its outputs, I quickly noticed
>> the pattern and just create some short two-liners to access the tree.
>
> Nice.  :-)
>
> Perhaps Stefan's idea of just naming the functions `libxml-parse-*' was
> the best to avoid the compatibility discussion.

Having two xml parsers that provide slightly different outputs, even
though they do the exact same thing, is a problem regardless of what
their names are.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 14:05                     ` Chong Yidong
@ 2010-09-22 14:32                       ` Lars Magne Ingebrigtsen
  2010-09-22 15:46                         ` Chong Yidong
  0 siblings, 1 reply; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-22 14:32 UTC (permalink / raw)
  To: emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

> So either the new libxml functions have to provide the same format as
> xml.el, or xml.el has to be changed to used the new format, breaking
> existing uses.  I am amenable to the latter if the new format is so much
> better than the old one that it's worth dealing with the backward
> compatibility headaches.

I'd volunteer for changing the callers to use the new format if you
change xml.el to do the same.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 14:07                             ` Chong Yidong
@ 2010-09-22 15:04                               ` Eli Zaretskii
  2010-09-22 23:59                               ` Stefan Monnier
  2010-09-23  2:16                               ` Kevin Rodgers
  2 siblings, 0 replies; 100+ messages in thread
From: Eli Zaretskii @ 2010-09-22 15:04 UTC (permalink / raw)
  To: Chong Yidong; +Cc: emacs-devel

> From: Chong Yidong <cyd@stupidchicken.com>
> Date: Wed, 22 Sep 2010 10:07:33 -0400
> 
> > Perhaps Stefan's idea of just naming the functions `libxml-parse-*' was
> > the best to avoid the compatibility discussion.
> 
> Having two xml parsers that provide slightly different outputs, even
> though they do the exact same thing, is a problem regardless of what
> their names are.

100% agreement.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 14:32                       ` Lars Magne Ingebrigtsen
@ 2010-09-22 15:46                         ` Chong Yidong
  2010-09-22 16:12                           ` Lars Magne Ingebrigtsen
  2010-09-22 16:51                           ` Wojciech Meyer
  0 siblings, 2 replies; 100+ messages in thread
From: Chong Yidong @ 2010-09-22 15:46 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Chong Yidong <cyd@stupidchicken.com> writes:
>
>> So either the new libxml functions have to provide the same format as
>> xml.el, or xml.el has to be changed to used the new format, breaking
>> existing uses.  I am amenable to the latter if the new format is so much
>> better than the old one that it's worth dealing with the backward
>> compatibility headaches.
>
> I'd volunteer for changing the callers to use the new format if you
> change xml.el to do the same.

First let me clarify a technical detail.  In your new format,

 (catalog
   (text . "\n   ")
   (book
     (:type . "manual")
     (text . "\n      ")
     (title
       (text . "GNU Emacs manual"))
     (text . "\n   "))
   (text . "\n"))

seems to assume that element names never start with the colon character.
That is, there can never be an element named ":type".

The XML spec (http://www.w3.org/TR/2008/REC-xml-20081126/) seems to
indicate that element names are allowed to start with a colon; see the
definition of NameStartChar in section 2.3.

It looks like the new format would give ambiguous results in that case.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 15:46                         ` Chong Yidong
@ 2010-09-22 16:12                           ` Lars Magne Ingebrigtsen
  2010-09-22 16:51                           ` Wojciech Meyer
  1 sibling, 0 replies; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-22 16:12 UTC (permalink / raw)
  To: emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

> First let me clarify a technical detail.  In your new format,

[...]

> seems to assume that element names never start with the colon character.
> That is, there can never be an element named ":type".
>
> The XML spec (http://www.w3.org/TR/2008/REC-xml-20081126/) seems to
> indicate that element names are allowed to start with a colon; see the
> definition of NameStartChar in section 2.3.
>
> It looks like the new format would give ambiguous results in that case.

True.

Like I said, I only wanted it for the HTML case, and the XML case was
just an afterthought.  And in HTML, there can be no :tags.

Looking at the output from xml.el and xml.c on two RSS feeds, the format
doesn't seem to be the biggest change, but the actual data:

This is from the same RSS feed.  First the xml.el parser:

(pp xml (current-buffer))
((rdf:RDF
  ((xmlns:rdf . "http://www.w3.org/1999/02/22-rdf-syntax-ns#")
   (xmlns . "http://purl.org/rss/1.0/")
   (xmlns:taxo . "http://purl.org/rss/1.0/modules/taxonomy/")
   (xmlns:dc . "http://purl.org/dc/elements/1.1/")
   (xmlns:syn . "http://purl.org/rss/1.0/modules/syndication/")
   (xmlns:admin . "http://webns.net/mvcb/"))
  "\n  "
  (channel
   ((rdf:about . "http://blog.gmane.org/gmane.discuss"))
   "\n    "
   (title nil "gmane.discuss")
   "\n    "
   (link nil "http://blog.gmane.org/gmane.discuss")
   "\n    "
   (description nil
		(""))
   "\n    "
   (syn:updatePeriod nil "hourly")
   "\n    "
   (syn:updateFrequency nil "1")
   "\n    "
   (syn:updateBase nil "1901-01-01T00:00+00:00")
   "\n    "
   (items nil "\n      "
	  (rdf:Seq nil "\n        "
		   (rdf:li
		    ((rdf:resource . "http://permalink.gmane.org/gmane.discuss/13574"))
		    (""))
		   "\n        "
		   (rdf:li

Then the same thing from the xml.c parser:
                   
(pp nxml (current-buffer))
(RDF
 (text . "\n  ")
 (channel
  (:about . "http://blog.gmane.org/gmane.discuss")
  (text . "\n    ")
  (title
   (text . "gmane.discuss"))
  (text . "\n    ")
  (link
   (text . "http://blog.gmane.org/gmane.discuss"))
  (text . "\n    ")
  (description)
  (text . "\n    ")
  (updatePeriod
   (text . "hourly"))
  (text . "\n    ")
  (updateFrequency
   (text . "1"))
  (text . "\n    ")
  (updateBase
   (text . "1901-01-01T00:00+00:00"))
  (text . "\n    ")
  (items
   (text . "\n      ")
   (Seq
    (text . "\n        ")
    (li
     (:resource . "http://permalink.gmane.org/gmane.discuss/13574"))
    (text . "\n        ")
    (li

So more work is needed to turn the xml.c parser into something that's
compatible with what xml.el users expect.

Anyway, back to the format thing -- if we disregard the :tag issue
(i.e., find a work-around), then it would be pretty trivial to write a
function to convert the output from libxml-parse-xml-region into what
the xml.el package returns.  (Not to mention the nxml.el package, which
does the same as the xml.el package?)  It'd still be faster than the
pure Elisp version, and Gnus can call libxml-parse-html-region (as
planned) to render HTML as fast and convenient as possible.
    
-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 15:46                         ` Chong Yidong
  2010-09-22 16:12                           ` Lars Magne Ingebrigtsen
@ 2010-09-22 16:51                           ` Wojciech Meyer
  2010-09-22 18:06                             ` Chong Yidong
  2010-09-22 18:06                             ` Andy Wingo
  1 sibling, 2 replies; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-22 16:51 UTC (permalink / raw)
  To: Chong Yidong; +Cc: emacs-devel

On Wed, Sep 22, 2010 at 4:46 PM, Chong Yidong <cyd@stupidchicken.com> wrote:
> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>
>> Chong Yidong <cyd@stupidchicken.com> writes:
>>
>>> So either the new libxml functions have to provide the same format as
>>> xml.el, or xml.el has to be changed to used the new format, breaking
>>> existing uses.  I am amenable to the latter if the new format is so much
>>> better than the old one that it's worth dealing with the backward
>>> compatibility headaches.
>>
>> I'd volunteer for changing the callers to use the new format if you
>> change xml.el to do the same.
>
> First let me clarify a technical detail.  In your new format,
>
>  (catalog
>   (text . "\n   ")
>   (book
>     (:type . "manual")
>     (text . "\n      ")
>     (title
>       (text . "GNU Emacs manual"))
>     (text . "\n   "))
>   (text . "\n"))
>
> seems to assume that element names never start with the colon character.
> That is, there can never be an element named ":type".
>
> The XML spec (http://www.w3.org/TR/2008/REC-xml-20081126/) seems to
> indicate that element names are allowed to start with a colon; see the
> definition of NameStartChar in section 2.3.
>
> It looks like the new format would give ambiguous results in that case.

My personal opinion, is to stick with something that is standardized - SXml
that handles all the cases, and have one and only one uniform representation.

If nobody is up to this, I would volunteer to implement the SXml backend.

Thanks,
Wojciech



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 16:51                           ` Wojciech Meyer
@ 2010-09-22 18:06                             ` Chong Yidong
  2010-09-22 18:14                               ` Edward O'Connor
  2010-09-22 18:06                             ` Andy Wingo
  1 sibling, 1 reply; 100+ messages in thread
From: Chong Yidong @ 2010-09-22 18:06 UTC (permalink / raw)
  To: Wojciech Meyer; +Cc: emacs-devel

Wojciech Meyer <wojciech.meyer@googlemail.com> writes:

> My personal opinion, is to stick with something that is standardized -
> SXml that handles all the cases, and have one and only one uniform
> representation.
>
> If nobody is up to this, I would volunteer to implement the SXml
> backend.

This is not a bad idea in principle.  But how "standardized" is SXML?
It doesn't seem to be an official standard developed by W3 or other
similar body.  If it's an unofficial standard, well, we have our own
unofficial standard---xml.el was released in 2000, and so predates SXML.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 16:51                           ` Wojciech Meyer
  2010-09-22 18:06                             ` Chong Yidong
@ 2010-09-22 18:06                             ` Andy Wingo
  1 sibling, 0 replies; 100+ messages in thread
From: Andy Wingo @ 2010-09-22 18:06 UTC (permalink / raw)
  To: Wojciech Meyer; +Cc: Chong Yidong, emacs-devel

On Wed 22 Sep 2010 18:51, Wojciech Meyer <wojciech.meyer@googlemail.com> writes:

> My personal opinion, is to stick with something that is standardized - SXml
> that handles all the cases, and have one and only one uniform
> representation.

Agreed; I'm surprised sxml hasn't come up in this conversation.

  http://ssax.sourceforge.net/

I would have thought also that htmlprag could be ported to elisp.

  http://www.neilvandyke.org/htmlprag/

These are idle thoughts, as I'm not going to hack on this, but as I
haven't seen them mentioned, just wanted to throw them out there.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 18:06                             ` Chong Yidong
@ 2010-09-22 18:14                               ` Edward O'Connor
  2010-09-22 18:34                                 ` Leo
  0 siblings, 1 reply; 100+ messages in thread
From: Edward O'Connor @ 2010-09-22 18:14 UTC (permalink / raw)
  To: Chong Yidong; +Cc: Wojciech Meyer, emacs-devel

> This is not a bad idea in principle.  But how "standardized" is SXML?
> It doesn't seem to be an official standard developed by W3 or other
> similar body.  If it's an unofficial standard, well, we have our own
> unofficial standard---xml.el was released in 2000, and so predates SXML.

More importantly, there's a fair amount of elisp out there that
expects the structure xml.el puts out. If you ask me [which, of
course, you didn't :)] the libxml stuff should be made to output the
exact same structure as xml.el already does.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 18:14                               ` Edward O'Connor
@ 2010-09-22 18:34                                 ` Leo
  2010-09-22 18:41                                   ` Chong Yidong
  0 siblings, 1 reply; 100+ messages in thread
From: Leo @ 2010-09-22 18:34 UTC (permalink / raw)
  To: emacs-devel

On 2010-09-22 19:14 +0100, Edward O'Connor wrote:
> More importantly, there's a fair amount of elisp out there that
> expects the structure xml.el puts out. If you ask me [which, of
> course, you didn't :)] the libxml stuff should be made to output the
> exact same structure as xml.el already does.

I would think those packages use the public functions offered by xml.el
to do their work. The structure should remain internal.

Leo




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 18:34                                 ` Leo
@ 2010-09-22 18:41                                   ` Chong Yidong
  2010-09-22 19:57                                     ` Wojciech Meyer
  0 siblings, 1 reply; 100+ messages in thread
From: Chong Yidong @ 2010-09-22 18:41 UTC (permalink / raw)
  To: Leo; +Cc: emacs-devel

Leo <sdl.web@gmail.com> writes:

> On 2010-09-22 19:14 +0100, Edward O'Connor wrote:
>> More importantly, there's a fair amount of elisp out there that
>> expects the structure xml.el puts out. If you ask me [which, of
>> course, you didn't :)] the libxml stuff should be made to output the
>> exact same structure as xml.el already does.
>
> I would think those packages use the public functions offered by xml.el
> to do their work. The structure should remain internal.

The public functions return the parse tree.  I don't think it's possible
to abstract that away.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 18:41                                   ` Chong Yidong
@ 2010-09-22 19:57                                     ` Wojciech Meyer
  0 siblings, 0 replies; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-22 19:57 UTC (permalink / raw)
  To: Chong Yidong; +Cc: Leo, emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

> Leo <sdl.web@gmail.com> writes:

>> I would think those packages use the public functions offered by xml.el
>> to do their work. The structure should remain internal.
>
> The public functions return the parse tree.  I don't think it's possible
> to abstract that away.

We could transform one form to another, use exclusively one form, and
when it's needed to be compatible, translate it. It is purely a data
structure which makes actually the transition easier, and since it comes
from the same source (a very generic one, because it needs to describe
other data, that's why in the first place it maps so well to Sxml and to
Lisp generally).  I don't like idea to have two representations of the
same thing thou, and do the rewriting but it's the way it could be.

Andy Wingo <wingo@pobox.com> writes:

>   http://ssax.sourceforge.net/
> I would have thought also that htmlprag could be ported to elisp.
>   http://www.neilvandyke.org/htmlprag/

Yep, those are possible candidates for making it even easier, than
writing by hand. I don't have problems reusing C libraries at all.

Wojciech



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-21 23:00           ` Chong Yidong
  2010-09-21 23:24             ` Leo
@ 2010-09-22 23:48             ` Stefan Monnier
  1 sibling, 0 replies; 100+ messages in thread
From: Stefan Monnier @ 2010-09-22 23:48 UTC (permalink / raw)
  To: Chong Yidong; +Cc: Mark A. Hershberger, emacs-devel

> A `libxml' prefix sounds a bit weird to me.  When we link to the GPM
> library, the functions are `gpm-mouse-*' not `libgpm-mouse-*'.

libgpm is part of the GPM package and the only lib that gives access to
it AFAIK, whereas libxml2 is just one of many libraries that provide
access to XML data.


        Stefan



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 14:07                             ` Chong Yidong
  2010-09-22 15:04                               ` Eli Zaretskii
@ 2010-09-22 23:59                               ` Stefan Monnier
  2010-09-23  5:53                                 ` Leo
  2010-09-23  2:16                               ` Kevin Rodgers
  2 siblings, 1 reply; 100+ messages in thread
From: Stefan Monnier @ 2010-09-22 23:59 UTC (permalink / raw)
  To: Chong Yidong; +Cc: emacs-devel

>>> Yeah I like that too. After seeing two of its outputs, I quickly noticed
>>> the pattern and just create some short two-liners to access the tree.
>> Nice.  :-)
>> Perhaps Stefan's idea of just naming the functions `libxml-parse-*' was
>> the best to avoid the compatibility discussion.
> Having two xml parsers that provide slightly different outputs, even
> though they do the exact same thing, is a problem regardless of what
> their names are.

That's true, but that doesn't mean we have to settle on
xml.el's format.  We can also decide that it's a good opportunity to
improve up the format.
FWIW, while I haven't use sml.el much, the little bit I've used it was
not particularly pleasant, partly because of the odd format.  I don't
know how/why the xml.el was chosen and how much thought was put into it,
but my experience with it is not 100% positive.


        Stefan



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 14:07                             ` Chong Yidong
  2010-09-22 15:04                               ` Eli Zaretskii
  2010-09-22 23:59                               ` Stefan Monnier
@ 2010-09-23  2:16                               ` Kevin Rodgers
  2 siblings, 0 replies; 100+ messages in thread
From: Kevin Rodgers @ 2010-09-23  2:16 UTC (permalink / raw)
  To: emacs-devel

On 9/22/10 8:07 AM, Chong Yidong wrote:
> Lars Magne Ingebrigtsen<larsi@gnus.org>  writes:
>
>> Leo<sdl.web@gmail.com>  writes:
>>
>>> Yeah I like that too. After seeing two of its outputs, I quickly noticed
>>> the pattern and just create some short two-liners to access the tree.
>>
>> Nice.  :-)
>>
>> Perhaps Stefan's idea of just naming the functions `libxml-parse-*' was
>> the best to avoid the compatibility discussion.
>
> Having two xml parsers that provide slightly different outputs, even
> though they do the exact same thing, is a problem regardless of what
> their names are.

But having 2 xml parsers that provide radically different outputs would be good
for the fans of each format.

-- 
Kevin Rodgers
Denver, Colorado, USA




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-22 23:59                               ` Stefan Monnier
@ 2010-09-23  5:53                                 ` Leo
  2010-09-23 15:43                                   ` Chong Yidong
  0 siblings, 1 reply; 100+ messages in thread
From: Leo @ 2010-09-23  5:53 UTC (permalink / raw)
  To: emacs-devel

On 2010-09-23 00:59 +0100, Stefan Monnier wrote:
> FWIW, while I haven't use sml.el much, the little bit I've used it was
> not particularly pleasant, partly because of the odd format. I don't
> know how/why the xml.el was chosen and how much thought was put into
> it, but my experience with it is not 100% positive.

That looks like my experience too.

Leo




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-23  5:53                                 ` Leo
@ 2010-09-23 15:43                                   ` Chong Yidong
  2010-09-23 16:53                                     ` Leo
                                                       ` (2 more replies)
  0 siblings, 3 replies; 100+ messages in thread
From: Chong Yidong @ 2010-09-23 15:43 UTC (permalink / raw)
  To: Leo; +Cc: emacs-devel

Leo <sdl.web@gmail.com> writes:

> On 2010-09-23 00:59 +0100, Stefan Monnier wrote:
>> FWIW, while I haven't use sml.el much, the little bit I've used it was
>> not particularly pleasant, partly because of the odd format. I don't
>> know how/why the xml.el was chosen and how much thought was put into
>> it, but my experience with it is not 100% positive.
>
> That looks like my experience too.

The main differences in the "new" format are (i) listing attributes as
(:foo bar) inside the element list, rather than in an alist after the
element name, (ii) listing text as (text "foo") rather than "foo", and
(iii) the as-yet-unresolved issue with XML namespaces, which probably
needs to be fixed in xml.c.

Point (i) is a broken design choice, as I already pointed out.  As for
(ii), it is a little nicer to take the cdr of each list member without
checking for stringp.  If others thing this is a really good change, I
won't object, though it seems pretty trivial to me.  We can add an
optional flag to the xml-* functions to toggle between the two
representations.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-23 15:43                                   ` Chong Yidong
@ 2010-09-23 16:53                                     ` Leo
  2010-09-23 21:58                                     ` Wojciech Meyer
  2010-09-23 22:21                                     ` Lars Magne Ingebrigtsen
  2 siblings, 0 replies; 100+ messages in thread
From: Leo @ 2010-09-23 16:53 UTC (permalink / raw)
  To: Chong Yidong; +Cc: emacs-devel

On 2010-09-23 16:43 +0100, Chong Yidong wrote:
> Leo <sdl.web@gmail.com> writes:
>
>> On 2010-09-23 00:59 +0100, Stefan Monnier wrote:
>>> FWIW, while I haven't use sml.el much, the little bit I've used it was
>>> not particularly pleasant, partly because of the odd format. I don't
>>> know how/why the xml.el was chosen and how much thought was put into
>>> it, but my experience with it is not 100% positive.
>>
>> That looks like my experience too.
>
> The main differences in the "new" format are (i) listing attributes as
> (:foo bar) inside the element list, rather than in an alist after the
> element name, (ii) listing text as (text "foo") rather than "foo", and
> (iii) the as-yet-unresolved issue with XML namespaces, which probably
> needs to be fixed in xml.c.
>
> Point (i) is a broken design choice, as I already pointed out.  As for
> (ii), it is a little nicer to take the cdr of each list member without
> checking for stringp.  If others thing this is a really good change, I
> won't object, though it seems pretty trivial to me.  We can add an
> optional flag to the xml-* functions to toggle between the two
> representations.

I don't mind one way or another but I prefer one format instead.

Leo



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-23 15:43                                   ` Chong Yidong
  2010-09-23 16:53                                     ` Leo
@ 2010-09-23 21:58                                     ` Wojciech Meyer
  2010-09-23 22:21                                     ` Lars Magne Ingebrigtsen
  2 siblings, 0 replies; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-23 21:58 UTC (permalink / raw)
  To: Chong Yidong; +Cc: Leo, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1369 bytes --]

Chong Yidong <cyd@stupidchicken.com> writes:

> Leo <sdl.web@gmail.com> writes:
>
>> On 2010-09-23 00:59 +0100, Stefan Monnier wrote:
>>> FWIW, while I haven't use sml.el much, the little bit I've used it was
>>> not particularly pleasant, partly because of the odd format. I don't
>>> know how/why the xml.el was chosen and how much thought was put into
>>> it, but my experience with it is not 100% positive.
>>
>> That looks like my experience too.
>
> The main differences in the "new" format are (i) listing attributes as
> (:foo bar) inside the element list, rather than in an alist after the
> element name, (ii) listing text as (text "foo") rather than "foo", and
> (iii) the as-yet-unresolved issue with XML namespaces, which probably
> needs to be fixed in xml.c.
>
> Point (i) is a broken design choice, as I already pointed out.  As for
> (ii), it is a little nicer to take the cdr of each list member without
> checking for stringp.  If others thing this is a really good change, I
> won't object, though it seems pretty trivial to me.  We can add an
> optional flag to the xml-* functions to toggle between the two
> representations.

This patch fixes all the problems above and gives SXML conforming
representation of the elements tree.

Obviously we would need to patch `xml.el', and provide an interface for
accessing tree elements.

Thanks,
Wojciech


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Support-for-SXML-AST.patch --]
[-- Type: text/x-diff, Size: 3263 bytes --]

From 7db230f57fe9b7904d4d55e1fbe90a7522bd38a5 Mon Sep 17 00:00:00 2001
From: Wojciech Meyer <wojciech.meyer@gmail.com>
Date: Thu, 23 Sep 2010 22:45:32 +0100
Subject: [PATCH] Support for SXML AST.

	* xml.c (make_dom): Make output to conform with
	  SXML spec.
	* ChangeLog: Add entry.

Signed-off-by: Wojciech Meyer <wojciech.meyer@gmail.com>
---
 src/ChangeLog |    5 ++++
 src/xml.c     |   60 ++++++++++++++++++++++++++++++++++++++++++++------------
 2 files changed, 52 insertions(+), 13 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index 2dd892f..b11d291 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,8 @@
+2010-09-23  Wojciech Meyer  <wojciech.meyer@gmail.com>
+
+	* xml.c (make_dom): Make output to conform with
+	  SXML spec.
+
 2010-09-22  Eli Zaretskii  <eliz@gnu.org>
 
 	* editfns.c (Fsubst_char_in_region, Ftranslate_region_internal)
diff --git a/src/xml.c b/src/xml.c
index 5829f1d..9dc0931 100644
--- a/src/xml.c
+++ b/src/xml.c
@@ -28,7 +28,8 @@ along with GNU Emacs.  If not, see <http://www.gnu.org/licenses/>.  */
 #include "lisp.h"
 #include "buffer.h"
 
-Lisp_Object make_dom (xmlNode *node)
+static Lisp_Object 
+make_dom (xmlNode *node)
 {
   if (node->type == XML_ELEMENT_NODE)
     {
@@ -36,27 +37,60 @@ Lisp_Object make_dom (xmlNode *node)
       xmlNode *child;
       xmlAttr *property;
       Lisp_Object plist = Qnil;
+      int was_element_node = 0;
+      
+      /* First add the attributes.  */
 
-      /* First add the attributes. */
       property = node->properties;
-      while (property != NULL)
+
+      /* Don't add nil if no properties */
+      if (property != NULL)
 	{
-	  if (property->children &&
-	      property->children->content)
+	  /* Add special `@' node containing properties */
+	  plist = Fcons(intern("@"),Qnil);
+
+	  while (property != NULL)
 	    {
-	      plist = Fcons (Fcons (intern (property->name),
-				    build_string (property->children->content)),
-			     plist);
+	      if (property->children &&
+		  property->children->content)
+		{
+		  plist = 
+		    Fcons 
+		    (Fcons (intern (property->name),
+			    Fcons (build_string (property->children->content), 
+				   Qnil)),
+		     plist);
+		}
+	      property = property->next;
 	    }
-	  property = property->next;
+	  result = Fcons (Fnreverse (plist), result);
 	}
-      result = Fcons (Fnreverse (plist), result);
-
       /* Then add the children of the node. */
+
       child = node->children;
+
+      /* First try to lookup for elements 
+	 if any found, prohibit adding any text elements */
+
       while (child != NULL)
 	{
-	  result = Fcons (make_dom (child), result);
+	  if (child->type == XML_ELEMENT_NODE)
+	    {
+	      was_element_node = 1;
+	      result = Fcons (make_dom (child), result);
+	    }
+
+	  child = child->next;
+	}
+      
+      child = node->children;
+      
+      while (!was_element_node && child != NULL)
+	{
+
+	  if ( child->type == XML_TEXT_NODE )
+	    result = Fcons (make_dom (child), result);
+
 	  child = child->next;
 	}
 
@@ -73,7 +107,7 @@ Lisp_Object make_dom (xmlNode *node)
     return Qnil;
 }
 
-static Lisp_Object
+INLINE static Lisp_Object
 parse_string (Lisp_Object string, Lisp_Object base_url, int htmlp)
 {
   xmlDoc *doc;
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-23 15:43                                   ` Chong Yidong
  2010-09-23 16:53                                     ` Leo
  2010-09-23 21:58                                     ` Wojciech Meyer
@ 2010-09-23 22:21                                     ` Lars Magne Ingebrigtsen
  2010-09-24  0:04                                       ` Stefan Monnier
  2 siblings, 1 reply; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-23 22:21 UTC (permalink / raw)
  To: emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

> The main differences in the "new" format are (i) listing attributes as
> (:foo bar) inside the element list, rather than in an alist after the
> element name, (ii) listing text as (text "foo") rather than "foo", and
> (iii) the as-yet-unresolved issue with XML namespaces, which probably
> needs to be fixed in xml.c.
>
> Point (i) is a broken design choice, as I already pointed out.

Well, yes and no.  Attributes are (:foo . bar).  Nodes can (as you
pointed out, and I didn't know) have names like :foo in XML, but they
can't be (:foo . bar).  They'll always have a list as the cdr.  (They
list may be nil, but it's still a list.)

The main point of having the : before the attribute names is mainly an
over-determination of what we're looking at when we're looking at stuff
visually, as we have a tendency to do when we're trying to make heads or
tails of the crappy HTML and XML people give us.

As for the :foo node names, we can map them to anything else if
required.  Pick an invalid XML character -- any one will do, if this is
important.

> As for (ii), it is a little nicer to take the cdr of each list member
> without checking for stringp.  If others thing this is a really good
> change, I won't object, though it seems pretty trivial to me.

It seems trivial, but as someone who's dealing with this stuff daily, I
assure you that it's really really annoying never being able to say
`assq' or just looping over the stuff without having the extra `if'
everywhere.  It's just annoying and makes the code unclear and
error-prone. 

> We can add an optional flag to the xml-* functions to toggle between
> the two representations.

Yes, that would be fine by me.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-23 22:21                                     ` Lars Magne Ingebrigtsen
@ 2010-09-24  0:04                                       ` Stefan Monnier
  2010-09-24  0:06                                         ` Lars Magne Ingebrigtsen
  2010-09-24 23:43                                         ` Problems with xml-parse-string Andrew W. Nosenko
  0 siblings, 2 replies; 100+ messages in thread
From: Stefan Monnier @ 2010-09-24  0:04 UTC (permalink / raw)
  To: emacs-devel

> As for the :foo node names, we can map them to anything else if
> required.  Pick an invalid XML character -- any one will do, if this is
> important.

How 'bout =foo ?


-- Stefan



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24  0:04                                       ` Stefan Monnier
@ 2010-09-24  0:06                                         ` Lars Magne Ingebrigtsen
  2010-09-24  1:09                                           ` Chong Yidong
  2010-09-24 23:43                                         ` Problems with xml-parse-string Andrew W. Nosenko
  1 sibling, 1 reply; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-24  0:06 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> As for the :foo node names, we can map them to anything else if
>> required.  Pick an invalid XML character -- any one will do, if this is
>> important.
>
> How 'bout =foo ?

Looks good to me.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24  0:06                                         ` Lars Magne Ingebrigtsen
@ 2010-09-24  1:09                                           ` Chong Yidong
  2010-09-24  2:46                                             ` David De La Harpe Golden
                                                               ` (2 more replies)
  0 siblings, 3 replies; 100+ messages in thread
From: Chong Yidong @ 2010-09-24  1:09 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>>> As for the :foo node names, we can map them to anything else if
>>> required.  Pick an invalid XML character -- any one will do, if this is
>>> important.
>>
>> How 'bout =foo ?
>
> Looks good to me.

If we're going to make a clean break with the old xml.el parse tree
format, I think it makes more sense to go with sxml.  Is there any
technical reason not to?  (Aside from the obnoxious all caps, which we
can safely omit.)



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24  1:09                                           ` Chong Yidong
@ 2010-09-24  2:46                                             ` David De La Harpe Golden
  2010-09-24  5:38                                               ` David Kastrup
  2010-09-24  8:02                                               ` Eli Zaretskii
  2010-09-24 10:44                                             ` Wojciech Meyer
  2010-09-24 10:49                                             ` Lars Magne Ingebrigtsen
  2 siblings, 2 replies; 100+ messages in thread
From: David De La Harpe Golden @ 2010-09-24  2:46 UTC (permalink / raw)
  To: emacs-devel

On 24/09/10 02:09, Chong Yidong wrote:
> Lars Magne Ingebrigtsen<larsi@gnus.org>  writes:
>
>> Stefan Monnier<monnier@iro.umontreal.ca>  writes:
>>
>>>> As for the :foo node names, we can map them to anything else if
>>>> required.  Pick an invalid XML character -- any one will do, if this is
>>>> important.
>>>
>>> How 'bout =foo ?
>>
>> Looks good to me.
>
> If we're going to make a clean break with the old xml.el parse tree
> format, I think it makes more sense to go with sxml.  Is there any
> technical reason not to?

I'm not too convinced mapping xml element and attribute names to 
interned lisp symbols at all is particularly desirable.  Not that I 
personally use xml-... in emacs lisp much/ever, but I have used common 
lisp xml parsing in the past and seem to remember that using strings was 
overall less problematic than symbols (and it wasn't just down to case - 
common lisp only looks case insensitive), just generally easier to be 
non-lossy and non-cluttering-symbol-table-with-random-xml-crap. Read 
some transient xml message once? have some useless symbols hanging round 
forever (for small values of forever).

Yes, strings look messier for "manually keyed in" toy example sexp 
representations of xml, but I imagine the bulk of emacs lisp xml api 
usage in practice would be programmatic.

One could also accept symbols as a shortcut for lisp->xml and just use 
their symbol-names as strings, but output strings for xml->lisp.  I 
suppose you could offer the option of interning for xml->lisp output.

Over in common lisp land, the "xmls" supported (and extended a bit) by 
closure xml [1] used strings, avoiding symbols.

[1] http://common-lisp.net/project/cxml/xmls-compat.html



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24  2:46                                             ` David De La Harpe Golden
@ 2010-09-24  5:38                                               ` David Kastrup
  2010-09-24  8:02                                               ` Eli Zaretskii
  1 sibling, 0 replies; 100+ messages in thread
From: David Kastrup @ 2010-09-24  5:38 UTC (permalink / raw)
  To: emacs-devel

David De La Harpe Golden <david@harpegolden.net> writes:

> On 24/09/10 02:09, Chong Yidong wrote:
>> Lars Magne Ingebrigtsen<larsi@gnus.org>  writes:
>>
>>> Stefan Monnier<monnier@iro.umontreal.ca>  writes:
>>>
>>>>> As for the :foo node names, we can map them to anything else if
>>>>> required.  Pick an invalid XML character -- any one will do, if this is
>>>>> important.
>>>>
>>>> How 'bout =foo ?
>>>
>>> Looks good to me.
>>
>> If we're going to make a clean break with the old xml.el parse tree
>> format, I think it makes more sense to go with sxml.  Is there any
>> technical reason not to?
>
> I'm not too convinced mapping xml element and attribute names to
> interned lisp symbols at all is particularly desirable.  Not that I
> personally use xml-... in emacs lisp much/ever, but I have used common
> lisp xml parsing in the past and seem to remember that using strings
> was overall less problematic than symbols (and it wasn't just down to
> case - 
> common lisp only looks case insensitive), just generally easier to be
> non-lossy and non-cluttering-symbol-table-with-random-xml-crap. Read
> some transient xml message once? have some useless symbols hanging
> round forever (for small values of forever).

You can scan through interned symbols (or uninterned ones) much faster
than through strings because they compare EQ.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24  2:46                                             ` David De La Harpe Golden
  2010-09-24  5:38                                               ` David Kastrup
@ 2010-09-24  8:02                                               ` Eli Zaretskii
  2010-09-24 10:47                                                 ` Wojciech Meyer
  1 sibling, 1 reply; 100+ messages in thread
From: Eli Zaretskii @ 2010-09-24  8:02 UTC (permalink / raw)
  To: David De La Harpe Golden; +Cc: emacs-devel

> Date: Fri, 24 Sep 2010 03:46:31 +0100
> From: David De La Harpe Golden <david@harpegolden.net>
> 
> I'm not too convinced mapping xml element and attribute names to 
> interned lisp symbols at all is particularly desirable.  Not that I 
> personally use xml-... in emacs lisp much/ever, but I have used common 
> lisp xml parsing in the past and seem to remember that using strings was 
> overall less problematic than symbols (and it wasn't just down to case - 
> common lisp only looks case insensitive), just generally easier to be 
> non-lossy and non-cluttering-symbol-table-with-random-xml-crap. Read 
> some transient xml message once? have some useless symbols hanging round 
> forever (for small values of forever).

If polluting the global obarray is a concern, perhaps we could use a
private one for XML symbols.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24  1:09                                           ` Chong Yidong
  2010-09-24  2:46                                             ` David De La Harpe Golden
@ 2010-09-24 10:44                                             ` Wojciech Meyer
  2010-09-24 10:49                                             ` Lars Magne Ingebrigtsen
  2 siblings, 0 replies; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-24 10:44 UTC (permalink / raw)
  To: Chong Yidong; +Cc: emacs-devel

On Fri, Sep 24, 2010 at 2:09 AM, Chong Yidong <cyd@stupidchicken.com> wrote:
> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>
>> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>>
>>>> As for the :foo node names, we can map them to anything else if
>>>> required.  Pick an invalid XML character -- any one will do, if this is
>>>> important.
>>>
>>> How 'bout =foo ?
>>
>> Looks good to me.
>
> If we're going to make a clean break with the old xml.el parse tree
> format, I think it makes more sense to go with sxml.  Is there any
> technical reason not to?  (Aside from the obnoxious all caps, which we
> can safely omit.)

Thanks.
I've read all the other messages and here I'm proposing solutions for
both problems:
- for ':' or any other special Emacs prefix character only we can
  generate a string. The drawback of this solution is that no-longer the
  output is consistent and conforming to Sxml, but the advantage is that
  it handles all the corner cases, and I cannot think about better
  solution.  We'd need a special handling code, abstracting access to it
  (but that's a good idea anyway, and it would happen anyway with the
  `prefix|replacement' solution).  Another advantage of trying keeping
  symbols, is that we can possibly eval code directly, without any
  transformation, and the data is more natural and expressive. Also, there
  could be functions, xml-valid-smxlp, xml-can-evalp, for testing
  various properties.

- for the problem with polluting obarrays with random interned symbols,
  David's solution would work.

Wojciech



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24  8:02                                               ` Eli Zaretskii
@ 2010-09-24 10:47                                                 ` Wojciech Meyer
  0 siblings, 0 replies; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-24 10:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, David De La Harpe Golden

> If polluting the global obarray is a concern, perhaps we could use a
> private one for XML symbols.

Sorry Eli/David, I meant it was Eli's  idea, and I also agree that
having checking
symbols is much less costly than comparing strings.

Wojciech



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24  1:09                                           ` Chong Yidong
  2010-09-24  2:46                                             ` David De La Harpe Golden
  2010-09-24 10:44                                             ` Wojciech Meyer
@ 2010-09-24 10:49                                             ` Lars Magne Ingebrigtsen
  2010-09-24 15:25                                               ` Chong Yidong
  2 siblings, 1 reply; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-24 10:49 UTC (permalink / raw)
  To: emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

> If we're going to make a clean break with the old xml.el parse tree
> format, I think it makes more sense to go with sxml. 

The sxml format is even less regular than the xml.el format.  What would
the advantages be to switching to sxml be?

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 10:49                                             ` Lars Magne Ingebrigtsen
@ 2010-09-24 15:25                                               ` Chong Yidong
  2010-09-24 15:53                                                 ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 100+ messages in thread
From: Chong Yidong @ 2010-09-24 15:25 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Chong Yidong <cyd@stupidchicken.com> writes:
>
>> If we're going to make a clean break with the old xml.el parse tree
>> format, I think it makes more sense to go with sxml.
>
> The sxml format is even less regular than the xml.el format.  What would
> the advantages be to switching to sxml be?

Could you elaborate?  I'm looking at sxml's Wikipedia page, which has an
example where

  <tag attr1="value1"
       attr2="value2">
    <nested>Text node</nested>
    <empty/>
  </tag>

maps to

  (tag (@ (attr1 "value1")
          (attr2 "value2"))
    (nested "Text node")
    (empty))

This seems pretty regular to me.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 15:25                                               ` Chong Yidong
@ 2010-09-24 15:53                                                 ` Lars Magne Ingebrigtsen
  2010-09-24 16:26                                                   ` Chong Yidong
  0 siblings, 1 reply; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-24 15:53 UTC (permalink / raw)
  To: emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

> Could you elaborate?  I'm looking at sxml's Wikipedia page, which has an
> example where
>
>   <tag attr1="value1"
>        attr2="value2">
>     <nested>Text node</nested>
>     <empty/>
>   </tag>
>
> maps to
>
>   (tag (@ (attr1 "value1")
>           (attr2 "value2"))
>     (nested "Text node")
>     (empty))
>
> This seems pretty regular to me.

The main difference between sxml and xml.el output is that it has the
weird an unnecessary "@" node for the attributes and that it wastes a
cons in the attributes, isn't it?

Other than that it has the same problem that xml.el has, in that text
nodes have to be special-cased, so you can't say assq or use simple
descent without testing.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 15:53                                                 ` Lars Magne Ingebrigtsen
@ 2010-09-24 16:26                                                   ` Chong Yidong
  2010-09-24 16:46                                                     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 100+ messages in thread
From: Chong Yidong @ 2010-09-24 16:26 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Chong Yidong <cyd@stupidchicken.com> writes:
>
>>   (tag (@ (attr1 "value1")
>>           (attr2 "value2"))
>>     (nested "Text node")
>>     (empty))
>>
>> This seems pretty regular to me.
>
> The main difference between sxml and xml.el output is that it has the
> weird an unnecessary "@" node for the attributes and that it wastes a
> cons in the attributes, isn't it?

The xml.el output always has an alist for attributes after each tag; if
there are no attributes, the element after the tag name is nil.  In
sxml, the `@' denotes an attribute list, which is omitted if no
attributes exist.

> Other than that it has the same problem that xml.el has, in that text
> nodes have to be special-cased, so you can't say assq or use simple
> descent without testing.

It is illogical to criticize sxml for wasting conses, while arguing for
wrapping each text node in a cons.

Anyway, it is difficult to see how real the problem is without a
concrete example.  Could you provide one?  I suspect that the real
problem, if one exists, is Elisp's relatively weak support for list
mapping and reduction; if that's the case, the correct solution is to
pull in some of the relevant functions from the CL package.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 16:26                                                   ` Chong Yidong
@ 2010-09-24 16:46                                                     ` Lars Magne Ingebrigtsen
  2010-09-24 17:34                                                       ` Wojciech Meyer
                                                                         ` (2 more replies)
  0 siblings, 3 replies; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-24 16:46 UTC (permalink / raw)
  To: emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

>> The main difference between sxml and xml.el output is that it has the
>> weird an unnecessary "@" node for the attributes and that it wastes a
>> cons in the attributes, isn't it?
>
> The xml.el output always has an alist for attributes after each tag; if
> there are no attributes, the element after the tag name is nil.  In
> sxml, the `@' denotes an attribute list, which is omitted if no
> attributes exist.

Yes.  So it's yet another irregularity you have to check for.

To take a concrete example: You want the src of the img node you have.

xml.el:  (cdr (assq 'img (cadr node)))
sxml.el: (if (and (consp (cadr node))
                  (eq (caadr node) '@))
             (cadr (assq 'img node)))

(And I'm not even sure that's correct.  It's probably not.  Which is my
point.)

libxml: (cdr (assq :img (cdr node)))

(The difference between libxml and xml.c for attributes is minuscule.)
             
>> Other than that it has the same problem that xml.el has, in that text
>> nodes have to be special-cased, so you can't say assq or use simple
>> descent without testing.
>
> It is illogical to criticize sxml for wasting conses, while arguing for
> wrapping each text node in a cons.

No, it is not.  I'm sacrificing space for speed and regularity.  sxml
wasting cons cells, and adding slowdowns at the same time.

> Anyway, it is difficult to see how real the problem is without a
> concrete example.  Could you provide one?  I suspect that the real
> problem, if one exists, is Elisp's relatively weak support for list
> mapping and reduction; if that's the case, the correct solution is to
> pull in some of the relevant functions from the CL package.

Here's a pretty piece of code, chosen at random:

(defun nnrss-find-el (tag data &optional found-list)
  "Find the all matching elements in the data.
Careful with this on large documents!"
  (when (consp data)
    (dolist (bit data)
      (when (car-safe bit)
	(when (equal tag (car bit))
	  ;; Old xml.el may return a list of string.
	  (when (and (consp (caddr bit))
		     (stringp (caaddr bit)))
	    (setcar (cddr bit) (caaddr bit)))
	  (setq found-list
		(append found-list
			(list bit))))
	(if (and (consp (car-safe (caddr bit)))
		 (not (stringp (caddr bit))))
	    (setq found-list
		  (append found-list
			  (nnrss-find-el
			   tag (caddr bit))))
	  (setq found-list
		(append found-list
			(nnrss-find-el
			 tag (cddr bit))))))))
  found-list)

The horror!  

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 16:46                                                     ` Lars Magne Ingebrigtsen
@ 2010-09-24 17:34                                                       ` Wojciech Meyer
  2010-09-24 18:09                                                         ` Frank Schmitt
  2010-09-24 18:47                                                       ` Chong Yidong
  2010-09-25 14:42                                                       ` Andy Wingo
  2 siblings, 1 reply; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-24 17:34 UTC (permalink / raw)
  To: emacs-devel

Sorry, I don't underdand your point at all. did you hear about
abstract data structures?

On 9/24/10, Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
> Chong Yidong <cyd@stupidchicken.com> writes:
>
>>> The main difference between sxml and xml.el output is that it has the
>>> weird an unnecessary "@" node for the attributes and that it wastes a
>>> cons in the attributes, isn't it?
>>
>> The xml.el output always has an alist for attributes after each tag; if
>> there are no attributes, the element after the tag name is nil.  In
>> sxml, the `@' denotes an attribute list, which is omitted if no
>> attributes exist.
>
> Yes.  So it's yet another irregularity you have to check for.
>
> To take a concrete example: You want the src of the img node you have.
>
> xml.el:  (cdr (assq 'img (cadr node)))
> sxml.el: (if (and (consp (cadr node))
>                   (eq (caadr node) '@))
>              (cadr (assq 'img node)))
>
> (And I'm not even sure that's correct.  It's probably not.  Which is my
> point.)
>
> libxml: (cdr (assq :img (cdr node)))
>
> (The difference between libxml and xml.c for attributes is minuscule.)
>
>>> Other than that it has the same problem that xml.el has, in that text
>>> nodes have to be special-cased, so you can't say assq or use simple
>>> descent without testing.
>>
>> It is illogical to criticize sxml for wasting conses, while arguing for
>> wrapping each text node in a cons.
>
> No, it is not.  I'm sacrificing space for speed and regularity.  sxml
> wasting cons cells, and adding slowdowns at the same time.
>
>> Anyway, it is difficult to see how real the problem is without a
>> concrete example.  Could you provide one?  I suspect that the real
>> problem, if one exists, is Elisp's relatively weak support for list
>> mapping and reduction; if that's the case, the correct solution is to
>> pull in some of the relevant functions from the CL package.
>
> Here's a pretty piece of code, chosen at random:
>
> (defun nnrss-find-el (tag data &optional found-list)
>   "Find the all matching elements in the data.
> Careful with this on large documents!"
>   (when (consp data)
>     (dolist (bit data)
>       (when (car-safe bit)
> 	(when (equal tag (car bit))
> 	  ;; Old xml.el may return a list of string.
> 	  (when (and (consp (caddr bit))
> 		     (stringp (caaddr bit)))
> 	    (setcar (cddr bit) (caaddr bit)))
> 	  (setq found-list
> 		(append found-list
> 			(list bit))))
> 	(if (and (consp (car-safe (caddr bit)))
> 		 (not (stringp (caddr bit))))
> 	    (setq found-list
> 		  (append found-list
> 			  (nnrss-find-el
> 			   tag (caddr bit))))
> 	  (setq found-list
> 		(append found-list
> 			(nnrss-find-el
> 			 tag (cddr bit))))))))
>   found-list)
>
> The horror!
>
> --
> (domestic pets only, the antidote for overdose, milk.)
>   larsi@gnus.org * Lars Magne Ingebrigtsen
>
>
>

-- 
Sent from my mobile device



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 17:34                                                       ` Wojciech Meyer
@ 2010-09-24 18:09                                                         ` Frank Schmitt
  2010-09-24 18:21                                                           ` Ted Zlatanov
  2010-09-24 18:31                                                           ` Wojciech Meyer
  0 siblings, 2 replies; 100+ messages in thread
From: Frank Schmitt @ 2010-09-24 18:09 UTC (permalink / raw)
  To: emacs-devel

Wojciech Meyer <wojciech.meyer@googlemail.com> writes:

> Sorry, I don't underdand your point at all. did you hear about
> abstract data structures?

You've got a pretty unfriendly tone in your posts. Oh, and BTW: Did you
here about quoting conventions in mail and newsgroups?

<Outlook style full-bottom-quote removed>

-- 
Have you ever considered how much text can fit in eighty columns?  Given that a
signature typically contains up to four lines of text, this space allows you to
attach a tremendous amount of valuable information to your messages.  Seize the
opportunity and don't waste your signature on bullshit that nobody cares about.




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 18:09                                                         ` Frank Schmitt
@ 2010-09-24 18:21                                                           ` Ted Zlatanov
  2010-09-24 18:31                                                           ` Wojciech Meyer
  1 sibling, 0 replies; 100+ messages in thread
From: Ted Zlatanov @ 2010-09-24 18:21 UTC (permalink / raw)
  To: emacs-devel

On Fri, 24 Sep 2010 20:09:45 +0200 Frank Schmitt <ich@frank-schmitt.net> wrote: 

FS> Wojciech Meyer <wojciech.meyer@googlemail.com> writes:
>> Sorry, I don't underdand your point at all. did you hear about
>> abstract data structures?

FS> You've got a pretty unfriendly tone in your posts. Oh, and BTW: Did you
FS> here about quoting conventions in mail and newsgroups?

No, no, let Lars answer if he has heard about abstract data structures.
I'm curious also :)

Ted




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 18:09                                                         ` Frank Schmitt
  2010-09-24 18:21                                                           ` Ted Zlatanov
@ 2010-09-24 18:31                                                           ` Wojciech Meyer
  1 sibling, 0 replies; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-24 18:31 UTC (permalink / raw)
  To: Frank Schmitt; +Cc: emacs-devel

Frank Schmitt <ich@frank-schmitt.net> writes:

> Wojciech Meyer <wojciech.meyer@googlemail.com> writes:
>
>> Sorry, I don't underdand your point at all. did you hear about
>> abstract data structures?
>
> You've got a pretty unfriendly tone in your posts. Oh, and BTW: Did you
> here about quoting conventions in mail and newsgroups?
>
> <Outlook style full-bottom-quote removed>

Sorry about this it was sent from my *mobile*, where I have no control
over the layout or citations, in fact I cannot see them. If you read
carefully you would see underneath `Sent from my mobile'.

About your first complaint please see:

> Wojciech Meyer <wojciech.meyer@googlemail.com>  writes:

> > It is data, it doesn't need to look good. Obviously you handle that
> > as an abstract data structure.

> I take it you're not a programmer?

> --
> (domestic pets only, the antidote for overdose, milk.)
>  larsi@gnus.org * Lars Magne Ingebrigtsen

If you followed the thread, you will see I raised that once, and received
very un-polite reply. I hope that help you justify the situation (80
columns fit in Gnus now.). In-fact I try to be polite always, but see
yourself.

If that's your point.

Anyway, I am giving up, because I am not in charge of changing it, it is
OK to use existing format, I don't really care. (but I care about
quality of delivered code in Emacs). That's all.

Thanks,

Wojciech



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 16:46                                                     ` Lars Magne Ingebrigtsen
  2010-09-24 17:34                                                       ` Wojciech Meyer
@ 2010-09-24 18:47                                                       ` Chong Yidong
  2010-09-24 18:53                                                         ` Chong Yidong
  2010-09-24 19:06                                                         ` Lars Magne Ingebrigtsen
  2010-09-25 14:42                                                       ` Andy Wingo
  2 siblings, 2 replies; 100+ messages in thread
From: Chong Yidong @ 2010-09-24 18:47 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Here's a pretty piece of code, chosen at random:
>
> (defun nnrss-find-el (tag data &optional found-list)
> ...
> The horror!

I think that code might just be crufty.  Here's an implementation,
assuming sxml format:

  (defun nnrss-find-el (tag data &optional found-list)
   "Find the all matching elements in the data.
  Careful with this on large documents!"
   (nreverse (nnrss-find-el-1 tag data)))

  (defun nnrss-find-el-1 (tag data &optional found-list)
    (and (consp data)
         (not (eq (car data) '@))
         (if (equal tag (car data))
             (push data found-list)
           (dolist (bit (cdr data))
             (setq found-list
                   (nnrss-find-el-1 tag bit found-list)))))
    found-list)

Doesn't seem too horrific.  Here's the example tree:

  (setq test-sxml-tree
        '(tag (@ (attr1 "value1")
                 (attr2 "value2"))
              "Free text"
              (foo "Text node 1")
              (bar "Text node 2")
              (baz
                "More free text"
                (foo "Text node 3")
                (bar "Text node 4"))))

> To take a concrete example: You want the src of the img node you have.
>
> xml.el:  (cdr (assq 'img (cadr node)))
> sxml.el: (if (and (consp (cadr node))
>                   (eq (caadr node) '@))
>              (cadr (assq 'img node)))
>
> (And I'm not even sure that's correct.  It's probably not.  Which is my
> point.)
>
> libxml: (cdr (assq :img (cdr node)))
>
> (The difference between libxml and xml.c for attributes is minuscule.)

I think this example is confused.  If you're scanning at top-level,
without descent, the code for all three cases is practically identical:
it's a simple assq.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 18:47                                                       ` Chong Yidong
@ 2010-09-24 18:53                                                         ` Chong Yidong
  2010-09-24 18:58                                                           ` Wojciech Meyer
  2010-09-24 19:06                                                         ` Lars Magne Ingebrigtsen
  1 sibling, 1 reply; 100+ messages in thread
From: Chong Yidong @ 2010-09-24 18:53 UTC (permalink / raw)
  To: emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

> I think that code might just be crufty.  Here's an implementation,
> assuming sxml format

Anyway, now that I think about it, the point that Leo made about public
functions is a good one.  If xml.el were to provide some public
functions for operations like "finding" or "flattening" or "removing
tags", the exact format becomes less important.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 18:53                                                         ` Chong Yidong
@ 2010-09-24 18:58                                                           ` Wojciech Meyer
  0 siblings, 0 replies; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-24 18:58 UTC (permalink / raw)
  To: Chong Yidong; +Cc: emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

> Chong Yidong <cyd@stupidchicken.com> writes:
>
>> I think that code might just be crufty.  Here's an implementation,
>> assuming sxml format
>
> Anyway, now that I think about it, the point that Leo made about public
> functions is a good one.  If xml.el were to provide some public
> functions for operations like "finding" or "flattening" or "removing
> tags", the exact format becomes less important.

Yes, I agree fully with it. If we could departure out of the physical
representation of the data, by some abstraction layer it would be just
easier to apply any modifications.

Mine 2 cents,
Wojciech



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 18:47                                                       ` Chong Yidong
  2010-09-24 18:53                                                         ` Chong Yidong
@ 2010-09-24 19:06                                                         ` Lars Magne Ingebrigtsen
  2010-09-24 19:25                                                           ` Chong Yidong
  1 sibling, 1 reply; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-24 19:06 UTC (permalink / raw)
  To: emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

>> To take a concrete example: You want the src of the img node you have.
>>
>> xml.el:  (cdr (assq 'img (cadr node)))
>> sxml.el: (if (and (consp (cadr node))
>>                   (eq (caadr node) '@))
>>              (cadr (assq 'img node)))
>>
>> (And I'm not even sure that's correct.  It's probably not.  Which is my
>> point.)
>>
>> libxml: (cdr (assq :img (cdr node)))
>>
>> (The difference between libxml and xml.c for attributes is minuscule.)
>
> I think this example is confused.  If you're scanning at top-level,
> without descent, the code for all three cases is practically identical:
> it's a simple assq.

Sorry, "img" in the examples should be replaced with "src".

But I don't understand what you mean with "top-level".  This example was
about getting an attribute.

Chong Yidong <cyd@stupidchicken.com> writes:

> I think that code might just be crufty.  Here's an implementation,
> assuming sxml format:
>
>   (defun nnrss-find-el (tag data &optional found-list)
>    "Find the all matching elements in the data.
>   Careful with this on large documents!"
>    (nreverse (nnrss-find-el-1 tag data)))
>
>   (defun nnrss-find-el-1 (tag data &optional found-list)
>     (and (consp data)
>          (not (eq (car data) '@))
>          (if (equal tag (car data))
>              (push data found-list)
>            (dolist (bit (cdr data))
>              (setq found-list
>                    (nnrss-find-el-1 tag bit found-list)))))
>     found-list)

Here's the libxml version:

(defun nnrss-find-el (tag node)
  (let (result)
    (dolist (elem (cdr node))
      (when (eq (car elem) tag)
	(push elem result))
       (when (consp (cdr elem))
	 (setq result (nconc (nnrss-find-el tag elem) result))))
    result))

(Of course, if we were allowed to use CL constructions, it would be
something like (untested, because I have to run):

(defun nnrss-find-el (tag node)
  (loop for elem in (cdr node)
        if (eq (car elem) tag)
        collect elem
        when (consp (cdr elem))
        append (nnrss-find-el tag node)))

Can get much clearer than that, can it?        
        
> Doesn't seem too horrific.  Here's the example tree:

It's not too horrific.  It's just that it's not as nice as it could be.
And if you deal with these structures all the time, you get to recreate
the (and (consp data) (not (eq (car data) '@))) in every single branch
of every single little trivial function you write (and read).

>   (setq test-sxml-tree
>         '(tag (@ (attr1 "value1")
>                  (attr2 "value2"))
>               "Free text"
>               (foo "Text node 1")
>               (bar "Text node 2")
>               (baz
>                 "More free text"
>                 (foo "Text node 3")
>                 (bar "Text node 4"))))

(setq test-libxml-tree
      '(tag
	(:attr1 . "value1")
	(:attr2 . "value2")
	(text . "Free text")
	(foo (text . "Text node 1"))
	(bar (text . "Text node 2"))
	(baz
	 (text . "More free text")
	 (foo (text . "Text node 3"))
	 (bar (text . "Text node 4")))))
    
-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 19:06                                                         ` Lars Magne Ingebrigtsen
@ 2010-09-24 19:25                                                           ` Chong Yidong
  2010-09-24 19:34                                                             ` Lars Magne Ingebrigtsen
  2010-09-24 22:01                                                             ` Stefan Monnier
  0 siblings, 2 replies; 100+ messages in thread
From: Chong Yidong @ 2010-09-24 19:25 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

>> Doesn't seem too horrific.
>
> It's not too horrific.  It's just that it's not as nice as it could
> be.  And if you deal with these structures all the time, you get to
> recreate the (and (consp data) (not (eq (car data) '@))) in every
> single branch of every single little trivial function you write (and
> read).

That's why you use `xml-node-children' and so forth.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 19:25                                                           ` Chong Yidong
@ 2010-09-24 19:34                                                             ` Lars Magne Ingebrigtsen
  2010-09-24 21:57                                                               ` Chong Yidong
  2010-09-24 22:01                                                             ` Stefan Monnier
  1 sibling, 1 reply; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-24 19:34 UTC (permalink / raw)
  To: emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

> That's why you use `xml-node-children' and so forth.

So your argument is now that it doesn't matter how crappy the
representation is, since we're going to teach all users of the data to
use small accessor functions that hide the horror? Instead of assq we're
going to use xml-find-attribute?

Well, that's dandy, if you manage to cover all the use cases, and you
manage to teach all users these accessors.

But that's not how it works. Somebody wants some data. It's in html. They
parse it.  They look at it. They use assq and are angry it breaks in
some cases.

(excuse typos. I'm writing from my mobile phone.)
 
-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 19:34                                                             ` Lars Magne Ingebrigtsen
@ 2010-09-24 21:57                                                               ` Chong Yidong
  2010-09-25 13:11                                                                 ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 100+ messages in thread
From: Chong Yidong @ 2010-09-24 21:57 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> So your argument is now that it doesn't matter how crappy the
> representation is, since we're going to teach all users of the data to
> use small accessor functions that hide the horror? Instead of assq
> we're going to use xml-find-attribute?

You are overselling your case.  The representation you're pushing also
has its complexities.  For instance, you need to mangle symbol names to
get the symbols representing XML attributes.

I can see where your arguments are coming from, but I don't think enough
has been brought to the table to justify switching to a non-standard and
non-backward-compatible internal format.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 19:25                                                           ` Chong Yidong
  2010-09-24 19:34                                                             ` Lars Magne Ingebrigtsen
@ 2010-09-24 22:01                                                             ` Stefan Monnier
  2010-09-24 22:17                                                               ` Chong Yidong
  1 sibling, 1 reply; 100+ messages in thread
From: Stefan Monnier @ 2010-09-24 22:01 UTC (permalink / raw)
  To: Chong Yidong; +Cc: emacs-devel

>>> Doesn't seem too horrific.
>> It's not too horrific.  It's just that it's not as nice as it could
>> be.  And if you deal with these structures all the time, you get to
>> recreate the (and (consp data) (not (eq (car data) '@))) in every
>> single branch of every single little trivial function you write (and
>> read).
> That's why you use `xml-node-children' and so forth.

There's also the performance impact.  Elisp is slow, so it would only
work well if we push xml-node-children and friends down to C in such
a case.


        Stefan



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 22:01                                                             ` Stefan Monnier
@ 2010-09-24 22:17                                                               ` Chong Yidong
  2010-09-25  0:25                                                                 ` Wojciech Meyer
  0 siblings, 1 reply; 100+ messages in thread
From: Chong Yidong @ 2010-09-24 22:17 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> There's also the performance impact.  Elisp is slow, so it would only
> work well if we push xml-node-children and friends down to C in such
> a case.

xml-node-children is cddr, so this is not the best example.

If what you're saying is that it might be good to push certain
(currently unspecified) xml-handling functions into the C level,
I agree.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24  0:04                                       ` Stefan Monnier
  2010-09-24  0:06                                         ` Lars Magne Ingebrigtsen
@ 2010-09-24 23:43                                         ` Andrew W. Nosenko
  1 sibling, 0 replies; 100+ messages in thread
From: Andrew W. Nosenko @ 2010-09-24 23:43 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On Fri, Sep 24, 2010 at 03:04, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> As for the :foo node names, we can map them to anything else if
>> required.  Pick an invalid XML character -- any one will do, if this is
>> important.
>
> How 'bout =foo ?
>

Another way to disappear the problem: just prefix _every_ name, and
nodes and attributes.  Nodes by the one symbol (e.g. by '.' (dot)), an
attributes by another (e.g. by current ':' (colon)).

<node attr="val"/>    becomes ".node" and ":attr"
<:node :attr="val/>    becomes ".:node" and "::attr"
<.node .attr="val"/>    becomes "..node" and ":.attr" if dot allowed
at the first char of name at all

In this case you avoid ambiguity whether 1st char is part of original
name or introduced as the "type sign" by engine.  Just because it
always is the "type sign" and introduced by engine.

-- 
Andrew W. Nosenko <andrew.w.nosenko@gmail.com>



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 22:17                                                               ` Chong Yidong
@ 2010-09-25  0:25                                                                 ` Wojciech Meyer
  0 siblings, 0 replies; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-25  0:25 UTC (permalink / raw)
  To: Chong Yidong; +Cc: Stefan Monnier, emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

> If what you're saying is that it might be good to push certain
> (currently unspecified) xml-handling functions into the C level,
> I agree.

Here is a draft for the API in C - some of the functions could be lifted
to Elisp (please feel free to drop comments, strip, or add what I've 
missed):

access functions:
sxml-node-attributes
sxml-node-content
sxml-node-children
sxml-node-name
sxml-node-comment
sxml-node-descent

selection functions:
sxml-node-get-attribute-by-name
sxml-node-get-nodes-by-name

destructive structure updates:
sxml-node-strip-attributes
sxml-node-append
sxml-node-prepend
sxml-node-add-attribute
smxl-node-set-or-insert-attributes
smxl-node-set-or-insert-nodes
sxml-node-sort

cloning:
sxml-node-clone

mapping:
sxml-node-mapc (destructive)
sxml-node-map

collecting data:
sxml-node-flatten
sxml-node-filter-flatten
sxml-node-sparse-tree
sxml-node-leafs

visiting: (with some extra flags)
sxml-node-visit-depth 
sxml-node-visit-breadth 

searching:
sxml-node-find
sxml-node-search

various:
sxml-strict-sxmlp
sxml-no-attributesp
sxml-validp
sxml->xml

the rest would be done via Xpath. 

Some of the XML specific things are missing (name-spaces, top-level
headers, docs, line numbers and etc.). 

Open question is how to parse directly buffers (skimming through manuals
didn't give my any clue how to deal with custom buffers, AFAIU it is not
possible).

The tricky question is also how do we interface we xpath.  Once we've
marshaled the tree to Lisp, we would need to either implement Xpath in
Elisp (where there is implementation already:

http://www.emacswiki.org/emacs/xpath.el

), or marshal back and again. (should be IMHO faster then Xpath in
Elisp, despite copying).

Thanks,
Wojciech



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 21:57                                                               ` Chong Yidong
@ 2010-09-25 13:11                                                                 ` Lars Magne Ingebrigtsen
  2010-09-25 13:31                                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 100+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-25 13:11 UTC (permalink / raw)
  To: emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

> You are overselling your case.

I give up.  Since you're not listening to technical arguments, I
withdraw from the discussion of this.  Choose a format, any format, and
the HTML renderer I'm going to write is going to take the format that
libxml-parse-html-region returns, turn it into the sane format that can
actually be used to write easy-to-understand code, and use that for the
actual rendering.

Sure, everybody will have slower and more gc-ing reading of their HTML
emails, but I have no more energy to argue about this.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-25 13:11                                                                 ` Lars Magne Ingebrigtsen
@ 2010-09-25 13:31                                                                   ` Eli Zaretskii
  2010-09-25 13:56                                                                     ` David Kastrup
                                                                                       ` (2 more replies)
  0 siblings, 3 replies; 100+ messages in thread
From: Eli Zaretskii @ 2010-09-25 13:31 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: emacs-devel

> From: Lars Magne Ingebrigtsen <larsi@gnus.org>
> Date: Sat, 25 Sep 2010 15:11:40 +0200
> 
> Chong Yidong <cyd@stupidchicken.com> writes:
> 
> > You are overselling your case.
> 
> I give up.

That's a pity.

Chong, I'd suggest trusting Lars's instincts and experience a bit
more.  OTOH, if you indeed want to see valid technical arguments for
his suggestion, you should request the same from the opposite views.
We should either judge intuition against intuition or specific
arguments vs specific arguments.  I saw no practical arguments to back
up the other view, only academic ones.  That's unfair, IMO.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-25 13:31                                                                   ` Eli Zaretskii
@ 2010-09-25 13:56                                                                     ` David Kastrup
  2010-09-25 13:59                                                                     ` Wojciech Meyer
  2010-09-25 15:00                                                                     ` Chong Yidong
  2 siblings, 0 replies; 100+ messages in thread
From: David Kastrup @ 2010-09-25 13:56 UTC (permalink / raw)
  To: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Lars Magne Ingebrigtsen <larsi@gnus.org>
>> Date: Sat, 25 Sep 2010 15:11:40 +0200
>> 
>> Chong Yidong <cyd@stupidchicken.com> writes:
>> 
>> > You are overselling your case.
>> 
>> I give up.
>
> That's a pity.
>
> Chong, I'd suggest trusting Lars's instincts and experience a bit
> more.  OTOH, if you indeed want to see valid technical arguments for
> his suggestion, you should request the same from the opposite views.
> We should either judge intuition against intuition or specific
> arguments vs specific arguments.  I saw no practical arguments to back
> up the other view, only academic ones.  That's unfair, IMO.

The basic argument was that existing code might be based on that form.
That is not entirely academical.  However, that existing code is not
likely to run unmodified in Emacs Lisp, anyway.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-25 13:31                                                                   ` Eli Zaretskii
  2010-09-25 13:56                                                                     ` David Kastrup
@ 2010-09-25 13:59                                                                     ` Wojciech Meyer
  2010-09-25 16:13                                                                       ` Eli Zaretskii
  2010-09-25 15:00                                                                     ` Chong Yidong
  2 siblings, 1 reply; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-25 13:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Lars Magne Ingebrigtsen, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Lars Magne Ingebrigtsen <larsi@gnus.org>
>> Date: Sat, 25 Sep 2010 15:11:40 +0200
>> 
>> Chong Yidong <cyd@stupidchicken.com> writes:
>> 
>> > You are overselling your case.
>> 
>> I give up.
>
> That's a pity.
>
> Chong, I'd suggest trusting Lars's instincts and experience a bit
> more.  OTOH, if you indeed want to see valid technical arguments for
> his suggestion, you should request the same from the opposite views.
> We should either judge intuition against intuition or specific
> arguments vs specific arguments.  I saw no practical arguments to back
> up the other view, only academic ones.  That's unfair, IMO.

Saying, `I used Sxml with Lisp in a real project and I had no problems
with it at all' is enough? To emphasise, a real project, which means
working for food. How can be a better proof or technical reason. If i
wanted to transfer the data from Elisp equipped with Sxml, to the other
one, what would be the easiest way? `Prin1' obviously, and `read' at
other side, or in case of Lars format transforming back to XML and
re-parse and re-read (Which is not supported yet at all, BTW). How about
other maybe `less practical' languages like Mbase (the one I worked
on). Can I actually do the same with the Scheme: yes I could if the
format was Sxml.  There is no real reason behind having this format
beside:

+ it is portable
+ very well specified and robust
+ well known standard
+ will not surprise us with some nitpicks

For a Lars format
+ only Emacs would use that
+ we still don't know how to support all the stuff from Xml
+ nobody produced spec for it that we can actually believe it is OK to
transform from XML and to XML without losing information
+ some small problems with escaping `:' but that's regarding Sxml too
+ BTW: you cannot use `text' property to store content, as it can be used
as an attribute (just a trivial example).

So yes, that's the practical reasons again, to emphasise I don't mind
this one or other one (if anybody asked me really..), it would be just
wise to use something that is well specified and not reinventing the
wheel again.

Wojciech



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-24 16:46                                                     ` Lars Magne Ingebrigtsen
  2010-09-24 17:34                                                       ` Wojciech Meyer
  2010-09-24 18:47                                                       ` Chong Yidong
@ 2010-09-25 14:42                                                       ` Andy Wingo
  2010-09-25 15:12                                                         ` Leo
  2010-09-25 15:21                                                         ` Leo
  2 siblings, 2 replies; 100+ messages in thread
From: Andy Wingo @ 2010-09-25 14:42 UTC (permalink / raw)
  To: emacs-devel

Hello,

On Fri 24 Sep 2010 18:46, Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> To take a concrete example: You want the src of the img node you have.
>
> xml.el:  (cdr (assq 'img (cadr node)))
> sxml.el: (if (and (consp (cadr node))
>                   (eq (caadr node) '@))
>              (cadr (assq 'img node)))

You should use something like sxml-match.

  http://www.gnu.org/software/guile/docs/master/guile.html/sxml_002dmatch.html#sxml_002dmatch

  (sxml-match node
    ((img (@ (src ,src)))
     src))

A bit verbose for this particular example, but it's the best, most
robust way to parse out values from xml-in-s-expressions.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-25 13:31                                                                   ` Eli Zaretskii
  2010-09-25 13:56                                                                     ` David Kastrup
  2010-09-25 13:59                                                                     ` Wojciech Meyer
@ 2010-09-25 15:00                                                                     ` Chong Yidong
  2 siblings, 0 replies; 100+ messages in thread
From: Chong Yidong @ 2010-09-25 15:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Lars Magne Ingebrigtsen, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> > You are overselling your case.
>>
>> I give up.
>
> That's a pity.
>
> Chong, I'd suggest trusting Lars's instincts and experience a bit
> more.  OTOH, if you indeed want to see valid technical arguments for
> his suggestion, you should request the same from the opposite views.
> We should either judge intuition against intuition or specific
> arguments vs specific arguments.  I saw no practical arguments to back
> up the other view, only academic ones.  That's unfair, IMO.

Well, I'm sorry if this is unfair, but in such a situation---there are
numerous third-party packages requiring xml.el; a cursory search on
emacswiki showing five or six---the onus of proof is on the proponent of
the compatibility-breaking change.  I've looked at the three formats,
and the examples given; and maybe I'm just being dense, but I just don't
see sufficient advantage.

I'm open to adding a flag to the parse functions that toggles between
the old xml.el format and a new format; but the trouble is that if we're
going to offer a new alternative format, it becomes hard to justify
making that new format yet another non-standard one (Lars'), rather than
something other people are already using (sxml).  That's why I think
it's better to work on improving the accessor functions instead.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-25 14:42                                                       ` Andy Wingo
@ 2010-09-25 15:12                                                         ` Leo
  2010-09-25 15:21                                                         ` Leo
  1 sibling, 0 replies; 100+ messages in thread
From: Leo @ 2010-09-25 15:12 UTC (permalink / raw)
  To: emacs-devel

On 2010-09-25 15:42 +0100, Andy Wingo wrote:
> You should use something like sxml-match.
>
>   http://www.gnu.org/software/guile/docs/master/guile.html/sxml_002dmatch.html#sxml_002dmatch
>
>   (sxml-match node
>     ((img (@ (src ,src)))
>      src))
>
> A bit verbose for this particular example, but it's the best, most
> robust way to parse out values from xml-in-s-expressions.

I like this approach very much. Thanks, Andy. That should shorten the
proposed APIs to just a few.

Leo




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-25 14:42                                                       ` Andy Wingo
  2010-09-25 15:12                                                         ` Leo
@ 2010-09-25 15:21                                                         ` Leo
  2010-09-25 15:42                                                           ` Wojciech Meyer
  1 sibling, 1 reply; 100+ messages in thread
From: Leo @ 2010-09-25 15:21 UTC (permalink / raw)
  To: emacs-devel

On 2010-09-25 15:42 +0100, Andy Wingo wrote:
>   (sxml-match node
>     ((img (@ (src ,src)))
>      src))

Looks like `sexp-match' will be a good addition to Emacs.

Leo




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-25 15:21                                                         ` Leo
@ 2010-09-25 15:42                                                           ` Wojciech Meyer
  2010-09-25 20:02                                                             ` Stefan Monnier
  0 siblings, 1 reply; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-25 15:42 UTC (permalink / raw)
  To: Leo; +Cc: emacs-devel

Leo <sdl.web@gmail.com> writes:

> On 2010-09-25 15:42 +0100, Andy Wingo wrote:
>>   (sxml-match node
>>     ((img (@ (src ,src)))
>>      src))
>
> Looks like `sexp-match' will be a good addition to Emacs.
>

Yes, pattern matching in Elisp would be a very cool addition.

> Leo

Cheers;
Wojciech



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-25 13:59                                                                     ` Wojciech Meyer
@ 2010-09-25 16:13                                                                       ` Eli Zaretskii
  2010-09-25 16:46                                                                         ` Wojciech Meyer
  0 siblings, 1 reply; 100+ messages in thread
From: Eli Zaretskii @ 2010-09-25 16:13 UTC (permalink / raw)
  To: Wojciech Meyer; +Cc: larsi, emacs-devel

> From: Wojciech Meyer <wojciech.meyer@googlemail.com>
> Cc: Lars Magne Ingebrigtsen <larsi@gnus.org>,  emacs-devel@gnu.org
> Date: Sat, 25 Sep 2010 14:59:59 +0100
> 
> Saying, `I used Sxml with Lisp in a real project and I had no problems
> with it at all' is enough?

No.  You didn't show any real code, whereas Lars was requested to do
that.

> To emphasise, a real project, which means
> working for food. How can be a better proof or technical reason.

Didn't you ever see projects that were finished successfully although
their technical basis is questionable?  I see that every day.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-25 16:13                                                                       ` Eli Zaretskii
@ 2010-09-25 16:46                                                                         ` Wojciech Meyer
  2010-09-25 22:29                                                                           ` Juanma Barranquero
  0 siblings, 1 reply; 100+ messages in thread
From: Wojciech Meyer @ 2010-09-25 16:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, Wojciech Meyer, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Wojciech Meyer <wojciech.meyer@googlemail.com>
>> Cc: Lars Magne Ingebrigtsen <larsi@gnus.org>,  emacs-devel@gnu.org
>> Date: Sat, 25 Sep 2010 14:59:59 +0100
>> 
>> Saying, `I used Sxml with Lisp in a real project and I had no problems
>> with it at all' is enough?
>
> No.  You didn't show any real code, whereas Lars was requested to do
> that.

OK.. It was said quite a few times and some people showed up the code.

>
>> To emphasise, a real project, which means
>> working for food. How can be a better proof or technical reason.
>
> Didn't you ever see projects that were finished successfully although
> their technical basis is questionable?  I see that every day.

Yes, but that does not mean we want our technical side of Emacs to be
questionable. Especially in case where clearly we can do better, because
the subject has been researched. Yes, I have seen many projects like
this but this is not the point of the discussion...

Successful also means many things. (Windows operating system is also a
good example, during it's development it was hardly using so much
knowledge that had been building up for 3 decades, and that was clearly
a mistake, until Unix people came and invent NT..., but still Windows
has been succeeding from the beginning)

If we still have a choice to change anything, and it is well known and
researched subject we should always make a design decision based on
the idiomatic approach to avoid pitfalls in future.

Somebody would say, let's replace XML in the industry with SXML, clearly
it is less verbose and more readable, if you want to have some parallel
example.
 
Wojciech



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-25 15:42                                                           ` Wojciech Meyer
@ 2010-09-25 20:02                                                             ` Stefan Monnier
  2010-09-25 20:32                                                               ` Leo
                                                                                 ` (2 more replies)
  0 siblings, 3 replies; 100+ messages in thread
From: Stefan Monnier @ 2010-09-25 20:02 UTC (permalink / raw)
  To: Wojciech Meyer; +Cc: Leo, emacs-devel

> Yes, pattern matching in Elisp would be a very cool addition.

The future has passed.  It's called `pcase'.


        Stefan



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-25 20:02                                                             ` Stefan Monnier
@ 2010-09-25 20:32                                                               ` Leo
  2010-09-25 23:08                                                               ` Leo
  2010-09-26  3:48                                                               ` pcase.el (was: Problems with xml-parse-string) Ted Zlatanov
  2 siblings, 0 replies; 100+ messages in thread
From: Leo @ 2010-09-25 20:32 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Wojciech Meyer, emacs-devel

On 2010-09-25 21:02 +0100, Stefan Monnier wrote:
>> Yes, pattern matching in Elisp would be a very cool addition.
>
> The future has passed.  It's called `pcase'.

Excellent ;) Thanks.

>         Stefan

Leo



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-25 16:46                                                                         ` Wojciech Meyer
@ 2010-09-25 22:29                                                                           ` Juanma Barranquero
  0 siblings, 0 replies; 100+ messages in thread
From: Juanma Barranquero @ 2010-09-25 22:29 UTC (permalink / raw)
  To: Wojciech Meyer; +Cc: Eli Zaretskii, larsi, emacs-devel

On Sat, Sep 25, 2010 at 18:46, Wojciech Meyer
<wojciech.meyer@googlemail.com> wrote:

> until Unix people came and invent NT...,

I think you meant RSX-11 and VAX/VMS, not Unix...

    Juanma



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-25 20:02                                                             ` Stefan Monnier
  2010-09-25 20:32                                                               ` Leo
@ 2010-09-25 23:08                                                               ` Leo
  2010-09-26 21:55                                                                 ` Stefan Monnier
  2010-09-26  3:48                                                               ` pcase.el (was: Problems with xml-parse-string) Ted Zlatanov
  2 siblings, 1 reply; 100+ messages in thread
From: Leo @ 2010-09-25 23:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On 2010-09-25 21:02 +0100, Stefan Monnier wrote:
>> Yes, pattern matching in Elisp would be a very cool addition.
>
> The future has passed.  It's called `pcase'.
>
>
>         Stefan

I am trying to use pcase.el with emacs-23 and getting an error:
(void-function plet*). Is that pcase-let*?

Leo



^ permalink raw reply	[flat|nested] 100+ messages in thread

* pcase.el (was: Problems with xml-parse-string)
  2010-09-25 20:02                                                             ` Stefan Monnier
  2010-09-25 20:32                                                               ` Leo
  2010-09-25 23:08                                                               ` Leo
@ 2010-09-26  3:48                                                               ` Ted Zlatanov
  2010-09-26 22:06                                                                 ` pcase.el Stefan Monnier
  2 siblings, 1 reply; 100+ messages in thread
From: Ted Zlatanov @ 2010-09-26  3:48 UTC (permalink / raw)
  To: emacs-devel

On Sat, 25 Sep 2010 22:02:38 +0200 Stefan Monnier <monnier@iro.umontreal.ca> wrote: 

>> Yes, pattern matching in Elisp would be a very cool addition.
SM> The future has passed.  It's called `pcase'.

That seems really useful (lately I've been working with CLIPS, which is
a sort of idiot cousin to ML).  Are there docs or examples for pcase.el?
I don't see any usage in the Emacs lisp/ tree.

Thanks
Ted




^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-25 23:08                                                               ` Leo
@ 2010-09-26 21:55                                                                 ` Stefan Monnier
  2010-09-26 23:34                                                                   ` Leo
  0 siblings, 1 reply; 100+ messages in thread
From: Stefan Monnier @ 2010-09-26 21:55 UTC (permalink / raw)
  To: Leo; +Cc: emacs-devel

>>> Yes, pattern matching in Elisp would be a very cool addition.
>> The future has passed.  It's called `pcase'.
> I am trying to use pcase.el with emacs-23 and getting an error:
> (void-function plet*). Is that pcase-let*?

Oops, yes, as you can tell, the pcase-let part of the code hasn't been
used much until now.


        Stefan



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: pcase.el
  2010-09-26  3:48                                                               ` pcase.el (was: Problems with xml-parse-string) Ted Zlatanov
@ 2010-09-26 22:06                                                                 ` Stefan Monnier
  2010-09-27 16:59                                                                   ` pcase.el Leo
  2010-09-28 18:17                                                                   ` pcase.el Ted Zlatanov
  0 siblings, 2 replies; 100+ messages in thread
From: Stefan Monnier @ 2010-09-26 22:06 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

>>> Yes, pattern matching in Elisp would be a very cool addition.
SM> The future has passed.  It's called `pcase'.

> That seems really useful (lately I've been working with CLIPS, which is
> a sort of idiot cousin to ML).  Are there docs or examples for pcase.el?
> I don't see any usage in the Emacs lisp/ tree.

The docstring of `pcase' is meant to be "complete", so you can start
with that and complain about the missing bits.  Don't bother complaining
if you're not famailiar with ML-style pattern matching, tho (the
docstring is written under the assumption that this much is known).

There's a fairly good/extensive example in macroexp.el.


        Stefan



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: Problems with xml-parse-string
  2010-09-26 21:55                                                                 ` Stefan Monnier
@ 2010-09-26 23:34                                                                   ` Leo
  0 siblings, 0 replies; 100+ messages in thread
From: Leo @ 2010-09-26 23:34 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On 2010-09-26 22:55 +0100, Stefan Monnier wrote:
>>>> Yes, pattern matching in Elisp would be a very cool addition.
>>> The future has passed.  It's called `pcase'.
>> I am trying to use pcase.el with emacs-23 and getting an error:
>> (void-function plet*). Is that pcase-let*?
>
> Oops, yes, as you can tell, the pcase-let part of the code hasn't been
> used much until now.

Thanks.

Leo



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: pcase.el
  2010-09-26 22:06                                                                 ` pcase.el Stefan Monnier
@ 2010-09-27 16:59                                                                   ` Leo
  2010-09-27 22:51                                                                     ` pcase.el Stefan Monnier
  2010-09-28 18:17                                                                   ` pcase.el Ted Zlatanov
  1 sibling, 1 reply; 100+ messages in thread
From: Leo @ 2010-09-27 16:59 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Ted Zlatanov, emacs-devel

On 2010-09-26 23:06 +0100, Stefan Monnier wrote:
>>>> Yes, pattern matching in Elisp would be a very cool addition.
> SM> The future has passed.  It's called `pcase'.
>
>> That seems really useful (lately I've been working with CLIPS, which is
>> a sort of idiot cousin to ML).  Are there docs or examples for pcase.el?
>> I don't see any usage in the Emacs lisp/ tree.
>
> The docstring of `pcase' is meant to be "complete", so you can start
> with that and complain about the missing bits.  Don't bother complaining
> if you're not famailiar with ML-style pattern matching, tho (the
> docstring is written under the assumption that this much is known).
>
> There's a fairly good/extensive example in macroexp.el.
>
>
>         Stefan

There's also an erlang-style matcher for elisp here:
http://code.google.com/p/distel/.

Leo



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: pcase.el
  2010-09-27 16:59                                                                   ` pcase.el Leo
@ 2010-09-27 22:51                                                                     ` Stefan Monnier
  0 siblings, 0 replies; 100+ messages in thread
From: Stefan Monnier @ 2010-09-27 22:51 UTC (permalink / raw)
  To: Leo; +Cc: Ted Zlatanov, emacs-devel

SM> The future has passed.  It's called `pcase'.
[...]
> There's also an erlang-style matcher for elisp here:
> http://code.google.com/p/distel/.

Not to belittle that code, but its internal working is very different:
it does not preprocess that `case' into an efficient decision tree, but
instead naively tries each pattern in turn, in many cases performing the
same tests over and over again (even interpreting patterns at run-time
rather than precompiling them).  So pcase should show *much*
higher performance (100% untested claim).


        Stefan



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: pcase.el
  2010-09-26 22:06                                                                 ` pcase.el Stefan Monnier
  2010-09-27 16:59                                                                   ` pcase.el Leo
@ 2010-09-28 18:17                                                                   ` Ted Zlatanov
  1 sibling, 0 replies; 100+ messages in thread
From: Ted Zlatanov @ 2010-09-28 18:17 UTC (permalink / raw)
  To: emacs-devel

On Mon, 27 Sep 2010 00:06:13 +0200 Stefan Monnier <monnier@iro.umontreal.ca> wrote: 

SM> The docstring of `pcase' is meant to be "complete", so you can start
SM> with that and complain about the missing bits.  
...
SM> There's a fairly good/extensive example in macroexp.el.

That was helpful, thank you.  I played with it a bit:

(loop for form in '("hello string"
                    ("hello list")
                    30
                    symbol
                    (lambda () ("lambda"))
                    (what about me))
      collect `(,form
                "pcased is"
                ,(pcase form
                   (`(lambda . ,_)
                    "lambda function")
                   (`30 :thirty)
                   ((pred stringp) "that was a string")
                   (`(what . ,what-args)
                    (format "what with args %s" what-args))
                   ((pred listp) "that was a list")
                   (t (format "couldn't match \"%s\"" form)))))

I'll keep it in mind.  

The docstring should maybe mention the t pattern (to match anything).
It's functionally equivalent to _ but I think the SYMBOL pattern matches
it...  So what happens to the SYMBOL value bind?

Thanks
Ted




^ permalink raw reply	[flat|nested] 100+ messages in thread

end of thread, other threads:[~2010-09-28 18:17 UTC | newest]

Thread overview: 100+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-14 18:11 Problems with xml-parse-string Leo
2010-09-14 18:24 ` Lars Magne Ingebrigtsen
2010-09-14 21:18   ` Stefan Monnier
2010-09-15  8:06     ` Lars Magne Ingebrigtsen
2010-09-15  8:51       ` Stefan Monnier
2010-09-15  9:21         ` Lars Magne Ingebrigtsen
2010-09-15  9:54           ` Leo
2010-09-15 10:16             ` Julien Danjou
2010-09-15 15:58               ` Chad Brown
2010-09-21 23:00           ` Chong Yidong
2010-09-21 23:24             ` Leo
2010-09-22  2:26               ` Chong Yidong
2010-09-22  3:15                 ` Chong Yidong
2010-09-22  7:14                   ` Stefan Monnier
2010-09-22 10:35                   ` Lars Magne Ingebrigtsen
2010-09-22 10:58                     ` Lars Magne Ingebrigtsen
2010-09-22 11:00                     ` Leo
2010-09-22 11:09                       ` Lars Magne Ingebrigtsen
2010-09-22 11:41                         ` Lars Magne Ingebrigtsen
2010-09-22 11:55                           ` Wojciech Meyer
2010-09-22 12:09                             ` Lars Magne Ingebrigtsen
2010-09-22 12:17                               ` Wojciech Meyer
2010-09-22 12:18                                 ` Lars Magne Ingebrigtsen
2010-09-22 12:20                                   ` Wojciech Meyer
2010-09-22 12:26                                     ` Lars Magne Ingebrigtsen
2010-09-22 12:34                                       ` Wojciech Meyer
2010-09-22 12:46                                         ` Lars Magne Ingebrigtsen
2010-09-22 12:45                         ` Leo
2010-09-22 13:14                           ` Lars Magne Ingebrigtsen
2010-09-22 14:07                             ` Chong Yidong
2010-09-22 15:04                               ` Eli Zaretskii
2010-09-22 23:59                               ` Stefan Monnier
2010-09-23  5:53                                 ` Leo
2010-09-23 15:43                                   ` Chong Yidong
2010-09-23 16:53                                     ` Leo
2010-09-23 21:58                                     ` Wojciech Meyer
2010-09-23 22:21                                     ` Lars Magne Ingebrigtsen
2010-09-24  0:04                                       ` Stefan Monnier
2010-09-24  0:06                                         ` Lars Magne Ingebrigtsen
2010-09-24  1:09                                           ` Chong Yidong
2010-09-24  2:46                                             ` David De La Harpe Golden
2010-09-24  5:38                                               ` David Kastrup
2010-09-24  8:02                                               ` Eli Zaretskii
2010-09-24 10:47                                                 ` Wojciech Meyer
2010-09-24 10:44                                             ` Wojciech Meyer
2010-09-24 10:49                                             ` Lars Magne Ingebrigtsen
2010-09-24 15:25                                               ` Chong Yidong
2010-09-24 15:53                                                 ` Lars Magne Ingebrigtsen
2010-09-24 16:26                                                   ` Chong Yidong
2010-09-24 16:46                                                     ` Lars Magne Ingebrigtsen
2010-09-24 17:34                                                       ` Wojciech Meyer
2010-09-24 18:09                                                         ` Frank Schmitt
2010-09-24 18:21                                                           ` Ted Zlatanov
2010-09-24 18:31                                                           ` Wojciech Meyer
2010-09-24 18:47                                                       ` Chong Yidong
2010-09-24 18:53                                                         ` Chong Yidong
2010-09-24 18:58                                                           ` Wojciech Meyer
2010-09-24 19:06                                                         ` Lars Magne Ingebrigtsen
2010-09-24 19:25                                                           ` Chong Yidong
2010-09-24 19:34                                                             ` Lars Magne Ingebrigtsen
2010-09-24 21:57                                                               ` Chong Yidong
2010-09-25 13:11                                                                 ` Lars Magne Ingebrigtsen
2010-09-25 13:31                                                                   ` Eli Zaretskii
2010-09-25 13:56                                                                     ` David Kastrup
2010-09-25 13:59                                                                     ` Wojciech Meyer
2010-09-25 16:13                                                                       ` Eli Zaretskii
2010-09-25 16:46                                                                         ` Wojciech Meyer
2010-09-25 22:29                                                                           ` Juanma Barranquero
2010-09-25 15:00                                                                     ` Chong Yidong
2010-09-24 22:01                                                             ` Stefan Monnier
2010-09-24 22:17                                                               ` Chong Yidong
2010-09-25  0:25                                                                 ` Wojciech Meyer
2010-09-25 14:42                                                       ` Andy Wingo
2010-09-25 15:12                                                         ` Leo
2010-09-25 15:21                                                         ` Leo
2010-09-25 15:42                                                           ` Wojciech Meyer
2010-09-25 20:02                                                             ` Stefan Monnier
2010-09-25 20:32                                                               ` Leo
2010-09-25 23:08                                                               ` Leo
2010-09-26 21:55                                                                 ` Stefan Monnier
2010-09-26 23:34                                                                   ` Leo
2010-09-26  3:48                                                               ` pcase.el (was: Problems with xml-parse-string) Ted Zlatanov
2010-09-26 22:06                                                                 ` pcase.el Stefan Monnier
2010-09-27 16:59                                                                   ` pcase.el Leo
2010-09-27 22:51                                                                     ` pcase.el Stefan Monnier
2010-09-28 18:17                                                                   ` pcase.el Ted Zlatanov
2010-09-24 23:43                                         ` Problems with xml-parse-string Andrew W. Nosenko
2010-09-23  2:16                               ` Kevin Rodgers
2010-09-22 14:05                     ` Chong Yidong
2010-09-22 14:32                       ` Lars Magne Ingebrigtsen
2010-09-22 15:46                         ` Chong Yidong
2010-09-22 16:12                           ` Lars Magne Ingebrigtsen
2010-09-22 16:51                           ` Wojciech Meyer
2010-09-22 18:06                             ` Chong Yidong
2010-09-22 18:14                               ` Edward O'Connor
2010-09-22 18:34                                 ` Leo
2010-09-22 18:41                                   ` Chong Yidong
2010-09-22 19:57                                     ` Wojciech Meyer
2010-09-22 18:06                             ` Andy Wingo
2010-09-22 23:48             ` Stefan Monnier

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.