unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Refactoring xml.el namespace handling
@ 2004-03-01 15:59 Mark A. Hershberger
  2004-03-02 21:52 ` Stefan Monnier
  0 siblings, 1 reply; 14+ messages in thread
From: Mark A. Hershberger @ 2004-03-01 15:59 UTC (permalink / raw)



(Sent this earlier to just emacs-devel...)

In late September, I proposed changing the way xml.el handles
namespaces.  That message is appended below.

Because of a variety of things that happened between then and now, I
never submitted a patch to make these changes.

This patch differs slightly from the proposal in that namespaces are
interned after being concatenated with `:'.  Thus,

  <top xmlns="urn"/>

Is parsed into:

  (((:urn . "top")
   (((:http://www\.w3\.org/2000/xmlns/ . "") . "urn"))))

Or, without namespace aware parsing:

  ((top ((xmlns . "urn"))))

2004-02-29  Mark A. Hershberger  <mah@everybody.org>

	* xml.el (xml-maybe-do-ns): new function to handle namespace
	expansion of qnames.
	(xml-parse-region): Update to work with namespace parsing.
	(xml-parse-tag): Change namespace parsing so that it produces
	(:uri . "lname") instead of {uri}lname.  Avoid unneccessy
	interning.
	(xml-parse-attlist): Update to work with namespace parsing.


*** xml.el	14 Jul 2003 20:41:12 -0000	1.23
--- xml.el	1 Mar 2004 15:53:50 -0000
***************
*** 52,66 ****
  
  ;;; LIST FORMAT
  
! ;; The functions `xml-parse-file' and `xml-parse-tag' return a list with
! ;; the following format:
  ;;
  ;;    xml-list   ::= (node node ...)
! ;;    node       ::= (tag_name attribute-list . child_node_list)
  ;;    child_node_list ::= child_node child_node ...
  ;;    child_node ::= node | string
! ;;    tag_name   ::= string
! ;;    attribute_list ::= (("attribute" . "value") ("attribute" . "value") ...)
  ;;                       | nil
  ;;    string     ::= "..."
  ;;
--- 52,66 ----
  
  ;;; LIST FORMAT
  
! ;; The functions `xml-parse-file', `xml-parse-region' and
! ;; `xml-parse-tag' return a list with the following format:
  ;;
  ;;    xml-list   ::= (node node ...)
! ;;    node       ::= (qname attribute-list . child_node_list)
  ;;    child_node_list ::= child_node child_node ...
  ;;    child_node ::= node | string
! ;;    qname      ::= (:namespace-uri . "name") | "name"
! ;;    attribute_list ::= ((qname . "value") (qname . "value") ...)
  ;;                       | nil
  ;;    string     ::= "..."
  ;;
***************
*** 68,73 ****
--- 68,78 ----
  ;; Whitespace is preserved.  Fixme: There should be a tree-walker that
  ;; can remove it.
  
+ ;; TODO:
+ ;;  * xml:base, xml:space support
+ ;;  * more complete DOCTYPE parsing
+ ;;  * pi support
+ 
  ;;; Code:
  
  ;; Note that {buffer-substring,match-string}-no-properties were
***************
*** 214,220 ****
  		    (setq result (xml-parse-tag parse-dtd parse-ns))
  		    (cond
  		     ((null result))
! 		     ((listp (car result))
  		      (setq dtd (car result))
  		      (if (cdr result)	; possible leading comment
  			  (add-to-list 'xml (cdr result))))
--- 226,233 ----
  		    (setq result (xml-parse-tag parse-dtd parse-ns))
  		    (cond
  		     ((null result))
! 		     ((and (listp (car result))
!                            parse-dtd)
  		      (setq dtd (car result))
  		      (if (cdr result)	; possible leading comment
  			  (add-to-list 'xml (cdr result))))
***************
*** 225,230 ****
--- 238,274 ----
  	      (cons dtd (nreverse xml))
  	    (nreverse xml)))))))
  
+ (defun xml-maybe-do-ns (name default xml-ns)
+   "Perform any namespace expansion.  NAME is the name to perform the expansion on.
+ DEFAULT is the default namespace.  XML-NS is a cons of namespace
+ names to uris.  When namespace-aware parsing is off, then XML-NS
+ is nil.
+ 
+ During namespace-aware parsing, any name without a namespace is
+ put into the namespace identified by DEFAULT.  nil is used to
+ specify that the name shouldn't be given a namespace."
+   (if (consp xml-ns)
+       (let* ((splitup (split-string name ":"))
+              (lname (or (nth 1 splitup)
+                         (nth 0 splitup)))
+              (prefix (if (nth 1 splitup)
+                          (nth 0 splitup)
+                        default))
+              (ns  (progn 
+                      (if (and 
+                           (string-equal lname "xmlns")
+                           (not prefix))
+                          (cdr (assoc "xmlns" xml-ns))
+                        (cdr (assoc prefix xml-ns))))))
+         (if ns
+              (cons ns
+                    (if (and
+                         (string-equal lname "xmlns")
+                         (not prefix))
+                        ""
+                      lname))
+            lname))
+     (intern name)))
  
  (defun xml-parse-tag (&optional parse-dtd parse-ns)
    "Parse the tag at point.
***************
*** 237,248 ****
   - a pair : the first element is the DTD, the second is the node."
    (let ((xml-ns (if (consp parse-ns)
  		    parse-ns
! 		  (if parse-ns
  		      (list
! 		       ;; Default no namespace
! 		       (cons "" "")
  		       ;; We need to seed the xmlns namespace
! 		       (cons "xmlns" "http://www.w3.org/2000/xmlns/"))))))
      (cond
       ;; Processing instructions (like the <?xml version="1.0"?> tag at the
       ;; beginning of a document).
--- 281,292 ----
   - a pair : the first element is the DTD, the second is the node."
    (let ((xml-ns (if (consp parse-ns)
  		    parse-ns
! 		  (when parse-ns
  		      (list
! 		       ;; "xml" namespace
! 		       (cons "xml"   :http://www.w3.org/XML/1998/namespace)
  		       ;; We need to seed the xmlns namespace
! 		       (cons "xmlns" :http://www.w3.org/2000/xmlns/))))))
      (cond
       ;; Processing instructions (like the <?xml version="1.0"?> tag at the
       ;; beginning of a document).
***************
*** 278,338 ****
        (goto-char (match-end 1))
        (let* ((node-name (match-string 1))
  	     ;; Parse the attribute list.
! 	     (children (list (xml-parse-attlist) (intern node-name)))
  	     pos)
  
  	;; add the xmlns:* attrs to our cache
  	(when (consp xml-ns)
  	  (mapcar
  	   (lambda (attr)
! 	     (let* ((splitup (split-string (symbol-name (car attr)) ":"))
! 		    (prefix (nth 0 splitup))
! 		    (lname (nth 1 splitup)))
! 	       (when (string= "xmlns" prefix)
! 		 (setq xml-ns (append (list (cons (if lname
! 						      lname
! 						    "")
! 						  (cdr attr)))
! 				      xml-ns)))))
! 	   (car children))
! 
! 	  ;; expand element names
! 	  (let* ((splitup (split-string (symbol-name (cadr children)) ":"))
! 		 (lname (or (nth 1 splitup)
! 			    (nth 0 splitup)))
! 		 (prefix (if (nth 1 splitup)
! 			     (nth 0 splitup)
! 			   "")))
! 	    (setcdr children (list
! 			      (intern (concat "{"
! 					     (cdr (assoc-string prefix xml-ns))
! 					     "}" lname)))))
! 
! 	  ;; expand attribute names
! 	  (mapcar
! 	   (lambda (attr)
! 	     (let* ((splitup (split-string (symbol-name (car attr)) ":"))
! 		    (lname (or (nth 1 splitup)
! 			       (nth 0 splitup)))
! 		    (prefix (if (nth 1 splitup)
! 				(nth 0 splitup)
! 			      (caar xml-ns))))
! 
! 	       (setcar attr (intern (concat "{"
! 					    (cdr (assoc-string prefix xml-ns))
! 					    "}" lname)))))
  	   (car children)))
  
! 	;; is this an empty element ?
! 	(if (looking-at "/>")
! 	(progn
! 	  (forward-char 2)
! 	  (nreverse children))
  
! 	;; is this a valid start tag ?
! 	(if (eq (char-after) ?>)
! 	    (progn
! 	      (forward-char 1)
  	      ;;  Now check that we have the right end-tag. Note that this
  	      ;;  one might contain spaces after the tag name
  	      (let ((end (concat "</" node-name "\\s-*>")))
--- 322,354 ----
        (goto-char (match-end 1))
        (let* ((node-name (match-string 1))
  	     ;; Parse the attribute list.
! 	     (children (list (xml-parse-attlist xml-ns) node-name))
  	     pos)
  
  	;; add the xmlns:* attrs to our cache
  	(when (consp xml-ns)
  	  (mapcar
  	   (lambda (attr)
!              (when (and (listp (car attr))
!                         (eq :http://www.w3.org/2000/xmlns/
!                             (caar attr)))
!                (setq xml-ns (append (list (cons (cdar attr)
!                                                 (intern (concat ":" (cdr attr)))))
!                                     xml-ns))))
  	   (car children)))
  
!         ;; expand element names
!         (setcdr children (list (xml-maybe-do-ns (cadr children) "" xml-ns)))
  
! 	;; is this an empty element ?
!         (if (looking-at "/>")
!             (progn
!               (forward-char 2)
!               (nreverse children))
!         ;; is this a valid start tag ?
!         (if (eq (char-after) ?>)
!             (progn
!               (forward-char 1)
  	      ;;  Now check that we have the right end-tag. Note that this
  	      ;;  one might contain spaces after the tag name
  	      (let ((end (concat "</" node-name "\\s-*>")))
***************
*** 349,355 ****
  		    (setq pos (point))
  		    (search-forward "<")
  		    (forward-char -1)
! 		    (let ((string (buffer-substring pos (point)))
  			  (pos 0))
  
  		      ;; Clean up the string.  As per XML
--- 365,372 ----
  		    (setq pos (point))
  		    (search-forward "<")
  		    (forward-char -1)
! 		    (let ((string (buffer-substring
!                                    pos (point)))
  			  (pos 0))
  
  		      ;; Clean up the string.  As per XML
***************
*** 369,390 ****
  				      (cdr children))
  			      (cons string children))))))))
  
! 	      (goto-char (match-end 0))
  	      (nreverse children))
! 	  ;;  This was an invalid start tag
! 	  (error "XML: Invalid attribute list")))))
!      (t	;; This is not a tag.
        (error "XML: Invalid character")))))
  
! (defun xml-parse-attlist ()
!   "Return the attribute-list after point.Leave point at the first non-blank character after the tag."
    (let ((attlist ())
  	start-pos name)
      (skip-syntax-forward " ")
      (while (looking-at (eval-when-compile
  			 (concat "\\(" xml-name-regexp "\\)\\s-*=\\s-*")))
!       (setq name (intern (match-string 1)))
!       (goto-char (match-end 0))
  
        ;; See also: http://www.w3.org/TR/2000/REC-xml-20001006#AVNormalize
  
--- 386,410 ----
  				      (cdr children))
  			      (cons string children))))))))
  
!               (goto-char (match-end 0))
  	      (nreverse children))
!           ;;  This was an invalid Start tag
!           (error "XML: Invalid attribute list")))))
!      (t	;; This is not a tag
        (error "XML: Invalid character")))))
  
! 
! (defun xml-parse-attlist (&optional xml-ns)
!   "Return the attribute-list after point.  Leave point at the first non-blank character after the tag."
    (let ((attlist ())
  	start-pos name)
      (skip-syntax-forward " ")
      (while (looking-at (eval-when-compile
  			 (concat "\\(" xml-name-regexp "\\)\\s-*=\\s-*")))
!       (let ((name (match-string 1))
!             (end  (match-end 0)))
!         (setq qname (xml-maybe-do-ns name nil xml-ns))
!         (goto-char end))
  
        ;; See also: http://www.w3.org/TR/2000/REC-xml-20001006#AVNormalize
  
***************
*** 395,403 ****
  	(if (looking-at "'\\([^']*\\)'")
  	    (setq start-pos (match-beginning 0))
  	  (error "XML: Attribute values must be given between quotes")))
- 
        ;; Each attribute must be unique within a given element
!       (if (assoc name attlist)
  	  (error "XML: each attribute must be unique within an element"))
  
        ;; Multiple whitespace characters should be replaced with a single one
--- 415,422 ----
  	(if (looking-at "'\\([^']*\\)'")
  	    (setq start-pos (match-beginning 0))
  	  (error "XML: Attribute values must be given between quotes")))
        ;; Each attribute must be unique within a given element
!       (if (assoc qname attlist)
  	  (error "XML: each attribute must be unique within an element"))
  
        ;; Multiple whitespace characters should be replaced with a single one
***************
*** 405,415 ****
        (let ((string (match-string 1))
  	    (pos 0))
  	(replace-regexp-in-string "\\s-\\{2,\\}" " " string)
! 	(push (cons name (xml-substitute-special string)) attlist))
  
        (goto-char start-pos)
        (forward-sexp)			; we have string syntax
- 
        (skip-syntax-forward " "))
      (nreverse attlist)))
  
--- 424,433 ----
        (let ((string (match-string 1))
  	    (pos 0))
  	(replace-regexp-in-string "\\s-\\{2,\\}" " " string)
! 	(push (cons qname (xml-substitute-special string)) attlist))
  
        (goto-char start-pos)
        (forward-sexp)			; we have string syntax
        (skip-syntax-forward " "))
      (nreverse attlist)))
  
***************
*** 488,494 ****
  
  	   ;;  Translation of rule [45] of XML specifications
  	   ((looking-at
! 	     "<!ELEMENT\\s-+\\([[:alnum:].%;]+\\)\\s-+\\([^>]+\\)>")
  
  	    (setq element (intern (match-string 1))
  		  type    (match-string-no-properties 2))
--- 506,512 ----
  
  	   ;;  Translation of rule [45] of XML specifications
  	   ((looking-at
! 	     "<!ELEMENT\\s-+\\([[:alnum:].%;-]+\\)\\s-+\\([^>]+\\)>")
  
  	    (setq element (intern (match-string 1))
  		  type    (match-string-no-properties 2))
***************
*** 517,528 ****
  	    (goto-char end-pos))
  	   ((looking-at "<!--")
  	    (search-forward "-->"))
! 
  	   (t
! 	    (error "XML: Invalid DTD item")))
  
  	  ;;  Skip the end of the DTD
! 	  (search-forward ">"))))
      (nreverse dtd)))
  
  
--- 535,554 ----
  	    (goto-char end-pos))
  	   ((looking-at "<!--")
  	    (search-forward "-->"))
!            ((looking-at "<!ENTITY\\s-+\\([[:alnum:].%;-]+\\)\\s-+\\([^>]+\\)>")
!             ; Put the ENTITY in
!             (goto-char (match-end 0)))
!            ((looking-at "<!ATTLIST\\s+\\([[:alnum:].%;-]+\\)\\s+\\([^>]+\\)>")
!             ; Put in the ATTLIST
!             (goto-char (match-end 0)))
!            ((looking-at "<!NOTATION\\s+\\([[:alnum:].%;-]+\\)\\s+\\([^>]+\\)>")
!             ; Put in the NOTATION
!             (goto-char (match-end 0)))
  	   (t
! 	    (error "XML: Invalid DTD item"))))
  
  	  ;;  Skip the end of the DTD
! 	  (search-forward ">")))
      (nreverse dtd)))
  
  



-- 
Peace is only better than war if peace isn't hell, too.
    -- Walker Percy, "The Second Coming"


From:      Mark A. Hershberger
Subject:   More XML Parsing: Is anyone using CVS xml.el namespace processing?
Date: 	   Sun, 28 Sep 2003 12:31:48 -0500

Recently I had a short conversation with James Clark about the
structures xml.el produces.

I asked him what he thought about the current (CVS) namespace-aware
processing.  Based on his feedback, I plan to submit changes that
will return an incompatible structure to the one currently in CVS.

Currently, when xml.el encounters a bit of XML like:

    <ns:xml xmlns:ns="uri:namespace" ns:attr="value"/>

it produces:

    (({uri:namespace}xml 
     (({http://www\.w3\.org/2000/xmlns/}ns . "uri:namespace") 
      ({uri:namespace}attr . "value"))))

At the time that I wrote this, I saw some W3 docs where this style was
used and copied it.  Some people here asked me why I did this instead
of something like (uri:namespace . "xml"), but I forged ahead.

Now, after my conversation with Mr. Clark, I've been persuaded that I
was wrong.  At his suggestion, I'd like to change the above xml
representation produce the following:

    (((uri:namespace . "xml")
     ((((http://www\.w3\.org/2000/xmlns/ . "ns") . "uri:namespace") 
      ((uri:namespace . "attr") . "value")))))

As Mr. Clark said:

    ... there are typically not very many different namespace URIs, so
    keeping them in Emacs symbol table is not a problem; in the
    returned representation of the XML, the namespaces would be
    shared, but strings are mutable in Emacs, which is kind of ugly.

Where there is no namespace given:

    <xml attr="value">

It would produce the following:

    (("xml"
      (("attr" . "value"))))

Unless there are major objections, I'd like to repent of my previous
code and submit changes to produce the above.

Mark.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Refactoring xml.el namespace handling
  2004-03-01 15:59 Mark A. Hershberger
@ 2004-03-02 21:52 ` Stefan Monnier
  2004-03-03  6:35   ` Mark A. Hershberger
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Monnier @ 2004-03-02 21:52 UTC (permalink / raw)
  Cc: emacs-pretest-bug, emacs-devel

> This patch differs slightly from the proposal in that namespaces are
> interned after being concatenated with `:'.  Thus,

Could you send a patch relative to the CVS head version?
Your patch does not apply cleanly to the current code.


        Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Refactoring xml.el namespace handling
  2004-03-02 21:52 ` Stefan Monnier
@ 2004-03-03  6:35   ` Mark A. Hershberger
  2004-03-04 17:50     ` Stefan Monnier
  0 siblings, 1 reply; 14+ messages in thread
From: Mark A. Hershberger @ 2004-03-03  6:35 UTC (permalink / raw)
  Cc: emacs-pretest-bug, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> Could you send a patch relative to the CVS head version?
> Your patch does not apply cleanly to the current code.

Argh.  My CVS skillz are lacking.

Now that I've gotten a good HEAD, here is a fresh diff.

2004-03-03  Mark A. Hershberger  <mah@everybody.org>

	* xml.el (xml-maybe-do-ns): new function to handle namespace
	parsing of both attribute and element names.
	(xml-ns-parse-ns-attrs, xml-ns-expand-el, xml-ns-expand-attr,
	xml-intern-attrlist): Removed in favor of xml-maybe-do-ns to avoid
	un-necessary intern-ing.
	(xml-parse-tag): Updated assumed namespaces.  Cleaned up namespace
	parsing.
	(xml-parse-attlist): Now does its own namespace parsing work.
	(xml-parse-dtd): Updated <!ELEMENT parsing to accept elements with
	hyphens.  Now skips ENTITY, ATTLIST, and NOTATION instead of
	barfing.


--- xml.el	2 Mar 2004 21:45:06 -0000	1.30
+++ xml.el	3 Mar 2004 06:17:00 -0000
@@ -52,15 +52,15 @@
 
 ;;; LIST FORMAT
 
-;; The functions `xml-parse-file' and `xml-parse-tag' return a list with
-;; the following format:
+;; The functions `xml-parse-file', `xml-parse-region' and
+;; `xml-parse-tag' return a list with the following format:
 ;;
 ;;    xml-list   ::= (node node ...)
-;;    node       ::= (tag_name attribute-list . child_node_list)
+;;    node       ::= (qname attribute-list . child_node_list)
 ;;    child_node_list ::= child_node child_node ...
 ;;    child_node ::= node | string
-;;    tag_name   ::= string
-;;    attribute_list ::= (("attribute" . "value") ("attribute" . "value") ...)
+;;    qname      ::= (:namespace-uri . "name") | "name"
+;;    attribute_list ::= ((qname . "value") (qname . "value") ...)
 ;;                       | nil
 ;;    string     ::= "..."
 ;;
@@ -68,6 +68,11 @@
 ;; Whitespace is preserved.  Fixme: There should be a tree-walker that
 ;; can remove it.
 
+;; TODO:
+;;  * xml:base, xml:space support
+;;  * more complete DOCTYPE parsing
+;;  * pi support
+
 ;;; Code:
 
 ;; Note that {buffer-substring,match-string}-no-properties were
@@ -230,72 +235,37 @@
 	      (cons dtd (nreverse xml))
 	    (nreverse xml)))))))
 
-(defun xml-ns-parse-ns-attrs (attr-list &optional xml-ns)
-  "Parse the namespace attributes and return a list of cons in the form:
-\(namespace . prefix)"
-
-  (mapcar
-   (lambda (attr)
-     (let* ((splitup (split-string (car attr) ":"))
-	    (prefix (nth 0 splitup))
-	    (lname (nth 1 splitup)))
-       (when (string= "xmlns" prefix)
-	 (push (cons (if lname
-			 lname
-		       "")
-		     (cdr attr))
-	       xml-ns)))) attr-list)
-  xml-ns)
-
-;; expand element names
-(defun xml-ns-expand-el (el xml-ns)
-  "Expand the XML elements from \"prefix:local-name\" to a cons in the form
-\"(namespace . local-name)\"."
-
-  (let* ((splitup (split-string el ":"))
-	 (lname (or (nth 1 splitup)
-		    (nth 0 splitup)))
-	 (prefix (if (nth 1 splitup)
-		     (nth 0 splitup)
-		   (if (string= lname "xmlns")
-		       "xmlns"
-		     "")))
-	 (ns (cdr (assoc-string prefix xml-ns))))
-    (if (string= "" ns)
-	lname
-      (cons (intern (concat ":" ns))
-	    lname))))
-
-;; expand attribute names
-(defun xml-ns-expand-attr (attr-list xml-ns)
-  "Expand the attribute list for a particular element from the form
-\"prefix:local-name\" to the form \"{namespace}:local-name\"."
-
-  (mapcar
-   (lambda (attr)
-     (let* ((splitup (split-string (car attr) ":"))
+(defun xml-maybe-do-ns (name default xml-ns)
+  "Perform any namespace expansion.  NAME is the name to perform the expansion on.
+DEFAULT is the default namespace.  XML-NS is a cons of namespace
+names to uris.  When namespace-aware parsing is off, then XML-NS
+is nil.
+
+During namespace-aware parsing, any name without a namespace is
+put into the namespace identified by DEFAULT.  nil is used to
+specify that the name shouldn't be given a namespace."
+  (if (consp xml-ns)
+      (let* ((splitup (split-string name ":"))
 	    (lname (or (nth 1 splitup)
 		       (nth 0 splitup)))
 	    (prefix (if (nth 1 splitup)
 			(nth 0 splitup)
-		      (if (string= (car attr) "xmlns")
-			  "xmlns"
-			"")))
-	    (ns (cdr (assoc-string prefix xml-ns))))
-       (setcar attr
-	       (if (string= "" ns)
-		   lname
-		 (cons (intern (concat ":" ns))
-		       lname)))))
-   attr-list)
-  attr-list)
-
-(defun xml-intern-attrlist (attr-list)
-  "Convert attribute names to symbols for backward compatibility."
-  (mapcar (lambda (attr)
-	    (setcar attr (intern (car attr))))
-	  attr-list)
-  attr-list)
+                       default))
+             (ns  (progn 
+                     (if (and 
+                          (string-equal lname "xmlns")
+                          (not prefix))
+                         (cdr (assoc "xmlns" xml-ns))
+                       (cdr (assoc prefix xml-ns))))))
+        (if ns
+             (cons ns
+                   (if (and
+                        (string-equal lname "xmlns")
+                        (not prefix))
+                       ""
+                     lname))
+           lname))
+    (intern name)))
 
 (defun xml-parse-tag (&optional parse-dtd parse-ns)
   "Parse the tag at point.
@@ -308,12 +278,12 @@
  - a pair : the first element is the DTD, the second is the node."
   (let ((xml-ns (if (consp parse-ns)
 		    parse-ns
-		  (if parse-ns
+		  (when parse-ns
 		      (list
-		       ;; Default no namespace
-		       (cons "" "")
+		       ;; "xml" namespace
+		       (cons "xml"   :http://www.w3.org/XML/1998/namespace)
 		       ;; We need to seed the xmlns namespace
-		       (cons "xmlns" "http://www.w3.org/2000/xmlns/"))))))
+		       (cons "xmlns" :http://www.w3.org/2000/xmlns/))))))
     (cond
      ;; Processing instructions (like the <?xml version="1.0"?> tag at the
      ;; beginning of a document).
@@ -350,19 +320,25 @@
 
       ;; Parse this node
       (let* ((node-name (match-string 1))
-	     (attr-list (xml-parse-attlist))
-	     (children (if  (consp xml-ns) ;; take care of namespace parsing
-			    (progn
-			      (setq xml-ns (xml-ns-parse-ns-attrs
-					    attr-list xml-ns))
-			      (list (xml-ns-expand-attr
-				     attr-list xml-ns)
-				    (xml-ns-expand-el
-				     node-name xml-ns)))
-			    (list (xml-intern-attrlist attr-list)
-				  (intern node-name))))
+             ;; Parse the attribute list.
+             (children (list (xml-parse-attlist xml-ns) node-name))
 	     pos)
 
+        ;; add the xmlns:* attrs to our cache
+        (when (consp xml-ns)
+          (mapcar
+           (lambda (attr)
+             (when (and (listp (car attr))
+                        (eq :http://www.w3.org/2000/xmlns/
+                            (caar attr)))
+               (setq xml-ns (append (list (cons (cdar attr)
+                                                (intern (concat ":" (cdr attr)))))
+                                    xml-ns))))
+           (car children)))
+
+        ;; expand element names
+        (setcdr children (list (xml-maybe-do-ns (cadr children) "" xml-ns)))
+
 	;; is this an empty element ?
 	(if (looking-at "/>")
 	(progn
@@ -416,7 +392,7 @@
      (t	;; This is not a tag.
       (error "XML: Invalid character")))))
 
-(defun xml-parse-attlist ()
+(defun xml-parse-attlist (&optional xml-ns)
   "Return the attribute-list after point.  Leave point at the
 first non-blank character after the tag."
   (let ((attlist ())
@@ -424,8 +400,9 @@
     (skip-syntax-forward " ")
     (while (looking-at (eval-when-compile
 			 (concat "\\(" xml-name-regexp "\\)\\s-*=\\s-*")))
-      (setq name (match-string 1))
-      (goto-char (match-end 0))
+      (setq end-pos (match-end 0))
+      (setq name (xml-maybe-do-ns (match-string 1) nil xml-ns))
+      (goto-char end-pos)
 
       ;; See also: http://www.w3.org/TR/2000/REC-xml-20001006#AVNormalize
 
@@ -527,7 +504,7 @@
 
 	   ;;  Translation of rule [45] of XML specifications
 	   ((looking-at
-	     "<!ELEMENT\\s-+\\([[:alnum:].%;]+\\)\\s-+\\([^>]+\\)>")
+	     "<!ELEMENT\\s-+\\([[:alnum:].%;-]+\\)\\s-+\\([^>]+\\)>")
 
 	    (setq element (match-string 1)
 		  type    (match-string-no-properties 2))
@@ -556,7 +533,15 @@
 	    (goto-char end-pos))
 	   ((looking-at "<!--")
 	    (search-forward "-->"))
-
+           ((looking-at "<!ENTITY\\s-+\\([[:alnum:].%;-]+\\)\\s-+\\([^>]+\\)>")
+            ; Put the ENTITY in
+            (goto-char (match-end 0)))
+           ((looking-at "<!ATTLIST\\s+\\([[:alnum:].%;-]+\\)\\s+\\([^>]+\\)>")
+            ; Put in the ATTLIST
+            (goto-char (match-end 0)))
+           ((looking-at "<!NOTATION\\s+\\([[:alnum:].%;-]+\\)\\s+\\([^>]+\\)>")
+            ; Put in the NOTATION
+            (goto-char (match-end 0)))
 	   (t
 	    (error "XML: Invalid DTD item")))
 


-- 
A choice between one man and a shovel, or a dozen men with teaspoons
is clear to me, and I'm sure it is clear to you also.
    -- Zimran Ahmed <http://www.winterspeak.com/>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Refactoring xml.el namespace handling
  2004-03-03  6:35   ` Mark A. Hershberger
@ 2004-03-04 17:50     ` Stefan Monnier
  2004-03-04 17:52       ` Stefan Monnier
  2004-03-08  1:33       ` Kim F. Storm
  0 siblings, 2 replies; 14+ messages in thread
From: Stefan Monnier @ 2004-03-04 17:50 UTC (permalink / raw)
  Cc: emacs-pretest-bug, emacs-devel

> Now that I've gotten a good HEAD, here is a fresh diff.

Thank you.  Here is a counter patch.  The main difference is that it uses
'http://foo/bar rather than :http://foo/bar so as to avoid an unnecessary
(concat ":" foo) and also so that (symbol-name foo) immediately returns
a usable URL.
It also cleans up a few elisp things (like replace mapcar->mapc->dolist,
and (append (list x) y) -> (cons x y), ...).
Things left:
- it seems that the new code returns either a TAG (a symbol) or (NS . TAG)
  where TAG is a string rather than a symbol.  Do I understand this right?
  Is that done on purpose?  It looks like a bad idea.
- in xml-parse-tag, you do

   (let (.. (children (list A B)) ...)
     ... (car children) ... (setcdr children ...) ...

  it would be better to do something slightly different so you don't need
  to extract the car of what you just built and you don't need to setcdr.
  Maybe something like

    (let (... (a A) (b B) ...)
      ... a ... (let ((children (list a (.. b ..)))) ...))

> 2004-03-03  Mark A. Hershberger  <mah@everybody.org>
>
> 	* xml.el (xml-maybe-do-ns): new function to handle namespace
> 	parsing of both attribute and element names.
> 	(xml-ns-parse-ns-attrs, xml-ns-expand-el, xml-ns-expand-attr,
> 	xml-intern-attrlist): Removed in favor of xml-maybe-do-ns to avoid
> 	un-necessary intern-ing.
> 	(xml-parse-tag): Updated assumed namespaces.  Cleaned up namespace
> 	parsing.
> 	(xml-parse-attlist): Now does its own namespace parsing work.
> 	(xml-parse-dtd): Updated <!ELEMENT parsing to accept elements with
> 	hyphens.  Now skips ENTITY, ATTLIST, and NOTATION instead of
> 	barfing.

ChangeLog text should use the present tense: it makes it easier to
write/read.


        Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Refactoring xml.el namespace handling
  2004-03-04 17:50     ` Stefan Monnier
@ 2004-03-04 17:52       ` Stefan Monnier
  2004-03-08  1:33       ` Kim F. Storm
  1 sibling, 0 replies; 14+ messages in thread
From: Stefan Monnier @ 2004-03-04 17:52 UTC (permalink / raw)
  Cc: emacs-pretest-bug, emacs-devel

> Thank you.  Here is a counter patch.

Of course, the patch was missing, so here it is.


        Stefan


--- xml.el.~1.30.~	Tue Mar  2 22:11:10 2004
+++ xml.el	Thu Mar  4 12:40:28 2004
@@ -52,15 +52,15 @@
 
 ;;; LIST FORMAT
 
-;; The functions `xml-parse-file' and `xml-parse-tag' return a list with
-;; the following format:
+;; The functions `xml-parse-file', `xml-parse-region' and
+;; `xml-parse-tag' return a list with the following format:
 ;;
 ;;    xml-list   ::= (node node ...)
-;;    node       ::= (tag_name attribute-list . child_node_list)
+;;    node       ::= (qname attribute-list . child_node_list)
 ;;    child_node_list ::= child_node child_node ...
 ;;    child_node ::= node | string
-;;    tag_name   ::= string
-;;    attribute_list ::= (("attribute" . "value") ("attribute" . "value") ...)
+;;    qname      ::= (namespace-uri . "name") | "name"
+;;    attribute_list ::= ((qname . "value") (qname . "value") ...)
 ;;                       | nil
 ;;    string     ::= "..."
 ;;
@@ -68,6 +68,11 @@
 ;; Whitespace is preserved.  Fixme: There should be a tree-walker that
 ;; can remove it.
 
+;; TODO:
+;;  * xml:base, xml:space support
+;;  * more complete DOCTYPE parsing
+;;  * pi support
+
 ;;; Code:
 
 ;; Note that {buffer-substring,match-string}-no-properties were
@@ -230,72 +235,26 @@
 	      (cons dtd (nreverse xml))
 	    (nreverse xml)))))))
 
-(defun xml-ns-parse-ns-attrs (attr-list &optional xml-ns)
-  "Parse the namespace attributes and return a list of cons in the form:
-\(namespace . prefix)"
-
-  (mapcar
-   (lambda (attr)
-     (let* ((splitup (split-string (car attr) ":"))
-	    (prefix (nth 0 splitup))
-	    (lname (nth 1 splitup)))
-       (when (string= "xmlns" prefix)
-	 (push (cons (if lname
-			 lname
-		       "")
-		     (cdr attr))
-	       xml-ns)))) attr-list)
-  xml-ns)
-
-;; expand element names
-(defun xml-ns-expand-el (el xml-ns)
-  "Expand the XML elements from \"prefix:local-name\" to a cons in the form
-\"(namespace . local-name)\"."
-
-  (let* ((splitup (split-string el ":"))
-	 (lname (or (nth 1 splitup)
-		    (nth 0 splitup)))
-	 (prefix (if (nth 1 splitup)
-		     (nth 0 splitup)
-		   (if (string= lname "xmlns")
-		       "xmlns"
-		     "")))
-	 (ns (cdr (assoc-string prefix xml-ns))))
-    (if (string= "" ns)
-	lname
-      (cons (intern (concat ":" ns))
-	    lname))))
-
-;; expand attribute names
-(defun xml-ns-expand-attr (attr-list xml-ns)
-  "Expand the attribute list for a particular element from the form
-\"prefix:local-name\" to the form \"{namespace}:local-name\"."
-
-  (mapcar
-   (lambda (attr)
-     (let* ((splitup (split-string (car attr) ":"))
-	    (lname (or (nth 1 splitup)
-		       (nth 0 splitup)))
-	    (prefix (if (nth 1 splitup)
-			(nth 0 splitup)
-		      (if (string= (car attr) "xmlns")
-			  "xmlns"
-			"")))
-	    (ns (cdr (assoc-string prefix xml-ns))))
-       (setcar attr
-	       (if (string= "" ns)
-		   lname
-		 (cons (intern (concat ":" ns))
-		       lname)))))
-   attr-list)
-  attr-list)
-
-(defun xml-intern-attrlist (attr-list)
-  "Convert attribute names to symbols for backward compatibility."
-  (mapcar (lambda (attr)
-	    (setcar attr (intern (car attr))))
-	  attr-list)
-  attr-list)
+(defun xml-maybe-do-ns (name default xml-ns)
+  "Perform any namespace expansion.  NAME is the name to perform the expansion on.
+DEFAULT is the default namespace.  XML-NS is a cons of namespace
+names to uris.  When namespace-aware parsing is off, then XML-NS
+is nil.
+
+During namespace-aware parsing, any name without a namespace is
+put into the namespace identified by DEFAULT.  nil is used to
+specify that the name shouldn't be given a namespace."
+  (if (consp xml-ns)
+      (let* ((nsp (string-match ":" name))
+	     (lname (if nsp (substring name (match-end 0)) name))
+	     (prefix (if nsp (substring name 0 (match-beginning 0)) default))
+	     (special (and (string-equal lname "xmlns") (not prefix)))
+	     (ns (cdr (assoc (if special "xmlns" prefix)
+			     xml-ns))))
+	(if ns
+	    (cons ns (if special "" lname))
+	  lname))
+    (intern name)))
 
 (defun xml-parse-tag (&optional parse-dtd parse-ns)
   "Parse the tag at point.
@@ -310,10 +269,10 @@
 		    parse-ns
 		  (if parse-ns
 		      (list
-		       ;; Default no namespace
-		       (cons "" "")
+		       ;; "xml" namespace
+		       (cons "xml"   'http://www.w3.org/XML/1998/namespace)
 		       ;; We need to seed the xmlns namespace
-		       (cons "xmlns" "http://www.w3.org/2000/xmlns/"))))))
+		       (cons "xmlns" 'http://www.w3.org/2000/xmlns/))))))
     (cond
      ;; Processing instructions (like the <?xml version="1.0"?> tag at the
      ;; beginning of a document).
@@ -350,19 +309,22 @@
 
       ;; Parse this node
       (let* ((node-name (match-string 1))
-	     (attr-list (xml-parse-attlist))
-	     (children (if  (consp xml-ns) ;; take care of namespace parsing
-			    (progn
-			      (setq xml-ns (xml-ns-parse-ns-attrs
-					    attr-list xml-ns))
-			      (list (xml-ns-expand-attr
-				     attr-list xml-ns)
-				    (xml-ns-expand-el
-				     node-name xml-ns)))
-			    (list (xml-intern-attrlist attr-list)
-				  (intern node-name))))
+             ;; Parse the attribute list.
+             (children (list (xml-parse-attlist xml-ns) node-name))
 	     pos)
 
+        ;; add the xmlns:* attrs to our cache
+        (when (consp xml-ns)
+	  (dolist (attr (car children))
+	    (when (and (consp (car attr))
+		       (eq 'http://www.w3.org/2000/xmlns/
+			   (caar attr)))
+	      (push (cons (cdar attr) (intern (cdr attr)))
+		    xml-ns))))
+
+        ;; expand element names
+        (setcdr children (list (xml-maybe-do-ns (cadr children) "" xml-ns)))
+
 	;; is this an empty element ?
 	(if (looking-at "/>")
 	(progn
@@ -416,7 +378,7 @@
      (t	;; This is not a tag.
       (error "XML: Invalid character")))))
 
-(defun xml-parse-attlist ()
+(defun xml-parse-attlist (&optional xml-ns)
   "Return the attribute-list after point.  Leave point at the
 first non-blank character after the tag."
   (let ((attlist ())
@@ -424,8 +386,9 @@
     (skip-syntax-forward " ")
     (while (looking-at (eval-when-compile
 			 (concat "\\(" xml-name-regexp "\\)\\s-*=\\s-*")))
-      (setq name (match-string 1))
-      (goto-char (match-end 0))
+      (setq end-pos (match-end 0))
+      (setq name (xml-maybe-do-ns (match-string 1) nil xml-ns))
+      (goto-char end-pos)
 
       ;; See also: http://www.w3.org/TR/2000/REC-xml-20001006#AVNormalize
 
@@ -527,7 +490,7 @@
 
 	   ;;  Translation of rule [45] of XML specifications
 	   ((looking-at
-	     "<!ELEMENT\\s-+\\([[:alnum:].%;]+\\)\\s-+\\([^>]+\\)>")
+	     "<!ELEMENT\\s-+\\([[:alnum:].%;-]+\\)\\s-+\\([^>]+\\)>")
 
 	    (setq element (match-string 1)
 		  type    (match-string-no-properties 2))
@@ -556,7 +519,15 @@
 	    (goto-char end-pos))
 	   ((looking-at "<!--")
 	    (search-forward "-->"))
-
+           ((looking-at "<!ENTITY\\s-+\\([[:alnum:].%;-]+\\)\\s-+\\([^>]+\\)>")
+	    ;; Put the ENTITY in
+            (goto-char (match-end 0)))
+           ((looking-at "<!ATTLIST\\s+\\([[:alnum:].%;-]+\\)\\s+\\([^>]+\\)>")
+            ;; Put in the ATTLIST
+            (goto-char (match-end 0)))
+           ((looking-at "<!NOTATION\\s+\\([[:alnum:].%;-]+\\)\\s+\\([^>]+\\)>")
+            ;; Put in the NOTATION
+            (goto-char (match-end 0)))
 	   (t
 	    (error "XML: Invalid DTD item")))

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Refactoring xml.el namespace handling
@ 2004-03-05 19:03 Mark A. Hershberger
  2004-03-05 19:29 ` Stefan Monnier
  2004-04-14 18:36 ` Stefan Monnier
  0 siblings, 2 replies; 14+ messages in thread
From: Mark A. Hershberger @ 2004-03-05 19:03 UTC (permalink / raw)
  Cc: emacs-pretest-bug, emacs-devel


> Thank you.  Here is a counter patch.  The main difference is that it uses
> 'http://foo/bar rather than :http://foo/bar so as to avoid an unnecessary
> (concat ":" foo) and also so that (symbol-name foo) immediately returns
> a usable URL.

There needs to be a way of differentiating between the (unlikely)
namespace uri "nil" and "" (that is, no namespace), which is why I believe we
need to stick with (concat ":" foo).

Also, since I believe we need a prefix, using ':' means less work for
the programmer since :symbols are automatically interned.  (FWIW, it
was James Clark who gave me this idea of using ':' as the prefix.)

I confess that I'm not very moved by the argument of a usable URL
since namespace URIs needn't be usable URLs.  My understanding is
that they are essentially opaque IDs.

> It also cleans up a few elisp things (like replace mapcar->mapc->dolist,
> and (append (list x) y) -> (cons x y), ...).

Good.  Thanks.

> Things left:
> - it seems that the new code returns either a TAG (a symbol) or (NS . TAG)
>   where TAG is a string rather than a symbol.  Do I understand this right?
>   Is that done on purpose?  It looks like a bad idea.

It was done on purpose, but it is a bad idea. <foo xmlns=""/> should
parse into 

    ((: . foo) (((:http://www.w3.org/2000/xmlns/ . "") ""))).

Likewise, <foo xmlns="nil" a="b"/> should parse into

    ((:nil . foo) (((:http://www.w3.org/2000/xmlns/ "") "nil")
                   ((: . "a") "b")))

> - in xml-parse-tag, you do
>
>    (let (.. (children (list A B)) ...)
>      ... (car children) ... (setcdr children ...) ...
>
>   it would be better to do something slightly different so you don't
>   need to extract the car of what you just built and you don't need
>   to setcdr.

Excellent.  The patch itself was there to clean up some bad coding.
Fixing even more is better.

> ChangeLog text should use the present tense: it makes it easier to
> write/read.

Thanks.  I'll remember it for future reference.

I'm restricting this patch to namespace stuff.  I'll put the DTD
parsing updates in another patch where I hope to add an understanding
of <!ENTITYs

Mark.

2004-03-03  Mark A. Hershberger  <mah@everybody.org>

	* xml.el (xml-maybe-do-ns): New function to handle namespace
	parsing of both attribute and element names.
	(xml-ns-parse-ns-attrs, xml-ns-expand-el, xml-ns-expand-attr,
	xml-intern-attrlist): Remove in favor of xml-maybe-do-ns.
	(xml-parse-tag): Update assumed namespaces.  Clean up namespace
	parsing.
	(xml-parse-attlist): Make it do its own namespace parsing.

--- xml.el	2 Mar 2004 21:45:06 -0000	1.30
+++ xml.el	5 Mar 2004 18:55:21 -0000
@@ -52,15 +52,15 @@
 
 ;;; LIST FORMAT
 
-;; The functions `xml-parse-file' and `xml-parse-tag' return a list with
-;; the following format:
+;; The functions `xml-parse-file', `xml-parse-region' and
+;; `xml-parse-tag' return a list with the following format:
 ;;
 ;;    xml-list   ::= (node node ...)
-;;    node       ::= (tag_name attribute-list . child_node_list)
+;;    node       ::= (qname attribute-list . child_node_list)
 ;;    child_node_list ::= child_node child_node ...
 ;;    child_node ::= node | string
-;;    tag_name   ::= string
-;;    attribute_list ::= (("attribute" . "value") ("attribute" . "value") ...)
+;;    qname      ::= (:namespace-uri . "name") | "name"
+;;    attribute_list ::= ((qname . "value") (qname . "value") ...)
 ;;                       | nil
 ;;    string     ::= "..."
 ;;
@@ -68,6 +68,11 @@
 ;; Whitespace is preserved.  Fixme: There should be a tree-walker that
 ;; can remove it.
 
+;; TODO:
+;;  * xml:base, xml:space support
+;;  * more complete DOCTYPE parsing
+;;  * pi support
+
 ;;; Code:
 
 ;; Note that {buffer-substring,match-string}-no-properties were
@@ -230,72 +335,27 @@
 	      (cons dtd (nreverse xml))
 	    (nreverse xml)))))))
 
-(defun xml-ns-parse-ns-attrs (attr-list &optional xml-ns)
-  "Parse the namespace attributes and return a list of cons in the form:
-\(namespace . prefix)"
-
-  (mapcar
-   (lambda (attr)
-     (let* ((splitup (split-string (car attr) ":"))
-	    (prefix (nth 0 splitup))
-	    (lname (nth 1 splitup)))
-       (when (string= "xmlns" prefix)
-	 (push (cons (if lname
-			 lname
-		       "")
-		     (cdr attr))
-	       xml-ns)))) attr-list)
-  xml-ns)
-
-;; expand element names
-(defun xml-ns-expand-el (el xml-ns)
-  "Expand the XML elements from \"prefix:local-name\" to a cons in the form
-\"(namespace . local-name)\"."
-
-  (let* ((splitup (split-string el ":"))
-	 (lname (or (nth 1 splitup)
-		    (nth 0 splitup)))
-	 (prefix (if (nth 1 splitup)
-		     (nth 0 splitup)
-		   (if (string= lname "xmlns")
-		       "xmlns"
-		     "")))
-	 (ns (cdr (assoc-string prefix xml-ns))))
-    (if (string= "" ns)
-	lname
-      (cons (intern (concat ":" ns))
-	    lname))))
-
-;; expand attribute names
-(defun xml-ns-expand-attr (attr-list xml-ns)
-  "Expand the attribute list for a particular element from the form
-\"prefix:local-name\" to the form \"{namespace}:local-name\"."
-
-  (mapcar
-   (lambda (attr)
-     (let* ((splitup (split-string (car attr) ":"))
-	    (lname (or (nth 1 splitup)
-		       (nth 0 splitup)))
-	    (prefix (if (nth 1 splitup)
-			(nth 0 splitup)
-		      (if (string= (car attr) "xmlns")
-			  "xmlns"
-			"")))
-	    (ns (cdr (assoc-string prefix xml-ns))))
-       (setcar attr
-	       (if (string= "" ns)
-		   lname
-		 (cons (intern (concat ":" ns))
-		       lname)))))
-   attr-list)
-  attr-list)
-
-(defun xml-intern-attrlist (attr-list)
-  "Convert attribute names to symbols for backward compatibility."
-  (mapcar (lambda (attr)
-	    (setcar attr (intern (car attr))))
-	  attr-list)
-  attr-list)
+(defun xml-maybe-do-ns (name default xml-ns)
+  "Perform any namespace expansion.  NAME is the name to perform the expansion on.
+DEFAULT is the default namespace.  XML-NS is a cons of namespace
+names to uris.  When namespace-aware parsing is off, then XML-NS
+is nil.
+
+During namespace-aware parsing, any name without a namespace is
+put into the namespace identified by DEFAULT.  nil is used to
+specify that the name shouldn't be given a namespace."
+  (if (consp xml-ns)
+      (let* ((nsp (string-match ":" name))
+	     (lname (if nsp (substring name (match-end 0)) name))
+	     (prefix (if nsp (substring name 0 (match-beginning 0)) default))
+	     (special (and (string-equal lname "xmlns") (not prefix)))
+             ;; Setting default to nil will insure that there is not
+             ;; matching cons in xml-ns.  In which case we
+	     (ns (or (cdr (assoc (if special "xmlns" prefix)
+                                 xml-ns))
+                     :)))
+        (cons ns (if special "" lname)))
+    (intern name)))
 
 (defun xml-parse-tag (&optional parse-dtd parse-ns)
   "Parse the tag at point.
@@ -310,10 +370,12 @@
 		    parse-ns
 		  (if parse-ns
 		      (list
-		       ;; Default no namespace
-		       (cons "" "")
+                       ;; Default for empty prefix is no namespace
+                       (cons ""      :)
+		       ;; "xml" namespace
+		       (cons "xml"   :http://www.w3.org/XML/1998/namespace)
 		       ;; We need to seed the xmlns namespace
-		       (cons "xmlns" "http://www.w3.org/2000/xmlns/"))))))
+		       (cons "xmlns" :http://www.w3.org/2000/xmlns/))))))
     (cond
      ;; Processing instructions (like the <?xml version="1.0"?> tag at the
      ;; beginning of a document).
@@ -350,19 +412,23 @@
 
       ;; Parse this node
       (let* ((node-name (match-string 1))
-	     (attr-list (xml-parse-attlist))
-	     (children (if  (consp xml-ns) ;; take care of namespace parsing
-			    (progn
-			      (setq xml-ns (xml-ns-parse-ns-attrs
-					    attr-list xml-ns))
-			      (list (xml-ns-expand-attr
-				     attr-list xml-ns)
-				    (xml-ns-expand-el
-				     node-name xml-ns)))
-			    (list (xml-intern-attrlist attr-list)
-				  (intern node-name))))
-	     pos)
+             ;; Parse the attribute list.
+             (attrs (xml-parse-attlist xml-ns))
+             children pos)
+
+        ;; add the xmlns:* attrs to our cache
+        (when (consp xml-ns)
+	  (dolist (attr attrs)
+	    (when (and (consp (car attr))
+		       (eq :http://www.w3.org/2000/xmlns/
+			   (caar attr)))
+	      (push (cons (cdar attr) (intern (concat ":" (cdr attr))))
+		    xml-ns))))
+
+        ;; expand element names
+        (setq node-name (list (xml-maybe-do-ns node-name "" xml-ns)))
 
+        (setq children (list attrs node-name))
 	;; is this an empty element ?
 	(if (looking-at "/>")
 	(progn
@@ -416,7 +482,7 @@
      (t	;; This is not a tag.
       (error "XML: Invalid character")))))
 
-(defun xml-parse-attlist ()
+(defun xml-parse-attlist (&optional xml-ns)
   "Return the attribute-list after point.  Leave point at the
 first non-blank character after the tag."
   (let ((attlist ())
@@ -424,8 +490,9 @@
     (skip-syntax-forward " ")
     (while (looking-at (eval-when-compile
 			 (concat "\\(" xml-name-regexp "\\)\\s-*=\\s-*")))
-      (setq name (match-string 1))
-      (goto-char (match-end 0))
+      (setq end-pos (match-end 0))
+      (setq name (xml-maybe-do-ns (match-string 1) nil xml-ns))
+      (goto-char end-pos)
 
       ;; See also: http://www.w3.org/TR/2000/REC-xml-20001006#AVNormalize

-- 
A choice between one man and a shovel, or a dozen men with teaspoons
is clear to me, and I'm sure it is clear to you also.
    -- Zimran Ahmed <http://www.winterspeak.com/>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Refactoring xml.el namespace handling
  2004-03-05 19:03 Refactoring xml.el namespace handling Mark A. Hershberger
@ 2004-03-05 19:29 ` Stefan Monnier
  2004-04-14 18:36 ` Stefan Monnier
  1 sibling, 0 replies; 14+ messages in thread
From: Stefan Monnier @ 2004-03-05 19:29 UTC (permalink / raw)
  Cc: emacs-pretest-bug, emacs-devel

>> Thank you.  Here is a counter patch.  The main difference is that it uses
>> 'http://foo/bar rather than :http://foo/bar so as to avoid an unnecessary
>> (concat ":" foo) and also so that (symbol-name foo) immediately returns
>> a usable URL.

> There needs to be a way of differentiating between the (unlikely)
> namespace uri "nil" and "" (that is, no namespace), which is why I believe we
> need to stick with (concat ":" foo).

I must say that I do not know enough of those details to understand what
would be the ideal solution.  I don't see where the confusion between "nil"
and "" could come from, but I do see that using a symbol such as the one
returned by (intern "") can be somewhat inconvenient if you ever want to
print&read the lisp representation.

I would actually recommend to intern all those symbols in a separate obarray
rather than in the main one.  This way you could use nil for the "" case
without interfering with (intern "nil" xml-obarray).  You could also then
use (intern-soft foo xml-obarray) to check whether foo already exists
without getting nasty surprises when foo is the string "nil".
Finally it would avoid polluting the main obarray.

> Also, since I believe we need a prefix, using ':' means less work for
> the programmer since :symbols are automatically interned.  (FWIW, it
> was James Clark who gave me this idea of using ':' as the prefix.)

If a prefix is needed than indeed `:' is fine.  I just feel like there's got
to be a way to avoid it altogether.

> I confess that I'm not very moved by the argument of a usable URL
> since namespace URIs needn't be usable URLs.  My understanding is
> that they are essentially opaque IDs.

You're probably right.  It just seems cleaner without the `:'.

If someone else wants to install this patch, please do.  I don't have time
to look at it just now.


        Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Refactoring xml.el namespace handling
  2004-03-04 17:50     ` Stefan Monnier
  2004-03-04 17:52       ` Stefan Monnier
@ 2004-03-08  1:33       ` Kim F. Storm
  2004-03-08  4:06         ` Mark A. Hershberger
  2004-03-12  9:57         ` Richard Stallman
  1 sibling, 2 replies; 14+ messages in thread
From: Kim F. Storm @ 2004-03-08  1:33 UTC (permalink / raw)
  Cc: Mark A. Hershberger, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> > Now that I've gotten a good HEAD, here is a fresh diff.
> 
> Thank you.  Here is a counter patch.  The main difference is that it uses
> 'http://foo/bar rather than :http://foo/bar so as to avoid an unnecessary
> (concat ":" foo) and also so that (symbol-name foo) immediately returns
> a usable URL.
> It also cleans up a few elisp things (like replace mapcar->mapc->dolist,
> and (append (list x) y) -> (cons x y), ...).
> Things left:
> - it seems that the new code returns either a TAG (a symbol) or (NS . TAG)
>   where TAG is a string rather than a symbol.  Do I understand this right?
>   Is that done on purpose?  It looks like a bad idea.

I don't know the code, but to me it seems that using strings rather
than symbols (in all places) generally would be preferable, as it
avoids pollution in the name space -- or maybe a local name space is
used here?  Or do symbols get garbage collected like strings do ?

-- 
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Refactoring xml.el namespace handling
  2004-03-08  1:33       ` Kim F. Storm
@ 2004-03-08  4:06         ` Mark A. Hershberger
  2004-03-12  9:57         ` Richard Stallman
  1 sibling, 0 replies; 14+ messages in thread
From: Mark A. Hershberger @ 2004-03-08  4:06 UTC (permalink / raw)
  Cc: Stefan Monnier, emacs-devel

storm@cua.dk (Kim F. Storm) writes:

> I don't know the code, but to me it seems that using strings rather
> than symbols (in all places) generally would be preferable, as it
> avoids pollution in the name space -- or maybe a local name space is
> used here?

After the discussion, I'd tend to agree: we should just use strings.
I was using symbols because that is what the old non-namespaced
xml.el used.

Mark.

-- 
A choice between one man and a shovel, or a dozen men with teaspoons
is clear to me, and I'm sure it is clear to you also.
    -- Zimran Ahmed <http://www.winterspeak.com/>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Refactoring xml.el namespace handling
  2004-03-08  1:33       ` Kim F. Storm
  2004-03-08  4:06         ` Mark A. Hershberger
@ 2004-03-12  9:57         ` Richard Stallman
  1 sibling, 0 replies; 14+ messages in thread
From: Richard Stallman @ 2004-03-12  9:57 UTC (permalink / raw)
  Cc: mah, monnier, emacs-devel

    I don't know the code, but to me it seems that using strings rather
    than symbols (in all places) generally would be preferable, as it
    avoids pollution in the name space -- or maybe a local name space is
    used here?  Or do symbols get garbage collected like strings do ?

If there is no crucial need for the uniquification that symbols have,
it is better to use strings.  If uniquification is what you want,
it is better to use symbols.  If you are using their values, then
use a separate obarray.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Refactoring xml.el namespace handling
  2004-03-05 19:03 Refactoring xml.el namespace handling Mark A. Hershberger
  2004-03-05 19:29 ` Stefan Monnier
@ 2004-04-14 18:36 ` Stefan Monnier
  2004-04-16  3:14   ` Katsumi Yamaoka
  1 sibling, 1 reply; 14+ messages in thread
From: Stefan Monnier @ 2004-04-14 18:36 UTC (permalink / raw)
  Cc: emacs-pretest-bug, emacs-devel

> I'm restricting this patch to namespace stuff.  I'll put the DTD
> parsing updates in another patch where I hope to add an understanding
> of <!ENTITYs

Installed.  Sorry about the delay.


        Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Refactoring xml.el namespace handling
  2004-04-14 18:36 ` Stefan Monnier
@ 2004-04-16  3:14   ` Katsumi Yamaoka
  2004-04-16 20:05     ` Mark A. Hershberger
  0 siblings, 1 reply; 14+ messages in thread
From: Katsumi Yamaoka @ 2004-04-16  3:14 UTC (permalink / raw)
  Cc: emacs-w3m, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 140 bytes --]

Hi,

I sent a bug report to the emacs-pretest-bug list yesterday,
however it doesn't seem to be delivered.  So, I resend it to the
Author.


[-- Attachment #2: Type: message/rfc822, Size: 583 bytes --]

From: Katsumi Yamaoka <yamaoka@jpl.org>
Cc: emacs-w3m@namazu.org
Subject: xml-parse-tag bug
Date: Thu, 15 Apr 2004 10:37:01 +0900
Message-ID: <78y8oyuima.fsf@namazu.org>

Hi,

Here's a simple code to reproduce the bug:

(with-temp-buffer
  (insert "<rdf:RDF>\n</rdf:RDF>\n")
  (goto-char (point-min))
  (xml-parse-tag))

It occurs with the latest xml.el in the Emacs CVS head and I
cannot run some programs, e.g., several shimbun modules of the
emacs-w3m package.

Regards,
-- 
Katsumi Yamaoka <yamaoka@jpl.org>

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Emacs-pretest-bug mailing list
Emacs-pretest-bug@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-pretest-bug

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Refactoring xml.el namespace handling
  2004-04-16  3:14   ` Katsumi Yamaoka
@ 2004-04-16 20:05     ` Mark A. Hershberger
  2004-04-16 22:27       ` Stefan Monnier
  0 siblings, 1 reply; 14+ messages in thread
From: Mark A. Hershberger @ 2004-04-16 20:05 UTC (permalink / raw)
  Cc: emacs-w3m, emacs-devel

Katsumi Yamaoka <yamaoka@jpl.org> writes:

> (with-temp-buffer
>   (insert "<rdf:RDF>\n</rdf:RDF>\n")
>   (goto-char (point-min))
>   (xml-parse-tag))

For me this gives: (rdf:RDF nil "\n") which, I suppose, means that I
need to try to get some more patches committed.

In any case, here is a patch against current CVS that fixes the
problem:

2004-04-16  Mark A. Hershberger  <mah@everybody.org>

	* xml.el (xml-parse-tag): Avoid overwriting node-name.

--- xml.el.~1.31.~	2004-04-16 14:56:46.000000000 -0500
+++ xml.el	2004-04-16 15:01:22.000000000 -0500
@@ -325,10 +325,8 @@
 	      (push (cons (cdar attr) (intern (concat ":" (cdr attr))))
 		    xml-ns))))
 
-        ;; expand element names
-        (setq node-name (list (xml-maybe-do-ns node-name "" xml-ns)))
+        (setq children (list attrs (xml-maybe-do-ns node-name "" xml-ns)))
 
-        (setq children (list attrs node-name))
 	;; is this an empty element ?
 	(if (looking-at "/>")
 	(progn

-- 
A choice between one man and a shovel, or a dozen men with teaspoons
is clear to me, and I'm sure it is clear to you also.
    -- Zimran Ahmed <http://www.winterspeak.com/>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Refactoring xml.el namespace handling
  2004-04-16 20:05     ` Mark A. Hershberger
@ 2004-04-16 22:27       ` Stefan Monnier
  0 siblings, 0 replies; 14+ messages in thread
From: Stefan Monnier @ 2004-04-16 22:27 UTC (permalink / raw)
  Cc: Katsumi Yamaoka, emacs-w3m, emacs-devel

> 2004-04-16  Mark A. Hershberger  <mah@everybody.org>

> 	* xml.el (xml-parse-tag): Avoid overwriting node-name.

Applied,


        Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2004-04-16 22:27 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-05 19:03 Refactoring xml.el namespace handling Mark A. Hershberger
2004-03-05 19:29 ` Stefan Monnier
2004-04-14 18:36 ` Stefan Monnier
2004-04-16  3:14   ` Katsumi Yamaoka
2004-04-16 20:05     ` Mark A. Hershberger
2004-04-16 22:27       ` Stefan Monnier
  -- strict thread matches above, loose matches on Subject: below --
2004-03-01 15:59 Mark A. Hershberger
2004-03-02 21:52 ` Stefan Monnier
2004-03-03  6:35   ` Mark A. Hershberger
2004-03-04 17:50     ` Stefan Monnier
2004-03-04 17:52       ` Stefan Monnier
2004-03-08  1:33       ` Kim F. Storm
2004-03-08  4:06         ` Mark A. Hershberger
2004-03-12  9:57         ` Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).