From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Andy Chambers Newsgroups: gmane.emacs.help Subject: sxpath for emacs Date: Thu, 16 Aug 2007 07:11:31 -0700 Organization: http://groups.google.com Message-ID: <1187273491.444716.253020@d55g2000hsg.googlegroups.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-Trace: sea.gmane.org 1187275242 7888 80.91.229.12 (16 Aug 2007 14:40:42 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 16 Aug 2007 14:40:42 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Thu Aug 16 16:40:39 2007 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1ILgWZ-0003Zy-Nc for geh-help-gnu-emacs@m.gmane.org; Thu, 16 Aug 2007 16:40:36 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1ILgWY-00028g-Sk for geh-help-gnu-emacs@m.gmane.org; Thu, 16 Aug 2007 10:40:34 -0400 Original-Path: shelby.stanford.edu!headwall.stanford.edu!newshub.sdsu.edu!postnews.google.com!d55g2000hsg.googlegroups.com!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 417 Original-NNTP-Posting-Host: 217.155.237.94 Original-X-Trace: posting.google.com 1187273492 25218 127.0.0.1 (16 Aug 2007 14:11:32 GMT) Original-X-Complaints-To: groups-abuse@google.com Original-NNTP-Posting-Date: Thu, 16 Aug 2007 14:11:32 +0000 (UTC) User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6,gzip(gfe),gzip(gfe) Complaints-To: groups-abuse@google.com Injection-Info: d55g2000hsg.googlegroups.com; posting-host=217.155.237.94; posting-account=ps2QrAMAAAA6_jCuRt2JEIpn5Otqf_w0 Original-Xref: shelby.stanford.edu gnu.emacs.help:151042 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:46615 Archived-At: Hi List, I've managed to port a decent subset of the excellent sxpath to common lisp (see recent list activity at http://common-lisp.net/mailman/listinfo/s-xml-devel for patches) and I'm trying to do the same for emacs lisp. I realize that there already exists an xpath implementation but I prefer the sxpath syntax and would have thought that most lispers would agree. Most of the low-level sxpath functions return closed over functions that are composed together by the sxpath function to create one function that converts one nodeset into another. So far, I've just wrapped each function that requires it, in a lexical- let which seems to make about half of the low-level test cases pass (although none of actual sxpath test cases work). Before now, I haven't made much use of the cl extensions so I wondered if anyone with a little more experience might be able to point out what's causing a few of the problems. The node-join recursion seems to be going wrong somewhere but I don't know more than that. --Andy ;;;; sxpath (require 'cl) (defun nodeset? (x) "Returns whether `x' is a nodelist (nil counts as a nodelist)" (or (and (listp x) (not (symbolp (car x)))) (null x))) (defun node-typeof? (crit) "Returns a function that tests whether the context node meets the specified criteria" (lexical-let ((crit crit)) #'(lambda (node) (case crit ((*) (and (listp node) (not (member (car node) '(:@ *PI*))))) ((*any*) t) ((*text*) (stringp node)) (t (and (listp node) (eq crit (car node)))))))) ; Curried equivalence converter-predicates (defun node-eq? (other) (lexical-let ((other other)) #'(lambda (node) (eq other node)))) (defun node-equal? (other) (lexical-let ((other other)) #'(lambda (node) (equal other node)))) (defun positive? (n) (< 0 n)) (defun negative? (n) (> 0 n)) (defun node-pos (n) "Select the n'th element of a Nodeset and return as a singular Nodeset Return an empty nodeset if the n'th element does not exist. If n is 1, selects the node at the head of the Nodeset, if exists. n can also be negative; in this case, the node is picked from the tail of the list." (lexical-let ((n n)) #'(lambda (nodeset) (cond ((not (nodeset? nodeset)) '()) ((null nodeset) nodeset) ((eql n 1) (list (car nodeset))) ((negative? n) (funcall (node-pos (+ n (length nodeset))) nodeset)) (t (funcall (node-pos (1- n)) (cdr nodeset))))))) (defun filter (pred?) "A filter applicator, which introduces a filtering context. The argument converter is considered a predicate, with null result meaning failure" (lexical-let ((pred? pred?)) #'(lambda (lst) (labels ((descend (lst res) (if (null lst) (reverse res) (let ((pred-result (funcall pred? (car lst)))) (descend (cdr lst) (if (and pred-result (not (null pred-result))) (cons (car lst) res) res)))))) (descend (if (nodeset? lst) lst (list lst)) '()))))) (defun take-until (pred?) "Given a converter-predicate and a nodeset, apply the predicate to each element of the nodeset, until the predicate yields anything but null. Return the elements of the input nodeset that have been processed till that moment (that is, which fail the predicate). take-until is a variation of the filter above; it passes elements of an ordered input set till (but not including) the first element that satisfies the predicate. The nodeset returned by (funcall (take-until (not pred)) nset) is a subset -- or to be more precise, a prefix -- of the nodeset returned by (funcall (filter pred) nset)" (lexical-let ((pred? pred?)) #'(lambda (lst) (labels ((descend (lst) (if (null lst) lst (let ((pred-result (funcall pred? (car lst)))) (if (and pred-result (not (null pred-result))) '() (cons (car lst) (descend (cdr lst)))))))) (descend (if (nodeset? lst) lst (list lst))))))) (defun take-after (pred?) "Given a converter-predicate and a nodeset, apply the predicate to each element of the nodeset, until the predicate yields anything but null. Return the elements of the input nodeset that have not been processed: that is, return the elements of nodeset that follow the first element that satisfied the predicate. take-after along with take-until partition an input nodeset into three parts: the first element that satisfies a predicate, all preceding elements, and all following elements." (lexical-let ((pred? pred?)) #'(lambda (lst) (labels ((descend (lst) (if (null lst) lst (let ((pred-result (funcall pred? (car lst)))) (if (and pred-result (not (null pred-result))) (cdr lst) (descend (cdr lst))))))) (descend (if (nodeset? lst) lst (list lst))))))) (defun map-union (proc lst) "Apply proc to each element of lst and return the list of results. If proc returns a nodeset, splice it into the result" (if (null lst) lst (let ((proc-res (funcall proc (car lst)))) (funcall (if (nodeset? proc-res) #'append #'cons) proc-res (map-union proc (cdr lst)))))) (defun node-reverse () "Reverses the order of nodes in the nodeset" #'(lambda (node-or-nodeset) (lexical-let ((node-or-nodeset node-or-nodeset)) (if (not (nodeset? node-or-nodeset)) (list node-or-nodeset) (reverse node-or-nodeset))))) (defun node-trace (title) "An identity converter. In addition, it prints out a node or nodeset it is applied to, prefixed with the title (useful for debugging)" #'(lambda (node-or-nodeset) (print "\n-->") (print title) (print " :") (pprint node-or-nodeset) node-or-nodeset)) ;; Converter combinators ;; ;; Combinators are higher-order functions that transmogrify a converter ;; or glue a sequence of converters into a single, non-trivial ;; converter. The goal is to arrive at converters that correspond to ;; XPath location paths. ;; From a different point of view, a combinator is a fixed, named ;; _pattern_ of applying converters. Given below is a complete set of ;; such patterns that together implement XPath location path ;; specification. As it turns out, all these combinators can be built ;; from a small number of basic blocks: regular functional composition, ;; map-union and filter applicators, and the nodeset union. (defun select-kids (test-pred?) "Given a Node, return an (ordered) subset, its children that satisfy test-pred? (actually a converter). When applied to a nodelist, select among children of all the nodes in the nodelist" (lexical-let ((test-pred? test-pred?)) #'(lambda (node) (cond ((null node) node) ((not (listp node)) '()) ((symbolp (car node)) (funcall (filter test-pred?) (cdr node))) (t (map-union (select-kids test-pred?) node)))))) (defun node-self (pred?) "Similar to select-kids but apply to the Node itself rather than to its children. The resulting Nodeset will contain either one component, or will be empty (if the Node failed the pred)." (filter pred?)) (defun node-join (&rest selectors) "Join the sequence of location steps or paths as described in the commentary above" (lexical-let ((selectors selectors)) #'(lambda (nodeset) (labels ((descend (nodeset selectors) (if (null selectors) nodeset (descend (if (nodeset? nodeset) (map-union (car selectors) nodeset) (funcall (car selectors) nodeset)) (cdr selectors))))) (descend nodeset selectors)))) (defun node-reduce (&rest converters) "A regular functional composition of converters." (lexical-let ((converters converters)) #'(lambda (nodeset) (labels ((descend (nodeset converters) (if (null converters) nodeset (descend (funcall (car converters) nodeset) (cdr converters))))) (descend nodeset converters))))) (defun node-or (&rest converters) " This combinator applies all converters to a given node and produces the union of their results. This combinator corresponds to a union, '|' operation for XPath location paths. (define (node-or . converters) (lambda (node-or-nodeset) (if (null? converters) node-or-nodeset (append ((car converters) node-or-nodeset) ((apply node-or (cdr converters)) node-or-nodeset)))))" (lexical-let ((converters converters)) #'(lambda (node-or-nodeset) (labels ((descend (result converters) (if (null converters) result (descend (append result (or (funcall (car converters) node-or-nodeset) '())) (cdr converters))))) (descend '() converters))))) (defun node-closure (test-pred?) " Select all _descendants_ of a node that satisfy a converter- predicate. This combinator is similar to select-kids but applies to grand... children as well. This combinator implements the \"descendant::\" XPath axis Conceptually, this combinator can be expressed as ;; (define (node-closure f) ;; (node-or ;; (select-kids f) ;; (node-reduce (select-kids (node-typeof? '*)) (node-closure f)))) This definition, as written, looks somewhat like a fixpoint, and it will run forever. It is obvious however that sooner or later (select-kids (node-typeof? '*)) will return an empty nodeset. At this point further iterations will no longer affect the result and can be stopped." (lexical-let ((test-pred? test-pred?)) #'(lambda (node) (labels ((descend (parent result) (if (null parent) result (descend (funcall (select-kids (node-typeof? '*)) parent) (append result (funcall (select-kids test-pred?) parent)))))) (descend node '()))))) (defun node-parent (rootnode) "(node-parent rootnode) yields a converter that returns a parent of a node it is applied to. If applied to a nodeset, it returns the list of parents of nodes in the nodeset. The rootnode does not have to be the root node of the whole SXML tree -- it may be a root node of a branch of interest. Given the notation of Philip Wadler's paper on semantics of XSLT, parent(x) = { y | y=subnode*(root), x=subnode(y) } Therefore, node-parent is not the fundamental converter: it can be expressed through the existing ones. Yet node-parent is a rather convenient converter. It corresponds to a parent:: axis of SXPath. Note that the parent:: axis can be used with an attribute node as well!" (lexical-let ((rootnode rootnode)) #'(lambda (node) (if (nodeset? node) (map-union (node-parent rootnode) node) (let ((pred (node-or (node-reduce (node-self (node-typeof? '*)) (select-kids (node-eq? node))) (node-join (select-kids (node-typeof? ':@)) (select-kids (node-eq? node)))))) (funcall (node-or (node-self pred) (node-closure pred)) rootnode)))))) (defun sxpath (path) "Evaluate an abbreviated SXPath path is a list. It is translated to the full sxpath according to the following rewriting rules... ; (sxpath '()) -> (node-join) ; (sxpath '(path-component ...)) -> (node-join ; (sxpath path-component) ; (sxpath '(...))) ; (sxpath '(//) -> (node-or ; (node-self (node-typeof? '*any*)) ; (node-closure (node-typeof? '*any*))) ; (sxpath '(equal? x)) -> (select-kids (node-equal? x)) ; (sxpath '(eq? x)) -> (select-kids (node-eq? x)) ; (sxpath ?symbol) -> (select-kids (node-typeof? ?symbol) ; (sxpath procedure) -> procedure ; (sxpath '(?symbol ...)) -> (sxpath '((?symbol) ...)) ; (sxpath '(path reducer ...)) -> ; (node-reduce (sxpath path) (sxpath reducer) ...) ; (sxpath number) -> (node-pos number) ; (sxpath path-filter) -> (filter (sxpath path-filter))" (lexical-let ((path path)) #'(lambda (nodeset) (labels ((descend (nodeset path) (cond ((null path) nodeset) ((nodeset? nodeset) (map-union (sxpath path) nodeset)) ((functionp (car path)) (descend (funcall (car path) nodeset) (cdr path))) ((eq '// (car path)) (descend (funcall (if (nodeset? nodeset) #'append #'cons) nodeset (funcall (node-closure (node-typeof? '*any*)) nodeset)) (cdr path))) ((symbolp (car path)) (descend (funcall (select-kids (node-typeof? (car path))) nodeset) (cdr path))) ((and (listp (car path)) (eq 'equal? (caar path))) (descend (funcall (select-kids (apply #'node-equal? (cdar path))) nodeset) (cdr path))) ((and (listp (car path)) (eq 'eq? (caar path))) (descend (funcall (select-kids (apply #'node-eq? (cdar path))) nodeset) (cdr path))) ((listp (car path)) (labels ((reducer (nodeset reducing-path) (cond ((null reducing-path) (descend nodeset (cdr path))) ((numberp (car reducing-path)) (reducer (funcall (node-pos (car reducing-path)) nodeset) (cdr reducing-path))) (t (reducer (funcall (filter (sxpath (car reducing-path))) nodeset) (cdr reducing-path)))))) (reducer (if (symbolp (caar path)) (funcall (select-kids (node-typeof? (caar path))) nodeset) (descend nodeset (caar path))) (cdar path)))) (t (error "Invalid path step: ~S" (car path)))))) (descend nodeset path)))))