From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: rekado Newsgroups: gmane.lisp.guile.bugs Subject: bug#19478: [PATCH] Improve SXPath documentation Date: Wed, 31 Dec 2014 17:45:40 +0100 Message-ID: <87fvbv7op7.fsf@mango.localdomain> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: ger.gmane.org 1420044384 30229 80.91.229.3 (31 Dec 2014 16:46:24 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 31 Dec 2014 16:46:24 +0000 (UTC) To: 19478@debbugs.gnu.org Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Wed Dec 31 17:46:17 2014 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Y6MPU-0007bD-Ad for guile-bugs@m.gmane.org; Wed, 31 Dec 2014 17:46:12 +0100 Original-Received: from localhost ([::1]:40671 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y6MPT-0000pY-Nx for guile-bugs@m.gmane.org; Wed, 31 Dec 2014 11:46:11 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:36952) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y6MPO-0000pQ-3f for bug-guile@gnu.org; Wed, 31 Dec 2014 11:46:08 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Y6MPK-0000o3-Q6 for bug-guile@gnu.org; Wed, 31 Dec 2014 11:46:06 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:52743) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y6MPK-0000nz-MI for bug-guile@gnu.org; Wed, 31 Dec 2014 11:46:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1Y6MPK-0000Io-GJ for bug-guile@gnu.org; Wed, 31 Dec 2014 11:46:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: rekado Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Wed, 31 Dec 2014 16:46:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 19478 X-GNU-PR-Package: guile X-GNU-PR-Keywords: patch X-Debbugs-Original-To: bug-guile@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.14200443601150 (code B ref -1); Wed, 31 Dec 2014 16:46:02 +0000 Original-Received: (at submit) by debbugs.gnu.org; 31 Dec 2014 16:46:00 +0000 Original-Received: from localhost ([127.0.0.1]:33876 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Y6MPH-0000IT-9r for submit@debbugs.gnu.org; Wed, 31 Dec 2014 11:46:00 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:35207) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Y6MPD-0000IJ-HZ for submit@debbugs.gnu.org; Wed, 31 Dec 2014 11:45:57 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Y6MPB-0000mn-5R for submit@debbugs.gnu.org; Wed, 31 Dec 2014 11:45:55 -0500 Original-Received: from lists.gnu.org ([2001:4830:134:3::11]:56907) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y6MPB-0000mj-2W for submit@debbugs.gnu.org; Wed, 31 Dec 2014 11:45:53 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:36925) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y6MP8-0000pJ-JI for bug-guile@gnu.org; Wed, 31 Dec 2014 11:45:53 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Y6MP5-0000m7-5S for bug-guile@gnu.org; Wed, 31 Dec 2014 11:45:50 -0500 Original-Received: from sender1.zohomail.com ([74.201.84.155]:30298) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y6MP4-0000ln-Md for bug-guile@gnu.org; Wed, 31 Dec 2014 11:45:47 -0500 Original-Received: from localhost (brln-d9ba341a.pool.mediaWays.net [217.186.52.26]) by mx.zohomail.com with SMTPS id 1420044343658558.8623966919301; Wed, 31 Dec 2014 08:45:43 -0800 (PST) X-Zoho-Virus-Status: 1 X-ZohoMailClient: External X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Original-Sender: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.bugs:7697 Archived-At: --=-=-= Content-Type: text/plain Hi, the SXPath documentation in the Guile manual is rather sparse. To figure out how to use SXPath functions I had to look up the source code at module/sxml/upstream/SXPath-old.scm. When I did, I noticed that there are lots of useful comments that really should be part of the documentation. Attached is a patch that takes the comments from the sources and adds them to the Texinfo sources. I chose to rewrite a few comments to make them a little clearer and added some Texinfo markup, but most of the documentation is unchanged from Oleg's comments in the source. While these changes don't result in great SXML documentation on par with the rest of the Guile documentation, I do think they make the SXPath section a lot more useful. This is the first time I wrote Texinfo documentation (it's not even close to being the obstacle to contributing that some people on another mailing list make it out to be), so I'm not sure I chose the most appropriate markup in all cases. There is also one instance where I think I did the right thing but the results are wrong: I added a link to an example further down the page but in my compiled version of the manual I end up far from the anchor when I follow the link. This is the section containing the reference: Similarly to XPath, SXPath defines full and abbreviated notations for location paths. In both cases, the abbreviated notation can be mechanically expanded into the full form by simple rewriting rules. In case of SXPath the corresponding rules are given in the documentation of the @code{sxpath} procedure. @xref{sxpath-procedure-docs,,SXPath procedure documentation}. ... And here's the anchor: @anchor{sxpath-procedure-docs} @deffn {Scheme Procedure} sxpath path Evaluate an abbreviated SXPath. ... I would appreciate it if you could take a look at my changes and suggest improvements. Cheers, rekado --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-doc-Add-SXPath-documentation-from-sources.patch >From 1fc011f3c8d25c3a14db2d7f5e9ecb6627bc9102 Mon Sep 17 00:00:00 2001 From: rekado Date: Wed, 31 Dec 2014 16:48:32 +0100 Subject: [PATCH] doc: Add SXPath documentation from sources * doc/ref/sxml.texi (SXPath): Add procedure documentation from sources. --- doc/ref/sxml.texi | 295 +++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 261 insertions(+), 34 deletions(-) diff --git a/doc/ref/sxml.texi b/doc/ref/sxml.texi index 75867f3..2f9cd8d 100644 --- a/doc/ref/sxml.texi +++ b/doc/ref/sxml.texi @@ -250,8 +250,8 @@ internal and external parsed entities, user-controlled handling of whitespace, and validation. This module therefore is intended to be a framework, a set of ``Lego blocks'' you can use to build a parser following any discipline and performing validation to any degree. As an -example of the parser construction, this file includes a semi-validating -SXML parser. +example of the parser construction, the source file includes a +semi-validating SXML parser. SSAX has a ``sequential'' feel of SAX yet a ``functional style'' of DOM. Like a SAX parser, the framework scans the document only once and @@ -725,95 +725,322 @@ location path is a relative path applied to the root node. Similarly to XPath, SXPath defines full and abbreviated notations for location paths. In both cases, the abbreviated notation can be mechanically expanded into the full form by simple rewriting rules. In -case of SXPath the corresponding rules are given as comments to a sxpath -function, below. The regression test suite at the end of this file shows -a representative sample of SXPaths in both notations, juxtaposed with -the corresponding XPath expressions. Most of the samples are borrowed +case of SXPath the corresponding rules are given in the documentation of +the @code{sxpath} procedure. @xref{sxpath-procedure-docs,,SXPath +procedure documentation}. + +The regression test suite at the end of the file SXPATH-old.scm shows a +representative sample of SXPaths in both notations, juxtaposed with the +corresponding XPath expressions. Most of the samples are borrowed literally from the XPath specification, while the others are adjusted -for our running example, tree1. +for our running example, @code{tree1}. + + +@subsubsection Basic converters and applicators + +A converter is a function mapping a nodeset (or a single node) to another +nodeset. Its type can be represented like this: + +@smallexample + type Converter = Node|Nodeset -> Nodeset +@end smallexample + +A converter can also play the role of a predicate: in that case, if a +converter, applied to a node or a nodeset, yields a non-empty nodeset, +the converter-predicate is deemed satisfied. Likewise, an empty nodeset +is equivalent to @code{#f} in denoting failure. -@subsubsection Usage @deffn {Scheme Procedure} nodeset? x +Return @code{#t} if @var{x} is a nodeset. @end deffn @deffn {Scheme Procedure} node-typeof? crit +This function implements a 'Node test' as defined in Sec. 2.3 of XPath +document. A node test is one of the components of a location step. It +is also a converter-predicate in SXPath. + +The function @code{node-typeof?} takes a type criterion and returns a +function, which, when applied to a node, will tell if the node satisfies +the test. + +The criterion @var{crit} is a symbol, one of the following: + +@table @code +@item id +tests if the node has the right name (id) + +@item @@ +tests if the node is an + +@item * +tests if the node is an + +@item *text* +tests if the node is a text node + +@item *PI* +tests if the node is a PI (processing instruction) node + +@item *any* +@code{#t} for any type of node +@end table @end deffn @deffn {Scheme Procedure} node-eq? other +A curried equivalence converter predicate that takes a node @var{other} +and returns a function that takes another node. The two nodes are +compared using @code{eq?}. @end deffn @deffn {Scheme Procedure} node-equal? other +A curried equivalence converter predicate that takes a node @var{other} +and returns a function that takes another node. The two nodes are +compared using @code{equal?}. @end deffn @deffn {Scheme Procedure} node-pos n +Select the @var{n}'th element of a nodeset and return as a singular +nodeset. If the @var{n}'th element does not exist, return an empty +nodeset. If @var{n} is a negative number the node is picked from the +tail of the list. + +@example +((node-pos 1) nodeset) ; return the the head of the nodeset (if exists) +((node-pos 2) nodeset) ; return the node after that (if exists) +((node-pos -1) nodeset) ; selects the last node of a non-empty nodeset +((node-pos -2) nodeset) ; selects the last but one node, if exists. +@end example @end deffn @deffn {Scheme Procedure} filter pred? -@verbatim - -- Scheme Procedure: filter pred list - Return all the elements of 2nd arg LIST that satisfy predicate - PRED. The list is not disordered - elements that appear in the - result list occur in the same order as they occur in the argument - list. The returned list may share a common tail with the argument - list. The dynamic order in which the various applications of pred - are made is not specified. - - (filter even? '(0 7 8 8 43 -4)) => (0 8 8 -4) - - -@end verbatim +A filter applicator, which introduces a filtering context. The argument +converter @var{pred?} is considered a predicate, with either @code{#f} +or @code{nil} meaning failure. @end deffn @deffn {Scheme Procedure} take-until pred? +@smallexample + take-until:: Converter -> Converter, or + take-until:: Pred -> Node|Nodeset -> Nodeset +@end smallexample + +Given a converter-predicate @var{pred?} and a nodeset, apply the +predicate to each element of the nodeset, until the predicate yields +anything but @code{#f} or @code{nil}. Return the elements of the input +nodeset that have been processed until that moment (that is, which fail +the predicate). + +@code{take-until} is a variation of the @code{filter} above: +@code{take-until} passes elements of an ordered input set up to (but not +including) the first element that satisfies the predicate. The nodeset +returned by @code{((take-until (not pred)) nset)} is a subset -- to be +more precise, a prefix -- of the nodeset returned by @code{((filter +pred) nset)}. @end deffn @deffn {Scheme Procedure} take-after pred? +@smallexample + take-after:: Converter -> Converter, or + take-after:: Pred -> Node|Nodeset -> Nodeset +@end smallexample + +Given a converter-predicate @var{pred?} and a nodeset, apply the +predicate to each element of the nodeset, until the predicate yields +anything but @code{#f} or @code{nil}. Return the elements of the input +nodeset that have not been processed: that is, return the elements of +the input nodeset that follow the first element that satisfied the +predicate. + +@code{take-after} along with @code{take-until} partition an input +nodeset into three parts: the first element that satisfies a predicate, +all preceding elements and all following elements. @end deffn @deffn {Scheme Procedure} map-union proc lst +Apply @var{proc} to each element of @var{lst} and return the list of results. +If @var{proc} returns a nodeset, splice it into the result + +From another point of view, @code{map-union} is a function +@code{Converter->Converter}, which places an argument-converter in a joining +context. @end deffn @deffn {Scheme Procedure} node-reverse node-or-nodeset +@smallexample + node-reverse :: Converter, or + node-reverse:: Node|Nodeset -> Nodeset +@end smallexample + +Reverses the order of nodes in the nodeset. This basic converter is +needed to implement a reverse document order (see the XPath +Recommendation). @end deffn @deffn {Scheme Procedure} node-trace title +@smallexample + node-trace:: String -> Converter +@end smallexample + +@code{(node-trace title)} is an identity converter. In addition it +prints out the node or nodeset it is applied to, prefixed with the +@var{title}. This converter is very useful for debugging. @end deffn +@subsubsection Converter combinators + +Combinators are higher-order functions that transmogrify a converter or +glue a sequence of converters into a single, non-trivial converter. The +goal is to arrive at converters that correspond to XPath location paths. + +From a different point of view, a combinator is a fixed, named +@emph{pattern} of applying converters. Given below is a complete set of +such patterns that together implement XPath location path specification. +As it turns out, all these combinators can be built from a small number +of basic blocks: regular functional composition, @code{map-union} and +@code{filter} applicators, and the nodeset union. + @deffn {Scheme Procedure} select-kids test-pred? +@code{select-kids} takes a converter (or a predicate) as an argument and +returns another converter. The resulting converter applied to a nodeset +returns an ordered subset of its children that satisfy the predicate +@var{test-pred?}. @end deffn @deffn {Scheme Procedure} node-self pred? -@verbatim - -- Scheme Procedure: filter pred list - Return all the elements of 2nd arg LIST that satisfy predicate - PRED. The list is not disordered - elements that appear in the - result list occur in the same order as they occur in the argument - list. The returned list may share a common tail with the argument - list. The dynamic order in which the various applications of pred - are made is not specified. - - (filter even? '(0 7 8 8 43 -4)) => (0 8 8 -4) - - -@end verbatim +Similar to @code{select-kids} except that the predicate @var{pred?} is +applied to the node itself rather than to its children. The resulting +nodeset will contain either one component, or will be empty if the node +failed the predicate. @end deffn @deffn {Scheme Procedure} node-join . selectors +@smallexample + node-join:: [LocPath] -> Node|Nodeset -> Nodeset, or + node-join:: [Converter] -> Converter +@end smallexample + +Join the sequence of location steps or paths as described above. @end deffn @deffn {Scheme Procedure} node-reduce . converters +@smallexample + node-reduce:: [LocPath] -> Node|Nodeset -> Nodeset, or + node-reduce:: [Converter] -> Converter +@end smallexample + +A regular functional composition of converters. From a different point +of view, @code{((apply node-reduce converters) nodeset)} is equivalent +to @code{(foldl apply nodeset converters)}, i.e., folding, or reducing, +a list of converters with the nodeset as a seed. @end deffn @deffn {Scheme Procedure} node-or . converters +@smallexample + node-or:: [Converter] -> Converter +@end smallexample + +This combinator applies all converters to a given node and produces the +union of their results. This combinator corresponds to a union +(@code{|} operation) for XPath location paths. @end deffn @deffn {Scheme Procedure} node-closure test-pred? +@smallexample + node-closure:: Converter -> Converter +@end smallexample + +Select all @emph{descendants} of a node that satisfy a +converter-predicate @var{test-pred?}. This combinator is similar to +@code{select-kids} but applies to grand... children as well. This +combinator implements the @code{descendant::} XPath axis. Conceptually, +this combinator can be expressed as + +@smallexample +(define (node-closure f) + (node-or + (select-kids f) + (node-reduce (select-kids (node-typeof? '*)) (node-closure f)))) +@end smallexample + +This definition, as written, looks somewhat like a fixpoint, and it will +run forever. It is obvious however that sooner or later +@code{(select-kids (node-typeof? '*))} will return an empty nodeset. At +this point further iterations will no longer affect the result and can +be stopped. @end deffn @deffn {Scheme Procedure} node-parent rootnode +@smallexample + node-parent:: RootNode -> Converter +@end smallexample + +@code{(node-parent rootnode)} yields a converter that returns a parent +of a node it is applied to. If applied to a nodeset, it returns the +list of parents of nodes in the nodeset. The @var{rootnode} does not +have to be the root node of the whole SXML tree -- it may be a root node +of a branch of interest. + +Given the notation of Philip Wadler's paper on semantics of XSLT, + +@verbatim + parent(x) = { y | y=subnode*(root), x=subnode(y) } +@end verbatim + +Therefore, @code{node-parent} is not the fundamental converter: it can +be expressed through the existing ones. Yet @code{node-parent} is a +rather convenient converter. It corresponds to a @code{parent::} axis +of SXPath. Note that the @code{parent::} axis can be used with an +attribute node as well. @end deffn +@anchor{sxpath-procedure-docs} @deffn {Scheme Procedure} sxpath path +Evaluate an abbreviated SXPath. + +@smallexample + sxpath:: AbbrPath -> Converter, or + sxpath:: AbbrPath -> Node|Nodeset -> Nodeset +@end smallexample + +@var{path} is a list. It is translated to the full SXPath according to +the following rewriting rules: + +@example +(sxpath '()) +@result{} (node-join) + +(sxpath '(path-component ...)) +@result{} (node-join (sxpath1 path-component) (sxpath '(...))) + +(sxpath1 '//) +@result{} (node-or + (node-self (node-typeof? '*any*)) + (node-closure (node-typeof? '*any*))) + +(sxpath1 '(equal? x)) +@result{} (select-kids (node-equal? x)) + +(sxpath1 '(eq? x)) +@result{} (select-kids (node-eq? x)) + +(sxpath1 ?symbol) +@result{} (select-kids (node-typeof? ?symbol) + +(sxpath1 procedure) +@result{} procedure + +(sxpath1 '(?symbol ...)) +@result{} (sxpath1 '((?symbol) ...)) + +(sxpath1 '(path reducer ...)) +@result{} (node-reduce (sxpath path) (sxpathr reducer) ...) + +(sxpathr number) +@result{} (node-pos number) + +(sxpathr path-filter) +@result{} (filter (sxpath path-filter)) +@end example @end deffn @node sxml ssax input-parse -- 2.1.0 --=-=-=--