From mboxrd@z Thu Jan 1 00:00:00 1970 From: swedebugia Subject: Re: Help with sxml simple parser for the quicklisp importer Date: Wed, 23 Jan 2019 17:03:02 +0100 Message-ID: References: <1b161633-c285-1401-d771-c965dae58149@riseup.net> <874l9z78sc.fsf@elephly.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([209.51.188.92]:40234) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gmKsg-0005UB-4l for guix-devel@gnu.org; Wed, 23 Jan 2019 10:55:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gmKsf-0003mM-9S for guix-devel@gnu.org; Wed, 23 Jan 2019 10:55:58 -0500 Received: from mx1.riseup.net ([198.252.153.129]:36292) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gmKse-0003lj-SY for guix-devel@gnu.org; Wed, 23 Jan 2019 10:55:57 -0500 In-Reply-To: <874l9z78sc.fsf@elephly.net> Content-Language: en-US List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Ricardo Wurmus Cc: guix-devel On 2019-01-23 15:22, Ricardo Wurmus wrote: > Hi, >=20 >> (define (get-homepage name) >> "Get the latest meta release file. From the links in this we extrac= t all >> other information we need." >> (call-with-temporary-output-file >> (lambda (temp port) >> (and (url-fetch (homepage name) temp) >> (xml->sxml (get-string-all port)))))) >=20 > Aside: you don=E2=80=99t need to use =E2=80=9Cget-string-all=E2=80=9D; = =E2=80=9Cxml->sxml=E2=80=9D can read > directly from a port. >=20 >> But it errors out with: >> >> sxml/simple.scm:143:4: In procedure loop: >> Throw to key `parser-error' with args `(# >> "[GIMatch] broken for " (END . head) " while expecting " END link)'. >=20 > I fetched the document. Here=E2=80=99s the part that it barfs on: >=20 > --8<---------------cut here---------------start------------->8--- > > > > > 1am | Quickdocs > > =20 > >=20 > > > =E2=80=A6 > --8<---------------cut here---------------end--------------->8--- >=20 > The second =E2=80=9Clink=E2=80=9D tag opens but is never closed. This = may be valid > HTML, but it is not valid XML, which is what xml->sxml expects. Thanks for the quick answer! I will try to remove this line before handling over to the parser. --=20 Cheers Swedebugia