From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:32771) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dpXeo-0001YW-UQ for guix-patches@gnu.org; Wed, 06 Sep 2017 06:34:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dpXek-0004Ku-7K for guix-patches@gnu.org; Wed, 06 Sep 2017 06:34:06 -0400 Received: from debbugs.gnu.org ([208.118.235.43]:43637) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dpXek-0004Kh-44 for guix-patches@gnu.org; Wed, 06 Sep 2017 06:34:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1dpXej-0000Dr-Rt for guix-patches@gnu.org; Wed, 06 Sep 2017 06:34:01 -0400 Subject: [bug#28235] [PATCH 2/3] gnu: Add python-html5-parser, python2-html5-parser Resent-Message-ID: References: <878ti7tsli.fsf@gnu.org> <8760dbtsf0.fsf@gnu.org> <874lsll4n0.fsf@fastmail.com> From: Roel Janssen Message-ID: <87zia8xglm.fsf@gnu.org> In-reply-to: <874lsll4n0.fsf@fastmail.com> Date: Wed, 06 Sep 2017 12:32:51 +0200 MIME-Version: 1.0 Content-Type: text/plain List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+kyle=kyleam.com@gnu.org Sender: "Guix-patches" To: Marius Bakke Cc: 28235@debbugs.gnu.org Marius Bakke writes: > Roel Janssen writes: > >> * gnu/packages/python.scm (python-html5-parser): New variable. >> (python2-html5-parser: New variable. >> --- >> gnu/packages/python.scm | 29 +++++++++++++++++++++++++++++ >> 1 file changed, 29 insertions(+) >> >> diff --git a/gnu/packages/python.scm b/gnu/packages/python.scm >> index 9bf46fb6f..8629228db 100644 >> --- a/gnu/packages/python.scm >> +++ b/gnu/packages/python.scm >> @@ -5868,6 +5868,35 @@ and written in Python.") >> (define-public python2-html5lib-0.9 >> (package-with-python2 python-html5lib-0.9)) >> >> +(define-public python-html5-parser >> + (package >> + (name "python-html5-parser") >> + (version "0.4.4") >> + (source (origin >> + (method url-fetch) >> + (uri (pypi-uri "html5-parser" version)) >> + (sha256 >> + (base32 >> + "1d8sxhl41ffh7qlk7wlsy17xw6slzx5v1yna9s72wx5qrpaa3wxr")))) >> + (build-system python-build-system) >> + (native-inputs >> + `(("pkg-config" ,pkg-config))) >> + (inputs >> + `(("libxml2" ,libxml2))) >> + (propagated-inputs >> + `(("python-lxml" ,python-lxml) >> + ("python-beautifulsoup4" ,python-beautifulsoup4))) >> + (home-page "https://html5-parser.readthedocs.io") >> + (synopsis "Fast C-based HTML5 parsing for Python") >> + (description "This package provides a fast implementation of the HTML5 >> +parsing spec for Python. Parsing is done in C using a variant of the gumbo >> +parser. The gumbo parse tree is then transformed into an lxml tree, also in >> +C, yielding parse times that can be a thirtieth of the html5lib parse times.") >> + (license license:asl2.0))) > > The files 'src/as-libxml.[ch]' are GPL3. Everything else in this series LGTM! Oh, it seems this is the case for as-python-tree.[ch], not as-libxml.[ch]. Good catch! I'll update the license list and push this patch series. Thanks for your time. Kind regards, Roel Janssen