From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ben Woodcroft Subject: [PATCHES] Add vsearch. Date: Wed, 30 Sep 2015 08:47:14 +1000 Message-ID: <560B14F2.7010103@uq.edu.au> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------040107020205090800040402" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:41345) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zh3gE-0003HQ-8E for guix-devel@gnu.org; Tue, 29 Sep 2015 18:47:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zh3gA-0007ei-UB for guix-devel@gnu.org; Tue, 29 Sep 2015 18:47:26 -0400 Received: from mailhub1.soe.uq.edu.au ([130.102.132.208]:40762 helo=newmailhub.uq.edu.au) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zh3gA-0007dv-Aw for guix-devel@gnu.org; Tue, 29 Sep 2015 18:47:22 -0400 Received: from smtp2.soe.uq.edu.au (smtp2.soe.uq.edu.au [10.138.113.41]) by newmailhub.uq.edu.au (8.14.5/8.14.5) with ESMTP id t8TMlH4X003966 for ; Wed, 30 Sep 2015 08:47:18 +1000 Received: from [192.168.1.101] ([103.25.181.216]) (authenticated bits=0) by smtp2.soe.uq.edu.au (8.14.5/8.14.5) with ESMTP id t8TMlFQd041586 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT) for ; Wed, 30 Sep 2015 08:47:17 +1000 List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org To: "guix-devel@gnu.org" This is a multi-part message in MIME format. --------------040107020205090800040402 Content-Type: multipart/alternative; boundary="------------060205000407060504010602" --------------060205000407060504010602 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Apologies if this is a duplicate email. Excellent to see an open source competitor to usearch. Thanks in advance for review as usual. I'm not especially adept at using gcc's flags so perhaps some attention is warranted in the second patch's snippet. --------------060205000407060504010602 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 7bit
Apologies if this is a duplicate email.

Excellent to see an open source competitor to usearch.

Thanks in advance for review as usual. I'm not especially adept at using gcc's flags so perhaps some attention is warranted in the second patch's snippet.
--------------060205000407060504010602-- --------------040107020205090800040402 Content-Type: text/x-patch; name="0001-gnu-Add-cityhash.patch" Content-Disposition: attachment; filename="0001-gnu-Add-cityhash.patch" Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by newmailhub.uq.edu.au id t8TMlH4X003966 >From e1789cbcfcf7dc6f1657f53bad04fca7180400cc Mon Sep 17 00:00:00 2001 From: Ben Woodcroft Date: Tue, 29 Sep 2015 22:10:33 +1000 Subject: [PATCH 1/2] gnu: Add cityhash. * gnu/packages/textutils.scm (cityhash): New variable. --- gnu/packages/textutils.scm | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/gnu/packages/textutils.scm b/gnu/packages/textutils.scm index 95a8ad1..5204297 100644 --- a/gnu/packages/textutils.scm +++ b/gnu/packages/textutils.scm @@ -1,6 +1,7 @@ ;;; GNU Guix --- Functional package management for GNU ;;; Copyright =C2=A9 2015 Taylan Ulrich Bay=C4=B1rl=C4=B1/Kammer ;;; Copyright =C2=A9 2015 Ricardo Wurmus +;;; Copyright =C2=A9 2015 Ben Woodcroft ;;; ;;; This file is part of GNU Guix. ;;; @@ -153,3 +154,26 @@ encoding, supporting Unicode version 7.0.") "libgtextutils is a text utilities library used by the fastx toolki= t from the Hannon Lab.") (license license:agpl3+))) + +(define-public cityhash + (let ((commit "8af9b8c") + (revision "1")) + (package + (name "cityhash") + (version (string-append "1.1." revision "." commit)) + (source (origin + (method git-fetch) + (uri (git-reference + (url "https://github.com/google/cityhash.git") + (commit commit))) + (file-name (string-append name "-" version ".tar.gz")) + (sha256 + (base32 + "0n6skf5dv8yfl1ckax8dqhvsbslkwc9158zf2ims0xqdvzsahbi6"= )))) + (build-system gnu-build-system) + (home-page "https://github.com/google/cityhash") + (synopsis "A family of functions for strings") + (description + "CityHash provides hash functions for strings. The functions mix t= he +input bits thoroughly but are not suitable for cryptography.") + (license license:expat)))) --=20 2.4.3 --------------040107020205090800040402 Content-Type: text/x-patch; name="0002-gnu-Add-vsearch.patch" Content-Disposition: attachment; filename="0002-gnu-Add-vsearch.patch" Content-Transfer-Encoding: 7bit >From c2470ec0681ccb18687bae5459247dc869fc8555 Mon Sep 17 00:00:00 2001 From: Ben Woodcroft Date: Tue, 29 Sep 2015 22:17:10 +1000 Subject: [PATCH 2/2] gnu: Add vsearch. * gnu/packages/bioinformatics.scm (vsearch): New variable. --- gnu/packages/bioinformatics.scm | 60 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 60 insertions(+) diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm index 8fc6142..f484efc 100644 --- a/gnu/packages/bioinformatics.scm +++ b/gnu/packages/bioinformatics.scm @@ -33,6 +33,7 @@ #:use-module (guix build-system ruby) #:use-module (guix build-system trivial) #:use-module (gnu packages) + #:use-module (gnu packages autotools) #:use-module (gnu packages algebra) #:use-module (gnu packages base) #:use-module (gnu packages boost) @@ -53,6 +54,7 @@ #:use-module (gnu packages statistics) #:use-module (gnu packages tbb) #:use-module (gnu packages textutils) + #:use-module (gnu packages tls) #:use-module (gnu packages vim) #:use-module (gnu packages web) #:use-module (gnu packages xml) @@ -2709,6 +2711,64 @@ data in the form of VCF files.") ;; at http://vcftools.sourceforge.net/license.html (license license:lgpl3))) +(define-public vsearch + (package + (name "vsearch") + (version "1.4.0") + (source + (origin + (method url-fetch) + (uri (string-append + "https://github.com/torognes/vsearch/archive/v" + version ".tar.gz")) + (file-name (string-append name "-" version ".tar.gz")) + (sha256 + (base32 + "16cr3wd59qkhb5p4yakphz2k4qd09s9kfgavyzzngal6qzqd4km2")) + (modules '((guix build utils))) + (snippet + '(begin + ;; Remove bundled cityhash + (substitute* "src/Makefile.am" + (((string-append "^AM_CXXFLAGS=-I\\$\\{srcdir\\}/cityhash" + " -O3 -mtune=native -Wall -Wsign-compare")) + (string-append "AM_CXXFLAGS=-lcityhash" + " -O3 -mtune=native -Wall -Wsign-compare")) + (("^__top_builddir__bin_vsearch_SOURCES = cityhash/city.h \\\\") + "__top_builddir__bin_vsearch_SOURCES = \\") + (("^cityhash/config.h \\\\") "\\") + (("^cityhash/city.cc \\\\") "\\")) + (substitute* "src/vsearch.h" + (("^\\#include \"cityhash/city.h\"") + "#include ")) + (delete-file-recursively "src/cityhash") + #t)))) + (build-system gnu-build-system) + (arguments + `(#:phases + (modify-phases %standard-phases + (add-before 'configure 'autogen + (lambda _ (zero? (system* "autoreconf" "-vif"))))))) + (inputs + `(("zlib" ,zlib) + ("bzip2" ,bzip2) + ("cityhash" ,cityhash))) + (native-inputs + `(("autoconf" ,autoconf) + ("automake" ,automake) + ("openssl" ,openssl))) + (synopsis "Sequence search tools for metagenomics") + (description + "VSEARCH supports DNA sequence searching, clustering, chimera detection, +dereplication, pairwise alignment, shuffling, subsampling, sorting and +masking. The tool takes advantage of parallelism in the form of SIMD +vectorization as well as multiple threads to perform accurate alignments at +high speed. VSEARCH uses an optimal global aligner (full dynamic programming +Needleman-Wunsch).") + (home-page "https://github.com/torognes/vsearch") + ;; dual licensed + (license (list license:gpl3 license:bsd-2)))) + (define-public bio-locus (package (name "bio-locus") -- 2.4.3 --------------040107020205090800040402--