unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
* [bug#47930] [PATCH] gnu: Add pbgzip.
@ 2021-04-21 12:26 Roel Janssen
  2021-04-21 21:44 ` Xinglu Chen
  2021-04-21 21:45 ` Xinglu Chen
  0 siblings, 2 replies; 10+ messages in thread
From: Roel Janssen @ 2021-04-21 12:26 UTC (permalink / raw)
  To: 47930

[-- Attachment #1: Type: text/plain, Size: 172 bytes --]

Hi Guix,

Here's a patch to add pbgzip.  Lint complains that there is no release 
on the Github page, but there's nothing I can do about it.

Kind regards,
Roel Janssen



[-- Attachment #2: 0001-gnu-Add-pbgzip.patch --]
[-- Type: text/x-patch, Size: 3186 bytes --]

From 3d34e82ee67f5ee0b226de350f40d7f881169a56 Mon Sep 17 00:00:00 2001
From: Roel Janssen <roel@gnu.org>
Date: Wed, 21 Apr 2021 14:24:07 +0200
Subject: [PATCH] gnu: Add pbgzip.

* gnu/packages/bioinformatics.scm (pbgzip): New variable.
---
 gnu/packages/bioinformatics.scm | 42 ++++++++++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 31205c473a..35601378c2 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -3,7 +3,7 @@
 ;;; Copyright © 2015, 2016, 2017, 2018 Ben Woodcroft <donttrustben@gmail.com>
 ;;; Copyright © 2015, 2016, 2018, 2019, 2020 Pjotr Prins <pjotr.guix@thebird.nl>
 ;;; Copyright © 2015 Andreas Enge <andreas@enge.fr>
-;;; Copyright © 2016, 2020 Roel Janssen <roel@gnu.org>
+;;; Copyright © 2016, 2020, 2021 Roel Janssen <roel@gnu.org>
 ;;; Copyright © 2016, 2017, 2018, 2019, 2020, 2021 Efraim Flashner <efraim@flashner.co.il>
 ;;; Copyright © 2016, 2020 Marius Bakke <mbakke@fastmail.com>
 ;;; Copyright © 2016, 2018 Raoul Bonnal <ilpuccio.febo@gmail.com>
@@ -569,6 +569,46 @@ input and output BAMs must adhere to the PacBio BAM format specification.
 Non-PacBio BAMs will cause exceptions to be thrown.")
     (license license:bsd-3)))
 
+(define-public pbgzip
+  (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
+    (package
+      (name "pbgzip")
+      (version (string-take commit 7))
+      (source (origin
+                (method git-fetch)
+                (uri (git-reference
+                      (url "https://github.com/nh13/pbgzip")
+                      (commit commit)))
+                (file-name (string-append name "-" version))
+                (sha256
+                 (base32
+                  "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
+      (build-system gnu-build-system)
+      (arguments
+       `(#:phases
+         (modify-phases %standard-phases
+           (add-after 'unpack 'autogen
+             (lambda _
+               (zero? (system* "sh" "autogen.sh")))))))
+      (native-inputs
+       `(("autoconf" ,autoconf)
+         ("automake" ,automake)))
+      (inputs
+       `(("zlib" ,zlib)))
+      (home-page "https://github.com/nh13/pbgzip")
+      (synopsis "Parallel Block GZIP")
+      (description "This package implements parallel block gzip.  For many
+formats, in particular genomics data formats, data are compressed in
+fixed-length blocks such that they can be easily indexed based on a (genomic)
+coordinate order, since typically each block is sorted according to this order.
+This allows for each block to be individually compressed (deflated), or more
+importantly, decompressed (inflated), with the latter enabling random retrieval
+of data in large files (gigabytes to terabytes).  @code{pbgzip} is not limited
+to any particular format, but certain features are tailored to genomics data
+formats when enabled.  Parallel decompression is somewhat faster, but truly the
+speedup comes during compression.")
+      (license license:expat))))
+
 (define-public blasr-libcpp
   (package
     (name "blasr-libcpp")
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [bug#47930] [PATCH] gnu: Add pbgzip.
  2021-04-21 12:26 [bug#47930] [PATCH] gnu: Add pbgzip Roel Janssen
@ 2021-04-21 21:44 ` Xinglu Chen
  2021-04-21 21:45 ` Xinglu Chen
  1 sibling, 0 replies; 10+ messages in thread
From: Xinglu Chen @ 2021-04-21 21:44 UTC (permalink / raw)
  To: Roel Janssen, 47930

[-- Attachment #1: Type: text/plain, Size: 1257 bytes --]

On Wed, Apr 21 2021, Roel Janssen wrote:

> * gnu/packages/bioinformatics.scm (pbgzip): New variable.
>
> [...]
>  
> +(define-public pbgzip
> +  (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
> +    (package
> +      (name "pbgzip")
> +      (version (string-take commit 7))

I think using (git-version VERSION REVISION COMMIT) is preferred.
Something like (git-version "0.0.0" "0" commit).

> +      (source (origin
> +                (method git-fetch)
> +                (uri (git-reference
> +                      (url "https://github.com/nh13/pbgzip")
> +                      (commit commit)))
> +                (file-name (string-append name "-" version))

Use (git-file-name name version).

> +                (sha256
> +                 (base32
> +                  "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
> +      (build-system gnu-build-system)
> +      (arguments
> +       `(#:phases
> +         (modify-phases %standard-phases
> +           (add-after 'unpack 'autogen
> +             (lambda _
> +               (zero? (system* "sh" "autogen.sh")))))))

IIRC, phases don’t have to return #t, so you could remove ‘zero?’.

Builds fine, but I haven’t tested it.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [bug#47930] [PATCH] gnu: Add pbgzip.
  2021-04-21 12:26 [bug#47930] [PATCH] gnu: Add pbgzip Roel Janssen
  2021-04-21 21:44 ` Xinglu Chen
@ 2021-04-21 21:45 ` Xinglu Chen
  2021-04-22 16:40   ` Maxime Devos
  1 sibling, 1 reply; 10+ messages in thread
From: Xinglu Chen @ 2021-04-21 21:45 UTC (permalink / raw)
  To: Roel Janssen, 47930

On Wed, Apr 21 2021, Roel Janssen wrote:

> * gnu/packages/bioinformatics.scm (pbgzip): New variable.
>
> [...]
>  
> +(define-public pbgzip
> +  (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
> +    (package
> +      (name "pbgzip")
> +      (version (string-take commit 7))

I think using (git-version VERSION REVISION COMMIT) is preferred.
Something like (git-version "0.0.0" "0" commit).

> +      (source (origin
> +                (method git-fetch)
> +                (uri (git-reference
> +                      (url "https://github.com/nh13/pbgzip")
> +                      (commit commit)))
> +                (file-name (string-append name "-" version))

Use (git-file-name name version).

> +                (sha256
> +                 (base32
> +                  "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
> +      (build-system gnu-build-system)
> +      (arguments
> +       `(#:phases
> +         (modify-phases %standard-phases
> +           (add-after 'unpack 'autogen
> +             (lambda _
> +               (zero? (system* "sh" "autogen.sh")))))))

IIRC, phases don’t have to return #t, so you could remove ‘zero?’.

Builds fine, but I haven’t tested it.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [bug#47930] [PATCH] gnu: Add pbgzip.
  2021-04-21 21:45 ` Xinglu Chen
@ 2021-04-22 16:40   ` Maxime Devos
  2021-04-29  7:29     ` Efraim Flashner
  0 siblings, 1 reply; 10+ messages in thread
From: Maxime Devos @ 2021-04-22 16:40 UTC (permalink / raw)
  To: Xinglu Chen, Roel Janssen, 47930

[-- Attachment #1: Type: text/plain, Size: 723 bytes --]

Xinglu Chen schreef op wo 21-04-2021 om 23:45 [+0200]:
> On Wed, Apr 21 2021, Roel Janssen wrote:
> 
> > [...]
> > +      (arguments
> > +       `(#:phases
> > +         (modify-phases %standard-phases
> > +           (add-after 'unpack 'autogen
> > +             (lambda _
> > +               (zero? (system* "sh" "autogen.sh")))))))
> 
> IIRC, phases don’t have to return #t, so you could remove ‘zero?’.

Try running (system* "does-not-exist").  It will fail by returning
something non-zero.  If I recall how to call "invoke" correctly,
I would recommend (invoke "sh" "autogen.sh") here.  "invoke" raises
an exception when the command fails, instead of returning something.

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [bug#47930] [PATCH] gnu: Add pbgzip.
  2021-04-22 16:40   ` Maxime Devos
@ 2021-04-29  7:29     ` Efraim Flashner
  2021-04-29 12:22       ` Roel Janssen
  0 siblings, 1 reply; 10+ messages in thread
From: Efraim Flashner @ 2021-04-29  7:29 UTC (permalink / raw)
  To: Maxime Devos; +Cc: Xinglu Chen, 47930, Roel Janssen

[-- Attachment #1: Type: text/plain, Size: 1116 bytes --]

On Thu, Apr 22, 2021 at 06:40:46PM +0200, Maxime Devos wrote:
> Xinglu Chen schreef op wo 21-04-2021 om 23:45 [+0200]:
> > On Wed, Apr 21 2021, Roel Janssen wrote:
> > 
> > > [...]
> > > +      (arguments
> > > +       `(#:phases
> > > +         (modify-phases %standard-phases
> > > +           (add-after 'unpack 'autogen
> > > +             (lambda _
> > > +               (zero? (system* "sh" "autogen.sh")))))))
> > 
> > IIRC, phases don’t have to return #t, so you could remove ‘zero?’.
> 
> Try running (system* "does-not-exist").  It will fail by returning
> something non-zero.  If I recall how to call "invoke" correctly,
> I would recommend (invoke "sh" "autogen.sh") here.  "invoke" raises
> an exception when the command fails, instead of returning something.

While we're at it, can this phase replace 'bootstrap? It seems to me we
shouldn't need both phases.


-- 
Efraim Flashner   <efraim@flashner.co.il>   אפרים פלשנר
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [bug#47930] [PATCH] gnu: Add pbgzip.
  2021-04-29  7:29     ` Efraim Flashner
@ 2021-04-29 12:22       ` Roel Janssen
  2021-04-30  8:30         ` Xinglu Chen
  0 siblings, 1 reply; 10+ messages in thread
From: Roel Janssen @ 2021-04-29 12:22 UTC (permalink / raw)
  To: Efraim Flashner, Maxime Devos, Xinglu Chen; +Cc: 47930

[-- Attachment #1: Type: text/plain, Size: 1214 bytes --]

On 4/29/21 9:29 AM, Efraim Flashner wrote:
> On Thu, Apr 22, 2021 at 06:40:46PM +0200, Maxime Devos wrote:
>> Xinglu Chen schreef op wo 21-04-2021 om 23:45 [+0200]:
>>> On Wed, Apr 21 2021, Roel Janssen wrote:
>>>
>>>> [...]
>>>> +      (arguments
>>>> +       `(#:phases
>>>> +         (modify-phases %standard-phases
>>>> +           (add-after 'unpack 'autogen
>>>> +             (lambda _
>>>> +               (zero? (system* "sh" "autogen.sh")))))))
>>> IIRC, phases don’t have to return #t, so you could remove ‘zero?’.
>> Try running (system* "does-not-exist").  It will fail by returning
>> something non-zero.  If I recall how to call "invoke" correctly,
>> I would recommend (invoke "sh" "autogen.sh") here.  "invoke" raises
>> an exception when the command fails, instead of returning something.
> While we're at it, can this phase replace 'bootstrap? It seems to me we
> shouldn't need both phases.
This indeed seems to be the best thing to do.  I attached a new patch.

I had to leave autoconf and automake in the native-inputs because
otherwise the command "aclocal" and "autom4te" couldn't be found.

Thanks all for the feedback!  I hope this new patch is fine.

Kind regards,
Roel Janssen


[-- Attachment #2: 0001-gnu-Add-pbgzip.patch --]
[-- Type: text/x-patch, Size: 2986 bytes --]

From b03f8d8926cdd6a28502f2bdc6db74854144f050 Mon Sep 17 00:00:00 2001
From: Roel Janssen <roel@gnu.org>
Date: Thu, 29 Apr 2021 14:18:30 +0200
Subject: [PATCH] gnu: Add pbgzip.

* gnu/packages/bioinformatics.scm (pbgzip): New variable.
---
 gnu/packages/bioinformatics.scm | 36 ++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 83ebfc2d8f..8c4d0fc649 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -3,7 +3,7 @@
 ;;; Copyright © 2015, 2016, 2017, 2018 Ben Woodcroft <donttrustben@gmail.com>
 ;;; Copyright © 2015, 2016, 2018, 2019, 2020 Pjotr Prins <pjotr.guix@thebird.nl>
 ;;; Copyright © 2015 Andreas Enge <andreas@enge.fr>
-;;; Copyright © 2016, 2020 Roel Janssen <roel@gnu.org>
+;;; Copyright © 2016, 2020, 2021 Roel Janssen <roel@gnu.org>
 ;;; Copyright © 2016, 2017, 2018, 2019, 2020, 2021 Efraim Flashner <efraim@flashner.co.il>
 ;;; Copyright © 2016, 2020 Marius Bakke <mbakke@fastmail.com>
 ;;; Copyright © 2016, 2018 Raoul Bonnal <ilpuccio.febo@gmail.com>
@@ -571,6 +571,40 @@ input and output BAMs must adhere to the PacBio BAM format specification.
 Non-PacBio BAMs will cause exceptions to be thrown.")
     (license license:bsd-3)))
 
+(define-public pbgzip
+  (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
+    (package
+      (name "pbgzip")
+      (version (string-take commit 7))
+      (source (origin
+                (method git-fetch)
+                (uri (git-reference
+                      (url "https://github.com/nh13/pbgzip")
+                      (commit commit)))
+                (file-name (string-append name "-" version))
+                (sha256
+                 (base32
+                  "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
+      (build-system gnu-build-system)
+      (native-inputs
+       `(("autoconf" ,autoconf)
+         ("automake" ,automake)))
+      (inputs
+       `(("zlib" ,zlib)))
+      (home-page "https://github.com/nh13/pbgzip")
+      (synopsis "Parallel Block GZIP")
+      (description "This package implements parallel block gzip.  For many
+formats, in particular genomics data formats, data are compressed in
+fixed-length blocks such that they can be easily indexed based on a (genomic)
+coordinate order, since typically each block is sorted according to this order.
+This allows for each block to be individually compressed (deflated), or more
+importantly, decompressed (inflated), with the latter enabling random retrieval
+of data in large files (gigabytes to terabytes).  @code{pbgzip} is not limited
+to any particular format, but certain features are tailored to genomics data
+formats when enabled.  Parallel decompression is somewhat faster, but truly the
+speedup comes during compression.")
+      (license license:expat))))
+
 (define-public blasr-libcpp
   (package
     (name "blasr-libcpp")
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [bug#47930] [PATCH] gnu: Add pbgzip.
  2021-04-29 12:22       ` Roel Janssen
@ 2021-04-30  8:30         ` Xinglu Chen
  2021-04-30 11:48           ` Roel Janssen
  0 siblings, 1 reply; 10+ messages in thread
From: Xinglu Chen @ 2021-04-30  8:30 UTC (permalink / raw)
  To: Roel Janssen, Efraim Flashner, Maxime Devos; +Cc: 47930

On Thu, Apr 29 2021, Roel Janssen wrote:

> +(define-public pbgzip
> +  (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
> +    (package
> +      (name "pbgzip")
> +      (version (string-take commit 7))

Maybe you missed my previous suggestions?

  https://issues.guix.gnu.org/47930#2
  
> +      (source (origin
> +                (method git-fetch)
> +                (uri (git-reference
> +                      (url "https://github.com/nh13/pbgzip")
> +                      (commit commit)))
> +                (file-name (string-append name "-" version))
> +                (sha256
> +                 (base32
> +                  "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
> +      (build-system gnu-build-system)
> +      (native-inputs
> +       `(("autoconf" ,autoconf)
> +         ("automake" ,automake)))
> +      (inputs
> +       `(("zlib" ,zlib)))
> +      (home-page "https://github.com/nh13/pbgzip")
> +      (synopsis "Parallel Block GZIP")
> +      (description "This package implements parallel block gzip.  For many
> +formats, in particular genomics data formats, data are compressed in
> +fixed-length blocks such that they can be easily indexed based on a (genomic)
> +coordinate order, since typically each block is sorted according to this order.
> +This allows for each block to be individually compressed (deflated), or more
> +importantly, decompressed (inflated), with the latter enabling random retrieval
> +of data in large files (gigabytes to terabytes).  @code{pbgzip} is not limited
> +to any particular format, but certain features are tailored to genomics data
> +formats when enabled.  Parallel decompression is somewhat faster, but truly the
                                                                     ^^^^^^^^^^^^^
> +speedup comes during compression.")
   ^^^^^^^

“but the true speedup” instead?





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [bug#47930] [PATCH] gnu: Add pbgzip.
  2021-04-30  8:30         ` Xinglu Chen
@ 2021-04-30 11:48           ` Roel Janssen
  2021-04-30 11:53             ` Efraim Flashner
  0 siblings, 1 reply; 10+ messages in thread
From: Roel Janssen @ 2021-04-30 11:48 UTC (permalink / raw)
  To: Xinglu Chen, Efraim Flashner, Maxime Devos; +Cc: 47930

[-- Attachment #1: Type: text/plain, Size: 2632 bytes --]

On Fri, 2021-04-30 at 10:30 +0200, Xinglu Chen wrote:
> On Thu, Apr 29 2021, Roel Janssen wrote:
> 
> > +(define-public pbgzip
> > +  (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
> > +    (package
> > +      (name "pbgzip")
> > +      (version (string-take commit 7))
> 
> Maybe you missed my previous suggestions?
> 
>   https://issues.guix.gnu.org/47930#2
> 

I'm sorry, I forgot to adapt.
>   
> > +      (source (origin
> > +                (method git-fetch)
> > +                (uri (git-reference
> > +                      (url "https://github.com/nh13/pbgzip")
> > +                      (commit commit)))
> > +                (file-name (string-append name "-" version))
> > +                (sha256
> > +                 (base32
> > +                 
> > "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
> > +      (build-system gnu-build-system)
> > +      (native-inputs
> > +       `(("autoconf" ,autoconf)
> > +         ("automake" ,automake)))
> > +      (inputs
> > +       `(("zlib" ,zlib)))
> > +      (home-page "https://github.com/nh13/pbgzip")
> > +      (synopsis "Parallel Block GZIP")
> > +      (description "This package implements parallel block gzip. 
> > For many
> > +formats, in particular genomics data formats, data are compressed
> > in
> > +fixed-length blocks such that they can be easily indexed based on
> > a (genomic)
> > +coordinate order, since typically each block is sorted according
> > to this order.
> > +This allows for each block to be individually compressed
> > (deflated), or more
> > +importantly, decompressed (inflated), with the latter enabling
> > random retrieval
> > +of data in large files (gigabytes to terabytes).  @code{pbgzip} is
> > not limited
> > +to any particular format, but certain features are tailored to
> > genomics data
> > +formats when enabled.  Parallel decompression is somewhat faster,
> > but truly the
>                                                                     
> ^^^^^^^^^^^^^
> > +speedup comes during compression.")
>    ^^^^^^^
> 
> “but the true speedup” instead?

Sure. I usually don't change descriptions as given by the creators of
the software, but I applied your suggestion.

Thank you for the elaborate suggestions!

I attached another version of the patch, which I hope is fine now. :)

Kind regards,
Roel Janssen



[-- Attachment #2: 0001-gnu-Add-pbgzip.patch --]
[-- Type: text/x-patch, Size: 2991 bytes --]

From 1af29f66980ba19740e05a27135f141e23b7fd3f Mon Sep 17 00:00:00 2001
From: Roel Janssen <roel@gnu.org>
Date: Fri, 30 Apr 2021 13:47:43 +0200
Subject: [PATCH] gnu: Add pbgzip.

* gnu/packages/bioinformatics.scm (pbgzip): New variable.
---
 gnu/packages/bioinformatics.scm | 36 ++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 83ebfc2d8f..cd2dae05d5 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -3,7 +3,7 @@
 ;;; Copyright © 2015, 2016, 2017, 2018 Ben Woodcroft <donttrustben@gmail.com>
 ;;; Copyright © 2015, 2016, 2018, 2019, 2020 Pjotr Prins <pjotr.guix@thebird.nl>
 ;;; Copyright © 2015 Andreas Enge <andreas@enge.fr>
-;;; Copyright © 2016, 2020 Roel Janssen <roel@gnu.org>
+;;; Copyright © 2016, 2020, 2021 Roel Janssen <roel@gnu.org>
 ;;; Copyright © 2016, 2017, 2018, 2019, 2020, 2021 Efraim Flashner <efraim@flashner.co.il>
 ;;; Copyright © 2016, 2020 Marius Bakke <mbakke@fastmail.com>
 ;;; Copyright © 2016, 2018 Raoul Bonnal <ilpuccio.febo@gmail.com>
@@ -571,6 +571,40 @@ input and output BAMs must adhere to the PacBio BAM format specification.
 Non-PacBio BAMs will cause exceptions to be thrown.")
     (license license:bsd-3)))
 
+(define-public pbgzip
+  (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
+    (package
+      (name "pbgzip")
+      (version (git-version "0.0.0" "0" commit))
+      (source (origin
+                (method git-fetch)
+                (uri (git-reference
+                      (url "https://github.com/nh13/pbgzip")
+                      (commit commit)))
+                (file-name (git-file-name name version))
+                (sha256
+                 (base32
+                  "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
+      (build-system gnu-build-system)
+      (native-inputs
+       `(("autoconf" ,autoconf)
+         ("automake" ,automake)))
+      (inputs
+       `(("zlib" ,zlib)))
+      (home-page "https://github.com/nh13/pbgzip")
+      (synopsis "Parallel Block GZIP")
+      (description "This package implements parallel block gzip.  For many
+formats, in particular genomics data formats, data are compressed in
+fixed-length blocks such that they can be easily indexed based on a (genomic)
+coordinate order, since typically each block is sorted according to this order.
+This allows for each block to be individually compressed (deflated), or more
+importantly, decompressed (inflated), with the latter enabling random retrieval
+of data in large files (gigabytes to terabytes).  @code{pbgzip} is not limited
+to any particular format, but certain features are tailored to genomics data
+formats when enabled.  Parallel decompression is somewhat faster, but the true
+speedup comes during compression.")
+      (license license:expat))))
+
 (define-public blasr-libcpp
   (package
     (name "blasr-libcpp")
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [bug#47930] [PATCH] gnu: Add pbgzip.
  2021-04-30 11:48           ` Roel Janssen
@ 2021-04-30 11:53             ` Efraim Flashner
  2021-04-30 16:47               ` bug#47930: " Roel Janssen
  0 siblings, 1 reply; 10+ messages in thread
From: Efraim Flashner @ 2021-04-30 11:53 UTC (permalink / raw)
  To: Roel Janssen; +Cc: Maxime Devos, Xinglu Chen, 47930

[-- Attachment #1: Type: text/plain, Size: 6447 bytes --]

On Fri, Apr 30, 2021 at 01:48:48PM +0200, Roel Janssen wrote:
> On Fri, 2021-04-30 at 10:30 +0200, Xinglu Chen wrote:
> > On Thu, Apr 29 2021, Roel Janssen wrote:
> > 
> > > +(define-public pbgzip
> > > +  (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
> > > +    (package
> > > +      (name "pbgzip")
> > > +      (version (string-take commit 7))
> > 
> > Maybe you missed my previous suggestions?
> > 
> >   https://issues.guix.gnu.org/47930#2
> > 
> 
> I'm sorry, I forgot to adapt.
> >   
> > > +      (source (origin
> > > +                (method git-fetch)
> > > +                (uri (git-reference
> > > +                      (url "https://github.com/nh13/pbgzip")
> > > +                      (commit commit)))
> > > +                (file-name (string-append name "-" version))
> > > +                (sha256
> > > +                 (base32
> > > +                 
> > > "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
> > > +      (build-system gnu-build-system)
> > > +      (native-inputs
> > > +       `(("autoconf" ,autoconf)
> > > +         ("automake" ,automake)))
> > > +      (inputs
> > > +       `(("zlib" ,zlib)))
> > > +      (home-page "https://github.com/nh13/pbgzip")
> > > +      (synopsis "Parallel Block GZIP")
> > > +      (description "This package implements parallel block gzip. 
> > > For many
> > > +formats, in particular genomics data formats, data are compressed
> > > in
> > > +fixed-length blocks such that they can be easily indexed based on
> > > a (genomic)
> > > +coordinate order, since typically each block is sorted according
> > > to this order.
> > > +This allows for each block to be individually compressed
> > > (deflated), or more
> > > +importantly, decompressed (inflated), with the latter enabling
> > > random retrieval
> > > +of data in large files (gigabytes to terabytes).  @code{pbgzip} is
> > > not limited
> > > +to any particular format, but certain features are tailored to
> > > genomics data
> > > +formats when enabled.  Parallel decompression is somewhat faster,
> > > but truly the
> >                                                                     
> > ^^^^^^^^^^^^^
> > > +speedup comes during compression.")
> >    ^^^^^^^
> > 
> > “but the true speedup” instead?
> 
> Sure. I usually don't change descriptions as given by the creators of
> the software, but I applied your suggestion.
> 
> Thank you for the elaborate suggestions!
> 
> I attached another version of the patch, which I hope is fine now. :)
> 
> Kind regards,
> Roel Janssen
> 
> 

> From 1af29f66980ba19740e05a27135f141e23b7fd3f Mon Sep 17 00:00:00 2001
> From: Roel Janssen <roel@gnu.org>
> Date: Fri, 30 Apr 2021 13:47:43 +0200
> Subject: [PATCH] gnu: Add pbgzip.
> 
> * gnu/packages/bioinformatics.scm (pbgzip): New variable.
> ---
>  gnu/packages/bioinformatics.scm | 36 ++++++++++++++++++++++++++++++++-
>  1 file changed, 35 insertions(+), 1 deletion(-)
> 
> diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
> index 83ebfc2d8f..cd2dae05d5 100644
> --- a/gnu/packages/bioinformatics.scm
> +++ b/gnu/packages/bioinformatics.scm
> @@ -3,7 +3,7 @@
>  ;;; Copyright © 2015, 2016, 2017, 2018 Ben Woodcroft <donttrustben@gmail.com>
>  ;;; Copyright © 2015, 2016, 2018, 2019, 2020 Pjotr Prins <pjotr.guix@thebird.nl>
>  ;;; Copyright © 2015 Andreas Enge <andreas@enge.fr>
> -;;; Copyright © 2016, 2020 Roel Janssen <roel@gnu.org>
> +;;; Copyright © 2016, 2020, 2021 Roel Janssen <roel@gnu.org>
>  ;;; Copyright © 2016, 2017, 2018, 2019, 2020, 2021 Efraim Flashner <efraim@flashner.co.il>
>  ;;; Copyright © 2016, 2020 Marius Bakke <mbakke@fastmail.com>
>  ;;; Copyright © 2016, 2018 Raoul Bonnal <ilpuccio.febo@gmail.com>
> @@ -571,6 +571,40 @@ input and output BAMs must adhere to the PacBio BAM format specification.
>  Non-PacBio BAMs will cause exceptions to be thrown.")
>      (license license:bsd-3)))
>  
> +(define-public pbgzip
> +  (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
> +    (package
> +      (name "pbgzip")
> +      (version (git-version "0.0.0" "0" commit))
> +      (source (origin
> +                (method git-fetch)
> +                (uri (git-reference
> +                      (url "https://github.com/nh13/pbgzip")
> +                      (commit commit)))
> +                (file-name (git-file-name name version))
> +                (sha256
> +                 (base32
> +                  "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
> +      (build-system gnu-build-system)
> +      (native-inputs
> +       `(("autoconf" ,autoconf)
> +         ("automake" ,automake)))
> +      (inputs
> +       `(("zlib" ,zlib)))
> +      (home-page "https://github.com/nh13/pbgzip")
> +      (synopsis "Parallel Block GZIP")
> +      (description "This package implements parallel block gzip.  For many
> +formats, in particular genomics data formats, data are compressed in

I wasn't sure about 'data are' vs 'data is' but I think data here is
plural, so 'data are' should be right.

> +fixed-length blocks such that they can be easily indexed based on a (genomic)
> +coordinate order, since typically each block is sorted according to this order.
> +This allows for each block to be individually compressed (deflated), or more
> +importantly, decompressed (inflated), with the latter enabling random retrieval
> +of data in large files (gigabytes to terabytes).  @code{pbgzip} is not limited
> +to any particular format, but certain features are tailored to genomics data
> +formats when enabled.  Parallel decompression is somewhat faster, but the true
> +speedup comes during compression.")
> +      (license license:expat))))
> +
>  (define-public blasr-libcpp
>    (package
>      (name "blasr-libcpp")
> -- 
> 2.31.1
> 

Looks good to me!

-- 
Efraim Flashner   <efraim@flashner.co.il>   אפרים פלשנר
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#47930: [PATCH] gnu: Add pbgzip.
  2021-04-30 11:53             ` Efraim Flashner
@ 2021-04-30 16:47               ` Roel Janssen
  0 siblings, 0 replies; 10+ messages in thread
From: Roel Janssen @ 2021-04-30 16:47 UTC (permalink / raw)
  To: Efraim Flashner; +Cc: Xinglu Chen, 47930-done

On Fri, 2021-04-30 at 14:53 +0300, Efraim Flashner wrote:
> 
> 
> > From 1af29f66980ba19740e05a27135f141e23b7fd3f Mon Sep 17 00:00:00
> > 2001
> > From: Roel Janssen <roel@gnu.org>
> > Date: Fri, 30 Apr 2021 13:47:43 +0200
> > Subject: [PATCH] gnu: Add pbgzip.
> > 
> > ...
> > +      (synopsis "Parallel Block GZIP")
> > +      (description "This package implements parallel block gzip. 
> > For many
> > +formats, in particular genomics data formats, data are compressed
> > in
> 
> I wasn't sure about 'data are' vs 'data is' but I think data here is
> plural, so 'data are' should be right.
> 
> > +fixed-length blocks such that they can be easily indexed based on
> > a (genomic)
> > +coordinate order, since typically each block is sorted according
> > to this order.
> > +This allows for each block to be individually compressed
> > (deflated), or more
> > +importantly, decompressed (inflated), with the latter enabling
> > random retrieval
> > +of data in large files (gigabytes to terabytes).  @code{pbgzip} is
> > not limited
> > +to any particular format, but certain features are tailored to
> > genomics data
> > +formats when enabled.  Parallel decompression is somewhat faster,
> > but the true
> > +speedup comes during compression.")
> > +      (license license:expat))))
> > +
> >  (define-public blasr-libcpp
> >    (package
> >      (name "blasr-libcpp")
> > -- 
> > 2.31.1
> > 
> 
> Looks good to me!
> 

Thank you Efraim, and thank you Xinglu Chen.
I pushed this patch.

Kind regards,
Roel Janssen






^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-04-30 17:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-21 12:26 [bug#47930] [PATCH] gnu: Add pbgzip Roel Janssen
2021-04-21 21:44 ` Xinglu Chen
2021-04-21 21:45 ` Xinglu Chen
2021-04-22 16:40   ` Maxime Devos
2021-04-29  7:29     ` Efraim Flashner
2021-04-29 12:22       ` Roel Janssen
2021-04-30  8:30         ` Xinglu Chen
2021-04-30 11:48           ` Roel Janssen
2021-04-30 11:53             ` Efraim Flashner
2021-04-30 16:47               ` bug#47930: " Roel Janssen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).