unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
* [bug#35880] [PATCH 0/7] Lzip support for 'guix publish' and 'guix substitute'
@ 2019-05-24 13:31 Ludovic Courtès
  2019-05-24 13:42 ` [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed' Ludovic Courtès
  0 siblings, 1 reply; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-24 13:31 UTC (permalink / raw)
  To: 35880; +Cc: Pierre Neidhardt

Hi!

As a followup to Pierre’s work on (guix lzlib), these patches implement
‘lzip’ support for ‘guix substitute’ and ‘guix publish’.  With these,
you can now run:

  ./pre-inst-env guix publish -Clzip …

on one side, and on another machine:

  ./pre-inst-env guix-daemon --build-users-group=guixbuild

and from there the client machine should be able to fetch
lzip-compressed substitutes.

These patches do not address the transitioning issue that we discussed
earlier, where we have clients lacking lzip support talking to an
lzip-capable server.  As discussed earlier, clients will have to send
a special HTTP header, ‘X-Guix-Accept-Encoding’.

For the server-side, I’m still hesitating between implementing it in ‘guix
publish’ or simply running two instances of ‘guix publish’ side-by-side
(one gzip and one lzip) and letting nginx dispatch between the two.  :-)

Note that we’ll have to adjust our nginx mirror configs to take that
header into account!

Comments?

Ludo’.

Ludovic Courtès (7):
  lzlib: Add 'make-lzip-input-port/compressed'.
  utils: Test 'compressed-port' and 'decompressed-port' for both gzip
    and xz.
  utils: Support compression and decompression with lzip.
  publish: Add support for lzip.
  self: Add dependency on lzlib.
  gnu: guix: Add dependency on lzlib.
  lzlib: 'lzread!' never returns more than it was asked for.

 .dir-locals.el                      |   2 +
 doc/guix.texi                       |  25 +++++--
 gnu/packages/package-management.scm |   1 +
 guix/lzlib.scm                      | 101 ++++++++++++++++++++++------
 guix/scripts/publish.scm            |  84 +++++++++++++++++------
 guix/self.scm                       |  13 +++-
 guix/tests.scm                      |   1 +
 guix/utils.scm                      |  27 ++++++--
 tests/lzlib.scm                     |  10 +++
 tests/publish.scm                   |  36 ++++++++++
 tests/utils.scm                     |  62 +++++++++++------
 11 files changed, 284 insertions(+), 78 deletions(-)

-- 
2.21.0

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-05-24 13:31 [bug#35880] [PATCH 0/7] Lzip support for 'guix publish' and 'guix substitute' Ludovic Courtès
@ 2019-05-24 13:42 ` Ludovic Courtès
  2019-05-24 13:42   ` [bug#35880] [PATCH 2/7] utils: Test 'compressed-port' and 'decompressed-port' for both gzip and xz Ludovic Courtès
                     ` (6 more replies)
  0 siblings, 7 replies; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-24 13:42 UTC (permalink / raw)
  To: 35880; +Cc: Pierre Neidhardt

* guix/lzlib.scm (lzwrite!, make-lzip-input-port/compressed): New
procedures.
* tests/lzlib.scm ("make-lzip-input-port/compressed"): New test.
* guix/tests.scm (%seed): Export.
---
 guix/lzlib.scm  | 62 +++++++++++++++++++++++++++++++++++++++++++++++++
 guix/tests.scm  |  1 +
 tests/lzlib.scm | 10 ++++++++
 3 files changed, 73 insertions(+)

diff --git a/guix/lzlib.scm b/guix/lzlib.scm
index a6dac46049..48927c6262 100644
--- a/guix/lzlib.scm
+++ b/guix/lzlib.scm
@@ -1,5 +1,6 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2019 Pierre Neidhardt <mail@ambrevar.xyz>
+;;; Copyright © 2019 Ludovic Courtès <ludo@gnu.org>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -23,9 +24,11 @@
   #:use-module (ice-9 match)
   #:use-module (system foreign)
   #:use-module (guix config)
+  #:use-module (srfi srfi-11)
   #:export (lzlib-available?
             make-lzip-input-port
             make-lzip-output-port
+            make-lzip-input-port/compressed
             call-with-lzip-input-port
             call-with-lzip-output-port
             %default-member-length-limit
@@ -515,6 +518,23 @@ the end-of-stream has been reached."
         (loop rd)))
     read))
 
+(define (lzwrite! encoder source source-offset source-count
+                  target target-offset target-count)
+  "Write up to SOURCE-COUNT bytes from SOURCE to ENCODER, and read up to
+TARGET-COUNT bytes into TARGET at TARGET-OFFSET.  Return two values: the
+number of bytes read from SOURCE, and the number of bytes written to TARGET."
+  (define read
+    (if (< 0 (lz-compress-write-size encoder))
+        (match (lz-compress-write encoder source source-offset source-count)
+          (0 (lz-compress-finish encoder) 0)
+          (n n))
+        0))
+
+  (let loop ()
+    (match (lz-compress-read encoder target target-offset target-count)
+      (0       (loop))
+      (written (values read written)))))
+
 (define* (lzwrite encoder bv lz-port
                   #:optional (start 0) (count (bytevector-length bv)))
   "Write up to COUNT bytes from BV at offset START into LZ-PORT.  Return
@@ -597,6 +617,48 @@ port is closed."
                                     (lz-compress-close encoder)
                                     (close-port port))))
 
+(define* (make-lzip-input-port/compressed port
+                                          #:key
+                                          (level %default-compression-level))
+  "Return an input port that compresses data read from PORT, with the given LEVEL.
+PORT is automatically closed when the resulting port is closed."
+  (define encoder (apply lz-compress-open
+                         (car (assoc-ref %compression-levels level))))
+
+  (define input-buffer (make-bytevector 8192))
+  (define input-len 0)
+  (define input-offset 0)
+
+  (define input-eof? #f)
+
+  (define (read! bv start count)
+    (cond
+     (input-eof?
+      (lz-compress-read encoder bv start count))
+     ((= input-offset input-len)
+      (match (get-bytevector-n! port input-buffer 0
+                                (bytevector-length input-buffer))
+        ((? eof-object?)
+         (set! input-eof? #t)
+         (lz-compress-finish encoder))
+        (count
+         (set! input-offset 0)
+         (set! input-len count)))
+      (read! bv start count))
+     (else
+      (let-values (((read written)
+                    (lzwrite! encoder
+                              input-buffer input-offset
+                              (- input-len input-offset)
+                              bv start count)))
+        (set! input-offset (+ input-offset read))
+        written))))
+
+  (make-custom-binary-input-port "lzip-input/compressed"
+                                 read! #f #f
+                                 (lambda ()
+                                   (close-port port))))
+
 (define* (call-with-lzip-input-port port proc)
   "Call PROC with a port that wraps PORT and decompresses data read from it.
 PORT is closed upon completion."
diff --git a/guix/tests.scm b/guix/tests.scm
index 35ebf8464d..66d60e964e 100644
--- a/guix/tests.scm
+++ b/guix/tests.scm
@@ -33,6 +33,7 @@
   #:use-module (web uri)
   #:export (open-connection-for-tests
             with-external-store
+            %seed
             random-text
             random-bytevector
             file=?
diff --git a/tests/lzlib.scm b/tests/lzlib.scm
index cf53a9417d..543622bb45 100644
--- a/tests/lzlib.scm
+++ b/tests/lzlib.scm
@@ -108,4 +108,14 @@
 (test-assert* "Bytevector of size relative to Lzip internal buffers (1MiB+1)"
   (compress-and-decompress (random-bytevector (1+ (* 1024 1024)))))
 
+(test-assert "make-lzip-input-port/compressed"
+  (let* ((len        (pk 'len (+ 10 (random 4000 %seed))))
+         (data       (random-bytevector len))
+         (compressed (make-lzip-input-port/compressed
+                      (open-bytevector-input-port data)))
+         (result     (call-with-lzip-input-port compressed
+                                                get-bytevector-all)))
+    (pk (bytevector-length result) (bytevector-length data))
+    (bytevector=? result data)))
+
 (test-end)
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 2/7] utils: Test 'compressed-port' and 'decompressed-port' for both gzip and xz.
  2019-05-24 13:42 ` [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed' Ludovic Courtès
@ 2019-05-24 13:42   ` Ludovic Courtès
  2019-05-24 13:42   ` [bug#35880] [PATCH 3/7] utils: Support compression and decompression with lzip Ludovic Courtès
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-24 13:42 UTC (permalink / raw)
  To: 35880; +Cc: Pierre Neidhardt

* tests/utils.scm (test-compression/decompression): New procedure.
<top level>: Call it for both 'xz and 'gzip.
---
 tests/utils.scm | 61 +++++++++++++++++++++++++++++++------------------
 1 file changed, 39 insertions(+), 22 deletions(-)

diff --git a/tests/utils.scm b/tests/utils.scm
index 3015b21b23..7d55107fda 100644
--- a/tests/utils.scm
+++ b/tests/utils.scm
@@ -174,30 +174,47 @@
       (any (compose (negate zero?) cdr waitpid)
            pids))))
 
-(test-assert "compressed-port, decompressed-port, non-file"
-  (let ((data (call-with-input-file (search-path %load-path "guix.scm")
-                get-bytevector-all)))
-    (let*-values (((compressed pids1)
-                   (compressed-port 'xz (open-bytevector-input-port data)))
-                  ((decompressed pids2)
-                   (decompressed-port 'xz compressed)))
-      (and (every (compose zero? cdr waitpid)
-                  (append pids1 pids2))
-           (equal? (get-bytevector-all decompressed) data)))))
+(define (test-compression/decompression method run?)
+  "Test METHOD, a symbol such as 'gzip.  Call RUN? to determine whether to
+skip these tests."
+  (unless (run?) (test-skip 1))
+  (test-assert (format #f "compressed-port, decompressed-port, non-file [~a]"
+                       method)
+    (let ((data (call-with-input-file (search-path %load-path "guix.scm")
+                  get-bytevector-all)))
+      (let*-values (((compressed pids1)
+                     (compressed-port method (open-bytevector-input-port data)))
+                    ((decompressed pids2)
+                     (decompressed-port method compressed)))
+        (and (every (compose zero? cdr waitpid)
+                    (pk 'pids method (append pids1 pids2)))
+             (let ((result (get-bytevector-all decompressed)))
+               (pk 'len method
+                   (if (bytevector? result)
+                       (bytevector-length result)
+                       result)
+                   (bytevector-length data))
+               (equal? result data))))))
 
-(false-if-exception (delete-file temp-file))
-(test-assert "compressed-output-port + decompressed-port"
-  (let* ((file (search-path %load-path "guix/derivations.scm"))
-         (data (call-with-input-file file get-bytevector-all))
-         (port (open-file temp-file "w0b")))
-    (call-with-compressed-output-port 'xz port
-      (lambda (compressed)
-        (put-bytevector compressed data)))
-    (close-port port)
+  (false-if-exception (delete-file temp-file))
+  (unless (run?) (test-skip 1))
+  (test-assert (format #f "compressed-output-port + decompressed-port [~a]"
+                       method)
+    (let* ((file (search-path %load-path "guix/derivations.scm"))
+           (data (call-with-input-file file get-bytevector-all))
+           (port (open-file temp-file "w0b")))
+      (call-with-compressed-output-port method port
+        (lambda (compressed)
+          (put-bytevector compressed data)))
+      (close-port port)
 
-    (bytevector=? data
-                  (call-with-decompressed-port 'xz (open-file temp-file "r0b")
-                    get-bytevector-all))))
+      (bytevector=? data
+                    (call-with-decompressed-port method (open-file temp-file "r0b")
+                      get-bytevector-all)))))
+
+(for-each test-compression/decompression
+          '(gzip xz)
+          (list (const #t) (const #f)))
 
 ;; This is actually in (guix store).
 (test-equal "store-path-package-name"
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 3/7] utils: Support compression and decompression with lzip.
  2019-05-24 13:42 ` [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed' Ludovic Courtès
  2019-05-24 13:42   ` [bug#35880] [PATCH 2/7] utils: Test 'compressed-port' and 'decompressed-port' for both gzip and xz Ludovic Courtès
@ 2019-05-24 13:42   ` Ludovic Courtès
  2019-05-25 17:27     ` Pierre Neidhardt
  2019-05-24 13:42   ` [bug#35880] [PATCH 4/7] publish: Add support for lzip Ludovic Courtès
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-24 13:42 UTC (permalink / raw)
  To: 35880; +Cc: Pierre Neidhardt

* guix/utils.scm (lzip-port): New procedure.
(decompressed-port, compressed-port, compressed-output-port): Add 'lzip
case.
* tests/utils.scm <top level>: Call 'test-compression/decompression' for
'lzip as well.
---
 guix/utils.scm  | 27 ++++++++++++++++++++++-----
 tests/utils.scm |  5 +++--
 2 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/guix/utils.scm b/guix/utils.scm
index ed1a418cca..709cdf9353 100644
--- a/guix/utils.scm
+++ b/guix/utils.scm
@@ -1,5 +1,5 @@
 ;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2013, 2014, 2015 Mark H Weaver <mhw@netris.org>
 ;;; Copyright © 2014 Eric Bavier <bavier@member.fsf.org>
 ;;; Copyright © 2014 Ian Denhardt <ian@zenhack.net>
@@ -169,6 +169,17 @@ buffered data is lost."
               (close-port out)
               (loop in (cons child pids)))))))))
 
+(define (lzip-port proc port . args)
+  "Return the lzip port produced by calling PROC (a symbol) on PORT and ARGS.
+Raise an error if lzlib support is missing."
+  (let* ((lzlib       (false-if-exception (resolve-interface '(guix lzlib))))
+         (supported?  (and lzlib
+                           ((module-ref lzlib 'lzlib-available?)))))
+    (if supported?
+        (let ((make-port (module-ref lzlib proc)))
+          (values (make-port port) '()))
+        (error "lzip compression not supported" lzlib))))
+
 (define (decompressed-port compression input)
   "Return an input port where INPUT is decompressed according to COMPRESSION,
 a symbol such as 'xz."
@@ -177,17 +188,21 @@ a symbol such as 'xz."
     ('bzip2        (filtered-port `(,%bzip2 "-dc") input))
     ('xz           (filtered-port `(,%xz "-dc") input))
     ('gzip         (filtered-port `(,%gzip "-dc") input))
-    (else          (error "unsupported compression scheme" compression))))
+    ('lzip         (values (lzip-port 'make-lzip-input-port input)
+                           '()))
+    (_             (error "unsupported compression scheme" compression))))
 
 (define (compressed-port compression input)
-  "Return an input port where INPUT is decompressed according to COMPRESSION,
+  "Return an input port where INPUT is compressed according to COMPRESSION,
 a symbol such as 'xz."
   (match compression
     ((or #f 'none) (values input '()))
     ('bzip2        (filtered-port `(,%bzip2 "-c") input))
     ('xz           (filtered-port `(,%xz "-c") input))
     ('gzip         (filtered-port `(,%gzip "-c") input))
-    (else          (error "unsupported compression scheme" compression))))
+    ('lzip         (values (lzip-port 'make-lzip-input-port/compressed input)
+                           '()))
+    (_             (error "unsupported compression scheme" compression))))
 
 (define (call-with-decompressed-port compression port proc)
   "Call PROC with a wrapper around PORT, a file port, that decompresses data
@@ -244,7 +259,9 @@ program--e.g., '(\"--fast\")."
     ('bzip2        (filtered-output-port `(,%bzip2 "-c" ,@options) output))
     ('xz           (filtered-output-port `(,%xz "-c" ,@options) output))
     ('gzip         (filtered-output-port `(,%gzip "-c" ,@options) output))
-    (else          (error "unsupported compression scheme" compression))))
+    ('lzip         (values (lzip-port 'make-lzip-output-port output)
+                           '()))
+    (_             (error "unsupported compression scheme" compression))))
 
 (define* (call-with-compressed-output-port compression port proc
                                            #:key (options '()))
diff --git a/tests/utils.scm b/tests/utils.scm
index 7d55107fda..7c8f7c09d0 100644
--- a/tests/utils.scm
+++ b/tests/utils.scm
@@ -23,6 +23,7 @@
   #:use-module (guix utils)
   #:use-module ((guix store) #:select (%store-prefix store-path-package-name))
   #:use-module ((guix search-paths) #:select (string-tokenize*))
+  #:use-module ((guix lzlib) #:select (lzlib-available?))
   #:use-module (srfi srfi-1)
   #:use-module (srfi srfi-11)
   #:use-module (srfi srfi-64)
@@ -213,8 +214,8 @@ skip these tests."
                       get-bytevector-all)))))
 
 (for-each test-compression/decompression
-          '(gzip xz)
-          (list (const #t) (const #f)))
+          '(gzip xz lzip)
+          (list (const #t) (const #f) lzlib-available?))
 
 ;; This is actually in (guix store).
 (test-equal "store-path-package-name"
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 4/7] publish: Add support for lzip.
  2019-05-24 13:42 ` [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed' Ludovic Courtès
  2019-05-24 13:42   ` [bug#35880] [PATCH 2/7] utils: Test 'compressed-port' and 'decompressed-port' for both gzip and xz Ludovic Courtès
  2019-05-24 13:42   ` [bug#35880] [PATCH 3/7] utils: Support compression and decompression with lzip Ludovic Courtès
@ 2019-05-24 13:42   ` Ludovic Courtès
  2019-05-24 13:42   ` [bug#35880] [PATCH 5/7] self: Add dependency on lzlib Ludovic Courtès
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-24 13:42 UTC (permalink / raw)
  To: 35880; +Cc: Pierre Neidhardt

* guix/scripts/publish.scm (show-help, %options): Support '-C METHOD'
and '-C METHOD:LEVEL'.
(default-compression): New procedure.
(bake-narinfo+nar): Add lzip.
(nar-response-port): Likewise.
(string->compression-type): New procedure.
(make-request-handler): Generalize /nar/gzip handler to handle /nar/lzip
as well.
* tests/publish.scm ("/nar/lzip/*"): New test.
("/*.narinfo with lzip compression"): New test.
* doc/guix.texi (Invoking guix publish): Document it.
(Requirements): Mention lzlib.
---
 .dir-locals.el           |  2 +
 doc/guix.texi            | 25 +++++++++---
 guix/scripts/publish.scm | 84 +++++++++++++++++++++++++++++-----------
 tests/publish.scm        | 36 +++++++++++++++++
 4 files changed, 119 insertions(+), 28 deletions(-)

diff --git a/.dir-locals.el b/.dir-locals.el
index 550e06ef09..f1196fd781 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -53,6 +53,8 @@
    (eval . (put 'call-with-decompressed-port 'scheme-indent-function 2))
    (eval . (put 'call-with-gzip-input-port 'scheme-indent-function 1))
    (eval . (put 'call-with-gzip-output-port 'scheme-indent-function 1))
+   (eval . (put 'call-with-lzip-input-port 'scheme-indent-function 1))
+   (eval . (put 'call-with-lzip-output-port 'scheme-indent-function 1))
    (eval . (put 'signature-case 'scheme-indent-function 1))
    (eval . (put 'emacs-batch-eval 'scheme-indent-function 0))
    (eval . (put 'emacs-batch-edit-file 'scheme-indent-function 1))
diff --git a/doc/guix.texi b/doc/guix.texi
index f176bb0163..b0de5632e7 100644
--- a/doc/guix.texi
+++ b/doc/guix.texi
@@ -757,6 +757,11 @@ Support for build offloading (@pxref{Daemon Offload Setup}) and
 @uref{https://github.com/artyom-poptsov/guile-ssh, Guile-SSH},
 version 0.10.2 or later.
 
+@item
+When @url{https://www.nongnu.org/lzip/lzlib.html, lzlib} is available, lzlib
+substitutes can be used and @command{guix publish} can compress substitutes
+with lzlib.
+
 @item
 When @url{http://www.bzip.org, libbz2} is available,
 @command{guix-daemon} can use it to compress build logs.
@@ -9656,12 +9661,20 @@ accept connections from any interface.
 Change privileges to @var{user} as soon as possible---i.e., once the
 server socket is open and the signing key has been read.
 
-@item --compression[=@var{level}]
-@itemx -C [@var{level}]
-Compress data using the given @var{level}.  When @var{level} is zero,
-disable compression.  The range 1 to 9 corresponds to different gzip
-compression levels: 1 is the fastest, and 9 is the best (CPU-intensive).
-The default is 3.
+@item --compression[=@var{method}[:@var{level}]]
+@itemx -C [@var{method}[:@var{level}]]
+Compress data using the given @var{method} and @var{level}.  @var{method} is
+one of @code{lzip} and @code{gzip}; when @var{method} is omitted, @code{gzip}
+is used.
+
+When @var{level} is zero, disable compression.  The range 1 to 9 corresponds
+to different compression levels: 1 is the fastest, and 9 is the best
+(CPU-intensive).  The default is 3.
+
+Usually, @code{lzip} compresses noticeably better than @code{gzip} for a small
+increase in CPU usage; see
+@uref{https://nongnu.org/lzip/lzip_benchmark.html,benchmarks on the lzip Web
+page}.
 
 Unless @option{--cache} is used, compression occurs on the fly and
 the compressed streams are not
diff --git a/guix/scripts/publish.scm b/guix/scripts/publish.scm
index a236f3e45c..9e74d789ce 100644
--- a/guix/scripts/publish.scm
+++ b/guix/scripts/publish.scm
@@ -1,6 +1,6 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2015 David Thompson <davet@gnu.org>
-;;; Copyright © 2015, 2016, 2017, 2018 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2015, 2016, 2017, 2018, 2019 Ludovic Courtès <ludo@gnu.org>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -51,6 +51,7 @@
   #:use-module (guix store)
   #:use-module ((guix serialization) #:select (write-file))
   #:use-module (guix zlib)
+  #:autoload   (guix lzlib) (lzlib-available?)
   #:use-module (guix cache)
   #:use-module (guix ui)
   #:use-module (guix scripts)
@@ -74,8 +75,8 @@ Publish ~a over HTTP.\n") %store-directory)
   (display (G_ "
   -u, --user=USER        change privileges to USER as soon as possible"))
   (display (G_ "
-  -C, --compression[=LEVEL]
-                         compress archives at LEVEL"))
+  -C, --compression[=METHOD:LEVEL]
+                         compress archives with METHOD at LEVEL"))
   (display (G_ "
   -c, --cache=DIRECTORY  cache published items to DIRECTORY"))
   (display (G_ "
@@ -121,6 +122,9 @@ Publish ~a over HTTP.\n") %store-directory)
   ;; Since we compress on the fly, default to fast compression.
   (compression 'gzip 3))
 
+(define (default-compression type)
+  (compression type 3))
+
 (define (actual-compression item requested)
   "Return the actual compression used for ITEM, which may be %NO-COMPRESSION
 if ITEM is already compressed."
@@ -153,18 +157,28 @@ if ITEM is already compressed."
                             name)))))
         (option '(#\C "compression") #f #t
                 (lambda (opt name arg result)
-                  (match (if arg (string->number* arg) 3)
-                    (0
-                     (alist-cons 'compression %no-compression result))
-                    (level
-                     (if (zlib-available?)
-                         (alist-cons 'compression
-                                     (compression 'gzip level)
-                                     result)
-                         (begin
-                           (warning (G_ "zlib support is missing; \
-compression disabled~%"))
-                           result))))))
+                  (let* ((colon (string-index arg #\:))
+                         (type  (cond
+                                 (colon (string-take arg colon))
+                                 ((string->number arg) "gzip")
+                                 (else arg)))
+                         (level (if colon
+                                    (string->number*
+                                     (string-drop arg (+ 1 colon)))
+                                    (or (string->number arg) 3))))
+                    (match level
+                      (0
+                       (alist-cons 'compression %no-compression result))
+                      (level
+                       (match (string->compression-type type)
+                         ((? symbol? type)
+                          (alist-cons 'compression
+                                      (compression type level)
+                                      result))
+                         (_
+                          (warning (G_ "~a: unsupported compression type~%")
+                                   type)
+                          result)))))))
         (option '(#\c "cache") #t #f
                 (lambda (opt name arg result)
                   (alist-cons 'cache arg result)))
@@ -483,6 +497,13 @@ requested using POOL."
          #:level (compression-level compression)
          #:buffer-size (* 128 1024))
        (rename-file (string-append nar ".tmp") nar))
+      ('lzip
+       ;; Note: the file port gets closed along with the lzip port.
+       (call-with-lzip-output-port (open-output-file (string-append nar ".tmp"))
+         (lambda (port)
+           (write-file item port))
+         #:level (compression-level compression))
+       (rename-file (string-append nar ".tmp") nar))
       ('none
        ;; Cache nars even when compression is disabled so that we can
        ;; guarantee the TTL (see <https://bugs.gnu.org/28664>.)
@@ -687,6 +708,9 @@ example: \"/foo/bar\" yields '(\"foo\" \"bar\")."
      (make-gzip-output-port (response-port response)
                             #:level level
                             #:buffer-size (* 64 1024)))
+    (($ <compression> 'lzip level)
+     (make-lzip-output-port (response-port response)
+                            #:level level))
     (($ <compression> 'none)
      (response-port response))
     (#f
@@ -761,12 +785,23 @@ blocking."
   http-write
   (@@ (web server http) http-close))
 
+(define (string->compression-type string)
+  "Return a symbol denoting the compression method expressed by STRING; return
+#f if STRING doesn't match any supported method."
+  (match string
+    ("gzip" (and (zlib-available?) 'gzip))
+    ("lzip" (and (lzlib-available?) 'lzip))
+    (_      #f)))
+
 (define* (make-request-handler store
                                #:key
                                cache pool
                                narinfo-ttl
                                (nar-path "nar")
                                (compression %no-compression))
+  (define compression-type?
+    string->compression-type)
+
   (define nar-path?
     (let ((expected (split-and-decode-uri-path nar-path)))
       (cut equal? expected <>)))
@@ -815,13 +850,18 @@ blocking."
           ;; is restarted with different compression parameters.
 
           ;; /nar/gzip/<store-item>
-          ((components ... "gzip" store-item)
-           (if (and (nar-path? components) (zlib-available?))
-               (let ((compression (match compression
-                                    (($ <compression> 'gzip)
-                                     compression)
-                                    (_
-                                     %default-gzip-compression))))
+          ((components ... (? compression-type? type) store-item)
+           (if (nar-path? components)
+               (let* ((compression-type (string->compression-type type))
+                      (compression (match compression
+                                     (($ <compression> type)
+                                      (if (eq? type compression-type)
+                                          compression
+                                          (default-compression
+                                            compression-type)))
+                                     (_
+                                      (default-compression
+                                        compression-type)))))
                  (if cache
                      (render-nar/cached store cache request store-item
                                         #:ttl narinfo-ttl
diff --git a/tests/publish.scm b/tests/publish.scm
index 097ac036e0..10bc859695 100644
--- a/tests/publish.scm
+++ b/tests/publish.scm
@@ -36,6 +36,7 @@
   #:use-module (gcrypt pk-crypto)
   #:use-module ((guix pki) #:select (%public-key-file %private-key-file))
   #:use-module (guix zlib)
+  #:use-module (guix lzlib)
   #:use-module (web uri)
   #:use-module (web client)
   #:use-module (web response)
@@ -229,6 +230,19 @@ FileSize: ~a~%"
                (string-append "/nar/gzip/" (basename %item))))))
     (get-bytevector-n nar (bytevector-length %gzip-magic-bytes))))
 
+(unless (lzlib-available?)
+  (test-skip 1))
+(test-equal "/nar/lzip/*"
+  "bar"
+  (call-with-temporary-output-file
+   (lambda (temp port)
+     (let ((nar (http-get-port
+                 (publish-uri
+                  (string-append "/nar/lzip/" (basename %item))))))
+       (call-with-lzip-input-port nar
+         (cut restore-file <> temp)))
+     (call-with-input-file temp read-string))))
+
 (unless (zlib-available?)
   (test-skip 1))
 (test-equal "/*.narinfo with compression"
@@ -251,6 +265,28 @@ FileSize: ~a~%"
                   (_ #f)))
               (recutils->alist body)))))
 
+(unless (lzlib-available?)
+  (test-skip 1))
+(test-equal "/*.narinfo with lzip compression"
+  `(("StorePath" . ,%item)
+    ("URL" . ,(string-append "nar/lzip/" (basename %item)))
+    ("Compression" . "lzip"))
+  (let ((thread (with-separate-output-ports
+                 (call-with-new-thread
+                  (lambda ()
+                    (guix-publish "--port=6790" "-Clzip"))))))
+    (wait-until-ready 6790)
+    (let* ((url  (string-append "http://localhost:6790/"
+                                (store-path-hash-part %item) ".narinfo"))
+           (body (http-get-port url)))
+      (filter (lambda (item)
+                (match item
+                  (("Compression" . _) #t)
+                  (("StorePath" . _)  #t)
+                  (("URL" . _) #t)
+                  (_ #f)))
+              (recutils->alist body)))))
+
 (unless (zlib-available?)
   (test-skip 1))
 (test-equal "/*.narinfo for a compressed file"
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 5/7] self: Add dependency on lzlib.
  2019-05-24 13:42 ` [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed' Ludovic Courtès
                     ` (2 preceding siblings ...)
  2019-05-24 13:42   ` [bug#35880] [PATCH 4/7] publish: Add support for lzip Ludovic Courtès
@ 2019-05-24 13:42   ` Ludovic Courtès
  2019-05-24 13:42   ` [bug#35880] [PATCH 6/7] gnu: guix: " Ludovic Courtès
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-24 13:42 UTC (permalink / raw)
  To: 35880; +Cc: Pierre Neidhardt

* guix/self.scm (compiled-guix): Pass #:lzlib to 'make-config.scm'.
(make-config.scm): Add #:lzlib and honor it.
(specification->package): Add "lzlib".
---
 guix/self.scm | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/guix/self.scm b/guix/self.scm
index 6d7569ec19..17d289e486 100644
--- a/guix/self.scm
+++ b/guix/self.scm
@@ -57,6 +57,7 @@
       ("guile-gcrypt"  (ref '(gnu packages gnupg) 'guile-gcrypt))
       ("gnutls"     (ref '(gnu packages tls) 'gnutls))
       ("zlib"       (ref '(gnu packages compression) 'zlib))
+      ("lzlib"      (ref '(gnu packages compression) 'lzlib))
       ("gzip"       (ref '(gnu packages compression) 'gzip))
       ("bzip2"      (ref '(gnu packages compression) 'bzip2))
       ("xz"         (ref '(gnu packages compression) 'xz))
@@ -646,6 +647,7 @@ Info manual."
                         (guile-version (effective-version))
                         (guile-for-build (default-guile))
                         (zlib (specification->package "zlib"))
+                        (lzlib (specification->package "lzlib"))
                         (gzip (specification->package "gzip"))
                         (bzip2 (specification->package "bzip2"))
                         (xz (specification->package "xz"))
@@ -800,6 +802,7 @@ Info manual."
                  #:extra-modules
                  `(((guix config)
                     => ,(make-config.scm #:zlib zlib
+                                         #:lzlib lzlib
                                          #:gzip gzip
                                          #:bzip2 bzip2
                                          #:xz xz
@@ -897,7 +900,7 @@ Info manual."
                                       (variables rest ...))))))
     (variables %localstatedir %storedir %sysconfdir)))
 
-(define* (make-config.scm #:key zlib gzip xz bzip2
+(define* (make-config.scm #:key zlib lzlib gzip xz bzip2
                           (package-name "GNU Guix")
                           (package-version "0")
                           (bug-report-address "bug-guix@gnu.org")
@@ -919,7 +922,7 @@ Info manual."
                                %store-database-directory
                                %config-directory
                                %libz
-                               ;; TODO: %liblz
+                               %liblz
                                %gzip
                                %bzip2
                                %xz))
@@ -966,7 +969,11 @@ Info manual."
 
                    (define %libz
                      #+(and zlib
-                            (file-append zlib "/lib/libz"))))
+                            (file-append zlib "/lib/libz")))
+
+                   (define %liblz
+                     #+(and lzlib
+                            (file-append lzlib "/lib/liblz"))))
 
                ;; Guile 2.0 *requires* the 'define-module' to be at the
                ;; top-level or the 'toplevel-ref' in the resulting .go file are
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 6/7] gnu: guix: Add dependency on lzlib.
  2019-05-24 13:42 ` [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed' Ludovic Courtès
                     ` (3 preceding siblings ...)
  2019-05-24 13:42   ` [bug#35880] [PATCH 5/7] self: Add dependency on lzlib Ludovic Courtès
@ 2019-05-24 13:42   ` Ludovic Courtès
  2019-05-24 13:42   ` [bug#35880] [PATCH 7/7] lzlib: 'lzread!' never returns more than it was asked for Ludovic Courtès
  2019-05-25 17:24   ` [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed' Pierre Neidhardt
  6 siblings, 0 replies; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-24 13:42 UTC (permalink / raw)
  To: 35880; +Cc: Pierre Neidhardt

* gnu/packages/package-management.scm (guix)[inputs]: Add LZLIB.
---
 gnu/packages/package-management.scm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gnu/packages/package-management.scm b/gnu/packages/package-management.scm
index a356a6dab7..985fa801b1 100644
--- a/gnu/packages/package-management.scm
+++ b/gnu/packages/package-management.scm
@@ -272,6 +272,7 @@
        `(("bzip2" ,bzip2)
          ("gzip" ,gzip)
          ("zlib" ,zlib)                           ;for 'guix publish'
+         ("lzlib" ,lzlib)            ;for 'guix publish' and 'guix substitute'
 
          ("sqlite" ,sqlite)
          ("libgcrypt" ,libgcrypt)
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 7/7] lzlib: 'lzread!' never returns more than it was asked for.
  2019-05-24 13:42 ` [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed' Ludovic Courtès
                     ` (4 preceding siblings ...)
  2019-05-24 13:42   ` [bug#35880] [PATCH 6/7] gnu: guix: " Ludovic Courtès
@ 2019-05-24 13:42   ` Ludovic Courtès
  2019-05-25 17:31     ` Pierre Neidhardt
  2019-05-25 17:24   ` [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed' Pierre Neidhardt
  6 siblings, 1 reply; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-24 13:42 UTC (permalink / raw)
  To: 35880; +Cc: Pierre Neidhardt

Fixes a bug whereby 'lzread!' could return more than COUNT.

* guix/lzlib.scm (lzread!): Rewrite in a semi-functional style.
---
 guix/lzlib.scm | 39 +++++++++++++++++++--------------------
 1 file changed, 19 insertions(+), 20 deletions(-)

diff --git a/guix/lzlib.scm b/guix/lzlib.scm
index 48927c6262..0e384f6cb8 100644
--- a/guix/lzlib.scm
+++ b/guix/lzlib.scm
@@ -494,29 +494,28 @@ perhaps not yet read."
 
 \f
 ;; High level functions.
-(define* (lzread! decoder file-port bv
+
+(define* (lzread! decoder port bv
                   #:optional (start 0) (count (bytevector-length bv)))
-  "Read up to COUNT bytes from FILE-PORT into BV at offset START.  Return the
+  "Read up to COUNT bytes from PORT into BV at offset START.  Return the
 number of uncompressed bytes actually read; it is zero if COUNT is zero or if
 the end-of-stream has been reached."
-  ;; WARNING: Because we don't alternate between lz-reads and lz-writes, we can't
-  ;; process more than lz-decompress-write-size from the file-port.
-  (when (> count (lz-decompress-write-size decoder))
-    (set! count (lz-decompress-write-size decoder)))
-  (let ((file-bv (get-bytevector-n file-port count)))
-    (unless (eof-object? file-bv)
-      (lz-decompress-write decoder file-bv 0 (bytevector-length file-bv))))
-  (let ((read 0))
-    (let loop ((rd 0))
-      (if (< start (bytevector-length bv))
-          (begin
-            (set! rd (lz-decompress-read decoder bv start (- (bytevector-length bv) start)))
-            (set! start (+ start rd))
-            (set! read (+ read rd)))
-          (set! rd 0))
-      (unless (= rd 0)
-        (loop rd)))
-    read))
+  (define (feed-decoder! decoder)
+    ;; Feed DECODER with data read from PORT.
+    (match (get-bytevector-n port (lz-decompress-write-size decoder))
+      ((? eof-object? eof) eof)
+      (bv (lz-decompress-write decoder bv))))
+
+  (let loop ((read 0)
+             (start start))
+    (cond ((< read count)
+           (match (lz-decompress-read decoder bv start (- count read))
+             (0 (if (eof-object? (feed-decoder! decoder))
+                    read
+                    (loop read start)))
+             (n (loop (+ read n) (+ start n)))))
+          (else
+           read))))
 
 (define (lzwrite! encoder source source-offset source-count
                   target target-offset target-count)
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-05-24 13:42 ` [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed' Ludovic Courtès
                     ` (5 preceding siblings ...)
  2019-05-24 13:42   ` [bug#35880] [PATCH 7/7] lzlib: 'lzread!' never returns more than it was asked for Ludovic Courtès
@ 2019-05-25 17:24   ` Pierre Neidhardt
  2019-05-26 19:51     ` Ludovic Courtès
  6 siblings, 1 reply; 31+ messages in thread
From: Pierre Neidhardt @ 2019-05-25 17:24 UTC (permalink / raw)
  To: Ludovic Courtès, 35880

[-- Attachment #1: Type: text/plain, Size: 1356 bytes --]

As an Lzip enthusiast, I have some questions ;)

I see you are using make-lzip-input-port/compressed in a subsequent
patch, but this does not map how it's done for gzip et al., the latter
being invoked via it's system command "gzip -c ...".  Why did you decide
to do it differently for lzip?

Much of the code induced by make-lzip-input-port/compressed seems to
repeat the lzread! / lzwrite business, maybe there is a way to factor
some of it?

> +(define (lzwrite! encoder source source-offset source-count
> +                  target target-offset target-count)
> +  "Write up to SOURCE-COUNT bytes from SOURCE to ENCODER, and read up to
> +TARGET-COUNT bytes into TARGET at TARGET-OFFSET.  Return two values: the
> +number of bytes read from SOURCE, and the number of bytes written to TARGET."
> +  (define read
> +    (if (< 0 (lz-compress-write-size encoder))
> +        (match (lz-compress-write encoder source source-offset source-count)
> +          (0 (lz-compress-finish encoder) 0)
> +          (n n))
> +        0))
> +
> +  (let loop ()
> +    (match (lz-compress-read encoder target target-offset target-count)
> +      (0       (loop))
> +      (written (values read written)))))

Why looping on 0?  If there is no byte to read, wouldn't this loop indefinitely?


-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 3/7] utils: Support compression and decompression with lzip.
  2019-05-24 13:42   ` [bug#35880] [PATCH 3/7] utils: Support compression and decompression with lzip Ludovic Courtès
@ 2019-05-25 17:27     ` Pierre Neidhardt
  2019-05-26 19:52       ` Ludovic Courtès
  0 siblings, 1 reply; 31+ messages in thread
From: Pierre Neidhardt @ 2019-05-25 17:27 UTC (permalink / raw)
  To: Ludovic Courtès, 35880

[-- Attachment #1: Type: text/plain, Size: 922 bytes --]

This is the part where you use make-lzip-input-port/compressed:

>  (define (compressed-port compression input)
> -  "Return an input port where INPUT is decompressed according to COMPRESSION,
> +  "Return an input port where INPUT is compressed according to COMPRESSION,
>  a symbol such as 'xz."
>    (match compression
>      ((or #f 'none) (values input '()))
>      ('bzip2        (filtered-port `(,%bzip2 "-c") input))
>      ('xz           (filtered-port `(,%xz "-c") input))
>      ('gzip         (filtered-port `(,%gzip "-c") input))
> -    (else          (error "unsupported compression scheme" compression))))
> +    ('lzip         (values (lzip-port 'make-lzip-input-port/compressed input)
> +                           '()))
> +    (_             (error "unsupported compression scheme" compression))))

So why not doing like for the others?


-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 7/7] lzlib: 'lzread!' never returns more than it was asked for.
  2019-05-24 13:42   ` [bug#35880] [PATCH 7/7] lzlib: 'lzread!' never returns more than it was asked for Ludovic Courtès
@ 2019-05-25 17:31     ` Pierre Neidhardt
  2019-05-26 19:54       ` Ludovic Courtès
  0 siblings, 1 reply; 31+ messages in thread
From: Pierre Neidhardt @ 2019-05-25 17:31 UTC (permalink / raw)
  To: Ludovic Courtès, 35880

[-- Attachment #1: Type: text/plain, Size: 405 bytes --]


> Fixes a bug whereby 'lzread!' could return more than COUNT.

Hmm... But why is this a bug?  lzread! returns the number of
_uncompressed_ bytes, while COUNT is number of _compressed_ bytes
written, so it's often expected that the former is more than the latter.

(By the way, nice rewrite, I like it much better than my C->Scheme
translation ;p)

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-05-25 17:24   ` [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed' Pierre Neidhardt
@ 2019-05-26 19:51     ` Ludovic Courtès
  2019-05-27 15:45       ` Ludovic Courtès
  0 siblings, 1 reply; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-26 19:51 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: 35880

Hi!

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> As an Lzip enthusiast, I have some questions ;)
>
> I see you are using make-lzip-input-port/compressed in a subsequent
> patch, but this does not map how it's done for gzip et al., the latter
> being invoked via it's system command "gzip -c ...".  Why did you decide
> to do it differently for lzip?
>
> Much of the code induced by make-lzip-input-port/compressed seems to
> repeat the lzread! / lzwrite business, maybe there is a way to factor
> some of it?

Actually, ‘make-lzip-input-port/compressed’ exists solely so we can have
‘compressed-port’ for lzip, which in turn allows us to write tests.

It uses (guix lzlib) instead of invoking the ‘lzip’ command because we
can.  :-)  Invoking commands is not as nice, because it’s more
expensive, requires us to spawn an additional process when the input is
not a file port (e.g., it’s a string port), and it’s forking is not
possible in multi-threaded programs like ‘guix publish’.

>> +(define (lzwrite! encoder source source-offset source-count
>> +                  target target-offset target-count)
>> +  "Write up to SOURCE-COUNT bytes from SOURCE to ENCODER, and read up to
>> +TARGET-COUNT bytes into TARGET at TARGET-OFFSET.  Return two values: the
>> +number of bytes read from SOURCE, and the number of bytes written to TARGET."
>> +  (define read
>> +    (if (< 0 (lz-compress-write-size encoder))
>> +        (match (lz-compress-write encoder source source-offset source-count)
>> +          (0 (lz-compress-finish encoder) 0)
>> +          (n n))
>> +        0))
>> +
>> +  (let loop ()
>> +    (match (lz-compress-read encoder target target-offset target-count)
>> +      (0       (loop))
>> +      (written (values read written)))))
>
> Why looping on 0?  If there is no byte to read, wouldn't this loop indefinitely?

Hmm, good point.  The idea is that ‘lzwrite!’ should return 0 only on
end-of-file, but then the loop should include reading more from SOURCE.
I’ll follow up on this one.

Thanks!

Ludo’.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 3/7] utils: Support compression and decompression with lzip.
  2019-05-25 17:27     ` Pierre Neidhardt
@ 2019-05-26 19:52       ` Ludovic Courtès
  0 siblings, 0 replies; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-26 19:52 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: 35880

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> This is the part where you use make-lzip-input-port/compressed:
>
>>  (define (compressed-port compression input)
>> -  "Return an input port where INPUT is decompressed according to COMPRESSION,
>> +  "Return an input port where INPUT is compressed according to COMPRESSION,
>>  a symbol such as 'xz."
>>    (match compression
>>      ((or #f 'none) (values input '()))
>>      ('bzip2        (filtered-port `(,%bzip2 "-c") input))
>>      ('xz           (filtered-port `(,%xz "-c") input))
>>      ('gzip         (filtered-port `(,%gzip "-c") input))
>> -    (else          (error "unsupported compression scheme" compression))))
>> +    ('lzip         (values (lzip-port 'make-lzip-input-port/compressed input)
>> +                           '()))
>> +    (_             (error "unsupported compression scheme" compression))))
>
> So why not doing like for the others?

See my other previous reply.  :-)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 7/7] lzlib: 'lzread!' never returns more than it was asked for.
  2019-05-25 17:31     ` Pierre Neidhardt
@ 2019-05-26 19:54       ` Ludovic Courtès
  2019-05-26 20:57         ` Pierre Neidhardt
  0 siblings, 1 reply; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-26 19:54 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: 35880

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

>> Fixes a bug whereby 'lzread!' could return more than COUNT.
>
> Hmm... But why is this a bug?

Because then the ‘read!’ method of the custom binary input port could
return more than ‘count’, which is understandably not permitted.

> (By the way, nice rewrite, I like it much better than my C->Scheme
> translation ;p)

Heheh, I initially tried to minimize changes but I found it easier to
reason about this version.

Ludo’.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 7/7] lzlib: 'lzread!' never returns more than it was asked for.
  2019-05-26 19:54       ` Ludovic Courtès
@ 2019-05-26 20:57         ` Pierre Neidhardt
  2019-05-26 21:28           ` Ludovic Courtès
  0 siblings, 1 reply; 31+ messages in thread
From: Pierre Neidhardt @ 2019-05-26 20:57 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 35880

[-- Attachment #1: Type: text/plain, Size: 578 bytes --]

Ludovic Courtès <ludo@gnu.org> writes:

> Pierre Neidhardt <mail@ambrevar.xyz> skribis:
>
>>> Fixes a bug whereby 'lzread!' could return more than COUNT.
>>
>> Hmm... But why is this a bug?
>
> Because then the ‘read!’ method of the custom binary input port could
> return more than ‘count’, which is understandably not permitted.

That's the part where I'm a bit confused because we deal with compressed
data here.

So when we say "(read count)", does COUNT refer to the compressed or
uncompressed data?

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 7/7] lzlib: 'lzread!' never returns more than it was asked for.
  2019-05-26 20:57         ` Pierre Neidhardt
@ 2019-05-26 21:28           ` Ludovic Courtès
  2019-05-27  7:00             ` Pierre Neidhardt
  0 siblings, 1 reply; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-26 21:28 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: 35880

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Pierre Neidhardt <mail@ambrevar.xyz> skribis:
>>
>>>> Fixes a bug whereby 'lzread!' could return more than COUNT.
>>>
>>> Hmm... But why is this a bug?
>>
>> Because then the ‘read!’ method of the custom binary input port could
>> return more than ‘count’, which is understandably not permitted.
>
> That's the part where I'm a bit confused because we deal with compressed
> data here.
>
> So when we say "(read count)", does COUNT refer to the compressed or
> uncompressed data?

We have this:

  (define* (make-lzip-input-port port)
    (define decoder (lz-decompress-open))

    (define (read! bv start count)
      (lzread! decoder port bv start count))

    (make-custom-binary-input-port "lzip-input" read! #f #f
                                   (lambda () …)))

Here ‘read!’ must return an integer between 1 and COUNT; it must return
0 if and only if the end-of-file is reached.

IOW, ‘lzread!’ must return the number of uncompressed bytes of BV that
it consumed, and that number is necessarily <= COUNT.

Does that make sense?

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 7/7] lzlib: 'lzread!' never returns more than it was asked for.
  2019-05-26 21:28           ` Ludovic Courtès
@ 2019-05-27  7:00             ` Pierre Neidhardt
  2019-05-27 10:00               ` Ludovic Courtès
  0 siblings, 1 reply; 31+ messages in thread
From: Pierre Neidhardt @ 2019-05-27  7:00 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 35880

[-- Attachment #1: Type: text/plain, Size: 711 bytes --]

It does make sense, but then don't we have the same issue with zlib.scm:

--8<---------------cut here---------------start------------->8---
(define gzread!
  (let ((proc (zlib-procedure int "gzread" (list '* '* unsigned-int))))
    (lambda* (gzfile bv #:optional (start 0) (count (bytevector-length bv)))
      "Read up to COUNT bytes from GZFILE into BV at offset START.  Return the
number of uncompressed bytes actually read; it is zero if COUNT is zero or if
the end-of-stream has been reached."
 ...
--8<---------------cut here---------------end--------------->8---

I initially tried to mimic zlib.scm and this part confused me a lot back then.

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 7/7] lzlib: 'lzread!' never returns more than it was asked for.
  2019-05-27  7:00             ` Pierre Neidhardt
@ 2019-05-27 10:00               ` Ludovic Courtès
  0 siblings, 0 replies; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-27 10:00 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: 35880

Hi Pierre,

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> It does make sense, but then don't we have the same issue with zlib.scm:

Which issue?

> (define gzread!
>   (let ((proc (zlib-procedure int "gzread" (list '* '* unsigned-int))))
>     (lambda* (gzfile bv #:optional (start 0) (count (bytevector-length bv)))
>       "Read up to COUNT bytes from GZFILE into BV at offset START.  Return the
> number of uncompressed bytes actually read; it is zero if COUNT is zero or if
> the end-of-stream has been reached."
>  ...
>
> I initially tried to mimic zlib.scm and this part confused me a lot back then.

There’s a key difference: the ‘gzread’ etc. API is high-level and easy
to use, but it wants a file descriptor to read from (thus a file port in
Scheme land.)

That’s enough for ‘guix publish’, which writes gzipped data to files,
but that’s not enough for ‘guix substitute’, which can read data from
non-file ports (e.g., chunked-encoding ports or TLS ports from the HTTP
client.)

HTH,
Ludo’.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-05-26 19:51     ` Ludovic Courtès
@ 2019-05-27 15:45       ` Ludovic Courtès
  2019-05-27 16:24         ` Pierre Neidhardt
  0 siblings, 1 reply; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-27 15:45 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: 35880

Ludovic Courtès <ludo@gnu.org> skribis:

> Pierre Neidhardt <mail@ambrevar.xyz> skribis:

[...]

>>> +(define (lzwrite! encoder source source-offset source-count
>>> +                  target target-offset target-count)
>>> +  "Write up to SOURCE-COUNT bytes from SOURCE to ENCODER, and read up to
>>> +TARGET-COUNT bytes into TARGET at TARGET-OFFSET.  Return two values: the
>>> +number of bytes read from SOURCE, and the number of bytes written to TARGET."
>>> +  (define read
>>> +    (if (< 0 (lz-compress-write-size encoder))
>>> +        (match (lz-compress-write encoder source source-offset source-count)
>>> +          (0 (lz-compress-finish encoder) 0)
>>> +          (n n))
>>> +        0))
>>> +
>>> +  (let loop ()
>>> +    (match (lz-compress-read encoder target target-offset target-count)
>>> +      (0       (loop))
>>> +      (written (values read written)))))
>>
>> Why looping on 0?  If there is no byte to read, wouldn't this loop indefinitely?
>
> Hmm, good point.  The idea is that ‘lzwrite!’ should return 0 only on
> end-of-file, but then the loop should include reading more from SOURCE.
> I’ll follow up on this one.

I noticed that ‘lz-compress-read’ is documented to return a “strictly
positive integer”, so I’m changing it to this:

--8<---------------cut here---------------start------------->8---
(define (lzwrite! encoder source source-offset source-count
                  target target-offset target-count)
  "Write up to SOURCE-COUNT bytes from SOURCE to ENCODER, and read up to
TARGET-COUNT bytes into TARGET at TARGET-OFFSET.  Return two values: the
number of bytes read from SOURCE, and the non-zero number of bytes written to
TARGET."
  (define read
    (if (< 0 (lz-compress-write-size encoder))
        (match (lz-compress-write encoder source source-offset source-count)
          (0 (lz-compress-finish encoder) 0)
          (n n))
        0))

  (define written
    ;; Note: 'lz-compress-read' promises to return a non-zero integer.
    (lz-compress-read encoder target target-offset target-count))

  (values read written))
--8<---------------cut here---------------end--------------->8---

Let me know what you think!

Ludo’.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-05-27 15:45       ` Ludovic Courtès
@ 2019-05-27 16:24         ` Pierre Neidhardt
  2019-05-27 20:53           ` bug#35880: " Ludovic Courtès
  0 siblings, 1 reply; 31+ messages in thread
From: Pierre Neidhardt @ 2019-05-27 16:24 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 35880

[-- Attachment #1: Type: text/plain, Size: 1058 bytes --]

Ludovic Courtès <ludo@gnu.org> writes:

> I noticed that ‘lz-compress-read’ is documented to return a “strictly
> positive integer”, so I’m changing it to this:

The docstring (that I wrote) says "non-negative positive integer."
"positive" is a typo (sorry about that), it should read "non-negative
integer" since the return value can be zero.

In general, lzlib's "reads" and "writes" don't give any guarantee about
the size of bytes that are actually processed.  You need to loop over
the calls until some condition is met, see the "finish(ed)" functions.

Here in particular, it's not clear that lz-compress-read is going to read
all the bytes in the encoder buffer.  Maybe that's OK for this
particular functions if we don't expect TARGET-COUNT to be reached.  See

  http://www.nongnu.org/lzip/manual/lzlib_manual.html#Buffering

That said, if the encoder buffer is not empty, I think lz-compress-read
should always return something >0.

Yup, lzlib is very low-level! :p

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* bug#35880: [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-05-27 16:24         ` Pierre Neidhardt
@ 2019-05-27 20:53           ` Ludovic Courtès
  2019-05-27 21:12             ` [bug#35880] " Pierre Neidhardt
  0 siblings, 1 reply; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-27 20:53 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: 35880-done

Hi!

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> That said, if the encoder buffer is not empty, I think lz-compress-read
> should always return something >0.

Yes, probably.  The docstring for ‘lz-compress-read’ says:

    "Read up to COUNT bytes from the encoder stream, storing the results in LZFILE-BV.
  Return the number of uncompressed bytes written, a strictly positive integer."
                                                     ^~~~~~~~~~~~~~~~~

However, the lzlib manual doesn’t say that for ‘LZ_compress_read’ (info
"(lzlib) Compression functions").

But that’s OK: the ‘read!’ method in ‘make-lzip-input-port/compressed’
can just call ‘lzwrite!’ again with more data when that happens, so I’ve
done that.

And I pushed the whole thing! :-)

I think it’d be good to let people play with it in their personal
setups.

Next up: multi-compression support in ‘guix publish’ (possibly?) so we
can smoothly transition on our build farms.

Thanks!

Ludo’.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-05-27 20:53           ` bug#35880: " Ludovic Courtès
@ 2019-05-27 21:12             ` Pierre Neidhardt
  2019-05-28  7:52               ` Ludovic Courtès
  0 siblings, 1 reply; 31+ messages in thread
From: Pierre Neidhardt @ 2019-05-27 21:12 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 35880-done

[-- Attachment #1: Type: text/plain, Size: 1476 bytes --]

Ludovic Courtès <ludo@gnu.org> writes:

>> That said, if the encoder buffer is not empty, I think lz-compress-read
>> should always return something >0.
>
> Yes, probably.  The docstring for ‘lz-compress-read’ says:

Oops, I read the docstring of lz-DEcompress-read.  My bad.

>     "Read up to COUNT bytes from the encoder stream, storing the results in LZFILE-BV.
>   Return the number of uncompressed bytes written, a strictly positive integer."
>                                                      ^~~~~~~~~~~~~~~~~

Bigger oops!  This comes from a copy-paste of the gzip docstring which I
forgot to update properly (I did for the decompression functions, but
not for the compression functions).  The docstrings should be fixed.

> But that’s OK: the ‘read!’ method in ‘make-lzip-input-port/compressed’
> can just call ‘lzwrite!’ again with more data when that happens, so I’ve
> done that.

This could work, but I've had some headaches on such assumptions
before.  Tests are very necessary here to validate those assumptions ;)

The thing is that we are not using lzlib as it is meant to be used
(i.e. with the finish* functions) because of the functional approach of
the binary ports which don't really play well with the procedural
approach of the C library.

> And I pushed the whole thing! :-)

Hurray!  Can't wait to say lz-compressed archives coming to Guix! :)

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-05-27 21:12             ` [bug#35880] " Pierre Neidhardt
@ 2019-05-28  7:52               ` Ludovic Courtès
  2019-05-28  8:46                 ` Pierre Neidhardt
  0 siblings, 1 reply; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-28  7:52 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: 35880-done

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>>> That said, if the encoder buffer is not empty, I think lz-compress-read
>>> should always return something >0.
>>
>> Yes, probably.  The docstring for ‘lz-compress-read’ says:
>
> Oops, I read the docstring of lz-DEcompress-read.  My bad.
>
>>     "Read up to COUNT bytes from the encoder stream, storing the results in LZFILE-BV.
>>   Return the number of uncompressed bytes written, a strictly positive integer."
>>                                                      ^~~~~~~~~~~~~~~~~
>
> Bigger oops!  This comes from a copy-paste of the gzip docstring which I
> forgot to update properly (I did for the decompression functions, but
> not for the compression functions).  The docstrings should be fixed.

I fixed this one in e13354a7ca5a0d5e28e02c4cfce6fecb1ab770e4.

>> But that’s OK: the ‘read!’ method in ‘make-lzip-input-port/compressed’
>> can just call ‘lzwrite!’ again with more data when that happens, so I’ve
>> done that.
>
> This could work, but I've had some headaches on such assumptions
> before.  Tests are very necessary here to validate those assumptions ;)

Definitely!

> The thing is that we are not using lzlib as it is meant to be used
> (i.e. with the finish* functions) because of the functional approach of
> the binary ports which don't really play well with the procedural
> approach of the C library.

I think we’re using it the way it’s meant to be used, roughly along the
lines of the examples of its manual (info "(lzlib) Examples").

(I/O ports are not very “functional”.)

>> And I pushed the whole thing! :-)
>
> Hurray!  Can't wait to say lz-compressed archives coming to Guix! :)

I’ve updated the ‘guix’ package so people can start using
‘guix publish -C lzip’ and fetch substitute from there.

Thanks for making it possible!

Ludo’.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-05-28  7:52               ` Ludovic Courtès
@ 2019-05-28  8:46                 ` Pierre Neidhardt
  2019-05-28 13:47                   ` Ludovic Courtès
  0 siblings, 1 reply; 31+ messages in thread
From: Pierre Neidhardt @ 2019-05-28  8:46 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 35880-done

[-- Attachment #1: Type: text/plain, Size: 1618 bytes --]

Ludovic Courtès <ludo@gnu.org> writes:

>> Bigger oops!  This comes from a copy-paste of the gzip docstring which I
>> forgot to update properly (I did for the decompression functions, but
>> not for the compression functions).  The docstrings should be fixed.
>
> I fixed this one in e13354a7ca5a0d5e28e02c4cfce6fecb1ab770e4.

Thanks!

>> The thing is that we are not using lzlib as it is meant to be used
>> (i.e. with the finish* functions) because of the functional approach of
>> the binary ports which don't really play well with the procedural
>> approach of the C library.
>
> I think we’re using it the way it’s meant to be used, roughly along the
> lines of the examples of its manual (info "(lzlib) Examples").
>
> (I/O ports are not very “functional”.)

In the sense that we define  "read!" and "write!" functions which don't
allow us to call the "finish" functions properly.

So maybe we are following the doc too "roughly" :p

This has multiple implications, e.g. it impedes support for multiple
members, (which is OK as long as we don't accept archives produced by a
third-party) and more importantly it lifts the guarantee that the
library is going to work as per the documentation.

>>> And I pushed the whole thing! :-)
>>
>> Hurray!  Can't wait to say lz-compressed archives coming to Guix! :)
>
> I’ve updated the ‘guix’ package so people can start using
> ‘guix publish -C lzip’ and fetch substitute from there.
>
> Thanks for making it possible!

And thanks for doing most of the work! :D

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-05-28  8:46                 ` Pierre Neidhardt
@ 2019-05-28 13:47                   ` Ludovic Courtès
  2019-05-29 14:57                     ` Pierre Neidhardt
  0 siblings, 1 reply; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-28 13:47 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: 35880-done

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:

[...]

>>> The thing is that we are not using lzlib as it is meant to be used
>>> (i.e. with the finish* functions) because of the functional approach of
>>> the binary ports which don't really play well with the procedural
>>> approach of the C library.
>>
>> I think we’re using it the way it’s meant to be used, roughly along the
>> lines of the examples of its manual (info "(lzlib) Examples").
>>
>> (I/O ports are not very “functional”.)
>
> In the sense that we define  "read!" and "write!" functions which don't
> allow us to call the "finish" functions properly.
>
> So maybe we are following the doc too "roughly" :p
>
> This has multiple implications, e.g. it impedes support for multiple
> members, (which is OK as long as we don't accept archives produced by a
> third-party) and more importantly it lifts the guarantee that the
> library is going to work as per the documentation.

I’m not sure I follow.  I think ‘make-lzip-input-port/compressed’
corresponds to Example 2 in the manual (info "(lzlib) Examples"),
‘make-lzip-output-port’ corresponds to Example 1, and
‘make-lzip-input-port’ corresponds to Example 4 (with the exception that
‘lzread!’ doesn’t call ‘lz-decompress-finished?’, but it has other means
to tell whether we’re done processing input.)

What do you think is not done as documented?

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-05-28 13:47                   ` Ludovic Courtès
@ 2019-05-29 14:57                     ` Pierre Neidhardt
  2019-05-31 20:54                       ` Ludovic Courtès
  0 siblings, 1 reply; 31+ messages in thread
From: Pierre Neidhardt @ 2019-05-29 14:57 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 35880-done

[-- Attachment #1: Type: text/plain, Size: 2181 bytes --]

Ludovic Courtès <ludo@gnu.org> writes:

> I’m not sure I follow.  I think ‘make-lzip-input-port/compressed’
> corresponds to Example 2 in the manual (info "(lzlib) Examples"),
> ‘make-lzip-output-port’ corresponds to Example 1, and
> ‘make-lzip-input-port’ corresponds to Example 4 (with the exception that
> ‘lzread!’ doesn’t call ‘lz-decompress-finished?’, but it has other means
> to tell whether we’re done processing input.)

Example 4 is:

     1) LZ_decompress_open
     2) go to step 5 if LZ_decompress_write_size returns 0
     3) LZ_decompress_write
     4) if no more data to write, call LZ_decompress_finish
     5) LZ_decompress_read
     5a) optionally, if LZ_decompress_member_finished returns 1, read
         final values for member with LZ_decompress_data_crc, etc.
     6) go back to step 2 until LZ_decompress_finished returns 1
     7) LZ_decompress_close

In `lzread!', we don't call lz-decompress-finished? nor do we loop on
lz-decompress-finished.

This only works for decompression of single-member archive, but the
documentation does not say that.

--8<---------------cut here---------------start------------->8---
    (match (get-bytevector-n port (lz-decompress-write-size decoder))
      ((? eof-object? eof) eof)
      (bv (lz-decompress-write decoder bv)))
--8<---------------cut here---------------end--------------->8---

In the above if lz-decompress-write-size returns 0, we won't be reading
anything (infinite loop?).  While I understand this should not happen in
practice, the documentation of the library does not give such guarantees.
Antonio told me that explicitly.

--8<---------------cut here---------------start------------->8---
           (match (lz-decompress-read decoder bv start (- count read))
             (0 (if (eof-object? (feed-decoder! decoder))
                    read
                    (loop read start)))
--8<---------------cut here---------------end--------------->8---

I'm not sure I understand the above: if we read nothing, then we try
again?  This might loop forever.

What do you think?

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-05-29 14:57                     ` Pierre Neidhardt
@ 2019-05-31 20:54                       ` Ludovic Courtès
  2019-06-01  6:02                         ` Pierre Neidhardt
  0 siblings, 1 reply; 31+ messages in thread
From: Ludovic Courtès @ 2019-05-31 20:54 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: 35880-done

Hello,

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> I’m not sure I follow.  I think ‘make-lzip-input-port/compressed’
>> corresponds to Example 2 in the manual (info "(lzlib) Examples"),
>> ‘make-lzip-output-port’ corresponds to Example 1, and
>> ‘make-lzip-input-port’ corresponds to Example 4 (with the exception that
>> ‘lzread!’ doesn’t call ‘lz-decompress-finished?’, but it has other means
>> to tell whether we’re done processing input.)
>
> Example 4 is:
>
>      1) LZ_decompress_open
>      2) go to step 5 if LZ_decompress_write_size returns 0
>      3) LZ_decompress_write
>      4) if no more data to write, call LZ_decompress_finish
>      5) LZ_decompress_read
>      5a) optionally, if LZ_decompress_member_finished returns 1, read
>          final values for member with LZ_decompress_data_crc, etc.
>      6) go back to step 2 until LZ_decompress_finished returns 1
>      7) LZ_decompress_close
>
> In `lzread!', we don't call lz-decompress-finished? nor do we loop on
> lz-decompress-finished.

Indeed, we’re missing a call to ‘lz-decompress-finish’, so lzlib could
in theory think there’s still data coming, and so fail to produce more
output, possibly leading to an infinite loop.

I think the ‘lzread!’ loop should look like this (the tests still pass
with this):

--8<---------------cut here---------------start------------->8---
  (let loop ((read 0)
             (start start))
    (cond ((< read count)
           (match (lz-decompress-read decoder bv start (- count read))
             (0 (cond ((lz-decompress-finished? decoder)
                       read)
                      ((eof-object? (feed-decoder! decoder))
                       (lz-decompress-finish decoder)
                       (loop read start))
                      (else                       ;read again
                       (loop read start))))
             (n (loop (+ read n) (+ start n)))))
          (else
           read)))
--8<---------------cut here---------------end--------------->8---

That way, I believe all cases are correctly handled.

WDYT?

> This only works for decompression of single-member archive, but the
> documentation does not say that.
>
>     (match (get-bytevector-n port (lz-decompress-write-size decoder))
>       ((? eof-object? eof) eof)
>       (bv (lz-decompress-write decoder bv)))
>
>
> In the above if lz-decompress-write-size returns 0, we won't be reading
> anything (infinite loop?).

We’re calling ‘feed-decoder!’ iff ‘lz-decompress-read’ returned 0; when
that happens ‘lz-decompress-write-size’ must return a strictly positive
number.  Otherwise there’s an inconsistency.

>            (match (lz-decompress-read decoder bv start (- count read))
>              (0 (if (eof-object? (feed-decoder! decoder))
>                     read
>                     (loop read start)))
>
> I'm not sure I understand the above: if we read nothing, then we try
> again?

No: if we read *something*, we try again; if we read nothing, we return.

Thanks for your careful review, much appreciated!

Ludo’.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-05-31 20:54                       ` Ludovic Courtès
@ 2019-06-01  6:02                         ` Pierre Neidhardt
  2019-06-01  9:41                           ` Ludovic Courtès
  0 siblings, 1 reply; 31+ messages in thread
From: Pierre Neidhardt @ 2019-06-01  6:02 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 35880-done

[-- Attachment #1: Type: text/plain, Size: 1561 bytes --]

Ludovic Courtès <ludo@gnu.org> writes:

> I think the ‘lzread!’ loop should look like this (the tests still pass
> with this):
>
> --8<---------------cut here---------------start------------->8---
>   (let loop ((read 0)
>              (start start))
>     (cond ((< read count)
>            (match (lz-decompress-read decoder bv start (- count read))
>              (0 (cond ((lz-decompress-finished? decoder)
>                        read)
>                       ((eof-object? (feed-decoder! decoder))
>                        (lz-decompress-finish decoder)
>                        (loop read start))
>                       (else                       ;read again
>                        (loop read start))))
>              (n (loop (+ read n) (+ start n)))))
>           (else
>            read)))
> --8<---------------cut here---------------end--------------->8---

Looks good to me!

>>            (match (lz-decompress-read decoder bv start (- count read))
>>              (0 (if (eof-object? (feed-decoder! decoder))
>>                     read
>>                     (loop read start)))
>>
>> I'm not sure I understand the above: if we read nothing, then we try
>> again?
>
> No: if we read *something*, we try again; if we read nothing, we return.

If we read nothing _and_ it is not an EOF (it can be an empty vector),
then we loop indefinitely, no?

> Thanks for your careful review, much appreciated!

You are welcome, thanks for your invaluable work!

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-06-01  6:02                         ` Pierre Neidhardt
@ 2019-06-01  9:41                           ` Ludovic Courtès
  2019-06-01  9:58                             ` Pierre Neidhardt
  0 siblings, 1 reply; 31+ messages in thread
From: Ludovic Courtès @ 2019-06-01  9:41 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: 35880-done

Hi!

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> I think the ‘lzread!’ loop should look like this (the tests still pass
>> with this):
>>
>> --8<---------------cut here---------------start------------->8---
>>   (let loop ((read 0)
>>              (start start))
>>     (cond ((< read count)
>>            (match (lz-decompress-read decoder bv start (- count read))
>>              (0 (cond ((lz-decompress-finished? decoder)
>>                        read)
>>                       ((eof-object? (feed-decoder! decoder))
>>                        (lz-decompress-finish decoder)
>>                        (loop read start))
>>                       (else                       ;read again
>>                        (loop read start))))
>>              (n (loop (+ read n) (+ start n)))))
>>           (else
>>            read)))
>> --8<---------------cut here---------------end--------------->8---
>
> Looks good to me!

OK, committed!

>>>            (match (lz-decompress-read decoder bv start (- count read))
>>>              (0 (if (eof-object? (feed-decoder! decoder))
>>>                     read
>>>                     (loop read start)))
>>>
>>> I'm not sure I understand the above: if we read nothing, then we try
>>> again?
>>
>> No: if we read *something*, we try again; if we read nothing, we return.
>
> If we read nothing _and_ it is not an EOF (it can be an empty vector),
> then we loop indefinitely, no?

‘feed-decoder!’ cannot return an empty bytevector because
‘lz-decompress-write-size’ necessarily returns a strictly positive
integer at this point.

(Imperative programming is hard! :-))

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-06-01  9:41                           ` Ludovic Courtès
@ 2019-06-01  9:58                             ` Pierre Neidhardt
  2019-06-01 12:21                               ` Ludovic Courtès
  0 siblings, 1 reply; 31+ messages in thread
From: Pierre Neidhardt @ 2019-06-01  9:58 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 35880-done

[-- Attachment #1: Type: text/plain, Size: 602 bytes --]

Ludovic Courtès <ludo@gnu.org> writes:

> ‘feed-decoder!’ cannot return an empty bytevector because
> ‘lz-decompress-write-size’ necessarily returns a strictly positive
> integer at this point.

I'm not sure that's true: if the buffer is full and the next
lz-decompress-read does not read anything, then the buffer will still be
full and lz-decompress-write-size will return 0.

The specs don't guarantee that lz-decompress-read will always read
something.

But that's the only assumption we are making I believe, and it's fair :)

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed'.
  2019-06-01  9:58                             ` Pierre Neidhardt
@ 2019-06-01 12:21                               ` Ludovic Courtès
  0 siblings, 0 replies; 31+ messages in thread
From: Ludovic Courtès @ 2019-06-01 12:21 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: 35880-done

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> ‘feed-decoder!’ cannot return an empty bytevector because
>> ‘lz-decompress-write-size’ necessarily returns a strictly positive
>> integer at this point.
>
> I'm not sure that's true: if the buffer is full and the next
> lz-decompress-read does not read anything, then the buffer will still be
> full and lz-decompress-write-size will return 0.

Hmm I don’t think that’s a reasonable scenario, otherwise it would mean
that you’re in a deadlock anyway (decoder buffer is already full yet it
does not contain enough data to actually decode anything.)

Dunno, I’m rather confident here but we’ll see if that causes any
troubles.

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2019-06-01 12:31 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-05-24 13:31 [bug#35880] [PATCH 0/7] Lzip support for 'guix publish' and 'guix substitute' Ludovic Courtès
2019-05-24 13:42 ` [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed' Ludovic Courtès
2019-05-24 13:42   ` [bug#35880] [PATCH 2/7] utils: Test 'compressed-port' and 'decompressed-port' for both gzip and xz Ludovic Courtès
2019-05-24 13:42   ` [bug#35880] [PATCH 3/7] utils: Support compression and decompression with lzip Ludovic Courtès
2019-05-25 17:27     ` Pierre Neidhardt
2019-05-26 19:52       ` Ludovic Courtès
2019-05-24 13:42   ` [bug#35880] [PATCH 4/7] publish: Add support for lzip Ludovic Courtès
2019-05-24 13:42   ` [bug#35880] [PATCH 5/7] self: Add dependency on lzlib Ludovic Courtès
2019-05-24 13:42   ` [bug#35880] [PATCH 6/7] gnu: guix: " Ludovic Courtès
2019-05-24 13:42   ` [bug#35880] [PATCH 7/7] lzlib: 'lzread!' never returns more than it was asked for Ludovic Courtès
2019-05-25 17:31     ` Pierre Neidhardt
2019-05-26 19:54       ` Ludovic Courtès
2019-05-26 20:57         ` Pierre Neidhardt
2019-05-26 21:28           ` Ludovic Courtès
2019-05-27  7:00             ` Pierre Neidhardt
2019-05-27 10:00               ` Ludovic Courtès
2019-05-25 17:24   ` [bug#35880] [PATCH 1/7] lzlib: Add 'make-lzip-input-port/compressed' Pierre Neidhardt
2019-05-26 19:51     ` Ludovic Courtès
2019-05-27 15:45       ` Ludovic Courtès
2019-05-27 16:24         ` Pierre Neidhardt
2019-05-27 20:53           ` bug#35880: " Ludovic Courtès
2019-05-27 21:12             ` [bug#35880] " Pierre Neidhardt
2019-05-28  7:52               ` Ludovic Courtès
2019-05-28  8:46                 ` Pierre Neidhardt
2019-05-28 13:47                   ` Ludovic Courtès
2019-05-29 14:57                     ` Pierre Neidhardt
2019-05-31 20:54                       ` Ludovic Courtès
2019-06-01  6:02                         ` Pierre Neidhardt
2019-06-01  9:41                           ` Ludovic Courtès
2019-06-01  9:58                             ` Pierre Neidhardt
2019-06-01 12:21                               ` Ludovic Courtès

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).