* bug#61722: (guix cpio) produces corrupted archives when there are non-ASCII filenames
@ 2023-02-23 3:14 Maxim Cournoyer
2023-02-24 4:54 ` bug#61722: [PATCH] cpio: Properly handle Unicode characters in file names Maxim Cournoyer
2023-02-24 13:26 ` bug#61722: [PATCH v2] " Maxim Cournoyer
0 siblings, 2 replies; 5+ messages in thread
From: Maxim Cournoyer @ 2023-02-23 3:14 UTC (permalink / raw)
To: 61722
Hi,
It appears that the code we have to generate CPIO archives doesn't
handle the presence of non-ASCII characters in the file names of files
to be archived well:
First, to make rpm usable on a Guix System:
--8<---------------cut here---------------start------------->8---
# mkdir /var/lib/rpm
# chown root:users /var/lib/rpm
# chmod g+rw /var/lib/rpm
--8<---------------cut here---------------end--------------->8---
Then, produce a problematic CPIO via 'guix pack -f rpm', which uses
(guix cpio):
--8<---------------cut here---------------start------------->8---
$ rpm_archive=$(guix pack -R -C none -f rpm nss-certs)
--8<---------------cut here---------------end--------------->8---
Notice that it cannot be installed:
--8<---------------cut here---------------start------------->8---
$ mkdir /tmp/nss-certs
# rpm --prefix=/tmp/nss-certs -i $rpm_archive
error: unpacking of archive failed: cpio: Bad magic
error: nss-certs-3.81-0.x86_64: install failed
--8<---------------cut here---------------end--------------->8---
Let's now inspect the cpio itself.
--8<---------------cut here---------------start------------->8---
$ guix shell rpm cpio
[env]$ rpm2cpio $rpm_archive > nss-certs.cpio
[env]$ cpio -t < nss-certs.cpio |& grep -B3 junk
./gnu/store/1klwvqm3njp070h982ydcix1gzf2zmdl-nss-certs-3.81/etc/ssl/certs/9482e63a.0
./gnu/store/1klwvqm3njp070h982ydcix1gzf2zmdl-nss-certs-3.81/etc/ssl/certs/9846683b.0
./gnu/store/1klwvqm3njp070h982ydcix1gzf2zmdl-nss-certs-3.81/etc/ssl/certs/988a38cb.0
cpio: warning: skipped 248 bytes of junk
--
./gnu/store/1klwvqm3njp070h982ydcix1gzf2zmdl-nss-certs-3.81/etc/ssl/certs/Microsoft_RSA_Root_Certificate_Authority_2017.pem
./gnu/store/1klwvqm3njp070h982ydcix1gzf2zmdl-nss-certs-3.81/etc/ssl/certs/NAVER_Global_Root_Certification_Authority.pem
./gnu/store/1klwvqm3njp070h982ydcix1gzf2zmdl-nss-certs-3.81/etc/ssl/certs/NetLock_Arany_=Class_Gold=_Főtanúsítvány.
cpio: warning: skipped 4 bytes of junk
--8<---------------cut here---------------end--------------->8---
I haven't yet pin-pointed what the problem is.
I could do with extra eyes :-).
--
Thanks,
Maxim
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#61722: [PATCH] cpio: Properly handle Unicode characters in file names.
2023-02-23 3:14 bug#61722: (guix cpio) produces corrupted archives when there are non-ASCII filenames Maxim Cournoyer
@ 2023-02-24 4:54 ` Maxim Cournoyer
2023-02-24 11:46 ` Mark H Weaver
2023-02-24 13:26 ` bug#61722: [PATCH v2] " Maxim Cournoyer
1 sibling, 1 reply; 5+ messages in thread
From: Maxim Cournoyer @ 2023-02-24 4:54 UTC (permalink / raw)
To: 61722
Cc: Josselin Poiret, Tobias Geerinckx-Rice, Maxim Cournoyer,
Simon Tournier, Mathieu Othacehe, Ludovic Courtès,
Christopher Baines, Ricardo Wurmus
Fixes <https://issues.guix.gnu.org/61722>.
* guix/cpio.scm (file->cpio-header): Compute the file name length in bytes rather than in
characters.
(file->cpio-header*, special-file->cpio-header*): Likewise.
(write-cpio-archive): Likewise, and write the file name as UTF-8 bytes, not
textually, to avoid encoding it as ISO-8859-1.
---
guix/cpio.scm | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/guix/cpio.scm b/guix/cpio.scm
index d4a7d5f1e0..8fd7552450 100644
--- a/guix/cpio.scm
+++ b/guix/cpio.scm
@@ -170,7 +170,8 @@ (define* (file->cpio-header file #:optional (file-name file)
#:size (stat:size st)
#:dev (stat:dev st)
#:rdev (stat:rdev st)
- #:name-size (string-length file-name))))
+ #:name-size (bytevector-length
+ (string->utf8 file-name)))))
(define* (file->cpio-header* file
#:optional (file-name file)
@@ -182,7 +183,8 @@ (define* (file->cpio-header* file
(make-cpio-header #:mode (stat:mode st)
#:nlink (stat:nlink st)
#:size (stat:size st)
- #:name-size (string-length file-name))))
+ #:name-size (bytevector-length
+ (string->utf8 file-name)))))
(define* (special-file->cpio-header* file
device-type
@@ -201,7 +203,8 @@ (define* (special-file->cpio-header* file
permission-bits)
#:nlink 1
#:rdev (device-number device-major device-minor)
- #:name-size (string-length file-name)))
+ #:name-size (bytevector-length
+ (string->utf8 file-name))))
(define %trailer
"TRAILER!!!")
@@ -237,7 +240,7 @@ (define (dump-file file)
;; We're padding the header + following file name + trailing zero, and
;; the header is 110 byte long.
- (write-padding (+ 110 1 (string-length file)) port)
+ (write-padding (+ 110 (bytevector-length (string->utf8 file)) 1) port)
(case (mode->type (cpio-header-mode header))
((regular)
@@ -246,7 +249,7 @@ (define (dump-file file)
(dump-port input port))))
((symlink)
(let ((target (readlink file)))
- (put-string port target)))
+ (put-bytevector port (string->utf8 target))))
((directory)
#t)
((block-special)
base-commit: c756c62cfdba8d4079be1ba9e370779b850f16b6
--
2.39.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* bug#61722: [PATCH] cpio: Properly handle Unicode characters in file names.
2023-02-24 4:54 ` bug#61722: [PATCH] cpio: Properly handle Unicode characters in file names Maxim Cournoyer
@ 2023-02-24 11:46 ` Mark H Weaver
0 siblings, 0 replies; 5+ messages in thread
From: Mark H Weaver @ 2023-02-24 11:46 UTC (permalink / raw)
To: Maxim Cournoyer, 61722
Cc: Josselin Poiret, Christopher Baines, Maxim Cournoyer,
Simon Tournier, Mathieu Othacehe, Ludovic Courtès,
Tobias Geerinckx-Rice, Ricardo Wurmus
Hi Maxim,
Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
> Fixes <https://issues.guix.gnu.org/61722>.
>
> * guix/cpio.scm (file->cpio-header): Compute the file name length in bytes rather than in
> characters.
> (file->cpio-header*, special-file->cpio-header*): Likewise.
> (write-cpio-archive): Likewise, and write the file name as UTF-8 bytes, not
> textually, to avoid encoding it as ISO-8859-1.
>
> ---
>
> guix/cpio.scm | 13 ++++++++-----
> 1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/guix/cpio.scm b/guix/cpio.scm
> index d4a7d5f1e0..8fd7552450 100644
> --- a/guix/cpio.scm
> +++ b/guix/cpio.scm
> @@ -170,7 +170,8 @@ (define* (file->cpio-header file #:optional (file-name file)
> #:size (stat:size st)
> #:dev (stat:dev st)
> #:rdev (stat:rdev st)
> - #:name-size (string-length file-name))))
> + #:name-size (bytevector-length
> + (string->utf8 file-name)))))
(string-utf8-length file-name) would produce the same result more
efficiently.
Regards,
Mark
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#61722: [PATCH v2] cpio: Properly handle Unicode characters in file names.
2023-02-23 3:14 bug#61722: (guix cpio) produces corrupted archives when there are non-ASCII filenames Maxim Cournoyer
2023-02-24 4:54 ` bug#61722: [PATCH] cpio: Properly handle Unicode characters in file names Maxim Cournoyer
@ 2023-02-24 13:26 ` Maxim Cournoyer
2023-02-25 19:52 ` bug#61722: (guix cpio) produces corrupted archives when there are non-ASCII filenames Maxim Cournoyer
1 sibling, 1 reply; 5+ messages in thread
From: Maxim Cournoyer @ 2023-02-24 13:26 UTC (permalink / raw)
To: 61722
Cc: Josselin Poiret, Tobias Geerinckx-Rice, Maxim Cournoyer,
Simon Tournier, mhw, Ludovic Courtès, Christopher Baines,
Ricardo Wurmus, Mathieu Othacehe
Fixes <https://issues.guix.gnu.org/61722>.
* guix/cpio.scm (file->cpio-header): Compute the file name length in bytes rather than in
characters.
(file->cpio-header*, special-file->cpio-header*): Likewise.
(write-cpio-archive): Likewise, and write the file name as UTF-8 bytes, not
textually, to avoid encoding it as ISO-8859-1.
---
Changes in v2:
- Use string-utf8-length
guix/cpio.scm | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/guix/cpio.scm b/guix/cpio.scm
index d4a7d5f1e0..876f61ea3c 100644
--- a/guix/cpio.scm
+++ b/guix/cpio.scm
@@ -170,7 +170,7 @@ (define* (file->cpio-header file #:optional (file-name file)
#:size (stat:size st)
#:dev (stat:dev st)
#:rdev (stat:rdev st)
- #:name-size (string-length file-name))))
+ #:name-size (string-utf8-length file-name))))
(define* (file->cpio-header* file
#:optional (file-name file)
@@ -182,7 +182,7 @@ (define* (file->cpio-header* file
(make-cpio-header #:mode (stat:mode st)
#:nlink (stat:nlink st)
#:size (stat:size st)
- #:name-size (string-length file-name))))
+ #:name-size (string-utf8-length file-name))))
(define* (special-file->cpio-header* file
device-type
@@ -201,7 +201,7 @@ (define* (special-file->cpio-header* file
permission-bits)
#:nlink 1
#:rdev (device-number device-major device-minor)
- #:name-size (string-length file-name)))
+ #:name-size (string-utf8-length file-name)))
(define %trailer
"TRAILER!!!")
@@ -237,7 +237,7 @@ (define (dump-file file)
;; We're padding the header + following file name + trailing zero, and
;; the header is 110 byte long.
- (write-padding (+ 110 1 (string-length file)) port)
+ (write-padding (+ 110 (string-utf8-length file) 1) port)
(case (mode->type (cpio-header-mode header))
((regular)
@@ -246,7 +246,7 @@ (define (dump-file file)
(dump-port input port))))
((symlink)
(let ((target (readlink file)))
- (put-string port target)))
+ (put-bytevector port (string->utf8 target))))
((directory)
#t)
((block-special)
base-commit: c756c62cfdba8d4079be1ba9e370779b850f16b6
--
2.39.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* bug#61722: (guix cpio) produces corrupted archives when there are non-ASCII filenames
2023-02-24 13:26 ` bug#61722: [PATCH v2] " Maxim Cournoyer
@ 2023-02-25 19:52 ` Maxim Cournoyer
0 siblings, 0 replies; 5+ messages in thread
From: Maxim Cournoyer @ 2023-02-25 19:52 UTC (permalink / raw)
To: 61722-done
Cc: Josselin Poiret, Christopher Baines, Simon Tournier, mhw,
Ludovic Courtès, Tobias Geerinckx-Rice, Ricardo Wurmus,
Mathieu Othacehe
Hi,
Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
> Fixes <https://issues.guix.gnu.org/61722>.
>
> * guix/cpio.scm (file->cpio-header): Compute the file name length in bytes rather than in
> characters.
> (file->cpio-header*, special-file->cpio-header*): Likewise.
> (write-cpio-archive): Likewise, and write the file name as UTF-8 bytes, not
> textually, to avoid encoding it as ISO-8859-1.
Pushed to master.
Closing.
--
Thanks,
Maxim
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-02-25 19:53 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-23 3:14 bug#61722: (guix cpio) produces corrupted archives when there are non-ASCII filenames Maxim Cournoyer
2023-02-24 4:54 ` bug#61722: [PATCH] cpio: Properly handle Unicode characters in file names Maxim Cournoyer
2023-02-24 11:46 ` Mark H Weaver
2023-02-24 13:26 ` bug#61722: [PATCH v2] " Maxim Cournoyer
2023-02-25 19:52 ` bug#61722: (guix cpio) produces corrupted archives when there are non-ASCII filenames Maxim Cournoyer
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).