From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Subject: Re: Utf8 error Date: Wed, 30 Jan 2013 23:23:38 +0100 Message-ID: <87ip6efgyd.fsf@gnu.org> References: <201301302227.03563.andreas@enge.fr> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Return-path: Received: from eggs.gnu.org ([208.118.235.92]:53104) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U0g4X-0005uN-06 for bug-guix@gnu.org; Wed, 30 Jan 2013 17:24:01 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1U0g4V-0001Zo-PD for bug-guix@gnu.org; Wed, 30 Jan 2013 17:24:00 -0500 Received: from mail3-relais-sop.national.inria.fr ([192.134.164.104]:54418) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U0g4V-0001Zg-IE for bug-guix@gnu.org; Wed, 30 Jan 2013 17:23:59 -0500 In-Reply-To: <201301302227.03563.andreas@enge.fr> (Andreas Enge's message of "Wed, 30 Jan 2013 22:27:03 +0100") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org To: Andreas Enge Cc: bug-guix@gnu.org --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Andreas Enge skribis: > 385: 2 [process-stderr #] > 170: 1 [read-string #] > In unknown file: > ?: 0 [utf8->string #vu8(115 97 109 112 108 101 95 114 97 116 101 95 10= 5=20 That=E2=80=99s because the build log contains a non-UTF-8 sequence, and store.scm expects UTF-8 (for no good reason). The attached patch removes that UTF-8 assumption. Can you test whether it fixes the problem? --=-=-= Content-Type: text/x-patch Content-Disposition: inline diff --git a/guix/store.scm b/guix/store.scm index 668bc9a..560e567 100644 --- a/guix/store.scm +++ b/guix/store.scm @@ -175,6 +175,14 @@ (get-bytevector-n p (- 8 m))) str)) +(define (read-latin1-string p) + (let* ((len (read-int p)) + (m (modulo len 8)) + (str (get-string-n p len))) + (or (zero? m) + (get-bytevector-n p (- 8 m))) + str)) + (define (write-string-list l p) (write-int (length l) p) (for-each (cut write-string <> p) l)) @@ -362,7 +370,11 @@ operate, should the disk become full. Return a server object." "Read standard output and standard error from SERVER, writing it to CURRENT-BUILD-OUTPUT-PORT. Return #t when SERVER is done sending data, and #f otherwise; in the latter case, the caller should call `process-stderr' -again until #t is returned or an error is raised." +again until #t is returned or an error is raised. + +Since the build process's output cannot be assumed to be UTF-8, we +conservatively consider it to be Latin-1, thereby avoiding possible +encoding conversion errors." (define p (nix-server-socket server)) @@ -375,18 +387,18 @@ again until #t is returned or an error is raised." (let ((k (read-int p))) (cond ((= k %stderr-write) - (read-string p) + (read-latin1-string p) #f) ((= k %stderr-read) (let ((len (read-int p))) - (read-string p) ; FIXME: what to do? + (read-latin1-string p) ; FIXME: what to do? #f)) ((= k %stderr-next) - (let ((s (read-string p))) + (let ((s (read-latin1-string p))) (display s (current-build-output-port)) #f)) ((= k %stderr-error) - (let ((error (read-string p)) + (let ((error (read-latin1-string p)) (status (if (>= (nix-server-minor-version server) 8) (read-int p) 1))) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: base64 DQpUaGFua3MsDQpMdWRv4oCZLg0K --=-=-=--