From: "Ludovic Courtès" <ludo@gnu.org>
To: Mark H Weaver <mhw@netris.org>
Cc: 35350@debbugs.gnu.org
Subject: bug#35350: Some compile output still leaks through with --verbosity=1
Date: Tue, 23 Apr 2019 12:12:34 +0200 [thread overview]
Message-ID: <87imv5jai5.fsf@gnu.org> (raw)
In-Reply-To: <87ftq9silk.fsf@netris.org> (Mark H. Weaver's message of "Mon, 22 Apr 2019 19:52:28 -0400")
[-- Attachment #1: Type: text/plain, Size: 4037 bytes --]
Hi Mark,
Mark H Weaver <mhw@netris.org> skribis:
> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Mark H Weaver <mhw@netris.org> skribis:
>>
>>> Sometimes when compiling a package with --verbosity=1, some parts of the
>>> compile output leak through. For example, see the transcript below.
>>
>> Weird.
>
> FWIW, a few observations, possibly relevant:
>
> (1) Each chunk of leaked output begins with 1 or 2 Unicode Replacement
> characters (U+FFFD). In the transcript I provided, the first leak
> began with 1 replacement char, and all later leaks began with 2.
>
> (2) The replacement characters are immediately followed by
> "@ build-log 30033 4096\n", and that string is also sprinkled
> throughout the leaked output, with approximately ~4060-4070
> characters of leaked output between each occurrence of
> "@ build-log 30033 4096\n".
Indeed. I managed to reproduce it while building modem-manager. I
strace’d ‘guix build’ with an additional ‘pk’¹ to see what happens, and
here’s what leads to the wrong “write(2, "�@ build-log…")” call:
--8<---------------cut here---------------start------------->8---
read(13, "gmlo\0\0\0\0", 8) = 8
read(13, "\27\20\0\0\0\0\0\0", 8) = 8
read(13, "@ build-log 22090 4096\n […] warning: \342\200", 4119) = 4119
read(13, "\0", 1) = 1
write(1, "\n", 1) = 1
write(1, ";;; (write 1008 <> #f 0)\n", 25) = 25
write(1, "\n", 1) = 1
write(1, ";;; (write 985 <> 22090 4096)\n", 30) = 30
write(1, "\n", 1) = 1
write(1, ";;; (write 1008 <> 22090 3111)\n", 31) = 31
write(1, "\n", 1) = 1
write(1, ";;; (write 1008 <> 22090 2103)\n", 31) = 31
write(1, "\n", 1) = 1
write(1, ";;; (write 1008 <> 22090 1095)\n", 31) = 31
write(1, "\n", 1) = 1
write(1, ";;; (write 88 <> 22090 87)\n", 27) = 27
write(2, "\r\33[K\\ 'build' phase", 19) = 19
[…]
write(2, "\r\33[K\\ 'build' phase", 19) = 19
write(1, "\n", 1) = 1
write(1, ";;; (write 1 <> #f 0)\n", 22) = 22
read(13, "gmlo\0\0\0\0", 8) = 8
read(13, "\27\20\0\0\0\0\0\0", 8) = 8
read(13, "@ build-log 22090 4096\n\230g_simple_async_result_take_error\342\200\231 is deprecated[…]", 4119) = 4119
read(13, "\0", 1) = 1
write(1, "\n", 1) = 1
write(1, ";;; (write 1008 <> #f 0)\n", 25) = 25
write(2, "\357\277\275@ build-log 22090 4096\n", 26) = 26
--8<---------------cut here---------------end--------------->8---
The third read(2) call here ends on a partial UTF-8 sequence for LEFT
SINGLE QUOTATION MARK (we get the first two bytes of a three byte
sequence.)
What happens is that ‘process-stderr’ in (guix store) gets that byte
string from the daemon, passes it through ‘read-maybe-utf8-string’,
which replaces the last two bytes with REPLACEMENT CHARACTER, which is
itself a 3-byte sequence.
Thus, we have this extra byte that’s being inserted. That confuses the
whole machinery since the build log was announced as being 4096-byte
long, and it’s now 4097-byte long.
Internally, ‘build-event-output-port’ keeps the last byte of the
REPLACEMENT CHARACTER sequence in the ‘%fragments’ buffer.
Consequently, the “@ build-log” string that comes next doesn’t start on
a newline, and thus it is considered build output. Since the first byte
does not constitute a valid UTF-8 sequence, another REPLACEMENT
CHARACTER is inserted there when it gets printed.
So ‘build-event-output-port’ is working as expected. The problem is the
first layer of UTF-8 decoding that happens in ‘process-stderr’, in the
‘%stderr-next’ case. We would need to disable it, but only if the build
output port is ‘build-event-output-port’ (i.e., it’s capable of
interpreting “multiplexed build output” correctly.)
Thanks,
Ludo’.
¹ pk:
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 469 bytes --]
diff --git a/guix/status.scm b/guix/status.scm
index cbea4151f2..4dcbcb0c1f 100644
--- a/guix/status.scm
+++ b/guix/status.scm
@@ -717,6 +717,7 @@ The second return value is a thunk to retrieve the current state."
(pointer->bytevector ptr count)))
(define (write! bv offset count)
+ (pk 'write count '<> %build-output-pid %build-output-left)
(if %build-output-pid
(let ((keep (min count %build-output-left)))
(set! %build-output
next prev parent reply other threads:[~2019-04-23 10:13 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-20 23:53 bug#35350: Some compile output still leaks through with --verbosity=1 Mark H Weaver
2019-04-21 20:15 ` Ludovic Courtès
2019-04-22 23:52 ` Mark H Weaver
2019-04-23 8:45 ` Mark H Weaver
2019-04-23 10:12 ` Ludovic Courtès [this message]
2019-04-26 19:09 ` Mark H Weaver
2019-04-27 0:45 ` Mark H Weaver
2019-04-27 7:56 ` Mark H Weaver
2019-04-27 16:36 ` Ludovic Courtès
2019-04-30 20:26 ` Mark H Weaver
2019-05-04 9:33 ` Ludovic Courtès
2019-05-04 18:53 ` Mark H Weaver
2021-09-20 5:44 ` Sarah Morgensen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87imv5jai5.fsf@gnu.org \
--to=ludo@gnu.org \
--cc=35350@debbugs.gnu.org \
--cc=mhw@netris.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.