From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Mark H Weaver Newsgroups: gmane.lisp.guile.bugs Subject: bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports Date: Thu, 11 Jan 2018 16:55:38 -0500 Message-ID: <87a7xkxdph.fsf@netris.org> References: <87zi5lrc3x.fsf@gnu.org> <87tvvtr9ge.fsf@gnu.org> <87fu7dptdn.fsf@igalia.com> <87o9m08nx2.fsf@gnu.org> <87fu7cf9wk.fsf@netris.org> <87po6gnm6y.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1515707879 24128 195.159.176.226 (11 Jan 2018 21:57:59 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 11 Jan 2018 21:57:59 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) Cc: Andy Wingo , 30066@debbugs.gnu.org To: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Thu Jan 11 22:57:55 2018 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eZkrA-0005go-LC for guile-bugs@m.gmane.org; Thu, 11 Jan 2018 22:57:52 +0100 Original-Received: from localhost ([::1]:45436 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eZkt8-00030W-8c for guile-bugs@m.gmane.org; Thu, 11 Jan 2018 16:59:54 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:60815) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eZkqQ-0000d4-S8 for bug-guile@gnu.org; Thu, 11 Jan 2018 16:57:08 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eZkqM-0008WB-TH for bug-guile@gnu.org; Thu, 11 Jan 2018 16:57:06 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:44703) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eZkqM-0008VR-PH for bug-guile@gnu.org; Thu, 11 Jan 2018 16:57:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1eZkqM-00057F-A8 for bug-guile@gnu.org; Thu, 11 Jan 2018 16:57:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Mark H Weaver Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Thu, 11 Jan 2018 21:57:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 30066 X-GNU-PR-Package: guile X-GNU-PR-Keywords: Original-Received: via spool by 30066-submit@debbugs.gnu.org id=B30066.151570777419610 (code B ref 30066); Thu, 11 Jan 2018 21:57:02 +0000 Original-Received: (at 30066) by debbugs.gnu.org; 11 Jan 2018 21:56:14 +0000 Original-Received: from localhost ([127.0.0.1]:52600 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eZkpa-00056E-38 for submit@debbugs.gnu.org; Thu, 11 Jan 2018 16:56:14 -0500 Original-Received: from world.peace.net ([50.252.239.5]:60040) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eZkpX-000560-Ti for 30066@debbugs.gnu.org; Thu, 11 Jan 2018 16:56:12 -0500 Original-Received: from [98.216.255.118] (helo=jojen) by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1eZkpR-0001Ol-Mo; Thu, 11 Jan 2018 16:56:05 -0500 In-Reply-To: <87po6gnm6y.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Thu, 11 Jan 2018 22:02:29 +0100") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Original-Sender: "bug-guile" Xref: news.gmane.org gmane.lisp.guile.bugs:8961 Archived-At: ludo@gnu.org (Ludovic Court=C3=A8s) writes: > Mark H Weaver skribis: > >> ludo@gnu.org (Ludovic Court=C3=A8s) writes: > > [...] > >>> + if (SCM_UNBUFFEREDP (port) && (avail < max_buffer_size)) >>> + { >>> + /* PORT is unbuffered. Read as much as possible from PORT. */ >>> + size_t read; >>> + >>> + bv =3D scm_c_make_bytevector (max_buffer_size); >>> + scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTEN= TS (bv), >>> + avail, cur, avail); >>> + >>> + read =3D scm_i_read_bytes (port, bv, avail, >>> + SCM_BYTEVECTOR_LENGTH (bv) - avail); >> >> Here's the R6RS specification for 'get-bytevector-some': >> >> "Reads from BINARY-INPUT-PORT, blocking as necessary, until bytes are >> available from BINARY-INPUT-PORT or until an end of file is reached. >> If bytes become available, 'get-bytevector-some' returns a freshly >> allocated bytevector containing the initial available bytes (at least >> one), and it updates BINARY-INPUT-PORT to point just past these >> bytes. If no input bytes are seen before an end of file is reached, >> the end-of-file object is returned." >> >> By my reading of this, we should block only if necessary to ensure that >> we return at least one byte (or EOF). In other words, if we can return >> at least one byte (or EOF), then we must not block, which means that we >> must not initiate another 'read'. > > Indeed. So perhaps the condition above should be changed to: > > if (SCM_UNBUFFEREDP (port) && (avail =3D=3D 0)) > > ? That won't work, because the earlier call to 'scm_fill_input' will have already initiated a 'read' if the buffer was empty. The read buffer size will determine the maximum number of bytes read, which will be 1 in the case of an unbuffered port. So, at the point of this condition, 'avail =3D=3D 0' will occur only if EOF was encountered, in which case you must return EOF without attempting another 'read'. In order to avoid unnecessary blocking, there must be only one 'read' call, and it must be initiated only if the buffer was already empty. So, in order to accomplish your goal here, I don't see how you can use 'scm_fill_input', unless you temporarily increase the size of the read buffer beforehand. Instead, I think you need to first check if the read buffer contains any bytes. If so, empty the buffer and return them. If the buffer is empty, the next thing to check is 'scm_port_buffer_has_eof_p'. If it's set, then clear that flag and return EOF. Otherwise, if the buffer is empty and 'scm_port_buffer_has_eof_p' is false, then you must do what 'scm_fill_input' would have done, except using your larger buffer instead of the port's internal read buffer. In particular, you must first switch the port to "reading" mode, flushing the write buffer if 'rw_random' is set. Also, I'd prefer to move this code to ports.c in order to avoid adding more internal declarations to ports.h and changing more functions from 'static' to global functions. >> Out of curiosity, is there a reason why you're using an unbuffered port >> in your use case? > > It=E2=80=99s to implement redirect =C3=A0 la socat: > > https://git.savannah.gnu.org/cgit/guix.git/commit/?id=3D17af5d51de7c407= 56a4a39d336f81681de2ba447 Why is an unbuffered port being used here? Can we change it to a buffered port? Mark