From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.bugs Subject: bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports Date: Fri, 12 Jan 2018 10:01:11 +0100 Message-ID: <87373bpi20.fsf@igalia.com> References: <87zi5lrc3x.fsf@gnu.org> <87tvvtr9ge.fsf@gnu.org> <87fu7dptdn.fsf@igalia.com> <87o9m08nx2.fsf@gnu.org> <87fu7cf9wk.fsf@netris.org> <87po6gnm6y.fsf@gnu.org> <87a7xkxdph.fsf@netris.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1515748042 2373 195.159.176.226 (12 Jan 2018 09:07:22 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 12 Jan 2018 09:07:22 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) Cc: Ludovic =?UTF-8?Q?Court=C3=A8s?= , 30066@debbugs.gnu.org To: Mark H Weaver Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Fri Jan 12 10:07:17 2018 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eZvIw-00006I-Jm for guile-bugs@m.gmane.org; Fri, 12 Jan 2018 10:07:14 +0100 Original-Received: from localhost ([::1]:55235 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eZvKv-0000x6-HM for guile-bugs@m.gmane.org; Fri, 12 Jan 2018 04:09:17 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:43446) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eZvE0-0003ax-0T for bug-guile@gnu.org; Fri, 12 Jan 2018 04:02:13 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eZvDu-0000qb-7t for bug-guile@gnu.org; Fri, 12 Jan 2018 04:02:08 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:44951) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eZvDu-0000qH-5C for bug-guile@gnu.org; Fri, 12 Jan 2018 04:02:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1eZvDt-0005WE-O5 for bug-guile@gnu.org; Fri, 12 Jan 2018 04:02:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Andy Wingo Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Fri, 12 Jan 2018 09:02:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 30066 X-GNU-PR-Package: guile X-GNU-PR-Keywords: Original-Received: via spool by 30066-submit@debbugs.gnu.org id=B30066.151574768021162 (code B ref 30066); Fri, 12 Jan 2018 09:02:01 +0000 Original-Received: (at 30066) by debbugs.gnu.org; 12 Jan 2018 09:01:20 +0000 Original-Received: from localhost ([127.0.0.1]:52848 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eZvDE-0005VF-7X for submit@debbugs.gnu.org; Fri, 12 Jan 2018 04:01:20 -0500 Original-Received: from pb-sasl2.pobox.com ([64.147.108.67]:52798 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eZvDC-0005V7-CD for 30066@debbugs.gnu.org; Fri, 12 Jan 2018 04:01:19 -0500 Original-Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by pb-sasl2.pobox.com (Postfix) with ESMTP id 82CB6A682A; Fri, 12 Jan 2018 04:01:16 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type:content-transfer-encoding; s=sasl; bh=otxt7qNJ5vso fCUj7AJtbx1bc7g=; b=VIAaH0nBLXoFtxFyERyfdn5bgXFjPCIvWTwvCl3qN4v/ RspeZP46i7TcifUfeLPdaRIRh3xiE6f1X5DXs8qU8AluHwa9tf6uJn/NCkyXPU+u ukhNTpxWLNxEtLtd76jXzLxIf5F1lG2FEHMnptYj6yc7ZInPufv/mSV/j18owKU= Original-Received: from pb-sasl2.nyi.icgroup.com (unknown [127.0.0.1]) by pb-sasl2.pobox.com (Postfix) with ESMTP id 68890A6826; Fri, 12 Jan 2018 04:01:16 -0500 (EST) Original-Received: from rusty (unknown [88.160.190.192]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by pb-sasl2.pobox.com (Postfix) with ESMTPSA id 7BF2FA6822; Fri, 12 Jan 2018 04:01:15 -0500 (EST) In-Reply-To: <87a7xkxdph.fsf@netris.org> (Mark H. Weaver's message of "Thu, 11 Jan 2018 16:55:38 -0500") X-Pobox-Relay-ID: 2541A2D0-F777-11E7-B790-EA54894C8D7C-02397024!pb-sasl2.pobox.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Original-Sender: "bug-guile" Xref: news.gmane.org gmane.lisp.guile.bugs:8962 Archived-At: On Thu 11 Jan 2018 22:55, Mark H Weaver writes: > ludo@gnu.org (Ludovic Court=C3=A8s) writes: > >> Mark H Weaver skribis: >> >>> ludo@gnu.org (Ludovic Court=C3=A8s) writes: >> >> [...] >> >>>> + if (SCM_UNBUFFEREDP (port) && (avail < max_buffer_size)) >>>> + { >>>> + /* PORT is unbuffered. Read as much as possible from PORT. */ >>>> + size_t read; >>>> + >>>> + bv =3D scm_c_make_bytevector (max_buffer_size); >>>> + scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTE= NTS (bv), >>>> + avail, cur, avail); >>>> + >>>> + read =3D scm_i_read_bytes (port, bv, avail, >>>> + SCM_BYTEVECTOR_LENGTH (bv) - avail); >>> >>> Here's the R6RS specification for 'get-bytevector-some': >>> >>> "Reads from BINARY-INPUT-PORT, blocking as necessary, until bytes are >>> available from BINARY-INPUT-PORT or until an end of file is reached. >>> If bytes become available, 'get-bytevector-some' returns a freshly >>> allocated bytevector containing the initial available bytes (at least >>> one), and it updates BINARY-INPUT-PORT to point just past these >>> bytes. If no input bytes are seen before an end of file is reached, >>> the end-of-file object is returned." >>> >>> By my reading of this, we should block only if necessary to ensure that >>> we return at least one byte (or EOF). In other words, if we can return >>> at least one byte (or EOF), then we must not block, which means that we >>> must not initiate another 'read'. >> >> Indeed. So perhaps the condition above should be changed to: >> >> if (SCM_UNBUFFEREDP (port) && (avail =3D=3D 0)) >> >> ? > > That won't work, because the earlier call to 'scm_fill_input' will have > already initiated a 'read' if the buffer was empty. The read buffer > size will determine the maximum number of bytes read, which will be 1 in > the case of an unbuffered port. So, at the point of this condition, > 'avail =3D=3D 0' will occur only if EOF was encountered, in which case you > must return EOF without attempting another 'read'. > > In order to avoid unnecessary blocking, there must be only one 'read' > call, and it must be initiated only if the buffer was already empty. > > So, in order to accomplish your goal here, I don't see how you can use > 'scm_fill_input', unless you temporarily increase the size of the read > buffer beforehand. > > Instead, I think you need to first check if the read buffer contains any > bytes. If so, empty the buffer and return them. If the buffer is > empty, the next thing to check is 'scm_port_buffer_has_eof_p'. If it's > set, then clear that flag and return EOF. > > Otherwise, if the buffer is empty and 'scm_port_buffer_has_eof_p' is > false, then you must do what 'scm_fill_input' would have done, except > using your larger buffer instead of the port's internal read buffer. In > particular, you must first switch the port to "reading" mode, flushing > the write buffer if 'rw_random' is set. > > Also, I'd prefer to move this code to ports.c in order to avoid adding > more internal declarations to ports.h and changing more functions from > 'static' to global functions. I agree with Mark here -- thanks for the close review. >>> Out of curiosity, is there a reason why you're using an unbuffered port >>> in your use case? >> >> It=E2=80=99s to implement redirect =C3=A0 la socat: >> >> https://git.savannah.gnu.org/cgit/guix.git/commit/?id=3D17af5d51de7c40= 756a4a39d336f81681de2ba447 > > Why is an unbuffered port being used here? Can we change it to a > buffered port? This was also a question I had! If you make it a buffered port at 4096 bytes (for example), then get-bytevector-some works exactly like you want it to, no? Andy