From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.devel Subject: Re: wip-ports-refactor Date: Tue, 12 Apr 2016 11:33:49 +0200 Message-ID: <87k2k3marm.fsf@pobox.com> References: <87twjempnf.fsf@pobox.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1460453656 9411 80.91.229.3 (12 Apr 2016 09:34:16 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 12 Apr 2016 09:34:16 +0000 (UTC) To: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Tue Apr 12 11:34:08 2016 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1apuhy-0001Ga-Lk for guile-devel@m.gmane.org; Tue, 12 Apr 2016 11:34:06 +0200 Original-Received: from localhost ([::1]:45625 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1apuhx-0001to-SP for guile-devel@m.gmane.org; Tue, 12 Apr 2016 05:34:05 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:41153) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1apuhu-0001qs-AQ for guile-devel@gnu.org; Tue, 12 Apr 2016 05:34:03 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1apuhr-0005Yj-4U for guile-devel@gnu.org; Tue, 12 Apr 2016 05:34:02 -0400 Original-Received: from pb-sasl0.pobox.com ([208.72.237.25]:53451 helo=sasl.smtp.pobox.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1apuhq-0005Yf-W9 for guile-devel@gnu.org; Tue, 12 Apr 2016 05:33:59 -0400 Original-Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by pb-sasl0.pobox.com (Postfix) with ESMTP id 56125523E6 for ; Tue, 12 Apr 2016 05:33:57 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=xf0HiDUSVZEU9scznoUz395Km0w=; b=gce18a xsmXUfpXzQpb/qZRTAu8rhOhNOxQ5jLg6Y8G0ohpvBbVHub1Ah8au/9EgLpZ7nk9 4H3yI/ivZhuxvVXcbMWnp29ooNQXiLnhgCIuCWEK/ySQfrkbZZVKjdikAHl2Yirn DufKW+FCDArSnoiPG1mmQ+PWriVw9kOgz2iak= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:subject :references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=S2WJ8rxN3g4N6ZNBPe8Vf28NQeo01ilj s2Pd2IdoOm4827cUtV3Ms1O7yWgVkYf6t46MOqMxQSB4UjZmuUPluTwyFJ4SWy+E 9LTDgmZGBQGLROilAvIiUOryJwgTrfm4RsOq72RIOq12MipOUSOL5d+T6yokzcjO KZosxssdKc4= Original-Received: from pb-sasl0.int.icgroup.com (unknown [127.0.0.1]) by pb-sasl0.pobox.com (Postfix) with ESMTP id 4C348523E5 for ; Tue, 12 Apr 2016 05:33:57 -0400 (EDT) Original-Received: from clucks (unknown [88.160.190.192]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by pb-sasl0.pobox.com (Postfix) with ESMTPSA id 591CA523E4 for ; Tue, 12 Apr 2016 05:33:56 -0400 (EDT) In-Reply-To: <87twjempnf.fsf@pobox.com> (Andy Wingo's message of "Wed, 06 Apr 2016 22:46:28 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) X-Pobox-Relay-ID: ADBD2DAC-0091-11E6-8E8C-E4FB1E2D4245-02397024!pb-sasl0.pobox.com X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.72.237.25 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.org gmane.lisp.guile.devel:18280 Archived-At: On Wed 06 Apr 2016 22:46, Andy Wingo writes: > I have been working on a refactor to ports. The status is that in wip-ports-refactor I have changed the internal port implementation to always have buffers, and that those buffers are always bytevectors (internally encapsulated in a scm_t_port_buffer struct that has cursors into the buffer). In that way we should be able to access the port buffers from Scheme safely. The end-game is to allow programs like Scheme's `read' to be suspendable. That means, whenever they would peek-char/read-char and input is unavailable, the program would suspend to the scheduler by aborting to a prompt, and resume the resulting continuation when input becomes available. Likewise for writing. To do this, all port functions need to be implemented in Scheme, because for a delimited continuation to be resumed, it has to only capture Scheme activations, not C activations. This is obviously a gnarly task. It still makes sense to have C functions that work on ports -- and specifically, that C have access to the port buffers. But it would be fine for C ports to call out to Scheme to fill their read buffers / flush their write buffers. So the near-term is to move the read/write/etc ptob methods to be Scheme functions -- probably gsubr wrappers for now (for the existing port types). Then we need to start allowing I/O functions to be implemented in Scheme -- in (ice-9 ports) or so. But, you don't want Scheme code to have to import (ice-9 ports). You want existing code that uses read-char and so on to become suspendable. So, we will replace core I/O bindings in boot-9 with imported bindings from (ice-9 ports). That will also allow us to trim the set of bindings defined in boot-9 itself (before (ice-9 ports) is loaded) to the minimum set that is needed to boot Guile. So the plan is: 1. Create (ice-9 ports) module - it will do load-extension to cause ports.c to define I/O routines - it exports all i/o routines that are exported by ports.c, and perhaps by other files as well - bindings from (ice-9 ports) are imported into boot-9, augmenting the minimal set of bindings defined in boot-9, and replacing the existing minimal bindings via set! 2. Add Scheme interface to port buffers, make internal to (ice-9 ports) - this should allow I/O routines to get a port's read or write buffers, grovel in the bytes, update cursors, and call the read or write functions to fill or empty them 3. Start rewriting I/O routines in Scheme 4. Add/adapt a non-blocking interface - Currently port read/write functions are blocking. Probably we should change their semantics to be nonblocking. This would allow Guile to detect when to suspend a computation. - Nonblocking ports need an FD to select on; if they don't have one, a write or read that consumes 0 bytes indicates EOF - Existing blocking interfaces would be shimmed by "select"-ing on the port until it's writable in a loop 5. Add "current read waiter" / "current write waiter" abstraction from the ethreads branch - These are parameters (dynamic bindings) that are procedures that define what to do when a read or write would block. By default I think probably they should select in a loop to emulate blocking behavior. They could be parameterized to suspend the computation to a scheduler though. Finally there is a question about speed. I expect that for buffered ports, I/O from C will have a minimal slowdown. For unbuffered ports, the slowdown would be more, because the cost of filling and emptying ports is higher with a call from C to Scheme (and then back, for read/write functions actually implemented in C.) But for Scheme, I expect that generally throughput goes up, as we will be able to build flexible I/O routines that can access the buffer directly, both because with this branch buffering is uniformly handled in the generic port code, and also because Scheme avoids the Scheme->C penalty in common cases. We can provide compiler support for accessing the port buffer, if needed, but hopefully we can avoid that. Finally finally, there is still the question about locks. I don't know the answer here. I think it's likely that we can have concurrent access to port buffers without locks, but I suspect that anything that accesses mutable port state should probably be protected by a lock -- but probably not a re-entrant lock, because the operations called with that lock wouldn't call out to any user code. That means that read/write functions from port implementations would have to bake in their own threadsafety, but probably that's OK; for file ports, for example, the threadsafety is baked in the kernel. Atomic accessors are also a possibility if there is still overhead. I think also we could remove all of the _unlocked functions from our API and from our internals in that case, and just lock as appropriate, understanding that the perf impact should be minimal. Andy