From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.devel Subject: Re: wip-ports-refactor Date: Sun, 17 Apr 2016 10:49:53 +0200 Message-ID: <8760vgmxfy.fsf@pobox.com> References: <87twjempnf.fsf@pobox.com> <87zisw9tju.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1460883045 18165 80.91.229.3 (17 Apr 2016 08:50:45 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 17 Apr 2016 08:50:45 +0000 (UTC) Cc: guile-devel To: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Sun Apr 17 10:50:36 2016 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1ariPb-0000QX-MO for guile-devel@m.gmane.org; Sun, 17 Apr 2016 10:50:35 +0200 Original-Received: from localhost ([::1]:42612 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ariPb-00082Q-8F for guile-devel@m.gmane.org; Sun, 17 Apr 2016 04:50:35 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:55117) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ariP9-0006gX-Et for guile-devel@gnu.org; Sun, 17 Apr 2016 04:50:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ariP6-000610-6U for guile-devel@gnu.org; Sun, 17 Apr 2016 04:50:07 -0400 Original-Received: from pb-sasl2.pobox.com ([64.147.108.67]:53686 helo=sasl.smtp.pobox.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ariP6-00060u-2t; Sun, 17 Apr 2016 04:50:04 -0400 Original-Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by pb-sasl2.pobox.com (Postfix) with ESMTP id 594661173E; Sun, 17 Apr 2016 04:50:03 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type:content-transfer-encoding; s=sasl; bh=hwighiXWbWKO GzXxefKkoXKf2lk=; b=ZiKpvleaw9126g6uwHJxqjDump8kDfSX6HBMVA97q2PA RBb2YBiDt+eYY0m+F0fndD4VCbff2ozzamagHwP4c9Nk3Va0hXQLC5M/gyrYpB6N SHhqXzweqJFNijfa/7v0Rd83GwGJiyNeTFvW/xeTgusmsUzrvxJ7rE3P2iBrdX4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=wRfA4D CNihYj5AE8k5IFU5E4Jk74y3K63b2AGmRhhufHtGzxdtE2HSO4jI6KTfx1goo9Rd wHvyj8titSXDotXSCdzrgvGHKA2QI9yR3+hkcOk+MNlMyTZb3Fd7SdZ+RbaO1Mgs D+JOT2mOErg6NJinQjLLAdUUbZzlT9q+Z3bCA= Original-Received: from pb-sasl2. (unknown [127.0.0.1]) by pb-sasl2.pobox.com (Postfix) with ESMTP id 415151173C; Sun, 17 Apr 2016 04:50:03 -0400 (EDT) Original-Received: from clucks (unknown [88.160.190.192]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by pb-sasl2.pobox.com (Postfix) with ESMTPSA id 2341511739; Sun, 17 Apr 2016 04:50:01 -0400 (EDT) In-Reply-To: <87zisw9tju.fsf@gnu.org> ("Ludovic =?utf-8?Q?Court=C3=A8s=22'?= =?utf-8?Q?s?= message of "Thu, 14 Apr 2016 16:03:17 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) X-Pobox-Relay-ID: 5FA74242-0479-11E6-88E7-D472793246D6-02397024!pb-sasl2.pobox.com X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.108.67 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.org gmane.lisp.guile.devel:18289 Archived-At: Hi :) On Thu 14 Apr 2016 16:03, ludo@gnu.org (Ludovic Court=C3=A8s) writes: > Andy Wingo skribis: > >> I have been working on a refactor to ports. The goal is to have a >> better concurrency story. Let me tell that story then get down to the >> details. > > In addition to concurrency and thread-safety, I=E2=80=99m very much inter= ested > in the impact of this change on the API (I=E2=80=99ve always found the po= rt API > in C to be really bad), on the flexibility it would provide, and on > performance=E2=80=94=E2=80=98read-char=E2=80=99 and =E2=80=98get-u8=E2=80= =99 are currently prohibitively slow! Yeah. Of course improving the port internals is technically a breaking change, but I think probably the set of people that have implemented ports using the C API can be counted on two hands, and I hope to find everyone and help them adapt :) >From the speed side, I think that considering read-char to be prohibitively slow is an incorrect diagnosis. First let's define a helper: (define-syntax-rule (do-times n exp) (let lp ((i 0)) (let ((res exp)) (if (< i n) (lp (1+ i)) res)))) I want to test four things. ;; 1. How long a loop up to 10 million takes (baseline measurement). (let ((port (open-input-string "s"))) (do-times #e1e7 1)) ;; 2. A call to a simple Scheme function. (define (foo port) 42) (let ((port (open-input-string "s"))) (do-times #e1e7 (foo port))) ;; 3. A call to a port subr. (let ((port (open-input-string "s"))) (do-times #e1e7 (port-line port))) ;; 4. A call to a port subr that touches the buffer. (let ((port (open-input-string "s"))) (do-times #e1e7 (peek-char port))) The results: | baseline | foo | port-line | peek-char ------------------+----------+--------+-----------+---------- guile 2.0 | 0.269s | 0.845s | 1.067s | 1.280s guile master | 0.058s | 0.224s | 0.225s | 0.433s wip-port-refactor | 0.058s | 0.220s | 0.226s | 0.375s These were single measurements at the REPL on my i7-5600U, run with --no-debug. The results were fairly consistent. Note that because this is a loop, Guile 2.2's compiler gets some "unfair" advantages related to loop-invariant code motion and peeling; but real parsers etc written on top of read-char will also have loops, so to a degree it's OK. Conclusions: 1. Guile 2.2 makes calling a subr just as cheap as calling a Scheme function. 2. The overhead of using Guile 2.0 is much greater than the overhead of calling peek-char. 3. peek-char is slower than other leaf functions in Guile 2.2 but only by 2x or so; I am sure it can be faster but I don't know by how much. Consider that it has to: 1. type-check the argument 2. get the port buffer and cursors 3. if there is enough data in the buffer to decode a char, do it. otherwise, slow-path. If we consider implementing this in Scheme, it might get slower than it currently is in 2.2, because of the switch from C->C calls (internal to ports.c and other C files) to Scheme->Scheme calls, probably with some additional subr calls to get state from the port. We might gain some of that back by removing the lock; dunno. It would be nice to be able to decode chars from UTF-8 or ISO-8859-1 ports from Scheme. But we always have to be able to call out to iconv too. Mark has mused on making the port buffer always UTF-8, but I don't quite see how this could work. I guess you could have a second port buffer for decoded UTF-8 chars, but that starts to look quite complicated to me. Anyway. I think that given the huge performance window opened up to us by the 2.0->2.2 switch, we should consider speed considerations as important but not primary -- when given a choice between speed and maintainability, or speed and the ability to suspend a port, we shouldn't choose speed. That said, the real way to make port operations fast is (1) to buffer the port, and (2) to operate on the buffer directly instead of fetching data octet-by-octet. Exposing the port buffer to Scheme allows this kind of punch-through optimization to be implemented where needed. Cheers, Andy