From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: David Kastrup Newsgroups: gmane.emacs.devel Subject: Re: request for review: Doing direct file I/O in Emacs Lisp Date: 16 May 2004 00:13:57 +0200 Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Message-ID: References: <87isf0vjtl.fsf@emptyhost.emptydomain.de> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1084659319 21487 80.91.224.253 (15 May 2004 22:15:19 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sat, 15 May 2004 22:15:19 +0000 (UTC) Cc: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Sun May 16 00:15:13 2004 Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1BP7R3-00068B-00 for ; Sun, 16 May 2004 00:15:13 +0200 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.35 #1 (Debian)) id 1BP7R3-0005za-00 for ; Sun, 16 May 2004 00:15:13 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.34) id 1BP7QW-0000Rk-FX for emacs-devel@quimby.gnus.org; Sat, 15 May 2004 18:14:40 -0400 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.34) id 1BP7QP-0000Or-Fa for emacs-devel@gnu.org; Sat, 15 May 2004 18:14:33 -0400 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.34) id 1BP7Pt-0008Qc-9U for emacs-devel@gnu.org; Sat, 15 May 2004 18:14:32 -0400 Original-Received: from [199.232.76.164] (helo=fencepost.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.34) id 1BP7Ps-0008Q1-Ey for emacs-devel@gnu.org; Sat, 15 May 2004 18:14:00 -0400 Original-Received: from localhost ([127.0.0.1] helo=lola.goethe.zz) by fencepost.gnu.org with esmtp (Exim 4.34) id 1BP7Pr-00026H-09; Sat, 15 May 2004 18:13:59 -0400 Original-To: John Wiegley In-Reply-To: Original-Lines: 58 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.4 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:23512 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:23512 John Wiegley writes: > If I start a process with start-process (/usr/bin/cat) and redirect > its output to a file, it is far slower than if I simply output the > same data to a file (eshell/cat) -- even though the resulting > "output" in both cases is the same. Why is receiving output via a > process sentinel so slow? Use a multiprocessor machine. Part of the reason is the same because of which process-adaptive-read-buffering exists: cat produces some output. As soon as one line (or whatever unit) is full, the operating system intervenes and schedules Emacs. Emacs processes one line of output, then checks whether it can run timers, whatever. If Emacs finally has decided it can't do anything more useful, it puts itself to sleep, reading on the pipe. Then the operating system wakes up cat again, for another single line. The main fault, in my opinion, lies with the "low-latency" operating system that schedules away the CPU from the writing process as soon as it has produced any output. That makes pipes pretty inefficient. If you have a multiprocessor machine, a simple job like "cat" can easily stuff the pipe completely while Emacs is processing the last chunk. On a uniprocessor machine, this does not happen since cat does not even get the tiny amount of CPU power necessary to fill the pipe. The problem is that I/O using "select" is ready the moment a _single_ byte is available. Perhaps one would need some nicer system calls for telling Linux "ok, wake me up immediately if any pipe is _full_. And wake me up immediately if there is input on some of [list of files]. Other than that, only wake me up if there is input and no other process is wanting the CPU". So we'd need to tell the operating system how urgent we want what amount of data on what input to be scheduled for processing. process-adaptive-read-buffering tries to fudge around this problem. I have some feeling that it might still be buggy. I seem to remember some inconclusive reports where larger delays occured, maybe under MSWindows. Basically, reports indicate that even with process-adaptive-read-buffering one is maybe 30% slower than if the command gets started with a trailing "|dd obs=8k" pipeline, but still quite faster than if you don't use process-adaptive-read-buffering. A mess. Anyway, it might be worth optimizing and profiling Emacs for good typical filter routine performance even when small data chunks are involved. The more administrational overhead Emacs tries to do on each little wakeup, the slower we get. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum