From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Jim Porter Newsgroups: gmane.emacs.devel Subject: Re: Native OS pipelines in eshell and Emacs Date: Tue, 28 May 2024 09:33:19 -0700 Message-ID: <85a89224-b032-b083-9825-2ae215ae6301@gmail.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34556"; mail-complaints-to="usenet@ciao.gmane.io" Cc: johnw@gnu.org, spwhitton@spwhitton.name, dmitry@gutov.dev To: Spencer Baugh , emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue May 28 18:34:43 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sBzmb-0008k1-Ko for ged-emacs-devel@m.gmane-mx.org; Tue, 28 May 2024 18:34:42 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sBzlN-000524-Kh; Tue, 28 May 2024 12:33:25 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sBzlM-00051g-0x for emacs-devel@gnu.org; Tue, 28 May 2024 12:33:24 -0400 Original-Received: from mail-pl1-x62a.google.com ([2607:f8b0:4864:20::62a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sBzlJ-0003qx-Aw; Tue, 28 May 2024 12:33:23 -0400 Original-Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1f4a0050b9aso8776235ad.2; Tue, 28 May 2024 09:33:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1716913999; x=1717518799; darn=gnu.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:mime-version:date:message-id:from:to:cc :subject:date:message-id:reply-to; bh=sNNgG2JNQIbqYZ5Q0y0J3uXyx4BUbMSWVwrOCBag+BE=; b=eIttm8pBM1KHYv/7ZsTP86nfviYF9lHxVMxZQfk6us/Zorae2PHz/F/cKvQmu9GmGv 3uStkqjtQcB/WxHRMwkH6XBvKLCO20KhyBJXJZvsMqv5DaO6AVjk10zKVVZRpPc6Kk7e fVwCqsrGpAvjlxDLiZL8QDsNt+XT7P1pIwzKdW4nAiznlo+/VBCUXfbRl+lBSqXZVBki rsY/D9FessMKgLrUY2ITSd4AALoZjE/56sXTZJaKDGO7b1KD6dNdbRKYV/zGoz4fK0y9 KquQvjFqK0gjTSThexfKzi5sGoZ1x9y/f6013Is0yo4oUZnAR/Mx0KWl9glqUyzpsXii 1CJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716913999; x=1717518799; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sNNgG2JNQIbqYZ5Q0y0J3uXyx4BUbMSWVwrOCBag+BE=; b=FrZ4A+rq5j/ApaKTO7iiLpomOqNtCDvNrCLaYo4Hq/xUzI0laIwrUFn5p+nMfffA+e 4rkK0VtTD2vGUmSSHvUrJe9OSrUuS11M47OzN0xyty34AtpRJJiWQVxjNVwtmVuEsq3f rXxds4UNKTeysTUvXMvaOeApJ7Eh9wtu0JUn5QmfXQV8uUYmwPYldgOZogPxiN7hpkNo qdZPQNikg8Fqg6mxyBS5enIDH6bmhotyns/FmBNPWNC0fFoUqylGmrFgiLhXIG7oYwr5 eTSfAJSkx6B3URdxIn6tlEdsIr12TuyAXoqY3oIql5ilDgHUKiW+6LsytWy3PeWYKq4N X5WQ== X-Forwarded-Encrypted: i=1; AJvYcCVZpGhmPFv7pGwi/gN8eXSUN9m/Mli24xbcgi1K84uf9nD5iSJL/HvZG69iCUYkXsp6CgZ4LCdmcfpF1f/kkSKgxDGx X-Gm-Message-State: AOJu0Yw7lWiKouLJ5UZ8sbQeFB8e8olIHn3kLyvgkA59f5+kF94kmeSz i3Xj61cLk0PbRyuXGdBqMDz3NMLEgqq0edfRvkgSwvGk3nqiYTYc X-Google-Smtp-Source: AGHT+IGNP2ybjzHk8jEPeWETiDqbF5FTBrIgb3+fGsRmHezZZgPlZo/mxlRZkyb0G4SxUGAxgL505A== X-Received: by 2002:a17:902:f652:b0:1f4:a6bb:15ed with SMTP id d9443c01a7336-1f4a6bb1976mr56939435ad.26.1716913999037; Tue, 28 May 2024 09:33:19 -0700 (PDT) Original-Received: from [192.168.1.2] (syn-023-240-098-037.res.spectrum.com. [23.240.98.37]) by smtp.googlemail.com with ESMTPSA id d9443c01a7336-1f44c9a683asm82641625ad.225.2024.05.28.09.33.17 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 28 May 2024 09:33:18 -0700 (PDT) Content-Language: en-US In-Reply-To: Received-SPF: pass client-ip=2607:f8b0:4864:20::62a; envelope-from=jporterbugs@gmail.com; helo=mail-pl1-x62a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:319654 Archived-At: On 5/28/2024 7:42 AM, Spencer Baugh wrote: > eshell "pipelines" operate by reading the data in from one process and > writing it out to the next process. Thus the data flows from one > process, to Emacs, and then to the next process. > > This differs from the native OS capability to make a pipe and pass one > end down to one process as stdout, and the other end down to another > process as stdin, which is more efficient. > > Has there been work before on supporting this in eshell and Emacs? I've worked on this previously, and even put together a hacky sketch of how it would work before abandoning it due to a bunch of complexities in Eshell that make this infeasible (in my opinion, anyway). As the current Eshell maintainer, I'd (softly) suggest you turn back now, unless you're willing to go down a fairly deep rabbit hole. I'll also note: the benefits here are also somewhat reduced by improvements to Eshell pipelines in Emacs 29. As of commit d7b89ea4077 (bug#56025), piped processes in Eshell no longer use PTYs for output, which resulted in a ~35x improvement in my limited tests. (Still 5-10x slower than in Bash though.) I didn't test this extensively at the time though since the main goal was fixing incorrect behavior; the perf improvement was just a nice bonus. > Specifically, the new feature would be something like an :stdin argument > to make-process which allows a make-pipe-process (or other process) to > be passed as stdin, and grabs the output file descriptor from that > process (what Emacs would normally read) and passes it down as stdin for > the new process instead. It's not quite as simple as that, I'm afraid. The C side is perfectly reasonable I think, and would likely make some parts of Eshell easier to manage, but there still needs to be some extra sorcery for Eshell. Eshell commands can either be Lisp-based or they can be external programs. That sounds simple, but it's not actually possible to determine ahead of time which Eshell will choose. Consider "cat". The implementation of "cat random" that Eshell uses depends on your cwd: if "random" is a regular file in your cwd, we use a Lisp implementation. But if your cwd is /dev, then "random" is a character device file, and the Lisp implementation replaces itself (*after* starting execution) with the external program. This makes it a lot harder to determine how to connect this command in a pipeline. Another issue is Tramp. If Eshell runs each remote process as an independent 'make-process' invocation as it is today, then we're stuck with a whole lot of extra indirection, and any pipe (native or otherwise) would be *local* instead of remote (where we want it). This even applies to not-really-remote cases like sudo, which Eshell manages via Tramp. Both of these cases are worked around via extpipes: in the former, the extpipe mandates that all connected commands are external programs, and in the latter, it constructs an 'sh' invocation that runs the entire pipeline as a unit on the remote host. With enough work it might be possible to overcome some of these problems for Eshell, but I haven't been able to produce a satisfactory design for this that doesn't involve major incompatible changes. It's a different strategy, but I wonder if improving the scheduling in Emacs' process handling would get us close to "native" performance here? See for a discussion of the issue and a WIP(?) fix.