Re: Native OS pipelines in eshell and Emacs

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

From: Spencer Baugh <sbaugh@janestreet.com>
To: emacs-devel@gnu.org
Subject: Re: Native OS pipelines in eshell and Emacs
Date: Tue, 28 May 2024 14:38:01 -0400	[thread overview]
Message-ID: <ier7cfd6auu.fsf@janestreet.com> (raw)
In-Reply-To: 85a89224-b032-b083-9825-2ae215ae6301@gmail.com

Jim Porter <jporterbugs@gmail.com> writes:
> On 5/28/2024 7:42 AM, Spencer Baugh wrote:
>> eshell "pipelines" operate by reading the data in from one process and
>> writing it out to the next process.  Thus the data flows from one
>> process, to Emacs, and then to the next process.
>> This differs from the native OS capability to make a pipe and pass
>> one
>> end down to one process as stdout, and the other end down to another
>> process as stdin, which is more efficient.
>> Has there been work before on supporting this in eshell and Emacs?
>
> I've worked on this previously, and even put together a hacky sketch
> of how it would work before abandoning it due to a bunch of
> complexities in Eshell that make this infeasible (in my opinion,
> anyway). As the current Eshell maintainer, I'd (softly) suggest you
> turn back now, unless you're willing to go down a fairly deep rabbit
> hole.

That is fair, but since supporting :stdin in Emacs would be useful for
project.el anyway, I'm motivated to do this even if eshell won't
immediately benefit.

So I'm especially interested in anything you have to share about the C
side of this, if anything.

> I'll also note: the benefits here are also somewhat reduced by
> improvements to Eshell pipelines in Emacs 29. As of commit d7b89ea4077
> (bug#56025), piped processes in Eshell no longer use PTYs for output,
> which resulted in a ~35x improvement in my limited tests. (Still 5-10x
> slower than in Bash though.) I didn't test this extensively at the
> time though since the main goal was fixing incorrect behavior; the
> perf improvement was just a nice bonus.
>
>> Specifically, the new feature would be something like an :stdin argument
>> to make-process which allows a make-pipe-process (or other process) to
>> be passed as stdin, and grabs the output file descriptor from that
>> process (what Emacs would normally read) and passes it down as stdin for
>> the new process instead.
>
> It's not quite as simple as that, I'm afraid. The C side is perfectly
> reasonable I think, and would likely make some parts of Eshell easier
> to manage, but there still needs to be some extra sorcery for
> Eshell. Eshell commands can either be Lisp-based or they can be
> external programs. That sounds simple, but it's not actually possible
> to determine ahead of time which Eshell will choose.
>
> Consider "cat". The implementation of "cat random" that Eshell uses
> depends on your cwd: if "random" is a regular file in your cwd, we use
> a Lisp implementation. But if your cwd is /dev, then "random" is a
> character device file, and the Lisp implementation replaces itself
> (*after* starting execution) with the external program. This makes it
> a lot harder to determine how to connect this command in a pipeline.
>
> Another issue is Tramp. If Eshell runs each remote process as an
> independent 'make-process' invocation as it is today, then we're stuck
> with a whole lot of extra indirection, and any pipe (native or
> otherwise) would be *local* instead of remote (where we want it). This
> even applies to not-really-remote cases like sudo, which Eshell
> manages via Tramp.
>
> Both of these cases are worked around via extpipes: in the former, the
> extpipe mandates that all connected commands are external programs,
> and in the latter, it constructs an 'sh' invocation that runs the
> entire pipeline as a unit on the remote host.

Ah, those are indeed real and annoying concerns.

But if extpipe is able to mandate this, doesn't that mean there is some
way to get this information?  That is, we're able to detect "all the
commands are external programs" and "we're running on the local host".

In that case, we could start by using the native pipes only when both
those conditions are true.

Or, slightly more aggressively: whenever two adjacent commands in an
eshell pipeline are both external programs on the local host.

> With enough work it might be possible to overcome some of these
> problems for Eshell, but I haven't been able to produce a satisfactory
> design for this that doesn't involve major incompatible changes.
>
> It's a different strategy, but I wonder if improving the scheduling in
> Emacs' process handling would get us close to "native" performance
> here? See
> <https://tdodge.consulting/blog/eshell/background-output-thread> for a
> discussion of the issue and a WIP(?) fix.

Very interesting idea, but I personally am motivated to get performance
which is not just closer to shell, but equivalent to shell.  Right now I
only infrequently use eshell, because every time I write and wait for a
pipeline I think "I would have to wait less if I was in M-x shell", and
I'd like to never think that :)

next prev parent reply	other threads:[~2024-05-28 18:38 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-28 14:42 Native OS pipelines in eshell and Emacs Spencer Baugh
2024-05-28 16:33 ` Jim Porter
2024-05-28 18:38   ` Spencer Baugh [this message]
2024-05-28 19:56     ` Jim Porter
2024-05-29  1:21 ` Dmitry Gutov
2024-05-29  1:43   ` Spencer Baugh
2024-05-29  2:08     ` Dmitry Gutov
2024-05-29  8:01       ` Michael Albinus
2024-05-29 10:31         ` Dmitry Gutov
2024-05-29  7:53     ` Michael Albinus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ier7cfd6auu.fsf@janestreet.com \
    --to=sbaugh@janestreet.com \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).