On Fri, Apr 21, 2023, 12:39 AM Lynn Winebarger <owinebar@gmail.com> wrote:
I'm not sure what the etiquette is here - I keep referring to Stefan's
effort on futur.el, so I've added him explicitly.

On Thu, Apr 20, 2023 at 10:36 AM Eli Zaretskii <eliz@gnu.org> wrote:
> > From: Lynn Winebarger <owinebar@gmail.com>
> > Date: Thu, 20 Apr 2023 10:19:11 -0400
> > Cc: emacs-devel@gnu.org
> >
> > > If you yield before issuing the system call, the system call will wait
> > > until you re-acquire the lock.  So how will this help?
> >
> > You're talking about yielding the system thread, I'm talking about
> > yielding the Lisp machine thread.
> No, I'm also talking about the Lisp machine thread.  The thread which
> calls insert-file-contents and runs the C code of insert-file-contents
> and of the subroutines called by insert-file-contents.

A lisp thread is the context (lisp machine state) observable
(intentionally) by lisp programs.  That is conceptually distinct from
the context tracked by the OS thread.  To paraphrase a great mind, the
identification of the two "is an implementation detail".  Meaning, it
is not normative with respect to the intended semantics of lisp

> > Even though Lisp machine threads are implemented by mapping them to
> > the underlying system thread, the Lisp machine execution state is
> > kept coherent by the global (interpreter) lock.  Releasing the lock
> > is effectively yielding the Lisp machine thread.  The system thread
> > will yield implicitly if the read blocks.
> What you say here is not relevant to the issue at hand.

I don't know what you think is the issue at hand.  My question was
about when (or if) the lisp thread could yield during a blocking I/O
operation.  You appear to have interpreted that question differently
than I intended, so I attempted to be more explicit about what I meant
by "the lisp thread" and "yielding".

> > the read operation should either use some temporary buffer and copy
> > into the lisp buffer, or wait for data to be ready, then reacquire
> > the GIL before invoking the read syscall with a pointer into the
> > Lisp buffer object.
> Yes, and that's exactly where we will lose: most of the heavy
> processing cannot be done in a separate temporary buffer, because it
> calls functions from the Lisp machine, and those are written assuming
> nothing else is running in the Lisp machine concurrently.  For
> example, take the code which decodes the file's contents we have just
> read.  I encourage you to take a good look at that code (most of it is
> in coding.c) to appreciate the magnitude of the problem.

I believe you. I did mention that the global lock could be (evidently
*must be*) reacquired before the actual call to "read".  There's also
Tromey's comment on

OTOH, I asked the question in order to understand what it means to
give programmers control over asynchronous execution in controlled
ways.  The most basic kind of control I can think of is whether
functions called in that code are expected to behave synchronously or
asynchronously.  And what is a more basic operation, that could be
done synchronously or asynchronously, than reading text from a file.
If something like insert-file-contents can't be performed
asynchronously (not in parallel, just with other code running while
waiting for IO), the scope of what Stefan's effort is going to be very
limited.  Limited to the point of not being very interesting.

It's also possible that something could be built on the existing data
structures that doesn't rely on more fine-grained locking.  I just
happened to be reading
today, after seeing this message, and it seemed to me highly relevant
to the problem of concurrent work on a text buffer.  Particularly the
comment "Experience has shown that merging is superior to locking".
Maybe the thing to do to enable asynchronous/concurrent/parallel work
would be to add a new type of buffer, built on the existing one, that
behaves something like a git repo/working copy of the text buffer,
that eventually merges edits, say by windows displaying the buffer
each pulling updates from all the "repos" with checked-out copies.
Then the model for support of asynchronous programming could
encapsulate specifying how to merge and/or handle merge failure.

Maybe that would be too expensive, but at least at first, these
distributed buffers would only be used by programs using explicit
asynchronous programming.  Maybe that approach would even be helpful
in dealing with extremely large files or long lines.  We could call it
"merge-oriented programming". :-)

I forgot to mention the concurrent version of the buffer would need a functional representation to avoid copying during the merge.  Something along the lines of Okasaki's purely functional strings, except including all the other components of buffers - overlays, local variables, and whatever else would be implicated.  I don't know if this would require a complete reimplementation of buffers, or if the current implementation could be tweaked to serve as an underlying component of a zippered buffer.

> So the code which can run in parallel with another Lisp thread will be
> able to do only very simple jobs, and will also add overhead due to
> the need of copying stuff from temporary buffers to Lisp objects.

I'm not talking about running in parallel - there is still just one
lisp machine in this hypothetical.

> Of course, we could redesign and reimplement this stuff, but that's a
> lot of non-trivial work.  My assumption was that you are considering
> relatively lightweight changes on top of the existing code, not a
> complete redesign of how these primitives work.

I wasn't considering anything.  I asked a very limited question for
the purpose of giving some hard thought to the language problem Stefan
requested assistance on.  I didn't offer these elaborations because I
have any plans, only to respond to your question "How could it?".
That is a purely hypothetical question.
