From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Ihor Radchenko Newsgroups: gmane.emacs.devel Subject: Re: Concurrency via isolated process/thread Date: Sun, 09 Jul 2023 15:49:41 +0000 Message-ID: <87v8et6q5m.fsf@localhost> References: <871qhnr4ty.fsf@localhost> <87zg49xfke.fsf@localhost> <83sfa1gjns.fsf@gnu.org> <87r0plxbep.fsf@localhost> <83ilawhpi6.fsf@gnu.org> <87zg48apwr.fsf@localhost> <83edljg8ub.fsf@gnu.org> <87o7knbxr7.fsf@localhost> <838rbrg4mg.fsf@gnu.org> <87ilavbvdr.fsf@localhost> <834jmffvhy.fsf@gnu.org> <878rbrbmwr.fsf@localhost> <83fs5zecpo.fsf@gnu.org> <87351zbi72.fsf@localhost> <83351yevde.fsf@gnu.org> <87cz12ad2w.fsf@localhost> <83a5w6cwdr.fsf@gnu.org> <87pm518m0g.fsf@localhost> <83o7kl9tyj.fsf@gnu.org> <874jmd89us.fsf@localhost> <83cz119lxu.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="32845"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Stefan Monnier , luangruo@yahoo.com, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Jul 09 17:50:40 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qIWgI-0008Io-Fb for ged-emacs-devel@m.gmane-mx.org; Sun, 09 Jul 2023 17:50:38 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qIWfR-0005To-5t; Sun, 09 Jul 2023 11:49:45 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qIWfP-0005Sk-PI for emacs-devel@gnu.org; Sun, 09 Jul 2023 11:49:43 -0400 Original-Received: from mout01.posteo.de ([185.67.36.65]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qIWfN-0006Te-0H for emacs-devel@gnu.org; Sun, 09 Jul 2023 11:49:43 -0400 Original-Received: from submission (posteo.de [185.67.36.169]) by mout01.posteo.de (Postfix) with ESMTPS id 779C7240027 for ; Sun, 9 Jul 2023 17:49:38 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1688917778; bh=jHdvYDbqlb5VwPCW6ywXa7ZOTj4f4Zzo7v/A3ra7avc=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:From; b=MNR6oD2os/d7bqk2RUrJ815rfoF7MpRCHfAZFerH1zD3BYaWhqj/5YCbeFvL5kl9k m3mpe3i1wCijnjmG7V0mWVqIY4iGQH0sMKg3IUyiRWK9lH39n2reNovIbyuE1yx3hm 7H4c785jVUTneS0NVr7D6VBbQVQhdRdjAbVuR/H3Lad5GeEfj4astN3fcxZySNcvVA GGF7kLAW4QYJTWu37uhW35Slb0nJdJRu/PAVXiih2gcTU0qOfbl2cWJWU+jeXJ5jHI 9jUDtbFl7G6SBj57FLqOSmS0Iuc56Y1hYyre2w3W8FZVVIx5hxb0bd3aB0CnX+vEWo Xsl+4G4+3blAg== Original-Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4QzWm14Wd7z6tsb; Sun, 9 Jul 2023 17:49:37 +0200 (CEST) In-Reply-To: <83cz119lxu.fsf@gnu.org> Received-SPF: pass client-ip=185.67.36.65; envelope-from=yantar92@posteo.net; helo=mout01.posteo.de X-Spam_score_int: -53 X-Spam_score: -5.4 X-Spam_bar: ----- X-Spam_report: (-5.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H5=-1, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:307690 Archived-At: Eli Zaretskii writes: >> I now understand that the gap can be moved by the code that is not >> actually writing text in buffer. However, I do not see how this is a >> problem we need to care about more than about generic problem with >> simultaneous write. > > Imagine a situation where we need to process XML or HTML, and since > that's quite expensive, we want to do that in a thread. What you are > saying is that this will either be impossible/impractical to do from a > thread, or will require to lock the entire buffer from access, because > the above processing moves the gap. If that is not a problem, I don't > know what is, because there could be a lot of such scenarios, and they > all will be either forbidden or very hard to implement. In this particular example, the need to move the gap is there because htmlReadMemory requires memory segment as input. Obviously, it requires a block in the current implementation. Can it be done async-safe? We would need to memcpy the parsed buffer block. Or up to 3 memcpy if we do not want to move the gap. >> If a variable or object value is being written, we need to block it. >> If a buffer object is being written (like when moving the gap or writing >> text), we need to block it. And this blocking will generally pose a >> problem only when multiple threads try to access the same object, which >> is generally unlikely. > > My impression is that this is very likely, because of the many global > objects in Emacs. There are many objects, but each individual thread will use a subset of these objects. What are the odds that intersection of these subsets are frequent? Not high, except certain frequently used objects. And we need to focus on identifying and figuring out what to do with these likely-to-clash objects. > ... Moreover, if you intend to allow several threads > using the same buffer (and I'm not yet sure whether you want that or > not), It would be nice if multiple threads can work with the same buffer in read-only mode, maybe with a single main thread editing the buffer (and pausing the async read-only threads while doing so). Writing simultaneously is a much bigger ask. > ... then the buffer-local variables of that buffer present the same > problem as global variables. Take the case-table or display-table, > for example: those are buffer-local in many cases, but their changes > will affect all the threads that work on the buffer. And how frequently are case-table and display-table changed? AFAIK, not frequently at all. >> We need to ensure that simultaneous consing will never happen. AFAIU, >> it should be ok if something that does not involve consing is running >> at the same time with cons (correct me if I am wrong here). > > What do you do if some thread hits the memory-full condition? The > current handling includes GC. May you please explain a bit more about the situation you are referring to? My above statement was about consing, not GC. For GC, as I mentioned earlier, we can pause each thread once maybe_gc() determines that GC is necessary, until all the threads are paused. Then, GC is executed and the threads continue. >> 2. Redisplay cannot be asynchronous in a sense that it does not make >> sense that multiple threads, possibly working with different buffers >> and different points in those buffers, request redisplay >> simultaneously. Of course, it is impossible to display several places >> in a buffer at once. > > But what about different threads redisplaying different windows? is > that allowed? If not, here goes one more benefit of concurrent > threads. I think I need to elaborate what I mean by "redisplay cannot be asynchronous". If an async thread want to request redisplay, it should be possible. But the redisplay itself must not be done by this same thread. Instead, the thread will send a request that Emacs needs redisplay and optionally block until that redisplay finishes (optionally, because something like displaying notification may not require waiting). The redisplay requests will be processed separately. Is Emacs display code even capable of redisplaying two different windows at the same time? > Also, that issue with prompting the user also needs some solution, > otherwise the class of jobs that non-main threads can do will be even > smaller. We can make reading input using similar idea to the above, but it will always block until the response. For non-blocking input, you said that it has been discussed. I do vaguely recall such discussion in the past and I even recall some ideas about it, but it would be better if you can link to that discussion, so that the participants of this thread can review the previously proposed ideas. >> Only a single `main-thread' should be allowed to modify frames, >> window configurations, and generally trigger redisplay. And thread >> that attempts to do such modifications must wait to become >> `main-thread' first. > > What about changes to frame-parameters? Those don't necessarily > affect display. But doesn't it depend on graphic toolkit? I got an impression (from Po Lu's replies) that graphic toolkits generally do not handle async requests well. >> This means that any code that is using things like >> `save-window-excursion', `display-buffer', and other display-related >> staff cannot run asynchronously. > > What about with-selected-window? also forbidden? Yes. A given frame must always have a single window active, which is not compatible with async threads. In addition, `with-selected-window' triggers redisplay. In particular, it triggers redisplaying mode-lines. It is a problem similar to async redisplay. >> Async threads will make an assumption that >> (set-buffer "1") (goto-char 100) (set-buffer "2") (set-buffer "1") >> (= (point) 100) invalid. > > If this is invalid, I don't see how one can write useful Lisp > programs, except of we request Lisp to explicitly define critical > sections. Hmm. I realized that it is already invalid. At least, if `thread-yield' is triggered somewhere between `set-buffer' calls and other thread happens to move point in buffer "1". But I realize that something like (while (re-search-forward "foo") nil t) (with-current-buffer "bar" (insert (match-string 0)))) may be broken if point is moved when switching between "bar" and "foo". Maybe, the last PV, ZV, and BEGV should not be stored in the buffer object upon switching away and instead recorded in a thread-local ((buffer PV ZV BEGV) ...) alist. Then, thread will set PV, ZV, and BEGV from its local alist rather than by reading buffer->... values. >> > What if the main thread modifies buffer text, while one of the other >> > threads wants to read from it? >> >> Reading and writing should be blocked while buffer is being modified. > > This will basically mean many/most threads will be blocked most of the > time. Lisp programs in Emacs read and write buffers a lot, and the > notion of forcing a thread to work only on its own single set of > buffers is quite a restriction, IMO. But not the same buffers! >> >> >> For example, `org-element-interpret-data' converts Org mode AST to >> >> >> string. Just now, I tried it using AST of one of my large Org buffers. >> >> >> It took 150seconds to complete, while blocking Emacs. >> >> > >> >> > It isn't side-effect-free, though. >> >> >> >> It is, just not declared so. >> > >> > No, it isn't. For starters, it changes obarray. >> >> Do you mean `intern'? `intern-soft' would be equivalent there. > > "Equivalent" in what way? AFAIU, the function does want to create a > symbol when it doesn't already exist. No. (intern (format "org-element-%s-interpreter" type)) is just to retrieve existing function symbol used for a given AST element type. (interpret (let ((fun (intern-soft (format "org-element-%s-interpreter" type)))) (if (and fun (fboundp fun)) fun (lambda (_ contents) contents)))) would also work. To be clear, I do know how this function is designed to work. It may not be de-facto pure, but that's just because nobody tried to ensure it - the usefulness of pure declarations is questionable in Emacs now. >> There will indeed be a lot of work to make the range of Lisp functions >> available for async code large enough. But it does not have to be done >> all at once. > > No, it doesn't. But until we have enough of those functions > available, one will be unable to write applications without > implementing and debugging a lot of those new functions as part of the > job. It will make simple programming jobs much larger and more > complicated, especially since it will require the programmers to > understand very well the limitations and requirements of concurrent > code programming, something Lisp programmers don't know very well, and > rightfully so. I disagree. If Emacs supports async threads, it does not mean that every single peace of Elisp should be async-compatible. But if a programmer is explicitly writing async code, it is natural to expect limitations. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at . Support Org development at , or support my work at