From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Concurrency via isolated process/thread Date: Tue, 25 Jul 2023 14:29:45 +0300 Message-ID: <83cz0gqlee.fsf@gnu.org> References: <871qhnr4ty.fsf@localhost> <83edljg8ub.fsf@gnu.org> <87o7knbxr7.fsf@localhost> <838rbrg4mg.fsf@gnu.org> <87ilavbvdr.fsf@localhost> <834jmffvhy.fsf@gnu.org> <878rbrbmwr.fsf@localhost> <83fs5zecpo.fsf@gnu.org> <87351zbi72.fsf@localhost> <83351yevde.fsf@gnu.org> <87cz12ad2w.fsf@localhost> <83a5w6cwdr.fsf@gnu.org> <87pm518m0g.fsf@localhost> <83o7kl9tyj.fsf@gnu.org> <874jmd89us.fsf@localhost> <878rb53dkj.fsf@localhost> <83edkxsclz.fsf@gnu.org> <87tttt1mzh.fsf@localhost> <83351ds9de.fsf@gnu.org> <87ila91j6n.fsf@localhost> <83y1j5qoyo.fsf@gnu.org> <87wmypi7s0.fsf@localhost> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="15356"; mail-complaints-to="usenet@ciao.gmane.io" Cc: luangruo@yahoo.com, emacs-devel@gnu.org To: Ihor Radchenko Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Jul 25 13:29:58 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qOGEo-0003j9-8Q for ged-emacs-devel@m.gmane-mx.org; Tue, 25 Jul 2023 13:29:58 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qOGDx-0000mE-Ib; Tue, 25 Jul 2023 07:29:05 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qOGDt-0000lk-2n for emacs-devel@gnu.org; Tue, 25 Jul 2023 07:29:02 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qOGDs-0005bG-4Q; Tue, 25 Jul 2023 07:29:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=VMMpU0TSoT2KdRZ7zpq23ZupaAtw+H+aBW5QLr/2C5I=; b=eIPn8WEjdcMl HP014FKho9IX6pBhPMY0B2M7pM3CUOLVrnKBtqNAdikuD8+JS8yCGpLG/KevWX4PooyzBXhKZ84v6 6UhQZx+WKL10GYmw0+PKTVPMKhpWxy3M29/C/8kXqf55p12JWOXjWcyNmZ4bVPAMpqZxArZkw2PuU E/zfkxzb+kklxWLzWD1Jz+A10LBB6BJFzzmzcchnjdZK9pB0cFTysSdohnzAI2wo24vIyJ4+zEZdI OTlGgP9rE1p+hpwSEYMEre3RgYzFqbh4ZZlemc2TN39LZwJHXthQsunxlsncHRyCDGYb8eeM3pSN4 whb+G+yu/JJLdbmntgcw/g==; Original-Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qOGDr-00049H-Hk; Tue, 25 Jul 2023 07:28:59 -0400 In-Reply-To: <87wmypi7s0.fsf@localhost> (message from Ihor Radchenko on Mon, 24 Jul 2023 16:38:55 +0000) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:308090 Archived-At: > From: Ihor Radchenko > Cc: luangruo@yahoo.com, emacs-devel@gnu.org > Date: Mon, 24 Jul 2023 16:38:55 +0000 > > Eli Zaretskii writes: > > >> AFAIK, reading buffer does not require moving the gap. > > > > We've been through that: at least xml.c moves the gap to allow > > external libraries access to buffer text as a single contiguous C > > string. This is a capability I wouldn't want to lose, because it > > might come in handy in future developments. > > move_gap_both should lock the buffer as it is buffer modification (of > the buffer_text object). htmlReadMemory should also lock it because it > expects constant string segment. This basically means we will only allow access of a single thread to the same buffer, see below. > > I still don't quite see how this will work. Indirect buffers don't > > introduce parallelism, and programs that modify indirect buffers > > _know_ that the text of the base buffer will also be modified. By > > contrast, a thread that has been preempted won't know and won't expect > > that. It could, for example, keep buffer positions in simple > > variables, not in markers; almost all Lisp programs do that, and use > > markers only in very special situations. > > Any async thread should expect that current buffer might be modified. That's impossible without rewriting a lot of code. And even after that, how is a thread supposed to "expect" such changes, when they can happen at any point in the Lisp program execution? What kind of measures can a Lisp program take to 'expect" that? The only one that I could think of is to copy the entire buffer to another one, and work on that. (Which is also not fool-proof.) > Or lock the buffer for text modifications explicitly (the feature we > should probably provide - something more enforcing compared to > read-only-mode we have now). Locking while accessing a buffer would in practice mean only one thread can access a given buffer at the same time. Which is what I suggested to begin with, but you said you didn't want to give up. > > In addition, on the C level, some code computes pointers to buffer > > text via BYTE_POS_ADDR, and then uses the pointer as any C program > > would. If such a thread is suspended, and some other thread modifies > > buffer text in the meantime, all those pointers will be invalid, and > > we have bugs. So it looks like, if we want to allow concurrent access > > to buffers from several threads, we will have a lot of code rewriting > > on our hands, and the rewritten code will be less efficient, because > > it will have to always access buffer text via buffer positions and > > macros like FETCH_BYTE and fetch_char_advance; access through char * > > pointers will be lost forever. > > Not necessarily lost. We should provide facilities to prevent buffer > from being modified ("write mutex"). That again means only one thread can access a given buffer, the rest will be stuck waiting for the mutex. > This problem is not limited to buffers - any low-level function that > modifies C object struct must enforce the condition when other threads > cannot modify the same object. For example SETCAR will have to mark the > modified object non-writable first, set its car, and release the lock. > > So, any time we need a guarantee that an object remains unchanged, we > should acquire object-specific write-preventing mutex. So we will allow access to many/all objects to only one thread at a time. How is this better than the current threads? > Of course, such write locks should happen for short periods of time to > be efficient. How can this be done in practice? Suppose a Lisp program needs to access some object, so it locks it. When will it be able to release the lock, except after it is basically done? because accessing an object is not contiguous: you access it, then do something else, then access it again, etc. -- and assume that the object will not change between successive accesses. If you release the lock after each individual access, that assumption will be false, and all bets are off again. > > So maybe we should take a step back and consider a restriction that > > only one thread can access a buffer at any given time? WDYT? > > Buffers are so central in Emacs that I do not want to give up before we > try our best. But in practice, what you suggest instead does mean we must give up on that, see above. > Alternatively, I can try to look into other global states first and > leave async buffer access to later. If we can get rid of the truly > global states (which buffer point is not; given that each thread has an > exclusive lock on its buffer), we can later come back to per-thread > point and restriction. That's up to you, although I don't see how the other objects are different, as explained above.