From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: multi-threaded Emacs Date: Sat, 29 Nov 2008 17:21:51 -0500 Message-ID: References: <87abbiody1.fsf@master.homenet> <873ahant5l.fsf@master.homenet> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1227997350 11893 80.91.229.12 (29 Nov 2008 22:22:30 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 29 Nov 2008 22:22:30 +0000 (UTC) Cc: emacs-devel@gnu.org To: Giuseppe Scrivano Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Nov 29 23:23:35 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1L6YDh-0003KW-GP for ged-emacs-devel@m.gmane.org; Sat, 29 Nov 2008 23:23:21 +0100 Original-Received: from localhost ([127.0.0.1]:38222 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L6YCX-0001I2-Cn for ged-emacs-devel@m.gmane.org; Sat, 29 Nov 2008 17:22:09 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1L6YCS-0001Er-35 for emacs-devel@gnu.org; Sat, 29 Nov 2008 17:22:04 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1L6YCP-0001DY-Dt for emacs-devel@gnu.org; Sat, 29 Nov 2008 17:22:03 -0500 Original-Received: from [199.232.76.173] (port=57879 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L6YCP-0001DV-At for emacs-devel@gnu.org; Sat, 29 Nov 2008 17:22:01 -0500 Original-Received: from ironport2-out.pppoe.ca ([206.248.154.182]:64093 helo=ironport2-out.teksavvy.com) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1L6YCN-0000kA-GY; Sat, 29 Nov 2008 17:21:59 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApsEAFhPMUnO+Jkl/2dsb2JhbACBbcwPgn2BJA X-IronPort-AV: E=Sophos;i="4.33,687,1220241600"; d="scan'208";a="30438187" Original-Received: from 206-248-153-37.dsl.teksavvy.com (HELO pastel.home) ([206.248.153.37]) by ironport2-out.teksavvy.com with ESMTP; 29 Nov 2008 17:21:52 -0500 Original-Received: by pastel.home (Postfix, from userid 20848) id 0397884C0; Sat, 29 Nov 2008 17:21:52 -0500 (EST) In-Reply-To: <873ahant5l.fsf@master.homenet> (Giuseppe Scrivano's message of "Sat, 29 Nov 2008 22:01:26 +0100") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux) X-detected-operating-system: by monty-python.gnu.org: Genre and OS details not recognized. X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:106325 Archived-At: >> - handling thread-specific data: gcprolist, mark_byte_stack, and things >> like that. You've tackled this one. I don't find your solution very >> elegant, but I can't think of a much better one, unless pthreads does >> provide some kind of thread-specific variables. I guess the only >> improvement is to put those vars together, so we have an array of >> structs (that then contain gcprolist, and mark_byte_stack fields) >> rather than several arrays. And then define `gcprolist' as a macro >> that expands into "thread_data[current_thread()].gcprolist", so as to >> reduce the amount of source-code changes. > Yes, we can put thread-specific data together in a struct; what about > buffer data? `current_buffer' must be different for every thread so as > many other global variables, should it be defined in the same > thread_data struct? Yes, some additional variables will need to be made thread-specific, indeed. For some of them, we may be able to simply use dynamic-scoping (once this is implemented). >> - dynamic-scoping: your handling of specpdl is naive and >> fundamentally flawed. For multithreading, we will have to completely >> change the implementation technique of dynamic-scoping. > I know but it was the easier way to have quickly a working > proof-of-concept as I concentrated my efforts mainly on the syntax. > Do you have any idea about how dynamic scoping should be handled in a > multi-threaded environment? Currently, we basically use the following implementation: (defun get-var (sym) (symbol-value 'var)) (defun set-var (sym val) (setf (symbol-value 'var) val)) (defmacro let-var (sym val body) `(let ((oldval (get-var ,sym))) (set-var ,sym ,val) (unwind-protect ,body (set-var ,sym ,val)))) we could instead use something like (defun get-var (sym) (cdr (assq sym specpdl))) (defun set-var (sym val) (set-cdr (assq sym specpdl) val)) (defmacro let-var (sym val body) `(let ((oldpdl specpdl)) (push (cons ,sym ,val) specpdl) (unwind-protect ,body (setq specpdl oldpdl)))) where specpdl is a per-thread variable. Or (defun get-var (sym) (cdr (assq thread (symbol-value sym)))) (defun set-var (sym val) (set-cdr (assq thread (symbol-value sym)) val)) (defmacro let-var (sym val body) `(let ((oldval (get-var ,sym))) (set-var ,sym ,val) (unwind-protect ,body (set-var ,sym ,val)))) This latter one might be the simplest: it basically adapts the notion of buffer-local/frame-local/terminal-local to also include thread-local. Currently, only one form of locality is supported at a time (a var can't be both buffer-local and terminal-local), so this would need to be worked out (frame-local and buffer-local was allowed in Emacs-21 but its behavior was not clearly defined and had corner case bugs). Clearly this can have drastic consequences w.r.t the performance of `get-var'. And its interaction with buffer-local bindings needs to be thought through. We mostly want to handle the default-directory case where a variable is local to every buffer and can also be dynamically bound. >> - synchronization to access all the global variables/objects. >> You haven't yet have time to tackle this (other than in `put', it >> seems), and it's going to be difficult. > Why do you think that a global lock (or several ones for separate kind > of data) will not be enough? It is not easy but I think it can be done > in a reasonable time without many troubles. I'm not sure what you mean by "a global lock". The question is not only how many locks, but what they protect. My proposal further down to start with "only one thread active at a time" is what I'd call "a global lock". >> - concurrent GC (later to be refined to parallel concurrent GC ;-). >> - redisplay in its own thread (later to be refined to one redisplay >> thread per terminal, then per frame, then per window ;-). > I think that this is the most difficult part, a new GC and how handle > redisplay. Actually, a new GC is not indispensable. We will most likely start by keeping the same GC and just stopping all threads when a GC is needed. >> A first step will be to restrict the implementation such that there's no >> parallelism: only one thread executes at any given time (i.e. have >> a single lock and have all thread grab the lock before doing any actual >> work). > IMHO it is better to avoid this middle solution and try to solve > directly the problem, it will not give real benefits and having only one > thread executes at a given time can be done differently, like saving and > restoring the thread call stack. Experience shows that making a program concurrent (or even parallel) requires a lot of work and can only be done step by step, and after each step appears new opportunities. I don't see my proposal as a "middle" solution. It's just one step out of many. Before knowing what other steps are most needed, we will need experience using those primitives in Elisp packages, so the most pressing step is to provide the primitives in a robust way. > Real threads will not suffer I/O bound operations and Emacs will be > able to use more cores at the same time, if needed. That can come later. Parallelism is important, but it will require changes in Elisp packages, so the first thing is to provide facilities so that Elisp packages can start using concurrency. Then we can worry about taking advantage of parallelism. If we want to get parallelism without changing Elisp packages, then I can only see two places where we could do that: - parallelise redisplay. - make the GC concurrent (and/or parallel). Both are pretty difficult. Interestingly, they're also mostly orthogonal to the issue of providing concurrency primitives to Elisp. Stefan