From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Helmut Eller Newsgroups: gmane.emacs.devel Subject: Re: SIGPROF + SIGCHLD and igc Date: Sat, 28 Dec 2024 17:46:42 +0100 Message-ID: <87a5cfoivh.fsf@gmail.com> References: <87o713wwsi.fsf@telefonica.net> <87ttaucub8.fsf@protonmail.com> <87pllicrpi.fsf@protonmail.com> <864j2u442i.fsf@gnu.org> <87a5ch5z1b.fsf@gmail.com> <87plld5pev.fsf@protonmail.com> <87ed1t6r34.fsf@gmail.com> <875xn46s6z.fsf@gmail.com> <86bjwwulnc.fsf@gnu.org> <877c7jlxsu.fsf@gmail.com> <86frm7sx4d.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="6330"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: pipcet@protonmail.com, gerd.moellmann@gmail.com, ofv@wanadoo.es, emacs-devel@gnu.org, acorallo@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Dec 28 17:47:50 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tRZyf-0001Z7-SJ for ged-emacs-devel@m.gmane-mx.org; Sat, 28 Dec 2024 17:47:49 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tRZxt-0002wD-Vj; Sat, 28 Dec 2024 11:47:02 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tRZxi-0002vZ-P1 for emacs-devel@gnu.org; Sat, 28 Dec 2024 11:46:57 -0500 Original-Received: from mail-ed1-x530.google.com ([2a00:1450:4864:20::530]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tRZxf-0004HD-Ne; Sat, 28 Dec 2024 11:46:50 -0500 Original-Received: by mail-ed1-x530.google.com with SMTP id 4fb4d7f45d1cf-5d3dce16a3dso14121680a12.1; Sat, 28 Dec 2024 08:46:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1735404404; x=1736009204; darn=gnu.org; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=+jSRuariGyoFVIx2haedAbAweQzWUVgYmlVKyFoatvY=; b=DIXdIWkifL2xnDbfrOqGWcRnKuBmMNQ31/0miBCcYOMNdaRuJBFbHotCmowudl/b2u 5XaM+O1MJNoC83LxQqF6Wy5JqTN46UK29nZRpshKCEHRMdNhceygihb0M8oxGJ88HTQC FomcsscLBXRqQ54+2aFwVMUOtRSuqa6o3EVomv6l+qTSMbBC9YkBg1/zUgGIIfqoVfeZ yqOoJaHLVbsvSTDhOs6OtAM5loNEz0cm4x5MQx4ckdqDICXsIS+9Vhsj3E6gu8JQdmmh IGlMkDZOlYqVghX54KH4VW22A/ZDePmv5rxYzTH3dBLmcJ8zbLmdf4hcOHkuUF+6EfJs W5hA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735404404; x=1736009204; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=+jSRuariGyoFVIx2haedAbAweQzWUVgYmlVKyFoatvY=; b=bTg+7rFZEfq1LPewPcHDM2/4VQZKmepiswjDvFD9lszPZGeUN5AMcAdhpQOW0Coxl5 1/4R0oyf04RyxO3XCrFgxVdwpAlYFALvSe9KRLRrFsA0sdrnXHm2wZtNATDZxtUeBE7O KD8D12gXkpiwVunH9r5d/JoPsdglDHTnyfm/ZyashEgY1IiaOHf+vzJLlrp3Bg3lcCcM BgC9txMl/TBuQjAC0eWEyr+mqsQBAvLF0gNowqyfNI0wg6E7l3124QccFOeorWebxMTx rcQrpsgJg8zJd8xppmlYbu7zRF5N3dl4v1INEg2sJV7hftaPLp7eTdkhNq2ke8Itywmr FyXQ== X-Forwarded-Encrypted: i=1; AJvYcCUTY5hpYelJ96H+0v7VOWtfMhAaGEe7IqUO0V5253T4g4TDSEwDWZg1iGEJcLlTh8eOvq6/Zn88zg==@gnu.org, AJvYcCX7vT0jov7RQ/k3e7crEAr3M+MWK6qVDGwszMjrQon7255J31I2SpKsqNLDht039MHapm+1lHE7SZLaDjM=@gnu.org X-Gm-Message-State: AOJu0YxTjAA1pp5DuEbPlnBGiFzyvZvWlGv3J424O9xjILMGDcTQZzwC 6RO9x4zHYR6hpjabA1gW2kX6fyeMgLnQkeoYR9sEy1qVqk0lUMiVpV1nmrQf X-Gm-Gg: ASbGncuCgHNyN9FIz7b8ac+x7waGKEA0aQm44Sl0O+sV7Os0ZrTyr97IZaQI6XstTTE gTHcdtqH/lK0JZXcVNDkTPfSskda7joaCOru/5F1mQWzbrln8vffO1zeqNHv+SU7XspZq1lFMDB XKotz/i4Y9INxK4mvSycN4pnXAhOADoz5wD67QEerP60NsiNb8w0UiO9B/kk3AYU2CNUNNhpDXj ZDNWw1MlJ6gv7SdsaNc+2sgTqD3F9d4r3kD1a8PweJzl+kr/V3GHxc= X-Google-Smtp-Source: AGHT+IFNOc6AtiffdiWueUEO0KUCO2mT/O7XRu/oLne2dpnyJolD/Fwy0k11r3fzjlOC8zbxsj8+4g== X-Received: by 2002:a05:6402:268c:b0:5d0:d2b1:6831 with SMTP id 4fb4d7f45d1cf-5d81e8c1309mr28421207a12.14.1735404404192; Sat, 28 Dec 2024 08:46:44 -0800 (PST) Original-Received: from caladan ([31.177.115.143]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5d80665253dsm12384282a12.0.2024.12.28.08.46.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Dec 2024 08:46:43 -0800 (PST) In-Reply-To: <86frm7sx4d.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 28 Dec 2024 16:25:22 +0200") Received-SPF: pass client-ip=2a00:1450:4864:20::530; envelope-from=eller.helmut@gmail.com; helo=mail-ed1-x530.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:327277 Archived-At: On Sat, Dec 28 2024, Eli Zaretskii wrote: > I'm thinking about a situation where SIGPROF was delivered while it > was blocked. In that case, it will be re-delivered once we unblock > it. > > By contrast, if we avoid delivering SIGPROF in the first place, it > will never be delivered until the next time SIGPROF is due. > > So imagine a function FUNC that conses some Lisp object. This calls > into MPS, which blocks SIGPROF, takes the arena lock, then does its > thing, then releases the lock and unblocks SIGPROF. If SIGPROF > happened while MPS was working with SIGPROF blocked, then the moment > SIGPROF is unblocked, the SIGPROF handler in the main thread will be > called, and will have the opportunity to see that we were executing > FUNC. By contrast, if the profiler thread avoided delivering SIGPROF > because it saw the arena locked, the next time the profiler thread > decides to deliver SIGPROF, execution could have already left FUNC, > and thus FUNC will not be in the profile. > > I hope I made myself more clear this time. I think I see what you mean. I imagine the profiler thread to be a loop like while (true) { sleep () ArenaEnter () pthread_kill (SIGPROF, ) wait () ArenaLeave () } If the profiler thread blocks in ArenaEnter, then we are at the mercy of the thread scheduler. The kernel may decide to let the main thread run for a long time before running the profiler thread again. While with sigblock(), we know exactly when the kernel will call the SIGPROF handler. So sigblock() would be more predictable and accurate. >> >> This variant might be bit easier to implement. The "while MPS does not >> >> hold the lock" part can be implemented by claiming the lock in the >> >> profiler thread like so: >> >> >> >> mps_arena_t arena = global_igc->arena; >> >> ArenaEnter (arena); >> >> ... deliver SIGPROF part goes here ... >> >> ArenaLeave (arena); >> > >> > What happens if, when we call ArenaEnter, MPS already holds the arena >> > lock? >> >> Since MPS holds the lock, it would run in a different thread. > > Yes, of course: we are talking about an implementation where the > profiler thread is separate, so the above code, which AFAIU runs in > the profiler thread, will be in a thread separate from the one where > MPS runs. > >> So the profiler thread blocks until MPS releases the lock. >> >> ArenaEnter uses non-recursive locks. > > Hm... if ArenaEnter uses non-recursive locks, how come we get aborts > if some code tries to lock the arena when it is already locked? IOW, > how is this situation different from what we already saw several times > in the crashes related to having SIGPROF delivered while MPS holds the > arena lock? I'm not sure what you expect instead. It's an error to claim a non-recursive lock twice in the same thread. The fault handler claims the lock. If the SIGPROF handler interrupts MPS while it's holding the lock and then triggers a fault, then it claims the lock a second time. It's no surprise to see crashes here. >> During that time window, the lock is held by the profiler thread. The >> SIGPROF handler runs in the main thread. If the main thread tries to >> claim the lock, it will block until the profiler thread releases it. > > See above: I thought that such a situation triggers crashes. I'm > probably missing something. If two threads are claiming a the same non-recursive lock concurrently, then it's not an error. >> >> Regarding deadlocks: the profiler thread holds the lock while it waits. >> >> So MPS should not be able to stop the profiler thread there. >> > >> > Which means we don't register the profiler thread with MPS, right? >> >> I'm not sure. It may not be safe to call ArenaEnter in non-registered >> threads. > > But if we do register the thread, then MPS _will_ stop it, no? Good point. But I think we are safe: to access the list of threads to stop, MPS must first hold the arena lock. Helmut