From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bengt Richter Subject: Re: fftwf tests running for 25+ hours - is this normal? Date: Tue, 15 Oct 2019 22:11:50 -0700 Message-ID: <20191016051150.GB76890@PhantoNv4ArchGx.localdomain> References: <87o8yi4qo6.fsf@gmail.com> <87o8yi4qo6.fsf@gmail.com> <20191015055259.GA1571@PhantoNv4ArchGx.localdomain> <87ftjt506x.fsf_-_@gmail.com> Reply-To: Bengt Richter Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:47100) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iKbbR-0005PL-Qd for guix-devel@gnu.org; Wed, 16 Oct 2019 01:12:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iKbbQ-0006ox-AQ for guix-devel@gnu.org; Wed, 16 Oct 2019 01:12:05 -0400 Received: from imta-36.everyone.net ([216.200.145.36]:43020 helo=imta-38.everyone.net) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1iKbbQ-0006oC-19 for guix-devel@gnu.org; Wed, 16 Oct 2019 01:12:04 -0400 Content-Disposition: inline In-Reply-To: <87ftjt506x.fsf_-_@gmail.com> List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Chris Marusich Cc: guix-devel@gnu.org, fftw@fftw.org, Matteo Frigo On +2019-10-15 11:32:38 -0700, Chris Marusich wrote: > Hi Bengt and Matteo, > > Bengt Richter writes: > > > Have you checked sensors for overheating that might induce CPU clock throttling? > > Actually, yes, I happened to be watching dmesg output at the time, and I > did notice these messages (no similar messages have been printed since > then, which was many hours ago): > > --8<---------------cut here---------------start------------->8--- > [180270.045081] mce: CPU1: Core temperature above threshold, cpu clock throttled (total events = 124733) > [180270.045567] mce: CPU1: Core temperature/speed normal > [180570.044352] mce: CPU1: Core temperature above threshold, cpu clock throttled (total events = 134902) > [180570.044838] mce: CPU1: Core temperature/speed normal > [180875.663432] mce: CPU1: Core temperature above threshold, cpu clock throttled (total events = 143897) > [180875.663918] mce: CPU1: Core temperature/speed normal > [181175.748616] mce: CPU1: Core temperature above threshold, cpu clock throttled (total events = 153503) > [181175.749103] mce: CPU1: Core temperature/speed normal > [181476.496915] mce: CPU1: Core temperature above threshold, cpu clock throttled (total events = 171377) > [181476.497401] mce: CPU1: Core temperature/speed normal > [181776.914264] mce: CPU1: Core temperature above threshold, cpu clock throttled (total events = 185828) > [181776.914751] mce: CPU1: Core temperature/speed normal > [182076.914391] mce: CPU1: Core temperature/speed normal > [182377.322112] mce: CPU1: Core temperature above threshold, cpu clock throttled (total events = 231578) > [182377.322598] mce: CPU1: Core temperature/speed normal > --8<---------------cut here---------------end--------------->8--- > > Is this throttling permanent, or is the throttling released after the > temperature returns to normal? In my case, throttling ceased as soon as temps normalized. If it happens to me again, I will try lscpu|grep -i '^CPU m' to see if the throttling is shown in the first number which right now gets me: CPU MHz: 2697.781 CPU max MHz: 3500.0000 CPU min MHz: 800.0000 again: CPU MHz: 2102.444 CPU max MHz: 3500.0000 CPU min MHz: 800.0000 Yeah, so that first number is a current sample. So I guess 800 could make something run 3x or more longer. ... Hm. I think I am going to see about dust-bunnies in the air ducts :) ... -- Regards, Bengt Richter