From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matteo Frigo Subject: Re: fftwf tests running for 25+ hours - is this normal? Date: Tue, 15 Oct 2019 10:12:31 -0400 Message-ID: <878spmumgg.fsf@fftw.org> References: <87o8yi4qo6.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:36063) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iKNny-0006mB-8D for guix-devel@gnu.org; Tue, 15 Oct 2019 10:28:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iKNnw-00005c-5z for guix-devel@gnu.org; Tue, 15 Oct 2019 10:28:05 -0400 Received: from [75.98.172.202] (port=46810 helo=fftw.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1iKNnu-0008RL-6d for guix-devel@gnu.org; Tue, 15 Oct 2019 10:28:03 -0400 In-Reply-To: <87o8yi4qo6.fsf@gmail.com> (Chris Marusich's message of "Mon, 14 Oct 2019 20:46:01 -0700") List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Chris Marusich Cc: guix-devel@gnu.org, fftw@fftw.org Chris, thanks for reporting this issue. (By the way, did the test ever finish?) Most likely, the problem is in FFTW and not in your machine. Would you mind reporting the exact configure options (I care specifically about --enable-sse2, --enable-avx, etc.) and the exact command line that is taking 25h? The issue is probably related to the funny integer 34320. The FFT algorithm depends on the factorization of the size, 34320 in this case, which is 34320=2*2*2*2*3*5*11*13. FFTW tries to find the optimal way to decompose 34320 into small problems that it can solve. For example, it may reduce a DFT of size 34320 into 34320/13 FFTs of size 13 followed by 13 FFTs of size 34320/13. Or it may reduce a DFT of size 34320 into 34320/11 FFTs of size 11 followed by 11 FFTs of size 34320/11. Because there are many distinct prime factors, the number of factorizations is really large and FFTW is pretty much trying them all. The problem is even harder once you consider the fact that some sizes are odd, and thus they don't naturally match the vector length of SSE/AVX instructions. FFTW has some heuristics to limit the search space in cases like this, but these heuristics were implemented in a simpler world (before AVX) and maybe they are not adequate any longer, so I want to look into this.