From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Emacs build fails [MSYS2/MINGW64] Date: Mon, 29 Apr 2019 17:53:27 +0300 Message-ID: <83imuwual4.fsf@gnu.org> References: <1443630449.1251981.1556465864402@mail.libero.it> <6790fa0c-6544-df50-b4f1-a77ca0daea1f@cs.ucla.edu> <1118604098.1569315.1556538785729@mail.libero.it> Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="208648"; mail-complaints-to="usenet@blaine.gmane.org" Cc: eggert@cs.ucla.edu, angelo.g0@libero.it, emacs-devel@gnu.org To: martin rudalics Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Apr 29 16:54:17 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hL7fb-000s52-Pz for ged-emacs-devel@m.gmane.org; Mon, 29 Apr 2019 16:54:16 +0200 Original-Received: from localhost ([127.0.0.1]:58721 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hL7fa-0004qJ-65 for ged-emacs-devel@m.gmane.org; Mon, 29 Apr 2019 10:54:14 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:50379) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hL7ez-0004py-Kt for emacs-devel@gnu.org; Mon, 29 Apr 2019 10:53:39 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:55764) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hL7ey-0006Wp-SE; Mon, 29 Apr 2019 10:53:36 -0400 Original-Received: from [176.228.60.248] (port=4043 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1hL7ey-0003uO-8O; Mon, 29 Apr 2019 10:53:36 -0400 In-reply-to: (message from martin rudalics on Mon, 29 Apr 2019 14:49:42 +0200) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:236028 Archived-At: > From: martin rudalics > Date: Mon, 29 Apr 2019 14:49:42 +0200 > Cc: emacs-devel@gnu.org > > > It seems that it is the optimization flag which causes the issue. Using "-O0" on place of "-O2" or "-O3" in my script fixes the issue... > > Same here. See also Bug#35410 for different behaviors of O3 and O0 > builds with gcc 7.3.0 32-bit builds using MSYS2. So GCC 7.x might also have problems with 'cold'... I can confirm that building with MinGW GCC 8.2.0 and -O2 causes crashes in several places during the bootstrap. The crashes that I saw are different from what Angelo reported, and are also different from the report in bug#35409, but I guess this is a kind of Heisenbug, so I'm not surprised to see it pop in different places on different systems. In one case the cause of the crash was a wrong-type-argument error where the offending object was a pseudo-vector of the type PVEC_FREE, i.e. it was already put on the free list; this caused a print.c function to barf trying to display the object. Clearly, some bad code was involved, and I suspect that bad code came from GCC, not from us. Just applying the simple patch below allows the build to run flawlessly to completion: --- src/conf_post.h~0 2019-04-28 09:39:12.000000000 +0300 +++ src/conf_post.h 2019-04-29 08:08:30.619991000 +0300 @@ -225,7 +225,7 @@ extern char *emacs_getenv_TZ (void); extern int emacs_setenv_TZ (char const *); -#if __has_attribute (cold) +#if __has_attribute_cold && !defined(__MINGW32__) # define ATTRIBUTE_COLD __attribute__ ((cold)) #else # define ATTRIBUTE_COLD This could be a MinGW-specific bug, but then again, it could be that other platforms are just lucky, or maybe not enough people tried to build the master branch with optimizations lately. Stepping back a little, I'm not sure we want this ATTRIBUTE_COLD business in Emacs, on any platform. At the very least, it causes unintended consequences, see, for example, the comments by Florian Weimer in this GCC bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88793 Also, the GCC manual indicates that marking a function as 'cold' causes that function to be optimized for size, and we already know that GCC 8.x has at least one bug with -Os used together with -O2, which seems to be exactly what happens when 'cold' is being used in an otherwise -O2 build. Moreover, I think we have way too many functions marked with this attribute. Functions like wrong_type_argument or Fthrow or Fabort_recursive_edit or Fsignal or error or Ftop_level -- do they really fit the below description? The 'cold' attribute on functions is used to inform the compiler that the function is unlikely to be executed. The function is optimized for size rather than speed and on many targets it is placed into a special subsection of the text section so all cold functions appear close together, improving code locality of non-cold parts of program. The paths leading to calls of cold functions within code are marked as unlikely by the branch prediction mechanism. It is thus useful to mark functions used to handle unlikely conditions, such as 'perror', as cold to improve optimization of hot functions that do call marked functions in rare occasions. I don't think the above functions are called "rarely" enough in Emacs to justify marking them as 'cold'. Stuff like emacs_abort or memory_full, maybe; but functions that jump to top-level are not in that class, IMO, not in Emacs Lisp. The problem I see with marking too many functions 'cold' is that then any path leading to those functions is also considered "unlikely", and who knows what that does to our code? "nm -A" on an Emacs executable compiled with ATTRIBUTE_COLD shows no less than 2400(!) local symbols with the ".cold" suffix, whereas a "normal" executable has just 7 of them. And what are the benefits we get for all this risky business? IMO, we should simply avoid this feature. But if we leave it in the sources, we should disable it for MinGW, it seems.