From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: =?UTF-8?Q?Nicolas_B=C3=A9rtolo?= Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] [WIP] Port feature/native-comp to Windows. Date: Sat, 9 May 2020 12:28:29 -0300 Message-ID: References: <5eb5b953.1c69fb81.a67ce.a764@mx.google.com> <83lfm1hc91.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="89804"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat May 09 17:29:16 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jXRPf-000NEB-U6 for ged-emacs-devel@m.gmane-mx.org; Sat, 09 May 2020 17:29:16 +0200 Original-Received: from localhost ([::1]:40978 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jXRPe-0000VZ-G8 for ged-emacs-devel@m.gmane-mx.org; Sat, 09 May 2020 11:29:14 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:42260) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jXRPB-0008Pz-Ly for emacs-devel@gnu.org; Sat, 09 May 2020 11:28:45 -0400 Original-Received: from mail-ot1-x336.google.com ([2607:f8b0:4864:20::336]:43174) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jXRPA-00088L-FL; Sat, 09 May 2020 11:28:45 -0400 Original-Received: by mail-ot1-x336.google.com with SMTP id g14so3952834otg.10; Sat, 09 May 2020 08:28:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=RRRfjeFAv9NzRqp1Y+nB4x+D5eHuH0A2kE0m0vxNnTg=; b=Q7tPcMZijkwIVBhRzBgMj5SCX9XSXnD5+vsQM2PbTAlhbbYeuXVnhwE95wJMEMtCiV lw3cNcufQ6ZLhlmy34Tap8HMGnrF1qfAgqUYrVyU7yvVxUWAtdyRPBZn/kotRA5cpoXg YHRjn/mu815TTk5CSljJ93Z+DDV0BGziVEUzwUFtdITkCyfTyU6+d/NZ0aUOq/+w8T4Q Dz/BTEqXZd4SV1jG5lVWfjbbNkPcX38gkfBdAQxzw0xcWdrHnIFxcOrPCqIojbKaW1YG oliih7LnfxV1fMq/JxqrW9Rag233WkMnJFcv+rUqKCgD43ELTSKcPOZ/psaP6OUatPcp xdoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=RRRfjeFAv9NzRqp1Y+nB4x+D5eHuH0A2kE0m0vxNnTg=; b=lq1X/yPkogLbLeOQlsSNyPdG8IxokKTFKe3jIZUBLRd+OaHkuUyWnTsEfh55DC6J60 oNWPR4MSGllqLRQM/cI5KLVy/TyYlhYia0v07tNoslCpupaYMXAct8ybZx1I+snJKaZa EL5ZoH6nygevIHRO5cHJssfxD7KOflYD7Nq0UyXEDIkeRBf65DCjI7+879+kLM3zwXyD UE3iHKCoWbo9SuzqkJzcm2XI8/WdP/cZLDGHUWuy2jhPmIalsACdUxDfG1dXzDq1sbTW CBp+rW9OIFDDn5TLb0pK/DKWGs6+6MGNe//7fQ5RBFXL1KSmA/O/qKRvArGaMEK0i36d 4kdw== X-Gm-Message-State: AGi0Pua7gWfnrjKCUnTBsJ9/AoP5IQEWhEry1JibP6qGkrIVGVP0AWTS OfQID7W8EVHGbkfhjON3UCUhy+RFjN9XOFbG6H+1uJtHC58= X-Google-Smtp-Source: APiQypL3wceExCV36CTKDKT8+398aIRLUdZiYLo7vwnbRfr6M9NjhgBjSmfD6w34O+z1mY9w49p0xvFT7k1jLV5Vnr8= X-Received: by 2002:a9d:5f09:: with SMTP id f9mr6363421oti.202.1589038122489; Sat, 09 May 2020 08:28:42 -0700 (PDT) In-Reply-To: <83lfm1hc91.fsf@gnu.org> Received-SPF: pass client-ip=2607:f8b0:4864:20::336; envelope-from=nicolasbertolo@gmail.com; helo=mail-ot1-x336.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:249468 Archived-At: >> As I said above other architectures or compilers should work, but it may= be necessary to change the code >> that generates calls to setjmp(), since there are many ways to do it. An= option would be to copy what the >> setjmp.h header does, but I do not think it is a good idea. The proper f= ix would be to use autoconf to detect it >> somehow. > Could you elaborate why this is an issue, and what exactly are the > details that need to be adapted to a different setjmp implementation? > Also, did you try compiling the modified code with the 32-bit MinGW64 > compiler? I haven't tried to compile it with the 32-bit compiler. There are many ways to call setjmp() in Windows. It depends on the architec= ture, whether the Universal CRT is used, whether SEH is enabled, the compiler ver= sion, etc. #define setjmp(BUF) _setjmp3((BUF), NULL) #define setjmp(BUF) __mingw_setjmp((BUF)) #define setjmp(BUF) _setjmp((BUF), __builtin_sponentry()) #define setjmp(BUF) _setjmp((BUF), mingw_getsp()) #define setjmp(BUF) _setjmp((BUF), __builtin_frame_address (0)) #define setjmp(BUF) _setjmp((BUF), NULL) _setjmp may be a call to function or a macro: #define _setjmp __intrinsic_setjmp #define _setjmp __intrinsic_setjmpex #define _setjmp _setjmpex This is nicely abstracted through a macro in the setjmp.h header. See https://sourceforge.net/p/mingw-w64/mingw-w64/ci/master/tree/mingw-w64-= headers/crt/setjmp.h In my machine (Windows 10, GCC 10.0 64 bits) the version that works is: #define setjmp(BUF) _setjmp((BUF), __builtin_frame_address (0)) where _setjmp is a call to the function. libgccjit does not implement a preprocessor, so we need to create a function call to the proper function with the proper arguments for the system. To do this= it is necessary to know what function to call and what arguments to give it. A= n option would be to copy the logic from the Mingw64 header. I don't like thi= s for two reasons: - The header may change as Microsoft adds more stuff to its C runtime or something else is discovered through reverse engineering. This would lead= to weird bugs when Emacs is compiled with a version of setjmp() but it gener= ates calls to it in a different style. - There may be licensing issues? >> Another issue is that the =E2=80=9Cemacs_dir=E2=80=9D environment variab= le needs to be set quite early in the initialization >> process. I do not know enough about the Emacs internals to make the prop= er changes for that, so I just >> added a dirty hack. > Why is this a problem for the native-compile version? The load_pdump() function calls it. I haven't found out why yet. >> There is a remaining issue involving the environment in which emacs runs= . The libgccjit likes to run in a >> Mingw64 environment, so it can find the assembler, linker, etc. > What do you mean by Mingw64 environment? Do you mean the MSYS > environment, i.e. the one that uses Bash and can run Unix shell > scripts? If so, why is that needed? Compiler passes are native > Windows programs, not MSYS programs. Is this something special to how > libgccjit was ported to MinGW? > This should be fixed, but I don't think I understand enough to propose > the way of fixing it. > I think describing the environment and the need for having it in this > case will be a significant first step towards resolving the problems. I tried copying the assembler and linker (as.exe and ld.exe) into the folde= r where emacs.exe lives. It is necessary to add that folder to PATH, that is = the first issue I found. Having done that was enough to make it work up to the point where the linker needs to find the Windows libraries. If I remove the MSYS installation folder then it fails with these errors: libgccjit.so: error: error invoking gcc driver -or- ld: cannot find dllcrt2.o: No such file or directory ld: cannot find crtbegin.o: No such file or directory ld: cannot find -lmingw32 ld: cannot find -lgcc_s ld: cannot find -lgcc ld: cannot find -lmoldname ld: cannot find -lmingwex ld: cannot find -lmsvcrt ld: cannot find -lpthread ld: cannot find -ladvapi32 ld: cannot find -lshell32 ld: cannot find -luser32 ld: cannot find -lkernel32 ld: cannot find -lmingw32 ld: cannot find -lgcc_s ld: cannot find -lgcc ld: cannot find -lmoldname ld: cannot find -lmingwex ld: cannot find -lmsvcrt ld: cannot find crtend.o: No such file or directory You are right when you say that they are native Windows programs. They don'= t need a "pseudo-unix" environment like I said previously. But they need some support files from the MSYS installation. I haven't figured out which ones = yet. >> Subject: [PATCH 4/6] Handle LISP_WORDS_ARE_POINTERS and >> CHECK_LISP_OBJECT_TYPE. > Is this specific to MS-Windows? If so, what is the MS-Windows > specific aspects of native compilation that require this? This is partially specific to Windows. I had trouble compiling it with the `--enable-check-lisp-object-type` configure option, so I had to add support= for it. One aspect that is specific to Windows is that sizeof(void*) !=3D sizeof(lo= ng) even if WIDE_EMACS_INT is not defined. The code assumed that sizeof(Lisp_Wo= rd) =3D=3D sizeof(long) if WIDE_EMACS_INT was not defined. I fixed this by addi= ng many types that represent the Lisp_* family and changing the code to use these instead of long and long long. >> Subject: [PATCH 5/6] Remove a layer of indirection for access to pure st= orage. > Same questions here. This one is definitely not Windows specific. There was a bug that caused PURE_P() to be implemented incorrectly in the generated code. It defined a variable `pure_reloc` of type `void**` that was supposed to st= ore a pointer to a pointer to pure storage. It was initialized to `(EMACS_INT**)&pure`. This is expression does not take evaluates to a point= er of type `EMACS_INT**` that points to the start of pure_storage. Since the gene= rated code derefereced this pointer, it was implementing PURE_P() as bool PURE_P(void* ptr) { return ((uintptr_t) ptr - (uintptr_t)pure[0]) <=3D PURESIZE; } In my tests `pure[0]` =3D=3D 2. This bug caused the native compiler to crash by calling pure_write_error().= It is strange that this was not detected in GNU/Linux. I conjecture that all L= isp objects are allocated in addresses higher than `pure` in GNU/Linux and that= they are higher than `pure[0]` too. I am not sure though. This is not the case in Windows. It is possible to have Lisp objects that a= re allocated below `pure`. Nicolas El s=C3=A1b., 9 may. 2020 a las 3:08, Eli Zaretskii () escrib= i=C3=B3: > > > Date: Fri, 8 May 2020 16:55:59 -0300 > > From: Nicolas Bertolo > > > > I have ported the feature/native-comp branch to Windows. I have tested = my changes in Windows 10 x64 with > > Mingw64 GCC 10.0. Other architectures or compilers should work, but it = may be necessary to adjust the > > code a little bit. > > Great news, thank you for working on this. > > > As I said above other architectures or compilers should work, but it ma= y be necessary to change the code > > that generates calls to setjmp(), since there are many ways to do it. A= n option would be to copy what the > > setjmp.h header does, but I do not think it is a good idea. The proper = fix would be to use autoconf to detect it > > somehow. > > Could you elaborate why this is an issue, and what exactly are the > details that need to be adapted to a different setjmp implementation? > > Also, did you try compiling the modified code with the 32-bit MinGW64 > compiler? > > > Another issue is that the =E2=80=9Cemacs_dir=E2=80=9D environment varia= ble needs to be set quite early in the initialization > > process. I do not know enough about the Emacs internals to make the pro= per changes for that, so I just > > added a dirty hack. > > Why is this a problem for the native-compile version? > > > There is a remaining issue involving the environment in which emacs run= s. The libgccjit likes to run in a > > Mingw64 environment, so it can find the assembler, linker, etc. > > What do you mean by Mingw64 environment? Do you mean the MSYS > environment, i.e. the one that uses Bash and can run Unix shell > scripts? If so, why is that needed? Compiler passes are native > Windows programs, not MSYS programs. Is this something special to how > libgccjit was ported to MinGW? > > > This is means that Emacs needs to run in a pseudo-Unix > > environment. I don=E2=80=99t like this since this environment would be > > propagated to other processes launched by Emacs, and they may not > > like this. > > This should be fixed, but I don't think I understand enough to propose > the way of fixing it. > > > I have thought about a simple fix to this but I haven=E2=80=99t impleme= nted it yet. The Emacs process that needs to > > run in a Mingw64 environment is actually the subprocess that performs t= he compilation, not the main > > process that runs the editor. So my idea is to run this subprocess thro= ugh a small script that setups the > > environment that libgccjit expects without polluting the Emacs environm= ent. > > I think describing the environment and the need for having it in this > case will be a significant first step towards resolving the problems. > > > Subject: [PATCH 4/6] Handle LISP_WORDS_ARE_POINTERS and > > CHECK_LISP_OBJECT_TYPE. > > > > * src/comp.c: Introduce the Lisp_X, Lisp_Word, and Lisp_Word_tag > > types. These types are used instead of long or long long. Use > > emacs_int_type and emacs_uint_types where appropriate. > > (emit_coerce): Add special logic that handles the case when > > Lisp_Object is a struct. This is necessary for handling the > > --enable-check-lisp-object-type configure option. > > > > * src/lisp.h: Since libgccjit does not support opaque unions, change > > Lisp_X to be struct. This is done to ensure that the same types are > > used in the same binary. It is probably unnecessary since only a > > pointer to it is used. > > Is this specific to MS-Windows? If so, what is the MS-Windows > specific aspects of native compilation that require this? > > > Subject: [PATCH 5/6] Remove a layer of indirection for access to pure s= torage. > > > > * src/comp.c: Taking the address of an array is the same as casting it > > to a pointer. Therefore, the C expression `(EMACS_INT **) &pure` is in > > fact adding a layer of indirection that is not necessary. The fix is > > to cast the `pure` array to a pointer and store that in a void pointer > > that is part of the compiled shared library. > > Same questions here. > > Thanks.