From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Andrea Corallo Newsgroups: gmane.emacs.bugs Subject: bug#41615: [feature/native-comp] Dump prettier C code. Date: Sun, 31 May 2020 16:57:40 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="31912"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) Cc: 41615@debbugs.gnu.org To: Nicolas =?UTF-8?Q?B=C3=A9rtolo?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun May 31 19:02:18 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jfRLl-0008F6-2p for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 31 May 2020 19:02:17 +0200 Original-Received: from localhost ([::1]:43214 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jfRLk-00022A-2j for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 31 May 2020 13:02:16 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:57324) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jfRHd-0005WE-TF for bug-gnu-emacs@gnu.org; Sun, 31 May 2020 12:58:01 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:50383) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jfRHd-0004RW-J0 for bug-gnu-emacs@gnu.org; Sun, 31 May 2020 12:58:01 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1jfRHd-000207-IP for bug-gnu-emacs@gnu.org; Sun, 31 May 2020 12:58:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Andrea Corallo Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 31 May 2020 16:58:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 41615 X-GNU-PR-Package: emacs Original-Received: via spool by 41615-submit@debbugs.gnu.org id=B41615.15909442657669 (code B ref 41615); Sun, 31 May 2020 16:58:01 +0000 Original-Received: (at 41615) by debbugs.gnu.org; 31 May 2020 16:57:45 +0000 Original-Received: from localhost ([127.0.0.1]:33696 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jfRHN-0001zd-9L for submit@debbugs.gnu.org; Sun, 31 May 2020 12:57:45 -0400 Original-Received: from mx.sdf.org ([205.166.94.20]:52219) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jfRHK-0001zT-NE for 41615@debbugs.gnu.org; Sun, 31 May 2020 12:57:43 -0400 Original-Received: from sdf.org (ma.sdf.org [205.166.94.33]) by mx.sdf.org (8.15.2/8.14.5) with ESMTPS id 04VGveku009373 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits) verified NO); Sun, 31 May 2020 16:57:41 GMT Original-Received: (from akrl@localhost) by sdf.org (8.15.2/8.12.8/Submit) id 04VGveJJ001942; Sun, 31 May 2020 16:57:40 GMT In-Reply-To: ("Nicolas =?UTF-8?Q?B=C3=A9rtolo?="'s message of "Sun, 31 May 2020 12:54:23 -0300") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:181298 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Nicolas B=C3=A9rtolo writes: >> I like this considerably less :) > > Ok, let's say goodbye to this patch. > >> It introduces quite some complexity and the same advantage in >> debuggability can be achieved with something like the attached 8 line >> patch (untested). > > Sounds good, I haven't tested it either. > >> Generally speaking I want to try to keep our back-end as simple as we >> manage to. > > I initially wrote this patch chasing the reason for slow compile times. I= think > that a 10k line C file should be compiled much faster than what gccjit ac= hieves. > I thought that "uncommon" (for C) ways of doing thing were causing gccjit= to get > stuck trying to optimize them hard, until it gave up. I thought that fill= ing the > static data using memcpy() and constant strings would help GCC recognize = this as > a constant initialization and hopefully just store a completely initializ= ed copy > in memory. > > I found that GCC would inline memcpy() and the static initialization woul= d turn > into a very long unrolled loop with SSE instructions. I tested this with = -O3 > only in gccjit to force maximum optimization. I found this super strange > considering that -ftree-loop-distribute-patterns is enabled at -O3 and it= should > recognize the naive_memcpy() function as an implementation of memcpy() an= d issue > calls to libc's implementation. Instead, it was inlining and unrolling it. Ok you confirm the suspects I wrote in the other mail! I've used your patch as a base, apart for minors here and there I've stripped out the definitions of bzero and memcpy. I believe bzero is unnecessary given these are static allocated. For memcpy we can just use the standard library implementation given elns are linked to it. The other advantage is that doing this way (here at least) memcpy is not inlined also at speed 3, so we don't trap in the optimizer issue! All summed is even a little faster than the stock patch and closer to the one with the specific GCC blob support. Let me know if you like the attached and if does the job for you too. Thanks Andrea --=20 akrl@sdf.org --=-=-= Content-Type: text/x-diff; charset=utf-8 Content-Disposition: attachment; filename=0001-Cut-down-compile-time-emitting-static-data-as-string.patch Content-Transfer-Encoding: quoted-printable >From 3efb2808d415f723ade4a0f9f61738e1a707156c Mon Sep 17 00:00:00 2001 From: =3D?UTF-8?q?Nicol=3DC3=3DA1s=3D20B=3DC3=3DA9rtolo?=3D Date: Wed, 20 May 2020 00:34:32 -0300 Subject: [PATCH] * Cut down compile-time emitting static data as string literals MIME-Version: 1.0 Content-Type: text/plain; charset=3DUTF-8 Content-Transfer-Encoding: 8bit This change drastically reduce compile time. Apparently GCC optimizer does not scale up well at all for long sequences of assignments into a single array. Nicol=C3=A1s B=C3=A9rtolo Andrea Corallo * src/comp.c (gcc_jit_context_new_string_literal) (gcc_jit_block_add_assignment_op): New imports. (comp_t): New 'size_t_type' 'memcpy' fields. (emit_static_object): Define static objects using string literals and memcpy. (define_memcpy): New function. (Fcomp__init_ctxt): Define 'size_t_type' and 'memcpy'. --- src/comp.c | 120 ++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 105 insertions(+), 15 deletions(-) diff --git a/src/comp.c b/src/comp.c index f288fc2551..81c4d2fe32 100644 --- a/src/comp.c +++ b/src/comp.c @@ -46,6 +46,7 @@ # include "w32common.h" =20 #undef gcc_jit_block_add_assignment +#undef gcc_jit_block_add_assignment_op #undef gcc_jit_block_add_comment #undef gcc_jit_block_add_eval #undef gcc_jit_block_end_with_conditional @@ -75,6 +76,7 @@ #undef gcc_jit_context_new_rvalue_from_int #undef gcc_jit_context_new_rvalue_from_long #undef gcc_jit_context_new_rvalue_from_ptr +#undef gcc_jit_context_new_string_literal #undef gcc_jit_context_new_struct_type #undef gcc_jit_context_new_unary_op #undef gcc_jit_context_new_union_type @@ -164,6 +166,8 @@ DEF_DLL_FN (gcc_jit_rvalue *, gcc_jit_context_new_rvalu= e_from_long, (gcc_jit_context *ctxt, gcc_jit_type *numeric_type, long value= )); DEF_DLL_FN (gcc_jit_rvalue *, gcc_jit_context_new_rvalue_from_ptr, (gcc_jit_context *ctxt, gcc_jit_type *pointer_type, void *valu= e)); +DEF_DLL_FN (gcc_jit_rvalue *, gcc_jit_context_new_string_literal, + (gcc_jit_context *ctxt, const char *value)); DEF_DLL_FN (gcc_jit_rvalue *, gcc_jit_context_new_unary_op, (gcc_jit_context *ctxt, gcc_jit_location *loc, enum gcc_jit_unary_op op, gcc_jit_type *result_type, @@ -197,6 +201,10 @@ DEF_DLL_FN (gcc_jit_type *, gcc_jit_type_get_pointer, = (gcc_jit_type *type)); DEF_DLL_FN (void, gcc_jit_block_add_assignment, (gcc_jit_block *block, gcc_jit_location *loc, gcc_jit_lvalue *= lvalue, gcc_jit_rvalue *rvalue)); +DEF_DLL_FN (void, gcc_jit_block_add_assignment_op, + (gcc_jit_block *block, gcc_jit_location *loc, + gcc_jit_lvalue *lvalue, enum gcc_jit_binary_op op, + gcc_jit_rvalue *rvalue)); DEF_DLL_FN (void, gcc_jit_block_add_eval, (gcc_jit_block *block, gcc_jit_location *loc, gcc_jit_rvalue *rvalue)); @@ -239,6 +247,7 @@ init_gccjit_functions (void) =20 /* In alphabetical order */ LOAD_DLL_FN (library, gcc_jit_block_add_assignment); + LOAD_DLL_FN (library, gcc_jit_block_add_assignment_op); LOAD_DLL_FN (library, gcc_jit_block_add_comment); LOAD_DLL_FN (library, gcc_jit_block_add_eval); LOAD_DLL_FN (library, gcc_jit_block_end_with_conditional); @@ -268,6 +277,7 @@ init_gccjit_functions (void) LOAD_DLL_FN (library, gcc_jit_context_new_rvalue_from_int); LOAD_DLL_FN (library, gcc_jit_context_new_rvalue_from_long); LOAD_DLL_FN (library, gcc_jit_context_new_rvalue_from_ptr); + LOAD_DLL_FN (library, gcc_jit_context_new_string_literal); LOAD_DLL_FN (library, gcc_jit_context_new_struct_type); LOAD_DLL_FN (library, gcc_jit_context_new_unary_op); LOAD_DLL_FN (library, gcc_jit_context_new_union_type); @@ -296,6 +306,7 @@ init_gccjit_functions (void) =20 /* In alphabetical order */ #define gcc_jit_block_add_assignment fn_gcc_jit_block_add_assignment +#define gcc_jit_block_add_assignment_op fn_gcc_jit_block_add_assignment_op #define gcc_jit_block_add_comment fn_gcc_jit_block_add_comment #define gcc_jit_block_add_eval fn_gcc_jit_block_add_eval #define gcc_jit_block_end_with_conditional fn_gcc_jit_block_end_with_condi= tional @@ -325,6 +336,7 @@ #define gcc_jit_context_new_param fn_gcc_jit_context_ne= w_param #define gcc_jit_context_new_rvalue_from_int fn_gcc_jit_context_new_rvalue_= from_int #define gcc_jit_context_new_rvalue_from_long fn_gcc_jit_context_new_rvalue= _from_long #define gcc_jit_context_new_rvalue_from_ptr fn_gcc_jit_context_new_rvalue_= from_ptr +#define gcc_jit_context_new_string_literal fn_gcc_jit_context_new_string_l= iteral #define gcc_jit_context_new_struct_type fn_gcc_jit_context_new_struct_type #define gcc_jit_context_new_unary_op fn_gcc_jit_context_new_unary_op #define gcc_jit_context_new_union_type fn_gcc_jit_context_new_union_type @@ -462,6 +474,7 @@ #define F_RELOC_MAX_SIZE 1500 gcc_jit_type *char_ptr_type; gcc_jit_type *ptrdiff_type; gcc_jit_type *uintptr_type; + gcc_jit_type *size_t_type; #if LISP_WORDS_ARE_POINTERS gcc_jit_type *lisp_X; #endif @@ -548,6 +561,7 @@ #define F_RELOC_MAX_SIZE 1500 gcc_jit_rvalue *data_relocs_ephemeral; /* Synthesized struct holding func relocs. */ gcc_jit_lvalue *func_relocs; + gcc_jit_function *memcpy; Lisp_Object d_default_idx; Lisp_Object d_impure_idx; Lisp_Object d_ephemeral_idx; @@ -2347,7 +2361,7 @@ emit_static_object (const char *name, Lisp_Object obj) /* libgccjit has no support for initialized static data. The mechanism below is certainly not aesthetic but I assume the bottl= e neck in terms of performance at load time will still be the reader. - NOTE: we can not relay on libgccjit even for valid NULL terminated C + NOTE: we can not rely on libgccjit even for valid NULL terminated C strings cause of this funny bug that will affect all pre gcc10 era gc= cs: https://gcc.gnu.org/ml/jit/2019-q3/msg00013.html */ =20 @@ -2405,22 +2419,78 @@ emit_static_object (const char *name, Lisp_Object o= bj) gcc_jit_lvalue *arr =3D gcc_jit_lvalue_access_field (data_struct, NULL, fields[1]); =20 - for (ptrdiff_t i =3D 0; i < len; i++, p++) + gcc_jit_lvalue *ptrvar =3D gcc_jit_function_new_local (f, NULL, + comp.char_ptr_type, + "ptr"); + + gcc_jit_block_add_assignment ( + block, + NULL, + ptrvar, + gcc_jit_lvalue_get_address ( + gcc_jit_context_new_array_access ( + comp.ctxt, + NULL, + gcc_jit_lvalue_as_rvalue (arr), + gcc_jit_context_new_rvalue_from_int (comp.ctxt, comp.int_type, 0)), + NULL)); + + for (ptrdiff_t i =3D 0; i < len;) { - gcc_jit_block_add_assignment ( - block, - NULL, - gcc_jit_context_new_array_access ( - comp.ctxt, - NULL, - gcc_jit_lvalue_as_rvalue (arr), - gcc_jit_context_new_rvalue_from_int (comp.ctxt, - comp.ptrdiff_type, - i)), - gcc_jit_context_new_rvalue_from_int (comp.ctxt, - comp.char_type, - *p)); + /* We can't use string literals longer that 200 bytes because + they cause a crash in older versions of gccjit. + https://gcc.gnu.org/ml/jit/2019-q3/msg00013.html. */ + char str[200]; + strncpy (str, p, 200); + str[199] =3D 0; + uintptr_t l =3D strlen (str); + + if (l !=3D 0) + { + p +=3D l; + i +=3D l; + + gcc_jit_rvalue *args[3] + =3D {gcc_jit_lvalue_as_rvalue (ptrvar), + gcc_jit_context_new_string_literal (comp.ctxt, str), + gcc_jit_context_new_rvalue_from_int (comp.ctxt, + comp.size_t_type, + l)}; + + gcc_jit_block_add_eval (block, NULL, + gcc_jit_context_new_call (comp.ctxt, NUL= L, + comp.memcpy, + ARRAYELTS (arg= s), + args)); + gcc_jit_block_add_assignment (block, NULL, ptrvar, + gcc_jit_lvalue_get_address ( + gcc_jit_context_new_array_access (comp.ctxt, NULL, + gcc_jit_lvalue_as_rvalue (ptrvar), + gcc_jit_context_new_rvalue_from_int (comp.ctxt, + comp.uintptr_type, + l)), + NULL)); + } + else + { + /* If strlen returned 0 that means that the static object + contains a NULL byte. In that case just move over to the + next block. We can rely on the byte being zero because + of the previous call to bzero and because the dynamic + linker cleared it. */ + p++; + i++; + gcc_jit_block_add_assignment ( + block, NULL, ptrvar, + gcc_jit_lvalue_get_address ( + gcc_jit_context_new_array_access ( + comp.ctxt, NULL, gcc_jit_lvalue_as_rvalue (ptrvar), + gcc_jit_context_new_rvalue_from_int (comp.ctxt, + comp.uintptr_type, 1)= ), + NULL)); + } } + gcc_jit_block_add_assignment ( block, NULL, @@ -2766,6 +2836,21 @@ define_jmp_buf (void) 1, &field); } =20 +static void +define_memcpy (void) +{ + + gcc_jit_param *params[] =3D + { gcc_jit_context_new_param (comp.ctxt, NULL, comp.void_ptr_type, "des= t"), + gcc_jit_context_new_param (comp.ctxt, NULL, comp.void_ptr_type, "src= "), + gcc_jit_context_new_param (comp.ctxt, NULL, comp.size_t_type, "n") }; + + comp.memcpy =3D + gcc_jit_context_new_function (comp.ctxt, NULL, GCC_JIT_FUNCTION_IMPORT= ED, + comp.void_ptr_type, "memcpy", + ARRAYELTS (params), params, false); +} + /* struct handler definition */ =20 static void @@ -3772,6 +3857,9 @@ DEFUN ("comp--init-ctxt", Fcomp__init_ctxt, Scomp__in= it_ctxt, comp.uintptr_type =3D gcc_jit_context_get_int_type (comp.ctxt, sizeof (void *), false); + comp.size_t_type =3D gcc_jit_context_get_int_type (comp.ctxt, + sizeof (size_t), + false); =20 comp.exported_funcs_h =3D CALLN (Fmake_hash_table, QCtest, Qequal); /* @@ -3780,6 +3868,8 @@ DEFUN ("comp--init-ctxt", Fcomp__init_ctxt, Scomp__in= it_ctxt, */ comp.imported_funcs_h =3D CALLN (Fmake_hash_table); =20 + define_memcpy (); + /* Define data structures. */ =20 define_lisp_cons (); --=20 2.17.1 --=-=-=--