From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: =?utf-8?Q?Gerd_M=C3=B6llmann?= <gerd.moellmann@gmail.com>
Newsgroups: gmane.emacs.devel
Subject: Re: MPS: Please check if scratch/igc builds with native compilation
Date: Tue, 21 May 2024 19:06:42 +0200
Message-ID: <m2jzjnumal.fsf@pro2.fritz.box>
References: <m24jar8duf.fsf@pro2.fritz.box> <yp1wmnn14t9.fsf@fencepost.gnu.org>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
	logging-data="17121"; mail-complaints-to="usenet@ciao.gmane.io"
User-Agent: Gnus/5.13 (Gnus v5.13)
Cc: Emacs Devel <emacs-devel@gnu.org>,  Eli Zaretskii <eliz@gnu.org>,
 Helmut Eller <eller.helmut@gmail.com>
To: Andrea Corallo <acorallo@gnu.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue May 21 19:07:38 2024
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane-mx.org
Original-Received: from lists.gnu.org ([209.51.188.17])
	by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
	(Exim 4.92)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>)
	id 1s9Sxe-0004Ei-KZ
	for ged-emacs-devel@m.gmane-mx.org; Tue, 21 May 2024 19:07:38 +0200
Original-Received: from localhost ([::1] helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <emacs-devel-bounces@gnu.org>)
	id 1s9SxB-00052H-5I; Tue, 21 May 2024 13:07:09 -0400
Original-Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <gerd.moellmann@gmail.com>)
 id 1s9Swv-00050X-VO
 for emacs-devel@gnu.org; Tue, 21 May 2024 13:06:56 -0400
Original-Received: from mail-lf1-x12a.google.com ([2a00:1450:4864:20::12a])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.90_1) (envelope-from <gerd.moellmann@gmail.com>)
 id 1s9Swq-0005PA-HW; Tue, 21 May 2024 13:06:51 -0400
Original-Received: by mail-lf1-x12a.google.com with SMTP id
 2adb3069b0e04-5210684cee6so5734043e87.0; 
 Tue, 21 May 2024 10:06:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1716311204; x=1716916004; darn=gnu.org;
 h=mime-version:user-agent:message-id:date:references:in-reply-to
 :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to;
 bh=WgPXF7S4GVFe/dMw72tAH8b1MyeVxKNecgV00W70k6I=;
 b=SZf5Vr6akLpd1iGS6PQdOQk8wejYktEuuWN4eyqFlGqlKiBrzfYfzUjTTPEwTbo2Yt
 mXDXCFqoKi7aijeBBrefwNsFwbinfYYmz+QhaFyNglO0ERMdnDj633Psd+6W8BcbyLlp
 mwFwn1me31WV2HSnuBFSMJ0jbix2xem40R0UN1BM+KHPjCwXXEEcVPNcDdfYL8ZJ0Bav
 rTIhsnss3USVf7THt0sPGJ3UN6xMbrneBcTOkB1gwnA+Z/GUIjO5OrS8teWDZOx1WXGr
 Hlys0h7uzorvCqxGyK3VbTbMY9T1BltOomp+LO0dmTebOQTlgHEjfjpZ3+e+vZYXf7r9
 w1ng==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1716311204; x=1716916004;
 h=mime-version:user-agent:message-id:date:references:in-reply-to
 :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date
 :message-id:reply-to;
 bh=WgPXF7S4GVFe/dMw72tAH8b1MyeVxKNecgV00W70k6I=;
 b=PFHMAPIAeBNvF3D2s/nxJlCHH+IXUDofZb/+zwXnveG8KK3fcE1YmMmPUQpI5Rms/1
 1KD3mhto5rp3bOELaI5ntDQo2b5Gqyj4rjRc5AEZA0WOpBJl2Ph7LF7FcIGERfFEXM1Y
 gHVV7zQF3xZrDpoVlOqdyRq5M3ldyciiQ2ZiHxdtgpg2OoP3Hol4duH8UfLudFdx7KVt
 RLF0PQsC0zhFcYv/s8nCEz+fGxKvg2jD9ORHVdgNnJg5yNninD5gi72WG3lYEL5xRgrr
 X9ykVdbI5VG3bNyvVuhzWrEeLiRacfz3NX5WfHSf/tV4JrS44RE9aYfDi4mIAXVfBsgJ
 nceQ==
X-Forwarded-Encrypted: i=1;
 AJvYcCWL+pZOIqEDVbMZBa5nQrh0kFSBS4AOgZ0EH+tOIT//3MkbTapVZy3HS6IQWRC9J2p7zqfwXuswYTwgCRA=
X-Gm-Message-State: AOJu0YwZeKYufg1SNYbwX5GG2rSXfbcsO6iUKui41PW6Ru6uxIa6FFet
 LaVWZj9YC3T3uGl9Sk3Vv5+LHCt4EDCe8LQXqGNtPEyxuRSKgLYb
X-Google-Smtp-Source: AGHT+IHHRcNGkItaX0lawsli3uznRPf7jEKGLypAOYe6SP67lBEVP3Fr+gYv5iRxGqMo4Dm5jYrQEQ==
X-Received: by 2002:a05:6512:15a8:b0:51f:b781:7297 with SMTP id
 2adb3069b0e04-5220fa7180amr25974044e87.8.1716311203607; 
 Tue, 21 May 2024 10:06:43 -0700 (PDT)
Original-Received: from pro2.fritz.box (pd9e36251.dip0.t-ipconnect.de. [217.227.98.81])
 by smtp.gmail.com with ESMTPSA id
 a640c23a62f3a-a5a50365669sm1396446066b.193.2024.05.21.10.06.42
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Tue, 21 May 2024 10:06:43 -0700 (PDT)
In-Reply-To: <yp1wmnn14t9.fsf@fencepost.gnu.org> (Andrea Corallo's message of
 "Tue, 21 May 2024 12:57:06 -0400")
Received-SPF: pass client-ip=2a00:1450:4864:20::12a;
 envelope-from=gerd.moellmann@gmail.com; helo=mail-lf1-x12a.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Xref: news.gmane.io gmane.emacs.devel:319447
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/319447>

Here's something about my debugging attenpts so far:

I'm throwing the towel wrt to native compilation with MPS on macOS.
Which makes it a failure for me.

The situation is as follows:

When building with native compilation with --enable-checking=all, I am
observing errors of the form

  igc.c:345: Emacs fatal error: assertion failed: h->obj_type != IGC_OBJ_FWD

when compiling Lisp files, for example

  ELC+ELN  ../lisp/international/mule-cmds.elc
  ELC+ELN  ../lisp/files.elc

What file triggers the error is not predictable, and it is not
reproducible when running under LLDB, with or without ASLR.

To debug this, I changed the check in igc.c to not assert, but print
the PID, and enter an endless loop sleeping. This makes it possible to
attach to the process with LLDB.

In all cases I investigated in this way, I'm seeing a pattern: What is
happening is that a function in the Emacs core is called from a
native-compiled function. Things look like, simplified,

  /* In some .eln */
  Lisp_Object d_reloc[100];

  Lisp_Object some_native_compiled_lisp_function ()
  {
    Lisp_Object frame[2];
    frame[0] = d_reloc[17]; // some symbol
    frame[1] = ...
    f_reloc->funcall (2, frame);
  }

where f_reloc is a large struct with function pointer members for
function being called from the .eln. Doesn't matter. We then land in
Ffuncall in the Emacs core, and the first element of its args vector,
a symbol, is found to be forwarded which leads to the assertion.

d_reloc in the .eln is scanned in igc.c, and it being on the control
stack, in frame[], or in a register, should pin it, one would assume.
So how comes Ffuncall in Emacs receives an invalid symbol?

I've checked that d_reloc is indeed scanned by fix_comp_unit. The
check gives me reasonable confidence that this "should work". But as
an alternative, I also made all the things like d_reloc in the .elns
ambiguous roots, so that they cannot possibly be moved, if all works as
expected.

- No change, it still asserts in the same way.

- Changing optimization levels - no change.
- Changing from arm64 to x86_64 - no change.