all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
@ 2024-07-14  4:12 Gerd Möllmann
  2024-07-14  5:30 ` Pip Cet
  0 siblings, 1 reply; 27+ messages in thread
From: Gerd Möllmann @ 2024-07-14  4:12 UTC (permalink / raw)
  To: Emacs Devel; +Cc: Pip Cet, Helmut Eller

I'm seeing this assertion sometimes in an Emacs built with
--enable-checking=igc_debug,igc_check_fwd,w here sometines means it can
take days of using/running Emacs, or it can take a couple of hours.
This is macOS 14, arm64. I'm linking with -lmps-debug.

The assertion means that we likely have a reference somewhere that
isn't traced. Because it isn't traced, the reference isn't changed to
point to the new location when the object being references is copied
to a new address in memory. Instead, it points to kind of a tombstone
that is left behind when the object is moved.

Alas, I haven't been able to debug this. One problem is that I
can't reproduce it easily, the other is that is is either not
happening or happening much less often when building with -O0, and
without -O0 I can't see much here.

This is just to let people know of the problem. If you find a recipe
how to reproduce this, please let me know. Or better yet, debug it :-).



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-14  4:12 MPS: assertion failed: header_type (h) != IGC_OBJ_FWD Gerd Möllmann
@ 2024-07-14  5:30 ` Pip Cet
  2024-07-14  7:00   ` Gerd Möllmann
  0 siblings, 1 reply; 27+ messages in thread
From: Pip Cet @ 2024-07-14  5:30 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Emacs Devel, Helmut Eller

On Sunday, July 14th, 2024 at 04:12, Gerd Möllmann <gerd.moellmann@gmail.com> wrote:
> I'm seeing this assertion sometimes in an Emacs built with
> --enable-checking=igc_debug,igc_check_fwd,w here sometines means it can
> take days of using/running Emacs, or it can take a couple of hours.
> This is macOS 14, arm64. I'm linking with -lmps-debug.

That means lldb and no core dumps, right? (I've had to work with lldb to make the Android port work with mps, and it's decidedly Not My GDB).

> The assertion means that we likely have a reference somewhere that
> isn't traced. Because it isn't traced, the reference isn't changed to
> point to the new location when the object being references is copied
> to a new address in memory. Instead, it points to kind of a tombstone
> that is left behind when the object is moved.

My approach would be to try to capture it in a debugger, then follow the forwarding pointer and find out what kind of object the pointer should be referring to.

I'm thinking, though, about how to increase pressure to flush out such bugs. Here are some ideas:

1. scan xmalloc'd memory for pointers that refer to MPS-managed objects.
2. reduce generation sizes and increase the number of generations, making it more likely objects will be copied.
3. trigger GC regularly while allocating objects
4. Hack MPS to do something.
5. Keep a log of forwarded objects and their old/new pointers

(1) seems the most complete approach but relies on unusual pointer representations to reduce the number of false positives (and even then, they might be pinned objects and then the pointer is okay...).

(2) is easy to do, but impacts performance. (3) is easy to do, but impacts performance a lot. I'd prefer avoiding (4), and while (5) is doable it's probably unnecessary: you can just read the forwarding pointer if it's still there.

> Alas, I haven't been able to debug this. One problem is that I
> can't reproduce it easily, the other is that is is either not
> happening or happening much less often when building with -O0, and
> without -O0 I can't see much here.

I assume it's being accessed from Lisp or in an exception handler? Which optimization options are you using?

> This is just to let people know of the problem. If you find a recipe
> how to reproduce this, please let me know. Or better yet, debug it :-).

Well, it's possible, but quite unlikely, that it is the handlerlist_sentinel thing or the Lisp_Mutex->name thing (both fixed). Much more likely it's another issue.

Pip



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-14  5:30 ` Pip Cet
@ 2024-07-14  7:00   ` Gerd Möllmann
  2024-07-14  7:08     ` Gerd Möllmann
  2024-07-16 13:02     ` Gerd Möllmann
  0 siblings, 2 replies; 27+ messages in thread
From: Gerd Möllmann @ 2024-07-14  7:00 UTC (permalink / raw)
  To: Pip Cet; +Cc: Emacs Devel, Helmut Eller

Pip Cet <pipcet@protonmail.com> writes:

> On Sunday, July 14th, 2024 at 04:12, Gerd Möllmann <gerd.moellmann@gmail.com> wrote:
>> I'm seeing this assertion sometimes in an Emacs built with
>> --enable-checking=igc_debug,igc_check_fwd,w here sometines means it can
>> take days of using/running Emacs, or it can take a couple of hours.
>> This is macOS 14, arm64. I'm linking with -lmps-debug.
>
> That means lldb and no core dumps, right? (I've had to work with lldb
> to make the Android port work with mps, and it's decidedly Not My
> GDB).

Yes, no GDB for macOS/arm64. I mean LLDB is okay for the usual
debugging, but in this case hm. But I don't know if GDB would do better,
either.

(One can produce core dumps, BTW, but it's complicated, requires signing
and entitlements, and is not really helpful debugging-wise, and the
dumps are several GB large.)

What I did is enter a loop with sleep(3) instead of asserting, so that
one can attach to the process. That works, but with optimizations I
can't see much.

>> The assertion means that we likely have a reference somewhere that
>> isn't traced. Because it isn't traced, the reference isn't changed to
>> point to the new location when the object being references is copied
>> to a new address in memory. Instead, it points to kind of a tombstone
>> that is left behind when the object is moved.
>
> My approach would be to try to capture it in a debugger, then follow
> the forwarding pointer and find out what kind of object the pointer
> should be referring to.

Exactly. If it's reproducible, one can also remember the hash, and stop
when objects with that hash are allocated, to see where they are stored.
That worked pretty well in the past, but in this case led to nothing.

> I'm thinking, though, about how to increase pressure to flush out such
> bugs. Here are some ideas:
>
> 1. scan xmalloc'd memory for pointers that refer to MPS-managed objects.
> 2. reduce generation sizes and increase the number of generations, making it more likely objects will be copied.
> 3. trigger GC regularly while allocating objects
> 4. Hack MPS to do something.
> 5. Keep a log of forwarded objects and their old/new pointers
>
(2) and (3) I've tried but no new findings, with (3) as far as it went
witout rendering Emacs unbearable interactively.

Also tried making specpdl and byte stack ambiguous roots, in their
entirety. Slowness but still asserted once, so I guess I can exclude
these two, with some probability.

> (1) seems the most complete approach but relies on unusual pointer
> representations to reduce the number of false positives (and even
> then, they might be pinned objects and then the pointer is okay...).
>
> (2) is easy to do, but impacts performance. (3) is easy to do, but
> impacts performance a lot. I'd prefer avoiding (4), and while (5) is
> doable it's probably unnecessary: you can just read the forwarding
> pointer if it's still there.
>
>> Alas, I haven't been able to debug this. One problem is that I
>> can't reproduce it easily, the other is that is is either not
>> happening or happening much less often when building with -O0, and
>> without -O0 I can't see much here.
>
> I assume it's being accessed from Lisp or in an exception handler?
> Which optimization options are you using?

Just -O.

igc_check_fwd runs in places like XCONS and XSYMBOL etc. so it detects
the problem as soom as possible, on the Lisp side. I'd say most of the
time it's a cons that's the problem, and sometimes a symbol. Maybe it's
a cons containing a symbol, or something like that, where sometimes the
cons isn't copied yet, and the symbol is, or something. Hard to tell.

>> This is just to let people know of the problem. If you find a recipe
>> how to reproduce this, please let me know. Or better yet, debug it :-).
>
> Well, it's possible, but quite unlikely, that it is the
> handlerlist_sentinel thing or the Lisp_Mutex->name thing (both fixed).
> Much more likely it's another issue.

Let's see, I've just transferred you latest commits. Like I mentioned,
it can take days for the thing to surface.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-14  7:00   ` Gerd Möllmann
@ 2024-07-14  7:08     ` Gerd Möllmann
  2024-07-16 13:02     ` Gerd Möllmann
  1 sibling, 0 replies; 27+ messages in thread
From: Gerd Möllmann @ 2024-07-14  7:08 UTC (permalink / raw)
  To: Pip Cet; +Cc: Emacs Devel, Helmut Eller

[-- Attachment #1: Type: text/plain, Size: 356 bytes --]

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Yes, no GDB for macOS/arm64. I mean LLDB is okay for the usual
> debugging, but in this case hm. But I don't know if GDB would do better,
> either.

BTW, I'm using the attached hack for LLDB debugging. It makes source
buffers read-only so that one can use single-key commands and such.
FWIW.


[-- Attachment #2: lldbx.el --]
[-- Type: application/emacs-lisp, Size: 7165 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-14  7:00   ` Gerd Möllmann
  2024-07-14  7:08     ` Gerd Möllmann
@ 2024-07-16 13:02     ` Gerd Möllmann
  2024-07-16 13:38       ` Eli Zaretskii
  1 sibling, 1 reply; 27+ messages in thread
From: Gerd Möllmann @ 2024-07-16 13:02 UTC (permalink / raw)
  To: Pip Cet; +Cc: Emacs Devel, Helmut Eller

[-- Attachment #1: Type: text/plain, Size: 746 bytes --]

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

>> Well, it's possible, but quite unlikely, that it is the
>> handlerlist_sentinel thing or the Lisp_Mutex->name thing (both fixed).
>> Much more likely it's another issue.
>
> Let's see, I've just transferred you latest commits. Like I mentioned,
> it can take days for the thing to surface.

The handlerlist_sentinel didn't help, BTW, but I had another idea today.
The function scan_ambig assumes that that references are aligned on word
boundaries (8 bytes here). I haven't checked (and I'm too lazy to check
:-)), but that assumption doesn't have to be true. Or, to say the least,
I didn't make sure the assumption holds.

I'm running with this in my branch only, for now.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: scan_ambig --]
[-- Type: text/x-patch, Size: 2658 bytes --]

From b912464c360d0f66ab472f96521dfb4f48d904f5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Gerd=20M=C3=B6llmann?= <gerd@gnu.org>
Date: Tue, 16 Jul 2024 14:38:26 +0200
Subject: [PATCH] Possibly wrong alignment assumption of 8 in scan_ambig

---
 src/igc.c | 69 ++++++++++++++++++++++++++++---------------------------
 1 file changed, 35 insertions(+), 34 deletions(-)

diff --git a/src/igc.c b/src/igc.c
index 4d20529d7a8..725799bbce4 100644
--- a/src/igc.c
+++ b/src/igc.c
@@ -1311,46 +1311,47 @@ scan_ambig (mps_ss_t ss, void *start, void *end, void *closure)
   MPS_SCAN_BEGIN (ss)
   {
     for (mps_word_t *p = start; p < (mps_word_t *) end; ++p)
-      {
-	mps_word_t word = *p;
-	mps_word_t tag = word & IGC_TAG_MASK;
-
-	/* If the references in the object being scanned are
-	   ambiguous then MPS_FIX2() does not update the
-	   reference (because it can't know if it's a
-	   genuine reference). The MPS handles an ambiguous
-	   reference by pinning the block pointed to so that
-	   it cannot move. */
-	mps_addr_t ref = (mps_addr_t) word;
-	mps_res_t res = MPS_FIX12 (ss, &ref);
-	if (res != MPS_RES_OK)
-	  return res;
-
-	switch (tag)
-	  {
-	  case Lisp_Int0:
-	  case Lisp_Int1:
-	  case Lisp_Type_Unused0:
-	    break;
+      for (size_t off = 0; off <= 4; off += 4)
+	{
+	  mps_word_t word = *(mps_word_t *) ((char *)p + off);
+	  mps_word_t tag = word & IGC_TAG_MASK;
+
+	  /* If the references in the object being scanned are
+	     ambiguous then MPS_FIX2() does not update the
+	     reference (because it can't know if it's a
+	     genuine reference). The MPS handles an ambiguous
+	     reference by pinning the block pointed to so that
+	     it cannot move. */
+	  mps_addr_t ref = (mps_addr_t) word;
+	  mps_res_t res = MPS_FIX12 (ss, &ref);
+	  if (res != MPS_RES_OK)
+	    return res;
 
-	  case Lisp_Symbol:
+	  switch (tag)
 	    {
-	      ptrdiff_t off = word ^ tag;
-	      ref = (mps_addr_t) ((char *) lispsym + off);
+	    case Lisp_Int0:
+	    case Lisp_Int1:
+	    case Lisp_Type_Unused0:
+	      break;
+
+	    case Lisp_Symbol:
+	      {
+		ptrdiff_t off = word ^ tag;
+		ref = (mps_addr_t) ((char *) lispsym + off);
+		res = MPS_FIX12 (ss, &ref);
+		if (res != MPS_RES_OK)
+		  return res;
+	      }
+	      break;
+
+	    default:
+	      ref = (mps_addr_t) (word ^ tag);
 	      res = MPS_FIX12 (ss, &ref);
 	      if (res != MPS_RES_OK)
 		return res;
+	      break;
 	    }
-	    break;
-
-	  default:
-	    ref = (mps_addr_t) (word ^ tag);
-	    res = MPS_FIX12 (ss, &ref);
-	    if (res != MPS_RES_OK)
-	      return res;
-	    break;
-	  }
-      }
+	}
   }
   MPS_SCAN_END (ss);
   return MPS_RES_OK;
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-16 13:02     ` Gerd Möllmann
@ 2024-07-16 13:38       ` Eli Zaretskii
  2024-07-16 13:47         ` Gerd Möllmann
  2024-07-16 15:49         ` Paul Eggert
  0 siblings, 2 replies; 27+ messages in thread
From: Eli Zaretskii @ 2024-07-16 13:38 UTC (permalink / raw)
  To: Gerd Möllmann, Paul Eggert; +Cc: pipcet, emacs-devel, eller.helmut

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Emacs Devel <emacs-devel@gnu.org>,  Helmut Eller <eller.helmut@gmail.com>
> Date: Tue, 16 Jul 2024 15:02:24 +0200
> 
> The handlerlist_sentinel didn't help, BTW, but I had another idea today.
> The function scan_ambig assumes that that references are aligned on word
> boundaries (8 bytes here). I haven't checked (and I'm too lazy to check
> :-)), but that assumption doesn't have to be true.

I think it _is_ true.  At least the original allocation code in
alloc.c made sure it was true.  Paul, am I right?



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-16 13:38       ` Eli Zaretskii
@ 2024-07-16 13:47         ` Gerd Möllmann
  2024-07-16 14:11           ` Eli Zaretskii
  2024-07-16 14:19           ` Helmut Eller
  2024-07-16 15:49         ` Paul Eggert
  1 sibling, 2 replies; 27+ messages in thread
From: Gerd Möllmann @ 2024-07-16 13:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Paul Eggert, pipcet, emacs-devel, eller.helmut

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: Emacs Devel <emacs-devel@gnu.org>,  Helmut Eller <eller.helmut@gmail.com>
>> Date: Tue, 16 Jul 2024 15:02:24 +0200
>> 
>> The handlerlist_sentinel didn't help, BTW, but I had another idea today.
>> The function scan_ambig assumes that that references are aligned on word
>> boundaries (8 bytes here). I haven't checked (and I'm too lazy to check
>> :-)), but that assumption doesn't have to be true.
>
> I think it _is_ true.  At least the original allocation code in
> alloc.c made sure it was true.  Paul, am I right?

That's probably a misunderstanding. I'm thinking about a block of memory
containing references, and the alignment of these references, not the
alignment of the block.

Example with sizeof(int) = 4, and sizeof(void *) = 8

  struct x
  {
    int x;
    struct Lisp_Symbol *s;
  };

What about the offset of x::s?



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-16 13:47         ` Gerd Möllmann
@ 2024-07-16 14:11           ` Eli Zaretskii
  2024-07-16 14:39             ` Gerd Möllmann
  2024-07-16 14:19           ` Helmut Eller
  1 sibling, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2024-07-16 14:11 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eggert, pipcet, emacs-devel, eller.helmut

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Paul Eggert <eggert@cs.ucla.edu>,  pipcet@protonmail.com,
>   emacs-devel@gnu.org,  eller.helmut@gmail.com
> Date: Tue, 16 Jul 2024 15:47:26 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> >> Cc: Emacs Devel <emacs-devel@gnu.org>,  Helmut Eller <eller.helmut@gmail.com>
> >> Date: Tue, 16 Jul 2024 15:02:24 +0200
> >> 
> >> The handlerlist_sentinel didn't help, BTW, but I had another idea today.
> >> The function scan_ambig assumes that that references are aligned on word
> >> boundaries (8 bytes here). I haven't checked (and I'm too lazy to check
> >> :-)), but that assumption doesn't have to be true.
> >
> > I think it _is_ true.  At least the original allocation code in
> > alloc.c made sure it was true.  Paul, am I right?
> 
> That's probably a misunderstanding. I'm thinking about a block of memory
> containing references, and the alignment of these references, not the
> alignment of the block.
> 
> Example with sizeof(int) = 4, and sizeof(void *) = 8
> 
>   struct x
>   {
>     int x;
>     struct Lisp_Symbol *s;
>   };
> 
> What about the offset of x::s?

The pointer or the Lisp_Symbol struct it points to?



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-16 13:47         ` Gerd Möllmann
  2024-07-16 14:11           ` Eli Zaretskii
@ 2024-07-16 14:19           ` Helmut Eller
  2024-07-16 14:48             ` Gerd Möllmann
  1 sibling, 1 reply; 27+ messages in thread
From: Helmut Eller @ 2024-07-16 14:19 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, Paul Eggert, pipcet, emacs-devel

On Tue, Jul 16 2024, Gerd Möllmann wrote:

> That's probably a misunderstanding. I'm thinking about a block of memory
> containing references, and the alignment of these references, not the
> alignment of the block.
>
> Example with sizeof(int) = 4, and sizeof(void *) = 8
>
>   struct x
>   {
>     int x;
>     struct Lisp_Symbol *s;
>   };
>
> What about the offset of x::s?

That's not a problem because offsetof(struct x, s) = 8. There are 4
bytes padding after x.  A problem could be if an unaligned void* is cast
to a struct Lisp_Symbol*; let's hope that nobody does that.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-16 14:11           ` Eli Zaretskii
@ 2024-07-16 14:39             ` Gerd Möllmann
  2024-07-16 15:21               ` Eli Zaretskii
  0 siblings, 1 reply; 27+ messages in thread
From: Gerd Möllmann @ 2024-07-16 14:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, pipcet, emacs-devel, eller.helmut

Eli Zaretskii <eliz@gnu.org> writes:

>> Example with sizeof(int) = 4, and sizeof(void *) = 8
>> 
>>   struct x
>>   {
>>     int x;
>>     struct Lisp_Symbol *s;
>>   };
>> 
>> What about the offset of x::s?
>
> The pointer or the Lisp_Symbol struct it points to?

What is offsetof (struct x, s)? Is it guaranteed to be 8 in this case?

But it's not limited to struct members. The question is similar for
control stacks, and anything allocated via malloc. Given an address
range [start, end) to scan with scan_ambig, where can references be
found?



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-16 14:19           ` Helmut Eller
@ 2024-07-16 14:48             ` Gerd Möllmann
  2024-07-16 15:22               ` Eli Zaretskii
                                 ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Gerd Möllmann @ 2024-07-16 14:48 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, Paul Eggert, pipcet, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Tue, Jul 16 2024, Gerd Möllmann wrote:
>
>> That's probably a misunderstanding. I'm thinking about a block of memory
>> containing references, and the alignment of these references, not the
>> alignment of the block.
>>
>> Example with sizeof(int) = 4, and sizeof(void *) = 8
>>
>>   struct x
>>   {
>>     int x;
>>     struct Lisp_Symbol *s;
>>   };
>>
>> What about the offset of x::s?
>
> That's not a problem because offsetof(struct x, s) = 8. There are 4
> bytes padding after x.  A problem could be if an unaligned void* is cast
> to a struct Lisp_Symbol*; let's hope that nobody does that.

If you're right, and the same holds for the control stack and anything
else a malloc'd block can contain, then we're safe.

Since I don't know that for a fact, it's an invalid assumption for me, ATM.

Let's see if it continues crashing.




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-16 14:39             ` Gerd Möllmann
@ 2024-07-16 15:21               ` Eli Zaretskii
  2024-07-16 16:54                 ` Gerd Möllmann
  0 siblings, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2024-07-16 15:21 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eggert, pipcet, emacs-devel, eller.helmut

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: eggert@cs.ucla.edu,  pipcet@protonmail.com,  emacs-devel@gnu.org,
>   eller.helmut@gmail.com
> Date: Tue, 16 Jul 2024 16:39:46 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Example with sizeof(int) = 4, and sizeof(void *) = 8
> >> 
> >>   struct x
> >>   {
> >>     int x;
> >>     struct Lisp_Symbol *s;
> >>   };
> >> 
> >> What about the offset of x::s?
> >
> > The pointer or the Lisp_Symbol struct it points to?
> 
> What is offsetof (struct x, s)? Is it guaranteed to be 8 in this case?

Yes, AFAIK.

> But it's not limited to struct members. The question is similar for
> control stacks, and anything allocated via malloc. Given an address
> range [start, end) to scan with scan_ambig, where can references be
> found?

AFAIU, each memory block allocated by malloc is guaranteed to have the
alignment of the largest fundamental data type on the platform.  So
yes, 8-byte alignment is guaranteed in this case.  (See also the
max_align_t data type introduced by C11.)



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-16 14:48             ` Gerd Möllmann
@ 2024-07-16 15:22               ` Eli Zaretskii
  2024-07-16 16:13               ` Pip Cet
  2024-07-17 19:47               ` Gerd Möllmann
  2 siblings, 0 replies; 27+ messages in thread
From: Eli Zaretskii @ 2024-07-16 15:22 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eller.helmut, eggert, pipcet, emacs-devel

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Paul Eggert <eggert@cs.ucla.edu>,
>   pipcet@protonmail.com,  emacs-devel@gnu.org
> Date: Tue, 16 Jul 2024 16:48:38 +0200
> 
> Helmut Eller <eller.helmut@gmail.com> writes:
> 
> > On Tue, Jul 16 2024, Gerd Möllmann wrote:
> >
> >> That's probably a misunderstanding. I'm thinking about a block of memory
> >> containing references, and the alignment of these references, not the
> >> alignment of the block.
> >>
> >> Example with sizeof(int) = 4, and sizeof(void *) = 8
> >>
> >>   struct x
> >>   {
> >>     int x;
> >>     struct Lisp_Symbol *s;
> >>   };
> >>
> >> What about the offset of x::s?
> >
> > That's not a problem because offsetof(struct x, s) = 8. There are 4
> > bytes padding after x.  A problem could be if an unaligned void* is cast
> > to a struct Lisp_Symbol*; let's hope that nobody does that.
> 
> If you're right, and the same holds for the control stack and anything
> else a malloc'd block can contain, then we're safe.
> 
> Since I don't know that for a fact, it's an invalid assumption for me, ATM.

I think it's a fact.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-16 13:38       ` Eli Zaretskii
  2024-07-16 13:47         ` Gerd Möllmann
@ 2024-07-16 15:49         ` Paul Eggert
  1 sibling, 0 replies; 27+ messages in thread
From: Paul Eggert @ 2024-07-16 15:49 UTC (permalink / raw)
  To: Eli Zaretskii, Gerd Möllmann; +Cc: pipcet, emacs-devel, eller.helmut

On 2024-07-16 06:38, Eli Zaretskii wrote:
>> From: Gerd Möllmann<gerd.moellmann@gmail.com>
>> Cc: Emacs Devel<emacs-devel@gnu.org>,  Helmut Eller<eller.helmut@gmail.com>
>> The function scan_ambig assumes that that references are aligned on word
>> boundaries (8 bytes here). I haven't checked (and I'm too lazy to check
>> :-)), but that assumption doesn't have to be true.
> I think it_is_ true.  At least the original allocation code in
> alloc.c made sure it was true.  Paul, am I right?

You're right. In lisp.h, struct Lisp_Symbol is declared with a 
GCALIGNED_UNION_MEMBER, which means that struct Lisp_Symbol must be on 
an address that is a multiple of 8. This is verified statically by 
lisp.h's line "verify (GCALIGNED (struct Lisp_Symbol));". Although there 
could be a bug in Emacs that would use pointer arithmetic to misalign a 
struct Lisp_Symbol, that would be the same class of bug that would 
misalign any other type.

GCALIGNED_UNION_MEMBER is implemented via GNU C and/or C11 primitives if 
available. Although this could be ineffective on older compilers, the 
"verify (GCALIGNED (struct Lisp_Symbol));" would catch any older 
compiler that happened to not align struct Lisp_Symbol properly. In 
practice we've never run into any such compiler and are never likely to 
in the future, so long as we insist only on an alignment of 8.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-16 14:48             ` Gerd Möllmann
  2024-07-16 15:22               ` Eli Zaretskii
@ 2024-07-16 16:13               ` Pip Cet
  2024-07-16 16:47                 ` Gerd Möllmann
  2024-07-17  7:51                 ` Andrea Corallo
  2024-07-17 19:47               ` Gerd Möllmann
  2 siblings, 2 replies; 27+ messages in thread
From: Pip Cet @ 2024-07-16 16:13 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Helmut Eller, Eli Zaretskii, Paul Eggert, emacs-devel

On Tuesday, July 16th, 2024 at 14:48, Gerd Möllmann <gerd.moellmann@gmail.com> wrote:
> Helmut Eller eller.helmut@gmail.com writes:
> 
> > On Tue, Jul 16 2024, Gerd Möllmann wrote:
> > 
> > > That's probably a misunderstanding. I'm thinking about a block of memory
> > > containing references, and the alignment of these references, not the
> > > alignment of the block.
> > > 
> > > Example with sizeof(int) = 4, and sizeof(void *) = 8
> > > 
> > > struct x
> > > {
> > > int x;
> > > struct Lisp_Symbol *s;
> > > };
> > > 
> > > What about the offset of x::s?
> > 
> > That's not a problem because offsetof(struct x, s) = 8. There are 4
> > bytes padding after x. A problem could be if an unaligned void* is cast
> > to a struct Lisp_Symbol*; let's hope that nobody does that.
> 
> 
> If you're right, and the same holds for the control stack and anything
> else a malloc'd block can contain, then we're safe.

I'm pretty sure we are, though interior pointers might cause a problem.

Any leads on where the crash happens yet? I've found a breakpoint on wrong_type_argument helpful, since we usually hit that when memory moves and the old pointer doesn't point to an object of the right type.

By the way, I'm done with the code for making base == client pointers and giving (almost) every object a header. Since it's a major change and can't really be split that well, I'm not sure yet how to install it, though it needs to be cleaned up still in any case... But when I do install it, it will require rebuilding of all .eln files, or there will be weird segfaults. (I guess we could bump the ABI constant in the nativecomp code to avoid that).

Pip



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-16 16:13               ` Pip Cet
@ 2024-07-16 16:47                 ` Gerd Möllmann
  2024-07-17  7:51                 ` Andrea Corallo
  1 sibling, 0 replies; 27+ messages in thread
From: Gerd Möllmann @ 2024-07-16 16:47 UTC (permalink / raw)
  To: Pip Cet; +Cc: Helmut Eller, Eli Zaretskii, Paul Eggert, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> Any leads on where the crash happens yet? I've found a breakpoint on
> wrong_type_argument helpful, since we usually hit that when memory
> moves and the old pointer doesn't point to an object of the right
> type.

Nothing giving the slightest clue. And it's all over the place when it
happens. No pattern I can recognize.

> By the way, I'm done with the code for making base == client pointers
> and giving (almost) every object a header. Since it's a major change
> and can't really be split that well, I'm not sure yet how to install
> it, though it needs to be cleaned up still in any case... But when I
> do install it, it will require rebuilding of all .eln files, or there
> will be weird segfaults. (I guess we could bump the ABI constant in
> the nativecomp code to avoid that).

I guess that's meanwhile something for Eli. I'm pretty ruthless in these
regards :-). Move fast, and so on.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-16 15:21               ` Eli Zaretskii
@ 2024-07-16 16:54                 ` Gerd Möllmann
  0 siblings, 0 replies; 27+ messages in thread
From: Gerd Möllmann @ 2024-07-16 16:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, pipcet, emacs-devel, eller.helmut

Eli Zaretskii <eliz@gnu.org> writes:

>> But it's not limited to struct members. The question is similar for
>> control stacks, and anything allocated via malloc. Given an address
>> range [start, end) to scan with scan_ambig, where can references be
>> found?
>
> AFAIU, each memory block allocated by malloc is guaranteed to have the
> alignment of the largest fundamental data type on the platform.  So
> yes, 8-byte alignment is guaranteed in this case.  (See also the
> max_align_t data type introduced by C11.)

Please note that I'm not talking about the alignment of allocated
memory, but of the r4ferences contained in that memory. Anyway, I'm
following my gut right now. Let's see.




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-16 16:13               ` Pip Cet
  2024-07-16 16:47                 ` Gerd Möllmann
@ 2024-07-17  7:51                 ` Andrea Corallo
  1 sibling, 0 replies; 27+ messages in thread
From: Andrea Corallo @ 2024-07-17  7:51 UTC (permalink / raw)
  To: Pip Cet
  Cc: Gerd Möllmann, Helmut Eller, Eli Zaretskii, Paul Eggert,
	emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> On Tuesday, July 16th, 2024 at 14:48, Gerd Möllmann <gerd.moellmann@gmail.com> wrote:
>> Helmut Eller eller.helmut@gmail.com writes:
>> 
>> > On Tue, Jul 16 2024, Gerd Möllmann wrote:
>> > 
>> > > That's probably a misunderstanding. I'm thinking about a block of memory
>> > > containing references, and the alignment of these references, not the
>> > > alignment of the block.
>> > > 
>> > > Example with sizeof(int) = 4, and sizeof(void *) = 8
>> > > 
>> > > struct x
>> > > {
>> > > int x;
>> > > struct Lisp_Symbol *s;
>> > > };
>> > > 
>> > > What about the offset of x::s?
>> > 
>> > That's not a problem because offsetof(struct x, s) = 8. There are 4
>> > bytes padding after x. A problem could be if an unaligned void* is cast
>> > to a struct Lisp_Symbol*; let's hope that nobody does that.
>> 
>> 
>> If you're right, and the same holds for the control stack and anything
>> else a malloc'd block can contain, then we're safe.
>
> I'm pretty sure we are, though interior pointers might cause a problem.
>
> Any leads on where the crash happens yet? I've found a breakpoint on wrong_type_argument helpful, since we usually hit that when memory moves and the old pointer doesn't point to an object of the right type.
>
> By the way, I'm done with the code for making base == client pointers
> and giving (almost) every object a header. Since it's a major change
> and can't really be split that well, I'm not sure yet how to install
> it, though it needs to be cleaned up still in any case... But when I
> do install it, it will require rebuilding of all .eln files, or there
> will be weird segfaults. (I guess we could bump the ABI constant in
> the nativecomp code to avoid that).

Yep, I confirm updating ABI_VERSION is the right thing to do in order to
be sure we don't use stale elns when an incompatible change is made.

  Andrea



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-16 14:48             ` Gerd Möllmann
  2024-07-16 15:22               ` Eli Zaretskii
  2024-07-16 16:13               ` Pip Cet
@ 2024-07-17 19:47               ` Gerd Möllmann
  2024-07-18 15:08                 ` Gerd Möllmann
  2 siblings, 1 reply; 27+ messages in thread
From: Gerd Möllmann @ 2024-07-17 19:47 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, Paul Eggert, pipcet, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Let's see if it continues crashing.

It does :-(.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-17 19:47               ` Gerd Möllmann
@ 2024-07-18 15:08                 ` Gerd Möllmann
  2024-07-18 16:05                   ` Pip Cet
  2024-07-18 19:06                   ` Andrea Corallo
  0 siblings, 2 replies; 27+ messages in thread
From: Gerd Möllmann @ 2024-07-18 15:08 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, Paul Eggert, pipcet, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> Let's see if it continues crashing.
>
> It does :-(.

Next idea.

I wonder if __builtin_unwind_init changes things. The different
behaviour when compiled -O0 and -O, the fact that problems happen in
"random" places which lets me suspect indicates a basic problem, and so
on - that would fit, when MPS doesn't see something in a register.
Maybe.

I'm now running with a patch that adds that to MPS' stack scanning code.

@@ -39,6 +39,8 @@ void StackHot(void **stackOut);
 /* STACK_CONTEXT_BEGIN -- save context */
 
 #define STACK_CONTEXT_BEGIN(arena) \
+  BEGIN \
+  __builtin_unwind_init(); \
   BEGIN \
     StackContextStruct _sc; \
     STACK_CONTEXT_SAVE(&_sc); \
@@ -51,6 +53,7 @@ void StackHot(void **stackOut);
 /* STACK_CONTEXT_END -- clear context */
 
 #define STACK_CONTEXT_END(arena) \
+    END; \
     END; \
     AVER(arena->stackWarm != NULL); \
     arena->stackWarm = NULL; \

If somone observes similar strange phenomena like I do on macOS/arm64,
maybe also give it a try.

Let's see how long igc survives this time :-).



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-18 15:08                 ` Gerd Möllmann
@ 2024-07-18 16:05                   ` Pip Cet
  2024-07-18 16:33                     ` Gerd Möllmann
  2024-07-18 19:06                   ` Andrea Corallo
  1 sibling, 1 reply; 27+ messages in thread
From: Pip Cet @ 2024-07-18 16:05 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Helmut Eller, Eli Zaretskii, Paul Eggert, emacs-devel

On Thursday, July 18th, 2024 at 15:08, Gerd Möllmann <gerd.moellmann@gmail.com> wrote:
> Gerd Möllmann gerd.moellmann@gmail.com writes:
> 
> > Gerd Möllmann gerd.moellmann@gmail.com writes:
> > 
> > > Let's see if it continues crashing.
> > 
> > It does :-(.
> 
> Next idea.
> 
> I wonder if __builtin_unwind_init changes things. The different
> behaviour when compiled -O0 and -O, the fact that problems happen in
> "random" places which lets me suspect indicates a basic problem, and so
> on - that would fit, when MPS doesn't see something in a register.
> Maybe.

I believe MPS relies (or used to rely) on setjmp() (or _setjmp()) to save registers, and there was a comment in the source code about how that mechanism might be unreliable with implementations which attempt to encrypt the jump buffer to avoid some forms of data injection attacks. I don't know whether macOS does that.

https://opensource.apple.com/source/libplatform/libplatform-254.80.2/src/setjmp/arm64/ appears to suggest some kind of "munging" happens, but it doesn't appear to affect x19-x28...

Also, LLVM apparently implements __builtin_setjmp(), and I'm not sure whether we might end up using that one. Can you run the compiler with -E to see what the preprocessor produces?

> If somone observes similar strange phenomena like I do on macOS/arm64,
> maybe also give it a try.
> 
> Let's see how long igc survives this time :-).

Fingers crossed!

Pip



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-18 16:05                   ` Pip Cet
@ 2024-07-18 16:33                     ` Gerd Möllmann
  0 siblings, 0 replies; 27+ messages in thread
From: Gerd Möllmann @ 2024-07-18 16:33 UTC (permalink / raw)
  To: Pip Cet; +Cc: Helmut Eller, Eli Zaretskii, Paul Eggert, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

>> I wonder if __builtin_unwind_init changes things. The different
>> behaviour when compiled -O0 and -O, the fact that problems happen in
>> "random" places which lets me suspect indicates a basic problem, and so
>> on - that would fit, when MPS doesn't see something in a register.
>> Maybe.
>
> I believe MPS relies (or used to rely) on setjmp() (or _setjmp()) to
> save registers, and there was a comment in the source code about how
> that mechanism might be unreliable with implementations which attempt
> to encrypt the jump buffer to avoid some forms of data injection
> attacks. I don't know whether macOS does that.

The man page is silent about that. But it could well be. I've also read
hints, without anything concrete, as usual :-/.
>
> https://opensource.apple.com/source/libplatform/libplatform-254.80.2/src/setjmp/arm64/ appears to suggest some kind of "munging" happens, but it doesn't appear to affect x19-x28...
>
> Also, LLVM apparently implements __builtin_setjmp(), and I'm not sure
> whether we might end up using that one. Can you run the compiler with
> -E to see what the preprocessor produces?

I think MPS uses _setjmp on macOS, not the builtin.

>> If somone observes similar strange phenomena like I do on macOS/arm64,
>> maybe also give it a try.
>> 
>> Let's see how long igc survives this time :-).
>
> Fingers crossed!

👍



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-18 15:08                 ` Gerd Möllmann
  2024-07-18 16:05                   ` Pip Cet
@ 2024-07-18 19:06                   ` Andrea Corallo
  2024-07-18 19:33                     ` Gerd Möllmann
  1 sibling, 1 reply; 27+ messages in thread
From: Andrea Corallo @ 2024-07-18 19:06 UTC (permalink / raw)
  To: Gerd Möllmann
  Cc: Helmut Eller, Eli Zaretskii, Paul Eggert, pipcet, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>>
>>> Let's see if it continues crashing.
>>
>> It does :-(.
>
> Next idea.
>
> I wonder if __builtin_unwind_init changes things. The different
> behaviour when compiled -O0 and -O, the fact that problems happen in
> "random" places which lets me suspect indicates a basic problem, and so
> on - that would fit, when MPS doesn't see something in a register.
> Maybe.
>
> I'm now running with a patch that adds that to MPS' stack scanning code.
>
> @@ -39,6 +39,8 @@ void StackHot(void **stackOut);
>  /* STACK_CONTEXT_BEGIN -- save context */
>  
>  #define STACK_CONTEXT_BEGIN(arena) \
> +  BEGIN \
> +  __builtin_unwind_init(); \
>    BEGIN \
>      StackContextStruct _sc; \
>      STACK_CONTEXT_SAVE(&_sc); \
> @@ -51,6 +53,7 @@ void StackHot(void **stackOut);
>  /* STACK_CONTEXT_END -- clear context */
>  
>  #define STACK_CONTEXT_END(arena) \
> +    END; \
>      END; \
>      AVER(arena->stackWarm != NULL); \
>      arena->stackWarm = NULL; \
>
> If somone observes similar strange phenomena like I do on macOS/arm64,
> maybe also give it a try.
>
> Let's see how long igc survives this time :-).

Hi Gerd,

if you want to use '__builtin_unwind_init' be aware that this GCC bug
I've found sometime ago [1] might make the builtin ineffective.  It
might not effect your generated code but in case you need you can see
how we work it around in 'flush_stack_call_func'.

Regards

  Andrea


[1] <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115132>



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-18 19:06                   ` Andrea Corallo
@ 2024-07-18 19:33                     ` Gerd Möllmann
  2024-07-19  4:38                       ` Gerd Möllmann
  0 siblings, 1 reply; 27+ messages in thread
From: Gerd Möllmann @ 2024-07-18 19:33 UTC (permalink / raw)
  To: Andrea Corallo
  Cc: Helmut Eller, Eli Zaretskii, Paul Eggert, pipcet, emacs-devel

Andrea Corallo <acorallo@gnu.org> writes:

>> Let's see how long igc survives this time :-).
>
> Hi Gerd,
>
> if you want to use '__builtin_unwind_init' be aware that this GCC bug
> I've found sometime ago [1] might make the builtin ineffective.  It
> might not effect your generated code but in case you need you can see
> how we work it around in 'flush_stack_call_func'.
>
> Regards

Thanks, that could become important at some point on platforms other
than macOS. AFAIK, one cannot build Emacs with GCC on newer versions of
macOS because the SDK is incompatible with GCC, so GCC is out of the
picture.

I haven't heard yet of anyone else having problems of the sort I have
here on macOS on other platforms. But maybe that comes up later.





^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-18 19:33                     ` Gerd Möllmann
@ 2024-07-19  4:38                       ` Gerd Möllmann
  2024-07-23  0:36                         ` Pip Cet
  0 siblings, 1 reply; 27+ messages in thread
From: Gerd Möllmann @ 2024-07-19  4:38 UTC (permalink / raw)
  To: Andrea Corallo
  Cc: Helmut Eller, Eli Zaretskii, Paul Eggert, pipcet, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Andrea Corallo <acorallo@gnu.org> writes:
>
>>> Let's see how long igc survives this time :-).
>>
>> Hi Gerd,
>>
>> if you want to use '__builtin_unwind_init' be aware that this GCC bug
>> I've found sometime ago [1] might make the builtin ineffective.  It
>> might not effect your generated code but in case you need you can see
>> how we work it around in 'flush_stack_call_func'.
>>
>> Regards
>
> Thanks, that could become important at some point on platforms other
> than macOS. AFAIK, one cannot build Emacs with GCC on newer versions of
> macOS because the SDK is incompatible with GCC, so GCC is out of the
> picture.
>
> I haven't heard yet of anyone else having problems of the sort I have
> here on macOS on other platforms. But maybe that comes up later.

It didn't help, and no more ideas at the moment :-(.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-19  4:38                       ` Gerd Möllmann
@ 2024-07-23  0:36                         ` Pip Cet
  2024-07-23  3:31                           ` Gerd Möllmann
  0 siblings, 1 reply; 27+ messages in thread
From: Pip Cet @ 2024-07-23  0:36 UTC (permalink / raw)
  To: Gerd Möllmann
  Cc: Andrea Corallo, Helmut Eller, Eli Zaretskii, Paul Eggert,
	emacs-devel

On Friday, July 19th, 2024 at 04:38, Gerd Möllmann <gerd.moellmann@gmail.com> wrote:
> Gerd Möllmann gerd.moellmann@gmail.com writes:
> 
> > Andrea Corallo acorallo@gnu.org writes:
> > 
> > > > Let's see how long igc survives this time :-).
> > > 
> > > Hi Gerd,
> > > 
> > > if you want to use '__builtin_unwind_init' be aware that this GCC bug
> > > I've found sometime ago [1] might make the builtin ineffective. It
> > > might not effect your generated code but in case you need you can see
> > > how we work it around in 'flush_stack_call_func'.
> > > 
> > > Regards
> > 
> > Thanks, that could become important at some point on platforms other
> > than macOS. AFAIK, one cannot build Emacs with GCC on newer versions of
> > macOS because the SDK is incompatible with GCC, so GCC is out of the
> > picture.
> > 
> > I haven't heard yet of anyone else having problems of the sort I have
> > here on macOS on other platforms. But maybe that comes up later.
> 
> 
> It didn't help, and no more ideas at the moment :-(.

Can you try compiling with -fno-omit-frame-pointer? I just spent entirely too much time tracing down a bug in my build to a missing option (of the same name) to x86_64 gcc. The frame pointer is stored in a mangled format by setjmp() on both darwin and glibc systems, and that caused weird problems (and since gcc generates different code with and without "-g", I had to do that without proper debugger support...)

Anyway, I think both architectures "allow" using the frame pointer register, so we're probably going to have to enforce that option, which will limit us to clang and gcc compilers unless someone figures out the configure magic...

Pip



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: MPS: assertion failed: header_type (h) != IGC_OBJ_FWD
  2024-07-23  0:36                         ` Pip Cet
@ 2024-07-23  3:31                           ` Gerd Möllmann
  0 siblings, 0 replies; 27+ messages in thread
From: Gerd Möllmann @ 2024-07-23  3:31 UTC (permalink / raw)
  To: Pip Cet
  Cc: Andrea Corallo, Helmut Eller, Eli Zaretskii, Paul Eggert,
	emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> On Friday, July 19th, 2024 at 04:38, Gerd Möllmann <gerd.moellmann@gmail.com> wrote:
>> Gerd Möllmann gerd.moellmann@gmail.com writes:
>> 
>> > Andrea Corallo acorallo@gnu.org writes:
>> > 
>> > > > Let's see how long igc survives this time :-).
>> > > 
>> > > Hi Gerd,
>> > > 
>> > > if you want to use '__builtin_unwind_init' be aware that this GCC bug
>> > > I've found sometime ago [1] might make the builtin ineffective. It
>> > > might not effect your generated code but in case you need you can see
>> > > how we work it around in 'flush_stack_call_func'.
>> > > 
>> > > Regards
>> > 
>> > Thanks, that could become important at some point on platforms other
>> > than macOS. AFAIK, one cannot build Emacs with GCC on newer versions of
>> > macOS because the SDK is incompatible with GCC, so GCC is out of the
>> > picture.
>> > 
>> > I haven't heard yet of anyone else having problems of the sort I have
>> > here on macOS on other platforms. But maybe that comes up later.
>> 
>> 
>> It didn't help, and no more ideas at the moment :-(.
>
> Can you try compiling with -fno-omit-frame-pointer? I just spent
> entirely too much time tracing down a bug in my build to a missing
> option (of the same name) to x86_64 gcc. The frame pointer is stored
> in a mangled format by setjmp() on both darwin and glibc systems, and
> that caused weird problems (and since gcc generates different code
> with and without "-g", I had to do that without proper debugger
> support...)

Thanks for letting me know! That indeed sounds like a candidate.

I wonder what __builtin_unwind_init does with the frame pointer. Using
it (in MPS itself) definitely has some effect on my system, although it
apparently is not a complete fix. But maybe there's more than one bug.
Hmm.

> Anyway, I think both architectures "allow" using the frame pointer
> register, so we're probably going to have to enforce that option,
> which will limit us to clang and gcc compilers unless someone figures
> out the configure magic...

The macOS arm ABI says

  The ARM standard delegates certain decisions to platform designers.
  Apple platforms adhere to the following choices:

  The platforms reserve register x18. Don’t use this register.

  The frame pointer register (x29) must always address a valid frame
  record. Some functions — such as leaf functions or tail calls — may opt
  not to create an entry in this list. As a result, stack traces are
  always meaningful, even without debug information.

I'll try the -fno-omit-frame-pointer anyway. Maybe I'm misinterpreting
that paragraph.



^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2024-07-23  3:31 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-14  4:12 MPS: assertion failed: header_type (h) != IGC_OBJ_FWD Gerd Möllmann
2024-07-14  5:30 ` Pip Cet
2024-07-14  7:00   ` Gerd Möllmann
2024-07-14  7:08     ` Gerd Möllmann
2024-07-16 13:02     ` Gerd Möllmann
2024-07-16 13:38       ` Eli Zaretskii
2024-07-16 13:47         ` Gerd Möllmann
2024-07-16 14:11           ` Eli Zaretskii
2024-07-16 14:39             ` Gerd Möllmann
2024-07-16 15:21               ` Eli Zaretskii
2024-07-16 16:54                 ` Gerd Möllmann
2024-07-16 14:19           ` Helmut Eller
2024-07-16 14:48             ` Gerd Möllmann
2024-07-16 15:22               ` Eli Zaretskii
2024-07-16 16:13               ` Pip Cet
2024-07-16 16:47                 ` Gerd Möllmann
2024-07-17  7:51                 ` Andrea Corallo
2024-07-17 19:47               ` Gerd Möllmann
2024-07-18 15:08                 ` Gerd Möllmann
2024-07-18 16:05                   ` Pip Cet
2024-07-18 16:33                     ` Gerd Möllmann
2024-07-18 19:06                   ` Andrea Corallo
2024-07-18 19:33                     ` Gerd Möllmann
2024-07-19  4:38                       ` Gerd Möllmann
2024-07-23  0:36                         ` Pip Cet
2024-07-23  3:31                           ` Gerd Möllmann
2024-07-16 15:49         ` Paul Eggert

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.