all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* MPS: Loaded pdump
@ 2024-05-09 10:52 Gerd Möllmann
  2024-05-09 11:00 ` Eli Zaretskii
                   ` (2 more replies)
  0 siblings, 3 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-09 10:52 UTC (permalink / raw)
  To: Helmut Eller, Eli Zaretskii; +Cc: Emacs Devel

I feel more and more that the handling of the loaded pdump with MPS as
it is now is not sustainable, and would like ask you for your ideas.

What we do now is make the hot part of the dump an ambig root. I don't
remember the exact numbers, but I think that's about 18 Mb of root. This
has at least these problems, from my POV:

- It is very large, and every time MPS scans roots, and that is all the
  time, the world is stopped until it has finished. That's not good for
  pause times.

- The root is ambiguous, so everything found in it is pinned in memory.
  
- The root is unstructured. We can't scan exactly, and so can't do
  anything special for pointers to non-MPS memory that Lisp objects
  have. This leads to some horrible workarounds.

I knew that early on, but I thought maybe we could get away with it. But
now the latest workaround for compilation units let's me think this
won't fly.

So, what to do now? I think we should consider what Helmut also
mentioned some mails ago: copy the object graph in the dump to MPS
memory. Everything else looks almost not worth it to me.

This has of course also consequences:

- copying 18 Mb of hot objects + 12 Mb or so of leaf objects to MPS
  could be slow. No idea if it is. That could impact startup time (not
  important to me at all, but people have different preferences).

- copying the graph requires that the copying functions know the layout
  of Lisp objects so that the functions can exchange references in the
  old graph to the corresponding ones in the new graph. I'm getting
  exhausted already from thinking of writing such functions, and we
  don't have C++ templates to help.

- AFAIK, but see admin/igc.org, there is no good way of allocating
  objects in an old generation, so they will maybe take some time to
  wander to an older generation.

Enough rambling.

Ideas, opinions, ...?






^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-09 10:52 MPS: Loaded pdump Gerd Möllmann
@ 2024-05-09 11:00 ` Eli Zaretskii
  2024-05-09 11:20   ` Gerd Möllmann
  2024-05-09 12:28 ` Helmut Eller
  2024-05-09 13:38 ` Helmut Eller
  2 siblings, 1 reply; 62+ messages in thread
From: Eli Zaretskii @ 2024-05-09 11:00 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eller.helmut, emacs-devel

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Emacs Devel <emacs-devel@gnu.org>
> Date: Thu, 09 May 2024 12:52:03 +0200
> 
> I feel more and more that the handling of the loaded pdump with MPS as
> it is now is not sustainable, and would like ask you for your ideas.
> 
> What we do now is make the hot part of the dump an ambig root.

Is this specific to native-compilation, or is it not?

If it isn't, why does it come up only now?  We've been running
MPS-built Emacs for a week at least; if that produces problems, can
you tell what problems and how can that be reproduced?

IOW, I'd like to understand better why we need to make the hot part of
the dump an ambiguous root before we consider solutions.

Thanks.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-09 11:00 ` Eli Zaretskii
@ 2024-05-09 11:20   ` Gerd Möllmann
  0 siblings, 0 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-09 11:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eller.helmut, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: Emacs Devel <emacs-devel@gnu.org>
>> Date: Thu, 09 May 2024 12:52:03 +0200
>> 
>> I feel more and more that the handling of the loaded pdump with MPS as
>> it is now is not sustainable, and would like ask you for your ideas.
>> 
>> What we do now is make the hot part of the dump an ambig root.
>
> Is this specific to native-compilation, or is it not?

It's not specific to native comp.

> If it isn't, why does it come up only now?  We've been running
> MPS-built Emacs for a week at least; if that produces problems, can
> you tell what problems and how can that be reproduced?

As I said, I thought we could get away with the root, and that finding
an alternative would be an optimization to be done later. (I think
admin/igc.org has some ideas regarding the loaded pdump from some time
ago.)

What I wrote are fundamental problems, so they can't be reproduced.

> IOW, I'd like to understand better why we need to make the hot part of
> the dump an ambiguous root before we consider solutions.

I think this mail I sent has the answer:

  Helmut Eller <eller.helmut@gmail.com> writes:

  >> @Helmut: Did we already talk about what the problem with the frame in
  >> the loaded pdump could be? Sorry that I don't remember.
  >
  > I never heard of that before.

  As the famous philosopher Manuel Manousakis says: Katastrophe!
  🙂

  Okay, I think one can understand this best when I try to describe what
  the pdumper does. Let's start with generating a pdump. I'll try to leave
  out as much details as a can.

  When we create a pdump, we start by allocating 3 big memory blocks which
  I'll call H (hot), C (cold), and R (relocs).

  We then traverse the graph of live objects like the old GC, starting
  from known roots. Each newly encountered object is copyied to H or C in
  binary form. C is used for leaf objects like strings and floats, H for
  the rest.

  The copying of objects is done by invoking type-specific functions,
  example dump_float, dump_vector, etc.

  We cannot use memcpy for the copying because we need more information
  when the pdump is loaded, namely relocation information, which goes to
  R. Relocation is necessary because both Emacs' DATA segment as well as
  H/C may end up at different addresses in a new process.

  Relocation info is recorded in S, and tells us where in the copied
  objects Lisp_Objects or pointers are that need patching when loaded.

  At the end, we write H, C, S to one big file.

  Good. Now let's load that file. We mmap the whole file and now have H',
  C', R' in the new process. C' and R'are good to go (In C are leaf
  objects). H' is patches according to the reloc info that is in S'.

  At the end of the relocation H' is ready to use. Some additional setup
  and initalizations, and we are good to go. I won't describe these.

  Thing is that H' now contains real Lisp objects of basically all types.
  Lisp objects contain references, so I make H' an ambig root.

  So far so good, but some Lisp objects contain not only references to
  other Lisp objects but also pointers to malloc'd memory. Not initially,
  in the dump, but during their lifetime.

  And finally we have reached face_cache.

  If initial_frame is an object in H', fix_frame won't be called for it.
  It cannot because the dump is not part of the MPS memory, and is instead
  traced ambigously as part of the big blob H'.

  Does that make any sense?






^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-09 10:52 MPS: Loaded pdump Gerd Möllmann
  2024-05-09 11:00 ` Eli Zaretskii
@ 2024-05-09 12:28 ` Helmut Eller
  2024-05-09 13:37   ` Gerd Möllmann
  2024-05-09 13:38 ` Helmut Eller
  2 siblings, 1 reply; 62+ messages in thread
From: Helmut Eller @ 2024-05-09 12:28 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, Emacs Devel

On Thu, May 09 2024, Gerd Möllmann wrote:

> I feel more and more that the handling of the loaded pdump with MPS as
> it is now is not sustainable, and would like ask you for your ideas.
>
> What we do now is make the hot part of the dump an ambig root. I don't
> remember the exact numbers, but I think that's about 18 Mb of root. This
> has at least these problems, from my POV:
>
> - It is very large, and every time MPS scans roots, and that is all the
>   time, the world is stopped until it has finished. That's not good for
>   pause times.

Maybe it would be better with the MPS_RM_PROT option.  Do you know which
of the telemetry events could be used to measure this?

I recorded telemetry for the nbody benchmark.  It seems that there are
16 GC flips.  From one TraceFlipBegin event to the next TraceFlipEnd
takes about 32 million cycles (I think the timestamps are cycles, as
returned by rtdsc).  If we assume the clock frequency is 2.5 GHz that
would be about 13 milliseconds per GC flip.  But I don't know what the
events actually mean and whether this includes scanning the pdump.

> This has of course also consequences:
>
> - copying 18 Mb of hot objects + 12 Mb or so of leaf objects to MPS
>   could be slow. No idea if it is. That could impact startup time (not
>   important to me at all, but people have different preferences).

It would certainly be interesting to know how long it takes.

> - copying the graph requires that the copying functions know the layout
>   of Lisp objects so that the functions can exchange references in the
>   old graph to the corresponding ones in the new graph. I'm getting
>   exhausted already from thinking of writing such functions, and we
>   don't have C++ templates to help.

Do we need to know the layout or can we get away with just knowing the
size and the relocation information that the pdump already has?

> - AFAIK, but see admin/igc.org, there is no good way of allocating
>   objects in an old generation, so they will maybe take some time to
>   wander to an older generation.

There is this ramp allocation pattern but it's not exactly what we need.

> Enough rambling.
>
> Ideas, opinions, ...?

Does Open Dylan use MPS in some way to dump/load a large amount of
state?



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-09 12:28 ` Helmut Eller
@ 2024-05-09 13:37   ` Gerd Möllmann
  2024-05-09 16:10     ` Helmut Eller
  0 siblings, 1 reply; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-09 13:37 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, Emacs Devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Thu, May 09 2024, Gerd Möllmann wrote:
>
>> I feel more and more that the handling of the loaded pdump with MPS as
>> it is now is not sustainable, and would like ask you for your ideas.
>>
>> What we do now is make the hot part of the dump an ambig root. I don't
>> remember the exact numbers, but I think that's about 18 Mb of root. This
>> has at least these problems, from my POV:
>>
>> - It is very large, and every time MPS scans roots, and that is all the
>>   time, the world is stopped until it has finished. That's not good for
>>   pause times.
>
> Maybe it would be better with the MPS_RM_PROT option.  Do you know which
> of the telemetry events could be used to measure this?

No idea. I've only looked at telemetry to see if it had something
helping me debug things. I didn't see anything obvious doing that at the
time. Maybe you could just add the PROT and see what the difference is?

> I recorded telemetry for the nbody benchmark.  It seems that there are
> 16 GC flips.  From one TraceFlipBegin event to the next TraceFlipEnd
> takes about 32 million cycles (I think the timestamps are cycles, as
> returned by rtdsc).  If we assume the clock frequency is 2.5 GHz that
> would be about 13 milliseconds per GC flip.  But I don't know what the
> events actually mean and whether this includes scanning the pdump.

In "Old Design" it says

  7.4.3. The flip phase

  .phase.flip: The roots (see design.mps.root) are scanned. This has to be
  an atomic action as far as the mutator is concerned, so all threads are
  suspended for the duration.

which probably means that between flips all roots are scanned. Unless
there is something "new" meanwhile.

Does telemetry show something concerning root?

I initially thought we could get away with the root not the least
because Emacs as an interactive program probably doesn't require a big
throughput. When I run it with MPS, what I observe doesn't prove this
immediately wrong at least.

But the horrible workarounds like last for the CUs, and the half a dozen
before that, which are only 90%, and the remaining 90% are still out
there. That's so horrible. And on top of all that, I don't have debug
info for elns ;-).

And right now I rememberd pure space. What is with pure space... Not now.

>> This has of course also consequences:
>>
>> - copying 18 Mb of hot objects + 12 Mb or so of leaf objects to MPS
>>   could be slow. No idea if it is. That could impact startup time (not
>>   important to me at all, but people have different preferences).
>
> It would certainly be interesting to know how long it takes.

I could add a DEFUN that allocates a mixture of objects. I think all
vectors of different sizes probably suffices. I don't think the type of
object makes a difference.

>> - copying the graph requires that the copying functions know the layout
>>   of Lisp objects so that the functions can exchange references in the
>>   old graph to the corresponding ones in the new graph. I'm getting
>>   exhausted already from thinking of writing such functions, and we
>>   don't have C++ templates to help.
>
> Do we need to know the layout or can we get away with just knowing the
> size and the relocation information that the pdump already has?

Does it have the size? I wondered that a while back, considering the
marking in the old GC where I got the impression that there is maybe
more info in the pdump.

>> - AFAIK, but see admin/igc.org, there is no good way of allocating
>>   objects in an old generation, so they will maybe take some time to
>>   wander to an older generation.
>
> There is this ramp allocation pattern but it's not exactly what we
> need.

Yeah ;-)

>> Enough rambling.
>>
>> Ideas, opinions, ...?
>
> Does Open Dylan use MPS in some way to dump/load a large amount of
> state?

No idea. Given that Dylan in a way came out of CL, I always assumed it
had the same image-based development model. But I've never used Dylan
myself. I'll try to find out.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-09 10:52 MPS: Loaded pdump Gerd Möllmann
  2024-05-09 11:00 ` Eli Zaretskii
  2024-05-09 12:28 ` Helmut Eller
@ 2024-05-09 13:38 ` Helmut Eller
  2024-05-09 14:18   ` Gerd Möllmann
  2 siblings, 1 reply; 62+ messages in thread
From: Helmut Eller @ 2024-05-09 13:38 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, Emacs Devel

On Thu, May 09 2024, Gerd Möllmann wrote:

> - The root is unstructured. We can't scan exactly, and so can't do
>   anything special for pointers to non-MPS memory that Lisp objects
>   have. This leads to some horrible workarounds.

The pdumper could put the different types (cons, symbol, string,
vectorlike) in different sections.  Then we could probably scan it
exactly.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-09 13:38 ` Helmut Eller
@ 2024-05-09 14:18   ` Gerd Möllmann
  2024-05-09 15:01     ` Helmut Eller
  0 siblings, 1 reply; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-09 14:18 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, Emacs Devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Thu, May 09 2024, Gerd Möllmann wrote:
>
>> - The root is unstructured. We can't scan exactly, and so can't do
>>   anything special for pointers to non-MPS memory that Lisp objects
>>   have. This leads to some horrible workarounds.
>
> The pdumper could put the different types (cons, symbol, string,
> vectorlike) in different sections.  Then we could probably scan it
> exactly.

Either that, or we could dump the igc_header with the objects. Don't
know how difficult that would be. My gut tells me introducing more
sections than the 3 we have could be more work.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-09 14:18   ` Gerd Möllmann
@ 2024-05-09 15:01     ` Helmut Eller
  2024-05-09 15:07       ` Gerd Möllmann
  2024-05-09 18:24       ` Gerd Möllmann
  0 siblings, 2 replies; 62+ messages in thread
From: Helmut Eller @ 2024-05-09 15:01 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, Emacs Devel

On Thu, May 09 2024, Gerd Möllmann wrote:

> Helmut Eller <eller.helmut@gmail.com> writes:
>
>> On Thu, May 09 2024, Gerd Möllmann wrote:
>>
>>> - The root is unstructured. We can't scan exactly, and so can't do
>>>   anything special for pointers to non-MPS memory that Lisp objects
>>>   have. This leads to some horrible workarounds.
>>
>> The pdumper could put the different types (cons, symbol, string,
>> vectorlike) in different sections.  Then we could probably scan it
>> exactly.
>
> Either that, or we could dump the igc_header with the objects. Don't
> know how difficult that would be. My gut tells me introducing more
> sections than the 3 we have could be more work.

Had forgotten about that; with the additional advantage that the
existing scan code would work right away.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-09 15:01     ` Helmut Eller
@ 2024-05-09 15:07       ` Gerd Möllmann
  2024-05-10  7:59         ` Gerd Möllmann
  2024-05-09 18:24       ` Gerd Möllmann
  1 sibling, 1 reply; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-09 15:07 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, Emacs Devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Thu, May 09 2024, Gerd Möllmann wrote:
>
>> Helmut Eller <eller.helmut@gmail.com> writes:
>>
>>> On Thu, May 09 2024, Gerd Möllmann wrote:
>>>
>>>> - The root is unstructured. We can't scan exactly, and so can't do
>>>>   anything special for pointers to non-MPS memory that Lisp objects
>>>>   have. This leads to some horrible workarounds.
>>>
>>> The pdumper could put the different types (cons, symbol, string,
>>> vectorlike) in different sections.  Then we could probably scan it
>>> exactly.
>>
>> Either that, or we could dump the igc_header with the objects. Don't
>> know how difficult that would be. My gut tells me introducing more
>> sections than the 3 we have could be more work.
>
> Had forgotten about that; with the additional advantage that the
> existing scan code would work right away.

Yes.

I have now pushed a igc--alloc-vectors that we could use to meansure
something. Good question is with what to call it.

DEFUN ("igc--alloc-vectors", Figc__alloc_vectors, Sigc__alloc_vectors,
       1, 1, 0, doc: /* Allocate vectors from MPS according to SPEC.
SPEC is a list of conses (N . SIZE).  N is the number of vectors and
SIZE is the SIZE of the vectors to allocate. Allocations happen with
MPS arena in parked state. */)

Sorry for breaking the build!



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-09 13:37   ` Gerd Möllmann
@ 2024-05-09 16:10     ` Helmut Eller
  2024-05-09 16:43       ` Gerd Möllmann
  0 siblings, 1 reply; 62+ messages in thread
From: Helmut Eller @ 2024-05-09 16:10 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, Emacs Devel

On Thu, May 09 2024, Gerd Möllmann wrote:

> No idea. I've only looked at telemetry to see if it had something
> helping me debug things. I didn't see anything obvious doing that at the
> time. Maybe you could just add the PROT and see what the difference is?

There is an RootScan event. I don't know what the best way would be to
figure out the duration from that.  I simply used the difference to the
next RootScan event.  This query

select root,label,avg(delta),min(delta),max(delta) from (select root,
(select I.string from  EVENT_Intern AS I,  EVENT_Label AS L where
I.stringId = L.stringId and R.root=L.address) as label,time, time -
lag(time) over(order by time) as delta from EVENT_RootScan R) group by
root order by avg(delta) desc limit 5;

prints:

root             label        avg(delta)     min(delta)  max(delta)
---------------  -----------  -------------  ----------  ----------
139759443775512  (null)       1958814868.6   1889481499  2809291196
139759443778736  "pdump root  35339296.0625  34952012    36923554  
139759443779272  (null)       960551.25      926604      1113924   
139759443778864  (null)       328300.75      274522      414676    
139759443776656  (null)       306938.6875    199500      411674    

I'm not sure this makes any sense.  The first line could be for a root
that is always scanned as the last per flip; that's why the delta is
larger than for the pdump root.

So (/ 35e6 2.5e9) would give again 14 milliseconds.  Can that be?

The MPS_RM_PROT flag didn't make much difference, beyond the noise
that's already there.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-09 16:10     ` Helmut Eller
@ 2024-05-09 16:43       ` Gerd Möllmann
  2024-05-09 17:57         ` Helmut Eller
  0 siblings, 1 reply; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-09 16:43 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, Emacs Devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Thu, May 09 2024, Gerd Möllmann wrote:
>
>> No idea. I've only looked at telemetry to see if it had something
>> helping me debug things. I didn't see anything obvious doing that at the
>> time. Maybe you could just add the PROT and see what the difference is?
>
> There is an RootScan event. I don't know what the best way would be to
> figure out the duration from that.  I simply used the difference to the
> next RootScan event.  This query
>
> select root,label,avg(delta),min(delta),max(delta) from (select root,
> (select I.string from  EVENT_Intern AS I,  EVENT_Label AS L where
> I.stringId = L.stringId and R.root=L.address) as label,time, time -
> lag(time) over(order by time) as delta from EVENT_RootScan R) group by
> root order by avg(delta) desc limit 5;
>
> prints:
>
> root             label        avg(delta)     min(delta)  max(delta)
> ---------------  -----------  -------------  ----------  ----------
> 139759443775512  (null)       1958814868.6   1889481499  2809291196
> 139759443778736  "pdump root  35339296.0625  34952012    36923554  
> 139759443779272  (null)       960551.25      926604      1113924   
> 139759443778864  (null)       328300.75      274522      414676    
> 139759443776656  (null)       306938.6875    199500      411674    
>
> I'm not sure this makes any sense.  The first line could be for a root
> that is always scanned as the last per flip; that's why the delta is
> larger than for the pdump root.
>
> So (/ 35e6 2.5e9) would give again 14 milliseconds.  Can that be?

This is what I understood:

Bahnhof. No :-)

I think EVENT_Label is a table you made that maps addresses of roots to
a label. The delta is the # of ticks, or most likely it is ticks,
between events, averaged. And the ticks divided by you cpu spped = time
taken for a pdump root scan = 14 ms which is ca. what a flip takes.
Right?

I find that at least not completely implausible. Maybe scanning the
pdump root dominates how long a flip takes. Could be.

> The MPS_RM_PROT flag didn't make much difference, beyond the noise
> that's already there.

Ok pity, then we can probably forget about that idea. For the moment.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-09 16:43       ` Gerd Möllmann
@ 2024-05-09 17:57         ` Helmut Eller
  2024-05-09 18:10           ` Gerd Möllmann
  0 siblings, 1 reply; 62+ messages in thread
From: Helmut Eller @ 2024-05-09 17:57 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, Emacs Devel

>> The MPS_RM_PROT flag didn't make much difference, beyond the noise
>> that's already there.
>
> Ok pity, then we can probably forget about that idea. For the moment.

Pity indeed.  But it's actually documented:

  10.5. Root modes ...
  Note
  
  The MPS does not currently perform either of these optimizations, so
  root modes have no effect. These features may be added in a future
  release.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-09 17:57         ` Helmut Eller
@ 2024-05-09 18:10           ` Gerd Möllmann
  0 siblings, 0 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-09 18:10 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, Emacs Devel

Helmut Eller <eller.helmut@gmail.com> writes:

>>> The MPS_RM_PROT flag didn't make much difference, beyond the noise
>>> that's already there.
>>
>> Ok pity, then we can probably forget about that idea. For the moment.
>
> Pity indeed.  But it's actually documented:
>
>   10.5. Root modes ...
>   Note
>   
>   The MPS does not currently perform either of these optimizations, so
>   root modes have no effect. These features may be added in a future
>   release.

I can't believe it. How often did I overread that? I'm getting old.
Thanks!



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-09 15:01     ` Helmut Eller
  2024-05-09 15:07       ` Gerd Möllmann
@ 2024-05-09 18:24       ` Gerd Möllmann
  2024-05-09 18:35         ` Gerd Möllmann
  1 sibling, 1 reply; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-09 18:24 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, Emacs Devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Thu, May 09 2024, Gerd Möllmann wrote:
>
>> Helmut Eller <eller.helmut@gmail.com> writes:
>>
>>> On Thu, May 09 2024, Gerd Möllmann wrote:
>>>
>>>> - The root is unstructured. We can't scan exactly, and so can't do
>>>>   anything special for pointers to non-MPS memory that Lisp objects
>>>>   have. This leads to some horrible workarounds.
>>>
>>> The pdumper could put the different types (cons, symbol, string,
>>> vectorlike) in different sections.  Then we could probably scan it
>>> exactly.
>>
>> Either that, or we could dump the igc_header with the objects. Don't
>> know how difficult that would be. My gut tells me introducing more
>> sections than the 3 we have could be more work.
>
> Had forgotten about that; with the additional advantage that the
> existing scan code would work right away.

But this looks also interesting

  struct dump_header
  {
    ...
    /* "Relocation" table we abuse to hold information about the
       location and type of each lisp object in the dump.  We need for
       pdumper_object_type and ultimately for conservative GC
       correctness.  */
    struct dump_table_locator object_starts;

I think I'll read the pdumper code a bit more in the next days.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-09 18:24       ` Gerd Möllmann
@ 2024-05-09 18:35         ` Gerd Möllmann
  0 siblings, 0 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-09 18:35 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, Emacs Devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Helmut Eller <eller.helmut@gmail.com> writes:
>
>> On Thu, May 09 2024, Gerd Möllmann wrote:
>>
>>> Helmut Eller <eller.helmut@gmail.com> writes:
>>>
>>>> On Thu, May 09 2024, Gerd Möllmann wrote:
>>>>
>>>>> - The root is unstructured. We can't scan exactly, and so can't do
>>>>>   anything special for pointers to non-MPS memory that Lisp objects
>>>>>   have. This leads to some horrible workarounds.
>>>>
>>>> The pdumper could put the different types (cons, symbol, string,
>>>> vectorlike) in different sections.  Then we could probably scan it
>>>> exactly.
>>>
>>> Either that, or we could dump the igc_header with the objects. Don't
>>> know how difficult that would be. My gut tells me introducing more
>>> sections than the 3 we have could be more work.
>>
>> Had forgotten about that; with the additional advantage that the
>> existing scan code would work right away.
>
> But this looks also interesting
>
>   struct dump_header
>   {
>     ...
>     /* "Relocation" table we abuse to hold information about the
>        location and type of each lisp object in the dump.  We need for
>        pdumper_object_type and ultimately for conservative GC
>        correctness.  */
>     struct dump_table_locator object_starts;
>
> I think I'll read the pdumper code a bit more in the next days.

Scratch that. A bit of LSP, and one can see it's not why I thought it
was. It doesn't always have the informaton what kind of Lisp object is
somewhere in the dump. AFAIU, it is for conservatice marking to be able
to tell the GC if a reference is to an object star or not.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-09 15:07       ` Gerd Möllmann
@ 2024-05-10  7:59         ` Gerd Möllmann
  2024-05-10  8:09           ` Helmut Eller
  0 siblings, 1 reply; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-10  7:59 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, Emacs Devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> I have now pushed a igc--alloc-vectors that we could use to meansure
> something. Good question is with what to call it.
>
> DEFUN ("igc--alloc-vectors", Figc__alloc_vectors, Sigc__alloc_vectors,
>        1, 1, 0, doc: /* Allocate vectors from MPS according to SPEC.
> SPEC is a list of conses (N . SIZE).  N is the number of vectors and
> SIZE is the SIZE of the vectors to allocate. Allocations happen with
> MPS arena in parked state. */)
>
> Sorry for breaking the build!

Moin Helmut,

could you please check if I'm doing something wrong? I do 

  (benchmark-run 1 (igc--alloc-vectors '((400000 . 4))))
=> (0.06476799999999999 0 0.0)

in a debug build (-lmps-debug).

That allocates 400K vectors of size 4, which is on my machine (* 400000
(+ 8 8 (* 4 8))) = ca. 19 Mb. It's in a parked arena, so GCs don't run,
but anyway, that's a bit faster than I thought, so I'd like to ask you
for confirmation.




^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-10  7:59         ` Gerd Möllmann
@ 2024-05-10  8:09           ` Helmut Eller
  2024-05-10  8:35             ` Gerd Möllmann
  2024-05-10 10:25             ` Eli Zaretskii
  0 siblings, 2 replies; 62+ messages in thread
From: Helmut Eller @ 2024-05-10  8:09 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, Emacs Devel

On Fri, May 10 2024, Gerd Möllmann wrote:

>> Sorry for breaking the build!
>
> Moin Helmut,
>
> could you please check if I'm doing something wrong? I do 
>
>   (benchmark-run 1 (igc--alloc-vectors '((400000 . 4))))
> => (0.06476799999999999 0 0.0)
>
> in a debug build (-lmps-debug).
>
> That allocates 400K vectors of size 4, which is on my machine (* 400000
> (+ 8 8 (* 4 8))) = ca. 19 Mb. It's in a parked arena, so GCs don't run,
> but anyway, that's a bit faster than I thought, so I'd like to ask you
> for confirmation.

It's a bit slower and varies a bit :
(0.117823281 0 0.0)
(0.074480941 0 0.0)
(0.119195628 0 0.0)
(0.113130772 0 0.0)
(0.122865761 0 0.0)
(0.21220993000000002 0 0.0)

The last one triggered SIGSEGV messages in gdb.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-10  8:09           ` Helmut Eller
@ 2024-05-10  8:35             ` Gerd Möllmann
  2024-05-10  8:51               ` Helmut Eller
  2024-05-10 10:25             ` Eli Zaretskii
  1 sibling, 1 reply; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-10  8:35 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, Emacs Devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Fri, May 10 2024, Gerd Möllmann wrote:
>
>>> Sorry for breaking the build!
>>
>> Moin Helmut,
>>
>> could you please check if I'm doing something wrong? I do 
>>
>>   (benchmark-run 1 (igc--alloc-vectors '((400000 . 4))))
>> => (0.06476799999999999 0 0.0)
>>
>> in a debug build (-lmps-debug).
>>
>> That allocates 400K vectors of size 4, which is on my machine (* 400000
>> (+ 8 8 (* 4 8))) = ca. 19 Mb. It's in a parked arena, so GCs don't run,
>> but anyway, that's a bit faster than I thought, so I'd like to ask you
>> for confirmation.
>
> It's a bit slower and varies a bit :
> (0.117823281 0 0.0)
> (0.074480941 0 0.0)
> (0.119195628 0 0.0)
> (0.113130772 0 0.0)
> (0.122865761 0 0.0)
> (0.21220993000000002 0 0.0)
>
> The last one triggered SIGSEGV messages in gdb.

Thanks!

So, I'd give a higher probability now to the assumption that the time
taken for allocations when copying from a dump to MPS would not be a
deal breaker. At least for me.

Wrt to SEGV, mine just survived

  (benchmark-run 1000 (igc--alloc-vectors '((400000 . 4))))
  (363.912468 0 0.0)



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-10  8:35             ` Gerd Möllmann
@ 2024-05-10  8:51               ` Helmut Eller
  2024-05-10  8:54                 ` Gerd Möllmann
  0 siblings, 1 reply; 62+ messages in thread
From: Helmut Eller @ 2024-05-10  8:51 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, Emacs Devel

On Fri, May 10 2024, Gerd Möllmann wrote:

> Wrt to SEGV, mine just survived
>
>   (benchmark-run 1000 (igc--alloc-vectors '((400000 . 4))))
>   (363.912468 0 0.0)

It did survive here too, it just triggered memory barriers; I configured
gdb so that it prints something and this may be slow.  But apparently
even a parked arena handles memory barriers.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-10  8:51               ` Helmut Eller
@ 2024-05-10  8:54                 ` Gerd Möllmann
  0 siblings, 0 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-10  8:54 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, Emacs Devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Fri, May 10 2024, Gerd Möllmann wrote:
>
>> Wrt to SEGV, mine just survived
>>
>>   (benchmark-run 1000 (igc--alloc-vectors '((400000 . 4))))
>>   (363.912468 0 0.0)
>
> It did survive here too, it just triggered memory barriers; I configured
> gdb so that it prints something and this may be slow.  But apparently
> even a parked arena handles memory barriers.

Ah that's interesting! And a good sign.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-10  8:09           ` Helmut Eller
  2024-05-10  8:35             ` Gerd Möllmann
@ 2024-05-10 10:25             ` Eli Zaretskii
  2024-05-10 11:31               ` Gerd Möllmann
  1 sibling, 1 reply; 62+ messages in thread
From: Eli Zaretskii @ 2024-05-10 10:25 UTC (permalink / raw)
  To: Helmut Eller; +Cc: gerd.moellmann, emacs-devel

> From: Helmut Eller <eller.helmut@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Emacs Devel <emacs-devel@gnu.org>
> Date: Fri, 10 May 2024 10:09:18 +0200
> 
> On Fri, May 10 2024, Gerd Möllmann wrote:
> 
> >> Sorry for breaking the build!
> >
> > Moin Helmut,
> >
> > could you please check if I'm doing something wrong? I do 
> >
> >   (benchmark-run 1 (igc--alloc-vectors '((400000 . 4))))
> > => (0.06476799999999999 0 0.0)
> >
> > in a debug build (-lmps-debug).
> >
> > That allocates 400K vectors of size 4, which is on my machine (* 400000
> > (+ 8 8 (* 4 8))) = ca. 19 Mb. It's in a parked arena, so GCs don't run,
> > but anyway, that's a bit faster than I thought, so I'd like to ask you
> > for confirmation.
> 
> It's a bit slower and varies a bit :
> (0.117823281 0 0.0)
> (0.074480941 0 0.0)
> (0.119195628 0 0.0)
> (0.113130772 0 0.0)
> (0.122865761 0 0.0)
> (0.21220993000000002 0 0.0)
> 
> The last one triggered SIGSEGV messages in gdb.

Here's what I get (in a 32-bit build, so about 11MB per run):

  (0.011463000000000001 0 0.0)
  (0.012748 0 0.0)
  (0.013849 0 0.0)
  (0.011302 0 0.0)
  (0.013317 0 0.0)
  (0.012459 0 0.0)
  (0.013537 0 0.0)
  (0.012501 0 0.0)
  (0.012622000000000001 0 0.0)

How come the MS-Windows build is so much faster?  Am I missing
something?



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-10 10:25             ` Eli Zaretskii
@ 2024-05-10 11:31               ` Gerd Möllmann
  2024-05-10 12:52                 ` Gerd Möllmann
  0 siblings, 1 reply; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-10 11:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Helmut Eller, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> Here's what I get (in a 32-bit build, so about 11MB per run):
>
>   (0.011463000000000001 0 0.0)
>   (0.012748 0 0.0)
>   (0.013849 0 0.0)
>   (0.011302 0 0.0)
>   (0.013317 0 0.0)
>   (0.012459 0 0.0)
>   (0.013537 0 0.0)
>   (0.012501 0 0.0)
>   (0.012622000000000001 0 0.0)
>
> How come the MS-Windows build is so much faster?  Am I missing
> something?

Thanks!

I had used the debug build. In an optimized build with -lmps, I get

  (benchmark-run 1 (igc--alloc-vectors '((400000 . 4))))
  (0.016177999999999998 0 0.0)

When I project that to 32 bits, that would be 

  (* (/ 11.0 19) 0.016) = 0.009



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-10 11:31               ` Gerd Möllmann
@ 2024-05-10 12:52                 ` Gerd Möllmann
  2024-05-10 13:37                   ` Helmut Eller
  2024-05-13  9:11                   ` Gerd Möllmann
  0 siblings, 2 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-10 12:52 UTC (permalink / raw)
  To: Eli Zaretskii, Helmut Eller; +Cc: emacs-devel

Ok, I think I'll try to get rid of the pdump root. Maybe I'll regret it,
and maybe I'll fail, but I feel the root is not the right thing to have.
(I'll do in in a branch from the branch in the fork here.)

Step 1:

Make the loaded dump traversable by dumping igc_headers.

How easy or not that is is hard to say. On one hand, there seems to be
dump_object_begin, which looks promising, but I already know one case
(hash tables), where extending that function to write igc_headers might
not be sufficient. And where is one exception there are more. And flags
like pack_objects could interfere, and so on.

Doenside: igc_header is then even more set in stone. Removing the header
for conses becomes even more work. One would need a new section in the
dump, just for conses.

Step 1.5:

When (1) is done, one could make the dump an exact root and use
dflt_scan on it. (That's better than an ambig root, but I don't want the
root because the world is stopped when roots are scanned.)

Step 2:

Walk through the dump and make copies of dumped objects in MPS memory.
Record a mapping from dumped -> MPS object. Then walk through the copied
objects and replace references to dumped objects.

This is a bit hand-wavy. At some point when a dump is loaded, things in
Emacs are set up to refer to objects in the dump. We will have to do our
copy before that and somehow make sure the copy is used instead of the
dump.

Unmap the dump, remove the root.

WDYT?



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-10 12:52                 ` Gerd Möllmann
@ 2024-05-10 13:37                   ` Helmut Eller
  2024-05-10 13:59                     ` Gerd Möllmann
  2024-05-13  9:11                   ` Gerd Möllmann
  1 sibling, 1 reply; 62+ messages in thread
From: Helmut Eller @ 2024-05-10 13:37 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, emacs-devel

On Fri, May 10 2024, Gerd Möllmann wrote:

> Ok, I think I'll try to get rid of the pdump root. Maybe I'll regret it,
> and maybe I'll fail, but I feel the root is not the right thing to have.
> (I'll do in in a branch from the branch in the fork here.)
>
> Step 1:
>
> Make the loaded dump traversable by dumping igc_headers.
>
> How easy or not that is is hard to say. On one hand, there seems to be
> dump_object_begin, which looks promising, but I already know one case
> (hash tables), where extending that function to write igc_headers might
> not be sufficient. And where is one exception there are more. And flags
> like pack_objects could interfere, and so on.

I feared that hash tables could be difficult; about the flags, I don't
know enough.

> Doenside: igc_header is then even more set in stone. Removing the header
> for conses becomes even more work. One would need a new section in the
> dump, just for conses.

In the long run, I'd like to have header-less conses.  On a typical
heap, there are more conses than any other type.  It seems like the
easiest (only?) way to get that with MPS is to put them in their own
pool.  Maybe we could keep igc-headers in the pdump, if we need to copy
one object at the time anyway.

> Step 1.5:
>
> When (1) is done, one could make the dump an exact root and use
> dflt_scan on it. (That's better than an ambig root, but I don't want the
> root because the world is stopped when roots are scanned.)

And perhaps helps to fix those native comp problems.

> Step 2:
>
> Walk through the dump and make copies of dumped objects in MPS memory.
> Record a mapping from dumped -> MPS object. Then walk through the copied
> objects and replace references to dumped objects.
>
> This is a bit hand-wavy. At some point when a dump is loaded, things in
> Emacs are set up to refer to objects in the dump. We will have to do our
> copy before that and somehow make sure the copy is used instead of the
> dump.

I hope that I understand a bit more about the pdumper at that time.  At
the moment some things, like those different relocation phases, look
scarily complicated to me.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-10 13:37                   ` Helmut Eller
@ 2024-05-10 13:59                     ` Gerd Möllmann
  2024-05-10 14:31                       ` Helmut Eller
  0 siblings, 1 reply; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-10 13:59 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

>> How easy or not that is is hard to say. On one hand, there seems to be
>> dump_object_begin, which looks promising, but I already know one case
>> (hash tables), where extending that function to write igc_headers might
>> not be sufficient. And where is one exception there are more. And flags
>> like pack_objects could interfere, and so on.
>
> I feared that hash tables could be difficult; about the flags, I don't
> know enough.

I think I have an idea how to do it. Maybe the easiest way would be to
make the key/value vectors MPS objects first. They have to be anyway for
the weak table support.

>> Doenside: igc_header is then even more set in stone. Removing the header
>> for conses becomes even more work. One would need a new section in the
>> dump, just for conses.
>
> In the long run, I'd like to have header-less conses.  On a typical
> heap, there are more conses than any other type.  It seems like the
> easiest (only?) way to get that with MPS is to put them in their own
> pool.  Maybe we could keep igc-headers in the pdump, if we need to copy
> one object at the time anyway.

Something like that. It will be a lot of work to remove the header from
conses. We then couldn't use the address-independent hash, which again
hits us with hash tables. We'd have to use MPS' location dependency
mechanism... :-/. And sxhash-eq gets a problem with a moving collector.
Bad idea to add that.

>> Step 1.5:
>>
>> When (1) is done, one could make the dump an exact root and use
>> dflt_scan on it. (That's better than an ambig root, but I don't want the
>> root because the world is stopped when roots are scanned.)
>
> And perhaps helps to fix those native comp problems.

Yeah, maybe. I've sent a mail to the jit mailing list now. Maybe they
know how to get debug info on macOS. I definitely don't want to wade
through the arm64 assembler code.

Does a native comp build of igc work on Debian, BTW?

>> Step 2:
>>
>> Walk through the dump and make copies of dumped objects in MPS memory.
>> Record a mapping from dumped -> MPS object. Then walk through the copied
>> objects and replace references to dumped objects.
>>
>> This is a bit hand-wavy. At some point when a dump is loaded, things in
>> Emacs are set up to refer to objects in the dump. We will have to do our
>> copy before that and somehow make sure the copy is used instead of the
>> dump.
>
> I hope that I understand a bit more about the pdumper at that time.  At
> the moment some things, like those different relocation phases, look
> scarily complicated to me.

👍 



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-10 13:59                     ` Gerd Möllmann
@ 2024-05-10 14:31                       ` Helmut Eller
  2024-05-10 14:36                         ` Gerd Möllmann
  0 siblings, 1 reply; 62+ messages in thread
From: Helmut Eller @ 2024-05-10 14:31 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, emacs-devel

On Fri, May 10 2024, Gerd Möllmann wrote:

> Something like that. It will be a lot of work to remove the header from
> conses. We then couldn't use the address-independent hash, which again
> hits us with hash tables. We'd have to use MPS' location dependency
> mechanism... :-/. And sxhash-eq gets a problem with a moving collector.
> Bad idea to add that.

If sxhash-eq isn't used much, we could declare it obsolete or simply say
that it doesn't work with igc.

> Does a native comp build of igc work on Debian, BTW?

AFAICT, yes.  It builds and passes the comp-tests.  Also with your
recent extra checks enabled.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-10 14:31                       ` Helmut Eller
@ 2024-05-10 14:36                         ` Gerd Möllmann
  0 siblings, 0 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-10 14:36 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Fri, May 10 2024, Gerd Möllmann wrote:
>
>> Something like that. It will be a lot of work to remove the header from
>> conses. We then couldn't use the address-independent hash, which again
>> hits us with hash tables. We'd have to use MPS' location dependency
>> mechanism... :-/. And sxhash-eq gets a problem with a moving collector.
>> Bad idea to add that.
>
> If sxhash-eq isn't used much, we could declare it obsolete or simply say
> that it doesn't work with igc.

Yes.

>
>> Does a native comp build of igc work on Debian, BTW?
>
> AFAICT, yes.  It builds and passes the comp-tests.  Also with your
> recent extra checks enabled.

Thanks. I feared that it might be platform dependent.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-10 12:52                 ` Gerd Möllmann
  2024-05-10 13:37                   ` Helmut Eller
@ 2024-05-13  9:11                   ` Gerd Möllmann
  2024-05-14  8:23                     ` Gerd Möllmann
  1 sibling, 1 reply; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-13  9:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Helmut Eller, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Ok, I think I'll try to get rid of the pdump root. Maybe I'll regret it,
> and maybe I'll fail, but I feel the root is not the right thing to have.
> (I'll do in in a branch from the branch in the fork here.)
>
> Step 1:
>
> Make the loaded dump traversable by dumping igc_headers.
>
> How easy or not that is is hard to say. On one hand, there seems to be
> dump_object_begin, which looks promising, but I already know one case
> (hash tables), where extending that function to write igc_headers might
> not be sufficient. And where is one exception there are more. And flags
> like pack_objects could interfere, and so on.
>
> Doenside: igc_header is then even more set in stone. Removing the header
> for conses becomes even more work. One would need a new section in the
> dump, just for conses.
>
> Step 1.5:
>
> When (1) is done, one could make the dump an exact root and use
> dflt_scan on it. (That's better than an ambig root, but I don't want the
> root because the world is stopped when roots are scanned.)
>
> Step 2:
>
> Walk through the dump and make copies of dumped objects in MPS memory.
> Record a mapping from dumped -> MPS object. Then walk through the copied
> objects and replace references to dumped objects.
>
> This is a bit hand-wavy. At some point when a dump is loaded, things in
> Emacs are set up to refer to objects in the dump. We will have to do our
> copy before that and somehow make sure the copy is used instead of the
> dump.

An update on what's going on:

I now have a branch in my fork that builds with and without MPS, where
the MPS dump contains igc headers and the non-MPS dump doesn't.

Next step will be do ensure that the hot section of the dump is indeed
traversable as I want, and fix what's wtong. So we are at step 1.25 or
so.

Things are a bit slower right now for procrastination reasons. The
pdumper is boring as hell :-/. Did I mention that code generation would
be a nice thing? Anyway.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-13  9:11                   ` Gerd Möllmann
@ 2024-05-14  8:23                     ` Gerd Möllmann
  2024-05-14 14:22                       ` Helmut Eller
  2024-05-16  4:25                       ` Gerd Möllmann
  0 siblings, 2 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-14  8:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Helmut Eller, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> Ok, I think I'll try to get rid of the pdump root. Maybe I'll regret it,
>> and maybe I'll fail, but I feel the root is not the right thing to have.
>> (I'll do in in a branch from the branch in the fork here.)
>>
>> Step 1:
>>
>> Make the loaded dump traversable by dumping igc_headers.
>>
>> How easy or not that is is hard to say. On one hand, there seems to be
>> dump_object_begin, which looks promising, but I already know one case
>> (hash tables), where extending that function to write igc_headers might
>> not be sufficient. And where is one exception there are more. And flags
>> like pack_objects could interfere, and so on.
>>
>> Doenside: igc_header is then even more set in stone. Removing the header
>> for conses becomes even more work. One would need a new section in the
>> dump, just for conses.
>>
>> Step 1.5:
>>
>> When (1) is done, one could make the dump an exact root and use
>> dflt_scan on it. (That's better than an ambig root, but I don't want the
>> root because the world is stopped when roots are scanned.)
>>
>> Step 2:
>>
>> Walk through the dump and make copies of dumped objects in MPS memory.
>> Record a mapping from dumped -> MPS object. Then walk through the copied
>> objects and replace references to dumped objects.
>>
>> This is a bit hand-wavy. At some point when a dump is loaded, things in
>> Emacs are set up to refer to objects in the dump. We will have to do our
>> copy before that and somehow make sure the copy is used instead of the
>> dump.
>
> An update on what's going on:
>
> I now have a branch in my fork that builds with and without MPS, where
> the MPS dump contains igc headers and the non-MPS dump doesn't.
>
> Next step will be do ensure that the hot section of the dump is indeed
> traversable as I want, and fix what's wtong. So we are at step 1.25 or
> so.
>
> Things are a bit slower right now for procrastination reasons. The
> pdumper is boring as hell :-/. Did I mention that code generation would
> be a nice thing? Anyway.

I've now pushed someting like step 1.45.

Random notes:

- The dump for MPS now contains the start offsets of igc objects. The
  existing object_starts relocs cannot be used because they are for Lisp
  objects only.

- Each igc object in the dump has an igc_header. The function
  pdumper_visit_object_starts can be used to traverse them in a loaded
  dump. I chose this interface because it cannot be made sure that igc
  objects occupy a continguous region in the dump, at least not with
  surgery on the pdumper.

- Obarrays are probably not yet handled right. I couldn't bring me to do
  this yet. As you know my fork doesn't have obarrays, but uses CL
  packages which use hash tables. Any takers?

- There are a number of igc_obj_types of the form IGC_OBJ_DUMPED_xy. I'm
  not sure what do with these.

- I saw that igc_header::pvec_type is used for something, and shoujld
  probably tell that was meant as an debugging aid (for the case of
  hitting IGC_OBJ_FWDs, so that more easily see what type was forwarded.
  This could also be seen fromt he vectorlike header. I think at some
  we should remove pvec_type in favor of more hash bits. Whatever, not
  so important.

- I commented out from .gitignore the .patch files. This is because
  ignoring .patch files disables some very handy Magit functionality wrt
  patch handling.

Otherwise this is not yet used. I checked with an MPS and non-MPS
build (with checking=all).

Taking the next step will be difficult. Copying object from the dump to
MPS and fixing references from one graph to the other requires writing a
gazillion functions to do that. At least at the moment that's something
that exhausts me just be thinking of it.

Whatever. Happy photosynthesizing on a sunny day!



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-14  8:23                     ` Gerd Möllmann
@ 2024-05-14 14:22                       ` Helmut Eller
  2024-05-14 15:46                         ` Gerd Möllmann
  2024-05-16  4:25                       ` Gerd Möllmann
  1 sibling, 1 reply; 62+ messages in thread
From: Helmut Eller @ 2024-05-14 14:22 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, emacs-devel

On Tue, May 14 2024, Gerd Möllmann wrote:

> Random notes:
>
> - The dump for MPS now contains the start offsets of igc objects. The
>   existing object_starts relocs cannot be used because they are for Lisp
>   objects only.
>
> - Each igc object in the dump has an igc_header. The function
>   pdumper_visit_object_starts can be used to traverse them in a loaded
>   dump. I chose this interface because it cannot be made sure that igc
>   objects occupy a continguous region in the dump, at least not with
>   surgery on the pdumper.

What are those objects that make the region non-contiguous? 

> - Obarrays are probably not yet handled right. I couldn't bring me to do
>   this yet. As you know my fork doesn't have obarrays, but uses CL
>   packages which use hash tables. Any takers?

There seem to be 24 obarrays in my branch.  Any idea how I could test
that those are not working?

> Otherwise this is not yet used. I checked with an MPS and non-MPS
> build (with checking=all).

So at the moment the dump is still an ambiguous root.  And because it is
not a contiguous region of igc-objects it's not easy to scan it exactly,
right?

I think I would try one of these before going to the copying step:
 a) make the hot section an exact root
 b) figure out the minimal set of roots (something like
    the inverse of dump_metadata_for_pdumper)

a) would be a good test to make sure that the igc-headers are correct
and b) will probably be needed for the next step anyway.

> Taking the next step will be difficult. Copying object from the dump to
> MPS and fixing references from one graph to the other requires writing a
> gazillion functions to do that. At least at the moment that's something
> that exhausts me just be thinking of it.

MPS knows how to copy objects.  Can't we somehow reuse that?  Perhaps
not, if you want to patch/relocate and copy in one step but if you do it
in two steps it should work, no?



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-14 14:22                       ` Helmut Eller
@ 2024-05-14 15:46                         ` Gerd Möllmann
  2024-05-14 17:49                           ` Eli Zaretskii
  0 siblings, 1 reply; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-14 15:46 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Tue, May 14 2024, Gerd Möllmann wrote:
>
>> Random notes:
>>
>> - The dump for MPS now contains the start offsets of igc objects. The
>>   existing object_starts relocs cannot be used because they are for Lisp
>>   objects only.
>>
>> - Each igc object in the dump has an igc_header. The function
>>   pdumper_visit_object_starts can be used to traverse them in a loaded
>>   dump. I chose this interface because it cannot be made sure that igc
>>   objects occupy a continguous region in the dump, at least not with
>>   surgery on the pdumper.
>
> What are those objects that make the region non-contiguous?

It's more that nothing in pdumper enforces objects to be contiguous. If
O1 and O2 are pointers to adjacent objects in the dump, then nothing
enforces that O1 + O1.size == O2. I didn't want to assume that is the
case if nothing makes it sure.

>> - Obarrays are probably not yet handled right. I couldn't bring me to do
>>   this yet. As you know my fork doesn't have obarrays, but uses CL
>>   packages which use hash tables. Any takers?
>
> There seem to be 24 obarrays in my branch.  Any idea how I could test
> that those are not working?

You could put a breakpoint on the dump_start_object in dump_obarray.
Then see if dump_start_object decides to put a header or not. If it
doesn't then we probably have to something like in dump_hash_table.

>> Otherwise this is not yet used. I checked with an MPS and non-MPS
>> build (with checking=all).
>
> So at the moment the dump is still an ambiguous root.  And because it is
> not a contiguous region of igc-objects it's not easy to scan it exactly,
> right?

It's still an ambiguous root, yes.

One could make it an exact root by, for instance, calling dflt_scan in
on the area from object start address to end address. Or one could
extract the part of scanning a single object from dflt_scan (the inside
of the loop), maybe that's nicer.

> I think I would try one of these before going to the copying step:
>  a) make the hot section an exact root
>  b) figure out the minimal set of roots (something like
>     the inverse of dump_metadata_for_pdumper)
>
> a) would be a good test to make sure that the igc-headers are correct

Definitely!

> and b) will probably be needed for the next step anyway.

That would be interesting if one could extract something useful out of
the pdumper metadata, indeed! A difficulty could be that the pdumper is
only interested in Lisp objects, AFAIU, which is a subset of what we
have in MPS.

>> Taking the next step will be difficult. Copying object from the dump to
>> MPS and fixing references from one graph to the other requires writing a
>> gazillion functions to do that. At least at the moment that's something
>> that exhausts me just be thinking of it.
>
> MPS knows how to copy objects.  Can't we somehow reuse that?  Perhaps
> not, if you want to patch/relocate and copy in one step but if you do it
> in two steps it should work, no?

I think MPS doesn't know very much actually. It basically has only what
the 4 functions in the object format give it. Iterate over objects with
skip, scan something with scan, insert padding, and mark something as
forwarded. That's basically it. It doesn't know anything about the
objects themselves.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-14 15:46                         ` Gerd Möllmann
@ 2024-05-14 17:49                           ` Eli Zaretskii
  2024-05-14 18:10                             ` Gerd Möllmann
  0 siblings, 1 reply; 62+ messages in thread
From: Eli Zaretskii @ 2024-05-14 17:49 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eller.helmut, emacs-devel

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  emacs-devel@gnu.org
> Date: Tue, 14 May 2024 17:46:49 +0200
> 
> It's more that nothing in pdumper enforces objects to be contiguous. If
> O1 and O2 are pointers to adjacent objects in the dump, then nothing
> enforces that O1 + O1.size == O2. I didn't want to assume that is the
> case if nothing makes it sure.

Isn't that how pdumper.c works?  After dumping an object, it moves
pointer to immediately after it.  No?



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-14 17:49                           ` Eli Zaretskii
@ 2024-05-14 18:10                             ` Gerd Möllmann
  0 siblings, 0 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-14 18:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eller.helmut, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  emacs-devel@gnu.org
>> Date: Tue, 14 May 2024 17:46:49 +0200
>> 
>> It's more that nothing in pdumper enforces objects to be contiguous. If
>> O1 and O2 are pointers to adjacent objects in the dump, then nothing
>> enforces that O1 + O1.size == O2. I didn't want to assume that is the
>> case if nothing makes it sure.
>
> Isn't that how pdumper.c works?  After dumping an object, it moves
> pointer to immediately after it.  No?

You mean with pointer the dump_context::offset that is sort of a current
write location in dump_context::buf? Yes, that's the convention. But it
is something different that is happening:

Take dump_symbol. This first writes the Lisp_Symbol structure to the
dump. If appropriate, this is immediately followed by a
Lisp_Buffer_Local_Value that is written to the dump. The blv is not a
Lisp object.

Similar things are also done for hash tables, obarrays (I think) and
maybe others.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-14  8:23                     ` Gerd Möllmann
  2024-05-14 14:22                       ` Helmut Eller
@ 2024-05-16  4:25                       ` Gerd Möllmann
  2024-05-16  8:36                         ` Helmut Eller
  2024-05-16 14:09                         ` Helmut Eller
  1 sibling, 2 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-16  4:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Helmut Eller, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>>
>>> Ok, I think I'll try to get rid of the pdump root. Maybe I'll regret it,
>>> and maybe I'll fail, but I feel the root is not the right thing to have.
>>> (I'll do in in a branch from the branch in the fork here.)
>>>
>>> Step 1:
>>>
>>> Make the loaded dump traversable by dumping igc_headers.
>>>
>>> How easy or not that is is hard to say. On one hand, there seems to be
>>> dump_object_begin, which looks promising, but I already know one case
>>> (hash tables), where extending that function to write igc_headers might
>>> not be sufficient. And where is one exception there are more. And flags
>>> like pack_objects could interfere, and so on.
>>>
>>> Doenside: igc_header is then even more set in stone. Removing the header
>>> for conses becomes even more work. One would need a new section in the
>>> dump, just for conses.
>>>
>>> Step 1.5:
>>>
>>> When (1) is done, one could make the dump an exact root and use
>>> dflt_scan on it. (That's better than an ambig root, but I don't want the
>>> root because the world is stopped when roots are scanned.)
>>>
>>> Step 2:
>>>
>>> Walk through the dump and make copies of dumped objects in MPS memory.
>>> Record a mapping from dumped -> MPS object. Then walk through the copied
>>> objects and replace references to dumped objects.
>>>
>>> This is a bit hand-wavy. At some point when a dump is loaded, things in
>>> Emacs are set up to refer to objects in the dump. We will have to do our
>>> copy before that and somehow make sure the copy is used instead of the
>>> dump.
>>
>> An update on what's going on:
>>
>> I now have a branch in my fork that builds with and without MPS, where
>> the MPS dump contains igc headers and the non-MPS dump doesn't.
>>
>> Next step will be do ensure that the hot section of the dump is indeed
>> traversable as I want, and fix what's wtong. So we are at step 1.25 or
>> so.
>>
>> Things are a bit slower right now for procrastination reasons. The
>> pdumper is boring as hell :-/. Did I mention that code generation would
>> be a nice thing? Anyway.
>
> I've now pushed someting like step 1.45.
>
> Random notes:
>
> - The dump for MPS now contains the start offsets of igc objects. The
>   existing object_starts relocs cannot be used because they are for Lisp
>   objects only.
>
> - Each igc object in the dump has an igc_header. The function
>   pdumper_visit_object_starts can be used to traverse them in a loaded
>   dump. I chose this interface because it cannot be made sure that igc
>   objects occupy a continguous region in the dump, at least not with
>   surgery on the pdumper.
>
> - Obarrays are probably not yet handled right. I couldn't bring me to do
>   this yet. As you know my fork doesn't have obarrays, but uses CL
>   packages which use hash tables. Any takers?
>
> - There are a number of igc_obj_types of the form IGC_OBJ_DUMPED_xy. I'm
>   not sure what do with these.
>
> - I saw that igc_header::pvec_type is used for something, and shoujld
>   probably tell that was meant as an debugging aid (for the case of
>   hitting IGC_OBJ_FWDs, so that more easily see what type was forwarded.
>   This could also be seen fromt he vectorlike header. I think at some
>   we should remove pvec_type in favor of more hash bits. Whatever, not
>   so important.
>
> - I commented out from .gitignore the .patch files. This is because
>   ignoring .patch files disables some very handy Magit functionality wrt
>   patch handling.
>
> Otherwise this is not yet used. I checked with an MPS and non-MPS
> build (with checking=all).
>
> Taking the next step will be difficult. Copying object from the dump to
> MPS and fixing references from one graph to the other requires writing a
> gazillion functions to do that. At least at the moment that's something
> that exhausts me just be thinking of it.
>
> Whatever. Happy photosynthesizing on a sunny day!

I've transferred more from my fork to GNU, something like step 2.1828.

The dump is still an ambig root, objects are copied to MPS when a dump
is loaded but references in the new graph in MPS are not yet mirrored.
The infrastructure for that is there, but a felt gazillion small
functions need to be written, one for each type :-/.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16  4:25                       ` Gerd Möllmann
@ 2024-05-16  8:36                         ` Helmut Eller
  2024-05-16  8:46                           ` Gerd Möllmann
  2024-05-16  9:01                           ` Gerd Möllmann
  2024-05-16 14:09                         ` Helmut Eller
  1 sibling, 2 replies; 62+ messages in thread
From: Helmut Eller @ 2024-05-16  8:36 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 836 bytes --]

On Thu, May 16 2024, Gerd Möllmann wrote:

> The dump is still an ambig root, objects are copied to MPS when a dump
> is loaded but references in the new graph in MPS are not yet mirrored.
> The infrastructure for that is there, but a felt gazillion small
> functions need to be written, one for each type :-/.

I'd like to submit some patches:

1) Actually implement igc_realloc_ambig; this is needed on X.

2) Make igc-info a bit more useful by reporting the different pvec
   types separately.

3) Some minor refactoring for the pdump code

4) Code to register the dump as exact root.  This will break some
   things.  Not surprising of course.  E.g.
   (progn
     (view-hello-file)
     (dotimes (_ 5) (redisplay) (igc--collect) (forward-line)))

    Not sure if now is a good time to make this change.



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Implement-igc_realloc_ambig.patch --]
[-- Type: text/x-diff, Size: 933 bytes --]

From 1a58ec991dac1f93736aaf7f25d1fca5f090d680 Mon Sep 17 00:00:00 2001
From: Helmut Eller <eller.helmut@gmail.com>
Date: Fri, 10 May 2024 09:39:46 +0200
Subject: [PATCH] Implement igc_realloc_ambig

After awaking from hybernation the X server reinitialized devices and
requires igc_realloc_ambig.

* src/igc.c (igc_realloc_ambig):
---
 src/igc.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/igc.c b/src/igc.c
index 5a27267b50a..2ccab619c40 100644
--- a/src/igc.c
+++ b/src/igc.c
@@ -2257,7 +2257,6 @@ igc_xzalloc_ambig (size_t size)
 void *
 igc_realloc_ambig (void *block, size_t size)
 {
-#if 0 // non tested code:
   struct igc_root_list *r = root_find (block);
   igc_assert (r != NULL);
   destroy_root (&r);
@@ -2269,8 +2268,6 @@ igc_realloc_ambig (void *block, size_t size)
   void *end = (char *)p + new_size;
   root_create_ambig (global_igc, p, end);
   return p;
-#endif
-  emacs_abort ();
 }
 
 
-- 
2.39.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0001-In-dflt_scanx-check-types-more-carefully.patch --]
[-- Type: text/x-diff, Size: 1746 bytes --]

From 3b7bfdf9931b624be0da7c75b3806bbba5a2ac4b Mon Sep 17 00:00:00 2001
From: Helmut Eller <eller.helmut@gmail.com>
Date: Fri, 10 May 2024 09:43:19 +0200
Subject: [PATCH] In dflt_scanx, check types more carefully

* src/igc.c (dflt_scanx): Make sure that obj_type and pvec_type are in
the valid range before using them as index.
---
 src/igc.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/src/igc.c b/src/igc.c
index 2ccab619c40..e2cf00b2de6 100644
--- a/src/igc.c
+++ b/src/igc.c
@@ -1252,10 +1252,17 @@ dflt_scanx (mps_ss_t ss, mps_addr_t base_start, mps_addr_t base_limit,
 	if (closure)
 	  {
 	    struct igc_stats *st = closure;
-	    st->obj[header->obj_type].nwords += header->nwords;
-	    st->obj[header->obj_type].nobjs += 1;
-	    st->obj[header->pvec_type].nwords += header->nwords;
-	    st->obj[header->pvec_type].nobjs += 1;
+	    mps_word_t obj_type = header->obj_type;
+	    igc_assert (obj_type < IGC_OBJ_LAST);
+	    st->obj[obj_type].nwords += header->nwords;
+	    st->obj[obj_type].nobjs += 1;
+	    if (obj_type != IGC_OBJ_PAD)
+	      {
+		mps_word_t pvec_type = header->pvec_type;
+		igc_assert (pvec_type <= PVEC_TAG_MAX);
+		st->obj[pvec_type].nwords += header->nwords;
+		st->obj[pvec_type].nobjs += 1;
+	      }
 	  }
 
 	switch (header->obj_type)
@@ -3115,7 +3122,9 @@ DEFUN ("igc-info", Figc_info, Sigc_info, 0, 0, 0, doc : /* */)
   struct igc *gc = global_igc;
   struct igc_stats st = { 0 };
   mps_res_t res;
-  IGC_WITH_PARKED (gc) { res = mps_pool_walk (gc->dflt_pool, dflt_scanx, &st); }
+  IGC_WITH_PARKED (gc) {
+     res = mps_pool_walk (gc->dflt_pool, dflt_scanx, &st);
+  }
   if (res != MPS_RES_OK)
     error ("Error %d walking memory", res);
 
-- 
2.39.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0001-Tighter-bounds-for-the-dumped-hot-region.patch --]
[-- Type: text/x-diff, Size: 1095 bytes --]

From f920bb66bfd7a6364ff1fa8796a6b3fd5a6606f0 Mon Sep 17 00:00:00 2001
From: Helmut Eller <eller.helmut@gmail.com>
Date: Thu, 16 May 2024 09:01:45 +0200
Subject: [PATCH] Tighter bounds for the dumped hot region

* src/pdumper.c (pdumper_load): Exclude the header and the discardable
part.
---
 src/pdumper.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/pdumper.c b/src/pdumper.c
index 2437a70f0a8..00bb7dd8db8 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -5956,8 +5956,11 @@ pdumper_load (const char *dump_filename, char *argv0)
   dump_public.end = dump_public.start + dump_size;
 
 #ifdef HAVE_MPS
-  void *hot_start = (void *) dump_base;
-  void *hot_end = (void *) (dump_base + adj_discardable_start);
+  size_t aligned_header_size
+    = ((sizeof (struct dump_header) + DUMP_ALIGNMENT - 1)
+       & ~(DUMP_ALIGNMENT - 1));
+  void *hot_start = (void *) (dump_base + aligned_header_size);
+  void *hot_end = (void *) (dump_base + header->discardable_start);
 #endif
 
   dump_do_all_dump_reloc_for_phase (header, dump_base, EARLY_RELOCS);
-- 
2.39.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #5: 0001-Factorize-common-pattern-to-dump-arrays.patch --]
[-- Type: text/x-diff, Size: 3226 bytes --]

From 874c30967e1886ad3d42aa21c625f29381612e9d Mon Sep 17 00:00:00 2001
From: Helmut Eller <eller.helmut@gmail.com>
Date: Thu, 16 May 2024 09:09:39 +0200
Subject: [PATCH] Factorize common pattern to dump arrays

* src/pdumper.c (dump_object_array): New.
(dump_hash_table_key, dump_hash_table_key, dump_obarray_buckets): Use it.
---
 src/pdumper.c | 53 +++++++++++++--------------------------------------
 1 file changed, 13 insertions(+), 40 deletions(-)

diff --git a/src/pdumper.c b/src/pdumper.c
index 00bb7dd8db8..738aeb458c9 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -2781,49 +2781,39 @@ hash_table_freeze (struct Lisp_Hash_Table *h)
 }
 
 static dump_off
-dump_hash_table_key (struct dump_context *ctx, struct Lisp_Hash_Table *h)
+dump_object_array (struct dump_context *ctx,
+		   const Lisp_Object array[], size_t len)
 {
   dump_align_output (ctx, DUMP_ALIGNMENT);
   dump_off start_offset = ctx->offset;
-  ptrdiff_t n = h->count;
 
   struct dump_flags old_flags = ctx->flags;
   ctx->flags.pack_objects = true;
 
-  for (ptrdiff_t i = 0; i < n; i++)
+  for (size_t i = 0; i < len; i++)
     {
       Lisp_Object out;
-      const Lisp_Object *slot = &h->key[i];
+      const Lisp_Object *slot = &array[i];
       dump_object_start_1 (ctx, &out, sizeof out);
       dump_field_lv (ctx, &out, slot, slot, WEIGHT_STRONG);
       dump_object_finish_1 (ctx, &out, sizeof out);
     }
 
   ctx->flags = old_flags;
+
   return start_offset;
 }
 
 static dump_off
-dump_hash_table_value (struct dump_context *ctx, struct Lisp_Hash_Table *h)
+dump_hash_table_key (struct dump_context *ctx, struct Lisp_Hash_Table *h)
 {
-  dump_align_output (ctx, DUMP_ALIGNMENT);
-  dump_off start_offset = ctx->offset;
-  ptrdiff_t n = h->count;
-
-  struct dump_flags old_flags = ctx->flags;
-  ctx->flags.pack_objects = true;
-
-  for (ptrdiff_t i = 0; i < n; i++)
-    {
-      Lisp_Object out;
-      const Lisp_Object *slot = &h->value[i];
-      dump_object_start_1 (ctx, &out, sizeof out);
-      dump_field_lv (ctx, &out, slot, slot, WEIGHT_STRONG);
-      dump_object_finish_1 (ctx, &out, sizeof out);
-    }
+  return dump_object_array (ctx, h->key, h->count);
+}
 
-  ctx->flags = old_flags;
-  return start_offset;
+static dump_off
+dump_hash_table_value (struct dump_context *ctx, struct Lisp_Hash_Table *h)
+{
+  return dump_object_array (ctx, h->value, h->count);
 }
 
 static dump_off
@@ -2875,24 +2865,7 @@ dump_hash_table (struct dump_context *ctx, Lisp_Object object)
 static dump_off
 dump_obarray_buckets (struct dump_context *ctx, const struct Lisp_Obarray *o)
 {
-  dump_align_output (ctx, DUMP_ALIGNMENT);
-  dump_off start_offset = ctx->offset;
-  ptrdiff_t n = obarray_size (o);
-
-  struct dump_flags old_flags = ctx->flags;
-  ctx->flags.pack_objects = true;
-
-  for (ptrdiff_t i = 0; i < n; i++)
-    {
-      Lisp_Object out;
-      const Lisp_Object *slot = &o->buckets[i];
-      dump_object_start_1 (ctx, &out, sizeof out);
-      dump_field_lv (ctx, &out, slot, slot, WEIGHT_STRONG);
-      dump_object_finish_1 (ctx, &out, sizeof out);
-    }
-
-  ctx->flags = old_flags;
-  return start_offset;
+  return dump_object_array (ctx, o->buckets, obarray_size (o));
 }
 
 static dump_off
-- 
2.39.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #6: 0001-Include-stats-about-pseudovectors-in-igc-info.patch --]
[-- Type: text/x-diff, Size: 3757 bytes --]

From b8c40588bfa76f58cc00f51a844044c8d30f7d01 Mon Sep 17 00:00:00 2001
From: Helmut Eller <eller.helmut@gmail.com>
Date: Thu, 16 May 2024 09:13:00 +0200
Subject: [PATCH] Include stats about pseudovectors in igc-info

* src/igc.c (pvec_type_names, pvec_type_name): New.
(dflt_scanx): Better accounting for each pseudovector type.
(Figc_info): Return the accumulated values for pseudovectors.
---
 src/igc.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 68 insertions(+), 10 deletions(-)

diff --git a/src/igc.c b/src/igc.c
index 9656d4190c1..db02cdb85f8 100644
--- a/src/igc.c
+++ b/src/igc.c
@@ -208,6 +208,56 @@ #define IGC_DEFINE_LIST(data)                                                  \
 
 igc_static_assert (ARRAYELTS (obj_type_names) == IGC_OBJ_LAST);
 
+static const char *pvec_type_names[] = {
+  "PVEC_NORMAL_VECTOR",
+  "PVEC_FREE",
+  "PVEC_BIGNUM",
+  "PVEC_MARKER",
+  "PVEC_OVERLAY",
+  "PVEC_FINALIZER",
+  "PVEC_SYMBOL_WITH_POS",
+  "PVEC_MISC_PTR",
+  "PVEC_USER_PTR",
+  "PVEC_PROCESS",
+  "PVEC_FRAME",
+  "PVEC_WINDOW",
+  "PVEC_BOOL_VECTOR",
+  "PVEC_BUFFER",
+  "PVEC_HASH_TABLE",
+  "PVEC_OBARRAY",
+  "PVEC_TERMINAL",
+  "PVEC_WINDOW_CONFIGURATION",
+  "PVEC_SUBR",
+  "PVEC_OTHER",
+  "PVEC_XWIDGET",
+  "PVEC_XWIDGET_VIEW",
+  "PVEC_THREAD",
+  "PVEC_MUTEX",
+  "PVEC_CONDVAR",
+  "PVEC_MODULE_FUNCTION",
+  "PVEC_MODULE_GLOBAL_REFERENCE",
+  "PVEC_NATIVE_COMP_UNIT",
+  "PVEC_TS_PARSER",
+  "PVEC_TS_NODE",
+  "PVEC_TS_COMPILED_QUERY",
+  "PVEC_SQLITE",
+  "PVEC_WEAK_REF",
+  "PVEC_COMPILED",
+  "PVEC_CHAR_TABLE",
+  "PVEC_SUB_CHAR_TABLE",
+  "PVEC_RECORD",
+  "PVEC_FONT",
+};
+
+igc_static_assert (ARRAYELTS (pvec_type_names) == PVEC_TAG_MAX + 1);
+
+static const char *
+pvec_type_name (enum pvec_type type)
+{
+  igc_assert (0 <= type && type <= PVEC_TAG_MAX);
+  return pvec_type_names[type];
+}
+
 struct igc_stats
 {
   struct
@@ -1246,12 +1296,13 @@ dflt_scanx (mps_ss_t ss, mps_addr_t base_start, mps_addr_t base_limit,
 	    igc_assert (obj_type < IGC_OBJ_LAST);
 	    st->obj[obj_type].nwords += header->nwords;
 	    st->obj[obj_type].nobjs += 1;
-	    if (obj_type != IGC_OBJ_PAD)
+	    if (obj_type == IGC_OBJ_VECTOR)
 	      {
-		mps_word_t pvec_type = header->pvec_type;
-		igc_assert (pvec_type <= PVEC_TAG_MAX);
-		st->obj[pvec_type].nwords += header->nwords;
-		st->obj[pvec_type].nobjs += 1;
+		struct Lisp_Vector* v = (struct Lisp_Vector*)client;
+		enum pvec_type pvec_type = PSEUDOVECTOR_TYPE (v);
+		igc_assert (0 <= pvec_type && pvec_type <= PVEC_TAG_MAX);
+		st->pvec[pvec_type].nwords += header->nwords;
+		st->pvec[pvec_type].nobjs += 1;
 	      }
 	  }
 
@@ -3107,8 +3158,9 @@ DEFUN ("igc-info", Figc_info, Sigc_info, 0, 0, 0, doc : /* */)
   struct igc *gc = global_igc;
   struct igc_stats st = { 0 };
   mps_res_t res;
-  IGC_WITH_PARKED (gc) {
-     res = mps_pool_walk (gc->dflt_pool, dflt_scanx, &st);
+  IGC_WITH_PARKED (gc)
+  {
+    res = mps_pool_walk (gc->dflt_pool, dflt_scanx, &st);
   }
   if (res != MPS_RES_OK)
     error ("Error %d walking memory", res);
@@ -3117,11 +3169,17 @@ DEFUN ("igc-info", Figc_info, Sigc_info, 0, 0, 0, doc : /* */)
   for (int i = 0; i < IGC_OBJ_LAST; ++i)
     {
       Lisp_Object e
-	= list3 (build_string (obj_type_names[i]), make_int (st.obj[i].nobjs),
-		 make_int (st.obj[i].nwords));
+	  = list3 (build_string (obj_type_names[i]),
+		   make_int (st.obj[i].nobjs), make_int (st.obj[i].nwords));
+      result = Fcons (e, result);
+    }
+  for (enum pvec_type i = 0; i <= PVEC_TAG_MAX; i++)
+    {
+      Lisp_Object e
+	  = list3 (build_string (pvec_type_name (i)),
+		   make_int (st.pvec[i].nobjs), make_int (st.pvec[i].nwords));
       result = Fcons (e, result);
     }
-
   return result;
 }
 
-- 
2.39.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #7: 0001-Register-the-dump-as-exact-root.patch --]
[-- Type: text/x-diff, Size: 2365 bytes --]

From f4b2c18a788e03fcd7a2e3640288c4794a7d9057 Mon Sep 17 00:00:00 2001
From: Helmut Eller <eller.helmut@gmail.com>
Date: Thu, 16 May 2024 10:22:48 +0200
Subject: [PATCH] Register the dump as exact root

* src/igc.c (register_pdump_roots_ctx, register_pdump_roots_1)
(register_pdump_roots): New.
(igc_on_pdump_loaded): Use it.
---
 src/igc.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 54 insertions(+), 1 deletion(-)

diff --git a/src/igc.c b/src/igc.c
index db02cdb85f8..c8698183bd4 100644
--- a/src/igc.c
+++ b/src/igc.c
@@ -3973,10 +3973,63 @@ mirror_dump (void)
   mirror_objects (&m);
 }
 
+struct register_pdump_roots_ctx
+{
+  void *hot_start;  /* start of hot section in pdump */
+  void *hot_end;    /* end of hot section in pdump */
+  void *root_start; /* start (or NULL) of current root */
+  void *root_end;   /* end (or NULL) of current root */
+};
+
+/* Try to combine adjacent objects into one root.  Naively creating a
+   separate root for each object seems to run into serious efficiency
+   problems. */
+static void
+register_pdump_roots_1 (void *start, void *closure)
+{
+  struct igc_header *h = start;
+  void *end = (char *)start + to_bytes (h->nwords);
+  struct register_pdump_roots_ctx *ctx = closure;
+  if (start < ctx->hot_start || ctx->hot_end <= start)
+    return;
+  if (ctx->root_end == start) /* adjacent objects? */
+    {
+      ctx->root_end = end; /* combine them */
+    }
+  else
+    {
+      if (ctx->root_start != NULL)
+	{
+	  root_create_exact (global_igc, ctx->root_start, ctx->root_end,
+			     dflt_scanx);
+	}
+      ctx->root_start = start;
+      ctx->root_end = end;
+    }
+}
+
+static void
+register_pdump_roots (void *start, void *end)
+{
+  struct register_pdump_roots_ctx ctx = {
+    .hot_start = start,
+    .hot_end = end,
+    .root_start = NULL,
+    .root_end = NULL,
+  };
+  pdumper_visit_object_starts (register_pdump_roots_1, &ctx);
+  if (ctx.root_start != NULL)
+    {
+      root_create_exact (global_igc, ctx.root_start, ctx.root_end,
+			 dflt_scanx);
+    }
+}
+
 void
 igc_on_pdump_loaded (void *start, void *end)
 {
-  root_create_ambig (global_igc, start, end);
+  // root_create_ambig (global_igc, start, end);
+  register_pdump_roots (start, end);
   specpdl_ref count = igc_park_arena ();
   mirror_dump ();
   unbind_to (count, Qnil);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16  8:36                         ` Helmut Eller
@ 2024-05-16  8:46                           ` Gerd Möllmann
  2024-05-16  9:01                           ` Gerd Möllmann
  1 sibling, 0 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-16  8:46 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Thu, May 16 2024, Gerd Möllmann wrote:
>
>> The dump is still an ambig root, objects are copied to MPS when a dump
>> is loaded but references in the new graph in MPS are not yet mirrored.
>> The infrastructure for that is there, but a felt gazillion small
>> functions need to be written, one for each type :-/.
>
> I'd like to submit some patches:
>
> 1) Actually implement igc_realloc_ambig; this is needed on X.
>
> 2) Make igc-info a bit more useful by reporting the different pvec
>    types separately.
>
> 3) Some minor refactoring for the pdump code
>
> 4) Code to register the dump as exact root.  This will break some
>    things.  Not surprising of course.  E.g.
>    (progn
>      (view-hello-file)
>      (dotimes (_ 5) (redisplay) (igc--collect) (forward-line)))
>
>     Not sure if now is a good time to make this change.

Moin Helmut, I've pushed all of them :-).



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16  8:36                         ` Helmut Eller
  2024-05-16  8:46                           ` Gerd Möllmann
@ 2024-05-16  9:01                           ` Gerd Möllmann
  2024-05-16  9:31                             ` Helmut Eller
  1 sibling, 1 reply; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-16  9:01 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> 4) Code to register the dump as exact root.  This will break some
>    things.  Not surprising of course.  E.g.
>    (progn
>      (view-hello-file)
>      (dotimes (_ 5) (redisplay) (igc--collect) (forward-line)))
>
>     Not sure if now is a good time to make this change.

FWIW, I don't see a crash here. Or is it something else?



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16  9:01                           ` Gerd Möllmann
@ 2024-05-16  9:31                             ` Helmut Eller
  2024-05-16  9:42                               ` Gerd Möllmann
  2024-05-16 12:07                               ` Eli Zaretskii
  0 siblings, 2 replies; 62+ messages in thread
From: Helmut Eller @ 2024-05-16  9:31 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, emacs-devel

On Thu, May 16 2024, Gerd Möllmann wrote:

> Helmut Eller <eller.helmut@gmail.com> writes:
>
>> 4) Code to register the dump as exact root.  This will break some
>>    things.  Not surprising of course.  E.g.
>>    (progn
>>      (view-hello-file)
>>      (dotimes (_ 5) (redisplay) (igc--collect) (forward-line)))
>>
>>     Not sure if now is a good time to make this change.
>
> FWIW, I don't see a crash here. Or is it something else?

For me, this one definitely crashes.  This line
 if (BUFFERP (glyph->object))
in xdisp.c:set_cursor_from_row.  The tty version crashes too.

Starting program: /scratch/emacs/emacs-igc2/src/emacs -Q -eval \(progn\ \(view-hello-file\)\ \(dotimes\ \(_\ 5\)\ \(redisplay\)\ \(igc--collect\)\ \(forward-line\)\)\)
[...]
Thread 1 "emacs" hit Breakpoint 1, terminate_due_to_signal (sig=11, 
    backtrace_limit=40) at emacs.c:443
443       signal (sig, SIG_DFL);
(gdb) ba 15
#0  terminate_due_to_signal (sig=11, backtrace_limit=40) at emacs.c:443
#1  0x00005555557974a6 in handle_fatal_signal (sig=11) at sysdep.c:1800
#2  0x000055555579747b in deliver_thread_signal
    (sig=11, handler=0x55555579748c <handle_fatal_signal>) at sysdep.c:1792
#3  0x00005555557974e7 in deliver_fatal_thread_signal (sig=11) at sysdep.c:1812
#4  0x000055555579768b in handle_sigsegv
    (sig=11, siginfo=0x5555560df230 <sigsegv_stack+64784>, arg=0x5555560df100 <sigsegv_stack+64480>) at sysdep.c:1950
#5  0x00007ffff61a4050 in <signal handler called> ()
    at /lib/x86_64-linux-gnu/libc.so.6
#6  0x00007ffff61a4267 in __GI_kill ()
    at ../sysdeps/unix/syscall-template.S:120
#7  0x000055555598cc0e in sigHandle
    (sig=<optimized out>, info=0x7fffffff8c70, uap=0x7fffffff8b40)
    at protsgix.c:122
#8  0x00007ffff61a4050 in <signal handler called> ()
    at /lib/x86_64-linux-gnu/libc.so.6
#9  0x00005555555c7aaf in PSEUDOVECTORP (a=XIL(0x7fffe3bbd9ad), code=13)
    at /scratch/emacs/emacs-igc2/src/lisp.h:1101
#10 0x00005555555ca5af in BUFFERP (a=XIL(0x7fffe3bbd9ad))
    at /scratch/emacs/emacs-igc2/src/buffer.h:722
#11 0x00005555556032da in set_cursor_from_row
    (w=0x7fffe39a7820, row=0x555556a62220, matrix=0x55555634ee40, delta=0, delta_bytes=0, dy=0, dvpos=0) at xdisp.c:18224
#12 0x0000555555609093 in try_cursor_movement
    (window=XIL(0x7fffe39a7825), startp=..., scroll_step=0x7fffffffa66f)
    at xdisp.c:19648
#13 0x000055555560ba01 in redisplay_window
    (window=XIL(0x7fffe39a7825), just_this_one_p=true) at xdisp.c:20405
#14 0x0000555555602bcd in redisplay_window_1 (window=XIL(0x7fffe39a7825))
    at xdisp.c:18026
(More stack frames follow...)
[Thread 0x7fffdf3b66c0 (LWP 38423) exited]



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16  9:31                             ` Helmut Eller
@ 2024-05-16  9:42                               ` Gerd Möllmann
  2024-05-16  9:54                                 ` Gerd Möllmann
  2024-05-16 12:08                                 ` Eli Zaretskii
  2024-05-16 12:07                               ` Eli Zaretskii
  1 sibling, 2 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-16  9:42 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Thu, May 16 2024, Gerd Möllmann wrote:
>
>> Helmut Eller <eller.helmut@gmail.com> writes:
>>
>>> 4) Code to register the dump as exact root.  This will break some
>>>    things.  Not surprising of course.  E.g.
>>>    (progn
>>>      (view-hello-file)
>>>      (dotimes (_ 5) (redisplay) (igc--collect) (forward-line)))
>>>
>>>     Not sure if now is a good time to make this change.
>>
>> FWIW, I don't see a crash here. Or is it something else?
>
> For me, this one definitely crashes.  This line
>  if (BUFFERP (glyph->object))
> in xdisp.c:set_cursor_from_row.  The tty version crashes too.
>
> Starting program: /scratch/emacs/emacs-igc2/src/emacs -Q -eval \(progn\ \(view-hello-file\)\ \(dotimes\ \(_\ 5\)\ \(redisplay\)\ \(igc--collect\)\ \(forward-line\)\)\)
> [...]
> Thread 1 "emacs" hit Breakpoint 1, terminate_due_to_signal (sig=11, 

Seems to work here, both in my Emacs and GNU. Hm.

I've BTW pushed something again. This changes nothing, it's just because
to make sure that my branch and GNU are easier to synchronize.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16  9:42                               ` Gerd Möllmann
@ 2024-05-16  9:54                                 ` Gerd Möllmann
  2024-05-16 12:43                                   ` Helmut Eller
  2024-05-16 12:08                                 ` Eli Zaretskii
  1 sibling, 1 reply; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-16  9:54 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Helmut Eller <eller.helmut@gmail.com> writes:
>
>> On Thu, May 16 2024, Gerd Möllmann wrote:
>>
>>> Helmut Eller <eller.helmut@gmail.com> writes:
>>>
>>>> 4) Code to register the dump as exact root.  This will break some
>>>>    things.  Not surprising of course.  E.g.
>>>>    (progn
>>>>      (view-hello-file)
>>>>      (dotimes (_ 5) (redisplay) (igc--collect) (forward-line)))
>>>>
>>>>     Not sure if now is a good time to make this change.
>>>
>>> FWIW, I don't see a crash here. Or is it something else?
>>
>> For me, this one definitely crashes.  This line
>>  if (BUFFERP (glyph->object))
>> in xdisp.c:set_cursor_from_row.  The tty version crashes too.
>>
>> Starting program: /scratch/emacs/emacs-igc2/src/emacs -Q -eval \(progn\ \(view-hello-file\)\ \(dotimes\ \(_\ 5\)\ \(redisplay\)\ \(igc--collect\)\ \(forward-line\)\)\)
>> [...]
>> Thread 1 "emacs" hit Breakpoint 1, terminate_due_to_signal (sig=11, 
>
> Seems to work here, both in my Emacs and GNU. Hm.

Done 10 rounds now for GUI and tty, with 50 forward-line, ASLR enabled
and no crash so far 🤷



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16  9:31                             ` Helmut Eller
  2024-05-16  9:42                               ` Gerd Möllmann
@ 2024-05-16 12:07                               ` Eli Zaretskii
  2024-05-16 12:21                                 ` Gerd Möllmann
  1 sibling, 1 reply; 62+ messages in thread
From: Eli Zaretskii @ 2024-05-16 12:07 UTC (permalink / raw)
  To: Helmut Eller; +Cc: gerd.moellmann, emacs-devel

> From: Helmut Eller <eller.helmut@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  emacs-devel@gnu.org
> Date: Thu, 16 May 2024 11:31:28 +0200
> 
> On Thu, May 16 2024, Gerd Möllmann wrote:
> 
> > Helmut Eller <eller.helmut@gmail.com> writes:
> >
> >> 4) Code to register the dump as exact root.  This will break some
> >>    things.  Not surprising of course.  E.g.
> >>    (progn
> >>      (view-hello-file)
> >>      (dotimes (_ 5) (redisplay) (igc--collect) (forward-line)))
> >>
> >>     Not sure if now is a good time to make this change.
> >
> > FWIW, I don't see a crash here. Or is it something else?
> 
> For me, this one definitely crashes.  This line
>  if (BUFFERP (glyph->object))
> in xdisp.c:set_cursor_from_row.  The tty version crashes too.

Yes, I see that on Windows as well:

  Thread 1 received signal SIGSEGV, Segmentation fault.
  PSEUDOVECTORP (code=13, a=0xb93100d) at lisp.h:755
  755       return lisp_h_XLP (o);
  (gdb) bt
  #0  PSEUDOVECTORP (code=13, a=0xb93100d) at lisp.h:755
  #1  BUFFERP (a=0xb93100d) at buffer.h:722
  #2  set_cursor_from_row (w=w@entry=0xaf74fe8, row=row@entry=0x71a5dc0,
      matrix=0x711ee50, delta=delta@entry=0, delta_bytes=delta_bytes@entry=0,
      dy=dy@entry=0, dvpos=dvpos@entry=0) at xdisp.c:18224

and the object A in the PSEUDOVECTORP call cannot be looked at:

  (gdb) p a
  $2 = XIL(0xb93100d)
  (gdb) xtype
  Lisp_Vectorlike
  Cannot access memory at address 0xb931008

That's glyph->object of the first glyph of a glyph row produced from
this line of text:

  but to demonstrate some of the character sets and writing

which is the 3rd line of HELLO.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16  9:42                               ` Gerd Möllmann
  2024-05-16  9:54                                 ` Gerd Möllmann
@ 2024-05-16 12:08                                 ` Eli Zaretskii
  2024-05-16 12:27                                   ` Gerd Möllmann
  1 sibling, 1 reply; 62+ messages in thread
From: Eli Zaretskii @ 2024-05-16 12:08 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eller.helmut, emacs-devel

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  emacs-devel@gnu.org
> Date: Thu, 16 May 2024 11:42:48 +0200
> 
> > Thread 1 "emacs" hit Breakpoint 1, terminate_due_to_signal (sig=11, 
> 
> Seems to work here, both in my Emacs and GNU. Hm.

macOS redisplay is...weird.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16 12:07                               ` Eli Zaretskii
@ 2024-05-16 12:21                                 ` Gerd Möllmann
  2024-05-16 12:27                                   ` Eli Zaretskii
  0 siblings, 1 reply; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-16 12:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Helmut Eller, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> and the object A in the PSEUDOVECTORP call cannot be looked at:
>
>   (gdb) p a
>   $2 = XIL(0xb93100d)
>   (gdb) xtype
>   Lisp_Vectorlike
>   Cannot access memory at address 0xb931008

Usually calling igc_postmortem helps in such a case.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16 12:08                                 ` Eli Zaretskii
@ 2024-05-16 12:27                                   ` Gerd Möllmann
  0 siblings, 0 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-16 12:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eller.helmut, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  emacs-devel@gnu.org
>> Date: Thu, 16 May 2024 11:42:48 +0200
>> 
>> > Thread 1 "emacs" hit Breakpoint 1, terminate_due_to_signal (sig=11, 
>> 
>> Seems to work here, both in my Emacs and GNU. Hm.
>
> macOS redisplay is...weird.

Yeah :-)



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16 12:21                                 ` Gerd Möllmann
@ 2024-05-16 12:27                                   ` Eli Zaretskii
  2024-05-16 12:43                                     ` Gerd Möllmann
  0 siblings, 1 reply; 62+ messages in thread
From: Eli Zaretskii @ 2024-05-16 12:27 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eller.helmut, emacs-devel

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Helmut Eller <eller.helmut@gmail.com>,  emacs-devel@gnu.org
> Date: Thu, 16 May 2024 14:21:20 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > and the object A in the PSEUDOVECTORP call cannot be looked at:
> >
> >   (gdb) p a
> >   $2 = XIL(0xb93100d)
> >   (gdb) xtype
> >   Lisp_Vectorlike
> >   Cannot access memory at address 0xb931008
> 
> Usually calling igc_postmortem helps in such a case.

Compliance!

  Thread 1 received signal SIGSEGV, Segmentation fault.
  PSEUDOVECTORP (code=13, a=0xb95c2dd) at lisp.h:755
  755       return lisp_h_XLP (o);
  (gdb) call igc_postmortem()
  (gdb) c
  Continuing.

  Thread 1 received signal SIGSEGV, Segmentation fault.
  PSEUDOVECTORP (code=13, a=0xb95c2dd) at lisp.h:755
  755       return lisp_h_XLP (o);
  (gdb) call igc_postmortem()

  traceanc.c:611: Emacs fatal error: assertion failed: !RingIsSingle(_old)
  [Thread 28004.0x5c2c exited with code 0]

  Thread 1 received signal SIGTRAP, Trace/breakpoint trap.
  0x76d258d3 in KERNELBASE!DebugBreak () from C:\WINDOWS\SysWOW64\KernelBase.dll
  The program being debugged was signaled while in a function called from GDB.
  GDB remains in the frame where the signal was received.
  To change this behavior use "set unwindonsignal on".
  Evaluation of the expression containing the function
  (igc_postmortem) will be abandoned.
  When the function is done executing, GDB will silently stop.

Now what?



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16 12:27                                   ` Eli Zaretskii
@ 2024-05-16 12:43                                     ` Gerd Möllmann
  0 siblings, 0 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-16 12:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eller.helmut, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> >   (gdb) xtype
>> >   Lisp_Vectorlike
>> >   Cannot access memory at address 0xb931008
>> 
>> Usually calling igc_postmortem helps in such a case.
>
> Compliance!
>
>   Thread 1 received signal SIGSEGV, Segmentation fault.
>   PSEUDOVECTORP (code=13, a=0xb95c2dd) at lisp.h:755
>   755       return lisp_h_XLP (o);
>   (gdb) call igc_postmortem()
>   (gdb) c
>   Continuing.
>
>   Thread 1 received signal SIGSEGV, Segmentation fault.
>   PSEUDOVECTORP (code=13, a=0xb95c2dd) at lisp.h:755
>   755       return lisp_h_XLP (o);
>   (gdb) call igc_postmortem()
>
>   traceanc.c:611: Emacs fatal error: assertion failed: !RingIsSingle(_old)
>   [Thread 28004.0x5c2c exited with code 0]
>
>   Thread 1 received signal SIGTRAP, Trace/breakpoint trap.
>   0x76d258d3 in KERNELBASE!DebugBreak () from C:\WINDOWS\SysWOW64\KernelBase.dll
>   The program being debugged was signaled while in a function called from GDB.
>   GDB remains in the frame where the signal was received.
>   To change this behavior use "set unwindonsignal on".
>   Evaluation of the expression containing the function
>   (igc_postmortem) will be abandoned.
>   When the function is done executing, GDB will silently stop.
>
> Now what?

The c was too much :-).

What I meant is that igc_postmortem probably makes the memory accessible
by making MPS lift its barriers. Alas MPS is also dead as a mouse after
that.




^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16  9:54                                 ` Gerd Möllmann
@ 2024-05-16 12:43                                   ` Helmut Eller
  2024-05-16 12:47                                     ` Gerd Möllmann
  0 siblings, 1 reply; 62+ messages in thread
From: Helmut Eller @ 2024-05-16 12:43 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 291 bytes --]

On Thu, May 16 2024, Gerd Möllmann wrote:

> Done 10 rounds now for GUI and tty, with 50 forward-line, ASLR enabled
> and no crash so far 🤷

The change below fixes the problem for me.  Apparently buffers can now
be moved.  How can this not be relevant for macOS?  Very strange.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-fix_glyph_matrix.patch --]
[-- Type: text/x-diff, Size: 736 bytes --]

From 1c537516a5c674e5e70eebfee8ba3a4ba8ea7ddf Mon Sep 17 00:00:00 2001
From: Helmut Eller <eller.helmut@gmail.com>
Date: Thu, 16 May 2024 14:39:09 +0200
Subject: [PATCH] Fix fix_glyph_matrix

* src/igc.c (fix_glyph_matrix): Fix buffers too no just strings.
---
 src/igc.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/igc.c b/src/igc.c
index 5d9315a0c44..fef9797b4d4 100644
--- a/src/igc.c
+++ b/src/igc.c
@@ -1464,8 +1464,7 @@ fix_glyph_matrix (mps_ss_t ss, struct glyph_matrix *matrix)
 	      for (; glyph < end_glyph; ++glyph)
 		{
 		  Lisp_Object *obj_ptr = &glyph->object;
-		  if (STRINGP (*obj_ptr))
-		    IGC_FIX12_OBJ (ss, obj_ptr);
+		  IGC_FIX12_OBJ (ss, obj_ptr);
 		}
 	    }
 	}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16 12:43                                   ` Helmut Eller
@ 2024-05-16 12:47                                     ` Gerd Möllmann
  0 siblings, 0 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-16 12:47 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Thu, May 16 2024, Gerd Möllmann wrote:
>
>> Done 10 rounds now for GUI and tty, with 50 forward-line, ASLR enabled
>> and no crash so far 🤷
>
> The change below fixes the problem for me.  Apparently buffers can now
> be moved.  How can this not be relevant for macOS?  Very strange.

Buffers are just normal pseudo vectors... :-).

As Eli said - macOS Emacs is generally weird. Heaven knows.

I'll commit that in a moment. Thanks!



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16  4:25                       ` Gerd Möllmann
  2024-05-16  8:36                         ` Helmut Eller
@ 2024-05-16 14:09                         ` Helmut Eller
  2024-05-16 14:24                           ` Gerd Möllmann
  1 sibling, 1 reply; 62+ messages in thread
From: Helmut Eller @ 2024-05-16 14:09 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, emacs-devel

On Thu, May 16 2024, Gerd Möllmann wrote:

> The infrastructure for that is there, but a felt gazillion small
> functions need to be written, one for each type :-/.

Can we move dflt_scan and the fix_* functions to an extra file and then
include that twice; each time with suitable definitions for MPS_FIX1 and
MPS_FIX2 so that it either does the normal fixing or the mirroring?



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16 14:09                         ` Helmut Eller
@ 2024-05-16 14:24                           ` Gerd Möllmann
  2024-05-16 15:48                             ` Eli Zaretskii
  2024-05-16 16:56                             ` Andrea Corallo
  0 siblings, 2 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-16 14:24 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Thu, May 16 2024, Gerd Möllmann wrote:
>
>> The infrastructure for that is there, but a felt gazillion small
>> functions need to be written, one for each type :-/.
>
> Can we move dflt_scan and the fix_* functions to an extra file and then
> include that twice; each time with suitable definitions for MPS_FIX1 and
> MPS_FIX2 so that it either does the normal fixing or the mirroring?

Don't know if that's flexible enough :-(.

Another idea I had yesterday would be to have one set of functions that
when called record meta-info about types. That could be done on startup.
Generic scan and mirror function could then interpret that meta-info.
;-).

Code generation would be my personal favourite, though.

And I was sooo close to doing it in C++ in the beginning. I think I
could do that sh*t with templates, too. Too bad.

Ceterum censeo Emacs should allow C++.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16 14:24                           ` Gerd Möllmann
@ 2024-05-16 15:48                             ` Eli Zaretskii
  2024-05-16 16:56                             ` Andrea Corallo
  1 sibling, 0 replies; 62+ messages in thread
From: Eli Zaretskii @ 2024-05-16 15:48 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eller.helmut, emacs-devel

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  emacs-devel@gnu.org
> Date: Thu, 16 May 2024 16:24:29 +0200
> 
> Ceterum censeo Emacs should allow C++.

If you don't care about losing me, go ahead.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16 14:24                           ` Gerd Möllmann
  2024-05-16 15:48                             ` Eli Zaretskii
@ 2024-05-16 16:56                             ` Andrea Corallo
  2024-05-16 17:27                               ` Gerd Möllmann
  1 sibling, 1 reply; 62+ messages in thread
From: Andrea Corallo @ 2024-05-16 16:56 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Helmut Eller, Eli Zaretskii, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Helmut Eller <eller.helmut@gmail.com> writes:
>
>> On Thu, May 16 2024, Gerd Möllmann wrote:
>>
>>> The infrastructure for that is there, but a felt gazillion small
>>> functions need to be written, one for each type :-/.
>>
>> Can we move dflt_scan and the fix_* functions to an extra file and then
>> include that twice; each time with suitable definitions for MPS_FIX1 and
>> MPS_FIX2 so that it either does the normal fixing or the mirroring?
>
> Don't know if that's flexible enough :-(.
>
> Another idea I had yesterday would be to have one set of functions that
> when called record meta-info about types. That could be done on startup.
> Generic scan and mirror function could then interpret that meta-info.
> ;-).
>
> Code generation would be my personal favourite, though.
>
> And I was sooo close to doing it in C++ in the beginning. I think I
> could do that sh*t with templates, too. Too bad.
>
> Ceterum censeo Emacs should allow C++.

C++ is not an option for me as well sorry.

Could you show an example of code you want to generate (and the
corresponding code you want to generate from)?

Thanks

  Andrea



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16 16:56                             ` Andrea Corallo
@ 2024-05-16 17:27                               ` Gerd Möllmann
  2024-05-16 17:50                                 ` Andrea Corallo
  2024-05-16 20:03                                 ` Helmut Eller
  0 siblings, 2 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-16 17:27 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Helmut Eller, Eli Zaretskii, emacs-devel

Andrea Corallo <acorallo@gnu.org> writes:

> Could you show an example of code you want to generate (and the
> corresponding code you want to generate from)?

Sorry we had that already on emacs-devel, recently. My question then was
if someone in the last 20 years has tried to generate C code from C
parse trees, for example for structs, plus possibly annotations.
Apparently non one did.

In Emacs, from the top of my head, GC marking/fix/mirror functions for
structs comes to mind, the pdumper dump_xx functions also comes to mind.

As I said, nothing cames out of this. And nothing will.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16 17:27                               ` Gerd Möllmann
@ 2024-05-16 17:50                                 ` Andrea Corallo
  2024-05-16 20:03                                 ` Helmut Eller
  1 sibling, 0 replies; 62+ messages in thread
From: Andrea Corallo @ 2024-05-16 17:50 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Helmut Eller, Eli Zaretskii, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Andrea Corallo <acorallo@gnu.org> writes:
>
>> Could you show an example of code you want to generate (and the
>> corresponding code you want to generate from)?
>
> Sorry we had that already on emacs-devel, recently.

Sorry I don't have time to go through all this ginormous thread, I'll
not be able to help you then :(

  Andrea



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16 17:27                               ` Gerd Möllmann
  2024-05-16 17:50                                 ` Andrea Corallo
@ 2024-05-16 20:03                                 ` Helmut Eller
  2024-05-17  4:04                                   ` Gerd Möllmann
  2024-05-17  6:09                                   ` Eli Zaretskii
  1 sibling, 2 replies; 62+ messages in thread
From: Helmut Eller @ 2024-05-16 20:03 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Andrea Corallo, Eli Zaretskii, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 186 bytes --]

On Thu, May 16 2024, Gerd Möllmann wrote:

> As I said, nothing cames out of this. And nothing will.

The script below uses the libclang Python bindings to produce this
output:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: x.c --]
[-- Type: text/x-csrc, Size: 1200 bytes --]

static void
mirror_Lisp_Symbol (struct igc_mirror *m, struct Lisp_Symbol *x)
{
  mirror_lisp_obj (m, &x->u.s.name);
  mirror_lisp_obj (m, &x->u.s.function);
  mirror_lisp_obj (m, &x->u.s.plist);
  mirror_ptr (m, &x->u.s.next);
}
static void
mirror_Lisp_String (struct igc_mirror *m, struct Lisp_String *x)
{
  mirror_ptr (m, &x->u.next);
  mirror_ptr (m, &x->u.s.intervals);
}
static void
mirror_interval (struct igc_mirror *m, struct interval *x)
{
  mirror_ptr (m, &x->left);
  mirror_ptr (m, &x->right);
  mirror_lisp_obj (m, &x->plist);
  mirror_ptr (m, &x->up.interval);
  mirror_lisp_obj (m, &x->up.obj);
}
static void
mirror_itree_node (struct igc_mirror *m, struct itree_node *x)
{
  mirror_ptr (m, &x->parent);
  mirror_ptr (m, &x->left);
  mirror_ptr (m, &x->right);
  mirror_lisp_obj (m, &x->data);
}
static void
mirror_image (struct igc_mirror *m, struct image *x)
{
  mirror_lisp_obj (m, &x->spec);
  mirror_lisp_obj (m, &x->dependencies);
  mirror_lisp_obj (m, &x->lisp_data);
  mirror_ptr (m, &x->next);
  mirror_ptr (m, &x->prev);
}
static void
mirror_Lisp_Cons (struct igc_mirror *m, struct Lisp_Cons *x)
{
  mirror_lisp_obj (m, &x->u.s.car);
  mirror_lisp_obj (m, &x->u.s.u.cdr);
}

[-- Attachment #3: Type: text/plain, Size: 28 bytes --]


Would this be acceptable?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: igc_codegen.py --]
[-- Type: text/x-python, Size: 2123 bytes --]

import clang.cindex as cindex
import sys
import re

index = cindex.Index.create()
tu = index.parse("igc.c")

MPS_STRUCTS = ["Lisp_Symbol",
               "Lisp_String",
               "interval",
               "itree_node",
               "image",
               "Lisp_Cons"];

EXCLUDE_FIELDS = {
    "Lisp_Symbol": re.compile('u s val.*'),
    "Lisp_Cons": re.compile('.* chain'),
}

def type_for_struct_name (name) -> cindex.Type:
    for c in tu.cursor.get_children():
        if (c.kind == cindex.CursorKind.STRUCT_DECL and c.spelling == name):
            return c.type.get_canonical()
    raise Exception("struct type not found: " + name)

mps_types = [type_for_struct_name(name) for name in MPS_STRUCTS]

lisp_obj_type = next(c.type.get_canonical()
                     for c in tu.cursor.get_children()
                     if c.spelling == 'Lisp_Object'
                     if c.kind == cindex.CursorKind.TYPEDEF_DECL)

def is_mps_ref (t:cindex.Type) -> bool:
    return (t.get_canonical() == lisp_obj_type or
            (t.get_pointee().get_canonical() in mps_types))

def field_paths (t:cindex.Type):
    l1 = [([f.spelling], f.type.get_canonical()) for f in t.get_fields()]
    l2 = [(p1 + p2,t2) for (p1,t1) in l1 for (p2,t2) in field_paths(t1)]
    return l1 + l2

def ref_fields (t:cindex.Type):
    excluded = EXCLUDE_FIELDS.get(t.get_declaration().spelling, re.compile(""))
    return [(p,t) for (p,t) in field_paths(t)
            if is_mps_ref (t)
            if not excluded.fullmatch(' '.join(p))]

def emit_function_signature (t:cindex.Type):
    print("""static void
mirror_%(fname)s (struct igc_mirror *m, %(atype)s *x)
{"""% { "fname": t.get_declaration().spelling,
       "atype": t.spelling })

def emit_mirror (t:cindex.Type):
    emit_function_signature(t)
    for (path, type) in ref_fields(t):
        if type == lisp_obj_type:
            print("  mirror_lisp_obj (m, &x->%s);" % '.'.join(path))
        else:
            print("  mirror_ptr (m, &x->%s);" % '.'.join(path))
    print("}")


def main():
    for t in mps_types:
        emit_mirror(t)

if __name__ == "__main__":
   main()

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16 20:03                                 ` Helmut Eller
@ 2024-05-17  4:04                                   ` Gerd Möllmann
  2024-05-17  6:09                                   ` Eli Zaretskii
  1 sibling, 0 replies; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-17  4:04 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Andrea Corallo, Eli Zaretskii, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Thu, May 16 2024, Gerd Möllmann wrote:
>
>> As I said, nothing cames out of this. And nothing will.
>
> The script below uses the libclang Python bindings to produce this
> output:
>
>
>
> Would this be acceptable?

Wow, Danke Helmut!

For me, that's the direction to go.

What the GNU side says is of course not my bear.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-16 20:03                                 ` Helmut Eller
  2024-05-17  4:04                                   ` Gerd Möllmann
@ 2024-05-17  6:09                                   ` Eli Zaretskii
  2024-05-18 18:55                                     ` Helmut Eller
  1 sibling, 1 reply; 62+ messages in thread
From: Eli Zaretskii @ 2024-05-17  6:09 UTC (permalink / raw)
  To: Helmut Eller; +Cc: gerd.moellmann, acorallo, emacs-devel

> From: Helmut Eller <eller.helmut@gmail.com>
> Cc: Andrea Corallo <acorallo@gnu.org>,  Eli Zaretskii <eliz@gnu.org>,
>   emacs-devel@gnu.org
> Date: Thu, 16 May 2024 22:03:14 +0200
> 
> On Thu, May 16 2024, Gerd Möllmann wrote:
> 
> > As I said, nothing cames out of this. And nothing will.
> 
> The script below uses the libclang Python bindings to produce this
> output:
> 
> Would this be acceptable?

As a one-time thing, I don't think anyone will care how the code was
obtained, as long as it is maintained by hand henceforth.

But if you suggest this as a permanent inclusion into Emacs, then I
don't think we can go that way, since the tools to produce this are
neither standard ones available everywhere, nor something we can
include with Emacs.

Given that Emacs now has tree-sitter bindings, I wonder whether the
same can be done in Emacs Lisp using tree-sitter for parsing.  That'd
be acceptable, I think.

Thanks.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-17  6:09                                   ` Eli Zaretskii
@ 2024-05-18 18:55                                     ` Helmut Eller
  2024-05-18 20:16                                       ` Andrea Corallo
  2024-05-19  3:48                                       ` Gerd Möllmann
  0 siblings, 2 replies; 62+ messages in thread
From: Helmut Eller @ 2024-05-18 18:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, acorallo, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1197 bytes --]

On Fri, May 17 2024, Eli Zaretskii wrote:

> As a one-time thing, I don't think anyone will care how the code was
> obtained, as long as it is maintained by hand henceforth.
>
> But if you suggest this as a permanent inclusion into Emacs, then I
> don't think we can go that way, since the tools to produce this are
> neither standard ones available everywhere, nor something we can
> include with Emacs.

I think the goal is not to edit the generated code manually but to keep
the code generator around and edit that if needed.

> Given that Emacs now has tree-sitter bindings, I wonder whether the
> same can be done in Emacs Lisp using tree-sitter for parsing.  That'd
> be acceptable, I think.

That's an interesting idea.  I tried to rewrite the Python code in Elisp
and it works, after a fashion.

The tree-sitter syntax tree is at a lower level than what libclang
offers and I had to rewrite the tree quite a bit to make it easier to
use.  I also pipe the C source code through the preprocessor first, so
that tree-sitter doesn't see macros.  With macros, it's even harder to
get some easy to use data structures out of it.

I'm not sure how to proceed from here. Anyway the code is here:


[-- Attachment #2: igc-codegen.el --]
[-- Type: application/emacs-lisp, Size: 11124 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-18 18:55                                     ` Helmut Eller
@ 2024-05-18 20:16                                       ` Andrea Corallo
  2024-05-19  5:27                                         ` Eli Zaretskii
  2024-05-19  3:48                                       ` Gerd Möllmann
  1 sibling, 1 reply; 62+ messages in thread
From: Andrea Corallo @ 2024-05-18 20:16 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, gerd.moellmann, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Fri, May 17 2024, Eli Zaretskii wrote:
>
>> As a one-time thing, I don't think anyone will care how the code was
>> obtained, as long as it is maintained by hand henceforth.
>>
>> But if you suggest this as a permanent inclusion into Emacs, then I
>> don't think we can go that way, since the tools to produce this are
>> neither standard ones available everywhere, nor something we can
>> include with Emacs.
>
> I think the goal is not to edit the generated code manually but to keep
> the code generator around and edit that if needed.
>
>> Given that Emacs now has tree-sitter bindings, I wonder whether the
>> same can be done in Emacs Lisp using tree-sitter for parsing.  That'd
>> be acceptable, I think.
>
> That's an interesting idea.  I tried to rewrite the Python code in Elisp
> and it works, after a fashion.

Thanks for taking this up.

> The tree-sitter syntax tree is at a lower level than what libclang
> offers and I had to rewrite the tree quite a bit to make it easier to
> use.  I also pipe the C source code through the preprocessor first, so
> that tree-sitter doesn't see macros.  With macros, it's even harder to
> get some easy to use data structures out of it.
>
> I'm not sure how to proceed from here. Anyway the code is here:

I guess we want to have it in admin/ and run it when necessary?

Thanks

  Andrea



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-18 18:55                                     ` Helmut Eller
  2024-05-18 20:16                                       ` Andrea Corallo
@ 2024-05-19  3:48                                       ` Gerd Möllmann
  2024-05-19  6:39                                         ` Eli Zaretskii
  1 sibling, 1 reply; 62+ messages in thread
From: Gerd Möllmann @ 2024-05-19  3:48 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, acorallo, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Fri, May 17 2024, Eli Zaretskii wrote:
>
>> As a one-time thing, I don't think anyone will care how the code was
>> obtained, as long as it is maintained by hand henceforth.
>>
>> But if you suggest this as a permanent inclusion into Emacs, then I
>> don't think we can go that way, since the tools to produce this are
>> neither standard ones available everywhere, nor something we can
>> include with Emacs.
>
> I think the goal is not to edit the generated code manually but to keep
> the code generator around and edit that if needed.

+1

>> Given that Emacs now has tree-sitter bindings, I wonder whether the
>> same can be done in Emacs Lisp using tree-sitter for parsing.  That'd
>> be acceptable, I think.
>
> That's an interesting idea.  I tried to rewrite the Python code in Elisp
> and it works, after a fashion.
>
> The tree-sitter syntax tree is at a lower level than what libclang
> offers and I had to rewrite the tree quite a bit to make it easier to
> use.  I also pipe the C source code through the preprocessor first, so
> that tree-sitter doesn't see macros.  With macros, it's even harder to
> get some easy to use data structures out of it.
>
> I'm not sure how to proceed from here. Anyway the code is here:

Thanks.

I don't know how to proceed with this either. I mean technically it's
probably clear: have an annotated AST, write AST matchers and an AST
visitor framework (just as an example) and generate C code. Start with
generating some fix_ functions, then others.

It's quite some work. (And I must say that I slowly feel I'd rather do
something else now than MPS :-).




^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-18 20:16                                       ` Andrea Corallo
@ 2024-05-19  5:27                                         ` Eli Zaretskii
  0 siblings, 0 replies; 62+ messages in thread
From: Eli Zaretskii @ 2024-05-19  5:27 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: eller.helmut, gerd.moellmann, emacs-devel

> From: Andrea Corallo <acorallo@gnu.org>
> Cc: Eli Zaretskii <eliz@gnu.org>,  gerd.moellmann@gmail.com,
>   emacs-devel@gnu.org
> Date: Sat, 18 May 2024 16:16:39 -0400
> 
> > The tree-sitter syntax tree is at a lower level than what libclang
> > offers and I had to rewrite the tree quite a bit to make it easier to
> > use.  I also pipe the C source code through the preprocessor first, so
> > that tree-sitter doesn't see macros.  With macros, it's even harder to
> > get some easy to use data structures out of it.
> >
> > I'm not sure how to proceed from here. Anyway the code is here:
> 
> I guess we want to have it in admin/ and run it when necessary?

Yes, I think so.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: MPS: Loaded pdump
  2024-05-19  3:48                                       ` Gerd Möllmann
@ 2024-05-19  6:39                                         ` Eli Zaretskii
  0 siblings, 0 replies; 62+ messages in thread
From: Eli Zaretskii @ 2024-05-19  6:39 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eller.helmut, acorallo, emacs-devel

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  acorallo@gnu.org,  emacs-devel@gnu.org
> Date: Sun, 19 May 2024 05:48:29 +0200
> 
> (And I must say that I slowly feel I'd rather do something else now
> than MPS :-).

IMO, if we drop the development of the branch now, the chances of its
ever being landed become very low indeed.  So I hope the branch will
keep being actively worked on.



^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2024-05-19  6:39 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-09 10:52 MPS: Loaded pdump Gerd Möllmann
2024-05-09 11:00 ` Eli Zaretskii
2024-05-09 11:20   ` Gerd Möllmann
2024-05-09 12:28 ` Helmut Eller
2024-05-09 13:37   ` Gerd Möllmann
2024-05-09 16:10     ` Helmut Eller
2024-05-09 16:43       ` Gerd Möllmann
2024-05-09 17:57         ` Helmut Eller
2024-05-09 18:10           ` Gerd Möllmann
2024-05-09 13:38 ` Helmut Eller
2024-05-09 14:18   ` Gerd Möllmann
2024-05-09 15:01     ` Helmut Eller
2024-05-09 15:07       ` Gerd Möllmann
2024-05-10  7:59         ` Gerd Möllmann
2024-05-10  8:09           ` Helmut Eller
2024-05-10  8:35             ` Gerd Möllmann
2024-05-10  8:51               ` Helmut Eller
2024-05-10  8:54                 ` Gerd Möllmann
2024-05-10 10:25             ` Eli Zaretskii
2024-05-10 11:31               ` Gerd Möllmann
2024-05-10 12:52                 ` Gerd Möllmann
2024-05-10 13:37                   ` Helmut Eller
2024-05-10 13:59                     ` Gerd Möllmann
2024-05-10 14:31                       ` Helmut Eller
2024-05-10 14:36                         ` Gerd Möllmann
2024-05-13  9:11                   ` Gerd Möllmann
2024-05-14  8:23                     ` Gerd Möllmann
2024-05-14 14:22                       ` Helmut Eller
2024-05-14 15:46                         ` Gerd Möllmann
2024-05-14 17:49                           ` Eli Zaretskii
2024-05-14 18:10                             ` Gerd Möllmann
2024-05-16  4:25                       ` Gerd Möllmann
2024-05-16  8:36                         ` Helmut Eller
2024-05-16  8:46                           ` Gerd Möllmann
2024-05-16  9:01                           ` Gerd Möllmann
2024-05-16  9:31                             ` Helmut Eller
2024-05-16  9:42                               ` Gerd Möllmann
2024-05-16  9:54                                 ` Gerd Möllmann
2024-05-16 12:43                                   ` Helmut Eller
2024-05-16 12:47                                     ` Gerd Möllmann
2024-05-16 12:08                                 ` Eli Zaretskii
2024-05-16 12:27                                   ` Gerd Möllmann
2024-05-16 12:07                               ` Eli Zaretskii
2024-05-16 12:21                                 ` Gerd Möllmann
2024-05-16 12:27                                   ` Eli Zaretskii
2024-05-16 12:43                                     ` Gerd Möllmann
2024-05-16 14:09                         ` Helmut Eller
2024-05-16 14:24                           ` Gerd Möllmann
2024-05-16 15:48                             ` Eli Zaretskii
2024-05-16 16:56                             ` Andrea Corallo
2024-05-16 17:27                               ` Gerd Möllmann
2024-05-16 17:50                                 ` Andrea Corallo
2024-05-16 20:03                                 ` Helmut Eller
2024-05-17  4:04                                   ` Gerd Möllmann
2024-05-17  6:09                                   ` Eli Zaretskii
2024-05-18 18:55                                     ` Helmut Eller
2024-05-18 20:16                                       ` Andrea Corallo
2024-05-19  5:27                                         ` Eli Zaretskii
2024-05-19  3:48                                       ` Gerd Möllmann
2024-05-19  6:39                                         ` Eli Zaretskii
2024-05-09 18:24       ` Gerd Möllmann
2024-05-09 18:35         ` Gerd Möllmann

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.