unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* --with-native-compilation build failure on 32-bit systems
@ 2022-08-05  2:12 Joseph Mingrone
  2022-08-05 11:58 ` Lars Ingebrigtsen
  0 siblings, 1 reply; 47+ messages in thread
From: Joseph Mingrone @ 2022-08-05  2:12 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: emacs-devel, emacs

Hello Lars,

Could 261d6af have broken --with-native-compilation builds on 32-bit systems?  This is what I see building in a clean FreeBSD/i386 13.0 jail using 261d6af:
http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log

6fb2063 looks good though (the pkg-plist error at the end can be ignored).
http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h53m03s/logs/errors/emacs-devel-29.0.50.20220804,2.log

Is there any other information that I can provide?

Joe



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-05  2:12 --with-native-compilation build failure on 32-bit systems Joseph Mingrone
@ 2022-08-05 11:58 ` Lars Ingebrigtsen
  2022-08-05 13:30   ` Andrea Corallo
  0 siblings, 1 reply; 47+ messages in thread
From: Lars Ingebrigtsen @ 2022-08-05 11:58 UTC (permalink / raw)
  To: Joseph Mingrone; +Cc: emacs-devel, emacs, Andrea Corallo

Joseph Mingrone <jrm@ftfl.ca> writes:

> Could 261d6af have broken --with-native-compilation builds on 32-bit
> systems?  This is what I see building in a clean FreeBSD/i386 13.0
> jail using 261d6af:
> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log

I guess these are the error messages?

emacs: Trying to load incoherent dumped eln file /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln

I don't know what that means; Andrea added to the CCs.



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-05 11:58 ` Lars Ingebrigtsen
@ 2022-08-05 13:30   ` Andrea Corallo
  2022-08-05 14:40     ` Andrea Corallo
  0 siblings, 1 reply; 47+ messages in thread
From: Andrea Corallo @ 2022-08-05 13:30 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Joseph Mingrone, emacs-devel, emacs

Lars Ingebrigtsen <larsi@gnus.org> writes:

> Joseph Mingrone <jrm@ftfl.ca> writes:
>
>> Could 261d6af have broken --with-native-compilation builds on 32-bit
>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
>> jail using 261d6af:
>> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
>
> I guess these are the error messages?
>
> emacs: Trying to load incoherent dumped eln file
> /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
>
> I don't know what that means; Andrea added to the CCs.

It's very surprising to see 261d6af causing this side effect, at least I
don't see why should effect the 32bit build only.

I'm trying to reproduce it on my 32bit env.

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-05 13:30   ` Andrea Corallo
@ 2022-08-05 14:40     ` Andrea Corallo
  2022-08-05 15:16       ` Lynn Winebarger
  2022-08-09  9:11       ` Andrea Corallo
  0 siblings, 2 replies; 47+ messages in thread
From: Andrea Corallo @ 2022-08-05 14:40 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Joseph Mingrone, emacs-devel, emacs

Andrea Corallo <akrl@sdf.org> writes:

> Lars Ingebrigtsen <larsi@gnus.org> writes:
>
>> Joseph Mingrone <jrm@ftfl.ca> writes:
>>
>>> Could 261d6af have broken --with-native-compilation builds on 32-bit
>>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
>>> jail using 261d6af:
>>> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
>>
>> I guess these are the error messages?
>>
>> emacs: Trying to load incoherent dumped eln file
>> /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
>>
>> I don't know what that means; Andrea added to the CCs.
>
> It's very surprising to see 261d6af causing this side effect, at least I
> don't see why should effect the 32bit build only.
>
> I'm trying to reproduce it on my 32bit env.

I confirm the build it's broken on my 32bit env as well, (but not on the
64 one).

Loading the second dump, while we are relocating the ediff-hook
compilation unit, we realize (@ pdumper.c:5304) that its file field is
not a cons as expected but just a string.

Now the question is why this is not fixed-up in loadup.el:477 as for the
other compilation units?

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-05 14:40     ` Andrea Corallo
@ 2022-08-05 15:16       ` Lynn Winebarger
  2022-08-08  7:44         ` Andrea Corallo
  2022-08-09  9:11       ` Andrea Corallo
  1 sibling, 1 reply; 47+ messages in thread
From: Lynn Winebarger @ 2022-08-05 15:16 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

[-- Attachment #1: Type: text/plain, Size: 1839 bytes --]

On Fri, Aug 5, 2022, 10:42 AM Andrea Corallo <akrl@sdf.org> wrote:

> Andrea Corallo <akrl@sdf.org> writes:
>
> > Lars Ingebrigtsen <larsi@gnus.org> writes:
> >
> >> Joseph Mingrone <jrm@ftfl.ca> writes:
> >>
> >>> Could 261d6af have broken --with-native-compilation builds on 32-bit
> >>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
> >>> jail using 261d6af:
> >>>
> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
> >>
> >> I guess these are the error messages?
> >>
> >> emacs: Trying to load incoherent dumped eln file
> >>
> /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
> >>
> >> I don't know what that means; Andrea added to the CCs.
> >
> > It's very surprising to see 261d6af causing this side effect, at least I
> > don't see why should effect the 32bit build only.
> >
> > I'm trying to reproduce it on my 32bit env.
>
> I confirm the build it's broken on my 32bit env as well, (but not on the
> 64 one).
>
> Loading the second dump, while we are relocating the ediff-hook
> compilation unit, we realize (@ pdumper.c:5304) that its file field is
> not a cons as expected but just a string.
>
> Now the question is why this is not fixed-up in loadup.el:477 as for the
> other compilation units?


Are you sure it's actually fixed up in the other compilation units?  When
I've seen this problem, it was because the bindir and elndir arguments were
not specified while dumping.  The complaint came up from one of the later
(but not last) files I had loaded for dumping, but none of the files were
fixed up.

This problem should be signaled by loadup if there are any NCUs it does not
fix up.  It would be a lot easier to diagnose the problem from there.

Lynn

[-- Attachment #2: Type: text/html, Size: 2982 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-05 15:16       ` Lynn Winebarger
@ 2022-08-08  7:44         ` Andrea Corallo
  2022-08-08 10:22           ` Lynn Winebarger
  0 siblings, 1 reply; 47+ messages in thread
From: Andrea Corallo @ 2022-08-08  7:44 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

Lynn Winebarger <owinebar@gmail.com> writes:

> On Fri, Aug 5, 2022, 10:42 AM Andrea Corallo <akrl@sdf.org> wrote:
>
>  Andrea Corallo <akrl@sdf.org> writes:
>
>  > Lars Ingebrigtsen <larsi@gnus.org> writes:
>  >
>  >> Joseph Mingrone <jrm@ftfl.ca> writes:
>  >>
>  >>> Could 261d6af have broken --with-native-compilation builds on 32-bit
>  >>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
>  >>> jail using 261d6af:
>  >>> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
>  >>
>  >> I guess these are the error messages?
>  >>
>  >> emacs: Trying to load incoherent dumped eln file
>  >>
>  /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
>  
>  >>
>  >> I don't know what that means; Andrea added to the CCs.
>  >
>  > It's very surprising to see 261d6af causing this side effect, at least I
>  > don't see why should effect the 32bit build only.
>  >
>  > I'm trying to reproduce it on my 32bit env.
>
>  I confirm the build it's broken on my 32bit env as well, (but not on the
>  64 one).
>
>  Loading the second dump, while we are relocating the ediff-hook
>  compilation unit, we realize (@ pdumper.c:5304) that its file field is
>  not a cons as expected but just a string.
>
>  Now the question is why this is not fixed-up in loadup.el:477 as for the
>  other compilation units?
>
> Are you sure it's actually fixed up in the other compilation units?

Indeed, otherwise an error is signaled.

> This problem should be signaled by loadup if there are any NCUs it does not fix up.  It would be a lot easier to diagnose
> the problem from there.

loadup is in charge of fixing up on all CU's file fields, and indeed if
something goes wrong in that code an error is signaled.  But evidently
this is not the case, so there's something more to understand.

Regards

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-08  7:44         ` Andrea Corallo
@ 2022-08-08 10:22           ` Lynn Winebarger
  2022-08-08 13:14             ` Andrea Corallo
  0 siblings, 1 reply; 47+ messages in thread
From: Lynn Winebarger @ 2022-08-08 10:22 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

[-- Attachment #1: Type: text/plain, Size: 2789 bytes --]

On Mon, Aug 8, 2022, 3:44 AM Andrea Corallo <akrl@sdf.org> wrote:

> Lynn Winebarger <owinebar@gmail.com> writes:
>
> > On Fri, Aug 5, 2022, 10:42 AM Andrea Corallo <akrl@sdf.org> wrote:
> >
> >  Andrea Corallo <akrl@sdf.org> writes:
> >
> >  > Lars Ingebrigtsen <larsi@gnus.org> writes:
> >  >
> >  >> Joseph Mingrone <jrm@ftfl.ca> writes:
> >  >>
> >  >>> Could 261d6af have broken --with-native-compilation builds on 32-bit
> >  >>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
> >  >>> jail using 261d6af:
> >  >>>
> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
> >  >>
> >  >> I guess these are the error messages?
> >  >>
> >  >> emacs: Trying to load incoherent dumped eln file
> >  >>
> >
> /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
> >
> >  >>
> >  >> I don't know what that means; Andrea added to the CCs.
> >  >
> >  > It's very surprising to see 261d6af causing this side effect, at
> least I
> >  > don't see why should effect the 32bit build only.
> >  >
> >  > I'm trying to reproduce it on my 32bit env.
> >
> >  I confirm the build it's broken on my 32bit env as well, (but not on the
> >  64 one).
> >
> >  Loading the second dump, while we are relocating the ediff-hook
> >  compilation unit, we realize (@ pdumper.c:5304) that its file field is
> >  not a cons as expected but just a string.
> >
> >  Now the question is why this is not fixed-up in loadup.el:477 as for the
> >  other compilation units?
> >
> > Are you sure it's actually fixed up in the other compilation units?
>
> Indeed, otherwise an error is signaled.
>
> > This problem should be signaled by loadup if there are any NCUs it does
> not fix up.  It would be a lot easier to diagnose
> > the problem from there.
>
> loadup is in charge of fixing up on all CU's file fields, and indeed if
> something goes wrong in that code an error is signaled.  But evidently
> this is not the case, so there's something more to understand.
>

I just looked, and there are 2 possible paths for NCUs to be in the dump
without an error being signaled:
1 - either the --bin-dest or --eln-dest flag is not specified (or is on the
command line but empty)
2 - there is an NCU loaded for which no symbol is bound to a subr in that
NCU.

Since I put in some code (in loadup) to explicitly test whether any loaded
NCU would be missed by (2), I have seen one instance pop up, though not
while only loading the files in loadup - site-load loads many more.
However, I've removed the requirement of having a cons cell in the NCU in
the dump file, so I don't know if it was destined was garbage collection,
and so discarded by the dump process.

Lynn

[-- Attachment #2: Type: text/html, Size: 4413 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-08 10:22           ` Lynn Winebarger
@ 2022-08-08 13:14             ` Andrea Corallo
  2022-08-08 13:55               ` Lynn Winebarger
  0 siblings, 1 reply; 47+ messages in thread
From: Andrea Corallo @ 2022-08-08 13:14 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

Lynn Winebarger <owinebar@gmail.com> writes:

> On Mon, Aug 8, 2022, 3:44 AM Andrea Corallo <akrl@sdf.org> wrote:
>
>  Lynn Winebarger <owinebar@gmail.com> writes:
>
>  > On Fri, Aug 5, 2022, 10:42 AM Andrea Corallo <akrl@sdf.org> wrote:
>  >
>  >  Andrea Corallo <akrl@sdf.org> writes:
>  >
>  >  > Lars Ingebrigtsen <larsi@gnus.org> writes:
>  >  >
>  >  >> Joseph Mingrone <jrm@ftfl.ca> writes:
>  >  >>
>  >  >>> Could 261d6af have broken --with-native-compilation builds on 32-bit
>  >  >>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
>  >  >>> jail using 261d6af:
>  >  >>> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
>  >  >>
>  >  >> I guess these are the error messages?
>  >  >>
>  >  >> emacs: Trying to load incoherent dumped eln file
>  >  >>
>  > 
>  /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
>  
>  >  
>  >  >>
>  >  >> I don't know what that means; Andrea added to the CCs.
>  >  >
>  >  > It's very surprising to see 261d6af causing this side effect, at least I
>  >  > don't see why should effect the 32bit build only.
>  >  >
>  >  > I'm trying to reproduce it on my 32bit env.
>  >
>  >  I confirm the build it's broken on my 32bit env as well, (but not on the
>  >  64 one).
>  >
>  >  Loading the second dump, while we are relocating the ediff-hook
>  >  compilation unit, we realize (@ pdumper.c:5304) that its file field is
>  >  not a cons as expected but just a string.
>  >
>  >  Now the question is why this is not fixed-up in loadup.el:477 as for the
>  >  other compilation units?
>  >
>  > Are you sure it's actually fixed up in the other compilation units?
>
>  Indeed, otherwise an error is signaled.
>
>  > This problem should be signaled by loadup if there are any NCUs it does not fix up.  It would be a lot easier to
>  diagnose
>  > the problem from there.
>
>  loadup is in charge of fixing up on all CU's file fields, and indeed if
>  something goes wrong in that code an error is signaled.  But evidently
>  this is not the case, so there's something more to understand.
>
> I just looked, and there are 2 possible paths for NCUs to be in the dump without an error being signaled:
> 1 - either the --bin-dest or --eln-dest flag is not specified (or is
> on the command line but empty)

This is not the case in our build.

> 2 - there is an NCU loaded for which no symbol is bound to a subr in that NCU.

CUs that are not reachable from the function slot of a symbol are
unloaded when GC runs.  We do run GC before dumping so this should not
happen.

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-08 13:14             ` Andrea Corallo
@ 2022-08-08 13:55               ` Lynn Winebarger
  2022-08-08 14:13                 ` Andrea Corallo
  0 siblings, 1 reply; 47+ messages in thread
From: Lynn Winebarger @ 2022-08-08 13:55 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

[-- Attachment #1: Type: text/plain, Size: 3490 bytes --]

On Mon, Aug 8, 2022, 9:14 AM Andrea Corallo <akrl@sdf.org> wrote:

> Lynn Winebarger <owinebar@gmail.com> writes:
>
> > On Mon, Aug 8, 2022, 3:44 AM Andrea Corallo <akrl@sdf.org> wrote:
> >
> >  Lynn Winebarger <owinebar@gmail.com> writes:
> >
> >  > On Fri, Aug 5, 2022, 10:42 AM Andrea Corallo <akrl@sdf.org> wrote:
> >  >
> >  >  Andrea Corallo <akrl@sdf.org> writes:
> >  >
> >  >  > Lars Ingebrigtsen <larsi@gnus.org> writes:
> >  >  >
> >  >  >> Joseph Mingrone <jrm@ftfl.ca> writes:
> >  >  >>
> >  >  >>> Could 261d6af have broken --with-native-compilation builds on
> 32-bit
> >  >  >>> systems?  This is what I see building in a clean FreeBSD/i386
> 13.0
> >  >  >>> jail using 261d6af:
> >  >  >>>
> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
> >  >  >>
> >  >  >> I guess these are the error messages?
> >  >  >>
> >  >  >> emacs: Trying to load incoherent dumped eln file
> >  >  >>
> >  >
> >
> /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
> >
> >  >
> >  >  >>
> >  >  >> I don't know what that means; Andrea added to the CCs.
> >  >  >
> >  >  > It's very surprising to see 261d6af causing this side effect, at
> least I
> >  >  > don't see why should effect the 32bit build only.
> >  >  >
> >  >  > I'm trying to reproduce it on my 32bit env.
> >  >
> >  >  I confirm the build it's broken on my 32bit env as well, (but not on
> the
> >  >  64 one).
> >  >
> >  >  Loading the second dump, while we are relocating the ediff-hook
> >  >  compilation unit, we realize (@ pdumper.c:5304) that its file field
> is
> >  >  not a cons as expected but just a string.
> >  >
> >  >  Now the question is why this is not fixed-up in loadup.el:477 as for
> the
> >  >  other compilation units?
> >  >
> >  > Are you sure it's actually fixed up in the other compilation units?
> >
> >  Indeed, otherwise an error is signaled.
> >
> >  > This problem should be signaled by loadup if there are any NCUs it
> does not fix up.  It would be a lot easier to
> >  diagnose
> >  > the problem from there.
> >
> >  loadup is in charge of fixing up on all CU's file fields, and indeed if
> >  something goes wrong in that code an error is signaled.  But evidently
> >  this is not the case, so there's something more to understand.
> >
> > I just looked, and there are 2 possible paths for NCUs to be in the dump
> without an error being signaled:
> > 1 - either the --bin-dest or --eln-dest flag is not specified (or is
> > on the command line but empty)
>
> This is not the case in our build.
>

No, but it is one way the dump can produce an unusable file without any
error signaled until an Emacs instance attempts to load it.

>
> > 2 - there is an NCU loaded for which no symbol is bound to a subr in
> that NCU.
>
> CUs that are not reachable from the function slot of a symbol are
> unloaded when GC runs.  We do run GC before dumping so this should not
> happen.


Yes, "should" is the operative word there.  Why not validate the condition
before writing the dump file?  If not in loadup, then in the procedure that
records the NCU in the dump?  Why wait until load-time to catch something
that was almost certainly (barring user performing surgery on the dump
file) the case when the dump was produced?  Just put the same check before
the "write" operation that is  done immediately after the corresponding
"read" operation.

Lynn

[-- Attachment #2: Type: text/html, Size: 5560 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-08 13:55               ` Lynn Winebarger
@ 2022-08-08 14:13                 ` Andrea Corallo
  0 siblings, 0 replies; 47+ messages in thread
From: Andrea Corallo @ 2022-08-08 14:13 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

Lynn Winebarger <owinebar@gmail.com> writes:

> On Mon, Aug 8, 2022, 9:14 AM Andrea Corallo <akrl@sdf.org> wrote:
>
>  Lynn Winebarger <owinebar@gmail.com> writes:
>
>  > On Mon, Aug 8, 2022, 3:44 AM Andrea Corallo <akrl@sdf.org> wrote:
>  >
>  >  Lynn Winebarger <owinebar@gmail.com> writes:
>  >
>  >  > On Fri, Aug 5, 2022, 10:42 AM Andrea Corallo <akrl@sdf.org> wrote:
>  >  >
>  >  >  Andrea Corallo <akrl@sdf.org> writes:
>  >  >
>  >  >  > Lars Ingebrigtsen <larsi@gnus.org> writes:
>  >  >  >
>  >  >  >> Joseph Mingrone <jrm@ftfl.ca> writes:
>  >  >  >>
>  >  >  >>> Could 261d6af have broken --with-native-compilation builds on 32-bit
>  >  >  >>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
>  >  >  >>> jail using 261d6af:
>  >  >  >>> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
>  >  >  >>
>  >  >  >> I guess these are the error messages?
>  >  >  >>
>  >  >  >> emacs: Trying to load incoherent dumped eln file
>  >  >  >>
>  >  > 
>  > 
>  /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
>  
>  >  
>  >  >  
>  >  >  >>
>  >  >  >> I don't know what that means; Andrea added to the CCs.
>  >  >  >
>  >  >  > It's very surprising to see 261d6af causing this side effect, at least I
>  >  >  > don't see why should effect the 32bit build only.
>  >  >  >
>  >  >  > I'm trying to reproduce it on my 32bit env.
>  >  >
>  >  >  I confirm the build it's broken on my 32bit env as well, (but not on the
>  >  >  64 one).
>  >  >
>  >  >  Loading the second dump, while we are relocating the ediff-hook
>  >  >  compilation unit, we realize (@ pdumper.c:5304) that its file field is
>  >  >  not a cons as expected but just a string.
>  >  >
>  >  >  Now the question is why this is not fixed-up in loadup.el:477 as for the
>  >  >  other compilation units?
>  >  >
>  >  > Are you sure it's actually fixed up in the other compilation units?
>  >
>  >  Indeed, otherwise an error is signaled.
>  >
>  >  > This problem should be signaled by loadup if there are any NCUs it does not fix up.  It would be a lot easier to
>  >  diagnose
>  >  > the problem from there.
>  >
>  >  loadup is in charge of fixing up on all CU's file fields, and indeed if
>  >  something goes wrong in that code an error is signaled.  But evidently
>  >  this is not the case, so there's something more to understand.
>  >
>  > I just looked, and there are 2 possible paths for NCUs to be in the dump without an error being signaled:
>  > 1 - either the --bin-dest or --eln-dest flag is not specified (or is
>  > on the command line but empty)
>
>  This is not the case in our build.
>
> No, but it is one way the dump can produce an unusable file without any error signaled until an Emacs instance attempts
> to load it.

That is understood, but it can't happen using our current build system,
and this is what we are interested in here.

>  > 2 - there is an NCU loaded for which no symbol is bound to a subr in that NCU.
>
>  CUs that are not reachable from the function slot of a symbol are
>  unloaded when GC runs.  We do run GC before dumping so this should not
>  happen.
>
> Yes, "should" is the operative word there.  Why not validate the condition before writing the dump file?  If not in
> loadup, then in the procedure that records the NCU in the dump?  Why wait until load-time to catch something that was
> almost certainly (barring user performing surgery on the dump file) the case when the dump was produced?  Just put the
> same check before the "write" operation that is  done immediately after the corresponding "read" operation.

Just submit a patch.

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-05 14:40     ` Andrea Corallo
  2022-08-05 15:16       ` Lynn Winebarger
@ 2022-08-09  9:11       ` Andrea Corallo
  2022-08-09  9:21         ` Andrea Corallo
  1 sibling, 1 reply; 47+ messages in thread
From: Andrea Corallo @ 2022-08-09  9:11 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Joseph Mingrone, emacs-devel, emacs

Andrea Corallo <akrl@sdf.org> writes:

> Andrea Corallo <akrl@sdf.org> writes:
>
>> Lars Ingebrigtsen <larsi@gnus.org> writes:
>>
>>> Joseph Mingrone <jrm@ftfl.ca> writes:
>>>
>>>> Could 261d6af have broken --with-native-compilation builds on 32-bit
>>>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
>>>> jail using 261d6af:
>>>> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
>>>
>>> I guess these are the error messages?
>>>
>>> emacs: Trying to load incoherent dumped eln file
>>> /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
>>>
>>> I don't know what that means; Andrea added to the CCs.
>>
>> It's very surprising to see 261d6af causing this side effect, at least I
>> don't see why should effect the 32bit build only.
>>
>> I'm trying to reproduce it on my 32bit env.
>
> I confirm the build it's broken on my 32bit env as well, (but not on the
> 64 one).
>
> Loading the second dump, while we are relocating the ediff-hook
> compilation unit, we realize (@ pdumper.c:5304) that its file field is
> not a cons as expected but just a string.
>
> Now the question is why this is not fixed-up in loadup.el:477 as for the
> other compilation units?

Just had some time to look into this further:

Of all the CUs we are dumping two are not fixed-up in loadup.el before
dump because not referenced by any function.

In particular looking at 'ediff-hook' it does contain only variable
definitions so this is correct.

We do run a GC before dumping so we should unload these unreferenced CUs
before dump.  And as expected I don't see ediff-hook CU being marked but
we do not free it during sweep.

It looks to me like a GC bug so far.  Unfortunatly I've very constrained
time to dedicate on this this week.

BR

  Andrea




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-09  9:11       ` Andrea Corallo
@ 2022-08-09  9:21         ` Andrea Corallo
  2022-08-09  9:48           ` Po Lu
                             ` (3 more replies)
  0 siblings, 4 replies; 47+ messages in thread
From: Andrea Corallo @ 2022-08-09  9:21 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Joseph Mingrone, emacs-devel, emacs

Andrea Corallo <akrl@sdf.org> writes:

[...]

> Just had some time to look into this further:
>
> Of all the CUs we are dumping two are not fixed-up in loadup.el before
> dump because not referenced by any function.
>
> In particular looking at 'ediff-hook' it does contain only variable
> definitions so this is correct.
>
> We do run a GC before dumping so we should unload these unreferenced CUs
> before dump.  And as expected I don't see ediff-hook CU being marked but
> we do not free it during sweep.
>
> It looks to me like a GC bug so far.  Unfortunatly I've very constrained
> time to dedicate on this this week.

Thinking about this... Maybe relying on the GC for this is not a very
good idea in the first place.  If we are conservative on the stack my
might always mark a CU accidentally and fall into the same issue.

I think we should maintain a list of all loaded CUs so we can fix them
up reliably.  If this is agreed not to be a bad idea I'll prepare a
patch.

BR

  Andrea

PS still dunno what's going on with the GC here



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-09  9:21         ` Andrea Corallo
@ 2022-08-09  9:48           ` Po Lu
  2022-08-09 10:03             ` Andrea Corallo
  2022-08-09 10:20           ` Lynn Winebarger
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 47+ messages in thread
From: Po Lu @ 2022-08-09  9:48 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

Andrea Corallo <akrl@sdf.org> writes:

> PS still dunno what's going on with the GC here

It will remain conservative for the forseeable future.



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-09  9:48           ` Po Lu
@ 2022-08-09 10:03             ` Andrea Corallo
  2022-08-09 10:10               ` Po Lu
  0 siblings, 1 reply; 47+ messages in thread
From: Andrea Corallo @ 2022-08-09 10:03 UTC (permalink / raw)
  To: Po Lu; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

Po Lu <luangruo@yahoo.com> writes:

> Andrea Corallo <akrl@sdf.org> writes:
>
>> PS still dunno what's going on with the GC here
>
> It will remain conservative for the forseeable future.

I guess so, here I'm referring to the fact that being conservative on
the stack still seams not to be the root cause of the issue here.

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-09 10:03             ` Andrea Corallo
@ 2022-08-09 10:10               ` Po Lu
  0 siblings, 0 replies; 47+ messages in thread
From: Po Lu @ 2022-08-09 10:10 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

Andrea Corallo <akrl@sdf.org> writes:

> I guess so, here I'm referring to the fact that being conservative on
> the stack still seams not to be the root cause of the issue here.
>
>   Andrea

Oh, okay.  Sorry for the noise then.



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-09  9:21         ` Andrea Corallo
  2022-08-09  9:48           ` Po Lu
@ 2022-08-09 10:20           ` Lynn Winebarger
  2022-08-09 11:16           ` Eli Zaretskii
  2022-08-09 15:32           ` Lars Ingebrigtsen
  3 siblings, 0 replies; 47+ messages in thread
From: Lynn Winebarger @ 2022-08-09 10:20 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

[-- Attachment #1: Type: text/plain, Size: 1458 bytes --]

On Tue, Aug 9, 2022, 5:22 AM Andrea Corallo <akrl@sdf.org> wrote:

> Andrea Corallo <akrl@sdf.org> writes:
>
> [...]
>
> > Just had some time to look into this further:
> >
> > Of all the CUs we are dumping two are not fixed-up in loadup.el before
> > dump because not referenced by any function.
> >
> > In particular looking at 'ediff-hook' it does contain only variable
> > definitions so this is correct.
> >
> > We do run a GC before dumping so we should unload these unreferenced CUs
> > before dump.  And as expected I don't see ediff-hook CU being marked but
> > we do not free it during sweep.
> >
> > It looks to me like a GC bug so far.  Unfortunatly I've very constrained
> > time to dedicate on this this week.
>
> Thinking about this... Maybe relying on the GC for this is not a very
> good idea in the first place.  If we are conservative on the stack my
> might always mark a CU accidentally and fall into the same issue.
>
> I think we should maintain a list of all loaded CUs so we can fix them
> up reliably.  If this is agreed not to be a bad idea I'll prepare a
> patch.


Just a heads up - when I was validating what was failing while dumping, I
tried printing the comp units before and after they were fixed up.  When
the comp unit has a cons cell in the name field, princ segfaults (at least
in 28.1).
I didn't report this as a bug because it would be very unusual for a user
to have access to comp units in this state.

Lynn






>

[-- Attachment #2: Type: text/html, Size: 2351 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-09  9:21         ` Andrea Corallo
  2022-08-09  9:48           ` Po Lu
  2022-08-09 10:20           ` Lynn Winebarger
@ 2022-08-09 11:16           ` Eli Zaretskii
  2022-08-17 19:59             ` Andrea Corallo
  2022-08-09 15:32           ` Lars Ingebrigtsen
  3 siblings, 1 reply; 47+ messages in thread
From: Eli Zaretskii @ 2022-08-09 11:16 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: larsi, jrm, emacs-devel, emacs

> From: Andrea Corallo <akrl@sdf.org>
> Cc: Joseph Mingrone <jrm@ftfl.ca>, emacs-devel@gnu.org, emacs@FreeBSD.org
> Date: Tue, 09 Aug 2022 09:21:11 +0000
> 
> > It looks to me like a GC bug so far.  Unfortunatly I've very constrained
> > time to dedicate on this this week.
> 
> Thinking about this... Maybe relying on the GC for this is not a very
> good idea in the first place.  If we are conservative on the stack my
> might always mark a CU accidentally and fall into the same issue.
> 
> I think we should maintain a list of all loaded CUs so we can fix them
> up reliably.  If this is agreed not to be a bad idea I'll prepare a
> patch.

I suggest to postpone the decision until we have a good understanding
of what happens in this particular case and why it happens only in
32-bit builds.  Maybe we will decide what you suggest, but there are
likely other factors at work here, and it would be good to know what
they are.

Thanks.



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-09  9:21         ` Andrea Corallo
                             ` (2 preceding siblings ...)
  2022-08-09 11:16           ` Eli Zaretskii
@ 2022-08-09 15:32           ` Lars Ingebrigtsen
  3 siblings, 0 replies; 47+ messages in thread
From: Lars Ingebrigtsen @ 2022-08-09 15:32 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Joseph Mingrone, emacs-devel, emacs

Andrea Corallo <akrl@sdf.org> writes:

> Thinking about this... Maybe relying on the GC for this is not a very
> good idea in the first place.  If we are conservative on the stack my
> might always mark a CU accidentally and fall into the same issue.
>
> I think we should maintain a list of all loaded CUs so we can fix them
> up reliably.  If this is agreed not to be a bad idea I'll prepare a
> patch.

Relying on the GC is indeed inherently fragile, so maintaining an
explicit list sounds like a good idea in any case -- even if GC doesn't
turn out to be the culprit here.



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-09 11:16           ` Eli Zaretskii
@ 2022-08-17 19:59             ` Andrea Corallo
  2022-08-17 21:01               ` Andrea Corallo
  2022-08-18  5:17               ` Eli Zaretskii
  0 siblings, 2 replies; 47+ messages in thread
From: Andrea Corallo @ 2022-08-17 19:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, jrm, emacs-devel, emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrea Corallo <akrl@sdf.org>
>> Cc: Joseph Mingrone <jrm@ftfl.ca>, emacs-devel@gnu.org, emacs@FreeBSD.org
>> Date: Tue, 09 Aug 2022 09:21:11 +0000
>> 
>> > It looks to me like a GC bug so far.  Unfortunatly I've very constrained
>> > time to dedicate on this this week.
>> 
>> Thinking about this... Maybe relying on the GC for this is not a very
>> good idea in the first place.  If we are conservative on the stack my
>> might always mark a CU accidentally and fall into the same issue.
>> 
>> I think we should maintain a list of all loaded CUs so we can fix them
>> up reliably.  If this is agreed not to be a bad idea I'll prepare a
>> patch.
>
> I suggest to postpone the decision until we have a good understanding
> of what happens in this particular case and why it happens only in
> 32-bit builds.  Maybe we will decide what you suggest, but there are
> likely other factors at work here, and it would be good to know what
> they are.
>
> Thanks.

Okay, I had some time to work on this and this is what's going:

After having loaded ediff-hooks temacs never sweeps vectors because,
even if call `garbage-collect' before dumping, this is inhibited cause
we overflowed purespace.

Interestingly we warn for purespace overflow calling 'check_pure_size'
when dumping with unexec and not with pdumper.  Given this makes the GC
not functional (at least in this phase) I'm wondering if we shouldn't do
this as well.

Also, thinking about the whole system even better, I think fixing-up CUs
reachable from named functions is definitely a bad for another reason
that is lambdas!  We could have a lambda referenced somewhere that keeps
a CU loaded and we need to fix it up anyway before dumping.

So yeah I guess tomorrow I'll prepare the patch were we keep a list of
loaded CU to fix-up.

Best Regards

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-17 19:59             ` Andrea Corallo
@ 2022-08-17 21:01               ` Andrea Corallo
  2022-08-18  5:30                 ` Eli Zaretskii
  2022-08-18  5:17               ` Eli Zaretskii
  1 sibling, 1 reply; 47+ messages in thread
From: Andrea Corallo @ 2022-08-17 21:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, jrm, emacs-devel, emacs

Andrea Corallo <akrl@sdf.org> writes:

[...]

> Okay, I had some time to work on this and this is what's going:
>
> After having loaded ediff-hooks temacs never sweeps vectors because,
> even if call `garbage-collect' before dumping, this is inhibited cause
> we overflowed purespace.
>
> Interestingly we warn for purespace overflow calling 'check_pure_size'
> when dumping with unexec and not with pdumper.  Given this makes the GC
> not functional (at least in this phase) I'm wondering if we shouldn't do
> this as well.
>
> Also, thinking about the whole system even better, I think fixing-up CUs
> reachable from named functions is definitely a bad for another reason
> that is lambdas!  We could have a lambda referenced somewhere that keeps
> a CU loaded and we need to fix it up anyway before dumping.
>
> So yeah I guess tomorrow I'll prepare the patch were we keep a list of
> loaded CU to fix-up.

Right I pushed the fix to scratch/better-cu-fixup so far as:

- I don't know if we want 1a637303b4 and 4bdda39f71 in master or 28.

- I suspect there's some good reason I'm not aware of why we don't
  eb539e92e9 at all (this is not necessary to fix the reported issue
  tho).

Bests!

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-17 19:59             ` Andrea Corallo
  2022-08-17 21:01               ` Andrea Corallo
@ 2022-08-18  5:17               ` Eli Zaretskii
  2022-08-18  7:59                 ` Andrea Corallo
  1 sibling, 1 reply; 47+ messages in thread
From: Eli Zaretskii @ 2022-08-18  5:17 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: larsi, jrm, emacs-devel, emacs

> From: Andrea Corallo <akrl@sdf.org>
> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
> Date: Wed, 17 Aug 2022 19:59:59 +0000
> 
> Okay, I had some time to work on this and this is what's going:
> 
> After having loaded ediff-hooks temacs never sweeps vectors because,
> even if call `garbage-collect' before dumping, this is inhibited cause
> we overflowed purespace.
> 
> Interestingly we warn for purespace overflow calling 'check_pure_size'
> when dumping with unexec and not with pdumper.  Given this makes the GC
> not functional (at least in this phase) I'm wondering if we shouldn't do
> this as well.

I always thought that pure-space overflow with pdumper doesn't matter,
that's why we don't warn.  You seem to be saying that it does matter?
By "makes GC not functional" do you mean during dumping, or do you
mean after restarting Emacs with the pdumper file?

> Also, thinking about the whole system even better, I think fixing-up CUs
> reachable from named functions is definitely a bad for another reason
> that is lambdas!  We could have a lambda referenced somewhere that keeps
> a CU loaded and we need to fix it up anyway before dumping.
> 
> So yeah I guess tomorrow I'll prepare the patch were we keep a list of
> loaded CU to fix-up.

Thanks, but why does this only affect 32-bit builds?



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-17 21:01               ` Andrea Corallo
@ 2022-08-18  5:30                 ` Eli Zaretskii
  2022-08-18  8:06                   ` Andrea Corallo
  2022-08-18 13:40                   ` Stefan Monnier
  0 siblings, 2 replies; 47+ messages in thread
From: Eli Zaretskii @ 2022-08-18  5:30 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: larsi, jrm, emacs-devel, emacs

> From: Andrea Corallo <akrl@sdf.org>
> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
> Date: Wed, 17 Aug 2022 21:01:48 +0000
> 
> > So yeah I guess tomorrow I'll prepare the patch were we keep a list of
> > loaded CU to fix-up.
> 
> Right I pushed the fix to scratch/better-cu-fixup so far as:
> 
> - I don't know if we want 1a637303b4 and 4bdda39f71 in master or 28.

The problem doesn't exist on emacs-28, does it?  I use a 32-bit build
of that branch all the time, including Emacs 28.1 and the pretests of
Emacs 28.2, and never had any problems.

If the problem doesn't exist on the release branch, I'd prefer to
leave it alone, as these changes are not entirely trivial, although
look quite simple.

> - I suspect there's some good reason I'm not aware of why we don't
>   eb539e92e9 at all (this is not necessary to fix the reported issue
>   tho).

Like I said earlier, I always thought that this problem doesn't affect
the pdumper builds.  Perhaps that's not true with native-compilation?



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18  5:17               ` Eli Zaretskii
@ 2022-08-18  7:59                 ` Andrea Corallo
  2022-08-18  8:14                   ` Eli Zaretskii
  0 siblings, 1 reply; 47+ messages in thread
From: Andrea Corallo @ 2022-08-18  7:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, jrm, emacs-devel, emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrea Corallo <akrl@sdf.org>
>> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
>> Date: Wed, 17 Aug 2022 19:59:59 +0000
>> 
>> Okay, I had some time to work on this and this is what's going:
>> 
>> After having loaded ediff-hooks temacs never sweeps vectors because,
>> even if call `garbage-collect' before dumping, this is inhibited cause
>> we overflowed purespace.
>> 
>> Interestingly we warn for purespace overflow calling 'check_pure_size'
>> when dumping with unexec and not with pdumper.  Given this makes the GC
>> not functional (at least in this phase) I'm wondering if we shouldn't do
>> this as well.
>
> I always thought that pure-space overflow with pdumper doesn't matter,
> that's why we don't warn.  You seem to be saying that it does matter?

It certanly does, at least in temacs if using using purecopy we overflow
the purespace.

> By "makes GC not functional" do you mean during dumping, or do you
> mean after restarting Emacs with the pdumper file?

The first, during dumping, I believe the produced emacs is functional.

>> Also, thinking about the whole system even better, I think fixing-up CUs
>> reachable from named functions is definitely a bad for another reason
>> that is lambdas!  We could have a lambda referenced somewhere that keeps
>> a CU loaded and we need to fix it up anyway before dumping.
>> 
>> So yeah I guess tomorrow I'll prepare the patch were we keep a list of
>> loaded CU to fix-up.
>
> Thanks, but why does this only affect 32-bit builds?

That's a good question, I guess for some reason we overflowed only on
the 32-bit builds?

Thanks

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18  5:30                 ` Eli Zaretskii
@ 2022-08-18  8:06                   ` Andrea Corallo
  2022-08-18  8:15                     ` Eli Zaretskii
                                       ` (2 more replies)
  2022-08-18 13:40                   ` Stefan Monnier
  1 sibling, 3 replies; 47+ messages in thread
From: Andrea Corallo @ 2022-08-18  8:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, jrm, emacs-devel, emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrea Corallo <akrl@sdf.org>
>> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
>> Date: Wed, 17 Aug 2022 21:01:48 +0000
>> 
>> > So yeah I guess tomorrow I'll prepare the patch were we keep a list of
>> > loaded CU to fix-up.
>> 
>> Right I pushed the fix to scratch/better-cu-fixup so far as:
>> 
>> - I don't know if we want 1a637303b4 and 4bdda39f71 in master or 28.
>
> The problem doesn't exist on emacs-28, does it?

AFAIK we never got a report of it.

> I use a 32-bit build
> of that branch all the time, including Emacs 28.1 and the pretests of
> Emacs 28.2, and never had any problems.
>
> If the problem doesn't exist on the release branch, I'd prefer to
> leave it alone, as these changes are not entirely trivial, although
> look quite simple.

I agree.

>> - I suspect there's some good reason I'm not aware of why we don't
>>   eb539e92e9 at all (this is not necessary to fix the reported issue
>>   tho).
>
> Like I said earlier, I always thought that this problem doesn't affect
> the pdumper builds.  Perhaps that's not true with native-compilation?

Certainly native-compilation was relying more on the GC than the
standard build for the discussed mechanism, OTOH I'm not sure how
potentially having the GC not functional in temacs is a serious issue.

But is also worth considering that, given almost no-one is building with
unexec now days, if we don't monitor for purespace overflow in pdumper
we'll regularly overflow from time to time without noticing it.

Regards

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18  7:59                 ` Andrea Corallo
@ 2022-08-18  8:14                   ` Eli Zaretskii
  2022-08-18  9:06                     ` Andrea Corallo
  0 siblings, 1 reply; 47+ messages in thread
From: Eli Zaretskii @ 2022-08-18  8:14 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: larsi, jrm, emacs-devel, emacs

> From: Andrea Corallo <akrl@sdf.org>
> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
> Date: Thu, 18 Aug 2022 07:59:22 +0000
> 
> > Thanks, but why does this only affect 32-bit builds?
> 
> That's a good question, I guess for some reason we overflowed only on
> the 32-bit builds?

That's unlikely to happen, AFAIU.  It's more likely to be the other
way around: the 64-bit builds overflow sooner, due to wider data
types.



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18  8:06                   ` Andrea Corallo
@ 2022-08-18  8:15                     ` Eli Zaretskii
  2022-08-18  9:08                       ` Andrea Corallo
  2022-08-18  8:31                     ` Po Lu
  2022-08-18 11:48                     ` Joseph Mingrone
  2 siblings, 1 reply; 47+ messages in thread
From: Eli Zaretskii @ 2022-08-18  8:15 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: larsi, jrm, emacs-devel, emacs

> From: Andrea Corallo <akrl@sdf.org>
> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
> Date: Thu, 18 Aug 2022 08:06:43 +0000
> 
> But is also worth considering that, given almost no-one is building with
> unexec now days, if we don't monitor for purespace overflow in pdumper
> we'll regularly overflow from time to time without noticing it.

I build the unexec version once a week.



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18  8:06                   ` Andrea Corallo
  2022-08-18  8:15                     ` Eli Zaretskii
@ 2022-08-18  8:31                     ` Po Lu
  2022-08-18 11:48                     ` Joseph Mingrone
  2 siblings, 0 replies; 47+ messages in thread
From: Po Lu @ 2022-08-18  8:31 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Eli Zaretskii, larsi, jrm, emacs-devel, emacs

Andrea Corallo <akrl@sdf.org> writes:

> But is also worth considering that, given almost no-one is building with
> unexec now days, if we don't monitor for purespace overflow in pdumper
> we'll regularly overflow from time to time without noticing it.

I am for two reasons: the unexec build is easier to work with during
development, since it doesn't leave lots of pdump files around, and the
MS-DOS build doesn't support anything else.



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18  8:14                   ` Eli Zaretskii
@ 2022-08-18  9:06                     ` Andrea Corallo
  2022-08-18  9:45                       ` Eli Zaretskii
  0 siblings, 1 reply; 47+ messages in thread
From: Andrea Corallo @ 2022-08-18  9:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, jrm, emacs-devel, emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrea Corallo <akrl@sdf.org>
>> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
>> Date: Thu, 18 Aug 2022 07:59:22 +0000
>> 
>> > Thanks, but why does this only affect 32-bit builds?
>> 
>> That's a good question, I guess for some reason we overflowed only on
>> the 32-bit builds?
>
> That's unlikely to happen, AFAIU.  It's more likely to be the other
> way around: the 64-bit builds overflow sooner, due to wider data
> types.

That's correct, but is not that simple. I see also pure_size depends on
many factors. Ex: I've it as 2000000 on the 32bit build and 3333333 on
the 64bit.

What I see comparing the two builds (my testbed is ATM on aff5961274) is
that we overflow on both, but on the 64bit we do it a little later in
the execution so the GC has the chance to collect ediff-hook before we
overflow purespace.

I pushed the fix for the nativecomp side to master as we understood the
mechanism needed improvement.

I let maintainers choose for the purespace overflow warning one.

Thanks!

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18  8:15                     ` Eli Zaretskii
@ 2022-08-18  9:08                       ` Andrea Corallo
  0 siblings, 0 replies; 47+ messages in thread
From: Andrea Corallo @ 2022-08-18  9:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, jrm, emacs-devel, emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrea Corallo <akrl@sdf.org>
>> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
>> Date: Thu, 18 Aug 2022 08:06:43 +0000
>> 
>> But is also worth considering that, given almost no-one is building with
>> unexec now days, if we don't monitor for purespace overflow in pdumper
>> we'll regularly overflow from time to time without noticing it.
>
> I build the unexec version once a week.

Sorry for putting you in the "almost no-one" category :)  Still my
opinion is that one build can't give us the coverage we want if we care
about not overflowing purespace.

Thanks

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18  9:06                     ` Andrea Corallo
@ 2022-08-18  9:45                       ` Eli Zaretskii
  2022-08-18  9:57                         ` Andrea Corallo
  0 siblings, 1 reply; 47+ messages in thread
From: Eli Zaretskii @ 2022-08-18  9:45 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: larsi, jrm, emacs-devel, emacs

> From: Andrea Corallo <akrl@sdf.org>
> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
> Date: Thu, 18 Aug 2022 09:06:14 +0000
> 
> What I see comparing the two builds (my testbed is ATM on aff5961274) is
> that we overflow on both, but on the 64bit we do it a little later in
> the execution so the GC has the chance to collect ediff-hook before we
> overflow purespace.

That's strange, because I just built the unexec build on a 64-bit
system, and it didn't overflow for me.

Does it overflow on your system during bootstrap, i.e. when it loads
all the Lisp packages in source form?  Or does it overflow when it
loads the *.elc byte-compiled files?  Or is this a native-comp build,
and it overflows when loading the *.eln files?  Or did you discover
the overflow via some method that is nor part of the standard build?

> I pushed the fix for the nativecomp side to master as we understood the
> mechanism needed improvement.

Thanks.



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18  9:45                       ` Eli Zaretskii
@ 2022-08-18  9:57                         ` Andrea Corallo
  2022-08-18 10:31                           ` Eli Zaretskii
  0 siblings, 1 reply; 47+ messages in thread
From: Andrea Corallo @ 2022-08-18  9:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, jrm, emacs-devel, emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrea Corallo <akrl@sdf.org>
>> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
>> Date: Thu, 18 Aug 2022 09:06:14 +0000
>> 
>> What I see comparing the two builds (my testbed is ATM on aff5961274) is
>> that we overflow on both, but on the 64bit we do it a little later in
>> the execution so the GC has the chance to collect ediff-hook before we
>> overflow purespace.
>
> That's strange, because I just built the unexec build on a 64-bit
> system, and it didn't overflow for me.
>
> Does it overflow on your system during bootstrap, i.e. when it loads
> all the Lisp packages in source form?  Or does it overflow when it
> loads the *.elc byte-compiled files?  Or is this a native-comp build,
> and it overflows when loading the *.eln files?

Mine is a build with native compilation, there are many variables into
play and indeed native compilation might be one of the main responsible
for the higher use the purespace here.

It does overflow during bootstrap after having loaded the eln files.

> Or did you discover
> the overflow via some method that is nor part of the standard build?

That's a regular make bootstrap configured with '--without-x
--with-native-compilation'.

Bests

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18  9:57                         ` Andrea Corallo
@ 2022-08-18 10:31                           ` Eli Zaretskii
  2022-08-18 11:08                             ` Andrea Corallo
  0 siblings, 1 reply; 47+ messages in thread
From: Eli Zaretskii @ 2022-08-18 10:31 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: larsi, jrm, emacs-devel, emacs

> From: Andrea Corallo <akrl@sdf.org>
> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
> Date: Thu, 18 Aug 2022 09:57:32 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> What I see comparing the two builds (my testbed is ATM on aff5961274) is
> >> that we overflow on both, but on the 64bit we do it a little later in
> >> the execution so the GC has the chance to collect ediff-hook before we
> >> overflow purespace.
> >
> > That's strange, because I just built the unexec build on a 64-bit
> > system, and it didn't overflow for me.
> >
> > Does it overflow on your system during bootstrap, i.e. when it loads
> > all the Lisp packages in source form?  Or does it overflow when it
> > loads the *.elc byte-compiled files?  Or is this a native-comp build,
> > and it overflows when loading the *.eln files?
> 
> Mine is a build with native compilation, there are many variables into
> play and indeed native compilation might be one of the main responsible
> for the higher use the purespace here.
> 
> It does overflow during bootstrap after having loaded the eln files.

How many more bytes do you need to avoid overflowing?

I guess we will need to enlarge SYSTEM_PURESIZE_EXTRA in the
native-comp build.  The question above will allow to figure out by how
much to enlarge it.



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18 10:31                           ` Eli Zaretskii
@ 2022-08-18 11:08                             ` Andrea Corallo
  2022-08-18 13:08                               ` Eli Zaretskii
  0 siblings, 1 reply; 47+ messages in thread
From: Andrea Corallo @ 2022-08-18 11:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, jrm, emacs-devel, emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrea Corallo <akrl@sdf.org>
>> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
>> Date: Thu, 18 Aug 2022 09:57:32 +0000
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>> >> What I see comparing the two builds (my testbed is ATM on aff5961274) is
>> >> that we overflow on both, but on the 64bit we do it a little later in
>> >> the execution so the GC has the chance to collect ediff-hook before we
>> >> overflow purespace.
>> >
>> > That's strange, because I just built the unexec build on a 64-bit
>> > system, and it didn't overflow for me.
>> >
>> > Does it overflow on your system during bootstrap, i.e. when it loads
>> > all the Lisp packages in source form?  Or does it overflow when it
>> > loads the *.elc byte-compiled files?  Or is this a native-comp build,
>> > and it overflows when loading the *.eln files?
>> 
>> Mine is a build with native compilation, there are many variables into
>> play and indeed native compilation might be one of the main responsible
>> for the higher use the purespace here.
>> 
>> It does overflow during bootstrap after having loaded the eln files.
>
> How many more bytes do you need to avoid overflowing?

On 64bit I get:

emacs:0:Pure Lisp storage overflow (approx. 3366891 bytes needed)

On 32:

emacs:0:Pure Lisp storage overflow (approx. 2549794 bytes needed)

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18  8:06                   ` Andrea Corallo
  2022-08-18  8:15                     ` Eli Zaretskii
  2022-08-18  8:31                     ` Po Lu
@ 2022-08-18 11:48                     ` Joseph Mingrone
  2 siblings, 0 replies; 47+ messages in thread
From: Joseph Mingrone @ 2022-08-18 11:48 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Eli Zaretskii, larsi, emacs-devel, emacs

On Thu, 2022-08-18 at 08:06, Andrea Corallo <akrl@sdf.org> wrote:

> Eli Zaretskii <eliz@gnu.org> writes:

>>> From: Andrea Corallo <akrl@sdf.org>
>>> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
>>> Date: Wed, 17 Aug 2022 21:01:48 +0000

>>> > So yeah I guess tomorrow I'll prepare the patch were we keep a list of
>>> > loaded CU to fix-up.

>>> Right I pushed the fix to scratch/better-cu-fixup so far as:

>>> - I don't know if we want 1a637303b4 and 4bdda39f71 in master or 28.

>> The problem doesn't exist on emacs-28, does it?

> AFAIK we never got a report of it.

>> I use a 32-bit build
>> of that branch all the time, including Emacs 28.1 and the pretests of
>> Emacs 28.2, and never had any problems.

>> If the problem doesn't exist on the release branch, I'd prefer to
>> leave it alone, as these changes are not entirely trivial, although
>> look quite simple.

> I agree.

>>> - I suspect there's some good reason I'm not aware of why we don't
>>>   eb539e92e9 at all (this is not necessary to fix the reported issue
>>>   tho).

>> Like I said earlier, I always thought that this problem doesn't affect
>> the pdumper builds.  Perhaps that's not true with native-compilation?

> Certainly native-compilation was relying more on the GC than the
> standard build for the discussed mechanism, OTOH I'm not sure how
> potentially having the GC not functional in temacs is a serious issue.

> But is also worth considering that, given almost no-one is building with
> unexec now days, if we don't monitor for purespace overflow in pdumper
> we'll regularly overflow from time to time without noticing it.

FWIW, 32-bit builds started completing again (on FreeBSD 13.0) in the last week or so.

http://pkg.ftfl.ca/data/latest-per-pkg/emacs-devel/29.0.50.20220811%2C2/13i386-default.log



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18 11:08                             ` Andrea Corallo
@ 2022-08-18 13:08                               ` Eli Zaretskii
  2022-08-18 14:09                                 ` Andrea Corallo
  0 siblings, 1 reply; 47+ messages in thread
From: Eli Zaretskii @ 2022-08-18 13:08 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: larsi, jrm, emacs-devel, emacs

> From: Andrea Corallo <akrl@sdf.org>
> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
> Date: Thu, 18 Aug 2022 11:08:32 +0000
> 
> On 64bit I get:
> 
> emacs:0:Pure Lisp storage overflow (approx. 3366891 bytes needed)
> 
> On 32:
> 
> emacs:0:Pure Lisp storage overflow (approx. 2549794 bytes needed)

That's soooo strange!  If I start Emacs under GDB and print the value
of PURESIZE, I get 6000000 bytes in a 64-bit build and 4480000 bytes
in a 32-bit build --with-wide-int.  What values do you see?

Maybe the problem happens only in --without-x builds?



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18  5:30                 ` Eli Zaretskii
  2022-08-18  8:06                   ` Andrea Corallo
@ 2022-08-18 13:40                   ` Stefan Monnier
  2022-08-18 13:47                     ` Lynn Winebarger
  2022-08-18 14:49                     ` Andrea Corallo
  1 sibling, 2 replies; 47+ messages in thread
From: Stefan Monnier @ 2022-08-18 13:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Andrea Corallo, larsi, jrm, emacs-devel, emacs

>> - I suspect there's some good reason I'm not aware of why we don't
>>   eb539e92e9 at all (this is not necessary to fix the reported issue
>>   tho).
>
> Like I said earlier, I always thought that this problem doesn't affect
> the pdumper builds.  Perhaps that's not true with native-compilation?

I can't see any good reason not to warn about purespace overflow,
regardless if it leads to misbehavior or not: we clearly do want to size
the purespace to avoid overflow.


        Stefan




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18 13:40                   ` Stefan Monnier
@ 2022-08-18 13:47                     ` Lynn Winebarger
  2022-08-18 14:49                     ` Andrea Corallo
  1 sibling, 0 replies; 47+ messages in thread
From: Lynn Winebarger @ 2022-08-18 13:47 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Eli Zaretskii, Andrea Corallo, Lars Ingebrigtsen, Joseph Mingrone,
	emacs-devel, emacs

[-- Attachment #1: Type: text/plain, Size: 691 bytes --]

On Thu, Aug 18, 2022, 9:42 AM Stefan Monnier <monnier@iro.umontreal.ca>
wrote:

> >> - I suspect there's some good reason I'm not aware of why we don't
> >>   eb539e92e9 at all (this is not necessary to fix the reported issue
> >>   tho).
> >
> > Like I said earlier, I always thought that this problem doesn't affect
> > the pdumper builds.  Perhaps that's not true with native-compilation?
>
> I can't see any good reason not to warn about purespace overflow,
> regardless if it leads to misbehavior or not: we clearly do want to size
> the purespace to avoid overflow.

See
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=46916
or
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=56793


Lynn

[-- Attachment #2: Type: text/html, Size: 1411 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18 13:08                               ` Eli Zaretskii
@ 2022-08-18 14:09                                 ` Andrea Corallo
  2022-08-18 14:22                                   ` Eli Zaretskii
  0 siblings, 1 reply; 47+ messages in thread
From: Andrea Corallo @ 2022-08-18 14:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, jrm, emacs-devel, emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrea Corallo <akrl@sdf.org>
>> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
>> Date: Thu, 18 Aug 2022 11:08:32 +0000
>> 
>> On 64bit I get:
>> 
>> emacs:0:Pure Lisp storage overflow (approx. 3366891 bytes needed)
>> 
>> On 32:
>> 
>> emacs:0:Pure Lisp storage overflow (approx. 2549794 bytes needed)
>
> That's soooo strange!  If I start Emacs under GDB and print the value
> of PURESIZE, I get 6000000 bytes in a 64-bit build and 4480000 bytes
> in a 32-bit build --with-wide-int.  What values do you see?
>
> Maybe the problem happens only in --without-x builds?

I get 2000000 on the 32bit build and 3333333 on 64 bit.  Both are indeed
--without-x.  According a to comment in puresize.h this has an effect.

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18 14:09                                 ` Andrea Corallo
@ 2022-08-18 14:22                                   ` Eli Zaretskii
  2022-08-18 14:50                                     ` Andrea Corallo
  0 siblings, 1 reply; 47+ messages in thread
From: Eli Zaretskii @ 2022-08-18 14:22 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: larsi, jrm, emacs-devel, emacs

> From: Andrea Corallo <akrl@sdf.org>
> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
> Date: Thu, 18 Aug 2022 14:09:45 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> From: Andrea Corallo <akrl@sdf.org>
> >> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
> >> Date: Thu, 18 Aug 2022 11:08:32 +0000
> >> 
> >> On 64bit I get:
> >> 
> >> emacs:0:Pure Lisp storage overflow (approx. 3366891 bytes needed)
> >> 
> >> On 32:
> >> 
> >> emacs:0:Pure Lisp storage overflow (approx. 2549794 bytes needed)
> >
> > That's soooo strange!  If I start Emacs under GDB and print the value
> > of PURESIZE, I get 6000000 bytes in a 64-bit build and 4480000 bytes
> > in a 32-bit build --with-wide-int.  What values do you see?
> >
> > Maybe the problem happens only in --without-x builds?
> 
> I get 2000000 on the 32bit build and 3333333 on 64 bit.  Both are indeed
> --without-x.  According a to comment in puresize.h this has an effect.

What is the value of SYSTEM_PURESIZE_EXTRA in both cases?



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18 13:40                   ` Stefan Monnier
  2022-08-18 13:47                     ` Lynn Winebarger
@ 2022-08-18 14:49                     ` Andrea Corallo
  1 sibling, 0 replies; 47+ messages in thread
From: Andrea Corallo @ 2022-08-18 14:49 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, larsi, jrm, emacs-devel, emacs

[-- Attachment #1: Type: text/plain, Size: 823 bytes --]

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>> - I suspect there's some good reason I'm not aware of why we don't
>>>   eb539e92e9 at all (this is not necessary to fix the reported issue
>>>   tho).
>>
>> Like I said earlier, I always thought that this problem doesn't affect
>> the pdumper builds.  Perhaps that's not true with native-compilation?
>
> I can't see any good reason not to warn about purespace overflow,
> regardless if it leads to misbehavior or not: we clearly do want to size
> the purespace to avoid overflow.

I agree.

And I think we'd better to install also (other than my other suggested
patch) the attached.  This to warn not only at the end when dumping, but
also in the moment the overflow happens.  This helps debugging in case
of a crash or analyzing the dynamic of an issue.

  Andrea


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-src-alloc.c-pure_alloc-Warn-for-pure-space-overflow.patch --]
[-- Type: text/x-diff, Size: 972 bytes --]

From 89e034a7acef12ad187b95a8b45970c89fdc9e0b Mon Sep 17 00:00:00 2001
From: Andrea Corallo <akrl@sdf.org>
Date: Thu, 18 Aug 2022 16:41:26 +0200
Subject: [PATCH] * src/alloc.c (pure_alloc): Warn for pure space overflow

---
 src/alloc.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/src/alloc.c b/src/alloc.c
index 2ffee9f729..34bedac36b 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -5314,6 +5314,7 @@ valid_lisp_object_p (Lisp_Object obj)
 pure_alloc (size_t size, int type)
 {
   void *result;
+  static bool pure_overflow_warned = false;
 
  again:
   if (type >= 0)
@@ -5338,6 +5339,12 @@ pure_alloc (size_t size, int type)
   if (pure_bytes_used <= pure_size)
     return result;
 
+  if (!pure_overflow_warned)
+    {
+      message ("Pure Lisp storage overflowed");
+      pure_overflow_warned = true;
+    }
+
   /* Don't allocate a large amount here,
      because it might get mmap'd and then its address
      might not be usable.  */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18 14:22                                   ` Eli Zaretskii
@ 2022-08-18 14:50                                     ` Andrea Corallo
  2022-08-18 15:57                                       ` Eli Zaretskii
  0 siblings, 1 reply; 47+ messages in thread
From: Andrea Corallo @ 2022-08-18 14:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, jrm, emacs-devel, emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrea Corallo <akrl@sdf.org>
>> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
>> Date: Thu, 18 Aug 2022 14:09:45 +0000
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>> >> From: Andrea Corallo <akrl@sdf.org>
>> >> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
>> >> Date: Thu, 18 Aug 2022 11:08:32 +0000
>> >> 
>> >> On 64bit I get:
>> >> 
>> >> emacs:0:Pure Lisp storage overflow (approx. 3366891 bytes needed)
>> >> 
>> >> On 32:
>> >> 
>> >> emacs:0:Pure Lisp storage overflow (approx. 2549794 bytes needed)
>> >
>> > That's soooo strange!  If I start Emacs under GDB and print the value
>> > of PURESIZE, I get 6000000 bytes in a 64-bit build and 4480000 bytes
>> > in a 32-bit build --with-wide-int.  What values do you see?
>> >
>> > Maybe the problem happens only in --without-x builds?
>> 
>> I get 2000000 on the 32bit build and 3333333 on 64 bit.  Both are indeed
>> --without-x.  According a to comment in puresize.h this has an effect.
>
> What is the value of SYSTEM_PURESIZE_EXTRA in both cases?

Zero in both cases.

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18 14:50                                     ` Andrea Corallo
@ 2022-08-18 15:57                                       ` Eli Zaretskii
  2022-08-18 16:42                                         ` Andrea Corallo
  0 siblings, 1 reply; 47+ messages in thread
From: Eli Zaretskii @ 2022-08-18 15:57 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: larsi, jrm, emacs-devel, emacs

> From: Andrea Corallo <akrl@sdf.org>
> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
> Date: Thu, 18 Aug 2022 14:50:23 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> From: Andrea Corallo <akrl@sdf.org>
> >> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
> >> Date: Thu, 18 Aug 2022 14:09:45 +0000
> >> 
> >> Eli Zaretskii <eliz@gnu.org> writes:
> >> 
> >> >> From: Andrea Corallo <akrl@sdf.org>
> >> >> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
> >> >> Date: Thu, 18 Aug 2022 11:08:32 +0000
> >> >> 
> >> >> On 64bit I get:
> >> >> 
> >> >> emacs:0:Pure Lisp storage overflow (approx. 3366891 bytes needed)
> >> >> 
> >> >> On 32:
> >> >> 
> >> >> emacs:0:Pure Lisp storage overflow (approx. 2549794 bytes needed)
> >> >
> >> > That's soooo strange!  If I start Emacs under GDB and print the value
> >> > of PURESIZE, I get 6000000 bytes in a 64-bit build and 4480000 bytes
> >> > in a 32-bit build --with-wide-int.  What values do you see?
> >> >
> >> > Maybe the problem happens only in --without-x builds?
> >> 
> >> I get 2000000 on the 32bit build and 3333333 on 64 bit.  Both are indeed
> >> --without-x.  According a to comment in puresize.h this has an effect.
> >
> > What is the value of SYSTEM_PURESIZE_EXTRA in both cases?
> 
> Zero in both cases.

I'm confused.  puresize.h says

  #define BASE_PURESIZE (2750000 + SYSTEM_PURESIZE_EXTRA + SITELOAD_PURESIZE_EXTRA)
  [...]
  #define PURESIZE  (BASE_PURESIZE * PURESIZE_RATIO * PURESIZE_CHECKING_RATIO)

So even if PURESIZE_RATIO and PURESIZE_CHECKING_RATIO are both 1, how
come you get 2000000 in the 32-bit build, when SYSTEM_PURESIZE_EXTRA
is zero?  I must be missing something.



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18 15:57                                       ` Eli Zaretskii
@ 2022-08-18 16:42                                         ` Andrea Corallo
  2022-08-18 17:11                                           ` Eli Zaretskii
  0 siblings, 1 reply; 47+ messages in thread
From: Andrea Corallo @ 2022-08-18 16:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, jrm, emacs-devel, emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrea Corallo <akrl@sdf.org>
>> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
>> Date: Thu, 18 Aug 2022 14:50:23 +0000
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>> >> From: Andrea Corallo <akrl@sdf.org>
>> >> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
>> >> Date: Thu, 18 Aug 2022 14:09:45 +0000
>> >> 
>> >> Eli Zaretskii <eliz@gnu.org> writes:
>> >> 
>> >> >> From: Andrea Corallo <akrl@sdf.org>
>> >> >> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
>> >> >> Date: Thu, 18 Aug 2022 11:08:32 +0000
>> >> >> 
>> >> >> On 64bit I get:
>> >> >> 
>> >> >> emacs:0:Pure Lisp storage overflow (approx. 3366891 bytes needed)
>> >> >> 
>> >> >> On 32:
>> >> >> 
>> >> >> emacs:0:Pure Lisp storage overflow (approx. 2549794 bytes needed)
>> >> >
>> >> > That's soooo strange!  If I start Emacs under GDB and print the value
>> >> > of PURESIZE, I get 6000000 bytes in a 64-bit build and 4480000 bytes
>> >> > in a 32-bit build --with-wide-int.  What values do you see?
>> >> >
>> >> > Maybe the problem happens only in --without-x builds?
>> >> 
>> >> I get 2000000 on the 32bit build and 3333333 on 64 bit.  Both are indeed
>> >> --without-x.  According a to comment in puresize.h this has an effect.
>> >
>> > What is the value of SYSTEM_PURESIZE_EXTRA in both cases?
>> 
>> Zero in both cases.
>
> I'm confused.  puresize.h says
>
>   #define BASE_PURESIZE (2750000 + SYSTEM_PURESIZE_EXTRA + SITELOAD_PURESIZE_EXTRA)
>   [...]
>   #define PURESIZE  (BASE_PURESIZE * PURESIZE_RATIO * PURESIZE_CHECKING_RATIO)
>
> So even if PURESIZE_RATIO and PURESIZE_CHECKING_RATIO are both 1, how
> come you get 2000000 in the 32-bit build, when SYSTEM_PURESIZE_EXTRA
> is zero?  I must be missing something.

It's 2000000 as my testbed for this bug as mentioned it is based on
aff5961274 (a master around the time the bug was reported), so before
your e46668847d.  Your commit changed the constant we add for computing
BASE_PURESIZE from 2000000 to 2750000.

This also indeed explains why Joseph reported the build to be again
working even without my fixes to the nativecomp side.

Thanks

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18 16:42                                         ` Andrea Corallo
@ 2022-08-18 17:11                                           ` Eli Zaretskii
  2022-08-18 19:35                                             ` Andrea Corallo
  0 siblings, 1 reply; 47+ messages in thread
From: Eli Zaretskii @ 2022-08-18 17:11 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: larsi, jrm, emacs-devel, emacs

> From: Andrea Corallo <akrl@sdf.org>
> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
> Date: Thu, 18 Aug 2022 16:42:24 +0000
> 
> > So even if PURESIZE_RATIO and PURESIZE_CHECKING_RATIO are both 1, how
> > come you get 2000000 in the 32-bit build, when SYSTEM_PURESIZE_EXTRA
> > is zero?  I must be missing something.
> 
> It's 2000000 as my testbed for this bug as mentioned it is based on
> aff5961274 (a master around the time the bug was reported), so before
> your e46668847d.  Your commit changed the constant we add for computing
> BASE_PURESIZE from 2000000 to 2750000.

Ah, okay.  So I guess the current values are already large enough, and
we don't need to do anything with this issue.



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18 17:11                                           ` Eli Zaretskii
@ 2022-08-18 19:35                                             ` Andrea Corallo
  2022-08-19  5:49                                               ` Eli Zaretskii
  0 siblings, 1 reply; 47+ messages in thread
From: Andrea Corallo @ 2022-08-18 19:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, jrm, emacs-devel, emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrea Corallo <akrl@sdf.org>
>> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
>> Date: Thu, 18 Aug 2022 16:42:24 +0000
>> 
>> > So even if PURESIZE_RATIO and PURESIZE_CHECKING_RATIO are both 1, how
>> > come you get 2000000 in the 32-bit build, when SYSTEM_PURESIZE_EXTRA
>> > is zero?  I must be missing something.
>> 
>> It's 2000000 as my testbed for this bug as mentioned it is based on
>> aff5961274 (a master around the time the bug was reported), so before
>> your e46668847d.  Your commit changed the constant we add for computing
>> BASE_PURESIZE from 2000000 to 2750000.
>
> Ah, okay.  So I guess the current values are already large enough, and
> we don't need to do anything with this issue.

Agree, still to be decided if warning when purespace is overflowed.

I really don't see why we should not do that, so I propose to have
a7abd8f235 and def6d57669 from scratch/pure-overflow-warn into master.

Thanks

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-18 19:35                                             ` Andrea Corallo
@ 2022-08-19  5:49                                               ` Eli Zaretskii
  2022-08-19  8:11                                                 ` Andrea Corallo
  0 siblings, 1 reply; 47+ messages in thread
From: Eli Zaretskii @ 2022-08-19  5:49 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: larsi, jrm, emacs-devel, emacs

> From: Andrea Corallo <akrl@sdf.org>
> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
> Date: Thu, 18 Aug 2022 19:35:01 +0000
> 
> Agree, still to be decided if warning when purespace is overflowed.
> 
> I really don't see why we should not do that, so I propose to have
> a7abd8f235 and def6d57669 from scratch/pure-overflow-warn into master.

I didn't object, did I?



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-19  5:49                                               ` Eli Zaretskii
@ 2022-08-19  8:11                                                 ` Andrea Corallo
  0 siblings, 0 replies; 47+ messages in thread
From: Andrea Corallo @ 2022-08-19  8:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, jrm, emacs-devel, emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrea Corallo <akrl@sdf.org>
>> Cc: larsi@gnus.org, jrm@ftfl.ca, emacs-devel@gnu.org, emacs@FreeBSD.org
>> Date: Thu, 18 Aug 2022 19:35:01 +0000
>> 
>> Agree, still to be decided if warning when purespace is overflowed.
>> 
>> I really don't see why we should not do that, so I propose to have
>> a7abd8f235 and def6d57669 from scratch/pure-overflow-warn into master.
>
> I didn't object, did I?

Infact you didn't, nice I've pushed the two commints in.

Thanks

  Andrea



^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2022-08-19  8:11 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-05  2:12 --with-native-compilation build failure on 32-bit systems Joseph Mingrone
2022-08-05 11:58 ` Lars Ingebrigtsen
2022-08-05 13:30   ` Andrea Corallo
2022-08-05 14:40     ` Andrea Corallo
2022-08-05 15:16       ` Lynn Winebarger
2022-08-08  7:44         ` Andrea Corallo
2022-08-08 10:22           ` Lynn Winebarger
2022-08-08 13:14             ` Andrea Corallo
2022-08-08 13:55               ` Lynn Winebarger
2022-08-08 14:13                 ` Andrea Corallo
2022-08-09  9:11       ` Andrea Corallo
2022-08-09  9:21         ` Andrea Corallo
2022-08-09  9:48           ` Po Lu
2022-08-09 10:03             ` Andrea Corallo
2022-08-09 10:10               ` Po Lu
2022-08-09 10:20           ` Lynn Winebarger
2022-08-09 11:16           ` Eli Zaretskii
2022-08-17 19:59             ` Andrea Corallo
2022-08-17 21:01               ` Andrea Corallo
2022-08-18  5:30                 ` Eli Zaretskii
2022-08-18  8:06                   ` Andrea Corallo
2022-08-18  8:15                     ` Eli Zaretskii
2022-08-18  9:08                       ` Andrea Corallo
2022-08-18  8:31                     ` Po Lu
2022-08-18 11:48                     ` Joseph Mingrone
2022-08-18 13:40                   ` Stefan Monnier
2022-08-18 13:47                     ` Lynn Winebarger
2022-08-18 14:49                     ` Andrea Corallo
2022-08-18  5:17               ` Eli Zaretskii
2022-08-18  7:59                 ` Andrea Corallo
2022-08-18  8:14                   ` Eli Zaretskii
2022-08-18  9:06                     ` Andrea Corallo
2022-08-18  9:45                       ` Eli Zaretskii
2022-08-18  9:57                         ` Andrea Corallo
2022-08-18 10:31                           ` Eli Zaretskii
2022-08-18 11:08                             ` Andrea Corallo
2022-08-18 13:08                               ` Eli Zaretskii
2022-08-18 14:09                                 ` Andrea Corallo
2022-08-18 14:22                                   ` Eli Zaretskii
2022-08-18 14:50                                     ` Andrea Corallo
2022-08-18 15:57                                       ` Eli Zaretskii
2022-08-18 16:42                                         ` Andrea Corallo
2022-08-18 17:11                                           ` Eli Zaretskii
2022-08-18 19:35                                             ` Andrea Corallo
2022-08-19  5:49                                               ` Eli Zaretskii
2022-08-19  8:11                                                 ` Andrea Corallo
2022-08-09 15:32           ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).