unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* --with-native-compilation build failure on 32-bit systems
@ 2022-08-05  2:12 Joseph Mingrone
  2022-08-05 11:58 ` Lars Ingebrigtsen
  0 siblings, 1 reply; 18+ messages in thread
From: Joseph Mingrone @ 2022-08-05  2:12 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: emacs-devel, emacs

Hello Lars,

Could 261d6af have broken --with-native-compilation builds on 32-bit systems?  This is what I see building in a clean FreeBSD/i386 13.0 jail using 261d6af:
http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log

6fb2063 looks good though (the pkg-plist error at the end can be ignored).
http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h53m03s/logs/errors/emacs-devel-29.0.50.20220804,2.log

Is there any other information that I can provide?

Joe



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-05  2:12 --with-native-compilation build failure on 32-bit systems Joseph Mingrone
@ 2022-08-05 11:58 ` Lars Ingebrigtsen
  2022-08-05 13:30   ` Andrea Corallo
  0 siblings, 1 reply; 18+ messages in thread
From: Lars Ingebrigtsen @ 2022-08-05 11:58 UTC (permalink / raw)
  To: Joseph Mingrone; +Cc: emacs-devel, emacs, Andrea Corallo

Joseph Mingrone <jrm@ftfl.ca> writes:

> Could 261d6af have broken --with-native-compilation builds on 32-bit
> systems?  This is what I see building in a clean FreeBSD/i386 13.0
> jail using 261d6af:
> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log

I guess these are the error messages?

emacs: Trying to load incoherent dumped eln file /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln

I don't know what that means; Andrea added to the CCs.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-05 11:58 ` Lars Ingebrigtsen
@ 2022-08-05 13:30   ` Andrea Corallo
  2022-08-05 14:40     ` Andrea Corallo
  0 siblings, 1 reply; 18+ messages in thread
From: Andrea Corallo @ 2022-08-05 13:30 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Joseph Mingrone, emacs-devel, emacs

Lars Ingebrigtsen <larsi@gnus.org> writes:

> Joseph Mingrone <jrm@ftfl.ca> writes:
>
>> Could 261d6af have broken --with-native-compilation builds on 32-bit
>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
>> jail using 261d6af:
>> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
>
> I guess these are the error messages?
>
> emacs: Trying to load incoherent dumped eln file
> /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
>
> I don't know what that means; Andrea added to the CCs.

It's very surprising to see 261d6af causing this side effect, at least I
don't see why should effect the 32bit build only.

I'm trying to reproduce it on my 32bit env.

  Andrea



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-05 13:30   ` Andrea Corallo
@ 2022-08-05 14:40     ` Andrea Corallo
  2022-08-05 15:16       ` Lynn Winebarger
  2022-08-09  9:11       ` Andrea Corallo
  0 siblings, 2 replies; 18+ messages in thread
From: Andrea Corallo @ 2022-08-05 14:40 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Joseph Mingrone, emacs-devel, emacs

Andrea Corallo <akrl@sdf.org> writes:

> Lars Ingebrigtsen <larsi@gnus.org> writes:
>
>> Joseph Mingrone <jrm@ftfl.ca> writes:
>>
>>> Could 261d6af have broken --with-native-compilation builds on 32-bit
>>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
>>> jail using 261d6af:
>>> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
>>
>> I guess these are the error messages?
>>
>> emacs: Trying to load incoherent dumped eln file
>> /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
>>
>> I don't know what that means; Andrea added to the CCs.
>
> It's very surprising to see 261d6af causing this side effect, at least I
> don't see why should effect the 32bit build only.
>
> I'm trying to reproduce it on my 32bit env.

I confirm the build it's broken on my 32bit env as well, (but not on the
64 one).

Loading the second dump, while we are relocating the ediff-hook
compilation unit, we realize (@ pdumper.c:5304) that its file field is
not a cons as expected but just a string.

Now the question is why this is not fixed-up in loadup.el:477 as for the
other compilation units?

  Andrea



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-05 14:40     ` Andrea Corallo
@ 2022-08-05 15:16       ` Lynn Winebarger
  2022-08-08  7:44         ` Andrea Corallo
  2022-08-09  9:11       ` Andrea Corallo
  1 sibling, 1 reply; 18+ messages in thread
From: Lynn Winebarger @ 2022-08-05 15:16 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

[-- Attachment #1: Type: text/plain, Size: 1839 bytes --]

On Fri, Aug 5, 2022, 10:42 AM Andrea Corallo <akrl@sdf.org> wrote:

> Andrea Corallo <akrl@sdf.org> writes:
>
> > Lars Ingebrigtsen <larsi@gnus.org> writes:
> >
> >> Joseph Mingrone <jrm@ftfl.ca> writes:
> >>
> >>> Could 261d6af have broken --with-native-compilation builds on 32-bit
> >>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
> >>> jail using 261d6af:
> >>>
> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
> >>
> >> I guess these are the error messages?
> >>
> >> emacs: Trying to load incoherent dumped eln file
> >>
> /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
> >>
> >> I don't know what that means; Andrea added to the CCs.
> >
> > It's very surprising to see 261d6af causing this side effect, at least I
> > don't see why should effect the 32bit build only.
> >
> > I'm trying to reproduce it on my 32bit env.
>
> I confirm the build it's broken on my 32bit env as well, (but not on the
> 64 one).
>
> Loading the second dump, while we are relocating the ediff-hook
> compilation unit, we realize (@ pdumper.c:5304) that its file field is
> not a cons as expected but just a string.
>
> Now the question is why this is not fixed-up in loadup.el:477 as for the
> other compilation units?


Are you sure it's actually fixed up in the other compilation units?  When
I've seen this problem, it was because the bindir and elndir arguments were
not specified while dumping.  The complaint came up from one of the later
(but not last) files I had loaded for dumping, but none of the files were
fixed up.

This problem should be signaled by loadup if there are any NCUs it does not
fix up.  It would be a lot easier to diagnose the problem from there.

Lynn

[-- Attachment #2: Type: text/html, Size: 2982 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-05 15:16       ` Lynn Winebarger
@ 2022-08-08  7:44         ` Andrea Corallo
  2022-08-08 10:22           ` Lynn Winebarger
  0 siblings, 1 reply; 18+ messages in thread
From: Andrea Corallo @ 2022-08-08  7:44 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

Lynn Winebarger <owinebar@gmail.com> writes:

> On Fri, Aug 5, 2022, 10:42 AM Andrea Corallo <akrl@sdf.org> wrote:
>
>  Andrea Corallo <akrl@sdf.org> writes:
>
>  > Lars Ingebrigtsen <larsi@gnus.org> writes:
>  >
>  >> Joseph Mingrone <jrm@ftfl.ca> writes:
>  >>
>  >>> Could 261d6af have broken --with-native-compilation builds on 32-bit
>  >>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
>  >>> jail using 261d6af:
>  >>> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
>  >>
>  >> I guess these are the error messages?
>  >>
>  >> emacs: Trying to load incoherent dumped eln file
>  >>
>  /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
>  
>  >>
>  >> I don't know what that means; Andrea added to the CCs.
>  >
>  > It's very surprising to see 261d6af causing this side effect, at least I
>  > don't see why should effect the 32bit build only.
>  >
>  > I'm trying to reproduce it on my 32bit env.
>
>  I confirm the build it's broken on my 32bit env as well, (but not on the
>  64 one).
>
>  Loading the second dump, while we are relocating the ediff-hook
>  compilation unit, we realize (@ pdumper.c:5304) that its file field is
>  not a cons as expected but just a string.
>
>  Now the question is why this is not fixed-up in loadup.el:477 as for the
>  other compilation units?
>
> Are you sure it's actually fixed up in the other compilation units?

Indeed, otherwise an error is signaled.

> This problem should be signaled by loadup if there are any NCUs it does not fix up.  It would be a lot easier to diagnose
> the problem from there.

loadup is in charge of fixing up on all CU's file fields, and indeed if
something goes wrong in that code an error is signaled.  But evidently
this is not the case, so there's something more to understand.

Regards

  Andrea



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-08  7:44         ` Andrea Corallo
@ 2022-08-08 10:22           ` Lynn Winebarger
  2022-08-08 13:14             ` Andrea Corallo
  0 siblings, 1 reply; 18+ messages in thread
From: Lynn Winebarger @ 2022-08-08 10:22 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

[-- Attachment #1: Type: text/plain, Size: 2789 bytes --]

On Mon, Aug 8, 2022, 3:44 AM Andrea Corallo <akrl@sdf.org> wrote:

> Lynn Winebarger <owinebar@gmail.com> writes:
>
> > On Fri, Aug 5, 2022, 10:42 AM Andrea Corallo <akrl@sdf.org> wrote:
> >
> >  Andrea Corallo <akrl@sdf.org> writes:
> >
> >  > Lars Ingebrigtsen <larsi@gnus.org> writes:
> >  >
> >  >> Joseph Mingrone <jrm@ftfl.ca> writes:
> >  >>
> >  >>> Could 261d6af have broken --with-native-compilation builds on 32-bit
> >  >>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
> >  >>> jail using 261d6af:
> >  >>>
> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
> >  >>
> >  >> I guess these are the error messages?
> >  >>
> >  >> emacs: Trying to load incoherent dumped eln file
> >  >>
> >
> /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
> >
> >  >>
> >  >> I don't know what that means; Andrea added to the CCs.
> >  >
> >  > It's very surprising to see 261d6af causing this side effect, at
> least I
> >  > don't see why should effect the 32bit build only.
> >  >
> >  > I'm trying to reproduce it on my 32bit env.
> >
> >  I confirm the build it's broken on my 32bit env as well, (but not on the
> >  64 one).
> >
> >  Loading the second dump, while we are relocating the ediff-hook
> >  compilation unit, we realize (@ pdumper.c:5304) that its file field is
> >  not a cons as expected but just a string.
> >
> >  Now the question is why this is not fixed-up in loadup.el:477 as for the
> >  other compilation units?
> >
> > Are you sure it's actually fixed up in the other compilation units?
>
> Indeed, otherwise an error is signaled.
>
> > This problem should be signaled by loadup if there are any NCUs it does
> not fix up.  It would be a lot easier to diagnose
> > the problem from there.
>
> loadup is in charge of fixing up on all CU's file fields, and indeed if
> something goes wrong in that code an error is signaled.  But evidently
> this is not the case, so there's something more to understand.
>

I just looked, and there are 2 possible paths for NCUs to be in the dump
without an error being signaled:
1 - either the --bin-dest or --eln-dest flag is not specified (or is on the
command line but empty)
2 - there is an NCU loaded for which no symbol is bound to a subr in that
NCU.

Since I put in some code (in loadup) to explicitly test whether any loaded
NCU would be missed by (2), I have seen one instance pop up, though not
while only loading the files in loadup - site-load loads many more.
However, I've removed the requirement of having a cons cell in the NCU in
the dump file, so I don't know if it was destined was garbage collection,
and so discarded by the dump process.

Lynn

[-- Attachment #2: Type: text/html, Size: 4413 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-08 10:22           ` Lynn Winebarger
@ 2022-08-08 13:14             ` Andrea Corallo
  2022-08-08 13:55               ` Lynn Winebarger
  0 siblings, 1 reply; 18+ messages in thread
From: Andrea Corallo @ 2022-08-08 13:14 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

Lynn Winebarger <owinebar@gmail.com> writes:

> On Mon, Aug 8, 2022, 3:44 AM Andrea Corallo <akrl@sdf.org> wrote:
>
>  Lynn Winebarger <owinebar@gmail.com> writes:
>
>  > On Fri, Aug 5, 2022, 10:42 AM Andrea Corallo <akrl@sdf.org> wrote:
>  >
>  >  Andrea Corallo <akrl@sdf.org> writes:
>  >
>  >  > Lars Ingebrigtsen <larsi@gnus.org> writes:
>  >  >
>  >  >> Joseph Mingrone <jrm@ftfl.ca> writes:
>  >  >>
>  >  >>> Could 261d6af have broken --with-native-compilation builds on 32-bit
>  >  >>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
>  >  >>> jail using 261d6af:
>  >  >>> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
>  >  >>
>  >  >> I guess these are the error messages?
>  >  >>
>  >  >> emacs: Trying to load incoherent dumped eln file
>  >  >>
>  > 
>  /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
>  
>  >  
>  >  >>
>  >  >> I don't know what that means; Andrea added to the CCs.
>  >  >
>  >  > It's very surprising to see 261d6af causing this side effect, at least I
>  >  > don't see why should effect the 32bit build only.
>  >  >
>  >  > I'm trying to reproduce it on my 32bit env.
>  >
>  >  I confirm the build it's broken on my 32bit env as well, (but not on the
>  >  64 one).
>  >
>  >  Loading the second dump, while we are relocating the ediff-hook
>  >  compilation unit, we realize (@ pdumper.c:5304) that its file field is
>  >  not a cons as expected but just a string.
>  >
>  >  Now the question is why this is not fixed-up in loadup.el:477 as for the
>  >  other compilation units?
>  >
>  > Are you sure it's actually fixed up in the other compilation units?
>
>  Indeed, otherwise an error is signaled.
>
>  > This problem should be signaled by loadup if there are any NCUs it does not fix up.  It would be a lot easier to
>  diagnose
>  > the problem from there.
>
>  loadup is in charge of fixing up on all CU's file fields, and indeed if
>  something goes wrong in that code an error is signaled.  But evidently
>  this is not the case, so there's something more to understand.
>
> I just looked, and there are 2 possible paths for NCUs to be in the dump without an error being signaled:
> 1 - either the --bin-dest or --eln-dest flag is not specified (or is
> on the command line but empty)

This is not the case in our build.

> 2 - there is an NCU loaded for which no symbol is bound to a subr in that NCU.

CUs that are not reachable from the function slot of a symbol are
unloaded when GC runs.  We do run GC before dumping so this should not
happen.

  Andrea



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-08 13:14             ` Andrea Corallo
@ 2022-08-08 13:55               ` Lynn Winebarger
  2022-08-08 14:13                 ` Andrea Corallo
  0 siblings, 1 reply; 18+ messages in thread
From: Lynn Winebarger @ 2022-08-08 13:55 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

[-- Attachment #1: Type: text/plain, Size: 3490 bytes --]

On Mon, Aug 8, 2022, 9:14 AM Andrea Corallo <akrl@sdf.org> wrote:

> Lynn Winebarger <owinebar@gmail.com> writes:
>
> > On Mon, Aug 8, 2022, 3:44 AM Andrea Corallo <akrl@sdf.org> wrote:
> >
> >  Lynn Winebarger <owinebar@gmail.com> writes:
> >
> >  > On Fri, Aug 5, 2022, 10:42 AM Andrea Corallo <akrl@sdf.org> wrote:
> >  >
> >  >  Andrea Corallo <akrl@sdf.org> writes:
> >  >
> >  >  > Lars Ingebrigtsen <larsi@gnus.org> writes:
> >  >  >
> >  >  >> Joseph Mingrone <jrm@ftfl.ca> writes:
> >  >  >>
> >  >  >>> Could 261d6af have broken --with-native-compilation builds on
> 32-bit
> >  >  >>> systems?  This is what I see building in a clean FreeBSD/i386
> 13.0
> >  >  >>> jail using 261d6af:
> >  >  >>>
> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
> >  >  >>
> >  >  >> I guess these are the error messages?
> >  >  >>
> >  >  >> emacs: Trying to load incoherent dumped eln file
> >  >  >>
> >  >
> >
> /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
> >
> >  >
> >  >  >>
> >  >  >> I don't know what that means; Andrea added to the CCs.
> >  >  >
> >  >  > It's very surprising to see 261d6af causing this side effect, at
> least I
> >  >  > don't see why should effect the 32bit build only.
> >  >  >
> >  >  > I'm trying to reproduce it on my 32bit env.
> >  >
> >  >  I confirm the build it's broken on my 32bit env as well, (but not on
> the
> >  >  64 one).
> >  >
> >  >  Loading the second dump, while we are relocating the ediff-hook
> >  >  compilation unit, we realize (@ pdumper.c:5304) that its file field
> is
> >  >  not a cons as expected but just a string.
> >  >
> >  >  Now the question is why this is not fixed-up in loadup.el:477 as for
> the
> >  >  other compilation units?
> >  >
> >  > Are you sure it's actually fixed up in the other compilation units?
> >
> >  Indeed, otherwise an error is signaled.
> >
> >  > This problem should be signaled by loadup if there are any NCUs it
> does not fix up.  It would be a lot easier to
> >  diagnose
> >  > the problem from there.
> >
> >  loadup is in charge of fixing up on all CU's file fields, and indeed if
> >  something goes wrong in that code an error is signaled.  But evidently
> >  this is not the case, so there's something more to understand.
> >
> > I just looked, and there are 2 possible paths for NCUs to be in the dump
> without an error being signaled:
> > 1 - either the --bin-dest or --eln-dest flag is not specified (or is
> > on the command line but empty)
>
> This is not the case in our build.
>

No, but it is one way the dump can produce an unusable file without any
error signaled until an Emacs instance attempts to load it.

>
> > 2 - there is an NCU loaded for which no symbol is bound to a subr in
> that NCU.
>
> CUs that are not reachable from the function slot of a symbol are
> unloaded when GC runs.  We do run GC before dumping so this should not
> happen.


Yes, "should" is the operative word there.  Why not validate the condition
before writing the dump file?  If not in loadup, then in the procedure that
records the NCU in the dump?  Why wait until load-time to catch something
that was almost certainly (barring user performing surgery on the dump
file) the case when the dump was produced?  Just put the same check before
the "write" operation that is  done immediately after the corresponding
"read" operation.

Lynn

[-- Attachment #2: Type: text/html, Size: 5560 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-08 13:55               ` Lynn Winebarger
@ 2022-08-08 14:13                 ` Andrea Corallo
  0 siblings, 0 replies; 18+ messages in thread
From: Andrea Corallo @ 2022-08-08 14:13 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

Lynn Winebarger <owinebar@gmail.com> writes:

> On Mon, Aug 8, 2022, 9:14 AM Andrea Corallo <akrl@sdf.org> wrote:
>
>  Lynn Winebarger <owinebar@gmail.com> writes:
>
>  > On Mon, Aug 8, 2022, 3:44 AM Andrea Corallo <akrl@sdf.org> wrote:
>  >
>  >  Lynn Winebarger <owinebar@gmail.com> writes:
>  >
>  >  > On Fri, Aug 5, 2022, 10:42 AM Andrea Corallo <akrl@sdf.org> wrote:
>  >  >
>  >  >  Andrea Corallo <akrl@sdf.org> writes:
>  >  >
>  >  >  > Lars Ingebrigtsen <larsi@gnus.org> writes:
>  >  >  >
>  >  >  >> Joseph Mingrone <jrm@ftfl.ca> writes:
>  >  >  >>
>  >  >  >>> Could 261d6af have broken --with-native-compilation builds on 32-bit
>  >  >  >>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
>  >  >  >>> jail using 261d6af:
>  >  >  >>> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
>  >  >  >>
>  >  >  >> I guess these are the error messages?
>  >  >  >>
>  >  >  >> emacs: Trying to load incoherent dumped eln file
>  >  >  >>
>  >  > 
>  > 
>  /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
>  
>  >  
>  >  >  
>  >  >  >>
>  >  >  >> I don't know what that means; Andrea added to the CCs.
>  >  >  >
>  >  >  > It's very surprising to see 261d6af causing this side effect, at least I
>  >  >  > don't see why should effect the 32bit build only.
>  >  >  >
>  >  >  > I'm trying to reproduce it on my 32bit env.
>  >  >
>  >  >  I confirm the build it's broken on my 32bit env as well, (but not on the
>  >  >  64 one).
>  >  >
>  >  >  Loading the second dump, while we are relocating the ediff-hook
>  >  >  compilation unit, we realize (@ pdumper.c:5304) that its file field is
>  >  >  not a cons as expected but just a string.
>  >  >
>  >  >  Now the question is why this is not fixed-up in loadup.el:477 as for the
>  >  >  other compilation units?
>  >  >
>  >  > Are you sure it's actually fixed up in the other compilation units?
>  >
>  >  Indeed, otherwise an error is signaled.
>  >
>  >  > This problem should be signaled by loadup if there are any NCUs it does not fix up.  It would be a lot easier to
>  >  diagnose
>  >  > the problem from there.
>  >
>  >  loadup is in charge of fixing up on all CU's file fields, and indeed if
>  >  something goes wrong in that code an error is signaled.  But evidently
>  >  this is not the case, so there's something more to understand.
>  >
>  > I just looked, and there are 2 possible paths for NCUs to be in the dump without an error being signaled:
>  > 1 - either the --bin-dest or --eln-dest flag is not specified (or is
>  > on the command line but empty)
>
>  This is not the case in our build.
>
> No, but it is one way the dump can produce an unusable file without any error signaled until an Emacs instance attempts
> to load it.

That is understood, but it can't happen using our current build system,
and this is what we are interested in here.

>  > 2 - there is an NCU loaded for which no symbol is bound to a subr in that NCU.
>
>  CUs that are not reachable from the function slot of a symbol are
>  unloaded when GC runs.  We do run GC before dumping so this should not
>  happen.
>
> Yes, "should" is the operative word there.  Why not validate the condition before writing the dump file?  If not in
> loadup, then in the procedure that records the NCU in the dump?  Why wait until load-time to catch something that was
> almost certainly (barring user performing surgery on the dump file) the case when the dump was produced?  Just put the
> same check before the "write" operation that is  done immediately after the corresponding "read" operation.

Just submit a patch.

  Andrea



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-05 14:40     ` Andrea Corallo
  2022-08-05 15:16       ` Lynn Winebarger
@ 2022-08-09  9:11       ` Andrea Corallo
  2022-08-09  9:21         ` Andrea Corallo
  1 sibling, 1 reply; 18+ messages in thread
From: Andrea Corallo @ 2022-08-09  9:11 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Joseph Mingrone, emacs-devel, emacs

Andrea Corallo <akrl@sdf.org> writes:

> Andrea Corallo <akrl@sdf.org> writes:
>
>> Lars Ingebrigtsen <larsi@gnus.org> writes:
>>
>>> Joseph Mingrone <jrm@ftfl.ca> writes:
>>>
>>>> Could 261d6af have broken --with-native-compilation builds on 32-bit
>>>> systems?  This is what I see building in a clean FreeBSD/i386 13.0
>>>> jail using 261d6af:
>>>> http://pkg.ftfl.ca/data/13i386-default/2022-08-04_22h38m28s/logs/errors/emacs-devel-29.0.50.20220804,2.log
>>>
>>> I guess these are the error messages?
>>>
>>> emacs: Trying to load incoherent dumped eln file
>>> /wrkdirs/usr/ports/editors/emacs-devel/work-full/emacs-261d6af/native-lisp/29.0.50-7cc1a43d/preloaded/ediff-hook-0b92f1a2-f843c8a0.eln
>>>
>>> I don't know what that means; Andrea added to the CCs.
>>
>> It's very surprising to see 261d6af causing this side effect, at least I
>> don't see why should effect the 32bit build only.
>>
>> I'm trying to reproduce it on my 32bit env.
>
> I confirm the build it's broken on my 32bit env as well, (but not on the
> 64 one).
>
> Loading the second dump, while we are relocating the ediff-hook
> compilation unit, we realize (@ pdumper.c:5304) that its file field is
> not a cons as expected but just a string.
>
> Now the question is why this is not fixed-up in loadup.el:477 as for the
> other compilation units?

Just had some time to look into this further:

Of all the CUs we are dumping two are not fixed-up in loadup.el before
dump because not referenced by any function.

In particular looking at 'ediff-hook' it does contain only variable
definitions so this is correct.

We do run a GC before dumping so we should unload these unreferenced CUs
before dump.  And as expected I don't see ediff-hook CU being marked but
we do not free it during sweep.

It looks to me like a GC bug so far.  Unfortunatly I've very constrained
time to dedicate on this this week.

BR

  Andrea




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-09  9:11       ` Andrea Corallo
@ 2022-08-09  9:21         ` Andrea Corallo
  2022-08-09  9:48           ` Po Lu
                             ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Andrea Corallo @ 2022-08-09  9:21 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Joseph Mingrone, emacs-devel, emacs

Andrea Corallo <akrl@sdf.org> writes:

[...]

> Just had some time to look into this further:
>
> Of all the CUs we are dumping two are not fixed-up in loadup.el before
> dump because not referenced by any function.
>
> In particular looking at 'ediff-hook' it does contain only variable
> definitions so this is correct.
>
> We do run a GC before dumping so we should unload these unreferenced CUs
> before dump.  And as expected I don't see ediff-hook CU being marked but
> we do not free it during sweep.
>
> It looks to me like a GC bug so far.  Unfortunatly I've very constrained
> time to dedicate on this this week.

Thinking about this... Maybe relying on the GC for this is not a very
good idea in the first place.  If we are conservative on the stack my
might always mark a CU accidentally and fall into the same issue.

I think we should maintain a list of all loaded CUs so we can fix them
up reliably.  If this is agreed not to be a bad idea I'll prepare a
patch.

BR

  Andrea

PS still dunno what's going on with the GC here



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-09  9:21         ` Andrea Corallo
@ 2022-08-09  9:48           ` Po Lu
  2022-08-09 10:03             ` Andrea Corallo
  2022-08-09 10:20           ` Lynn Winebarger
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Po Lu @ 2022-08-09  9:48 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

Andrea Corallo <akrl@sdf.org> writes:

> PS still dunno what's going on with the GC here

It will remain conservative for the forseeable future.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-09  9:48           ` Po Lu
@ 2022-08-09 10:03             ` Andrea Corallo
  2022-08-09 10:10               ` Po Lu
  0 siblings, 1 reply; 18+ messages in thread
From: Andrea Corallo @ 2022-08-09 10:03 UTC (permalink / raw)
  To: Po Lu; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

Po Lu <luangruo@yahoo.com> writes:

> Andrea Corallo <akrl@sdf.org> writes:
>
>> PS still dunno what's going on with the GC here
>
> It will remain conservative for the forseeable future.

I guess so, here I'm referring to the fact that being conservative on
the stack still seams not to be the root cause of the issue here.

  Andrea



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-09 10:03             ` Andrea Corallo
@ 2022-08-09 10:10               ` Po Lu
  0 siblings, 0 replies; 18+ messages in thread
From: Po Lu @ 2022-08-09 10:10 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

Andrea Corallo <akrl@sdf.org> writes:

> I guess so, here I'm referring to the fact that being conservative on
> the stack still seams not to be the root cause of the issue here.
>
>   Andrea

Oh, okay.  Sorry for the noise then.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-09  9:21         ` Andrea Corallo
  2022-08-09  9:48           ` Po Lu
@ 2022-08-09 10:20           ` Lynn Winebarger
  2022-08-09 11:16           ` Eli Zaretskii
  2022-08-09 15:32           ` Lars Ingebrigtsen
  3 siblings, 0 replies; 18+ messages in thread
From: Lynn Winebarger @ 2022-08-09 10:20 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Lars Ingebrigtsen, Joseph Mingrone, emacs-devel, emacs

[-- Attachment #1: Type: text/plain, Size: 1458 bytes --]

On Tue, Aug 9, 2022, 5:22 AM Andrea Corallo <akrl@sdf.org> wrote:

> Andrea Corallo <akrl@sdf.org> writes:
>
> [...]
>
> > Just had some time to look into this further:
> >
> > Of all the CUs we are dumping two are not fixed-up in loadup.el before
> > dump because not referenced by any function.
> >
> > In particular looking at 'ediff-hook' it does contain only variable
> > definitions so this is correct.
> >
> > We do run a GC before dumping so we should unload these unreferenced CUs
> > before dump.  And as expected I don't see ediff-hook CU being marked but
> > we do not free it during sweep.
> >
> > It looks to me like a GC bug so far.  Unfortunatly I've very constrained
> > time to dedicate on this this week.
>
> Thinking about this... Maybe relying on the GC for this is not a very
> good idea in the first place.  If we are conservative on the stack my
> might always mark a CU accidentally and fall into the same issue.
>
> I think we should maintain a list of all loaded CUs so we can fix them
> up reliably.  If this is agreed not to be a bad idea I'll prepare a
> patch.


Just a heads up - when I was validating what was failing while dumping, I
tried printing the comp units before and after they were fixed up.  When
the comp unit has a cons cell in the name field, princ segfaults (at least
in 28.1).
I didn't report this as a bug because it would be very unusual for a user
to have access to comp units in this state.

Lynn






>

[-- Attachment #2: Type: text/html, Size: 2351 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-09  9:21         ` Andrea Corallo
  2022-08-09  9:48           ` Po Lu
  2022-08-09 10:20           ` Lynn Winebarger
@ 2022-08-09 11:16           ` Eli Zaretskii
  2022-08-09 15:32           ` Lars Ingebrigtsen
  3 siblings, 0 replies; 18+ messages in thread
From: Eli Zaretskii @ 2022-08-09 11:16 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: larsi, jrm, emacs-devel, emacs

> From: Andrea Corallo <akrl@sdf.org>
> Cc: Joseph Mingrone <jrm@ftfl.ca>, emacs-devel@gnu.org, emacs@FreeBSD.org
> Date: Tue, 09 Aug 2022 09:21:11 +0000
> 
> > It looks to me like a GC bug so far.  Unfortunatly I've very constrained
> > time to dedicate on this this week.
> 
> Thinking about this... Maybe relying on the GC for this is not a very
> good idea in the first place.  If we are conservative on the stack my
> might always mark a CU accidentally and fall into the same issue.
> 
> I think we should maintain a list of all loaded CUs so we can fix them
> up reliably.  If this is agreed not to be a bad idea I'll prepare a
> patch.

I suggest to postpone the decision until we have a good understanding
of what happens in this particular case and why it happens only in
32-bit builds.  Maybe we will decide what you suggest, but there are
likely other factors at work here, and it would be good to know what
they are.

Thanks.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: --with-native-compilation build failure on 32-bit systems
  2022-08-09  9:21         ` Andrea Corallo
                             ` (2 preceding siblings ...)
  2022-08-09 11:16           ` Eli Zaretskii
@ 2022-08-09 15:32           ` Lars Ingebrigtsen
  3 siblings, 0 replies; 18+ messages in thread
From: Lars Ingebrigtsen @ 2022-08-09 15:32 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Joseph Mingrone, emacs-devel, emacs

Andrea Corallo <akrl@sdf.org> writes:

> Thinking about this... Maybe relying on the GC for this is not a very
> good idea in the first place.  If we are conservative on the stack my
> might always mark a CU accidentally and fall into the same issue.
>
> I think we should maintain a list of all loaded CUs so we can fix them
> up reliably.  If this is agreed not to be a bad idea I'll prepare a
> patch.

Relying on the GC is indeed inherently fragile, so maintaining an
explicit list sounds like a good idea in any case -- even if GC doesn't
turn out to be the culprit here.



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-08-09 15:32 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-05  2:12 --with-native-compilation build failure on 32-bit systems Joseph Mingrone
2022-08-05 11:58 ` Lars Ingebrigtsen
2022-08-05 13:30   ` Andrea Corallo
2022-08-05 14:40     ` Andrea Corallo
2022-08-05 15:16       ` Lynn Winebarger
2022-08-08  7:44         ` Andrea Corallo
2022-08-08 10:22           ` Lynn Winebarger
2022-08-08 13:14             ` Andrea Corallo
2022-08-08 13:55               ` Lynn Winebarger
2022-08-08 14:13                 ` Andrea Corallo
2022-08-09  9:11       ` Andrea Corallo
2022-08-09  9:21         ` Andrea Corallo
2022-08-09  9:48           ` Po Lu
2022-08-09 10:03             ` Andrea Corallo
2022-08-09 10:10               ` Po Lu
2022-08-09 10:20           ` Lynn Winebarger
2022-08-09 11:16           ` Eli Zaretskii
2022-08-09 15:32           ` Lars Ingebrigtsen

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).