* seg-fault in unexelf.c
@ 2006-07-21 17:59 Chip Coldwell
2006-07-21 20:53 ` Chip Coldwell
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Chip Coldwell @ 2006-07-21 17:59 UTC (permalink / raw)
I have been having a problem building emacs on GNU/Linux (in
particular, the Fedora Core 5 distribution). The build process runs
the following command in the src/ subdirectory:
./temacs -l loadup -batch dump
to cause the temacs binary (essentially emacs without the Lisp code
loaded) to set up the Lisp environment in the process address space
and then create a new ELF file with a new .data section that contains
the .data and .bss sections from the temacs process.
If I build the temacs binary with no compiler optimization (gcc (GCC)
4.1.0 20060304 (Red Hat 4.1.0-3)), the command above seg-faults in the
unexec function (file unexelf.c) while executing this line:
memcpy (NEW_SECTION_H (nn).sh_offset + new_base,
(caddr_t) OLD_SECTION_H (n).sh_addr,
new_data2_size);
I unrolled the memcpy thus:
p = NEW_SECTION_H (nn).sh_offset + new_base;
q = (caddr_t) OLD_SECTION_H (n).sh_addr;
for(i=0; i<new_data2_size; i++)
p[i] = q[i];
ran the debugger and found the segfault happens when
(gdb) p/x q+i
$5 = 0x82f0000
(gdb) p/x i
$8 = 0xed160
(gdb) p/x new_bss_addr
$10 = 0x852a000
In the meantime, if I look in /proc/[PID]/maps I find this:
08048000-081fa000 r-xp 00000000 fd:00 1311728 /home/coldwell/rpm/BUILD/emacs-21.4/src/temacs
081fa000-08203000 rw-p 001b2000 fd:00 1311728 /home/coldwell/rpm/BUILD/emacs-21.4/src/temacs
08203000-082f0000 rw-p 08203000 00:00 0
08337000-0852a000 rw-p 08337000 00:00 0 [heap]
b730e000-b7da3000 rw-p b730e000 00:00 0
The problem is that the Linux kernel has set up the process virtual
memory with a hole in it, and when the memcpy steps into this hole, it
seg-faults.
Here are more details. "new_data2_size" is computed from (paraphrasing)
new_data2_size = sbrk(0) - OLD_SECTION_H (old_bss_index).sh_addr
In other words, the size of the new data section is determined to be
the top of the heap of the current temacs process minus the bottom of
the .bss in the temacs ELF file on disk.
"readelf -S temacs" gives
[20] .data PROGBITS 081fa880 1b2880 008610 00 WA 0 0 32
[21] .bss NOBITS 08202ea0 1bae90 0ec6d0 00 WA 0 0 32
Chip
--
Charles M. "Chip" Coldwell
Senior Software Engineer
Red Hat, Inc
978-392-2426
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: seg-fault in unexelf.c
2006-07-21 17:59 seg-fault in unexelf.c Chip Coldwell
@ 2006-07-21 20:53 ` Chip Coldwell
2006-07-22 0:33 ` Nick Roberts
2006-07-22 4:39 ` Richard Stallman
2 siblings, 0 replies; 7+ messages in thread
From: Chip Coldwell @ 2006-07-21 20:53 UTC (permalink / raw)
On Fri, 21 Jul 2006, Chip Coldwell wrote:
>
> If I build the temacs binary with no compiler optimization (gcc (GCC)
> 4.1.0 20060304 (Red Hat 4.1.0-3)), the command above seg-faults in the
> unexec function (file unexelf.c) while executing this line:
>
> memcpy (NEW_SECTION_H (nn).sh_offset + new_base,
> (caddr_t) OLD_SECTION_H (n).sh_addr,
> new_data2_size);
[ ... ]
> The problem is that the Linux kernel has set up the process virtual
> memory with a hole in it, and when the memcpy steps into this hole, it
> seg-faults.
To paraphrase: the memcpy uses the .bss section start address from the
temacs ELF file for the lower bound, and sbrk(0) from the running
temacs process for its upper bound of a copy from the process address
space to the new ELF file .data section it is creating, but the Linux
kernel can set up the process address space such that there are holes
in the virtual address space between these two addresses. Stepping
into such a hole gets you a segmentation fault.
A colleague suggested that one (crude) way to cope with this would be
signal(SIGSEGV, SIG_IGN);
before the memcpy and
signal(SIGSEGV, SIG_DFL);
afterwards, although it might be better to unroll the memcpy to do
just a page at a time if taking this approach.
Chip
--
Charles M. "Chip" Coldwell
Senior Software Engineer
Red Hat, Inc
978-392-2426
^ permalink raw reply [flat|nested] 7+ messages in thread
* seg-fault in unexelf.c
2006-07-21 17:59 seg-fault in unexelf.c Chip Coldwell
2006-07-21 20:53 ` Chip Coldwell
@ 2006-07-22 0:33 ` Nick Roberts
2006-07-22 12:37 ` Chip Coldwell
2006-07-22 4:39 ` Richard Stallman
2 siblings, 1 reply; 7+ messages in thread
From: Nick Roberts @ 2006-07-22 0:33 UTC (permalink / raw)
Cc: emacs-devel
> I have been having a problem building emacs on GNU/Linux (in
> particular, the Fedora Core 5 distribution). The build process runs
> the following command in the src/ subdirectory:
>
> ./temacs -l loadup -batch dump
>
> to cause the temacs binary (essentially emacs without the Lisp code
> loaded) to set up the Lisp environment in the process address space
> and then create a new ELF file with a new .data section that contains
> the .data and .bss sections from the temacs process.
>
> If I build the temacs binary with no compiler optimization (gcc (GCC)
> 4.1.0 20060304 (Red Hat 4.1.0-3)), the command above seg-faults in the
> unexec function (file unexelf.c) while executing this line:
>...
I regularly build on FC5 without optimization and don't have any problems
(although I haven't done "make bootstrap for a while, the final build uses
"./temacs -l loadup -batch dump").
gcc (GCC) 4.1.1 20060525 (Red Hat 4.1.1-1)
Linux kahikatea.snap.net.nz 2.6.15-1.2054_FC5 #1 Tue Mar 14 15:48:33 EST 2006 i686 i686 i386 GNU/Linux
n
In GNU Emacs 22.0.50.41 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
of 2006-07-21 on kahikatea.snap.net.nz
X server distributor `The X.Org Foundation', version 11.0.70000000
configured using `configure 'CFLAGS=-g3''
--
Nick http://www.inet.net.nz/~nickrob
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: seg-fault in unexelf.c
2006-07-21 17:59 seg-fault in unexelf.c Chip Coldwell
2006-07-21 20:53 ` Chip Coldwell
2006-07-22 0:33 ` Nick Roberts
@ 2006-07-22 4:39 ` Richard Stallman
2006-07-22 12:27 ` Chip Coldwell
2 siblings, 1 reply; 7+ messages in thread
From: Richard Stallman @ 2006-07-22 4:39 UTC (permalink / raw)
Cc: emacs-devel
Which Emacs version is this?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: seg-fault in unexelf.c
2006-07-22 4:39 ` Richard Stallman
@ 2006-07-22 12:27 ` Chip Coldwell
0 siblings, 0 replies; 7+ messages in thread
From: Chip Coldwell @ 2006-07-22 12:27 UTC (permalink / raw)
Cc: emacs-devel
On Sat, 22 Jul 2006, Richard Stallman wrote:
> Which Emacs version is this?
21.4. The unexec function has changed in 22 (no later than 22.0.50), but
it looks like it still has this same problem. I will verify that.
Chip
--
Charles M. "Chip" Coldwell
Senior Software Engineer
Red Hat, Inc
978-392-2426
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: seg-fault in unexelf.c
2006-07-22 0:33 ` Nick Roberts
@ 2006-07-22 12:37 ` Chip Coldwell
2006-07-22 12:50 ` Nick Roberts
0 siblings, 1 reply; 7+ messages in thread
From: Chip Coldwell @ 2006-07-22 12:37 UTC (permalink / raw)
Cc: emacs-devel
On Sat, 22 Jul 2006, Nick Roberts wrote:
>
> I regularly build on FC5 without optimization and don't have any problems
> (although I haven't done "make bootstrap for a while, the final build uses
> "./temacs -l loadup -batch dump").
Which version? If you mean 21.4 (the one that ships with FC5), then all
this means is that, for whatever reason, your kernel is aways setting up
the temacs process with a contiguous address space. The fact that the
Linux kernel could set up a process virtual address space with holes in it
has to be dealt with anyway.
Although I must admit, a poll of all the vm-gurus in the building couldn't
come up with a good reason *why* you would want a discontiguous process
address space.
Chip
--
Charles M. "Chip" Coldwell
Senior Software Engineer
Red Hat, Inc
978-392-2426
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: seg-fault in unexelf.c
2006-07-22 12:37 ` Chip Coldwell
@ 2006-07-22 12:50 ` Nick Roberts
0 siblings, 0 replies; 7+ messages in thread
From: Nick Roberts @ 2006-07-22 12:50 UTC (permalink / raw)
Cc: emacs-devel
Chip Coldwell writes:
> On Sat, 22 Jul 2006, Nick Roberts wrote:
> >
> > I regularly build on FC5 without optimization and don't have any problems
> > (although I haven't done "make bootstrap for a while, the final build uses
> > "./temacs -l loadup -batch dump").
>
> Which version?
Emacs 22.0.50.
> If you mean 21.4 (the one that ships with FC5), then all
> this means is that, for whatever reason, your kernel is aways setting up
> the temacs process with a contiguous address space. The fact that the
> Linux kernel could set up a process virtual address space with holes in it
> has to be dealt with anyway.
Isn't this related to items in the PROBLEMS file (some of which I thought had
been solved)?:
*** Linux: Segfault during `make bootstrap' under certain recent versions of
the Linux kernel.
With certain recent Linux kernels (like the one of Redhat Fedora Core 1 and
newer), the new "Exec-shield" functionality is enabled by default, which
creates a different memory layout that breaks the emacs dumper. Emacs tries
to handle this at build time, but if the workaround used fails, these
instructions can be useful. The work-around explained here is not enough on
Fedora Core 4 (and possible newer). Read the next item.
Configure can overcome the problem of exec-shield if the architecture is x86
and the program setarch is present. On other architectures no workaround is
known.
You can check the Exec-shield state like this:
cat /proc/sys/kernel/exec-shield
It returns non-zero when Exec-shield is enabled, 0 otherwise. Please read
your system documentation for more details on Exec-shield and associated
commands. Exec-shield can be turned off with this command:
echo "0" > /proc/sys/kernel/exec-shield
When Exec-shield is enabled, building Emacs will segfault during the
execution of this command:
./temacs --batch --load loadup [dump|bootstrap]
To work around this problem, it is necessary to temporarily disable
Exec-shield while building Emacs, or, on x86, by using the `setarch'
command when running temacs like this:
setarch i386 ./temacs --batch --load loadup [dump|bootstrap]
*** Fedora Core 4 GNU/Linux: Segfault during dumping.
In addition to exec-shield explained above "Linux: Segfault during `make
bootstrap' under certain recent versions of the Linux kernel" item, Linux
kernel shipped with Fedora Core 4 randomizes the virtual address space of a
process. As the result dumping may fail even if you turn off exec-shield. In
this case, use the -R option to the setarch command:
setarch i386 -R ./temacs --batch --load loadup [dump|bootstrap]
or
setarch i386 -R make bootstrap
--
Nick http://www.inet.net.nz/~nickrob
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2006-07-22 12:50 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-21 17:59 seg-fault in unexelf.c Chip Coldwell
2006-07-21 20:53 ` Chip Coldwell
2006-07-22 0:33 ` Nick Roberts
2006-07-22 12:37 ` Chip Coldwell
2006-07-22 12:50 ` Nick Roberts
2006-07-22 4:39 ` Richard Stallman
2006-07-22 12:27 ` Chip Coldwell
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).