unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* seg-fault in unexelf.c
@ 2006-07-21 17:59 Chip Coldwell
  2006-07-21 20:53 ` Chip Coldwell
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Chip Coldwell @ 2006-07-21 17:59 UTC (permalink / raw)



I have been having a problem building emacs on GNU/Linux (in
particular, the Fedora Core 5 distribution).  The build process runs
the following command in the src/ subdirectory:

./temacs -l loadup -batch dump

to cause the temacs binary (essentially emacs without the Lisp code
loaded) to set up the Lisp environment in the process address space
and then create a new ELF file with a new .data section that contains
the .data and .bss sections from the temacs process.

If I build the temacs binary with no compiler optimization (gcc (GCC)
4.1.0 20060304 (Red Hat 4.1.0-3)), the command above seg-faults in the
unexec function (file unexelf.c) while executing this line:

           memcpy (NEW_SECTION_H (nn).sh_offset + new_base,
 		  (caddr_t) OLD_SECTION_H (n).sh_addr,
 		  new_data2_size);

I unrolled the memcpy thus:

 	  p = NEW_SECTION_H (nn).sh_offset + new_base;
 	  q = (caddr_t) OLD_SECTION_H (n).sh_addr;
 	  for(i=0; i<new_data2_size; i++)
 	    p[i] = q[i];

ran the debugger and found the segfault happens when

(gdb) p/x q+i
$5 = 0x82f0000
(gdb) p/x i
$8 = 0xed160
(gdb) p/x new_bss_addr
$10 = 0x852a000

In the meantime, if I look in /proc/[PID]/maps I find this:

08048000-081fa000 r-xp 00000000 fd:00 1311728    /home/coldwell/rpm/BUILD/emacs-21.4/src/temacs
081fa000-08203000 rw-p 001b2000 fd:00 1311728    /home/coldwell/rpm/BUILD/emacs-21.4/src/temacs
08203000-082f0000 rw-p 08203000 00:00 0 
08337000-0852a000 rw-p 08337000 00:00 0          [heap]
b730e000-b7da3000 rw-p b730e000 00:00 0

The problem is that the Linux kernel has set up the process virtual
memory with a hole in it, and when the memcpy steps into this hole, it
seg-faults.

Here are more details.  "new_data2_size" is computed from (paraphrasing)

new_data2_size = sbrk(0) - OLD_SECTION_H (old_bss_index).sh_addr

In other words, the size of the new data section is determined to be
the top of the heap of the current temacs process minus the bottom of
the .bss in the temacs ELF file on disk.

"readelf -S temacs" gives

  [20] .data             PROGBITS        081fa880 1b2880 008610 00  WA  0   0 32
  [21] .bss              NOBITS          08202ea0 1bae90 0ec6d0 00  WA  0   0 32

Chip

-- 
Charles M. "Chip" Coldwell
Senior Software Engineer
Red Hat, Inc
978-392-2426

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: seg-fault in unexelf.c
  2006-07-21 17:59 seg-fault in unexelf.c Chip Coldwell
@ 2006-07-21 20:53 ` Chip Coldwell
  2006-07-22  0:33 ` Nick Roberts
  2006-07-22  4:39 ` Richard Stallman
  2 siblings, 0 replies; 7+ messages in thread
From: Chip Coldwell @ 2006-07-21 20:53 UTC (permalink / raw)


On Fri, 21 Jul 2006, Chip Coldwell wrote:
>
> If I build the temacs binary with no compiler optimization (gcc (GCC)
> 4.1.0 20060304 (Red Hat 4.1.0-3)), the command above seg-faults in the
> unexec function (file unexelf.c) while executing this line:
>
>          memcpy (NEW_SECTION_H (nn).sh_offset + new_base,
> 		  (caddr_t) OLD_SECTION_H (n).sh_addr,
> 		  new_data2_size);

[ ... ]

> The problem is that the Linux kernel has set up the process virtual
> memory with a hole in it, and when the memcpy steps into this hole, it
> seg-faults.

To paraphrase: the memcpy uses the .bss section start address from the
temacs ELF file for the lower bound, and sbrk(0) from the running
temacs process for its upper bound of a copy from the process address
space to the new ELF file .data section it is creating, but the Linux
kernel can set up the process address space such that there are holes
in the virtual address space between these two addresses.  Stepping
into such a hole gets you a segmentation fault.

A colleague suggested that one (crude) way to cope with this would be

signal(SIGSEGV, SIG_IGN);

before the memcpy and

signal(SIGSEGV, SIG_DFL);

afterwards, although it might be better to unroll the memcpy to do
just a page at a time if taking this approach.

Chip

-- 
Charles M. "Chip" Coldwell
Senior Software Engineer
Red Hat, Inc
978-392-2426

^ permalink raw reply	[flat|nested] 7+ messages in thread

* seg-fault in unexelf.c
  2006-07-21 17:59 seg-fault in unexelf.c Chip Coldwell
  2006-07-21 20:53 ` Chip Coldwell
@ 2006-07-22  0:33 ` Nick Roberts
  2006-07-22 12:37   ` Chip Coldwell
  2006-07-22  4:39 ` Richard Stallman
  2 siblings, 1 reply; 7+ messages in thread
From: Nick Roberts @ 2006-07-22  0:33 UTC (permalink / raw)
  Cc: emacs-devel

 > I have been having a problem building emacs on GNU/Linux (in
 > particular, the Fedora Core 5 distribution).  The build process runs
 > the following command in the src/ subdirectory:
 > 
 > ./temacs -l loadup -batch dump
 > 
 > to cause the temacs binary (essentially emacs without the Lisp code
 > loaded) to set up the Lisp environment in the process address space
 > and then create a new ELF file with a new .data section that contains
 > the .data and .bss sections from the temacs process.
 > 
 > If I build the temacs binary with no compiler optimization (gcc (GCC)
 > 4.1.0 20060304 (Red Hat 4.1.0-3)), the command above seg-faults in the
 > unexec function (file unexelf.c) while executing this line:
 >...

I regularly build on FC5 without optimization and don't have any problems
(although I haven't done "make bootstrap for a while, the final build uses
"./temacs -l loadup -batch dump").


gcc (GCC) 4.1.1 20060525 (Red Hat 4.1.1-1)

Linux kahikatea.snap.net.nz 2.6.15-1.2054_FC5 #1 Tue Mar 14 15:48:33 EST 2006 i686 i686 i386 GNU/Linux
n

In GNU Emacs 22.0.50.41 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2006-07-21 on kahikatea.snap.net.nz
X server distributor `The X.Org Foundation', version 11.0.70000000
configured using `configure 'CFLAGS=-g3''

-- 
Nick                                           http://www.inet.net.nz/~nickrob

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: seg-fault in unexelf.c
  2006-07-21 17:59 seg-fault in unexelf.c Chip Coldwell
  2006-07-21 20:53 ` Chip Coldwell
  2006-07-22  0:33 ` Nick Roberts
@ 2006-07-22  4:39 ` Richard Stallman
  2006-07-22 12:27   ` Chip Coldwell
  2 siblings, 1 reply; 7+ messages in thread
From: Richard Stallman @ 2006-07-22  4:39 UTC (permalink / raw)
  Cc: emacs-devel

Which Emacs version is this?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: seg-fault in unexelf.c
  2006-07-22  4:39 ` Richard Stallman
@ 2006-07-22 12:27   ` Chip Coldwell
  0 siblings, 0 replies; 7+ messages in thread
From: Chip Coldwell @ 2006-07-22 12:27 UTC (permalink / raw)
  Cc: emacs-devel

On Sat, 22 Jul 2006, Richard Stallman wrote:

> Which Emacs version is this?

21.4.  The unexec function has changed in 22 (no later than 22.0.50), but 
it looks like it still has this same problem.  I will verify that.

Chip

-- 
Charles M. "Chip" Coldwell
Senior Software Engineer
Red Hat, Inc
978-392-2426

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: seg-fault in unexelf.c
  2006-07-22  0:33 ` Nick Roberts
@ 2006-07-22 12:37   ` Chip Coldwell
  2006-07-22 12:50     ` Nick Roberts
  0 siblings, 1 reply; 7+ messages in thread
From: Chip Coldwell @ 2006-07-22 12:37 UTC (permalink / raw)
  Cc: emacs-devel

On Sat, 22 Jul 2006, Nick Roberts wrote:
>
> I regularly build on FC5 without optimization and don't have any problems
> (although I haven't done "make bootstrap for a while, the final build uses
> "./temacs -l loadup -batch dump").

Which version?  If you mean 21.4 (the one that ships with FC5), then all 
this means is that, for whatever reason, your kernel is aways setting up 
the temacs process with a contiguous address space.  The fact that the 
Linux kernel could set up a process virtual address space with holes in it 
has to be dealt with anyway.

Although I must admit, a poll of all the vm-gurus in the building couldn't 
come up with a good reason *why* you would want a discontiguous process 
address space.

Chip

-- 
Charles M. "Chip" Coldwell
Senior Software Engineer
Red Hat, Inc
978-392-2426

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: seg-fault in unexelf.c
  2006-07-22 12:37   ` Chip Coldwell
@ 2006-07-22 12:50     ` Nick Roberts
  0 siblings, 0 replies; 7+ messages in thread
From: Nick Roberts @ 2006-07-22 12:50 UTC (permalink / raw)
  Cc: emacs-devel

Chip Coldwell writes:
 > On Sat, 22 Jul 2006, Nick Roberts wrote:
 > >
 > > I regularly build on FC5 without optimization and don't have any problems
 > > (although I haven't done "make bootstrap for a while, the final build uses
 > > "./temacs -l loadup -batch dump").
 > 
 > Which version?

Emacs 22.0.50.

 >                 If you mean 21.4 (the one that ships with FC5), then all 
 > this means is that, for whatever reason, your kernel is aways setting up 
 > the temacs process with a contiguous address space.  The fact that the 
 > Linux kernel could set up a process virtual address space with holes in it 
 > has to be dealt with anyway.

Isn't this related to items in the PROBLEMS file (some of which I thought had
been solved)?:

  *** Linux: Segfault during `make bootstrap' under certain recent versions of
      the Linux kernel.

  With certain recent Linux kernels (like the one of Redhat Fedora Core 1 and
  newer), the new "Exec-shield" functionality is enabled by default, which
  creates a different memory layout that breaks the emacs dumper.  Emacs tries
  to handle this at build time, but if the workaround used fails, these
  instructions can be useful.  The work-around explained here is not enough on
  Fedora Core 4 (and possible newer). Read the next item.

  Configure can overcome the problem of exec-shield if the architecture is x86
  and the program setarch is present.  On other architectures no workaround is
  known.

  You can check the Exec-shield state like this:

    cat /proc/sys/kernel/exec-shield

  It returns non-zero when Exec-shield is enabled, 0 otherwise.  Please read
  your system documentation for more details on Exec-shield and associated
  commands.  Exec-shield can be turned off with this command:

    echo "0" > /proc/sys/kernel/exec-shield

  When Exec-shield is enabled, building Emacs will segfault during the
  execution of this command:

    ./temacs --batch --load loadup [dump|bootstrap]

  To work around this problem, it is necessary to temporarily disable
  Exec-shield while building Emacs, or, on x86, by using the `setarch'
  command when running temacs like this:

    setarch i386 ./temacs --batch --load loadup [dump|bootstrap]


  *** Fedora Core 4 GNU/Linux: Segfault during dumping.

  In addition to exec-shield explained above "Linux: Segfault during `make
  bootstrap' under certain recent versions of the Linux kernel" item, Linux
  kernel shipped with Fedora Core 4 randomizes the virtual address space of a
  process. As the result dumping may fail even if you turn off exec-shield. In
  this case, use the -R option to the setarch command:

   setarch i386 -R ./temacs --batch --load loadup [dump|bootstrap]

 or

   setarch i386 -R make bootstrap



-- 
Nick                                           http://www.inet.net.nz/~nickrob

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-07-22 12:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-21 17:59 seg-fault in unexelf.c Chip Coldwell
2006-07-21 20:53 ` Chip Coldwell
2006-07-22  0:33 ` Nick Roberts
2006-07-22 12:37   ` Chip Coldwell
2006-07-22 12:50     ` Nick Roberts
2006-07-22  4:39 ` Richard Stallman
2006-07-22 12:27   ` Chip Coldwell

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).