unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Intermittent unexec failures on Linux >= 2.6.25
@ 2008-09-03 22:29 Ulrich Mueller
  2008-10-20 17:20 ` Christian Faulhammer
  0 siblings, 1 reply; 6+ messages in thread
From: Ulrich Mueller @ 2008-09-03 22:29 UTC (permalink / raw)
  To: emacs-devel; +Cc: emacs

Building of Emacs 22.2.92 (also 22.2) on Linux 2.6.25 (or later)
sometimes fails with a segmentation fault in dump-emacs / unexec.

This was reported by Jan Hrabe as Gentoo bug 236579,
<http://bugs.gentoo.org/236579>.

I've investigated and found that indeed temacs fails in dump-emacs
intermittently. For my test, I have run "make; rm src/emacs" 250 times
in a loop, and in 3 cases a segmentation fault of temacs occured.

The problem seems to be that heap_bss_diff is too large for unexec
to succeed (due to kernel heap randomisation, see
<http://lkml.org/lkml/2007/10/23/435>).

On the other hand, it is (in case of the 3 failures) not large enough
to fulfill the condition (heap_bss_diff > MAX_HEAP_BSS_DIFF) which
would trigger the correct behaviour, namely setting the personality
and calling execve of itself.

In the 247 successful cases, heap_bss_diff first had a large value
(up to about 32 MiB), and in the exec'd temacs its value was constant,
namely 1887 bytes.

The 3 failures had heap_bss_diff = 575327, 911199, and 268127, which
are all smaller than MAX_HEAP_BSS_DIFF (1024*1024), so execvp was
_not_ called.

Where does that value of MAX_HEAP_BSS_DIFF = 1 MiB come from? Could it
be decreased, or could temacs execve itself unconditionally on Linux?
In my opinion, a failure rate of about 1 % is too high.

(The problem doesn't exist for Linux 2.6.24, or if heap randomisation
is turned off, i.e. with /proc/sys/kernel/randomize_va_space < 2.)

Ulrich




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Intermittent unexec failures on Linux >= 2.6.25
  2008-09-03 22:29 Intermittent unexec failures on Linux >= 2.6.25 Ulrich Mueller
@ 2008-10-20 17:20 ` Christian Faulhammer
  2008-10-20 17:56   ` Chong Yidong
  0 siblings, 1 reply; 6+ messages in thread
From: Christian Faulhammer @ 2008-10-20 17:20 UTC (permalink / raw)
  To: emacs; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1950 bytes --]

Hi,

I want to remind you of this bug report, could you please react on this
as we are able to reproduce.

<URL:http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=900>

Ulrich Mueller <ulm@gentoo.org>:
> Building of Emacs 22.2.92 (also 22.2) on Linux 2.6.25 (or later)
> sometimes fails with a segmentation fault in dump-emacs / unexec.
> 
> This was reported by Jan Hrabe as Gentoo bug 236579,
> <http://bugs.gentoo.org/236579>.
> 
> I've investigated and found that indeed temacs fails in dump-emacs
> intermittently. For my test, I have run "make; rm src/emacs" 250 times
> in a loop, and in 3 cases a segmentation fault of temacs occured.
> 
> The problem seems to be that heap_bss_diff is too large for unexec
> to succeed (due to kernel heap randomisation, see
> <http://lkml.org/lkml/2007/10/23/435>).
> 
> On the other hand, it is (in case of the 3 failures) not large enough
> to fulfill the condition (heap_bss_diff > MAX_HEAP_BSS_DIFF) which
> would trigger the correct behaviour, namely setting the personality
> and calling execve of itself.
> 
> In the 247 successful cases, heap_bss_diff first had a large value
> (up to about 32 MiB), and in the exec'd temacs its value was constant,
> namely 1887 bytes.
> 
> The 3 failures had heap_bss_diff = 575327, 911199, and 268127, which
> are all smaller than MAX_HEAP_BSS_DIFF (1024*1024), so execvp was
> _not_ called.
> 
> Where does that value of MAX_HEAP_BSS_DIFF = 1 MiB come from? Could it
> be decreased, or could temacs execve itself unconditionally on Linux?
> In my opinion, a failure rate of about 1 % is too high.
> 
> (The problem doesn't exist for Linux 2.6.24, or if heap randomisation
> is turned off, i.e. with /proc/sys/kernel/randomize_va_space < 2.)
> 
> Ulrich


-- 
Christian Faulhammer, Gentoo Lisp project
<URL:http://www.gentoo.org/proj/en/lisp/>, #gentoo-lisp on FreeNode

<URL:http://www.faulhammer.org/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Intermittent unexec failures on Linux >= 2.6.25
  2008-10-20 17:20 ` Christian Faulhammer
@ 2008-10-20 17:56   ` Chong Yidong
  2008-10-21  6:32     ` Jan Djärv
  0 siblings, 1 reply; 6+ messages in thread
From: Chong Yidong @ 2008-10-20 17:56 UTC (permalink / raw)
  To: Jan Djärv; +Cc: Christian Faulhammer, emacs-devel, emacs

Hi Jan,

>> Building of Emacs 22.2.92 (also 22.2) on Linux 2.6.25 (or later)
>> sometimes fails with a segmentation fault in dump-emacs / unexec.
>> 
>> This was reported by Jan Hrabe as Gentoo bug 236579,
>> <http://bugs.gentoo.org/236579>.
>> 
>> I've investigated and found that indeed temacs fails in dump-emacs
>> intermittently. For my test, I have run "make; rm src/emacs" 250 times
>> in a loop, and in 3 cases a segmentation fault of temacs occured.
>> 
>> The problem seems to be that heap_bss_diff is too large for unexec
>> to succeed (due to kernel heap randomisation, see
>> <http://lkml.org/lkml/2007/10/23/435>).
>>
>> On the other hand, it is (in case of the 3 failures) not large enough
>> to fulfill the condition (heap_bss_diff > MAX_HEAP_BSS_DIFF) which
>> would trigger the correct behaviour, namely setting the personality
>> and calling execve of itself.

Do you remember the rationale for setting

#define MAX_HEAP_BSS_DIFF (1024*1024)

in emacs.c?  This variable was introduced by you on 2004-10-20, and I'm
not too familiar with this part of the code.

>> In the 247 successful cases, heap_bss_diff first had a large value
>> (up to about 32 MiB), and in the exec'd temacs its value was constant,
>> namely 1887 bytes.
>> 
>> The 3 failures had heap_bss_diff = 575327, 911199, and 268127, which
>> are all smaller than MAX_HEAP_BSS_DIFF (1024*1024), so execvp was
>> _not_ called.
>> 
>> Where does that value of MAX_HEAP_BSS_DIFF = 1 MiB come from? Could it
>> be decreased, or could temacs execve itself unconditionally on Linux?
>> In my opinion, a failure rate of about 1 % is too high.
>> 
>> (The problem doesn't exist for Linux 2.6.24, or if heap randomisation
>> is turned off, i.e. with /proc/sys/kernel/randomize_va_space < 2.)




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Intermittent unexec failures on Linux >= 2.6.25
  2008-10-20 17:56   ` Chong Yidong
@ 2008-10-21  6:32     ` Jan Djärv
  2008-10-21  8:32       ` Ulrich Mueller
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Djärv @ 2008-10-21  6:32 UTC (permalink / raw)
  To: Chong Yidong; +Cc: Christian Faulhammer, emacs-devel, emacs

I got it from the kernel source at the time.
I see now that there is no lower limit on the heap gap produced by
randomization.  I guess we must exec every time to be sure.

I think it is only heap randomization that unexec has problems with.  Other
address randomizations are ok.  But we will unconditionally turn off all of
them when we exec and dump.

I have checked in a fix.

	Jan D.

Chong Yidong skrev:
> Hi Jan,
> 
>>> Building of Emacs 22.2.92 (also 22.2) on Linux 2.6.25 (or later)
>>> sometimes fails with a segmentation fault in dump-emacs / unexec.
>>>
>>> This was reported by Jan Hrabe as Gentoo bug 236579,
>>> <http://bugs.gentoo.org/236579>.
>>>
>>> I've investigated and found that indeed temacs fails in dump-emacs
>>> intermittently. For my test, I have run "make; rm src/emacs" 250 times
>>> in a loop, and in 3 cases a segmentation fault of temacs occured.
>>>
>>> The problem seems to be that heap_bss_diff is too large for unexec
>>> to succeed (due to kernel heap randomisation, see
>>> <http://lkml.org/lkml/2007/10/23/435>).
>>>
>>> On the other hand, it is (in case of the 3 failures) not large enough
>>> to fulfill the condition (heap_bss_diff > MAX_HEAP_BSS_DIFF) which
>>> would trigger the correct behaviour, namely setting the personality
>>> and calling execve of itself.
> 
> Do you remember the rationale for setting
> 
> #define MAX_HEAP_BSS_DIFF (1024*1024)
> 
> in emacs.c?  This variable was introduced by you on 2004-10-20, and I'm
> not too familiar with this part of the code.
> 
>>> In the 247 successful cases, heap_bss_diff first had a large value
>>> (up to about 32 MiB), and in the exec'd temacs its value was constant,
>>> namely 1887 bytes.
>>>
>>> The 3 failures had heap_bss_diff = 575327, 911199, and 268127, which
>>> are all smaller than MAX_HEAP_BSS_DIFF (1024*1024), so execvp was
>>> _not_ called.
>>>
>>> Where does that value of MAX_HEAP_BSS_DIFF = 1 MiB come from? Could it
>>> be decreased, or could temacs execve itself unconditionally on Linux?
>>> In my opinion, a failure rate of about 1 % is too high.
>>>
>>> (The problem doesn't exist for Linux 2.6.24, or if heap randomisation
>>> is turned off, i.e. with /proc/sys/kernel/randomize_va_space < 2.)




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Intermittent unexec failures on Linux >= 2.6.25
  2008-10-21  6:32     ` Jan Djärv
@ 2008-10-21  8:32       ` Ulrich Mueller
  2008-10-21 10:18         ` Jan Djärv
  0 siblings, 1 reply; 6+ messages in thread
From: Ulrich Mueller @ 2008-10-21  8:32 UTC (permalink / raw)
  To: Jan Djärv; +Cc: Chong Yidong, emacs-devel, emacs

>>>>> On Tue, 21 Oct 2008, Jan Djärv wrote:

> Chong Yidong skrev:
>> Do you remember the rationale for setting
>> 
>> #define MAX_HEAP_BSS_DIFF (1024*1024)
>> 
>> in emacs.c?  This variable was introduced by you on 2004-10-20, and I'm
>> not too familiar with this part of the code.

> I got it from the kernel source at the time.
> I see now that there is no lower limit on the heap gap produced by
> randomization.

Unfortunately, yes. If I read the kernel sources right, the gap can
have any size between one page and 32 MiB.

(One could test if heap_bss_diff is larger than the page size ...
but probably it's difficult to get the test right, without making a
fence-post error.)

> I guess we must exec every time to be sure.

I had attached a patch at my bug report
<http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=900>
which asks the kernel (by reading /proc/sys/kernel/randomize_va_space)
if heap randomisation is switched on. Or is this too fragile?

Ulrich




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Intermittent unexec failures on Linux >= 2.6.25
  2008-10-21  8:32       ` Ulrich Mueller
@ 2008-10-21 10:18         ` Jan Djärv
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Djärv @ 2008-10-21 10:18 UTC (permalink / raw)
  To: Ulrich Mueller; +Cc: Chong Yidong, emacs-devel, emacs

Ulrich Mueller skrev:

> 
> I had attached a patch at my bug report
> <http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=900>
> which asks the kernel (by reading /proc/sys/kernel/randomize_va_space)
> if heap randomisation is switched on. Or is this too fragile?
> 

I thought so.

	Jan D.





^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-10-21 10:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-03 22:29 Intermittent unexec failures on Linux >= 2.6.25 Ulrich Mueller
2008-10-20 17:20 ` Christian Faulhammer
2008-10-20 17:56   ` Chong Yidong
2008-10-21  6:32     ` Jan Djärv
2008-10-21  8:32       ` Ulrich Mueller
2008-10-21 10:18         ` Jan Djärv

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).