From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Chip Coldwell Newsgroups: gmane.emacs.devel Subject: Re: 22.0.99 emacs dumper (?) problem Date: Mon, 21 May 2007 10:45:18 -0400 (EDT) Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Trace: sea.gmane.org 1179758735 9424 80.91.229.12 (21 May 2007 14:45:35 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Mon, 21 May 2007 14:45:35 +0000 (UTC) Cc: emacs-devel@gnu.org, coldwell@redhat.com, Chong Yidong To: Richard Stallman Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon May 21 16:45:31 2007 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1Hq98b-0002V5-Ap for ged-emacs-devel@m.gmane.org; Mon, 21 May 2007 16:45:29 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Hq98a-0005Vj-VQ for ged-emacs-devel@m.gmane.org; Mon, 21 May 2007 10:45:29 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Hq98X-0005VI-1h for emacs-devel@gnu.org; Mon, 21 May 2007 10:45:25 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Hq98W-0005V6-Fu for emacs-devel@gnu.org; Mon, 21 May 2007 10:45:24 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Hq98W-0005V3-90 for emacs-devel@gnu.org; Mon, 21 May 2007 10:45:24 -0400 Original-Received: from mx1.redhat.com ([66.187.233.31]) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1Hq98T-0001z9-Rf; Mon, 21 May 2007 10:45:22 -0400 Original-Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.1/8.13.1) with ESMTP id l4LEjL0o025210; Mon, 21 May 2007 10:45:21 -0400 Original-Received: from mail.boston.redhat.com (mail.boston.redhat.com [172.16.76.12]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id l4LEjKkS013998; Mon, 21 May 2007 10:45:20 -0400 Original-Received: from bogart.boston.redhat.com (bogart.boston.redhat.com [172.16.80.240]) by mail.boston.redhat.com (8.13.1/8.13.1) with ESMTP id l4LEjIaG010285; Mon, 21 May 2007 10:45:18 -0400 In-Reply-To: X-detected-kernel: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:71517 Archived-At: On Mon, 21 May 2007, Richard Stallman wrote: > I cannot reconcile the things that you've said about the problem. > > If this is the full extent of the problem > > This appears to be due to a change in the internal format of freed > blocks introduced between glibc-2.5.90-21 and glibc-2.5.90-22. As a > result, Emacs binaries built using older versions of glibc may crash > when run using newer versions of glibc. > > then it ought to work just to relink temacs with the newer libc > and then dump Emacs again. > > (Mr Coldwell, what happens if you do that?) Yes, that works; the resulting dumped emacs binary does not seg-fault on startup. > However, if this change is needed also > > + MALLOC_MMAP_MAX_=0 LC_ALL=C $(RUN_TEMACS) -nl -batch -l loadup dump > > to make Emacs work with the newer libc, it implies there is a worse > problem. It implies that Emacs fails to work correctly with the newer > libc, and never mind what the older libc did. > > Which is it? My current belief is that it is the former -- changes in the internal malloc data structures meant that the dumping temacs and dumped emacs had to have the same glibc version. I currently believe the MALLOC_MMAP_MAX_=0 is a red herring; I've build several binaries without it that work fine. dump_emacs contains this line malloc_state_ptr = malloc_get_state(); just before the call to unexec. Then malloc_initialize_hook (bound to the weak symbol __malloc_initialize_hook) does if (initialized) { [ ... ] malloc_set_state (malloc_state_ptr); #ifndef XMALLOC_OVERRUN_CHECK free (malloc_state_ptr); #endif } The "initialized" variable is zero in the dumping emacs and nonzero in the dumped emacs. So malloc_get_state returns a pointer to an opaque data structure in the .bss segment (I believe all calls to malloc in temacs are guarded such that malloc always uses sbrk not mmap). The .bss gets dumped into a new .data segment in the emacs binary with the opaque data structure. The dumped emacs binary then starts up and passes the pointer to the opaque data structure in the call to malloc_set_state. Since the temacs binary and the emacs binary could be linked to different versions of glibc, there can be no incompatible changes to this opaque data structure between glibc versions. IOW, the data structure is not quite as opaque as the glibc maintainers believed. > Meanwhile, is this workaround really ok? We are currently building without it; I believe this workaround isn't even necessary. > Can Mr Coldwell (or anyone) tell us which? Once the glibc situation settles down, I will verify that I can dump emacs on glibc-2.5.90-21 and run the resulting binary when it links to glibc-2.6-whatever. Then we can put this to rest. > If it only prevents use of mmap when running temacs, then it is ok as > a workaround, but I would like to generate it automatically thru the > makefile mechanism rather than ask users to patch it by hand. I think that the reason the workaround worked has something to do with the specifics of the malloc saved state opaque structure. If mmap was off during dumping, then that saved state structure from previous glibc versions was compatible with the new glibc version, just by luck. Chip -- Charles M. "Chip" Coldwell Senior Software Engineer Red Hat, Inc 978-392-2426