From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Dumper problems and a possible solutions Date: Tue, 24 Jun 2014 17:37:39 -0400 Message-ID: References: <20140624171955.GS179@brightrain.aerifal.cx> <20140624194026.GT179@brightrain.aerifal.cx> <20140624211519.GU179@brightrain.aerifal.cx> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1403645887 20592 80.91.229.3 (24 Jun 2014 21:38:07 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 24 Jun 2014 21:38:07 +0000 (UTC) Cc: emacs-devel@gnu.org To: Rich Felker Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Jun 24 23:38:00 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1WzYPg-0002F2-Be for ged-emacs-devel@m.gmane.org; Tue, 24 Jun 2014 23:38:00 +0200 Original-Received: from localhost ([::1]:33947 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WzYPg-0002cn-02 for ged-emacs-devel@m.gmane.org; Tue, 24 Jun 2014 17:38:00 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:51300) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WzYPV-0002ch-9F for emacs-devel@gnu.org; Tue, 24 Jun 2014 17:37:57 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WzYPM-00087q-PT for emacs-devel@gnu.org; Tue, 24 Jun 2014 17:37:49 -0400 Original-Received: from ironport2-out.teksavvy.com ([206.248.154.181]:28007) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WzYPM-00087k-LX for emacs-devel@gnu.org; Tue, 24 Jun 2014 17:37:40 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ArUGAIDvNVNLd+D9/2dsb2JhbABZgwaDSsA9gRcXdIIlAQEBAQIBViMFCwsOJhIUGA0kLodWCNIZF456B4Q4BJQelHuBaoNMIQ X-IPAS-Result: ArUGAIDvNVNLd+D9/2dsb2JhbABZgwaDSsA9gRcXdIIlAQEBAQIBViMFCwsOJhIUGA0kLodWCNIZF456B4Q4BJQelHuBaoNMIQ X-IronPort-AV: E=Sophos;i="4.97,753,1389762000"; d="scan'208";a="69443185" Original-Received: from 75-119-224-253.dsl.teksavvy.com (HELO ceviche.home) ([75.119.224.253]) by ironport2-out.teksavvy.com with ESMTP/TLS/ADH-AES256-SHA; 24 Jun 2014 17:37:39 -0400 Original-Received: by ceviche.home (Postfix, from userid 20848) id 074E166167; Tue, 24 Jun 2014 17:37:39 -0400 (EDT) In-Reply-To: <20140624211519.GU179@brightrain.aerifal.cx> (Rich Felker's message of "Tue, 24 Jun 2014 17:15:19 -0400") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 206.248.154.181 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:172693 Archived-At: >> > Thanks for the feedback. Can you elaborate on how/why the hash >> > changes, and where it's stored that would need to be updated? >> When placing an object in a hash-table, the hashing function often just >> uses the address as "the hash value". So any hash-table that uses such >> a hash-function will need to be rehashed after relocation. > I see. Is this hashing all at the C level, or is it happening in lisp > code? It's in C. > Can the lisp code even see the address value for lisp objects? It usually doesn't see it, but the `sxhash' function does return values which can depend on the address of objects (not for cons cells or arrays, but for objects such as processes, markers, buffer, overlays, ...). I think it happens rarely enough that we can hope to be OK on this front. > If it's purely at the C level I doubt it would be hard to re-do the > hashes but I obviously haven't read the relevant code. Indeed, we just need to rehash all the hash-tables we find while traversing the heap. >> Yes, the GC already knows how to find the references that are inside >> Lisp objects, but there can also be references coming from global >> variables (for sure) or non-Lisp data-structures or maybe from the stack >> (not sure about those last two). > How does the GC avoid freeing objects that have these kinds of > references? It knows about some of those pointers (via `staticpro' for global variables and via conservative stack scanning for the stack). > BTW, at the point of dumping, my impression is that there should not > be relevant references from the stack; That'd be my hope as well. >> We could support relocation at mmap-time to solve this. > Yes, but that's conceptually just as difficult as dumping to a C > array: you have to patch up all the addresses and the hash values will > change. Agreed. Relocation is the big issue and pretty much any technique we may like to use will need to address the problem. I wonder how smalltalk machines deal with it. > I agree completely. The current situation makes it nearly impossible > to port emacs to a system that's not making strong guarantees about > its implementation internals, and (at least from my understanding > reading list archives) it's imposing ugly constraints on existing > implementations (glibc) not to change internals in ways that would > break emacs' dumper. I would really like to see fixing this issue > treated as a priority in the future direction of emacs. It's been a latent problem for the last 20 years or so, but it rarely bites, so it's not of terribly high priority in general, especially since new systems don't show up very often. But it's important enough that we might be willing to pay some price (e.g. the relocation code will likely either require significant changes to the GC code, or it will duplicate significant chunks of the GC code). Stefan