From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.emacs.devel Subject: Re: Dumper problems and a possible solutions Date: Wed, 25 Jun 2014 15:03:33 -0400 Message-ID: <20140625190333.GZ179@brightrain.aerifal.cx> References: <20140624171955.GS179@brightrain.aerifal.cx> <53AB0EF8.4090608@yandex.ru> <831tucrguf.fsf@gnu.org> <20140625183241.GW179@brightrain.aerifal.cx> <83wqc4q0xl.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1403723236 19400 80.91.229.3 (25 Jun 2014 19:07:16 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 25 Jun 2014 19:07:16 +0000 (UTC) Cc: dmantipov@yandex.ru, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Jun 25 21:07:09 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1WzsXC-0005RD-DV for ged-emacs-devel@m.gmane.org; Wed, 25 Jun 2014 21:07:06 +0200 Original-Received: from localhost ([::1]:40542 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WzsXB-0005BS-UF for ged-emacs-devel@m.gmane.org; Wed, 25 Jun 2014 15:07:05 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:59751) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WzsTx-0003k8-7A for emacs-devel@gnu.org; Wed, 25 Jun 2014 15:03:49 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WzsTs-0007DH-H3 for emacs-devel@gnu.org; Wed, 25 Jun 2014 15:03:45 -0400 Original-Received: from 216-12-86-13.cv.mvl.ntelos.net ([216.12.86.13]:44428 helo=brightrain.aerifal.cx) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WzsTn-0007CF-D5; Wed, 25 Jun 2014 15:03:35 -0400 Original-Received: from dalias by brightrain.aerifal.cx with local (Exim 3.15 #2) id 1WzsTl-00030T-00; Wed, 25 Jun 2014 19:03:33 +0000 Content-Disposition: inline In-Reply-To: <83wqc4q0xl.fsf@gnu.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 216.12.86.13 X-Mailman-Approved-At: Wed, 25 Jun 2014 15:07:03 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:172721 Archived-At: On Wed, Jun 25, 2014 at 09:49:42PM +0300, Eli Zaretskii wrote: > > Date: Wed, 25 Jun 2014 14:32:41 -0400 > > From: Rich Felker > > Cc: Dmitry Antipov , emacs-devel@gnu.org > > > > > Is it possible to provide our own implementation of sbrk that > > > allocates memory from some large static array? > > > > That's exactly the hack I described which I'm using right now. But > > since I didn't implement a free-like operation and since > > load_charset_map_from_file allocates >700k every time it's called, I > > had to make the static array 400MB. > > That's not a problem, because those 700K are free'd before the next > one is allocated. And in any case, they are all free'd before we call > unexec. Just implement sbrk for negative increment. The Windows port But load_charset_map_from_file doesn't call an sbrk-like interface; it calls (indirectly) xmalloc and xfree. So there's at least some nontrivial glue that goes in between. > already does that, see w32heap.c on the trunk. It works with only > 11MB of static array for 32-bit builds and 18MB for 64-bit. Nice to know. > > I think it would work with a "real" mini-malloc implementation using > > the static array, and a much smaller static array (maybe 8-15 MB) > > but my attempts to write a quick one have been sloppy and buggy so > > far. > > If supporting deallocation in such an sbrk isn't feasible, how about > using gmalloc, as an malloc replacement before dumping? I suspect it's a lot of work to wire up gmalloc to (1) avoid interposing on the malloc/free/etc. names, (2) use the static mini-brk buffer, (3) only allocate from the mini-brk buffer before dumping (otherwise pass to real malloc), but still check realloc/free calls after dumping and handle the case where the old memory was in the mini-brk. What seems easier, and what I tried, is writing a completely naive malloc with a single freelist that's linear-searched on malloc and which does not support coalescing free chunks. But I think my implementation has some bugs still, because it's not working. I'm not sure if they're bugs in the allocator, or bugs in how it's used (maybe missing some places that would have to be redirected through it and which are still calling malloc or free directly). > > I would be reasonably happy with this solution (at least it would fix > > the problems I'm experiencing), but I don't think it's as elegant as > > fixing the portability problem completely by getting rid of the need > > to dump executable binary files and instead dumping a C array. > > But it's conceptually much simpler and reliable. That's "elegant" in > my book, when such hairy stuff is concerned. No, it's less reliable. See my other posts in the thread about what happens if you have other libraries linked and they do nontrivial things prior to dumping (e.g. from static ctors). Dumping JUST the lisp object state in a C array ensures that none of the pre-dump state from other libraries (or even libc) can pollute the state observed after dumping. Both the current method, and the proposed simple fixes above, suffer from this issue and are therefore very fragile. As an example (I think I mentioned this earlier), if you static link, musl libc is remembering the clock_gettime vdso pointer from the pre-dump state and attempting to use it later (which is not valid because the kernel maps it at a random address). > > And it doesn't fix the fact that you can't build a PIE emacs. > > Why is that important? Since emacs is processing lots of potentially untrusted data, PIE hardening may be beneficial for hardening against vulnerabilities where an attacker would otherwise be able to perform arbitrary code execution as the user running emacs. I'm not aware of such vulnerabilities, but being that I found things that look suspiciously like use-after-free while reading the allocator-related code, I wouldn't be surprised if they exist. Rich