unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#23529: Request for fixing randomize_va_space build issues
@ 2016-05-13 12:18 Philippe Vaucher
  2016-05-13 15:58 ` Paul Eggert
  0 siblings, 1 reply; 66+ messages in thread
From: Philippe Vaucher @ 2016-05-13 12:18 UTC (permalink / raw)
  To: 23529

Hello,

When /proc/sys/kernel/randomize_va_space is 2, emacs fails to build:

    Dumping under the name emacs
    **************************************************
    Warning: Your system has a gap between BSS and the
    heap (20865783 bytes).  This usually means that exec-shield
    or something similar is in effect.  The dump may
    fail because of this.  See the section about
    exec-shield in etc/PROBLEMS for more information.
    **************************************************
    /bin/bash: line 7:  8981 Segmentation fault      (core dumped)
./temacs --batch --load loadup bootstrap
    Makefile:815: recipe for target 'bootstrap-emacs' failed
    make[1]: *** [bootstrap-emacs] Error 1
    make[1]: Leaving directory '/tmp/emacs/src'

This is a somewhat known bug:

https://debbugs.gnu.org/db/13/13964.html
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=598234
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=566947
https://bugzilla.redhat.com/show_bug.cgi?id=160814

I know there is a workaround that goes like:

    echo 0 > /proc/sys/kernel/randomize_va_space
    make
    echo 2 > /proc/sys/kernel/randomize_va_space

But in my case this is not possible, I'm building as a user with
limited privileges.

The file /etc/PROBLEMS mention another workaround:

    setarch x86_64 -R make

But this fails with the same error.

I'm far from being the only one with this problem:

https://github.com/boot2docker/boot2docker/issues/1136
https://github.com/ensime/ensime-emacs/issues/369
https://github.com/proot-me/PRoot/issues/52

And basically the workaround is always "set randomize_va_space to 0".

Can someone explain what the real issue is and what we could do to
_really_ fix it? One should be able to compile emacs without changing
kernel parameters!

Thanks in advance,
Philippe





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-05-13 12:18 bug#23529: Request for fixing randomize_va_space build issues Philippe Vaucher
@ 2016-05-13 15:58 ` Paul Eggert
  2016-05-17 16:38   ` Philippe Vaucher
  0 siblings, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-05-13 15:58 UTC (permalink / raw)
  To: Philippe Vaucher; +Cc: 23529

I am not observing the problem on Fedora 23 x86-64, even though 
/proc/sys/kernel/randomize_va_space is 2 on my platform.

Emacs has had bug fixes in this area. You don't mention which version of 
Emacs you're using, or which platform. I suggest trying the latest test 
version of Emacs, and if this doesn't work then please send details 
about your platform and how you configured and built Emacs.

ftp://alpha.gnu.org/gnu/emacs/pretest/emacs-25.0.93.tar.xz






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-05-13 15:58 ` Paul Eggert
@ 2016-05-17 16:38   ` Philippe Vaucher
  2016-05-18  7:53     ` Philippe Vaucher
  2016-05-18  8:21     ` Paul Eggert
  0 siblings, 2 replies; 66+ messages in thread
From: Philippe Vaucher @ 2016-05-17 16:38 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 23529

On Fri, May 13, 2016 at 5:58 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
> I am not observing the problem on Fedora 23 x86-64, even though
> /proc/sys/kernel/randomize_va_space is 2 on my platform.

Yes, because when building emacs it calls ./temacs which calls
"personality" like here
https://github.com/emacs-mirror/emacs/blob/master/src/emacs.c#L802-819
This basically does the same as disabling randomize_va_space.

Disallow the syscall to personality and you'll see emacs segfaults
while building.

Some information about why the personality syscall is disabled in my env:

https://github.com/docker/docker/blob/master/docs/security/seccomp.md

> Emacs has had bug fixes in this area. You don't mention which version of
> Emacs you're using, or which platform. I suggest trying the latest test
> version of Emacs, and if this doesn't work then please send details about
> your platform and how you configured and built Emacs.

I'm building on Ubuntu 16.04 Linux 4.4.0-22-generic x86_64 GNU/Linux
with Docker 1.11.1.

I tried to run "./temacs --batch --load loadup bootstrap" inside GDB
to get more insights about why it segfaults there, but somehow gdb
fails to catch it. Maybe because of spawned processes?

I run gdb like this: "gdb --args ./temacs --batch --load loadup
bootstrap" followed by "run"

I also tried to disable personalities alltogether by undefined
HAVE_PERSONALITY_LINUX32 but the only way I found was to mess with the
./configure detection... I'll investiguate. If you have any tricks to
have emacs be more verbose about its segfault it'd be appreciated.

Thanks,
Philippe





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-05-17 16:38   ` Philippe Vaucher
@ 2016-05-18  7:53     ` Philippe Vaucher
  2016-05-18  8:21     ` Paul Eggert
  1 sibling, 0 replies; 66+ messages in thread
From: Philippe Vaucher @ 2016-05-18  7:53 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 23529

> Yes, because when building emacs it calls ./temacs which calls
> "personality" like here
> https://github.com/emacs-mirror/emacs/blob/master/src/emacs.c#L802-819
> This basically does the same as disabling randomize_va_space.

For information, I also opened an issue with docker because it might
be an ubuntu:16.04/seccomp issue
https://github.com/docker/docker/issues/22801

I'll keep you posted, but it's interesting that emacs needs
`personality` syscalls in order to build. I'm curious about why there
is this somewhat intermediate "temacs" binary that seems to have the
ability to dump itself into the final binary.

Philippe





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-05-17 16:38   ` Philippe Vaucher
  2016-05-18  7:53     ` Philippe Vaucher
@ 2016-05-18  8:21     ` Paul Eggert
  2016-05-18  8:44       ` Philippe Vaucher
  1 sibling, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-05-18  8:21 UTC (permalink / raw)
  To: Philippe Vaucher; +Cc: 23529

[-- Attachment #1: Type: text/plain, Size: 2207 bytes --]

Some background: Emacs has an 'undump' function that saves the Emacs state as an
executable: when you run the executable, you get an Emacs with the same (or
nearly the same) state. This makes Emacs startup considerably faster. Objects in
the restored state must be in the same location as when they were saved, so the
executable cannot be subject to ASLR.

On 05/17/2016 09:38 AM, Philippe Vaucher wrote:

> Some information about why the personality syscall is disabled in my env:
>
> https://github.com/docker/docker/blob/master/docs/security/seccomp.md
>

That says the 'personality' syscall is "Not inherently dangerous, but poorly
tested". Although this justifies paranoia in some applications, we are talking
*Emacs* here. (People worried about poorly tested code should not be running
Emacs. :-) So a simple way to fix the problem, as I guess you've discovered, is
to allow the 'personality' syscall in your Docker image.

I don't know all the ins and outs of why it is necessary for Emacs to invoke
'personality'. As I understand it, the build procedure should invoke the shell
command 'setfattr -n user.pax.flags -v er temacs' immediately after building
temacs, and I don't know why this doesn't make the 'personality' call
unnecessary. Perhaps you can consult a seccomp expert who can tell you what's
going on, as seccomp is not well-documented. If there is some way to disable
ASLR without calling 'personality', that should fix your problem.

Regardless, the advice in etc/PROBLEMS is clearly obsolete here, so I installed
the attached patch to try to make things clearer. We're not going to greatly
alter the dumping procedure before Emacs 25 comes out (it's too late in the
release process) but we should do better in the future. For now we should at
least document the issues better.

> I tried to run "./temacs --batch --load loadup bootstrap" inside GDB
> to get more insights about why it segfaults there, but somehow gdb
> fails to catch it. Maybe because of spawned processes?

Yes, the code you highlighted does an execvp. You might try fiddling with GDB's
follow-exec-mode variable; see
<https://sourceware.org/gdb/onlinedocs/gdb/Forks.html>.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Modernize-ASLR-advice-in-etc-PROBLEMS.patch --]
[-- Type: text/x-diff; name="0001-Modernize-ASLR-advice-in-etc-PROBLEMS.patch", Size: 5183 bytes --]

From b412bcd921b4dd788c17f9077f02d1d592ea7e0a Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Wed, 18 May 2016 01:05:00 -0700
Subject: [PATCH] Modernize ASLR advice in etc/PROBLEMS

* etc/PROBLEMS (Segfault during 'make'): Modernize advice for
seccomp, Docker, and NetBSD (Bug#23529).
---
 etc/PROBLEMS | 77 +++++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 48 insertions(+), 29 deletions(-)

diff --git a/etc/PROBLEMS b/etc/PROBLEMS
index 533c4e9..8733095 100644
--- a/etc/PROBLEMS
+++ b/etc/PROBLEMS
@@ -2600,51 +2600,70 @@ See <URL:http://debbugs.gnu.org/327>, <URL:http://debbugs.gnu.org/821>.
 
 ** Dumping
 
-*** Segfault during 'make bootstrap' under the Linux kernel.
+*** Segfault during 'make'
 
-In Red Hat Linux kernels, "Exec-shield" functionality is enabled by
-default, which creates a different memory layout that can break the
-emacs dumper.  Emacs tries to handle this at build time, but if this
-fails, the following instructions may be useful.
+If Emacs segfaults when 'make' executes one of these commands:
 
-Exec-shield is enabled on your system if
+  LC_ALL=C ./temacs -batch -l loadup bootstrap
+  LC_ALL=C ./temacs -batch -l loadup dump
 
-    cat /proc/sys/kernel/exec-shield
+the problem may be due to inadequate workarounds for address space
+layout randomization (ASLR), an operating system feature that
+randomizes the virtual address space of a process.  ASLR is commonly
+enabled in Linux and NetBSD kernels, and is intended to deter exploits
+of pointer-related bugs in applications.  If ASLR is enabled, the
+command:
 
-prints a value other than 0.  (Please read your system documentation
-for more details on Exec-shield and associated commands.)
+   cat /proc/sys/kernel/randomize_va_space  # GNU/Linux
+   sysctl security.pax.aslr.global          # NetBSD
 
-Additionally, Linux kernel versions since 2.6.12 randomize the virtual
-address space of a process by default.  If this feature is enabled on
-your system, then
+outputs a nonzero value.
 
-   cat /proc/sys/kernel/randomize_va_space
+These segfaults should not occur on most modern systems, because the
+Emacs build procedure uses the command 'setfattr' or 'paxctl' to mark
+the Emacs executable as requiring non-randomized address space, and
+Emacs uses the 'personality' system call to disable address space
+randomization when dumping.  However, older kernels may not support
+'setfattr', 'paxctl', or 'personality', and newer Linux kernels have a
+secure computing mode (seccomp) that can be configured to disable the
+'personality' call.
 
-prints a value other than 0.
+It may be possible to work around the 'personality' problem in a newer
+Linux kernel by configuring seccomp to allow the 'personality' call.
+For example, if you are building Emacs under Docker, you can run the
+Docker container with a security profile that allows 'personality' by
+using Docker's --security-opt option with an appropriate profile; see
+<https://docs.docker.com/engine/security/seccomp/>.
 
-When these features are enabled, building Emacs may segfault during
-the execution of this command:
+To work around the ASLR problem in either an older or a newer kernel,
+you can temporarily disable the feature while building Emacs.  On
+GNU/Linux you can do so using the following command (as root).
 
-    ./temacs --batch --load loadup [dump|bootstrap]
+    echo 0 > /proc/sys/kernel/randomize_va_space
 
-To work around this problem, you can temporarily disable these
-features while building Emacs.  You can do so using the following
-commands (as root).  Remember to re-enable them when you are done,
-by echoing the original values back to the files.
+You can re-enable the feature when you are done, by echoing the
+original value back to the file.  NetBSD uses a different command,
+e.g., 'sysctl -w security.pax.aslr.global=0'.
 
-    echo 0 > /proc/sys/kernel/exec-shield
-    echo 0 > /proc/sys/kernel/randomize_va_space
+Alternatively, you can try using the 'setarch' command when building
+temacs like this, where -R disables address space randomization:
 
-Or, on x86, you can try using the 'setarch' command when running
-temacs, like this:
+    setarch $(uname -m) -R make
 
-    setarch i386 -R ./temacs --batch --load loadup [dump|bootstrap]
+ASLR is not the only problem that can break Emacs dumping.  Another
+issue is that in Red Hat Linux kernels, Exec-shield is enabled by
+default, and this creates a different memory layout.  Emacs should
+handle this at build time, but if this fails the following
+instructions may be useful.  Exec-shield is enabled on your system if
 
-or
+    cat /proc/sys/kernel/exec-shield
+
+prints a nonzero value.  You can temporarily disable it as follows:
 
-    setarch i386 -R make
+    echo 0 > /proc/sys/kernel/exec-shield
 
-(The -R option disables address space randomization.)
+As with randomize_va_space, you can re-enable Exec-shield when you are
+done, by echoing the original value back to the file.
 
 *** temacs prints "Pure Lisp storage exhausted".
 
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-05-18  8:21     ` Paul Eggert
@ 2016-05-18  8:44       ` Philippe Vaucher
  2016-05-20 17:52         ` Paul Eggert
  0 siblings, 1 reply; 66+ messages in thread
From: Philippe Vaucher @ 2016-05-18  8:44 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 23529

> Some background: Emacs has an 'undump' function that saves the Emacs state as an
> executable: when you run the executable, you get an Emacs with the same (or
> nearly the same) state. This makes Emacs startup considerably faster. Objects in
> the restored state must be in the same location as when they were saved, so the
> executable cannot be subject to ASLR.

Alright, that makes sense now.

> I don't know all the ins and outs of why it is necessary for Emacs to invoke
> 'personality'. As I understand it, the build procedure should invoke the shell
> command 'setfattr -n user.pax.flags -v er temacs' immediately after building
> temacs, and I don't know why this doesn't make the 'personality' call
> unnecessary. Perhaps you can consult a seccomp expert who can tell you what's
> going on, as seccomp is not well-documented. If there is some way to disable
> ASLR without calling 'personality', that should fix your problem.

I'll try to debug the `setfattr` part to see what it does. I seems
that `setarch -R` and `personality` both "works" return-status wise
but in practice inside docker they don't change anything (and thus
don't disable ASLR). It looks like eventually the problem will be
fixed on the docker side... but maybe the debug session will yield
some emacs patch.

> Regardless, the advice in etc/PROBLEMS is clearly obsolete here, so I installed
> the attached patch to try to make things clearer. We're not going to greatly
> alter the dumping procedure before Emacs 25 comes out (it's too late in the
> release process) but we should do better in the future. For now we should at
> least document the issues better.

Ah, good patch! About the dumping procedure, do you mean there *is* a
plan to alter it after Emacs 25 comes out? The building behavior on
this issue about ASLR between 24.5 and 25.0.93 seems very similar from
my experience.

>> I tried to run "./temacs --batch --load loadup bootstrap" inside GDB
>> to get more insights about why it segfaults there, but somehow gdb
>> fails to catch it. Maybe because of spawned processes?
>
> Yes, the code you highlighted does an execvp. You might try fiddling with GDB's
> follow-exec-mode variable; see
> <https://sourceware.org/gdb/onlinedocs/gdb/Forks.html>.

I'll play with it. Thanks!
Philippe





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-05-18  8:44       ` Philippe Vaucher
@ 2016-05-20 17:52         ` Paul Eggert
  2016-09-06  9:22           ` Philipp Stephani
  0 siblings, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-05-20 17:52 UTC (permalink / raw)
  To: Philippe Vaucher; +Cc: 23529

On 05/18/2016 01:44 AM, Philippe Vaucher wrote:
> About the dumping procedure, do you mean there*is*  a
> plan to alter it after Emacs 25 comes out?

Although we have no specific plan, it's on my list of things to do. I 
took a stab at it a while ago but was not happy with the way my draft 
was headed. I will see if I can do better next time.






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-05-20 17:52         ` Paul Eggert
@ 2016-09-06  9:22           ` Philipp Stephani
  2016-09-06 17:21             ` Paul Eggert
  0 siblings, 1 reply; 66+ messages in thread
From: Philipp Stephani @ 2016-09-06  9:22 UTC (permalink / raw)
  To: Paul Eggert, Philippe Vaucher; +Cc: 23529

[-- Attachment #1: Type: text/plain, Size: 677 bytes --]

Paul Eggert <eggert@cs.ucla.edu> schrieb am Fr., 20. Mai 2016 um 19:54 Uhr:

> On 05/18/2016 01:44 AM, Philippe Vaucher wrote:
> > About the dumping procedure, do you mean there*is*  a
> > plan to alter it after Emacs 25 comes out?
>
> Although we have no specific plan, it's on my list of things to do. I
> took a stab at it a while ago but was not happy with the way my draft
> was headed. I will see if I can do better next time.
>
>
>
Did you (or somebody else) by chance started working on this? I think
removing unexec should have highest priority once 25 is out; its
assumptions become increasingly less valid on modern systems (ASLR,
seccomp-bpf, ASan, containers...).

[-- Attachment #2: Type: text/html, Size: 984 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06  9:22           ` Philipp Stephani
@ 2016-09-06 17:21             ` Paul Eggert
  2016-09-06 17:40               ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-09-06 17:21 UTC (permalink / raw)
  To: Philipp Stephani, Philippe Vaucher; +Cc: 23529

On 09/06/2016 02:22 AM, Philipp Stephani wrote:
>
> Did you (or somebody else) by chance started working on this?

Not since we last wrote, no.

My idea is pretty simple: just output the objects as a C file, then 
compile and link the file. It should be reasonably portable. But there 
are a lot of details.






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 17:21             ` Paul Eggert
@ 2016-09-06 17:40               ` Eli Zaretskii
  2016-09-06 17:46                 ` Philippe Vaucher
                                   ` (2 more replies)
  0 siblings, 3 replies; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-06 17:40 UTC (permalink / raw)
  To: Paul Eggert; +Cc: p.stephani2, philippe.vaucher, 23529

> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Tue, 6 Sep 2016 10:21:15 -0700
> Cc: 23529@debbugs.gnu.org
> 
> My idea is pretty simple: just output the objects as a C file, then 
> compile and link the file.

So we will be giving up the ability of end-users to re-dump their
Emacs, unless they have a compiler/binutils installed that are
compatible with the ones used to build the Emacs binary?





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 17:40               ` Eli Zaretskii
@ 2016-09-06 17:46                 ` Philippe Vaucher
  2016-09-06 17:55                   ` Philipp Stephani
  2016-09-06 17:59                   ` Eli Zaretskii
  2016-09-06 18:18                 ` Clément Pit--Claudel
  2016-09-06 18:44                 ` Paul Eggert
  2 siblings, 2 replies; 66+ messages in thread
From: Philippe Vaucher @ 2016-09-06 17:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, Paul Eggert, 23529

>> My idea is pretty simple: just output the objects as a C file, then
>> compile and link the file.
>
> So we will be giving up the ability of end-users to re-dump their
> Emacs, unless they have a compiler/binutils installed that are
> compatible with the ones used to build the Emacs binary?

I doubt many end-users are aware of this feature, let alone use it.

IMHO, a saner compile cycle & the ability to play nice with modern
systems (ASLR, containers) largely outweights the loss of this
feature.

Philippe





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 17:46                 ` Philippe Vaucher
@ 2016-09-06 17:55                   ` Philipp Stephani
  2016-09-06 18:04                     ` Eli Zaretskii
  2016-09-06 17:59                   ` Eli Zaretskii
  1 sibling, 1 reply; 66+ messages in thread
From: Philipp Stephani @ 2016-09-06 17:55 UTC (permalink / raw)
  To: Philippe Vaucher, Eli Zaretskii; +Cc: 23529, Paul Eggert

[-- Attachment #1: Type: text/plain, Size: 629 bytes --]

Philippe Vaucher <philippe.vaucher@gmail.com> schrieb am Di., 6. Sep. 2016
um 19:47 Uhr:

> >> My idea is pretty simple: just output the objects as a C file, then
> >> compile and link the file.
> >
> > So we will be giving up the ability of end-users to re-dump their
> > Emacs, unless they have a compiler/binutils installed that are
> > compatible with the ones used to build the Emacs binary?
>
> I doubt many end-users are aware of this feature, let alone use it.
>
> IMHO, a saner compile cycle & the ability to play nice with modern
> systems (ASLR, containers) largely outweights the loss of this
> feature.
>
>
I agree.

[-- Attachment #2: Type: text/html, Size: 977 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 17:46                 ` Philippe Vaucher
  2016-09-06 17:55                   ` Philipp Stephani
@ 2016-09-06 17:59                   ` Eli Zaretskii
  2016-09-06 18:03                     ` Philipp Stephani
  2016-09-06 18:24                     ` Philippe Vaucher
  1 sibling, 2 replies; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-06 17:59 UTC (permalink / raw)
  To: Philippe Vaucher; +Cc: p.stephani2, eggert, 23529

> From: Philippe Vaucher <philippe.vaucher@gmail.com>
> Date: Tue, 6 Sep 2016 19:46:47 +0200
> Cc: Paul Eggert <eggert@cs.ucla.edu>, p.stephani2@gmail.com, 23529@debbugs.gnu.org
> 
> >> My idea is pretty simple: just output the objects as a C file, then
> >> compile and link the file.
> >
> > So we will be giving up the ability of end-users to re-dump their
> > Emacs, unless they have a compiler/binutils installed that are
> > compatible with the ones used to build the Emacs binary?
> 
> I doubt many end-users are aware of this feature, let alone use it.

Rarely used is not the same as useless and unneeded.  Whenever you
remove a feature, expect someone to come up with complaints about
regressions.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 17:59                   ` Eli Zaretskii
@ 2016-09-06 18:03                     ` Philipp Stephani
  2016-09-06 18:32                       ` Eli Zaretskii
  2016-09-06 18:24                     ` Philippe Vaucher
  1 sibling, 1 reply; 66+ messages in thread
From: Philipp Stephani @ 2016-09-06 18:03 UTC (permalink / raw)
  To: Eli Zaretskii, Philippe Vaucher; +Cc: 23529, eggert

[-- Attachment #1: Type: text/plain, Size: 998 bytes --]

Eli Zaretskii <eliz@gnu.org> schrieb am Di., 6. Sep. 2016 um 20:00 Uhr:

> > From: Philippe Vaucher <philippe.vaucher@gmail.com>
> > Date: Tue, 6 Sep 2016 19:46:47 +0200
> > Cc: Paul Eggert <eggert@cs.ucla.edu>, p.stephani2@gmail.com,
> 23529@debbugs.gnu.org
> >
> > >> My idea is pretty simple: just output the objects as a C file, then
> > >> compile and link the file.
> > >
> > > So we will be giving up the ability of end-users to re-dump their
> > > Emacs, unless they have a compiler/binutils installed that are
> > > compatible with the ones used to build the Emacs binary?
> >
> > I doubt many end-users are aware of this feature, let alone use it.
>
> Rarely used is not the same as useless and unneeded.  Whenever you
> remove a feature, expect someone to come up with complaints about
> regressions.
>

If we care enough about this feature, then instead of writing C code we can
write some portable serialization format (e.g. protobuf). That will be a
bit slower, but that might be OK.

[-- Attachment #2: Type: text/html, Size: 1619 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 17:55                   ` Philipp Stephani
@ 2016-09-06 18:04                     ` Eli Zaretskii
  0 siblings, 0 replies; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-06 18:04 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: philippe.vaucher, eggert, 23529

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Tue, 06 Sep 2016 17:55:42 +0000
> Cc: Paul Eggert <eggert@cs.ucla.edu>, 23529@debbugs.gnu.org
> 
>  IMHO, a saner compile cycle & the ability to play nice with modern
>  systems (ASLR, containers) largely outweights the loss of this
>  feature.
> 
> I agree. 

Why not look for ways of having the cake and eating it, too?  There's
no reason to believe there couldn't be a way to play nice with modern
systems without losing any existing feature.  After all, what we dump
is just data, we just cannot safely write it into the executable file.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 17:40               ` Eli Zaretskii
  2016-09-06 17:46                 ` Philippe Vaucher
@ 2016-09-06 18:18                 ` Clément Pit--Claudel
  2016-09-06 19:09                   ` Eli Zaretskii
  2016-09-06 18:44                 ` Paul Eggert
  2 siblings, 1 reply; 66+ messages in thread
From: Clément Pit--Claudel @ 2016-09-06 18:18 UTC (permalink / raw)
  To: 23529


[-- Attachment #1.1: Type: text/plain, Size: 701 bytes --]

On 2016-09-06 13:40, Eli Zaretskii wrote:
>> From: Paul Eggert <eggert@cs.ucla.edu>
>> Date: Tue, 6 Sep 2016 10:21:15 -0700
>> Cc: 23529@debbugs.gnu.org
>>
>> My idea is pretty simple: just output the objects as a C file, then 
>> compile and link the file.
> 
> So we will be giving up the ability of end-users to re-dump their
> Emacs, unless they have a compiler/binutils installed that are
> compatible with the ones used to build the Emacs binary?

Does this feature exist? I keep seeing complaints about the fact that it was disabled:

    $ emacs -Q --batch --eval '(dump-emacs "a" "b")'
    Emacs can be dumped only once

Or am I misunderstanding your point?

Clément.



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 17:59                   ` Eli Zaretskii
  2016-09-06 18:03                     ` Philipp Stephani
@ 2016-09-06 18:24                     ` Philippe Vaucher
  2016-09-06 19:11                       ` Eli Zaretskii
  1 sibling, 1 reply; 66+ messages in thread
From: Philippe Vaucher @ 2016-09-06 18:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, Paul Eggert, 23529

>> I doubt many end-users are aware of this feature, let alone use it.
>
> Rarely used is not the same as useless and unneeded.  Whenever you
> remove a feature, expect someone to come up with complaints about
> regressions.

You got me curious, can you explain a bit why it is useful? I always
thought the reason was about speed, something historical that was
needed "back then".

AFAIK, nowadays most people that care about speed use autoloads or
use-package and get under 2-3s of load-time. Don't get me wrong I
understand that dumping emacs is much faster (< 1s load time), but the
cost of having to re-dump each time you add a package makes it not
very practical.

Maybe one of the advantage is when you want to carry your customized
emacs as one file on some USB key?





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 18:03                     ` Philipp Stephani
@ 2016-09-06 18:32                       ` Eli Zaretskii
  2016-09-06 19:01                         ` Philipp Stephani
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-06 18:32 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: philippe.vaucher, eggert, 23529

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Tue, 06 Sep 2016 18:03:13 +0000
> Cc: eggert@cs.ucla.edu, 23529@debbugs.gnu.org
> 
> If we care enough about this feature, then instead of writing C code we can write some portable serialization
> format (e.g. protobuf).

Something like that, yes.

> That will be a bit slower, but that might be OK. 

Slower than compile, link, and load?  I doubt that.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 17:40               ` Eli Zaretskii
  2016-09-06 17:46                 ` Philippe Vaucher
  2016-09-06 18:18                 ` Clément Pit--Claudel
@ 2016-09-06 18:44                 ` Paul Eggert
  2016-09-06 19:18                   ` Eli Zaretskii
  2 siblings, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-09-06 18:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, philippe.vaucher, 23529

On 09/06/2016 10:40 AM, Eli Zaretskii wrote:
> So we will be giving up the ability of end-users to re-dump their
> Emacs, unless they have a compiler/binutils installed that are
> compatible with the ones used to build the Emacs binary?

No, the idea would be to keep the current undump as-is, and to use the 
new mechanism when building and installing Emacs. That way, end-users 
would not lose any abilities that they already have. It's OK that new 
mechanism would work on some platforms where the current undump does not.






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 18:32                       ` Eli Zaretskii
@ 2016-09-06 19:01                         ` Philipp Stephani
  0 siblings, 0 replies; 66+ messages in thread
From: Philipp Stephani @ 2016-09-06 19:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: philippe.vaucher, eggert, 23529

[-- Attachment #1: Type: text/plain, Size: 454 bytes --]

Eli Zaretskii <eliz@gnu.org> schrieb am Di., 6. Sep. 2016 um 20:33 Uhr:

> > That will be a bit slower, but that might be OK.
>
> Slower than compile, link, and load?  I doubt that.
>

Not while dumping (nobody cares about the dumping speed), but while
loading. In the alternative proposal, the dumped data could be read
directly into the process memory, requiring only some relocations. My
suggestion would require an extra parsing step on every start.

[-- Attachment #2: Type: text/html, Size: 742 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 18:18                 ` Clément Pit--Claudel
@ 2016-09-06 19:09                   ` Eli Zaretskii
  2016-09-06 19:59                     ` Clément Pit--Claudel
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-06 19:09 UTC (permalink / raw)
  To: Clément Pit--Claudel; +Cc: 23529

> From: Clément Pit--Claudel <clement.pit@gmail.com>
> Date: Tue, 6 Sep 2016 14:18:38 -0400
> 
> > So we will be giving up the ability of end-users to re-dump their
> > Emacs, unless they have a compiler/binutils installed that are
> > compatible with the ones used to build the Emacs binary?
> 
> Does this feature exist? I keep seeing complaints about the fact that it was disabled:

AFAIR it was disabled because no one cared to fix its quirks wrt ASLR
etc. (or maybe some similar nuisance).  Once the dumped data is
portable, those issues will be all but gone.

Moving to compiled C code will kill that feature for good.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 18:24                     ` Philippe Vaucher
@ 2016-09-06 19:11                       ` Eli Zaretskii
  0 siblings, 0 replies; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-06 19:11 UTC (permalink / raw)
  To: Philippe Vaucher; +Cc: p.stephani2, eggert, 23529

> From: Philippe Vaucher <philippe.vaucher@gmail.com>
> Date: Tue, 6 Sep 2016 20:24:19 +0200
> Cc: Paul Eggert <eggert@cs.ucla.edu>, p.stephani2@gmail.com, 23529@debbugs.gnu.org
> 
> You got me curious, can you explain a bit why it is useful? I always
> thought the reason was about speed, something historical that was
> needed "back then".

If you are loading a lot of code in your init files, you could simply
dump Emacs with all that code loaded.

> AFAIK, nowadays most people that care about speed use autoloads or
> use-package and get under 2-3s of load-time.

Loading a large package can still be slow enough to annoy.  Try
loading Org, for example; then imagine you have several such large
packages to load at startup.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 18:44                 ` Paul Eggert
@ 2016-09-06 19:18                   ` Eli Zaretskii
  2016-09-06 20:37                     ` Paul Eggert
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-06 19:18 UTC (permalink / raw)
  To: Paul Eggert; +Cc: p.stephani2, philippe.vaucher, 23529

> Cc: p.stephani2@gmail.com, philippe.vaucher@gmail.com, 23529@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Tue, 6 Sep 2016 11:44:57 -0700
> 
> On 09/06/2016 10:40 AM, Eli Zaretskii wrote:
> > So we will be giving up the ability of end-users to re-dump their
> > Emacs, unless they have a compiler/binutils installed that are
> > compatible with the ones used to build the Emacs binary?
> 
> No, the idea would be to keep the current undump as-is, and to use the 
> new mechanism when building and installing Emacs. That way, end-users 
> would not lose any abilities that they already have. It's OK that new 
> mechanism would work on some platforms where the current undump does not.

Then users on those platforms will never be able to re-dump.

I actually don't understand why the data should be serialized as C
code.  Why not just data that is read into memory (with conversion to
the native format)?  A compiler is not the only way to convert text
into binary data.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 19:09                   ` Eli Zaretskii
@ 2016-09-06 19:59                     ` Clément Pit--Claudel
  0 siblings, 0 replies; 66+ messages in thread
From: Clément Pit--Claudel @ 2016-09-06 19:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 23529


[-- Attachment #1.1: Type: text/plain, Size: 843 bytes --]

On 2016-09-06 15:09, Eli Zaretskii wrote:
>> From: Clément Pit--Claudel <clement.pit@gmail.com>
>> Date: Tue, 6 Sep 2016 14:18:38 -0400
>>
>>> So we will be giving up the ability of end-users to re-dump their
>>> Emacs, unless they have a compiler/binutils installed that are
>>> compatible with the ones used to build the Emacs binary?
>>
>> Does this feature exist? I keep seeing complaints about the fact that it was disabled:
> 
> AFAIR it was disabled because no one cared to fix its quirks wrt ASLR
> etc. (or maybe some similar nuisance).  Once the dumped data is
> portable, those issues will be all but gone.
> 
> Moving to compiled C code will kill that feature for good.

Thanks, I understand better now.  So ideally when we change the implementation we'll be able to re-enable this feature.

Cheers,
Clément.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 19:18                   ` Eli Zaretskii
@ 2016-09-06 20:37                     ` Paul Eggert
  2016-09-07  7:12                       ` Philippe Vaucher
  2016-09-07 14:21                       ` Eli Zaretskii
  0 siblings, 2 replies; 66+ messages in thread
From: Paul Eggert @ 2016-09-06 20:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, philippe.vaucher, 23529

On 09/06/2016 12:18 PM, Eli Zaretskii wrote:
> Then users on those platforms will never be able to re-dump.

True. But they'll still be better off than they are now, since they 
can't dump at all now. Plus, for extra credit we could dynamically link 
the dumped object modules at Emacs startup, with the idea of making it 
practical to re-dump.

> I actually don't understand why the data should be serialized as C
> code.  Why not just data that is read into memory (with conversion to
> the native format)?  A compiler is not the only way to convert text
> into binary data.

The compiler-based approach should be simpler and more portable than 
messing with low-level binary I/O. For example, it should be easy to 
arrange for some of the objects to be read-only: just declare them to be 
'const'. Another example: on hardened platforms with PIEs 
(position-independent executables), you get a PIE for free as the dumped 
executable, instead of having to disable PIE as we do now.

Although Emacs can do this sort of work itself (e.g., randomizing 
locations of dumped objects, munging pointers as they come in to match 
the random locations, and using mmap to make the relevant objects 
const), it should be better for Emacs to use the linking technology 
already available on modern platforms, rather than trying to reinvent 
the wheel.






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 20:37                     ` Paul Eggert
@ 2016-09-07  7:12                       ` Philippe Vaucher
  2016-09-07  7:40                         ` Paul Eggert
  2016-09-07 14:21                       ` Eli Zaretskii
  1 sibling, 1 reply; 66+ messages in thread
From: Philippe Vaucher @ 2016-09-07  7:12 UTC (permalink / raw)
  To: Paul Eggert; +Cc: p.stephani2, 23529

[-- Attachment #1: Type: text/plain, Size: 492 bytes --]

>
> Although Emacs can do this sort of work itself (e.g., randomizing
> locations of dumped objects, munging pointers as they come in to match the
> random locations, and using mmap to make the relevant objects const), it
> should be better for Emacs to use the linking technology already available
> on modern platforms, rather than trying to reinvent the wheel.
>

I agree.

Would that also avoid having to require special privileges when building?
e.g the "personality" syscall.

Philippe

[-- Attachment #2: Type: text/html, Size: 830 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-07  7:12                       ` Philippe Vaucher
@ 2016-09-07  7:40                         ` Paul Eggert
  2016-09-07 11:01                           ` Philipp Stephani
  0 siblings, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-09-07  7:40 UTC (permalink / raw)
  To: Philippe Vaucher; +Cc: p.stephani2, 23529

Philippe Vaucher wrote:
> Would that also avoid having to require special privileges when building?
> e.g the "personality" syscall.

I hope not.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-07  7:40                         ` Paul Eggert
@ 2016-09-07 11:01                           ` Philipp Stephani
  0 siblings, 0 replies; 66+ messages in thread
From: Philipp Stephani @ 2016-09-07 11:01 UTC (permalink / raw)
  To: Paul Eggert, Philippe Vaucher; +Cc: 23529

[-- Attachment #1: Type: text/plain, Size: 380 bytes --]

Paul Eggert <eggert@cs.ucla.edu> schrieb am Mi., 7. Sep. 2016 um 09:40 Uhr:

> Philippe Vaucher wrote:
> > Would that also avoid having to require special privileges when building?
> > e.g the "personality" syscall.
>
> I hope not.
>

I hope you mean "I hope so" :)
I think any new dumper should be portable enough to work with ASLR, ASan,
... (essentially just portable C code).

[-- Attachment #2: Type: text/html, Size: 791 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-06 20:37                     ` Paul Eggert
  2016-09-07  7:12                       ` Philippe Vaucher
@ 2016-09-07 14:21                       ` Eli Zaretskii
  2016-09-07 16:11                         ` Paul Eggert
  1 sibling, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-07 14:21 UTC (permalink / raw)
  To: Paul Eggert; +Cc: p.stephani2, philippe.vaucher, 23529

> Cc: p.stephani2@gmail.com, philippe.vaucher@gmail.com, 23529@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Tue, 6 Sep 2016 13:37:20 -0700
> 
> On 09/06/2016 12:18 PM, Eli Zaretskii wrote:
> > Then users on those platforms will never be able to re-dump.
> 
> True. But they'll still be better off than they are now, since they 
> can't dump at all now. Plus, for extra credit we could dynamically link 
> the dumped object modules at Emacs startup, with the idea of making it 
> practical to re-dump.

That would be good, thanks.  I still think we should consider
approaches that don't require a compiler.

> The compiler-based approach should be simpler and more portable than 
> messing with low-level binary I/O.

You mean, 'read' and 'write'?  That's hardly problematic, and is
available on all supported platforms.  Even 'mmap' is reasonably
portable.

> Another example: on hardened platforms with PIEs
> (position-independent executables), you get a PIE for free as the
> dumped executable, instead of having to disable PIE as we do now.

I'm not sure how PIE is relevant: the stuff we dump, and need to load
into a running Emacs, is data, not code.  What am I missing?





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-07 14:21                       ` Eli Zaretskii
@ 2016-09-07 16:11                         ` Paul Eggert
  2016-09-07 17:10                           ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-09-07 16:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, philippe.vaucher, 23529

Eli Zaretskii wrote:

>> The compiler-based approach should be simpler and more portable than
>> messing with low-level binary I/O.
>
> You mean, 'read' and 'write'?

I meant all the other stuff associated with the I/O.

>> Another example: on hardened platforms with PIEs
>> (position-independent executables), you get a PIE for free as the
>> dumped executable, instead of having to disable PIE as we do now.
>
> I'm not sure how PIE is relevant: the stuff we dump, and need to load
> into a running Emacs, is data, not code.  What am I missing?

PIE can relocate data as well as code. And with modules, we also have code to dump.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-07 16:11                         ` Paul Eggert
@ 2016-09-07 17:10                           ` Eli Zaretskii
  2016-09-07 17:40                             ` Paul Eggert
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-07 17:10 UTC (permalink / raw)
  To: Paul Eggert; +Cc: p.stephani2, philippe.vaucher, 23529

> Cc: p.stephani2@gmail.com, philippe.vaucher@gmail.com, 23529@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Wed, 7 Sep 2016 09:11:31 -0700
> 
> >> Another example: on hardened platforms with PIEs
> >> (position-independent executables), you get a PIE for free as the
> >> dumped executable, instead of having to disable PIE as we do now.
> >
> > I'm not sure how PIE is relevant: the stuff we dump, and need to load
> > into a running Emacs, is data, not code.  What am I missing?
> 
> PIE can relocate data as well as code.

Since we will be reading data into existing variables, that would
happen automatically.

> And with modules, we also have code to dump.

??? What do you mean by that?  Modules cannot be preloaded, AFAIK.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-07 17:10                           ` Eli Zaretskii
@ 2016-09-07 17:40                             ` Paul Eggert
  2016-09-07 18:11                               ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-09-07 17:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, philippe.vaucher, 23529

Eli Zaretskii wrote:
>> PIE can relocate data as well as code.
> Since we will be reading data into existing variables, that would
> happen automatically.

I'm afraid I'm not following. Any existing variables (i.e., existing in Emacs 
when it starts up) are of fixed size, so they can't hold all the data of a 
dumped Emacs. The newly starting-up Emacs must decide how much storage to 
allocate to hold the dumped state that Emacs is about to read.  This storage's 
addresses should be randomized, and the data that Emacs reads will contain 
pointers-to-data that Emacs itself would need to relocate.

All this is doable, of course. It's just that it should be easier and more 
portable to use the existing compilers and linkers rather than reinvent the wheel.

>> > And with modules, we also have code to dump.
> ??? What do you mean by that?  Modules cannot be preloaded, AFAIK.

You're right, saving objects as C source code doesn't fix that problem all by 
itself.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-07 17:40                             ` Paul Eggert
@ 2016-09-07 18:11                               ` Eli Zaretskii
  2016-09-07 20:12                                 ` Paul Eggert
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-07 18:11 UTC (permalink / raw)
  To: Paul Eggert; +Cc: p.stephani2, philippe.vaucher, 23529

> Cc: p.stephani2@gmail.com, philippe.vaucher@gmail.com, 23529@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Wed, 7 Sep 2016 10:40:14 -0700
> 
> Eli Zaretskii wrote:
> >> PIE can relocate data as well as code.
> > Since we will be reading data into existing variables, that would
> > happen automatically.
> 
> I'm afraid I'm not following. Any existing variables (i.e., existing in Emacs 
> when it starts up) are of fixed size, so they can't hold all the data of a 
> dumped Emacs. The newly starting-up Emacs must decide how much storage to 
> allocate to hold the dumped state that Emacs is about to read.  This storage's 
> addresses should be randomized, and the data that Emacs reads will contain 
> pointers-to-data that Emacs itself would need to relocate.

Data that has to be dumped and loaded are accessed through pointers
(since it's malloced in temacs).  When Emacs starts, it will allocate
memory off the heap and read the dumped data into that, using those
pointers to access it.  The pointers are of fixed size, so they will
already exist in the Emacs binary (and relocated if PIE wants that).
I assume that randomization affects the addresses of the buffers
allocated off the heap, so we don't need to do anything else to
randomize the data we load.

> All this is doable, of course. It's just that it should be easier and more 
> portable to use the existing compilers and linkers rather than reinvent the wheel.

I very much doubt that it would be easier, since linking nowadays is
also much more complicated.  We'd need to plug the compiled data into
data structures that support the Lisp interpreter, something which the
compiler and linker won't help us.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-07 18:11                               ` Eli Zaretskii
@ 2016-09-07 20:12                                 ` Paul Eggert
  2016-09-09  5:40                                   ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-09-07 20:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, philippe.vaucher, 23529

On 09/07/2016 11:11 AM, Eli Zaretskii wrote:
> Data that has to be dumped and loaded are accessed through pointers

Sure, but the data contains pointers to other data (and perhaps to code? 
I haven't checked), and when the pointer-containing data is loaded into 
a fresh Emacs those pointers need to be relocated appropriately for the 
new Emacs.

> We'd need to plug the compiled data into
> data structures that support the Lisp interpreter, something which the
> compiler and linker won't help us.

Ah, but they can! Because Emacs now assumes the LSB representation, 
Emacs objects now encapsulate pointers simply by adding a constant to 
them. All C compilers and linkers support that, even for addresses 
defined by other compilation units.






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-07 20:12                                 ` Paul Eggert
@ 2016-09-09  5:40                                   ` Eli Zaretskii
  2016-09-09  7:10                                     ` Paul Eggert
  2016-09-09 18:29                                     ` Andreas Schwab
  0 siblings, 2 replies; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-09  5:40 UTC (permalink / raw)
  To: Paul Eggert; +Cc: p.stephani2, philippe.vaucher, 23529

> Cc: p.stephani2@gmail.com, philippe.vaucher@gmail.com, 23529@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Wed, 7 Sep 2016 13:12:27 -0700
> 
> On 09/07/2016 11:11 AM, Eli Zaretskii wrote:
> > Data that has to be dumped and loaded are accessed through pointers
> 
> Sure, but the data contains pointers to other data

I guess you mean the 'previous' and 'next' pointers?  Fixing that is
just a simple job of adding a fixed offset to each such pointer.

> (and perhaps to code? I haven't checked)

defsubr does that, but fixing the address of the function after
loading the dumped data is also very simple: for each defsubr, rewrite
its function pointer.

> > We'd need to plug the compiled data into
> > data structures that support the Lisp interpreter, something which the
> > compiler and linker won't help us.
> 
> Ah, but they can! Because Emacs now assumes the LSB representation, 
> Emacs objects now encapsulate pointers simply by adding a constant to 
> them. All C compilers and linkers support that, even for addresses 
> defined by other compilation units.

First, Emacs doesn't assume LSB representation when built with wide
ints.  And second, I think you forget the part of the task that with
your proposed method is required to "serialize" the dumped data as C
code.  AFAIU, you are talking about writing and debugging an entirely
new back-end to all the DEFSYM, DEFVAR, defsubr, etc. stuff we use
during dumping, and in addition some new code that would either
replace lisp_malloc and friends during dumping, to produce C code, or
something that would traverse the Lisp data as part of the new
implementation of unexec and convert the Lisp data into C code.  That
is a formidable job in itself, which I think is much more complex than
data I/O and the necessary fixups.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-09  5:40                                   ` Eli Zaretskii
@ 2016-09-09  7:10                                     ` Paul Eggert
  2016-09-09  7:50                                       ` Eli Zaretskii
  2016-09-09 18:29                                     ` Andreas Schwab
  1 sibling, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-09-09  7:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, philippe.vaucher, 23529

Eli Zaretskii wrote:

> I guess you mean the 'previous' and 'next' pointers?

I mean all the pointers in the data. There are more than just 'previous' and 
'next'. Most Lisp objects are tagged pointers, and data contains them.

> Emacs doesn't assume LSB representation when built with wide
> ints.

True, I should have said that the representation is always a pointer plus a 
constant offset, which is true even with wide ints. (It didn't used to be true, 
but those days are long gone.)

> your proposed method is required to "serialize" the dumped data as C
> code.

Sure, but that's true of any dumping method. The advantage of dumping to C code 
is that the compiler and linker will deserialize it for you.






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-09  7:10                                     ` Paul Eggert
@ 2016-09-09  7:50                                       ` Eli Zaretskii
  2016-09-09  8:54                                         ` Paul Eggert
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-09  7:50 UTC (permalink / raw)
  To: Paul Eggert; +Cc: p.stephani2, philippe.vaucher, 23529

> Cc: p.stephani2@gmail.com, philippe.vaucher@gmail.com, 23529@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Fri, 9 Sep 2016 00:10:22 -0700
> 
> Eli Zaretskii wrote:
> 
> > I guess you mean the 'previous' and 'next' pointers?
> 
> I mean all the pointers in the data. There are more than just 'previous' and 
> 'next'. Most Lisp objects are tagged pointers, and data contains them.

Lisp objects are referenced through the obarray, which will be part of
the dumped data, so fixing that up, as part of walking through all the
structures created by temacs, will take care of this problem.  Once
again, a constant offset will do.

> > your proposed method is required to "serialize" the dumped data as C
> > code.
> 
> Sure, but that's true of any dumping method.

No.  Writing out the dumped data is almost trivial, no changes in the
current implementation are needed beyond just the file I/O itself.

> The advantage of dumping to C code is that the compiler and linker
> will deserialize it for you.

That's true, but I think you pay much more in the serialization phase.

In addition, the compiler and the linker were not meant for these
jobs, and their developers certainly don't take such jobs into
account, so we should expect to bump into unexpected problems.  By
contrast, writing the dumped data and then reading it with fixups is
something we can do ourselves without relying on any external
technologies which need to be bent to our needs.  The latter aspects
may well become a problem, not unlike what we have today, at some
future point.  As the number of people who are able to futz with Emacs
internals at this depth continues to dwindle, I don't think we want to
go through replacing this stuff more than just this once, or even
risking that.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-09  7:50                                       ` Eli Zaretskii
@ 2016-09-09  8:54                                         ` Paul Eggert
  2016-09-09  9:09                                           ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-09-09  8:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, philippe.vaucher, 23529

Eli Zaretskii wrote:

> Lisp objects are referenced through the obarray

Sure, but they are also referenced in many other ways. The obarray is just one 
corner of this.

>> Sure, but that's true of any dumping method.
>
> Writing out the dumped data is almost trivial

Not really. Not nowadays.

>> The advantage of dumping to C code is that the compiler and linker
>> will deserialize it for you.
>
> That's true, but I think you pay much more in the serialization phase.

That's fine. Serialization is rare, typically just when Emacs is built. 
Deserialization is much more common, typically whenever Emacs starts up. So it 
can be a win to speed up and simplify deserialization at the expense of 
serialization.

> the compiler and the linker were not meant for these jobs

I don't see why today's compilers and linkers wouldn't be up to these jobs. 
Emacs is not that large by today's standards. The proof of that will be in the 
pudding, no?

> writing the dumped data and then reading it with fixups is
> something we can do ourselves without relying on any external
> technologies which need to be bent to our needs.

I don't think so. We need to rely on and/or work around properties of address 
randomization which will be platform-dependent. It will be tempting to do the 
job poorly, and lose any reliability and/or security benefits of randomization 
that we might otherwise get for free. Letting the compiler and linker do this 
work for us will save us work in the long run.

> As the number of people who are able to futz with Emacs
> internals at this depth continues to dwindle,

This is exactly why we should let the compiler- and linker-writers do this work 
for us.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-09  8:54                                         ` Paul Eggert
@ 2016-09-09  9:09                                           ` Eli Zaretskii
  2016-09-09 16:16                                             ` Paul Eggert
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-09  9:09 UTC (permalink / raw)
  To: Paul Eggert; +Cc: p.stephani2, philippe.vaucher, 23529

> Cc: p.stephani2@gmail.com, philippe.vaucher@gmail.com, 23529@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Fri, 9 Sep 2016 01:54:07 -0700
> 
> Eli Zaretskii wrote:
> 
> > Lisp objects are referenced through the obarray
> 
> Sure, but they are also referenced in many other ways. The obarray is just one 
> corner of this.

Can you elaborate about the other ways you had in mind?

> >> Sure, but that's true of any dumping method.
> >
> > Writing out the dumped data is almost trivial
> 
> Not really. Not nowadays.

Again, please elaborate.

What I had in mind is just a single 'write' (resp. 'read') call for
any contiguous region of memory.  (For best results, we will probably
want to use gmalloc so that it allocates memory off a single array we
define, so that we have fewer regions to write and read.)

> >> The advantage of dumping to C code is that the compiler and linker
> >> will deserialize it for you.
> >
> > That's true, but I think you pay much more in the serialization phase.
> 
> That's fine. Serialization is rare, typically just when Emacs is built. 

By "pay" I meant the development, debugging, and maintenance costs,
not the run-time costs.

> Deserialization is much more common, typically whenever Emacs starts up. So it 
> can be a win to speed up and simplify deserialization at the expense of 
> serialization.

A typical non-trivial Emacs session takes several seconds, sometimes
25 or more, to start, so I don't think the un-dumping that needs to
read the data will be significant.  (Isn't that more or less what
XEmacs did with their "portable dumper"?)

> > the compiler and the linker were not meant for these jobs
> 
> I don't see why today's compilers and linkers wouldn't be up to these jobs. 

They are up to it today, but they are not meant for it.  Their
developers could easily decide that these jobs don't need to be
supported, and then we will be in the same situation as we are today
vis-à-vis the glibc development.

> > writing the dumped data and then reading it with fixups is
> > something we can do ourselves without relying on any external
> > technologies which need to be bent to our needs.
> 
> I don't think so. We need to rely on and/or work around properties of address 
> randomization which will be platform-dependent.

By the time you read the dumped data into Emacs, the randomization
will have been done already, so all you need is to fixup the pointers
in the dumped data accordingly.  Since the final effect of the
randomization is just to change the addresses by some fixed amount,
the fixup should be trivial, once you have a way of finding all the
pointers which need that.

> > As the number of people who are able to futz with Emacs
> > internals at this depth continues to dwindle,
> 
> This is exactly why we should let the compiler- and linker-writers do this work 
> for us.

But they won't!  They develop compilers and linkers, not tools to
undump Emacs.  Our specific use of their tools is not in their
projects' goals.

We once thought that Emacs is important enough for the Free Software
libraries to tweak themselves to accommodate us.  We were proven
wrong.  (AFAIK, only one Free Software library took that seriously,
and does that to this day.)  I see no reason to believe we will never
bump into similar problems by using tools whose main job is something
else.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-09  9:09                                           ` Eli Zaretskii
@ 2016-09-09 16:16                                             ` Paul Eggert
  2016-09-09 18:45                                               ` Eli Zaretskii
  2016-09-09 20:00                                               ` Philippe Vaucher
  0 siblings, 2 replies; 66+ messages in thread
From: Paul Eggert @ 2016-09-09 16:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, philippe.vaucher, 23529

On 09/09/2016 02:09 AM, Eli Zaretskii wrote:
>
> Can you elaborate about the other ways you had in mind?

The best way to elaborate this is to write code. That being said, there 
are a lot of pointers in the data structures of (e.g.) alloc.c and they 
need to be saved and restored and demangled in the process.

> What I had in mind is just a single 'write' (resp. 'read') call for
> any contiguous region of memory.  (For best results, we will probably
> want to use gmalloc so that it allocates memory off a single array we
> define, so that we have fewer regions to write and read.)

That is exactly the wrong way to go. We should not implement our own low 
level memory allocator again! Memory allocation is getting fancier and 
fancier internally in glibc and in other C libraries, for both 
performance and security/robustness reasons, and we shouldn't be wasting 
our development resources trying to keep up.


> By "pay" I meant the development, debugging, and maintenance costs,
> not the run-time costs.

I meant both.

> A typical non-trivial Emacs session takes several seconds, sometimes
> 25 or more, to start

?!  That may be typical for *you*. It is not typical for me. On the 
six-year-old desktop at work that I'm using to type this message (hard 
disks, no flash) in normal mode, Emacs by default takes 1.2 seconds to 
start up. Even 1.2 seconds is too long, as I start up Emacs a lot.

> Their
> developers could easily decide that these jobs don't need to be
> supported

That's not likely. C compilers are commonly used as back ends for other 
systems. Compiler writers take that part of the job seriously.

> all you need is to fixup the pointers
> in the dumped data accordingly.  Since the final effect of the
> randomization is just to change the addresses by some fixed amount,
No, every block is put into a random location. Otherwise it's not 
random. So different values need to be added to different pointers. 
Worse, you have to know where the pointers are.

> They develop compilers and linkers, not tools to
> undump Emacs.

And as long as we use them as compilers and linkers, we will be fine. We 
got into the current mess because we went under the covers of the 
underlying systems. That was reasonable in the 1980s when things were 
simpler, but it is not reasonable now.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-09  5:40                                   ` Eli Zaretskii
  2016-09-09  7:10                                     ` Paul Eggert
@ 2016-09-09 18:29                                     ` Andreas Schwab
  2016-09-09 18:56                                       ` Eli Zaretskii
  1 sibling, 1 reply; 66+ messages in thread
From: Andreas Schwab @ 2016-09-09 18:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, Paul Eggert, philippe.vaucher, 23529

On Sep 09 2016, Eli Zaretskii <eliz@gnu.org> wrote:

> defsubr does that, but fixing the address of the function after
> loading the dumped data is also very simple: for each defsubr, rewrite
> its function pointer.

Function pointers are difficult to handle, especially on architectures
that use function descriptors.  That's why the "portable" dumper of
xemacs doesn't work on ia64: it lumps together function and data
pointers.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-09 16:16                                             ` Paul Eggert
@ 2016-09-09 18:45                                               ` Eli Zaretskii
  2016-09-09 19:59                                                 ` Paul Eggert
  2016-09-09 20:00                                               ` Philippe Vaucher
  1 sibling, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-09 18:45 UTC (permalink / raw)
  To: Paul Eggert; +Cc: p.stephani2, philippe.vaucher, 23529

> Cc: p.stephani2@gmail.com, philippe.vaucher@gmail.com, 23529@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Fri, 9 Sep 2016 09:16:39 -0700
> 
> On 09/09/2016 02:09 AM, Eli Zaretskii wrote:
> >
> > Can you elaborate about the other ways you had in mind?
> 
> The best way to elaborate this is to write code. That being said, there 
> are a lot of pointers in the data structures of (e.g.) alloc.c and they 
> need to be saved and restored and demangled in the process.

All of those data structures are memory allocated for Lisp objects and
their supporting structures, with known structures, so we know exactly
which pointers need fixing.

> > What I had in mind is just a single 'write' (resp. 'read') call for
> > any contiguous region of memory.  (For best results, we will probably
> > want to use gmalloc so that it allocates memory off a single array we
> > define, so that we have fewer regions to write and read.)
> 
> That is exactly the wrong way to go. We should not implement our own low 
> level memory allocator again!

gmalloc is already implemented.

If there are libc's out there that allow the application to define its
own sbrk, then we could use that (we do on Windows).  If not, gmalloc
will be good enough for the temacs run; emacs will of course use the
normal libc allocators.

> Memory allocation is getting fancier and fancier internally in glibc
> and in other C libraries, for both performance and
> security/robustness reasons, and we shouldn't be wasting our
> development resources trying to keep up.

We will use libc in emacs.

> > By "pay" I meant the development, debugging, and maintenance costs,
> > not the run-time costs.
> 
> I meant both.

Each one is a different tradeoff.

> > A typical non-trivial Emacs session takes several seconds, sometimes
> > 25 or more, to start
> 
> ?!  That may be typical for *you*. It is not typical for me. On the 
> six-year-old desktop at work that I'm using to type this message (hard 
> disks, no flash) in normal mode, Emacs by default takes 1.2 seconds to 
> start up.

You have a small .emacs, I guess.

Anyway, even 1.2 sec is an eternity for the job at hand.  I don't see
a problem.

> > Their
> > developers could easily decide that these jobs don't need to be
> > supported
> 
> That's not likely. C compilers are commonly used as back ends for other 
> systems. Compiler writers take that part of the job seriously.

Yes, we used to think that back when unexec was implemented.

> > all you need is to fixup the pointers
> > in the dumped data accordingly.  Since the final effect of the
> > randomization is just to change the addresses by some fixed amount,
> No, every block is put into a random location.

What is a "block" in this context?  Surely, a data structure with
linked pointers cannot be distributed between different "blocks",
since a linker will not know how to fixup each address, because it
doesn't understand the data structure.  So I think you are talking
about an issue that will not affect us.  If that's not so, please do
describe the details, please don't hide behind "easier to write the
code" argument, because this issue is IMO of the utmost importance for
the future of Emacs.

> Worse, you have to know where the pointers are.

We know.

> > They develop compilers and linkers, not tools to
> > undump Emacs.
> 
> And as long as we use them as compilers and linkers, we will be
> fine.

We won't be able to use them as just compilers and linkers.  We will
be using them for a job that is quite a bit more complex and
different.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-09 18:29                                     ` Andreas Schwab
@ 2016-09-09 18:56                                       ` Eli Zaretskii
  0 siblings, 0 replies; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-09 18:56 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: p.stephani2, eggert, philippe.vaucher, 23529

> From: Andreas Schwab <schwab@linux-m68k.org>
> Cc: Paul Eggert <eggert@cs.ucla.edu>,  p.stephani2@gmail.com,  philippe.vaucher@gmail.com,  23529@debbugs.gnu.org
> Date: Fri, 09 Sep 2016 20:29:30 +0200
> 
> On Sep 09 2016, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> > defsubr does that, but fixing the address of the function after
> > loading the dumped data is also very simple: for each defsubr, rewrite
> > its function pointer.
> 
> Function pointers are difficult to handle, especially on architectures
> that use function descriptors.  That's why the "portable" dumper of
> xemacs doesn't work on ia64: it lumps together function and data
> pointers.

Sorry, I don't understand: does defsubr work on ia64?  If so, doing in
emacs just the last part of it, which stores the function pointer,
should also work, right?





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-09 18:45                                               ` Eli Zaretskii
@ 2016-09-09 19:59                                                 ` Paul Eggert
  2016-09-10  6:06                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-09-09 19:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, philippe.vaucher, 23529

On 09/09/2016 11:45 AM, Eli Zaretskii wrote:
> All of those data structures are memory allocated for Lisp objects and
> their supporting structures, with known structures, so we know exactly
> which pointers need fixing.

Of course. But it's not trivial to fix them. It can be done, but it will 
take code that will be hard to maintain portably.

> gmalloc is already implemented

Yes, and its problems are prompting this discussion. gmalloc was a fine 
design for the 1980s but is not now.

> If there are libc's out there that allow the application to define its
> own sbrk, then we could use that (we do on Windows).

The sbrk model is becoming less and less plausible.

> If not, gmalloc
> will be good enough for the temacs run; emacs will of course use the
> normal libc allocators.

This would give up on redumping, no? Plus, it assumes sbrk, which is 
backward-looking. POSIX has withdrawn support for sbrk and there is 
movement to deprecate it in C libraries due to security/robustness 
concerns. This is something we should encourage, not run away from.

> What is a "block" in this context? Surely, a data structure with
> linked pointers cannot be distributed between different "blocks",
> since a linker will not know how to fixup each address, because it
> doesn't understand the data structure.

It can be distributed between different "blocks", because we can tell 
the compiler and linker the data structure. Here's a quick example with 
two small "blocks" dX and dY (the actual code would differ, this is just 
a proof of concept):

   /* Simplified version of lisp.h.  */
   #include <stdint.h>
   typedef intptr_t Lisp_Object;
   enum { Lisp_Int0 = 2, Lisp_Cons = 3 /* ... */};
   #define make_number(n) (((n) << 2) + Lisp_Int0)
   #define TAG_PTR(tag, ptr) ((intptr_t) (ptr) + (tag))
   #define Qnil ((Lisp_Object) 0)
   struct Lisp_Cons { Lisp_Object car, cdr; };

   /* Define a statically-allocated pair x that is equal to (10).  */
   struct Lisp_Cons dX = { make_number (10), Qnil };
   #define x TAG_PTR (Lisp_Cons, &dX)

   /* Use x to build a statically-allocated list y that is equal to (5 
10).  */
   struct Lisp_Cons dY = { make_number (5), x };
   #define y TAG_PTR (Lisp_Cons, &dY)


> We won't be able to use them as just compilers and linkers. We will
> be using them for a job that is quite a bit more complex and
> different.

No, this sort of thing is something that compilers and linkers do all 
the time.






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-09 16:16                                             ` Paul Eggert
  2016-09-09 18:45                                               ` Eli Zaretskii
@ 2016-09-09 20:00                                               ` Philippe Vaucher
  2016-09-10  6:13                                                 ` Eli Zaretskii
  1 sibling, 1 reply; 66+ messages in thread
From: Philippe Vaucher @ 2016-09-09 20:00 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Philipp Stephani, 23529

[-- Attachment #1: Type: text/plain, Size: 929 bytes --]

>> A typical non-trivial Emacs session takes several seconds, sometimes
>> 25 or more, to start
>
> ?!  That may be typical for *you*. It is not typical for me. On the
six-year-old desktop at work that I'm using to type this message (hard
disks, no flash) in normal mode, Emacs by default takes 1.2 seconds to
start up. Even 1.2 seconds is too long, as I start up Emacs a lot.

I second that, most people load emacs in less than 7 seconds, and below 2
when they use use-package or equivalent. That is for emacs with 50+
packages.

> And as long as we use them as compilers and linkers, we will be fine. We
got into the current mess because we went under the covers of the
underlying systems. That was reasonable in the 1980s when things were
simpler, but it is not reasonable now.

I agree with the spirit: we should try to simplify & modernify emacs.
Adding something very custom sounds like another maintenance hell.

Philippe

[-- Attachment #2: Type: text/html, Size: 1060 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-09 19:59                                                 ` Paul Eggert
@ 2016-09-10  6:06                                                   ` Eli Zaretskii
  2016-09-10  7:52                                                     ` Paul Eggert
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-10  6:06 UTC (permalink / raw)
  To: Paul Eggert; +Cc: p.stephani2, philippe.vaucher, 23529

> Cc: p.stephani2@gmail.com, philippe.vaucher@gmail.com, 23529@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Fri, 9 Sep 2016 12:59:16 -0700
> 
> On 09/09/2016 11:45 AM, Eli Zaretskii wrote:
> > All of those data structures are memory allocated for Lisp objects and
> > their supporting structures, with known structures, so we know exactly
> > which pointers need fixing.
> 
> Of course. But it's not trivial to fix them. It can be done, but it will 
> take code that will be hard to maintain portably.

I fail to see why it would be hard to maintain that portably.  Those
data structures are entirely our design and implementation, their
differences between platforms are almost non-existent.  Finding all
the pointers in them is almost trivial.

> > gmalloc is already implemented
> 
> Yes, and its problems are prompting this discussion. gmalloc was a fine 
> design for the 1980s but is not now.

temacs is not a program that needs to run for prolonged time
intervals, its only purpose is to produce the data that the un-dumped
Emacs will use.  So whether its malloc implementation is strong enough
by today's standards is not a relevant question.  What matters is is
it good enough for what temacs should do before it exits.

> > If there are libc's out there that allow the application to define its
> > own sbrk, then we could use that (we do on Windows).
> 
> The sbrk model is becoming less and less plausible.

Or whatever other back-end is used by malloc implementations, sbrk is
not an important detail.

> > If not, gmalloc
> > will be good enough for the temacs run; emacs will of course use the
> > normal libc allocators.
> 
> This would give up on redumping, no?

Not necessarily, we could have a variable that would force using the
pre-dump malloc in emacs.

> Plus, it assumes sbrk, which is backward-looking.

What part assumes sbrk?

> POSIX has withdrawn support for sbrk and there is 
> movement to deprecate it in C libraries due to security/robustness 
> concerns. This is something we should encourage, not run away from.

This is a wrong tree to bark up.  What we need is a malloc back-end
that will allow to allocate memory off an implementation-specified
memory block, that's all.

If we cannot have that (which would surprise me, since MS-Windows does
provide such a feature), we can still implement undump using a data
file, but it will make our job slightly more complex, as we'd need to
collect the data allocated off the heap before dumping it.  Not rocket
science, either.

> > What is a "block" in this context? Surely, a data structure with
> > linked pointers cannot be distributed between different "blocks",
> > since a linker will not know how to fixup each address, because it
> > doesn't understand the data structure.
> 
> It can be distributed between different "blocks", because we can tell 
> the compiler and linker the data structure. Here's a quick example with 
> two small "blocks" dX and dY (the actual code would differ, this is just 
> a proof of concept):
> 
>    /* Simplified version of lisp.h.  */
>    #include <stdint.h>
>    typedef intptr_t Lisp_Object;
>    enum { Lisp_Int0 = 2, Lisp_Cons = 3 /* ... */};
>    #define make_number(n) (((n) << 2) + Lisp_Int0)
>    #define TAG_PTR(tag, ptr) ((intptr_t) (ptr) + (tag))
>    #define Qnil ((Lisp_Object) 0)
>    struct Lisp_Cons { Lisp_Object car, cdr; };
> 
>    /* Define a statically-allocated pair x that is equal to (10).  */
>    struct Lisp_Cons dX = { make_number (10), Qnil };
>    #define x TAG_PTR (Lisp_Cons, &dX)
> 
>    /* Use x to build a statically-allocated list y that is equal to (5 
> 10).  */
>    struct Lisp_Cons dY = { make_number (5), x };
>    #define y TAG_PTR (Lisp_Cons, &dY)

But we don't do these things in our code, so how is this relevant to
this discussion?

What I had in mind is the data structures we use to support
maintenance of Lisp objects.  One example is string_blocks, which we
use to maintain Lisp strings.  Surely, this structure will be in a
single "block" under memory randomization, right?

> > We won't be able to use them as just compilers and linkers. We will
> > be using them for a job that is quite a bit more complex and
> > different.
> 
> No, this sort of thing is something that compilers and linkers do all 
> the time.

We won't know for sure until this is fully implemented.

Anyway, my take from this discussion is that we shouldn't give up so
easily on dumping data as a binary file, as that approach sounds to me
more future-proof than relying (again) on external technologies that
were not meant for this specific job.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-09 20:00                                               ` Philippe Vaucher
@ 2016-09-10  6:13                                                 ` Eli Zaretskii
  0 siblings, 0 replies; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-10  6:13 UTC (permalink / raw)
  To: Philippe Vaucher; +Cc: p.stephani2, eggert, 23529

> From: Philippe Vaucher <philippe.vaucher@gmail.com>
> Date: Fri, 9 Sep 2016 22:00:43 +0200
> Cc: Philipp Stephani <p.stephani2@gmail.com>, 23529@debbugs.gnu.org, 
> 	Eli Zaretskii <eliz@gnu.org>
> 
> >> A typical non-trivial Emacs session takes several seconds, sometimes
> >> 25 or more, to start
> >
> > ?! That may be typical for *you*. It is not typical for me. On the six-year-old desktop at work that I'm using to
> type this message (hard disks, no flash) in normal mode, Emacs by default takes 1.2 seconds to start up.
> Even 1.2 seconds is too long, as I start up Emacs a lot.
> 
> I second that, most people load emacs in less than 7 seconds, and below 2 when they use use-package or
> equivalent. That is for emacs with 50+ packages.

Didn't I say "several seconds"?  Where's the contradiction?

Several seconds is a lot of time for modern CPUs, address fixup is a
rather cheap operation.  That's what the dynamic linker does every
time you load a shared library -- did you ever find that loading
annoyingly long?

> I agree with the spirit: we should try to simplify & modernify emacs. Adding something very custom sounds
> like another maintenance hell.

Excuse me, but how is relying on compilers and linkers more "modern"?

"Maintenance hell"?  Did you see how many changes in Emacs are done to
adapt Emacs to some particular compiler on some particular system?  I
suggest you take a look at "git log" (search for "Port "), keeping up
with that is no less "maintenance hell" than maintaining our own code.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-10  6:06                                                   ` Eli Zaretskii
@ 2016-09-10  7:52                                                     ` Paul Eggert
  2016-09-10 10:19                                                       ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-09-10  7:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, philippe.vaucher, 23529

Eli Zaretskii wrote:

> I fail to see why it would be hard to maintain that portably.  Those
> data structures are entirely our design and implementatio

If it were *that* easy to do, the garbage collector would be doing it. It does 
not. It uses conservative collection, which is easier as it does not relocate 
pointers.

> temacs is not a program that needs to run for prolonged time
> intervals, its only purpose is to produce the data that the un-dumped
> Emacs will use.  So whether its malloc implementation is strong enough
> by today's standards is not a relevant question.  What matters is is
> it good enough for what temacs should do before it exits.

Fair enough. Still this hybrid-implementation approach, where the code uses one 
malloc implementation before dumping, and a different one after, is an extra 
complexity that makes it harder to understand and maintain Emacs. It would be 
better to remove this hack, and we should not be piling even more gingerbread 
atop it.

> we could have a variable that would force using the
> pre-dump malloc in emacs.

That would be still more complexity and state.

>> Plus, it assumes sbrk, which is backward-looking.
>
> What part assumes sbrk?

The current gmalloc implementation assumes the sbrk model, and operates poorly 
(if at all) when the underlying implementation uses address randomization. We 
are already at the edge of portability here; the fact that it works at all on 
modern GNU/Linux is a bit of an accident, requires mysterious tweaks 
occasionally at the C level, and there's no guarantee it will continue to work.

> we can still implement undump using a data
> file, but it will make our job slightly more complex, as we'd need to
> collect the data allocated off the heap before dumping it.  Not rocket
> science, either.

None of this is rocket science! But it is unnecessary complexity.

> But we don't do these things in our code, so how is this relevant to
> this discussion?

We do almost all of that example in our code already. Most of the example was 
taken from lisp.h (with some simplifications just for the example; the actual 
implementation would be based on the current lisp.h). The example demonstrates 
that compilers and linkers can relocate tagged Lisp pointers themselves, which 
means we don't have to do that ourselves.

> One example is string_blocks, which we
> use to maintain Lisp strings.  Surely, this structure will be in a
> single "block" under memory randomization, right?

That would be simpler, at least at first. But it's not the only possibility. For 
example, we could put each pure string in a separate block.






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-10  7:52                                                     ` Paul Eggert
@ 2016-09-10 10:19                                                       ` Eli Zaretskii
  2016-09-10 23:01                                                         ` Paul Eggert
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-10 10:19 UTC (permalink / raw)
  To: Paul Eggert; +Cc: p.stephani2, philippe.vaucher, 23529

> Cc: p.stephani2@gmail.com, philippe.vaucher@gmail.com, 23529@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sat, 10 Sep 2016 00:52:33 -0700
> 
> Eli Zaretskii wrote:
> 
> > I fail to see why it would be hard to maintain that portably.  Those
> > data structures are entirely our design and implementatio
> 
> If it were *that* easy to do, the garbage collector would be doing it. It does 
> not. It uses conservative collection, which is easier as it does not relocate 
> pointers.

Conservative stack marking is for Lisp objects held in variables on
the stack.  Those objects cannot be relevant to dumping, because
stack-based variables disappear without a trace when we dump _today_,
and we don't have any problems with that.

GC cannot disregard stack-based values, without asking the programmer
to use GCPRO.

> > temacs is not a program that needs to run for prolonged time
> > intervals, its only purpose is to produce the data that the un-dumped
> > Emacs will use.  So whether its malloc implementation is strong enough
> > by today's standards is not a relevant question.  What matters is is
> > it good enough for what temacs should do before it exits.
> 
> Fair enough. Still this hybrid-implementation approach, where the code uses one 
> malloc implementation before dumping, and a different one after, is an extra 
> complexity that makes it harder to understand and maintain Emacs. It would be 
> better to remove this hack, and we should not be piling even more gingerbread 
> atop it.

I agree.  If mainline libc allows such control on its memory
allocation back-end, it is better to use that than rely on our own
replacement allocator.

> > we could have a variable that would force using the
> > pre-dump malloc in emacs.
> 
> That would be still more complexity and state.
> 
> >> Plus, it assumes sbrk, which is backward-looking.
> >
> > What part assumes sbrk?
> 
> The current gmalloc implementation assumes the sbrk model, and operates poorly 
> (if at all) when the underlying implementation uses address randomization.

What about disabling randomization for the temacs run?

> > But we don't do these things in our code, so how is this relevant to
> > this discussion?
> 
> We do almost all of that example in our code already. Most of the example was 
> taken from lisp.h (with some simplifications just for the example; the actual 
> implementation would be based on the current lisp.h).

No, I don't think we do that in code that runs in temacs.  If you see
such code, which defines statically-allocated Lisp objects that need
to survive dumping, please point me to it.

In any case, even if such static Lisp objects exist, they just need to
be fixed as well, as part of un-dumping.

> The example demonstrates 
> that compilers and linkers can relocate tagged Lisp pointers themselves, which 
> means we don't have to do that ourselves.

You don't need to convince me that a linker can relocate addresses, I
know that.  Our differences of opinions are not about that.

> > One example is string_blocks, which we
> > use to maintain Lisp strings.  Surely, this structure will be in a
> > single "block" under memory randomization, right?
> 
> That would be simpler, at least at first. But it's not the only possibility. For 
> example, we could put each pure string in a separate block.

I don't see why we would want to, it would mean too many
disadvantages.  But even if we did, it just means separate fixup value
for each block, that's all.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-10 10:19                                                       ` Eli Zaretskii
@ 2016-09-10 23:01                                                         ` Paul Eggert
  2016-09-11 15:23                                                           ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-09-10 23:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, philippe.vaucher, 23529

Eli Zaretskii wrote:

> Conservative stack marking is for Lisp objects held in variables on
> the stack.  Those objects cannot be relevant to dumping

Yes, but the conservativeness of the marking phase means Emacs cannot relocate 
objects. This is true regardless of whether the objects-that-can't-be-moved 
reside on the stack or on the heap.

> If mainline libc allows such control on its memory
> allocation back-end, it is better to use that than rely on our own
> replacement allocator.

Although that might be better than what we're doing, better yet would be to not 
fiddle with such internal details of malloc at all.

> What about disabling randomization for the temacs run?

That is yet another low-level thing to configure, and to get right in new ports. 
The approach I'm suggesting does not rely on disabling randomization.

> I don't think we do that in code that runs in temacs.

This point is a tangent to its containing thread, as the thread in question is 
about whether compilers and linkers can relocate pointers for us. The code 
example establishes that compilers and linkers can do so, regardless of whether 
Emacs is using that capability now.






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-10 23:01                                                         ` Paul Eggert
@ 2016-09-11 15:23                                                           ` Eli Zaretskii
  2016-09-11 16:59                                                             ` Paul Eggert
  2016-09-11 19:32                                                             ` Philippe Vaucher
  0 siblings, 2 replies; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-11 15:23 UTC (permalink / raw)
  To: Paul Eggert; +Cc: p.stephani2, philippe.vaucher, 23529

> Cc: p.stephani2@gmail.com, philippe.vaucher@gmail.com, 23529@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sat, 10 Sep 2016 16:01:28 -0700
> 
>     Conservative stack marking is for Lisp objects held in variables on
>     the stack.  Those objects cannot be relevant to dumping
> 
> Yes, but the conservativeness of the marking phase means Emacs cannot relocate objects.

I don't understand how this is relevant.  What do you mean by
"relocating objects", and why would we need to do that as part of
un-dumping?

>     If mainline libc allows such control on its memory
>     allocation back-end, it is better to use that than rely on our own
>     replacement allocator.
> 
> Although that might be better than what we're doing, better yet would be to not fiddle with such internal details of malloc at all.

Yes, and it's better not to fiddle with Emacs at all, if all we want
is simple C programs.

>     What about disabling randomization for the temacs run?
> 
> That is yet another low-level thing to configure, and to get right in new ports.

We already have that in Emacs, don't we?

> The approach I'm suggesting does not rely on disabling randomization.

It has other costs, though.  A tradeoff should consider them all, not
one by one.

> This point is a tangent to its containing thread, as the thread in question is about whether compilers and linkers can relocate pointers for us. The code example establishes that compilers and linkers can do so, regardless of whether Emacs is using that capability now. 

No, this point started with me saying dumping and reading dumped data
with fixups is relatively easy, and you objecting saying address
randomizations will defeat that.  Now we agree that it's a tangential
issue unrelated to my proposal.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-11 15:23                                                           ` Eli Zaretskii
@ 2016-09-11 16:59                                                             ` Paul Eggert
  2016-09-11 17:19                                                               ` Eli Zaretskii
  2016-09-11 19:32                                                             ` Philippe Vaucher
  1 sibling, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-09-11 16:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, philippe.vaucher, 23529

Eli Zaretskii wrote:

> What do you mean by
> "relocating objects", and why would we need to do that as part of
> un-dumping?

Objects in the new Emacs might have different addresses from objects in the old 
Emacs, due to address randomization.

>>     What about disabling randomization for the temacs run?
>>
>> That is yet another low-level thing to configure, and to get right in new ports.
>
> We already have that in Emacs, don't we?

Yes, and that's one of the problems that we should fix. It causes us to run into 
portability problems. It's still not clear that the current implementation will 
actually work on POSIXish systems. We are on (or over) the edge of portability 
here, and we need to get off.

> No, this point started with me saying dumping and reading dumped data
> with fixups is relatively easy, and you objecting saying address
> randomizations will defeat that.  Now we agree that it's a tangential
> issue unrelated to my proposal.

We don't agree.






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-11 16:59                                                             ` Paul Eggert
@ 2016-09-11 17:19                                                               ` Eli Zaretskii
  0 siblings, 0 replies; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-11 17:19 UTC (permalink / raw)
  To: Paul Eggert; +Cc: p.stephani2, philippe.vaucher, 23529

> Cc: p.stephani2@gmail.com, philippe.vaucher@gmail.com, 23529@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sun, 11 Sep 2016 09:59:52 -0700
> 
> Eli Zaretskii wrote:
> 
> > What do you mean by
> > "relocating objects", and why would we need to do that as part of
> > un-dumping?
> 
> Objects in the new Emacs might have different addresses from objects in the old 
> Emacs, due to address randomization.

Yes, but I don't think we will see them in the dumped data, except as
results of DEFVAR etc., which will have to be fixed up anyway.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-11 15:23                                                           ` Eli Zaretskii
  2016-09-11 16:59                                                             ` Paul Eggert
@ 2016-09-11 19:32                                                             ` Philippe Vaucher
  2016-09-12  2:30                                                               ` Eli Zaretskii
  1 sibling, 1 reply; 66+ messages in thread
From: Philippe Vaucher @ 2016-09-11 19:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Philipp Stephani, Paul Eggert, 23529

>>     What about disabling randomization for the temacs run?
>>
>> That is yet another low-level thing to configure, and to get right in new ports.
>
> We already have that in Emacs, don't we?

That is exactly why I made the bug report 23529!

Because Emacs does stuffs at build time that requires "high"
privileges (like the personality() syscall), one cannot build Emacs in
various restricted environments.

Disabling randomization is exactly what we should get rid of, at least
at build time. Having some part of emacs that requires somewhat high
privileges when it runs is ok, but that should *not* be part of the
standard build procedure... Being able to build GCC inside a container
but not Emacs is just "wrong" IMHO.

Philippe





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-11 19:32                                                             ` Philippe Vaucher
@ 2016-09-12  2:30                                                               ` Eli Zaretskii
  2016-09-12  2:58                                                                 ` Clément Pit--Claudel
  2016-09-12 14:10                                                                 ` Philippe Vaucher
  0 siblings, 2 replies; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-12  2:30 UTC (permalink / raw)
  To: Philippe Vaucher; +Cc: p.stephani2, eggert, 23529

> From: Philippe Vaucher <philippe.vaucher@gmail.com>
> Date: Sun, 11 Sep 2016 21:32:51 +0200
> Cc: Paul Eggert <eggert@cs.ucla.edu>, Philipp Stephani <p.stephani2@gmail.com>, 23529@debbugs.gnu.org
> 
> >>     What about disabling randomization for the temacs run?
> >>
> >> That is yet another low-level thing to configure, and to get right in new ports.
> >
> > We already have that in Emacs, don't we?
> 
> That is exactly why I made the bug report 23529!
> 
> Because Emacs does stuffs at build time that requires "high"
> privileges (like the personality() syscall), one cannot build Emacs in
> various restricted environments.
> 
> Disabling randomization is exactly what we should get rid of, at least
> at build time.

Isn't it the other way around: the first priority is to enable
randomization and all the other modern techniques for running the
dumped Emacs?





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-12  2:30                                                               ` Eli Zaretskii
@ 2016-09-12  2:58                                                                 ` Clément Pit--Claudel
  2016-09-12  6:09                                                                   ` Philipp Stephani
  2016-09-12 14:10                                                                 ` Philippe Vaucher
  1 sibling, 1 reply; 66+ messages in thread
From: Clément Pit--Claudel @ 2016-09-12  2:58 UTC (permalink / raw)
  To: 23529


[-- Attachment #1.1: Type: text/plain, Size: 1234 bytes --]

On 2016-09-11 22:30, Eli Zaretskii wrote:
>> From: Philippe Vaucher <philippe.vaucher@gmail.com>
>> Date: Sun, 11 Sep 2016 21:32:51 +0200
>> Cc: Paul Eggert <eggert@cs.ucla.edu>, Philipp Stephani <p.stephani2@gmail.com>, 23529@debbugs.gnu.org
>>
>>>>     What about disabling randomization for the temacs run?
>>>>
>>>> That is yet another low-level thing to configure, and to get right in new ports.
>>>
>>> We already have that in Emacs, don't we?
>>
>> That is exactly why I made the bug report 23529!
>>
>> Because Emacs does stuffs at build time that requires "high"
>> privileges (like the personality() syscall), one cannot build Emacs in
>> various restricted environments.
>>
>> Disabling randomization is exactly what we should get rid of, at least
>> at build time.
> 
> Isn't it the other way around: the first priority is to enable
> randomization and all the other modern techniques for running the
> dumped Emacs?

I think we want to be able to build the full Emacs in a container; that is without needing, at any point in the process, to disable randomization.  If I understand correctly, this means that even the process of dumping Emacs cannot involve disabling randomization.

Clément.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-12  2:58                                                                 ` Clément Pit--Claudel
@ 2016-09-12  6:09                                                                   ` Philipp Stephani
  2016-09-12 17:04                                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Philipp Stephani @ 2016-09-12  6:09 UTC (permalink / raw)
  To: Clément Pit--Claudel, 23529

[-- Attachment #1: Type: text/plain, Size: 1473 bytes --]

Clément Pit--Claudel <clement.pit@gmail.com> schrieb am Mo., 12. Sep. 2016
um 04:59 Uhr:

> On 2016-09-11 22:30, Eli Zaretskii wrote:
> >> From: Philippe Vaucher <philippe.vaucher@gmail.com>
> >> Date: Sun, 11 Sep 2016 21:32:51 +0200
> >> Cc: Paul Eggert <eggert@cs.ucla.edu>, Philipp Stephani <
> p.stephani2@gmail.com>, 23529@debbugs.gnu.org
> >>
> >>>>     What about disabling randomization for the temacs run?
> >>>>
> >>>> That is yet another low-level thing to configure, and to get right in
> new ports.
> >>>
> >>> We already have that in Emacs, don't we?
> >>
> >> That is exactly why I made the bug report 23529!
> >>
> >> Because Emacs does stuffs at build time that requires "high"
> >> privileges (like the personality() syscall), one cannot build Emacs in
> >> various restricted environments.
> >>
> >> Disabling randomization is exactly what we should get rid of, at least
> >> at build time.
> >
> > Isn't it the other way around: the first priority is to enable
> > randomization and all the other modern techniques for running the
> > dumped Emacs?
>
> I think we want to be able to build the full Emacs in a container; that is
> without needing, at any point in the process, to disable randomization.  If
> I understand correctly, this means that even the process of dumping Emacs
> cannot involve disabling randomization.
>

Yes, that's correct. No step in the build process should have to disable
randomization.

[-- Attachment #2: Type: text/html, Size: 2721 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-12  2:30                                                               ` Eli Zaretskii
  2016-09-12  2:58                                                                 ` Clément Pit--Claudel
@ 2016-09-12 14:10                                                                 ` Philippe Vaucher
  2016-09-12 14:18                                                                   ` Philippe Vaucher
  1 sibling, 1 reply; 66+ messages in thread
From: Philippe Vaucher @ 2016-09-12 14:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Philipp Stephani, Paul Eggert, 23529

On Mon, Sep 12, 2016 at 4:30 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Philippe Vaucher <philippe.vaucher@gmail.com>
>> Date: Sun, 11 Sep 2016 21:32:51 +0200
>> Cc: Paul Eggert <eggert@cs.ucla.edu>, Philipp Stephani <p.stephani2@gmail.com>, 23529@debbugs.gnu.org
>>
>> >>     What about disabling randomization for the temacs run?
>> >>
>> >> That is yet another low-level thing to configure, and to get right in new ports.
>> >
>> > We already have that in Emacs, don't we?
>>
>> That is exactly why I made the bug report 23529!
>>
>> Because Emacs does stuffs at build time that requires "high"
>> privileges (like the personality() syscall), one cannot build Emacs in
>> various restricted environments.
>>
>> Disabling randomization is exactly what we should get rid of, at least
>> at build time.
>
> Isn't it the other way around: the first priority is to enable
> randomization and all the other modern techniques for running the
> dumped Emacs?

Now you confuse me... let's start over: I'm saying that we should try
to make Emacs build fine wether there is address randomization or not
. There should not be "disable ASLR" hacks when building/dumping, like
there currently is.

Basically, Emacs should not require higher privileges than GCC, which
is currently not the case with all the personality() syscalls &
friends.

Now, maybe I missunderstood you when you said "What about disabling
randomization for the temacs run?". I understood that as "Why don't we
disable ASLR when temacs run?", which is exactly why Emacs has problem
building in restricted environments.

Philippe





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-12 14:10                                                                 ` Philippe Vaucher
@ 2016-09-12 14:18                                                                   ` Philippe Vaucher
  2016-09-13 14:47                                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Philippe Vaucher @ 2016-09-12 14:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Philipp Stephani, Paul Eggert, 23529

Interesting links from https://github.com/docker/docker/issues/22801

http://www.openwall.com/lists/musl/2015/02/03/1
https://lwn.net/Articles/673724/





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-12  6:09                                                                   ` Philipp Stephani
@ 2016-09-12 17:04                                                                     ` Eli Zaretskii
  0 siblings, 0 replies; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-12 17:04 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: 23529, clement.pit

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Mon, 12 Sep 2016 06:09:10 +0000
> 
>     > Isn't it the other way around: the first priority is to enable
>     > randomization and all the other modern techniques for running the
>     > dumped Emacs?
> 
>     I think we want to be able to build the full Emacs in a container; that is without needing, at any point in the process, to disable randomization.  If I understand correctly, this means that even the process of dumping Emacs cannot involve disabling randomization.
> 
> Yes, that's correct. No step in the build process should have to disable randomization.

Got it, thanks.

However, on second thought, I don't see why this would be an issue.
I've mentioned gmalloc as a candidate for an malloc implementation
during the temacs run (i.e. during dumping), because gmalloc can be
told to use our own sbrk, and that sbrk could allocate memory off an
array we define; this might make the job of finding the memory to dump
easier.  Paul said that gmalloc doesn't work well when ASLR is
enabled, but I now think this is not relevant, because we will be
allocating from a single contiguous array, which AFAIU is unaffected
by ASLR, and also makes those gmalloc problems a non-issue as a side
effect.

Moreover, if for some reason using gmalloc is not an option, or
doesn't really help with this job, that would just make the job of
collecting the memory to dump harder, but not too hard.  Again, ASLR
adds nothing to this picture, as the job of collecting the memory to
dump will be based on known pointers to known data structures, and the
values of the addresses where these pointers point to are of no
importance.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-12 14:18                                                                   ` Philippe Vaucher
@ 2016-09-13 14:47                                                                     ` Eli Zaretskii
  2016-09-13 15:21                                                                       ` Philippe Vaucher
  2016-09-13 15:51                                                                       ` Paul Eggert
  0 siblings, 2 replies; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-13 14:47 UTC (permalink / raw)
  To: Philippe Vaucher; +Cc: p.stephani2, eggert, 23529

What about this idea:

  https://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01049.html

To recap: the idea is to dump the dumping (pun intended), and instead
load the preloaded packages upon each session start.  This currently
takes about 5 to 12 sec, depending on the platform and the
optimization switches, but a simple optimization brings it down to
below 0.5 sec in an optimized build.  Further testing indicates that
lumping all of the files we preload into a single .elc file reduces
the load time even more, so it becomes around 0.1-0.2 sec.  That
should be short enough to make it negligible, right?

This sounds as a much easier and low-risk approach. Its significant
advantage is that it doesn't require any serious changes in the
low-level infrastructure -- no need for generating C code or dump data
records as part of DEFUN, defsubr, and staticpro, no need to hunt for,
dump, and restore global variables that hold Lisp objects, no
dependencies on external tools, etc.  Almost everything in support of
this method, like the ability to redirect results of compiling several
source files into a single .elc file, can be done in Lisp, mainly in
loadup.el, plus some Makefile changes in the last stages of the build
process.  A much simpler design and higher-level implementation mean
more people could be involved in working on this, testing, and
debugging, instead of relying on a couple of "usual suspects" who are
overloaded anyway.  Having a single .elc file provides a nice bonus of
being able to run Emacs even if the Lisp files are not available.  And
re-dumping can be supported with almost no effort.

If this idea is accepted, the first question to ask is whether we
still need pure storage on modern platforms.  If we do,
find_string_data_in_pure will have to be sped up by using a hash table
or some such.  If pure storage is not needed, purecopy can be a no-op
and find_string_data_in_pure should simply go away, which is of course
much easier.

Comments?





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-13 14:47                                                                     ` Eli Zaretskii
@ 2016-09-13 15:21                                                                       ` Philippe Vaucher
  2016-09-13 15:55                                                                         ` Eli Zaretskii
  2016-09-13 15:51                                                                       ` Paul Eggert
  1 sibling, 1 reply; 66+ messages in thread
From: Philippe Vaucher @ 2016-09-13 15:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Philipp Stephani, Paul Eggert, 23529

> What about this idea:
>
>   https://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01049.html
>
> To recap: the idea is to dump the dumping (pun intended), and instead
> load the preloaded packages upon each session start.

I'm all for removing the dumping entirely, because I never use it
(using use-package is fast enough for me).

I thought you wanted to keep the dumping feature tho, maybe I
missunderstood you (again) :-)

Philippe





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-13 14:47                                                                     ` Eli Zaretskii
  2016-09-13 15:21                                                                       ` Philippe Vaucher
@ 2016-09-13 15:51                                                                       ` Paul Eggert
  2016-09-13 19:24                                                                         ` Eli Zaretskii
  1 sibling, 1 reply; 66+ messages in thread
From: Paul Eggert @ 2016-09-13 15:51 UTC (permalink / raw)
  To: Eli Zaretskii, Philippe Vaucher; +Cc: p.stephani2, 23529

Eli Zaretskii wrote:
> the idea is to dump the dumping (pun intended), and instead
> load the preloaded packages upon each session start.

This approach could well work. It addresses the portability concerns with 
malloc, as it means we can just use the system malloc. If it has acceptable 
performance and its bugs can be fixed, it would be a good way to go.

My main worry is performance. On the desktop that I'm typing this message on, 
emacs -Q takes about 40 ms to start up in batch mode, about 90 ms in a terminal, 
and about 300 ms on a graphical display. These are the sorts of times that I'd 
like to see with the proposed approach too. (This desktop is using a 4-year-old 
Xeon E3-1225 V2 CPU, and Emacs is stored on flash and typically cached in RAM.)

By "dump the dumping" I assume you mean withdraw the traditional dump-emacs 
function.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-13 15:21                                                                       ` Philippe Vaucher
@ 2016-09-13 15:55                                                                         ` Eli Zaretskii
  0 siblings, 0 replies; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-13 15:55 UTC (permalink / raw)
  To: Philippe Vaucher; +Cc: p.stephani2, eggert, 23529

> From: Philippe Vaucher <philippe.vaucher@gmail.com>
> Date: Tue, 13 Sep 2016 17:21:18 +0200
> Cc: Paul Eggert <eggert@cs.ucla.edu>, Philipp Stephani <p.stephani2@gmail.com>, 23529@debbugs.gnu.org
> 
> I thought you wanted to keep the dumping feature tho, maybe I
> missunderstood you (again) :-)

I want to keep the ability of users to add more packages to the stuff
that is automatically available at startup.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2016-09-13 15:51                                                                       ` Paul Eggert
@ 2016-09-13 19:24                                                                         ` Eli Zaretskii
  0 siblings, 0 replies; 66+ messages in thread
From: Eli Zaretskii @ 2016-09-13 19:24 UTC (permalink / raw)
  To: Paul Eggert; +Cc: philippe.vaucher, p.stephani2, 23529

> Cc: p.stephani2@gmail.com, 23529@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Tue, 13 Sep 2016 08:51:47 -0700
> 
> Eli Zaretskii wrote:
> > the idea is to dump the dumping (pun intended), and instead
> > load the preloaded packages upon each session start.
> 
> This approach could well work. It addresses the portability concerns with 
> malloc, as it means we can just use the system malloc. If it has acceptable 
> performance and its bugs can be fixed, it would be a good way to go.

That's the goal, yes.

> My main worry is performance. On the desktop that I'm typing this message on, 
> emacs -Q takes about 40 ms to start up in batch mode, about 90 ms in a terminal, 
> and about 300 ms on a graphical display. These are the sorts of times that I'd 
> like to see with the proposed approach too. (This desktop is using a 4-year-old 
> Xeon E3-1225 V2 CPU, and Emacs is stored on flash and typically cached in RAM.)

I don't see why we couldn't keep these times, or come close, if your
disk is flash memory.  I got 0.2 sec with a real disk, in a 32-bit
build with wide ints, and that was without any serious attempt to find
the hot spots and optimize them.

> By "dump the dumping" I assume you mean withdraw the traditional dump-emacs 
> function.

Yes.  Instead, there should be mostly Lisp code to produce a single
.elc file for all the preloaded stuff, and calculate values of some
variables that can only be recorded at build time (like
source-directory, for example).





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#23529: Request for fixing randomize_va_space build issues
  2019-09-14  8:52 ` Philippe Vaucher
@ 2019-09-14 10:39   ` Stefan Kangas
  0 siblings, 0 replies; 66+ messages in thread
From: Stefan Kangas @ 2019-09-14 10:39 UTC (permalink / raw)
  To: Philippe Vaucher; +Cc: 23529, 13964

tags 23529 fixed
close 23529 27.1
quit

Philippe Vaucher <philippe.vaucher@gmail.com> writes:
>> Is this still an issue with Emacs 27.0.50 (current master branch)?
>
> No, this is indeed fixed in master.
>
> This ticket can thus be closes.

Thanks.  I'm therefore closing this bug.

Best regards,
Stefan Kangas





^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2019-09-14 10:39 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-13 12:18 bug#23529: Request for fixing randomize_va_space build issues Philippe Vaucher
2016-05-13 15:58 ` Paul Eggert
2016-05-17 16:38   ` Philippe Vaucher
2016-05-18  7:53     ` Philippe Vaucher
2016-05-18  8:21     ` Paul Eggert
2016-05-18  8:44       ` Philippe Vaucher
2016-05-20 17:52         ` Paul Eggert
2016-09-06  9:22           ` Philipp Stephani
2016-09-06 17:21             ` Paul Eggert
2016-09-06 17:40               ` Eli Zaretskii
2016-09-06 17:46                 ` Philippe Vaucher
2016-09-06 17:55                   ` Philipp Stephani
2016-09-06 18:04                     ` Eli Zaretskii
2016-09-06 17:59                   ` Eli Zaretskii
2016-09-06 18:03                     ` Philipp Stephani
2016-09-06 18:32                       ` Eli Zaretskii
2016-09-06 19:01                         ` Philipp Stephani
2016-09-06 18:24                     ` Philippe Vaucher
2016-09-06 19:11                       ` Eli Zaretskii
2016-09-06 18:18                 ` Clément Pit--Claudel
2016-09-06 19:09                   ` Eli Zaretskii
2016-09-06 19:59                     ` Clément Pit--Claudel
2016-09-06 18:44                 ` Paul Eggert
2016-09-06 19:18                   ` Eli Zaretskii
2016-09-06 20:37                     ` Paul Eggert
2016-09-07  7:12                       ` Philippe Vaucher
2016-09-07  7:40                         ` Paul Eggert
2016-09-07 11:01                           ` Philipp Stephani
2016-09-07 14:21                       ` Eli Zaretskii
2016-09-07 16:11                         ` Paul Eggert
2016-09-07 17:10                           ` Eli Zaretskii
2016-09-07 17:40                             ` Paul Eggert
2016-09-07 18:11                               ` Eli Zaretskii
2016-09-07 20:12                                 ` Paul Eggert
2016-09-09  5:40                                   ` Eli Zaretskii
2016-09-09  7:10                                     ` Paul Eggert
2016-09-09  7:50                                       ` Eli Zaretskii
2016-09-09  8:54                                         ` Paul Eggert
2016-09-09  9:09                                           ` Eli Zaretskii
2016-09-09 16:16                                             ` Paul Eggert
2016-09-09 18:45                                               ` Eli Zaretskii
2016-09-09 19:59                                                 ` Paul Eggert
2016-09-10  6:06                                                   ` Eli Zaretskii
2016-09-10  7:52                                                     ` Paul Eggert
2016-09-10 10:19                                                       ` Eli Zaretskii
2016-09-10 23:01                                                         ` Paul Eggert
2016-09-11 15:23                                                           ` Eli Zaretskii
2016-09-11 16:59                                                             ` Paul Eggert
2016-09-11 17:19                                                               ` Eli Zaretskii
2016-09-11 19:32                                                             ` Philippe Vaucher
2016-09-12  2:30                                                               ` Eli Zaretskii
2016-09-12  2:58                                                                 ` Clément Pit--Claudel
2016-09-12  6:09                                                                   ` Philipp Stephani
2016-09-12 17:04                                                                     ` Eli Zaretskii
2016-09-12 14:10                                                                 ` Philippe Vaucher
2016-09-12 14:18                                                                   ` Philippe Vaucher
2016-09-13 14:47                                                                     ` Eli Zaretskii
2016-09-13 15:21                                                                       ` Philippe Vaucher
2016-09-13 15:55                                                                         ` Eli Zaretskii
2016-09-13 15:51                                                                       ` Paul Eggert
2016-09-13 19:24                                                                         ` Eli Zaretskii
2016-09-09 20:00                                               ` Philippe Vaucher
2016-09-10  6:13                                                 ` Eli Zaretskii
2016-09-09 18:29                                     ` Andreas Schwab
2016-09-09 18:56                                       ` Eli Zaretskii
  -- strict thread matches above, loose matches on Subject: below --
2019-09-14  4:18 bug#13964: " Stefan Kangas
2019-09-14  8:52 ` Philippe Vaucher
2019-09-14 10:39   ` Stefan Kangas

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).