unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Daniel Colascione <dancol@dancol.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: gerd.moellmann@gmail.com, pipcet@protonmail.com, 75322@debbugs.gnu.org
Subject: bug#75322: SAFE_ALLOCA assumed to root Lisp_Objects/SSDATA(string)
Date: Mon, 06 Jan 2025 09:48:09 -0500	[thread overview]
Message-ID: <87wmf8t2vq.fsf@dancol.org> (raw)
In-Reply-To: <867c786quc.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 06 Jan 2025 14:59:07 +0200")

Eli Zaretskii <eliz@gnu.org> writes:

>> Date: Sun, 05 Jan 2025 16:15:47 -0500
>> From: Daniel Colascione <dancol@dancol.org>
>> CC: pipcet@protonmail.com, 75322@debbugs.gnu.org
>> 
>> >  . an automatic variable
>> >  . a static variable that is protected by someone
>> >  . a global variable that is protected by someone
>> >  . a result of dereferencing a pointer that is somehow protected
>> >
>> >etc. etc., where "protected by someone" means that it is a descendant
>> >of some staticpro, or of some root, or...
>> 
>> Well, yeah. Every other GC program does this. Emacs can too. There's
>> no great burden: all Lisp objects get traced
>> automatically. Everything on the stack or in a register gets traced
>> automatically, and, because the scanning is conservative,
>> pinned. You only have to take extra steps to tell the GC about
>> something when you're going out of your way to go around the GC.
>> 
>> It's simply not true that to adopt a modern GC every line of code has to change.  I wrote a moving GC for Emacs myself years ago. Worked fine. No rewrite.
>
> The opinions in this thread are that changes in the code _are_ needed.
> Maybe that's not a "rewrite" you had in mind, but if we need to make
> such substantial changes in many places, that's a "rewrite" in my
> book.

I wouldn't call it a "rewrite".  If auditing the codebase for memory
safety is a "rewrite", I'm a "duck".  We're talking about a few hundred
lines of changes at the most.  Most of the work is just auditing the
code for problems.  We should be grateful Gerd has done this work
already, not "run away from MPS, fast".

>> >And if we cannot prove to ourselves that one of the above happens,
>> >then we'd need to force a copy of the variable to be on the stack?
>> >
>> >Does this sound practical?
>> >
>> >If this is the price of using MPS, and I'm not missing something
>> >obvious, then it sounds like we should run away from MPS, fast.
>> >Because we will sooner or later have to rewrite every single line of
>> >code we ever wrote.
>> 
>> No, you do it by adopting a rule that when a function receives a
>> pointer, the caller guarantees the validity of the pointer for the
>> duration of the call. This way, only the level of the stack that
>> spills the array to the heap has to take on the responsibility of
>> keeping the referenced objects alive, and making the spilled array a
>> pointer to the pinned guts of a Lisp vector is an adequate way to
>> do this.
>
> We are talking about code such as this one:
>
> 	    SAFE_NALLOCA (args2, 1, nargs + 1);
> 	    args2[0] = Qcall_process;
> 	    for (i = 0; i < nargs; i++) args2[i + 1] = args[i];
> 	    coding_systems = Ffind_operation_coding_system (nargs + 1, args2);
> 	    val = CONSP (coding_systems) ? XCDR (coding_systems) : Qnil;
>
> "Look, ma: no pointers!"

      Lisp_Object val, *args2;

In the C programming language, "*" means "pointer".

> The args[] array is fine: it comes from the caller and is valid.  The
> problem being discussed here is the args2[] array, in the case where
> SAFE_NALLOCA decides args2[] is too large for the stack and instead
> malloc's it.  In that case, args2[] stores on the heap copies of the
> objects in args[] (still no pointers!), and the issue is that when GC
> happens (say, at some point inside Ffind_operation_coding_system), the
> objects in args[] are updated by GC, but their copies in args2[] are
> not.

Each Lisp_Object held across a GC must either:

1) be visible to the GC for precise marking and updating (which is the
case for all references from Lisp object to another as well as
staticpro-protected static storage), or

2) be visible to the GC for conservative marking and pinning instead
of moving, which today means the variable is on the stack.

A Lisp_Object embedded in a block of malloc-heap memory doesn't satisfy
either condition and so is unsafe.

Also, this pattern is perfectly fine:

void
foo()
{
  Lisp_Object c = Fcons(...);
  bar(&c);
}

void
bar(Lisp_Object* c)
{
  eassert(CONSP(c));
  Fgarbage_collect();
  eassert(CONSP(c));
}

because when bar() gets its pointer, it implicitly assumes that it's
*caller* has followed the rule above, which it has.

The Emacs code already works this way.  The changes we're talking about
are not anywhere near as large-scale as what you might have in mind.

> So this code needs to be changed.

The snippet you quoted above can be fixed with a one-liner --- replace
SAFE_NALLOCA with SAFE_ALLOCA_LISP.  (We have to fix
SAFE_ALLOCA_LISP_EXTRA to make it safe with the old GC, but that's not a
change that needs to happen at call sites.)

> And if you look around, we have quite a lot of these in many places.

Sounds like Gerd's spent some time hunting them down.

>> "Oh, but won't that kill performance?"
>
> That wasn't my primary bother.  The primary bother is the need to
> modify many places.  Which is not necessarily a purely mechanical
> refactoring, either.  Consider:
>
>   static Lisp_Object last_coding;
>   Lisp_Object coding = Vcoding_system_for_write;
>   [...]
>   last_coding = coding;
>   call_some_function ();
>   if (NILP (last_coding))
>     do_something ();
>
> If call_some_function can trigger GC, the value of 'coding' after the
> call is still valid, but the value of 'last_coding' could be garbage
> (unless 'last_coding' is staticpro'd).

Correct.  The fix is to move last_coding to file scope and gcpro it.
You can argue that moving this declaration around is not a "mechanical"
fix, but it's straightforward, safe, and makes the code clearer: after
the change, the static storage is at file scope where you can see it,
not buried in a function body.

> We have almost 200 static
> Lisp_Object variables, probably not all of them staticpro'd (8 of them
> inside functions, like the above example, so definitely not
> staticpro'd).  So now we need to examine the uses of all of them and
> either staticpro them or do something else (like move the assignment
> to 'last_coding' to after call_some_function).

Changing eight variables from function statics to file statics hardly
seems like a monumental effort.  The static-storage global-scope
Lisp_Object variables are probably almost all gcproed already.

Gerd has already gone a long way towards identifying the various problem
spots.  In particular, his change to move uses of SAFE_ALLOCA to the
_LISP variant in various places is good and should be applied regardless
of the readiness of the rest of the MPS branch. Even in cases where for
one reason or another the code he's changing is safe under the old GC,
the old code is fragile and can break under modification. 

> And we will probably find other usage patterns which trip the same

Yep.

> And the amazing part is not the above, it''s the fact that with all
> that "unsafe" code people are using the branch for production, and
> most of them report a stable Emacs.  What does that tell us? that
> most, if not all, of those places are perfectly valid code, and we are
> haunted by the proverbial shadow of a dwarf?  Or maybe it just makes
> the job harder, because we now need to identify the places where these
> bad things can _really_ happen, and fix only them?  Or something else?

> This is what I'm trying to establish.  The problem is not theoretical,
> it's entirely practical: where to steer our efforts of stabilizing the
> branch and preparing it for landing, given the available resources.

There are three reasons we might not see code with dubious adherence to
GC rules cause problems in practice:

1) the dubious code is unsafe only in rare edge cases (who has thousands
of call-process parameters?) or the problems are GC-timing dependent,

2) the dubious code is safe only because all the Lisp values in question
are protected by other references and the old GC is non-moving, and

3) the dubious code is safe only because the only Lisp values stored in
these malloc-heap regions talk about objects that are naturally
immortal, e.g. Qnil, and the GC doesn't move immortal values.

A case of #1 is a memory safety problem.  #2 and #3 merely mean we have
fragile code.  Sometimes it's hard to tell the difference, which is why
we should fix all three.

A moving GC like MPS imposes more stringent requirements on a program
than a non-moving one does, but the changes needed to make Emacs safe
under a moving GC are not particularly invasive and make the code safer
and more maintainable overall. Most of the changes should be one- or
few-line tweaks.

We're talking about nothing remotely close to a general rewrite.

>> The *existing* code is broken, and you don't see it because we use alloca to allocate on the stack almost all the time. 
>
> If we allocate on the stack almost all the time, we will keep
> allocating on the stack almost all the time.  But anyway, if you are
> now saying that the code is broken, then a rewrite, and a significant
> one at that, _is_ needed, right?

"Almost" isn't good enough when it comes to memory safety problems and
the security issues they often cause.





  reply	other threads:[~2025-01-06 14:48 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-03 17:20 bug#75322: SAFE_ALLOCA assumed to root Lisp_Objects/SSDATA(string) Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-03 19:55 ` Gerd Möllmann
2025-01-03 20:34   ` Gerd Möllmann
2025-01-03 20:48     ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-04  4:40       ` Gerd Möllmann
2025-01-04  7:57         ` Eli Zaretskii
2025-01-04  8:47           ` Gerd Möllmann
2025-01-04  9:56             ` Eli Zaretskii
2025-01-04 10:20               ` Gerd Möllmann
2025-01-05 13:30                 ` Eli Zaretskii
2025-01-05 14:11                   ` Gerd Möllmann
2025-01-05 17:45                     ` Eli Zaretskii
2025-01-05 18:17                       ` Gerd Möllmann
2025-01-05 19:07                         ` Eli Zaretskii
2025-01-05 20:04                           ` Gerd Möllmann
2025-01-05 20:24                             ` Eli Zaretskii
2025-01-06  3:57                               ` Gerd Möllmann
2025-01-06  8:25                                 ` Gerd Möllmann
2025-01-06 14:07                                 ` Eli Zaretskii
2025-01-05 21:15                           ` Daniel Colascione
2025-01-06 12:59                             ` Eli Zaretskii
2025-01-06 14:48                               ` Daniel Colascione [this message]
2025-01-06 15:12                                 ` Eli Zaretskii
2025-01-06 15:27                                   ` Daniel Colascione
2025-01-05 21:01                     ` Daniel Colascione
2025-01-05 23:28                       ` Daniel Colascione
2025-01-06 13:26                         ` Eli Zaretskii
2025-01-06 15:08                           ` Daniel Colascione
2025-01-06  4:23                       ` Gerd Möllmann
2025-01-04 11:41               ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-04 11:29         ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-04 12:17           ` Gerd Möllmann
2025-01-04  7:00     ` Eli Zaretskii
2025-01-04  7:17       ` Gerd Möllmann
2025-01-04  8:23         ` Eli Zaretskii
2025-01-04  8:58           ` Gerd Möllmann
2025-01-04 11:08       ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-04 13:47         ` Eli Zaretskii
2025-01-04 14:13           ` Gerd Möllmann
2025-01-04 15:26           ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-04 15:34             ` Gerd Möllmann
2025-01-04 18:19               ` Eli Zaretskii
2025-01-04 18:35                 ` Gerd Möllmann
2025-01-04 19:10                   ` Eli Zaretskii
2025-01-04 19:24                     ` Gerd Möllmann
2025-01-04 18:02             ` Eli Zaretskii
2025-01-04 19:32               ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-04 20:31                 ` Eli Zaretskii
2025-01-04 21:15                   ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-05  8:23                     ` Eli Zaretskii
2025-01-05  9:04                       ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-05  9:32                         ` Eli Zaretskii
2025-01-05  9:47                           ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-05 11:04                             ` Eli Zaretskii
2025-01-06 15:54                               ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-06 19:16                                 ` Gerd Möllmann
2025-01-05  6:32 ` Gerd Möllmann
2025-01-05  6:59   ` Gerd Möllmann
2025-01-05 10:21     ` Eli Zaretskii
2025-01-05 10:30       ` Gerd Möllmann
2025-01-05 10:35         ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-05 10:45           ` Gerd Möllmann
2025-01-05 11:29         ` Eli Zaretskii
2025-01-05 11:37           ` Gerd Möllmann
2025-01-05 12:15             ` Eli Zaretskii
2025-01-05 13:21               ` Gerd Möllmann
2025-01-05 17:31                 ` Eli Zaretskii
2025-01-05 17:49                   ` Gerd Möllmann
2025-01-05 18:42                     ` Eli Zaretskii
2025-01-05 19:02                       ` Gerd Möllmann
2025-01-05  7:48   ` Eli Zaretskii
2025-01-05  8:19     ` Gerd Möllmann
2025-01-05 10:33       ` Eli Zaretskii
2025-01-05 10:40         ` Gerd Möllmann
2025-01-05 11:21           ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-05 11:27             ` Gerd Möllmann
2025-01-05 11:49             ` Paul Eggert
2025-01-06  6:26           ` Gerd Möllmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wmf8t2vq.fsf@dancol.org \
    --to=dancol@dancol.org \
    --cc=75322@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=gerd.moellmann@gmail.com \
    --cc=pipcet@protonmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).