From: Eli Zaretskii <eliz@gnu.org>
To: Pip Cet <pipcet@protonmail.com>
Cc: gerd.moellmann@gmail.com, 75322@debbugs.gnu.org
Subject: bug#75322: SAFE_ALLOCA assumed to root Lisp_Objects/SSDATA(string)
Date: Sat, 04 Jan 2025 20:02:42 +0200 [thread overview]
Message-ID: <86y0zqbgot.fsf@gnu.org> (raw)
In-Reply-To: <877c7aha9n.fsf@protonmail.com> (message from Pip Cet on Sat, 04 Jan 2025 15:26:01 +0000)
> Date: Sat, 04 Jan 2025 15:26:01 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: gerd.moellmann@gmail.com, 75322@debbugs.gnu.org
>
> "Eli Zaretskii" <eliz@gnu.org> writes:
>
> > AFAICT, expand-file-name is called before we start using the SSDATA of
> > strings in the args[] array. Or what did I miss?
>
> You're right, thanks. I got confused between args and new_argv.
>
> The next thing I'd look at is the final call to ENCODE_FILE, called
> after new_argv is set up; this ends up in encode_coding_object, which
> calls safe_funcall in what seems to me to be an unlikely but possible
> code path. I assume that's unsafe (the safe_ refers to redisplay, not
> GC, IIUC)?
ENCODE_FILE can indeed trigger GC in rare cases, but I see only one
such call:
path = ENCODE_FILE (path);
We could simply move this to before the loop that sets up new_argv[].
> While maybe_quit is "safe" because it inhibits GC, I believe it can
> still call the debugger, which might require more memory than is
> available until GC is re-enabled.
maybe_quit is not the problem, the problem AFAIU is that any encoding
could have pre-write-conversion function written in Lisp.
> >> Yes, make_environment_block does say its callers can't run GC, but
> >> call_process doesn't indicate when and how it establishes a no-GC
> >> assumption.
> >
> > What would be needed to indicate that?
>
> I'd prefer a pair of macros (no-ops in regular builds) to comments, but
> there is no obvious best solution here.
Sorry, I don't understand: why macros? Do we use something like that
anywhere else in Emacs?
> My proposal would be to remove most (ideally, all, and then we're done)
> no-GC assumptions, and put the few ones that must remain into separate
> function bodies for the no-GC / possible-GC cases. Then we can put the
> no-GC function bodies into a separate section and prohibit references
> from no-GC to possible-GC functions, and have the linker check that.
First, the techniques that rely on separate segments are non-portable,
so not all platforms will support them. More importantly, I'm afraid
the amount of code where we currently don't expect GC is much more
than you seem to imagine, so I don't think the above is practical.
In any case, the starting point is to audit all the places where GC
can happen, and that needs to be done anyway if we want to do it
thoroughly and proactively (as opposed to only when someone reports a
bug). We can delay the decision what to do with these places to
later, when we understand better the magnitude of the problem and its
various aspects (in terms of several different effects that GC can
have on various objects).
> >> As we agreed, code should be written to assume GC can strike at any
> >> time.
> >
> > I don't think we agreed to that. At least I didn't, not in this
> > general form. It would be a huge job to make all of our code comply
> > with this.
>
> I said "That's what you should think: GC can strike at any time", and
> you said "The same is true with the old GC". I misread that as
> agreement.
I only meant that MPS is not very different from the old GC in this
regard, that's all.
> >> IIUC, Gerd explained that the old GC can still move the string *data*
> >> used in that structure, even if the string metadata stays in place.
> >
> > If string data is moved, then accessing the old pointer would trigger
> > the barrier and MPS will do its thing, not?
>
> Sorry, but I think I'm confused here.
>
> IIUC, MPS doesn't currently use barriers on data that is moved (it
> could, so the data is copied lazily, but I don't think that's what you
> meant), it uses barriers on data that contains references that may not
> have been fixed.
The safe thing is to re-initialize the pointer from the string
whenever we think GC could happen, and otherwise make sure GC cannot
happen.
> If a pointer to "old" data is ever exposed to Emacs, we lose, because
> MPS will reuse the memory for new data, which might be behind a barrier.
>
> If we ever do:
>
> static Lisp_Object unmarked;
> unmarked = string;
> ... trigger GC here ...
> puts (SDATA (unmarked);
>
> the most likely outcome (thanks to Gerd for the hint) is that
> nonsensical data is printed
Are you sure? AFAIU, referencing 'unmarked' is safe, even after GC.
The below is indeed unsafe:
char *p = SDATA (unmarked);
... trigger GC here ...
puts (p);
> > To clarify, I was trying to understand whether the error message
> > reported by Ihor in another thread could have happened because of GC
> > in this are of the code.
>
> I currently think that Ihor's test case calls execve () with nonsensical
> "environment" strings a lot, and once in a while they'll even be behind
> the barrier which causes an EFAULT.
Before we agree that this could be the reason, we'd need to find the
places where GC could reuse the memory of a live string, while that
string appears in some live data structure, and as long as no GC can
happen between the time we put the SDATA pointers into the environment
array and the time posix_spawn returns.
next prev parent reply other threads:[~2025-01-04 18:02 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-03 17:20 bug#75322: SAFE_ALLOCA assumed to root Lisp_Objects/SSDATA(string) Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-03 19:55 ` Gerd Möllmann
2025-01-03 20:34 ` Gerd Möllmann
2025-01-03 20:48 ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-04 4:40 ` Gerd Möllmann
2025-01-04 7:57 ` Eli Zaretskii
2025-01-04 8:47 ` Gerd Möllmann
2025-01-04 9:56 ` Eli Zaretskii
2025-01-04 10:20 ` Gerd Möllmann
2025-01-05 13:30 ` Eli Zaretskii
2025-01-05 14:11 ` Gerd Möllmann
2025-01-05 17:45 ` Eli Zaretskii
2025-01-05 18:17 ` Gerd Möllmann
2025-01-05 19:07 ` Eli Zaretskii
2025-01-05 20:04 ` Gerd Möllmann
2025-01-05 20:24 ` Eli Zaretskii
2025-01-06 3:57 ` Gerd Möllmann
2025-01-06 8:25 ` Gerd Möllmann
2025-01-06 14:07 ` Eli Zaretskii
2025-01-05 21:15 ` Daniel Colascione
2025-01-06 12:59 ` Eli Zaretskii
2025-01-06 14:48 ` Daniel Colascione
2025-01-06 15:12 ` Eli Zaretskii
2025-01-06 15:27 ` Daniel Colascione
2025-01-05 21:01 ` Daniel Colascione
2025-01-05 23:28 ` Daniel Colascione
2025-01-06 13:26 ` Eli Zaretskii
2025-01-06 15:08 ` Daniel Colascione
2025-01-06 4:23 ` Gerd Möllmann
2025-01-04 11:41 ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-04 11:29 ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-04 12:17 ` Gerd Möllmann
2025-01-04 7:00 ` Eli Zaretskii
2025-01-04 7:17 ` Gerd Möllmann
2025-01-04 8:23 ` Eli Zaretskii
2025-01-04 8:58 ` Gerd Möllmann
2025-01-04 11:08 ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-04 13:47 ` Eli Zaretskii
2025-01-04 14:13 ` Gerd Möllmann
2025-01-04 15:26 ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-04 15:34 ` Gerd Möllmann
2025-01-04 18:19 ` Eli Zaretskii
2025-01-04 18:35 ` Gerd Möllmann
2025-01-04 19:10 ` Eli Zaretskii
2025-01-04 19:24 ` Gerd Möllmann
2025-01-04 18:02 ` Eli Zaretskii [this message]
2025-01-04 19:32 ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-04 20:31 ` Eli Zaretskii
2025-01-04 21:15 ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-05 8:23 ` Eli Zaretskii
2025-01-05 9:04 ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-05 9:32 ` Eli Zaretskii
2025-01-05 9:47 ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-05 11:04 ` Eli Zaretskii
2025-01-06 15:54 ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-06 19:16 ` Gerd Möllmann
2025-01-08 3:46 ` Gerd Möllmann
2025-01-05 6:32 ` Gerd Möllmann
2025-01-05 6:59 ` Gerd Möllmann
2025-01-05 10:21 ` Eli Zaretskii
2025-01-05 10:30 ` Gerd Möllmann
2025-01-05 10:35 ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-05 10:45 ` Gerd Möllmann
2025-01-05 11:29 ` Eli Zaretskii
2025-01-05 11:37 ` Gerd Möllmann
2025-01-05 12:15 ` Eli Zaretskii
2025-01-05 13:21 ` Gerd Möllmann
2025-01-05 17:31 ` Eli Zaretskii
2025-01-05 17:49 ` Gerd Möllmann
2025-01-05 18:42 ` Eli Zaretskii
2025-01-05 19:02 ` Gerd Möllmann
2025-01-05 7:48 ` Eli Zaretskii
2025-01-05 8:19 ` Gerd Möllmann
2025-01-05 10:33 ` Eli Zaretskii
2025-01-05 10:40 ` Gerd Möllmann
2025-01-05 11:21 ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2025-01-05 11:27 ` Gerd Möllmann
2025-01-05 11:49 ` Paul Eggert
2025-01-06 6:26 ` Gerd Möllmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86y0zqbgot.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=75322@debbugs.gnu.org \
--cc=gerd.moellmann@gmail.com \
--cc=pipcet@protonmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.