From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Philipp Stephani
> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Fri, 20 Nov 2015 23:22:49 +0000
> Cc: tzz@lifelogs= .com, aurelien.aptel+emacs@gmail.com, emacs-devel@gnu.org
>
>=C2=A0 =C2=A0 =C2=A0Ah, you are talking about C++ dynamic initializers!= So the model is
>=C2=A0 =C2=A0 =C2=A0that someone writes a module in C++, starts a threa= d there, and then
>=C2=A0 =C2=A0 =C2=A0inside some dynamic initializer calls emacs-module = interface
>=C2=A0 =C2=A0 =C2=A0functions, is that it? If that's the situation,= I'd suggest a
>=C2=A0 =C2=A0 =C2=A0prominent commentary describing this in the source.=
>
> The SO post talks about C++ but the issue is the same in C. AFAIK with= C11 and
> C++11 the execution models are harmonized.
It doesn't matter for the issue at hand whether it's C, C++, Java, = or
whatever.=C2=A0 My originally incorrect interpretation of what you wrote
was that you are talking about initializers that are part of the
module code, i.e. the emacs_module_init function they implement.
> It seems the comment is overly confusing. It is supposed to warn about= the
> following. Naively, if you wanted to test whether you are in the main = thread,
> you would do (module types and naming):
>
> static thread_id main_thread =3D get_current_thread();
> bool in_main_thread() { return get_current_thread() =3D=3D main_thread= ; }
>
> The dynamic initializer here is the first "get_current_thread()&q= uot;; it is not
> guaranteed to run in the main thread, so "main_thread" is no= t guaranteed to
> contain the main thread ID.=C2=A0 Therefore you have to do:
>
> static thread_id main_thread; // initialized later
> int main() {
> // guaranteed to be in the main thread
> main_thread =3D get_current_thread();
> }
>
> That's all. I'm not aware of any runtime that would run dynami= c initializers
> outside of the main thread, but it's not impossible and easy to pr= otect
> against.
AFAIK, code that does this:
=C2=A0 static thread_id main_thread =3D get_current_thread();
is invalid in C, because such initializers cannot call functions.=C2=A0 So<= br> it would surprise me to see code which tried to record its thread ID
before 'main'.
So I think we should reword that comment to be much less mysterious
and confusing than it is now.=C2=A0 (Look how much did we need to talk
about for you to explain to me what was the intent of the comment.)Since the alternative isn't even legal, th= e comment can just go away.=C2=A0
>=C2=A0 =C2=A0 =C2=A0Anyway, thanks for explaining this, I now know how = to change the code
>=C2=A0 =C2=A0 =C2=A0to DTRT on MS-Windows wrt to the thread checks.
>
> This is unfortunately all surprisingly subtle and vaguely defined. See= e.g.
> http://stackoverflow.com/q/19744250/178761 (appar= ently the standards are vague
> about what happens to detached threads after main has exited).
I don't see how that affects the issue at hand.=C2=A0 The issue at hand= is
whether a thread ID of the main thread could be reused while some of
the other threads belonging to the Emacs process are still running.
And the answer to that on MS-Windows is AFAIU a sound NO, because as
long as the Emacs process is alive, it holds a handle on the main
thread, which precludes the OS from discarding that thread's kernel
object.=C2=A0 Why? because a thread handle can and is used to query the OS<= br> about that thread's conditions, like its exit code, or wait for its
completion in the likes of WaitForSingleObject.=C2=A0 So the kernel object<= br> that represents the thread must be kept by the OS as long as at least
one open handle for the thread exists, and that prevents the OS from
reusing the thread ID.Does it actually= hold that handle? It sounds reasonable, but I can't find it documented= .=C2=A0
>=C2=A0 =C2=A0 =C2=A0> See also
>=C2=A0 =C2=A0 =C2=A0> http:= //blogs.msdn.com/b/oldnewthing/archive/2006/09/27/773741.aspx.
>
>=C2=A0 =C2=A0 =C2=A0We don't use IsBadWritePtr on Windows to check = this, see
>=C2=A0 =C2=A0 =C2=A0w32_valid_pointer_p for how this is actually implem= ented.
>
> Much of this applies generally.
> > "But what should I do, then, if somebody passes me a bad poi= nter?"
> > You should crash.
Which is what we do, since eassert aborts.=C2=A0 We will just do it sooner,=
which IME is a Good Thing.OK. We shoul= d be a bit careful with the current implementation of valid_pointer_p thoug= h, as AFAIK write(2) could cast the pointer it receives to char* and derefe= rence it, but that's probably no worse than not checking it at all, and= unrelated to modules.=C2=A0
>=C2=A0 =C2=A0 =C2=A0Anyway, I'm surprised by this extreme POV: even= if we cannot validate
>=C2=A0 =C2=A0 =C2=A0a pointer 100%, surely it doesn't mean we canno= t or shouldn't do some
>=C2=A0 =C2=A0 =C2=A0partial job? Why this "all or nothing" ap= proach?
>
> We can check whether it's NULL. Apart from that, everything else i= s outside of
> the C standard.
Emacs is not a Standard C program, far from it.=C2=A0 It uses a lot of
stuff outside of any C standard, and for a very good reason: it is a
large and complicate program with many features that require access to
OS facilities.
IOW, using only Standard C features is not, and cannot be, a
requirement for Emacs code.Fair enough= .=C2=A0
>=C2=A0 =C2=A0 =C2=A0We need to devise a way for it to detect that it wa= s called from
>=C2=A0 =C2=A0 =C2=A0emacs-module.c, the rest is simple, I think.
>
> Hmm, why does it need to detect anything? Can't it just be a diffe= rent function
> that doesn't signal, similar to push_handler and push_handler_nosi= gnal?
I don't think we want each of its callers call the signaling part by
itself.=C2=A0 That would be repeating the problem with malloc itself: many<= br> programs simply neglect to include the code which does TRT when it
returns NULL.=C2=A0 xmalloc solves this, and makes sure the (non-trivial) error action and message are always the same in that case.
We need a variant of this for memory allocated on behalf of modules, I
think.But this would require modules t= o be prepared for handling longjmps, which in general they aren't.In an "unsafe" language like C, we can't do without the= users' cooperation. If users ignore NULL returns from malloc, that'= ;s a bug and we can't do much about it. We could make it harder to acci= dentially ignore the result:bool emacs_malloc(size_t size, void = **out) __attribute__((warn_unused_result));but that requires a c= ompiler extension and is not really consistent with the rest of the module = interface, where NULL is regularly returned on failure.=C2=A0