From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Philipp Stephani
> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Fri, 20 Nov 2015 19:58:25 +0000
> Cc: tzz@lifelogs= .com, aurelien.aptel+emacs@gmail.com, emacs-devel@gnu.org
>
>=C2=A0 =C2=A0 =C2=A0So you are saying that module_init runs in the main= thread, but
>=C2=A0 =C2=A0 =C2=A0dynamic initializers might run in another thread? H= ow's that
>=C2=A0 =C2=A0 =C2=A0possible?
>
> According to http://stackoverflow.com/a/29776416 it is n= ot guaranteed that
> dynamic initializers run in the main thread.
Ah, you are talking about C++ dynamic initializers!=C2=A0 So the model is that someone writes a module in C++, starts a thread there, and then
inside some dynamic initializer calls emacs-module interface
functions, is that it?=C2=A0 If that's the situation, I'd suggest a=
prominent commentary describing this in the source.
>=C2=A0 =C2=A0 =C2=A0On what OS can this happen? It cannot happen on Win= dows, AFAIK,
>
> http://blogs.msdn.com/b/old= newthing/archive/2010/08/27/10054832.aspx seems to
> indicate that it is possible that other threads continue running after= the main
> thread has exited.
Not in Emacs: we use the C runtime, and never call ExitThread, let
alone ExitProcess.=C2=A0 Doing that would be a silly bug.=C2=A0 There's= no need
to make a single component of Emacs, and not the most important one,
protected to such extreme degree against calamities that are only
possible if Emacs developers will lose their senses.
> My initial code didn't use a thread identifier at all because it c= ould have
> been reused after the main thread ended, introducing a subtle bug.
No, it cannot be reused, because we hold a handle on the main thread,
see w32term.c around line 6985.=C2=A0 As long as a single handle is open on=
a thread, it still exists from the OS POV, and its ID cannot be
reused.
>=C2=A0 =C2=A0 =C2=A0> This is undefined behavior, but we included an= explicit check if
>=C2=A0 =C2=A0 =C2=A0> checking is enabled because that case is somew= hat subtle.
>
>=C2=A0 =C2=A0 =C2=A0I'm surprised this could happen on some OS. Can= you point to any
>=C2=A0 =C2=A0 =C2=A0details about this?
>
>
> It's not ruled out by the C standard or the APIs of operating syst= ems. If Emacs
> doesn't call ExitThread or manipulates the C runtime, then indeed = it can't
> happen (at least on Windows), but I think it's brittle to rely on = such
> subtleties without explicitly checking for them (somebody might introd= uce a
> call to ExitThread in the future without looking at module code).
I think these precautions are unnecessary.
Anyway, thanks for explaining this, I now know how to change the code
to DTRT on MS-Windows wrt to the thread checks.
>=C2=A0 =C2=A0 =C2=A0OK, but then either there should have been a commen= t explaining this,
>=C2=A0 =C2=A0 =C2=A0or, better, the test should have been after the add= ition. (Which, by
>=C2=A0 =C2=A0 =C2=A0some strange luck -- or maybe something else -- is = just what Paul
>=C2=A0 =C2=A0 =C2=A0already did ;-)
>
> If the test gets moved after the addition, then we should have a verif= y
> (MAX_EMACS_UINT - 1 > MOST_POSITIVE_FIXNUM) or use __builtin_add_ov= erflow to
> make sure the addition doesn't cause UB.
I don't think so.=C2=A0 Please take a look at the code Paul wrote there= , I
don't see a problem with it.
>=C2=A0 =C2=A0 =C2=A0> No validity checks on 'fin'?
>=C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0> How should it be validated? In C an arbitrary = (invalid) pointer could be
>=C2=A0 =C2=A0 =C2=A0> passed. I think we just have to accept that th= is is UB.
>
>=C2=A0 =C2=A0 =C2=A0Given the extensive checks elsewhere in the file, t= his surprised me,
>=C2=A0 =C2=A0 =C2=A0that's all. How to test it? calling valid_point= er_p would be one
>=C2=A0 =C2=A0 =C2=A0thing I'd suggest. Maybe there are other (perha= ps platform-dependent)
>=C2=A0 =C2=A0 =C2=A0checks possible here, I would have to look around. = For example, it
>=C2=A0 =C2=A0 =C2=A0would be good to make sure this points into executa= ble code.
>
> You can't validate pointers in C. For example, is the following po= inter valid?
>
> long a;
> double *p =3D (double *)(&a);
>
> The answer is no; this is an aliasing violation, and dereferencing p i= s UB, but
> you won't detect this by trying to read the process's address = space.
The function valid_pointer_p validates a pointer in the sense that it
points into the program's address space.=C2=A0 That's not a full
validation, and won't catch unaligned pointers or pointers into
non-executable portions of memory, but it's better than nothing.=C2=A0 = I'd
certainly consider it under "#ifdef ENABLE_CHECKING" at least.
> See also
> http://blogs.msdn.com/b/oldne= wthing/archive/2006/09/27/773741.aspx.
We don't use IsBadWritePtr on Windows to check this, see
w32_valid_pointer_p for how this is actually implemented.
Anyway, I'm surprised by this extreme POV: even if we cannot validate a pointer 100%, surely it doesn't mean we cannot or shouldn't do so= me
partial job?=C2=A0 Why this "all or nothing" approach?We can check whether it's NULL. Apart from th= at, everything else is outside of the C standard.=C2=A0
>=C2=A0 =C2=A0 =C2=A0> if (len > INT_MAX || len < envptr->mi= n_arity || (envptr->max_arity >=3D 0
>=C2=A0 =C2=A0 =C2=A0&&
>=C2=A0 =C2=A0 =C2=A0> len > envptr->max_arity))
>=C2=A0 =C2=A0 =C2=A0> xsignal2 (Qwrong_number_of_arguments, module_f= ormat_fun_env (envptr),
>=C2=A0 =C2=A0 =C2=A0> make_number (len));
>=C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0> Why the test against INT_MAX? EMACS_INT can le= gitimately be a 64-bit
>=C2=A0 =C2=A0 =C2=A0> data type with values far exceeding INT_MAX. W= hy impose this
>=C2=A0 =C2=A0 =C2=A0> limitation here?
>=C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0> Because the nargs argument in the module inter= face is an int.
>
>=C2=A0 =C2=A0 =C2=A0Shouldn't it be EMACS_INT instead?
>
> EMACS_INT is an internal type that isn't exposed to module authors= .
OK, but the type of nargs is ptrdiff_t, so the test should be against
the maximum of that type.=C2=A0 On 64-bit hosts, ptrdiff_t is a 64-bit
type, while an int is still 32-bit wide.
> Could xmalloc grow a variant that is guaranteed not to call longjmp?
We need to devise a way for it to detect that it was called from
emacs-module.c, the rest is simple, I think.