unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Conservative GC isn't safe
@ 2016-11-26  8:11 Daniel Colascione
  2016-11-26  8:30 ` Paul Eggert
  2016-11-26 19:08 ` Pip Cet
  0 siblings, 2 replies; 46+ messages in thread
From: Daniel Colascione @ 2016-11-26  8:11 UTC (permalink / raw)
  To: Emacs developers

I was poking at alloc.c recently and realized that the existing 
conservative GC code is somewhat unsafe. In particular,

   1) mark_maybe_pointer looks only for exact matches on object start. 
It's perfectly legal for the compiler to keep an interior object pointer 
and discard the pointer to the object start.

   2) INTERVAL is GCed, but it's not represented in the memory tree: 
struct interval isn't a real lisp object and it's allocated as 
MEM_TYPE_NON_LISP. Even a direct pointer to the start of an interval 
won't protect it from GC. Shouldn't we treat intervals like conses?

We've been getting by on dumb luck and the magnanimity of the compiler.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26  8:11 Conservative GC isn't safe Daniel Colascione
@ 2016-11-26  8:30 ` Paul Eggert
  2016-11-26  8:33   ` Daniel Colascione
  2016-11-26 15:03   ` Stefan Monnier
  2016-11-26 19:08 ` Pip Cet
  1 sibling, 2 replies; 46+ messages in thread
From: Paul Eggert @ 2016-11-26  8:30 UTC (permalink / raw)
  To: Daniel Colascione, Emacs developers

On 11/26/2016 12:11 AM, Daniel Colascione wrote:
>
>   1) mark_maybe_pointer looks only for exact matches on object start. 
> It's perfectly legal for the compiler to keep an interior object 
> pointer and discard the pointer to the object start.

Yes, just as it's perfectly legal for the compiler to subtract 42 from 
every pointer before putting it in a register or storing it into memory. 
In practice, though, compilers don't do this around calls to the garbage 
collector. (True, this assumption should be documented better.)

>
>   2) INTERVAL is GCed, but it's not represented in the memory tree: 
> struct interval isn't a real lisp object and it's allocated as 
> MEM_TYPE_NON_LISP. Even a direct pointer to the start of an interval 
> won't protect it from GC. Shouldn't we treat intervals like conses?

Does the code ever create an interval that is accessible only via locals 
when a GC occurs? If not, Emacs should be OK. (This should also be 
documented better.)



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26  8:30 ` Paul Eggert
@ 2016-11-26  8:33   ` Daniel Colascione
  2016-11-26  9:01     ` Eli Zaretskii
  2016-11-26 15:03   ` Stefan Monnier
  1 sibling, 1 reply; 46+ messages in thread
From: Daniel Colascione @ 2016-11-26  8:33 UTC (permalink / raw)
  To: Paul Eggert, Emacs developers

On 11/26/2016 12:30 AM, Paul Eggert wrote:
> On 11/26/2016 12:11 AM, Daniel Colascione wrote:
>>
>>   1) mark_maybe_pointer looks only for exact matches on object start.
>> It's perfectly legal for the compiler to keep an interior object
>> pointer and discard the pointer to the object start.
>
> Yes, just as it's perfectly legal for the compiler to subtract 42 from
> every pointer before putting it in a register or storing it into memory.
> In practice, though, compilers don't do this around calls to the garbage
> collector. (True, this assumption should be documented better.)

I can imagine a compiler having a legitimate reason to use an interior 
pointer, but I can't see why it'd subtract 42, XOR it with 0xDEADBEEF, 
or make other opaque transformations. We already search the memory tree, 
and each tree node has both start and end information for each 
allocation. We should be able to cope with interior pointers.

>>   2) INTERVAL is GCed, but it's not represented in the memory tree:
>> struct interval isn't a real lisp object and it's allocated as
>> MEM_TYPE_NON_LISP. Even a direct pointer to the start of an interval
>> won't protect it from GC. Shouldn't we treat intervals like conses?
>
> Does the code ever create an interval that is accessible only via locals
> when a GC occurs? If not, Emacs should be OK. (This should also be
> documented better.)

Anywhere in the code? Forever? I wouldn't be confident saying so.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26  8:33   ` Daniel Colascione
@ 2016-11-26  9:01     ` Eli Zaretskii
  2016-11-26  9:04       ` Daniel Colascione
  0 siblings, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2016-11-26  9:01 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: eggert, emacs-devel

> From: Daniel Colascione <dancol@dancol.org>
> Date: Sat, 26 Nov 2016 00:33:13 -0800
> 
> >>   2) INTERVAL is GCed, but it's not represented in the memory tree:
> >> struct interval isn't a real lisp object and it's allocated as
> >> MEM_TYPE_NON_LISP. Even a direct pointer to the start of an interval
> >> won't protect it from GC. Shouldn't we treat intervals like conses?
> >
> > Does the code ever create an interval that is accessible only via locals
> > when a GC occurs? If not, Emacs should be OK. (This should also be
> > documented better.)
> 
> Anywhere in the code? Forever? I wouldn't be confident saying so.

A simple practical solution to such assumptions is to add an assertion
in some strategic place(s).

I don't think it's TRT to sprinkle our sources with code that is there
"just in case", i.e. it will never actually run.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26  9:01     ` Eli Zaretskii
@ 2016-11-26  9:04       ` Daniel Colascione
  2016-11-26  9:24         ` Eli Zaretskii
  2016-11-26 15:05         ` Stefan Monnier
  0 siblings, 2 replies; 46+ messages in thread
From: Daniel Colascione @ 2016-11-26  9:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel

On 11/26/2016 01:01 AM, Eli Zaretskii wrote:
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Sat, 26 Nov 2016 00:33:13 -0800
>>
>>>>   2) INTERVAL is GCed, but it's not represented in the memory tree:
>>>> struct interval isn't a real lisp object and it's allocated as
>>>> MEM_TYPE_NON_LISP. Even a direct pointer to the start of an interval
>>>> won't protect it from GC. Shouldn't we treat intervals like conses?
>>>
>>> Does the code ever create an interval that is accessible only via locals
>>> when a GC occurs? If not, Emacs should be OK. (This should also be
>>> documented better.)
>>
>> Anywhere in the code? Forever? I wouldn't be confident saying so.
>
> A simple practical solution to such assumptions is to add an assertion
> in some strategic place(s).
>
> I don't think it's TRT to sprinkle our sources with code that is there
> "just in case", i.e. it will never actually run.

How would you assert dynamically that if an interval is reachable, its 
owning string or buffer must be too? It's not enough for the variable 
holding the reference to the string or buffer to be in scope: you have 
to be sure that the reference isn't dead.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26  9:04       ` Daniel Colascione
@ 2016-11-26  9:24         ` Eli Zaretskii
  2016-11-26 15:05         ` Stefan Monnier
  1 sibling, 0 replies; 46+ messages in thread
From: Eli Zaretskii @ 2016-11-26  9:24 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: eggert, emacs-devel

> Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Sat, 26 Nov 2016 01:04:18 -0800
> 
> On 11/26/2016 01:01 AM, Eli Zaretskii wrote:
> >> From: Daniel Colascione <dancol@dancol.org>
> >> Date: Sat, 26 Nov 2016 00:33:13 -0800
> >>
> >>>>   2) INTERVAL is GCed, but it's not represented in the memory tree:
> >>>> struct interval isn't a real lisp object and it's allocated as
> >>>> MEM_TYPE_NON_LISP. Even a direct pointer to the start of an interval
> >>>> won't protect it from GC. Shouldn't we treat intervals like conses?
> >>>
> >>> Does the code ever create an interval that is accessible only via locals
> >>> when a GC occurs? If not, Emacs should be OK. (This should also be
> >>> documented better.)
> >>
> >> Anywhere in the code? Forever? I wouldn't be confident saying so.
> >
> > A simple practical solution to such assumptions is to add an assertion
> > in some strategic place(s).
> >
> > I don't think it's TRT to sprinkle our sources with code that is there
> > "just in case", i.e. it will never actually run.
> 
> How would you assert dynamically that if an interval is reachable, its 
> owning string or buffer must be too? It's not enough for the variable 
> holding the reference to the string or buffer to be in scope: you have 
> to be sure that the reference isn't dead.

I don't understand the use case you have in mind.  Why would an
interval be created that is not reachable from the interval tree of
some Lisp object?



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26  8:30 ` Paul Eggert
  2016-11-26  8:33   ` Daniel Colascione
@ 2016-11-26 15:03   ` Stefan Monnier
  2016-11-26 15:12     ` Eli Zaretskii
  2016-11-27  6:17     ` Ken Raeburn
  1 sibling, 2 replies; 46+ messages in thread
From: Stefan Monnier @ 2016-11-26 15:03 UTC (permalink / raw)
  To: emacs-devel

>> 1) mark_maybe_pointer looks only for exact matches on object start. It's
>> perfectly legal for the compiler to keep an interior object pointer and
>> discard the pointer to the object start.
> Yes, just as it's perfectly legal for the compiler to subtract 42 from every
> pointer before putting it in a register or storing it into memory. In
> practice, though, compilers don't do this around calls to the garbage
> collector. (True, this assumption should be documented better.)

Indeed.  Hans Boehm's done a fair bit of research in this issue,
including discussing the underlying assumptions and arguing that
compilers should (and usually do) guarantee those assumptions.

>> 2) INTERVAL is GCed, but it's not represented in the memory tree: struct
>> interval isn't a real lisp object and it's allocated as
>> MEM_TYPE_NON_LISP. Even a direct pointer to the start of an interval won't
>> protect it from GC. Shouldn't we treat intervals like conses?
> Does the code ever create an interval that is accessible only via locals
> when a GC occurs? If not, Emacs should be OK. (This should also be
> documented better.)

Indeed, this is a fairly delicate assumption that we don't check.
It's fairly rare to manipulate "struct interval" directly, so I think
the assumption is probably acceptable, but we should maybe document it
more prominently.


        Stefan




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26  9:04       ` Daniel Colascione
  2016-11-26  9:24         ` Eli Zaretskii
@ 2016-11-26 15:05         ` Stefan Monnier
  2016-11-26 15:21           ` Camm Maguire
  2016-11-28 17:51           ` Daniel Colascione
  1 sibling, 2 replies; 46+ messages in thread
From: Stefan Monnier @ 2016-11-26 15:05 UTC (permalink / raw)
  To: emacs-devel

> How would you assert dynamically that if an interval is reachable, its
> owning string or buffer must be too?

You don't.  You check it statically (by a human).

> It's not enough for the variable holding the reference to the string
> or buffer to be in scope: you have to be sure that the reference
> isn't dead.

It should be: if it's in scope, it's not dead.


        Stefan




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26 15:03   ` Stefan Monnier
@ 2016-11-26 15:12     ` Eli Zaretskii
  2016-11-26 16:29       ` Stefan Monnier
  2016-11-27  6:17     ` Ken Raeburn
  1 sibling, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2016-11-26 15:12 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sat, 26 Nov 2016 10:03:39 -0500
> 
> >> 2) INTERVAL is GCed, but it's not represented in the memory tree: struct
> >> interval isn't a real lisp object and it's allocated as
> >> MEM_TYPE_NON_LISP. Even a direct pointer to the start of an interval won't
> >> protect it from GC. Shouldn't we treat intervals like conses?
> > Does the code ever create an interval that is accessible only via locals
> > when a GC occurs? If not, Emacs should be OK. (This should also be
> > documented better.)
> 
> Indeed, this is a fairly delicate assumption that we don't check.
> It's fairly rare to manipulate "struct interval" directly, so I think
> the assumption is probably acceptable, but we should maybe document it
> more prominently.

Documentation aspects aside, if by "manipulate struct interval" you
mean what we do in intervals.c between the call to make_interval and
the return value being plugged into some Lisp object, either a buffer
or a string, then we could set a variable during that time, which
would cause an abort in GC, if that happens somehow.

Would that address these concerns?  If not, what kind of direct
manipulations with intervals did you have in mind?



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26 15:05         ` Stefan Monnier
@ 2016-11-26 15:21           ` Camm Maguire
  2016-11-28 17:51           ` Daniel Colascione
  1 sibling, 0 replies; 46+ messages in thread
From: Camm Maguire @ 2016-11-26 15:21 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

GCL uses a conservative GC as well.  I've always pondered how to make it
more efficient by controlling the allocation and initialization of the
stack.  The general idea is that the stack should be initialized when
allocated, including alignment padding.  The natural place to do this is
with a gcc switch, as I cannot see any way to distinguish variables in
register in standard C.  

Anyone know of any developments along these lines?

Take care,

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> How would you assert dynamically that if an interval is reachable, its
>> owning string or buffer must be too?
>
> You don't.  You check it statically (by a human).
>
>> It's not enough for the variable holding the reference to the string
>> or buffer to be in scope: you have to be sure that the reference
>> isn't dead.
>
> It should be: if it's in scope, it's not dead.
>
>
>         Stefan
>
>
>
>
>

-- 
Camm Maguire			     		    camm@maguirefamily.org
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26 15:12     ` Eli Zaretskii
@ 2016-11-26 16:29       ` Stefan Monnier
  2016-11-26 16:42         ` Eli Zaretskii
  0 siblings, 1 reply; 46+ messages in thread
From: Stefan Monnier @ 2016-11-26 16:29 UTC (permalink / raw)
  To: emacs-devel

> Documentation aspects aside, if by "manipulate struct interval" you
> mean what we do in intervals.c between the call to make_interval and
> the return value being plugged into some Lisp object, either a buffer

Yes, basically, that kind of manipulation.

> or a string, then we could set a variable during that time, which
> would cause an abort in GC, if that happens somehow.

Such a var would only catch some of the possible issues I think
(there's also the issue of when we take an existing struct interval
pointer, remove it from one lvalue and plug it into another, plus
various other cases).

IOW it sounds difficult to make such a test be "complete" (catch
most/all cases).  I also think it could prove fiddly to avoid
false positives.


        Stefan




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26 16:29       ` Stefan Monnier
@ 2016-11-26 16:42         ` Eli Zaretskii
  2016-11-26 18:43           ` Stefan Monnier
  0 siblings, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2016-11-26 16:42 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sat, 26 Nov 2016 11:29:06 -0500
> 
> > Documentation aspects aside, if by "manipulate struct interval" you
> > mean what we do in intervals.c between the call to make_interval and
> > the return value being plugged into some Lisp object, either a buffer
> 
> Yes, basically, that kind of manipulation.

All of these cases are in intervals.c.  There are no other calls to
make_interval anywhere in our sources.

So the question is: are those _the_only_ cases that you are talking
about, or do you see any others?

> > or a string, then we could set a variable during that time, which
> > would cause an abort in GC, if that happens somehow.
> 
> Such a var would only catch some of the possible issues I think
> (there's also the issue of when we take an existing struct interval
> pointer, remove it from one lvalue and plug it into another, plus
> various other cases).
> 
> IOW it sounds difficult to make such a test be "complete" (catch
> most/all cases).

That doesn't mean we shouldn't do what we can.  Provided that we
consider this danger to be real, of course.

> I also think it could prove fiddly to avoid false positives.

How can this cause false positives?  The current code doesn't allow
any GC in those functions I described above.  This is purely a
defensive technique against possible changes in the future which will
mistakenly allow that.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26 16:42         ` Eli Zaretskii
@ 2016-11-26 18:43           ` Stefan Monnier
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Monnier @ 2016-11-26 18:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> Yes, basically, that kind of manipulation.
> All of these cases are in intervals.c.
> There are no other calls to make_interval anywhere in our sources.
> So the question is: are those _the_only_ cases that you are talking
> about, or do you see any others?

I think the only code that manipulates struct intervals is in
intervals.c, indeed, and the risks should be limited to this file.

>> IOW it sounds difficult to make such a test be "complete" (catch
>> most/all cases).
> That doesn't mean we shouldn't do what we can.
> Provided that we consider this danger to be real, of course.

Right, it's a tradeoff.  I personally don't think the tradeoffs favor
writing such code (which is why I haven't done any such thing), as
opposed to relying on code review (this code doesn't change very often).

>> I also think it could prove fiddly to avoid false positives.
> How can this cause false positives?

I was mostly thinking of cases where the flag that signals we're
in the process of manipulating intervals could stay set after the fact
(because of non-local exit), but I'm sure there could be other cases
(it will all depend on exactly what we check and how): false positives
are pretty hard to avoid completely.

> The current code doesn't allow any GC in those functions I described
> above.  This is purely a defensive technique against possible changes
> in the future which will mistakenly allow that.

Another approach would be to change the conservative GC code so as to
also look for "struct intervals" pointers.  We could do it "all the
time", so as to just completely avoid the problem, or we could do it
only depending on a debug flag and then signal an error when this extra
code detects a reference that's not "redundant".

In any case, so far I think it's just a problem in theory, but in
practice I haven't seen any indication that we really have a problem.


        Stefan



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26  8:11 Conservative GC isn't safe Daniel Colascione
  2016-11-26  8:30 ` Paul Eggert
@ 2016-11-26 19:08 ` Pip Cet
  2016-11-27  0:24   ` Paul Eggert
  1 sibling, 1 reply; 46+ messages in thread
From: Pip Cet @ 2016-11-26 19:08 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: Emacs developers

On Sat, Nov 26, 2016 at 8:11 AM, Daniel Colascione <dancol@dancol.org> wrote:
>   1) mark_maybe_pointer looks only for exact matches on object start. It's
> perfectly legal for the compiler to keep an interior object pointer and
> discard the pointer to the object start.

There's a new, currently undocumented, GCC option called
-fkeep-gc-roots-live, which I think addresses this problem. My
understanding is previous versions of GCC did not in practice break
conservative GC (except for strings, which we handle specially for
this reason).



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26 19:08 ` Pip Cet
@ 2016-11-27  0:24   ` Paul Eggert
  0 siblings, 0 replies; 46+ messages in thread
From: Paul Eggert @ 2016-11-27  0:24 UTC (permalink / raw)
  To: Pip Cet, Daniel Colascione; +Cc: Emacs developers

On 11/26/2016 11:08 AM, Pip Cet wrote:
> There's a new, currently undocumented, GCC option called
> -fkeep-gc-roots-live, which I think addresses this problem.

Thanks, I hadn't heard about this option. Apparently it's designed for 
Go, whose garbage collector follows pointers to any part of the 
pointed-at object. Although this doesn't match Emacs's current GC, it 
should be an improvement. Perhaps someone could try using this option to 
build Emacs, to see whether it hurts performance or correctness.




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26 15:03   ` Stefan Monnier
  2016-11-26 15:12     ` Eli Zaretskii
@ 2016-11-27  6:17     ` Ken Raeburn
  2016-11-27 15:39       ` Eli Zaretskii
                         ` (2 more replies)
  1 sibling, 3 replies; 46+ messages in thread
From: Ken Raeburn @ 2016-11-27  6:17 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On Nov 26, 2016, at 10:03, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> 
>>> 1) mark_maybe_pointer looks only for exact matches on object start. It's
>>> perfectly legal for the compiler to keep an interior object pointer and
>>> discard the pointer to the object start.
>> Yes, just as it's perfectly legal for the compiler to subtract 42 from every
>> pointer before putting it in a register or storing it into memory. In
>> practice, though, compilers don't do this around calls to the garbage
>> collector. (True, this assumption should be documented better.)
> 
> Indeed.  Hans Boehm's done a fair bit of research in this issue,
> including discussing the underlying assumptions and arguing that
> compilers should (and usually do) guarantee those assumptions.

I’d be surprised if that held reliably when the last use of a Lisp_Object in some function extracts an object pointer and then never references the Lisp_Object as such ever again.

Lisp_Object foo (Lisp_Object obj)
{
  …
  return mumble (XSYMBOL (obj));
}

It’s got no reason to specifically obfuscate the value, but it may also have no reason to keep a copy of the Lisp_Object value around when it’s no longer needed.  It’s not so much that the compiler has decided to start using an interior pointer on its own, but instead just doing what we told it to do.  If “mumble” triggers GC, stack marking may well find only the pointer and not the original “obj” value in this function, especially if the compiler optimizes away the stack frame of “foo” completely.

If Boehm has found that compilers really do keep references even in cases like this (“usually” probably isn’t good enough), I’d be interested in reading up on that.

Ken


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-27  6:17     ` Ken Raeburn
@ 2016-11-27 15:39       ` Eli Zaretskii
  2016-11-28  9:50         ` Ken Raeburn
  2016-11-27 16:15       ` Paul Eggert
  2016-11-27 16:52       ` Stefan Monnier
  2 siblings, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2016-11-27 15:39 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: monnier, emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Sun, 27 Nov 2016 01:17:54 -0500
> Cc: emacs-devel@gnu.org
> 
> > Indeed.  Hans Boehm's done a fair bit of research in this issue,
> > including discussing the underlying assumptions and arguing that
> > compilers should (and usually do) guarantee those assumptions.
> 
> I’d be surprised if that held reliably when the last use of a Lisp_Object in some function extracts an object pointer and then never references the Lisp_Object as such ever again.
> 
> Lisp_Object foo (Lisp_Object obj)
> {
>   …
>   return mumble (XSYMBOL (obj));
> }
> 
> It’s got no reason to specifically obfuscate the value, but it may also have no reason to keep a copy of the Lisp_Object value around when it’s no longer needed.  It’s not so much that the compiler has decided to start using an interior pointer on its own, but instead just doing what we told it to do.  If “mumble” triggers GC, stack marking may well find only the pointer and not the original “obj” value in this function, especially if the compiler optimizes away the stack frame of “foo” completely.

IOW, you envision the possibility that the object identified by 'obj'
is not held in any stack-based memory cell in any of the 'foo's
callers, and instead all of them hold that object only in registers?
Is that a reasonable assumption?



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-27  6:17     ` Ken Raeburn
  2016-11-27 15:39       ` Eli Zaretskii
@ 2016-11-27 16:15       ` Paul Eggert
  2016-11-28  9:36         ` Ken Raeburn
  2016-11-27 16:52       ` Stefan Monnier
  2 siblings, 1 reply; 46+ messages in thread
From: Paul Eggert @ 2016-11-27 16:15 UTC (permalink / raw)
  To: Ken Raeburn, Stefan Monnier; +Cc: emacs-devel

Ken Raeburn wrote:
>> > Indeed.  Hans Boehm's done a fair bit of research in this issue,
>> > including discussing the underlying assumptions and arguing that
>> > compilers should (and usually do) guarantee those assumptions.

> I’d be surprised if that held reliably when the last use of a Lisp_Object in some function extracts an object pointer and then never references the Lisp_Object as such ever again.

That's not a problem for Emacs, since the Emacs GC marks the object either way.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-27  6:17     ` Ken Raeburn
  2016-11-27 15:39       ` Eli Zaretskii
  2016-11-27 16:15       ` Paul Eggert
@ 2016-11-27 16:52       ` Stefan Monnier
  2 siblings, 0 replies; 46+ messages in thread
From: Stefan Monnier @ 2016-11-27 16:52 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

> If Boehm has found that compilers really do keep references even in cases
> like this (“usually” probably isn’t good enough), I’d be interested in
> reading up on that.

Try http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.432


        Stefan



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-27 16:15       ` Paul Eggert
@ 2016-11-28  9:36         ` Ken Raeburn
  2016-11-28 15:55           ` Eli Zaretskii
  2016-11-28 16:13           ` Paul Eggert
  0 siblings, 2 replies; 46+ messages in thread
From: Ken Raeburn @ 2016-11-28  9:36 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Stefan Monnier, emacs-devel

On Nov 27, 2016, at 11:15, Paul Eggert <eggert@cs.ucla.edu> wrote:
> 
> Ken Raeburn wrote:
>>> > Indeed.  Hans Boehm's done a fair bit of research in this issue,
>>> > including discussing the underlying assumptions and arguing that
>>> > compilers should (and usually do) guarantee those assumptions.
> 
>> I’d be surprised if that held reliably when the last use of a Lisp_Object in some function extracts an object pointer and then never references the Lisp_Object as such ever again.
> 
> That's not a problem for Emacs, since the Emacs GC marks the object either way.

Ah, sorry, I misunderstood the case Daniel was describing.  Yes, the case I was thinking of is in fact handled; stack slots holding either Lisp_Object values or pointers to the start of the Lisp data structures will be fine.

But we do use interior pointers sometimes; looking at Fsubstring’s handling of a vector object:

  else
    res = Fvector (ito - ifrom, aref_addr (string, ifrom));

  return res;
}

… here we pass Fvector a pointer to somewhere within the “contents” array of the vector passed as argument “string”; it’s neither the Lisp_Object value, nor the start of the allocated structure.  Now, I don’t think this is a case that can trigger GC at the critical time.  But clearly we’ve got at least one case where we keep an interior pointer and — locally, at least; the caller could be another matter — don’t keep a live handle on the object itself.

And the compiler can do it too.  For example, if we did something like this:

  DEFUN (“frob-array-elts", Ffrob_array_elts, Sfrob_array_elts, 1, 1, 0,
         doc: /* Blah */ )
    (Lisp_Object obj)
  {
    int i;
    for (i = 0; i < 30; i += 3)
      {
        frob (AREF (obj, i));
      }
    return Qnil;
  }

I tried compiling this (“gcc version 4.9.2 (Debian 4.9.2-10)” on x86-64).  The generated code computes obj+3 (vector tag is 5, contents array starts at offset 8) and obj+0xf3 (end of the iteration), and overwrites the register containing the original “obj” value with the argument to be passed to “frob”.

If, in this case, “frob” were something that could trigger GC, then stack scanning would not see “obj”, at least not in this stack frame.  And if the caller is doing something like:

  Ffrob_array_elts (get_vector_of_stuff ());

then the caller needn’t retain any other references to “obj” either.

Ken


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-27 15:39       ` Eli Zaretskii
@ 2016-11-28  9:50         ` Ken Raeburn
  2016-11-28 15:55           ` Eli Zaretskii
  0 siblings, 1 reply; 46+ messages in thread
From: Ken Raeburn @ 2016-11-28  9:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

On Nov 27, 2016, at 10:39, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Ken Raeburn <raeburn@raeburn.org>
>> Date: Sun, 27 Nov 2016 01:17:54 -0500
>> Cc: emacs-devel@gnu.org
>> 
>>> Indeed.  Hans Boehm's done a fair bit of research in this issue,
>>> including discussing the underlying assumptions and arguing that
>>> compilers should (and usually do) guarantee those assumptions.
>> 
>> I’d be surprised if that held reliably when the last use of a Lisp_Object in some function extracts an object pointer and then never references the Lisp_Object as such ever again.
>> 
>> Lisp_Object foo (Lisp_Object obj)
>> {
>>  …
>>  return mumble (XSYMBOL (obj));
>> }
>> 
>> It’s got no reason to specifically obfuscate the value, but it may also have no reason to keep a copy of the Lisp_Object value around when it’s no longer needed.  It’s not so much that the compiler has decided to start using an interior pointer on its own, but instead just doing what we told it to do.  If “mumble” triggers GC, stack marking may well find only the pointer and not the original “obj” value in this function, especially if the compiler optimizes away the stack frame of “foo” completely.
> 
> IOW, you envision the possibility that the object identified by 'obj'
> is not held in any stack-based memory cell in any of the 'foo's
> callers, and instead all of them hold that object only in registers?
> Is that a reasonable assumption?

Anything held in registers should get scanned by GC as well, once setjmp forces them into memory.
But if the caller never refers to the object again, it’s certainly possible that it wouldn’t keep around any copies of the Lisp_Object value, even in other registers, once it’s set up the arguments for the one function call that uses the object.  If the caller also created the object, there may not be any other references left.

Ken


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28  9:36         ` Ken Raeburn
@ 2016-11-28 15:55           ` Eli Zaretskii
  2016-11-28 16:15             ` Stefan Monnier
  2016-11-28 17:03             ` Björn Lindqvist
  2016-11-28 16:13           ` Paul Eggert
  1 sibling, 2 replies; 46+ messages in thread
From: Eli Zaretskii @ 2016-11-28 15:55 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: eggert, monnier, emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Mon, 28 Nov 2016 04:36:56 -0500
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> 
> But we do use interior pointers sometimes; looking at Fsubstring’s handling of a vector object:
> 
>   else
>     res = Fvector (ito - ifrom, aref_addr (string, ifrom));
> 
>   return res;
> }
> 
> … here we pass Fvector a pointer to somewhere within the “contents” array of the vector passed as argument “string”; it’s neither the Lisp_Object value, nor the start of the allocated structure.  Now, I don’t think this is a case that can trigger GC at the critical time.  But clearly we’ve got at least one case where we keep an interior pointer and — locally, at least; the caller could be another matter — don’t keep a live handle on the object itself.

But 'string' still references the contents array of the vector, so GC
will mark it when it comes to 'string'.

> And the compiler can do it too.  For example, if we did something like this:
> 
>   DEFUN (“frob-array-elts", Ffrob_array_elts, Sfrob_array_elts, 1, 1, 0,
>          doc: /* Blah */ )
>     (Lisp_Object obj)
>   {
>     int i;
>     for (i = 0; i < 30; i += 3)
>       {
>         frob (AREF (obj, i));
>       }
>     return Qnil;
>   }
> 
> I tried compiling this (“gcc version 4.9.2 (Debian 4.9.2-10)” on x86-64).  The generated code computes obj+3 (vector tag is 5, contents array starts at offset 8) and obj+0xf3 (end of the iteration), and overwrites the register containing the original “obj” value with the argument to be passed to “frob”.
> 
> If, in this case, “frob” were something that could trigger GC, then stack scanning would not see “obj”, at least not in this stack frame.  And if the caller is doing something like:
> 
>   Ffrob_array_elts (get_vector_of_stuff ());
> 
> then the caller needn’t retain any other references to “obj” either.

Again, 'obj' will be somewhere on the stack up the call frames.
Otherwise, how did it wind up in your frob-array-elts function?



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28  9:50         ` Ken Raeburn
@ 2016-11-28 15:55           ` Eli Zaretskii
  0 siblings, 0 replies; 46+ messages in thread
From: Eli Zaretskii @ 2016-11-28 15:55 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: monnier, emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Mon, 28 Nov 2016 04:50:56 -0500
> Cc: monnier@iro.umontreal.ca,
>  emacs-devel@gnu.org
> 
> But if the caller never refers to the object again, it’s certainly possible that it wouldn’t keep around any copies of the Lisp_Object value, even in other registers, once it’s set up the arguments for the one function call that uses the object.  If the caller also created the object, there may not be any other references left.

One of the callers should still hold the reference.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28  9:36         ` Ken Raeburn
  2016-11-28 15:55           ` Eli Zaretskii
@ 2016-11-28 16:13           ` Paul Eggert
  1 sibling, 0 replies; 46+ messages in thread
From: Paul Eggert @ 2016-11-28 16:13 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Stefan Monnier, emacs-devel

On 11/28/2016 01:36 AM, Ken Raeburn wrote:
> here we pass Fvector a pointer to somewhere within the “contents” array of the vector passed as argument “string”

Sure, but in practice there are always other copies of 'string' floating 
around. Look at the call to Fsubstring in bytecode.c, for example:

   TOP = Fsubstring (TOP, v1, v2);

There's another copy of 'string' in the stack. Even if that code were 
changed to this:

   Lisp_Object v = POP; PUSH (Fsubstring (v, v1, v2));

Because the garbage collector doesn't know where the stack top is and so 
scans the entire stack, the GC would still mark the string.

In practice, Emacs C code doesn't create objects that no Elisp code 
cares about, and then mess with those objects in a way that can cause GC 
and that does not need the object base address. Partly this is due to 
the longstanding tradition where such objects needed to be GCPRO'd 
anyway. Partly it's because it's the Emacs C code is designed to be 
subsidiary to the Lisp code, and it's unusual to create and mess with 
objects not intended to be exported to Lisp.

If Emacs code starts doing what you're worried about, then we'll have to 
modify the GC marker to chase all pointers into objects, not merely base 
pointers. But I don't see this being a problem in the current code.




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 15:55           ` Eli Zaretskii
@ 2016-11-28 16:15             ` Stefan Monnier
  2016-11-28 17:37               ` Eli Zaretskii
  2016-11-28 17:03             ` Björn Lindqvist
  1 sibling, 1 reply; 46+ messages in thread
From: Stefan Monnier @ 2016-11-28 16:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Ken Raeburn, emacs-devel, eggert

>> res = Fvector (ito - ifrom, aref_addr (string, ifrom));
> But 'string' still references the contents array of the vector, so GC
> will mark it when it comes to 'string'.

But after computing "aref_addr (string, ifrom)", it may very well be
that `string` is dead and the compiler may then decide not to write
it into the stack.


        Stefan



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 15:55           ` Eli Zaretskii
  2016-11-28 16:15             ` Stefan Monnier
@ 2016-11-28 17:03             ` Björn Lindqvist
  1 sibling, 0 replies; 46+ messages in thread
From: Björn Lindqvist @ 2016-11-28 17:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Ken Raeburn, emacs-devel, Stefan Monnier, eggert

>> If, in this case, “frob” were something that could trigger GC, then stack scanning would not see “obj”, at least not in this stack frame.  And if the caller is doing something like:
>>
>>   Ffrob_array_elts (get_vector_of_stuff ());
>>
>> then the caller needn’t retain any other references to “obj” either.
>
> Again, 'obj' will be somewhere on the stack up the call frames.
> Otherwise, how did it wind up in your frob-array-elts function?

With the 64bit x86 sysv abi, the compiler will put the return value of
get_vector_of_stuff () in the RDI register when calling the function
so value might never exist on the stack. It is different from 32bit
architectures where parameters are mostly passed on the stack. Then in
the loop in Ffrob_array_elts the base pointer in RDI pointing to
get_vector_of_stuff() might be overwritten which would cause the
object to be lost completely.

The first problem is easy to deal with by pushing all general purpose
registers onto the stack before running the gc.

The other problem is that RDI might contain an interior pointer to
get_vector_of_stuff () can be solved too. Like this:

a) Add all allocated objects to an ordered set O, keyed by address.
b) For each pointer P to trace:
c) If P does not point to an object header:
d) Find the object in O with the largest address < P and trace that one instead.

The ordered set can be implemented as a red-black tree so the lookup
in d) will be a fast O(log N) operation. This method is often used to
trace code heaps because return addresses are a form of interior
pointers. But it should work for data heaps too.


-- 
mvh/best regards Björn Lindqvist



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 16:15             ` Stefan Monnier
@ 2016-11-28 17:37               ` Eli Zaretskii
  2016-11-28 17:49                 ` Stefan Monnier
  2016-11-28 19:09                 ` Ken Raeburn
  0 siblings, 2 replies; 46+ messages in thread
From: Eli Zaretskii @ 2016-11-28 17:37 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: raeburn, emacs-devel, eggert

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Ken Raeburn <raeburn@raeburn.org>,  eggert@cs.ucla.edu,  emacs-devel@gnu.org
> Date: Mon, 28 Nov 2016 11:15:28 -0500
> 
> >> res = Fvector (ito - ifrom, aref_addr (string, ifrom));
> > But 'string' still references the contents array of the vector, so GC
> > will mark it when it comes to 'string'.
> 
> But after computing "aref_addr (string, ifrom)", it may very well be
> that `string` is dead and the compiler may then decide not to write
> it into the stack.

It will still be there in the caller.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 17:37               ` Eli Zaretskii
@ 2016-11-28 17:49                 ` Stefan Monnier
  2016-11-28 17:57                   ` Eli Zaretskii
  2016-11-28 19:09                 ` Ken Raeburn
  1 sibling, 1 reply; 46+ messages in thread
From: Stefan Monnier @ 2016-11-28 17:49 UTC (permalink / raw)
  To: emacs-devel

>> >> res = Fvector (ito - ifrom, aref_addr (string, ifrom));
>> > But 'string' still references the contents array of the vector, so GC
>> > will mark it when it comes to 'string'.
>> But after computing "aref_addr (string, ifrom)", it may very well be
>> that `string` is dead and the compiler may then decide not to write
>> it into the stack.
> It will still be there in the caller.

Only if the caller still needs it after the call.
That's usually the case, but it's not a given.


        Stefan




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-26 15:05         ` Stefan Monnier
  2016-11-26 15:21           ` Camm Maguire
@ 2016-11-28 17:51           ` Daniel Colascione
  2016-11-28 18:00             ` Eli Zaretskii
  2016-11-28 18:03             ` Stefan Monnier
  1 sibling, 2 replies; 46+ messages in thread
From: Daniel Colascione @ 2016-11-28 17:51 UTC (permalink / raw)
  To: Stefan Monnier, emacs-devel

On 11/26/2016 07:05 AM, Stefan Monnier wrote:
>> How would you assert dynamically that if an interval is reachable, its
>> owning string or buffer must be too?
>
> You don't.  You check it statically (by a human).
>
>> It's not enough for the variable holding the reference to the string
>> or buffer to be in scope: you have to be sure that the reference
>> isn't dead.
>
> It should be: if it's in scope, it's not dead.

That's not the case.

struct foo* f = something();
int* x = f->&field;
something_else(); // invalidate global memory
*x = 5; // f is dead here, but still in scope

Even if you don't write this kind of code, the compiler is allowed to 
generate it.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 17:49                 ` Stefan Monnier
@ 2016-11-28 17:57                   ` Eli Zaretskii
  2016-11-28 18:05                     ` Stefan Monnier
  0 siblings, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2016-11-28 17:57 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Mon, 28 Nov 2016 12:49:28 -0500
> 
> >> >> res = Fvector (ito - ifrom, aref_addr (string, ifrom));
> >> > But 'string' still references the contents array of the vector, so GC
> >> > will mark it when it comes to 'string'.
> >> But after computing "aref_addr (string, ifrom)", it may very well be
> >> that `string` is dead and the compiler may then decide not to write
> >> it into the stack.
> > It will still be there in the caller.
> 
> Only if the caller still needs it after the call.
> That's usually the case, but it's not a given.

If it is not needed after the call, it will be clobbered after the
call, but not during the call.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 17:51           ` Daniel Colascione
@ 2016-11-28 18:00             ` Eli Zaretskii
  2016-11-28 18:03               ` Daniel Colascione
  2016-11-28 18:03             ` Stefan Monnier
  1 sibling, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2016-11-28 18:00 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: monnier, emacs-devel

> From: Daniel Colascione <dancol@dancol.org>
> Date: Mon, 28 Nov 2016 09:51:37 -0800
> 
> struct foo* f = something();
> int* x = f->&field;
> something_else(); // invalidate global memory
> *x = 5; // f is dead here, but still in scope
> 
> Even if you don't write this kind of code, the compiler is allowed to 
> generate it.

But there's no such code in Emacs, and will never be.  Lisp objects we
create are either temporaries that can be GC'ed, or values that cannot
be GC'ed, in which case they are passed to some other code, either a
callee or returned as a value.  The only ones that can be dead as
above are the first variety, about which we don't care.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 17:51           ` Daniel Colascione
  2016-11-28 18:00             ` Eli Zaretskii
@ 2016-11-28 18:03             ` Stefan Monnier
  2016-11-28 19:18               ` Daniel Colascione
  2016-11-28 19:26               ` Andreas Schwab
  1 sibling, 2 replies; 46+ messages in thread
From: Stefan Monnier @ 2016-11-28 18:03 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

>>> How would you assert dynamically that if an interval is reachable, its
>>> owning string or buffer must be too?
>> You don't.  You check it statically (by a human).
>>> It's not enough for the variable holding the reference to the string
>>> or buffer to be in scope: you have to be sure that the reference
>>> isn't dead.
>> It should be: if it's in scope, it's not dead.
> That's not the case.

In general, no, but in this specific case I think it always will.
E.g. because in order to be in the process of working on the intervals of
a buffer, that buffer needs to be not just reachable but buffer-live-p,
so you'd have to mess with buffer-alist before the buffer can be
reclaimed, which is highly unlikely to happen within the functions that
manipulate intervals.

For strings, the argument might not be as strong.  Maybe code like
(ignore (propertize "foo" 'a 'b)) could lead to us working on an
unreachable string, so it could get GC'd while we manipulate its
intervals.

So for those cases, I guess the main safety argument we have is that we
will not call the GC while we're in the middle of manipulating struct
interval objects.


        Stefan



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 18:00             ` Eli Zaretskii
@ 2016-11-28 18:03               ` Daniel Colascione
  2016-11-28 18:50                 ` Eli Zaretskii
  0 siblings, 1 reply; 46+ messages in thread
From: Daniel Colascione @ 2016-11-28 18:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

On 11/28/2016 10:00 AM, Eli Zaretskii wrote:
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Mon, 28 Nov 2016 09:51:37 -0800
>>
>> struct foo* f = something();
>> int* x = f->&field;
>> something_else(); // invalidate global memory
>> *x = 5; // f is dead here, but still in scope
>>
>> Even if you don't write this kind of code, the compiler is allowed to
>> generate it.
>
> But there's no such code in Emacs, and will never be.

I think you have too little faith in the ingenuity of compiler writers. 
Why can't the compiler generate this sort of code in cases we don't 
anticipate?

> Lisp objects we
> create are either temporaries that can be GC'ed, or values that cannot
> be GC'ed, in which case they are passed to some other code, either a
> callee or returned as a value.  The only ones that can be dead as
> above are the first variety, about which we don't care.

When this assumption stops holding, it's going to be very difficult to 
debug the resulting occasional crashes. Wouldn't it be easier to use the 
information *already in the memory tree* to make GC more conservative 
and understand interior pointers?



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 17:57                   ` Eli Zaretskii
@ 2016-11-28 18:05                     ` Stefan Monnier
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Monnier @ 2016-11-28 18:05 UTC (permalink / raw)
  To: emacs-devel

>> >> >> res = Fvector (ito - ifrom, aref_addr (string, ifrom));
>> >> > But 'string' still references the contents array of the vector, so GC
>> >> > will mark it when it comes to 'string'.
>> >> But after computing "aref_addr (string, ifrom)", it may very well be
>> >> that `string` is dead and the compiler may then decide not to write
>> >> it into the stack.
>> > It will still be there in the caller.
>> Only if the caller still needs it after the call.
>> That's usually the case, but it's not a given.
> If it is not needed after the call, it will be clobbered after the
> call, but not during the call.

Not necessarily.  Depends on details such as the argument passing convention.


        Stefan




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 18:03               ` Daniel Colascione
@ 2016-11-28 18:50                 ` Eli Zaretskii
  0 siblings, 0 replies; 46+ messages in thread
From: Eli Zaretskii @ 2016-11-28 18:50 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: monnier, emacs-devel

> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Mon, 28 Nov 2016 10:03:49 -0800
> 
> On 11/28/2016 10:00 AM, Eli Zaretskii wrote:
> >> From: Daniel Colascione <dancol@dancol.org>
> >> Date: Mon, 28 Nov 2016 09:51:37 -0800
> >>
> >> struct foo* f = something();
> >> int* x = f->&field;
> >> something_else(); // invalidate global memory
> >> *x = 5; // f is dead here, but still in scope
> >>
> >> Even if you don't write this kind of code, the compiler is allowed to
> >> generate it.
> >
> > But there's no such code in Emacs, and will never be.
> 
> I think you have too little faith in the ingenuity of compiler writers. 
> Why can't the compiler generate this sort of code in cases we don't 
> anticipate?

Because no matter how ingenious the compiler writers are, they cannot
produce code that will trigger GC where we didn't write such code to
begin with.  As long as there's no GC, the above is harmless.

> > Lisp objects we
> > create are either temporaries that can be GC'ed, or values that cannot
> > be GC'ed, in which case they are passed to some other code, either a
> > callee or returned as a value.  The only ones that can be dead as
> > above are the first variety, about which we don't care.
> 
> When this assumption stops holding, it's going to be very difficult to 
> debug the resulting occasional crashes. Wouldn't it be easier to use the 
> information *already in the memory tree* to make GC more conservative 
> and understand interior pointers?

I don't see why such assumptions should stop holding: we write code as
part of the Lisp interpreter, not as just any C program.  So creating
Lisp objects just to fiddle with their C-side internals will never
make sense, and code like that will always be rejected or rewritten.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 17:37               ` Eli Zaretskii
  2016-11-28 17:49                 ` Stefan Monnier
@ 2016-11-28 19:09                 ` Ken Raeburn
  2016-11-28 19:33                   ` Eli Zaretskii
  1 sibling, 1 reply; 46+ messages in thread
From: Ken Raeburn @ 2016-11-28 19:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, Stefan Monnier, emacs-devel


> On Nov 28, 2016, at 12:37, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: Ken Raeburn <raeburn@raeburn.org>,  eggert@cs.ucla.edu,  emacs-devel@gnu.org
>> Date: Mon, 28 Nov 2016 11:15:28 -0500
>> 
>>>> res = Fvector (ito - ifrom, aref_addr (string, ifrom));
>>> But 'string' still references the contents array of the vector, so GC
>>> will mark it when it comes to 'string'.
>> 
>> But after computing "aref_addr (string, ifrom)", it may very well be
>> that `string` is dead and the compiler may then decide not to write
>> it into the stack.
> 
> It will still be there in the caller.

Not if Fsubstring (or whatever function) is applied directly to the return value from some other call.  As Björn described, the caller may not keep it around at all, just shift it from register to register, and the callee can scribble over those registers.  Even if there’s an automatic variable assigned the value, if it’s not used later, the compiler may optimize it away; I’ve lost track of how many times GDB has indicated to me that it recognizes a variable name but at the current point in the code it’s been optimized out so GDB can’t show it to me.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 18:03             ` Stefan Monnier
@ 2016-11-28 19:18               ` Daniel Colascione
  2016-11-28 19:33                 ` Stefan Monnier
  2016-11-28 19:37                 ` Eli Zaretskii
  2016-11-28 19:26               ` Andreas Schwab
  1 sibling, 2 replies; 46+ messages in thread
From: Daniel Colascione @ 2016-11-28 19:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On 11/28/2016 10:03 AM, Stefan Monnier wrote:
>>>> How would you assert dynamically that if an interval is reachable, its
>>>> owning string or buffer must be too?
>>> You don't.  You check it statically (by a human).
>>>> It's not enough for the variable holding the reference to the string
>>>> or buffer to be in scope: you have to be sure that the reference
>>>> isn't dead.
>>> It should be: if it's in scope, it's not dead.
>> That's not the case.
>
> In general, no, but in this specific case I think it always will.
> E.g. because in order to be in the process of working on the intervals of
> a buffer, that buffer needs to be not just reachable but buffer-live-p,
> so you'd have to mess with buffer-alist before the buffer can be
> reclaimed, which is highly unlikely to happen within the functions that
> manipulate intervals.
>
> For strings, the argument might not be as strong.  Maybe code like
> (ignore (propertize "foo" 'a 'b)) could lead to us working on an
> unreachable string, so it could get GC'd while we manipulate its
> intervals.
>
> So for those cases, I guess the main safety argument we have is that we
> will not call the GC while we're in the middle of manipulating struct
> interval objects.

It's not just strings and buffers and intervals. What about cons cells? 
There's nothing wrong with getting a cons from something, doing 
something that might GC with its car, then doing something that might GC 
with its cdr. There's nothing stopping the compiler from keeping a 
pointer to the cdr instead of the car and indexing when it's time to 
dereference the cons and get the cdr out of it.

It's legal for the compiler to emit code that does things like this. 
That it hasn't happened yet is no guarantee that it won't in the future.

Let me ask again: we already have all the runtime data we need for more 
conservative GC. Where is the resistance to the idea coming from?



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 18:03             ` Stefan Monnier
  2016-11-28 19:18               ` Daniel Colascione
@ 2016-11-28 19:26               ` Andreas Schwab
  2016-11-28 19:34                 ` Stefan Monnier
  1 sibling, 1 reply; 46+ messages in thread
From: Andreas Schwab @ 2016-11-28 19:26 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Daniel Colascione, emacs-devel

On Nov 28 2016, Stefan Monnier <monnier@iro.umontreal.ca> wrote:

> For strings, the argument might not be as strong.  Maybe code like
> (ignore (propertize "foo" 'a 'b)) could lead to us working on an
> unreachable string, so it could get GC'd while we manipulate its
> intervals.

The arguments to propertize will still be reachable from the bytecode
stack.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 19:18               ` Daniel Colascione
@ 2016-11-28 19:33                 ` Stefan Monnier
  2016-11-28 19:37                 ` Eli Zaretskii
  1 sibling, 0 replies; 46+ messages in thread
From: Stefan Monnier @ 2016-11-28 19:33 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

> It's not just strings and buffers and intervals.

The case of intervals is different because pointers to them from the
stack aren't recognized by our conservative stack scanner (CSS), so they're
more at risk, regardless of funny compiler shenanigans.

> What about cons cells?  There's nothing wrong with getting a cons from
> something, doing something that might GC with its car, then doing
> something that might GC with its cdr.  There's nothing stopping the
> compiler from keeping a pointer to the cdr instead of the car and
> indexing when it's time to dereference the cons and get the cdr out
> of it.

Read Boehm&Chase's article: indeed, it's legal for compilers to do that.

But Emacs is not the only program using CSS, so there is some amount of
pressure to try and make sure C compilers don't make life impossible
for CSS.

> Let me ask again: we already have all the runtime data we need for more
> conservative GC. Where is the resistance to the idea coming from?

A few reasons I can think of for this inertia:
- The problem is hypothetical.
- Even if you pay attention to internal pointers, there are still
  (hypothetical) cases that won't be caught.
- Noone volunteered to write it.
- It will likely increase the CPU cost of stack scanning.
- It will likely increase the amount of garbage we erroneously keep alive.


        Stefan



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 19:09                 ` Ken Raeburn
@ 2016-11-28 19:33                   ` Eli Zaretskii
  2016-11-29  8:49                     ` Ken Raeburn
  0 siblings, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2016-11-28 19:33 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: eggert, monnier, emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Mon, 28 Nov 2016 14:09:44 -0500
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,
>  eggert@cs.ucla.edu,
>  emacs-devel@gnu.org
> 
> >> But after computing "aref_addr (string, ifrom)", it may very well be
> >> that `string` is dead and the compiler may then decide not to write
> >> it into the stack.
> > 
> > It will still be there in the caller.
> 
> Not if Fsubstring (or whatever function) is applied directly to the return value from some other call.

No, it still will be there.

You need to keep in mind how these variables get to Lisp, and for what
purposes.  Our code is written for certain purposes, and that mandates
how data flows in the code.  If you go high enough up the stack, you
will find every Lisp object value that matters at some level.

> As Björn described, the caller may not keep it around at all, just shift it from register to register, and the callee can scribble over those registers.

The number of registers is finite, even in the x86_64 architectures.
The compiler cannot keep storing values in registers, because it needs
to leave enough of them available for the next level of function
invocation, about which it knows nothing before that function is
called.  Go deep enough, and values will be pushed on the stack
because the compiler needs to free a register.

> Even if there’s an automatic variable assigned the value, if it’s not used later, the compiler may optimize it away; I’ve lost track of how many times GDB has indicated to me that it recognizes a variable name but at the current point in the code it’s been optimized out so GDB can’t show it to me.

The fact that GDB cannot tell you the value, and instead throws the
"optimized out" thing at you, is just a sign that GCC didn't leave
behind enough information for GDB to get its act together, and/or that
GDB has bugs, and/or that DWARF is not expressive enough to cover some
ingenious optimization technique.  It doesn't necessarily tell
anything about where the value really is.  If anything, it means the
value is not in a register, i.e. likely on the stack (where else?).



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 19:26               ` Andreas Schwab
@ 2016-11-28 19:34                 ` Stefan Monnier
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Monnier @ 2016-11-28 19:34 UTC (permalink / raw)
  To: emacs-devel

>> For strings, the argument might not be as strong.  Maybe code like
>> (ignore (propertize "foo" 'a 'b)) could lead to us working on an
>> unreachable string, so it could get GC'd while we manipulate its
>> intervals.
> The arguments to propertize will still be reachable from the bytecode
> stack.

I was giving this code as an example of the kind of operations and the
order in which they're performed.


        Stefan




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 19:18               ` Daniel Colascione
  2016-11-28 19:33                 ` Stefan Monnier
@ 2016-11-28 19:37                 ` Eli Zaretskii
  2016-11-28 19:40                   ` Daniel Colascione
  1 sibling, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2016-11-28 19:37 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: monnier, emacs-devel

> From: Daniel Colascione <dancol@dancol.org>
> Date: Mon, 28 Nov 2016 11:18:32 -0800
> Cc: emacs-devel@gnu.org
> 
> Let me ask again: we already have all the runtime data we need for more 
> conservative GC. Where is the resistance to the idea coming from?

I already answered that up-thread: it will be dead code, and thus will
likely do the wrong thing if it ever runs.

I also suggested what to do instead: add assertions that express what
we believe should never happen.  Stefan says doing that is unlikely to
be justified by the dangers, but if we think so, then we shouldn't be
afraid of the problem in the first place.  If we are, then adding
assertions is the way to go.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 19:37                 ` Eli Zaretskii
@ 2016-11-28 19:40                   ` Daniel Colascione
  2016-11-28 20:03                     ` Eli Zaretskii
  0 siblings, 1 reply; 46+ messages in thread
From: Daniel Colascione @ 2016-11-28 19:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

On 11/28/2016 11:37 AM, Eli Zaretskii wrote:
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Mon, 28 Nov 2016 11:18:32 -0800
>> Cc: emacs-devel@gnu.org
>>
>> Let me ask again: we already have all the runtime data we need for more
>> conservative GC. Where is the resistance to the idea coming from?
>
> I already answered that up-thread: it will be dead code, and thus will
> likely do the wrong thing if it ever runs.

We'll always have stray pointers to object interiors that will exercise 
these code paths. I've broken enough of them recently enough to know.

> I also suggested what to do instead: add assertions that express what
> we believe should never happen.  Stefan says doing that is unlikely to
> be justified by the dangers, but if we think so, then we shouldn't be
> afraid of the problem in the first place.  If we are, then adding
> assertions is the way to go.

It's not possible to assert, statically or dynamically, that we don't 
have this problem. If we could, we wouldn't need conservative GC at all, 
since we'd know the exact locations of all pointers.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 19:40                   ` Daniel Colascione
@ 2016-11-28 20:03                     ` Eli Zaretskii
  2016-11-28 20:09                       ` Daniel Colascione
  0 siblings, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2016-11-28 20:03 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: monnier, emacs-devel

> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Mon, 28 Nov 2016 11:40:22 -0800
> 
> > I already answered that up-thread: it will be dead code, and thus will
> > likely do the wrong thing if it ever runs.
> 
> We'll always have stray pointers to object interiors that will exercise 
> these code paths. I've broken enough of them recently enough to know.

We are not talking about broken code.  It's easy to get yourself hung
in Emacs by writing bad code.  Writing additions to GC to detect bad
code is not a good use of CPU cycles and of our time.

> > I also suggested what to do instead: add assertions that express what
> > we believe should never happen.  Stefan says doing that is unlikely to
> > be justified by the dangers, but if we think so, then we shouldn't be
> > afraid of the problem in the first place.  If we are, then adding
> > assertions is the way to go.
> 
> It's not possible to assert, statically or dynamically, that we don't 
> have this problem.

If we cannot assert the invariants, either we don't really understand
the problem, or the problem doesn't exist in the first place (or
both).



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 20:03                     ` Eli Zaretskii
@ 2016-11-28 20:09                       ` Daniel Colascione
  0 siblings, 0 replies; 46+ messages in thread
From: Daniel Colascione @ 2016-11-28 20:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

On 11/28/2016 12:03 PM, Eli Zaretskii wrote:
>> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Mon, 28 Nov 2016 11:40:22 -0800
>>
>>> I already answered that up-thread: it will be dead code, and thus will
>>> likely do the wrong thing if it ever runs.
>>
>> We'll always have stray pointers to object interiors that will exercise
>> these code paths. I've broken enough of them recently enough to know.
>
> We are not talking about broken code.  It's easy to get yourself hung
> in Emacs by writing bad code.  Writing additions to GC to detect bad
> code is not a good use of CPU cycles and of our time.

What are you talking about? No matter how good the Emacs C code is, the 
compiler is legally allowed to generate code that trips up the current 
conservative GC. Code that looks perfectly safe may end up being 
compiled into something that trips up the current GC scheme. Even if it 
is possible to arrange C code such that, at runtime, we always keep GC 
roots around, the rules must be very subtle.

We can make a simple, cheap modification to the GC to plug this hole 
once and for all. Other systems that use conservative GC have made 
similar modifications. I still don't understand why you're resisting 
this change when I've outlined a clear way in which things might go wrong.

You're the one always talking about future-proofing the C core so that 
it needs less expert maintenance. This is a change that will 
future-proof the Emacs C core so that it needs less expert maintenance.

>>> I also suggested what to do instead: add assertions that express what
>>> we believe should never happen.  Stefan says doing that is unlikely to
>>> be justified by the dangers, but if we think so, then we shouldn't be
>>> afraid of the problem in the first place.  If we are, then adding
>>> assertions is the way to go.
>>
>> It's not possible to assert, statically or dynamically, that we don't
>> have this problem.
>
> If we cannot assert the invariants, either we don't really understand
> the problem, or the problem doesn't exist in the first place (or
> both).

It's possible, from a CS perspective, to do a reachability analysis on 
the generated machine code and prove that all accessible objects have 
live head pointers. I think. But adding an analysis like this to the 
build process would be a research project in itself. It's easier to just 
make the GC more conservative.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Conservative GC isn't safe
  2016-11-28 19:33                   ` Eli Zaretskii
@ 2016-11-29  8:49                     ` Ken Raeburn
  0 siblings, 0 replies; 46+ messages in thread
From: Ken Raeburn @ 2016-11-29  8:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel

On Nov 28, 2016, at 14:33, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Ken Raeburn <raeburn@raeburn.org>
>> Date: Mon, 28 Nov 2016 14:09:44 -0500
>> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,
>> eggert@cs.ucla.edu,
>> emacs-devel@gnu.org
>> 
>>>> But after computing "aref_addr (string, ifrom)", it may very well be
>>>> that `string` is dead and the compiler may then decide not to write
>>>> it into the stack.
>>> 
>>> It will still be there in the caller.
>> 
>> Not if Fsubstring (or whatever function) is applied directly to the return value from some other call.
> 
> No, it still will be there.

Not anywhere saved by this part of the call chain.

> You need to keep in mind how these variables get to Lisp, and for what
> purposes.  Our code is written for certain purposes, and that mandates
> how data flows in the code.  If you go high enough up the stack, you
> will find every Lisp object value that matters at some level.

I’m talking mainly about calls between C functions in Emacs.  Some original data may come from Lisp code, way up the stack, but if some C routine applies Fdowncase() or Fmapcar() or Fconcat() to the Lisp-supplied values, and does further manipulation on them, then we’re dealing with objects generated in the C code and not referenced by the Lisp stack.

> 
>> As Björn described, the caller may not keep it around at all, just shift it from register to register, and the callee can scribble over those registers.
> 
> The number of registers is finite, even in the x86_64 architectures.
> The compiler cannot keep storing values in registers, because it needs
> to leave enough of them available for the next level of function
> invocation, about which it knows nothing before that function is
> called.  Go deep enough, and values will be pushed on the stack
> because the compiler needs to free a register.

Or they get discarded, when the compiler sees that they’re not needed any more.  Not when they go out of scope, but when they’re no longer actually used in the function.

>> Even if there’s an automatic variable assigned the value, if it’s not used later, the compiler may optimize it away; I’ve lost track of how many times GDB has indicated to me that it recognizes a variable name but at the current point in the code it’s been optimized out so GDB can’t show it to me.
> 
> The fact that GDB cannot tell you the value, and instead throws the
> "optimized out" thing at you, is just a sign that GCC didn't leave
> behind enough information for GDB to get its act together, and/or that
> GDB has bugs, and/or that DWARF is not expressive enough to cover some
> ingenious optimization technique.  It doesn't necessarily tell
> anything about where the value really is.  If anything, it means the
> value is not in a register, i.e. likely on the stack (where else?).

Bugs in one or more of the tools are certainly possibilities, but there really are cases where the value just doesn’t exist in any particular location, memory or register, because it’s not needed and the registers can be reused for something else.




^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2016-11-29  8:49 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-26  8:11 Conservative GC isn't safe Daniel Colascione
2016-11-26  8:30 ` Paul Eggert
2016-11-26  8:33   ` Daniel Colascione
2016-11-26  9:01     ` Eli Zaretskii
2016-11-26  9:04       ` Daniel Colascione
2016-11-26  9:24         ` Eli Zaretskii
2016-11-26 15:05         ` Stefan Monnier
2016-11-26 15:21           ` Camm Maguire
2016-11-28 17:51           ` Daniel Colascione
2016-11-28 18:00             ` Eli Zaretskii
2016-11-28 18:03               ` Daniel Colascione
2016-11-28 18:50                 ` Eli Zaretskii
2016-11-28 18:03             ` Stefan Monnier
2016-11-28 19:18               ` Daniel Colascione
2016-11-28 19:33                 ` Stefan Monnier
2016-11-28 19:37                 ` Eli Zaretskii
2016-11-28 19:40                   ` Daniel Colascione
2016-11-28 20:03                     ` Eli Zaretskii
2016-11-28 20:09                       ` Daniel Colascione
2016-11-28 19:26               ` Andreas Schwab
2016-11-28 19:34                 ` Stefan Monnier
2016-11-26 15:03   ` Stefan Monnier
2016-11-26 15:12     ` Eli Zaretskii
2016-11-26 16:29       ` Stefan Monnier
2016-11-26 16:42         ` Eli Zaretskii
2016-11-26 18:43           ` Stefan Monnier
2016-11-27  6:17     ` Ken Raeburn
2016-11-27 15:39       ` Eli Zaretskii
2016-11-28  9:50         ` Ken Raeburn
2016-11-28 15:55           ` Eli Zaretskii
2016-11-27 16:15       ` Paul Eggert
2016-11-28  9:36         ` Ken Raeburn
2016-11-28 15:55           ` Eli Zaretskii
2016-11-28 16:15             ` Stefan Monnier
2016-11-28 17:37               ` Eli Zaretskii
2016-11-28 17:49                 ` Stefan Monnier
2016-11-28 17:57                   ` Eli Zaretskii
2016-11-28 18:05                     ` Stefan Monnier
2016-11-28 19:09                 ` Ken Raeburn
2016-11-28 19:33                   ` Eli Zaretskii
2016-11-29  8:49                     ` Ken Raeburn
2016-11-28 17:03             ` Björn Lindqvist
2016-11-28 16:13           ` Paul Eggert
2016-11-27 16:52       ` Stefan Monnier
2016-11-26 19:08 ` Pip Cet
2016-11-27  0:24   ` Paul Eggert

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).