* New GC concept @ 2021-06-04 3:30 Daniel Colascione 2021-06-04 8:00 ` Daniel Mendler ` (2 more replies) 0 siblings, 3 replies; 16+ messages in thread From: Daniel Colascione @ 2021-06-04 3:30 UTC (permalink / raw) To: emacs-devel Emacs has had the same GC for a decent amount of time now (since the 1980s, really). I spent some time in 2020 rewriting it from scratch. I haven't had time to work on the new GC recently, but I figure I'd throw it out here to get some feedback on the general concept. Check out https://github.com/dcolascione/emacs-1/blob/newgc-wip/src/alloc.c, specifically the big doc comment on top The new GC basically replaces alloc.c and a few other things. It has a few cool features: * fully copying and compacting * special treatment of sxhash to preserve object identify even while we move it around in memory * generational * contiguous storage of mark bits separately from the data heap * concurrent (in design, not current implementation): idea is that we do concurrent marking and barely pause for sweep * small string optimization * bump pointer allocation of new objects * heap enumeration support * hard requirement on pdumper * specialized GC spaces for conses, strings, arrays, and so on: no stupid header word for cons cells bloating memory use by 50%! * cool modern C implementation that relies heavily on compiler inlining and constant propagation The current implementation is deficient in many ways. Honestly, I'm not even sure whether that specific revision compiles. But like I said, I haven't had time recently to continue work on it. Still, I'm still curious about what people think of the overall effort. It might work nicely with the new native compilation stuff, giving us a managed code execution environment kind-of, sort-of on par with the big modern managed-code runtimes. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: New GC concept 2021-06-04 3:30 New GC concept Daniel Colascione @ 2021-06-04 8:00 ` Daniel Mendler 2021-06-04 9:47 ` Daniel Colascione 2021-06-04 8:56 ` Andrea Corallo via Emacs development discussions. 2021-06-07 17:32 ` Matt Armstrong 2 siblings, 1 reply; 16+ messages in thread From: Daniel Mendler @ 2021-06-04 8:00 UTC (permalink / raw) To: Daniel Colascione, emacs-devel Interesting, thank you for working on this! I had hoped that a new GC would surface at some point given the recent improvements regarding native compilation. As you say this can bring Emacs on par with modern managed language runtimes. Can you elaborate a bit more of some of your concepts? > * fully copying and compacting How do you ensure that compaction works together with the conservative stack scanning? You pin memory blocks, which are potentially referenced by the stack? > * generational Do you rely on the OS memory protection as a write barrier to separate the different generations, similar to how the bdwgc does that? > * specialized GC spaces for conses, strings, arrays, and so on: no stupid header word for cons cells bloating memory use by 50%! Avoiding the headers is a good idea. You are using a first fit strategy to find the next free space for a new object. How do you use the headerless approach for objects of different sizes? Isn't it the case that every block should then contain objects of only a single type and of a single size? Probably most of the objects fall in a few size classes, so it may be possible to do better than first fit? Overall is your design similar to the approach of the bdwgc plus that a memory/object layout tailored to the needs of Emacs and the compaction? How well does such a GC hold up to a GC which is precise and does not rely on the OS facilities for barriers? It appears such a precise GC is impossible to retrofit on the existing Elisp runtime, so I assume your approach is the right way to go. Daniel Mendler ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: New GC concept 2021-06-04 8:00 ` Daniel Mendler @ 2021-06-04 9:47 ` Daniel Colascione 2021-06-04 10:50 ` Eli Zaretskii 2021-06-04 11:06 ` Daniel Mendler 0 siblings, 2 replies; 16+ messages in thread From: Daniel Colascione @ 2021-06-04 9:47 UTC (permalink / raw) To: Daniel Mendler, emacs-devel On 6/4/21 1:00 AM, Daniel Mendler wrote: > Interesting, thank you for working on this! > > I had hoped that a new GC would surface at some point given the recent > improvements regarding native compilation. As you say this can bring > Emacs on par with modern managed language runtimes. Can you elaborate a > bit more of some of your concepts? Thanks for taking a look. >> * fully copying and compacting > How do you ensure that compaction works together with the conservative > stack scanning? You pin memory blocks, which are potentially referenced > by the stack? Yes, pinning is how we combine conservative stack scanning with a copying collector. We don't pin whole memory blocks though. We pin at object granularity. (Things like Hosking's "mostly copying collector" use block pinning IIRC, but we can do much better these days.) Just as each object has a mark bit, each object has a pin bit. We pin only those specific objects that conservative scanning flags as potentially referenced from native code. We still copy pinned objects from the from-space to the to-space actually --- it's important copy pinned objects because it's during copying that we update all the pointers that a pinned object might contain. Pinning just ensures that we copy in a specific way such that after we're done with GC and swap the to-space and from-space, each pinned object ends up at the same virtual address it had before GC started. This way, although we *do* copy pinned objects, the mutator never observes a pinned object changing position. The pin bits end up using very little memory because they're stored contiguously in side arrays and almost entirely zero, and each zero page shares the same backing RAM until something makes it non-zero. Like I mentioned in the new alloc.c, address space is abundant. >> * generational > Do you rely on the OS memory protection as a write barrier to separate > the different generations, similar to how the bdwgc does that? Correct. IMHO, it's not practical to retrofit write barriers or read barriers into Emacs. >> * specialized GC spaces for conses, strings, arrays, and so on: no > stupid header word for cons cells bloating memory use by 50%! > > Avoiding the headers is a good idea. You are using a first fit strategy > to find the next free space for a new object. How do you use the > headerless approach for objects of different sizes? We don't. :-) In the new GC, the overall Emacs heap is divided into "heaps" for various object types; each heap has its own list of blocks and its own heap memory format. The heaps for fixed-size objects like cons cells and intervals don't have headers. The heaps for variable-sized objects like strings and vectorlikes *do* use conventional object headers. > Isn't it the case > that every block should then contain objects of only a single type and > of a single size? Some heaps (most importantly, the vectorlike heap) do support variable-sized objects, and blocks belonging to these heap types contain a mixture of object types. > Probably most of the objects fall in a few size > classes, so it may be possible to do better than first fit? First-fit is better than it sounds in the context of a compacting collector. First-fit allocation always (except in two cases described below) succeeds on the first try: because each GC pass compacts all the objects at the "start" of the heap, and we start first-fit allocation from the end of the last compacted object. That's why I wrote that the first-fit allocation scheme is equivalent in practice to bump-pointer allocation. The two cases where we fail first-fit allocation are: 1) we're in a heap that supports variable-sized objects and there's not enough space in the current block to hold the object we're allocating, and 2) there's a pinned object "in the way" of where we want to place the object via first-fit allocation. #1 isn't a problem in practice: if we're trying to allocate an object that's too big to place in the tail of the object's heap's current block, we allocate a new block and put the new object there instead. The new object is guaranteed to fit in the new block because we allocate larger-than-block objects in a separate storage area, as is traditional in GCs of this sort. (See the large vector list.) When we move to a new block this way, we don't commit the memory of the tail of the previous block, so moving to the next block is practically free, modulo page-tail wastage. #2 isn't a problem either: pinned objects are rare, and when we do encounter one, we can "skip over" it efficiently using the free-object bitmap. Modern machines are really good at streaming analysis of bit arrays: we don't even need a freelist embedded in the heap, like Emacs currently has for conses. Scanning a bitmap is both simpler and kinder to the cache. Because pinned objects are rare, because pins are transient, and because each GC pass is a compacting pass, first-fit doesn't lead to the fragmentation that it normally causes in things like malloc implementations. > Overall is your design similar to the approach of the bdwgc plus that a > memory/object layout tailored to the needs of Emacs and the compaction? > How well does such a GC hold up to a GC which is precise and does not > rely on the OS facilities for barriers? It appears such a precise GC is > impossible to retrofit on the existing Elisp runtime, so I assume your > approach is the right way to go. Most other GCs use software barriers, true. But even that's changing. Relying on OS facilities for barriers has an important advantage: it reduces code size. If we used the non-OS-level facility in a native compilation world, we'd have to emit a write barrier before *every* mutator write (~6 instructions). These barriers add up and bloat the generated code. If we use OS memory protection instead, the generated code can be a lot smaller. Plus, using OS facilities, we don't have to change the rest of the Emacs C core. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: New GC concept 2021-06-04 9:47 ` Daniel Colascione @ 2021-06-04 10:50 ` Eli Zaretskii 2021-06-21 13:00 ` Fejfighter 2021-06-04 11:06 ` Daniel Mendler 1 sibling, 1 reply; 16+ messages in thread From: Eli Zaretskii @ 2021-06-04 10:50 UTC (permalink / raw) To: Daniel Colascione; +Cc: mail, emacs-devel > From: Daniel Colascione <dancol@dancol.org> > Date: Fri, 4 Jun 2021 02:47:32 -0700 > > On 6/4/21 1:00 AM, Daniel Mendler wrote: > > > Interesting, thank you for working on this! > > > > I had hoped that a new GC would surface at some point given the recent > > improvements regarding native compilation. As you say this can bring > > Emacs on par with modern managed language runtimes. Can you elaborate a > > bit more of some of your concepts? > > Thanks for taking a look. Seconded. And I really hope that more than just a look (and a discussion) will come out of this. Making our GC more efficient is an important development goal, of which over the years we've seen several attempts, but unfortunately little advancement in practice. I hope interested individuals will step forward and continue developing this (or any other) initiative so that we will eventually be able to replace our GC with a better one. Thanks in advance for developing this aspect of Emacs. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: New GC concept 2021-06-04 10:50 ` Eli Zaretskii @ 2021-06-21 13:00 ` Fejfighter 2021-06-21 13:31 ` Eli Zaretskii 2021-06-21 22:43 ` Daniel Colascione 0 siblings, 2 replies; 16+ messages in thread From: Fejfighter @ 2021-06-21 13:00 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mail, Daniel Colascione, emacs-devel [-- Attachment #1: Type: text/plain, Size: 2005 bytes --] Not wanting to see this drop, I found a little time to bring the changes up to emacs master with a few notes: 1) now sits on top of a4fb5811f (Do not attempt to write .elc....) 2) The C will compile with a couple of warnings as the native comp cases are currently not handled. 3) There's dead code, commented code and other atrocities on top of the current wip 3) It segfaults during the pdumper step in the build, there is not immediately obvious reason for this, but I suspect a move or free occurs and it's not tracked 4) I have ignored the comments for large_vector and large_vector_meta for now, so the meta is kept with the vector struct. code is up here: https://github.com/fejfighter/emacs/tree/feature/newgc I'm hoping now that it's a little closer to master and compiling someone might have a little more luck with the segfault issue I have been facing but I will keep try in the mean time, JeffW On Fri, Jun 4, 2021 at 8:52 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Daniel Colascione <dancol@dancol.org> > > Date: Fri, 4 Jun 2021 02:47:32 -0700 > > > > On 6/4/21 1:00 AM, Daniel Mendler wrote: > > > > > Interesting, thank you for working on this! > > > > > > I had hoped that a new GC would surface at some point given the recent > > > improvements regarding native compilation. As you say this can bring > > > Emacs on par with modern managed language runtimes. Can you elaborate a > > > bit more of some of your concepts? > > > > Thanks for taking a look. > > Seconded. > > And I really hope that more than just a look (and a discussion) will > come out of this. Making our GC more efficient is an important > development goal, of which over the years we've seen several attempts, > but unfortunately little advancement in practice. I hope interested > individuals will step forward and continue developing this (or any > other) initiative so that we will eventually be able to replace our GC > with a better one. > > Thanks in advance for developing this aspect of Emacs. > > [-- Attachment #2: Type: text/html, Size: 2715 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: New GC concept 2021-06-21 13:00 ` Fejfighter @ 2021-06-21 13:31 ` Eli Zaretskii 2021-06-21 22:43 ` Daniel Colascione 1 sibling, 0 replies; 16+ messages in thread From: Eli Zaretskii @ 2021-06-21 13:31 UTC (permalink / raw) To: Fejfighter; +Cc: mail, dancol, emacs-devel > From: Fejfighter <fejfighter@gmail.com> > Date: Mon, 21 Jun 2021 23:00:09 +1000 > Cc: mail@daniel-mendler.de, Daniel Colascione <dancol@dancol.org>, > emacs-devel@gnu.org > > Not wanting to see this drop, I found a little time to bring the changes up to emacs master with a few notes: Thanks for working on this. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: New GC concept 2021-06-21 13:00 ` Fejfighter 2021-06-21 13:31 ` Eli Zaretskii @ 2021-06-21 22:43 ` Daniel Colascione 2021-07-24 13:39 ` Fejfighter 1 sibling, 1 reply; 16+ messages in thread From: Daniel Colascione @ 2021-06-21 22:43 UTC (permalink / raw) To: Fejfighter, Eli Zaretskii; +Cc: mail, emacs-devel On 6/21/21 6:00 AM, Fejfighter wrote: > Not wanting to see this drop, I found a little time to bring the > changes up to emacs master with a few notes: > > 1) now sits on top of a4fb5811f (Do not attempt to write .elc....) Awesome. Thanks! > 2) The C will compile with a couple of warnings as the native comp > cases are currently not handled. > 3) There's dead code, commented code and other atrocities on top of > the current wip Yep. There were also plenty of atrocities in the original. :-) > 3) It segfaults during the pdumper step in the build, there is not > immediately obvious reason for this, but I suspect a move or free > occurs and it's not tracked > 4) I have ignored the comments for large_vector and large_vector_meta > for now, so the meta is kept with the vector struct. Yeah. That's probably a minor optimization, but we should get around to completing it. The next big chunk of work is actually implementing concurrent marking. > code is up here: > https://github.com/fejfighter/emacs/tree/feature/newgc > <https://github.com/fejfighter/emacs/tree/feature/newgc> > I'm hoping now that it's a little closer to master and compiling > someone might have a little more luck with the segfault issue I have > been facing but I will keep try in the mean time, FWIW, I find rr [1] to be exceptionally useful for diagnosing segfaults like this. [1] https://rr-project.org/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: New GC concept 2021-06-21 22:43 ` Daniel Colascione @ 2021-07-24 13:39 ` Fejfighter 0 siblings, 0 replies; 16+ messages in thread From: Fejfighter @ 2021-07-24 13:39 UTC (permalink / raw) To: Daniel Colascione; +Cc: mail, Eli Zaretskii, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1630 bytes --] On Tue, Jun 22, 2021 at 8:43 AM Daniel Colascione <dancol@dancol.org> wrote: > On 6/21/21 6:00 AM, Fejfighter wrote: > > > > code is up here: > > https://github.com/fejfighter/emacs/tree/feature/newgc > > <https://github.com/fejfighter/emacs/tree/feature/newgc> > > I'm hoping now that it's a little closer to master and compiling > > someone might have a little more luck with the segfault issue I have > > been facing but I will keep try in the mean time, > > FWIW, I find rr [1] to be exceptionally useful for diagnosing segfaults > like this. > > > [1] https://rr-project.org/ > While I had no luck with rr, I did trace that particular issue to not unprotecting memory, which got me through a little further to a point of a compacting sweep. This is where I have been spinning my wheels for the last few weeks in the short bursts of time I get to look at this codebase. The problem is highlighted by xxx_check_obarray, but will show up in future reads, where the ob array is not swept and does not get the updated references. I feel like it might be a special case where a global vector has references, because when marking we traverse the array as required, but this does not occur when sweeping, however, it does not affect other vector-like things. I think I will need to update the obarray, but simply calling `scan_vectorlike(XPNTR(Vobarray), GC_PHASE_SWEEP);` has bad values at that point. I'm hoping that this will either jog your memory and provide some background or get someone curious that might have a different understanding of how it all interacts and we can get over this particular hump, Thanks, Jeff W [-- Attachment #2: Type: text/html, Size: 2479 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: New GC concept 2021-06-04 9:47 ` Daniel Colascione 2021-06-04 10:50 ` Eli Zaretskii @ 2021-06-04 11:06 ` Daniel Mendler 1 sibling, 0 replies; 16+ messages in thread From: Daniel Mendler @ 2021-06-04 11:06 UTC (permalink / raw) To: Daniel Colascione, emacs-devel Thank you for answering my questions. Did you also consider the approach of using a non-copying collector, keeping the classical mark and sweep? Andrea mentioned in his mail that the focus should be on reactivity. With mark and sweep it is possible to avoid the copying step, which scales with the live size, such that one can achieve constant pause times. On the other hand the copying step is probably quick for the expected Emacs heap sizes. Furthermore with m&s, you have the fragmentation problem, it is harder to use such a bump-style allocator and it is harder to separate the generations, which is a requirement for the hardware write barrier. So I think overall your design is a sound approach as long as the heap stays at a reasonable size. Your approach seems to be quite general purpose and does not require intrusive changes to the Emacs C code; it seems to be relatively decoupled from the Elisp runtime. Do you think it is realistic to implement the GC as a "library" behind some abstract interface? Of course there is some dependence on the object memory layout, but the GC interface could offer APIs to request objects of different types. It should then be possible to use different GCs which use a similar approach (conservative stack scanning, no explicit read/write barriers). Daniel Mendler ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: New GC concept 2021-06-04 3:30 New GC concept Daniel Colascione 2021-06-04 8:00 ` Daniel Mendler @ 2021-06-04 8:56 ` Andrea Corallo via Emacs development discussions. 2021-06-07 17:32 ` Matt Armstrong 2 siblings, 0 replies; 16+ messages in thread From: Andrea Corallo via Emacs development discussions. @ 2021-06-04 8:56 UTC (permalink / raw) To: Daniel Colascione; +Cc: emacs-devel Daniel Colascione <dancol@dancol.org> writes: > Emacs has had the same GC for a decent amount of time now (since the > 1980s, really). I spent some time in 2020 rewriting it from scratch. I > haven't had time to work on the new GC recently, but I figure I'd > throw it out here to get some feedback on the general concept. > > Check out > https://github.com/dcolascione/emacs-1/blob/newgc-wip/src/alloc.c, > specifically the big doc comment on top > > The new GC basically replaces alloc.c and a few other things. It has a > few cool features: > > * fully copying and compacting > > * special treatment of sxhash to preserve object identify even while > we move it around in memory > > * generational > > * contiguous storage of mark bits separately from the data heap > > * concurrent (in design, not current implementation): idea is that we > do concurrent marking and barely pause for sweep > > * small string optimization > > * bump pointer allocation of new objects > > * heap enumeration support > > * hard requirement on pdumper > > * specialized GC spaces for conses, strings, arrays, and so on: no > stupid header word for cons cells bloating memory use by 50%! > > * cool modern C implementation that relies heavily on compiler > inlining and constant propagation > > The current implementation is deficient in many ways. Honestly, I'm > not even sure whether that specific revision compiles. But like I > said, I haven't had time recently to continue work on it. > > Still, I'm still curious about what people think of the overall > effort. It might work nicely with the new native compilation stuff, > giving us a managed code execution environment kind-of, sort-of on par > with the big modern managed-code runtimes. Sounds cool! The only comment I've so far is that IMO *the* important feature for a new Emacs GC is to have it concurrent (or say concurrent as much as possible). Emacs user experience is often dictated by its reactivity, we need to head towards a GC that is concurrent prioritizing in the design this feature over others, I wouldn't mind sacrificing some efficiency for that. I like the idea of a moving/generational GC but possibily porting what we have to a tri-color mark and sweep would solve already the problem with less impact. This is what I would have tried if I had time. Thanks for this work! Andrea ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: New GC concept 2021-06-04 3:30 New GC concept Daniel Colascione 2021-06-04 8:00 ` Daniel Mendler 2021-06-04 8:56 ` Andrea Corallo via Emacs development discussions. @ 2021-06-07 17:32 ` Matt Armstrong 2021-06-07 18:03 ` Daniel Colascione 2 siblings, 1 reply; 16+ messages in thread From: Matt Armstrong @ 2021-06-07 17:32 UTC (permalink / raw) To: Daniel Colascione, emacs-devel Daniel Colascione <dancol@dancol.org> writes: > Emacs has had the same GC for a decent amount of time now (since the > 1980s, really). I spent some time in 2020 rewriting it from scratch. I > haven't had time to work on the new GC recently, but I figure I'd throw > it out here to get some feedback on the general concept. > > Check out > https://github.com/dcolascione/emacs-1/blob/newgc-wip/src/alloc.c, > specifically the big doc comment on top Hey Daniel, I am no GC expert but I'm liking this a lot. I love the block comments in your alloc.c -- very clear and easy to understand. You're in a uniquely good position to work on this. I hope you continue! I'm curious about the answer to one of the unanswered questions in your alloc.c FAQ: What about systems without virtual memory? Asked another way: can we reasonably expect to entirely replace the current GC with this new one? Are there platforms Emacs supports today that would be left behind? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: New GC concept 2021-06-07 17:32 ` Matt Armstrong @ 2021-06-07 18:03 ` Daniel Colascione 2021-06-07 19:51 ` Daniele Nicolodi 0 siblings, 1 reply; 16+ messages in thread From: Daniel Colascione @ 2021-06-07 18:03 UTC (permalink / raw) To: Matt Armstrong, emacs-devel On 6/7/21 10:32 AM, Matt Armstrong wrote: > Daniel Colascione <dancol@dancol.org> writes: > >> Emacs has had the same GC for a decent amount of time now (since the >> 1980s, really). I spent some time in 2020 rewriting it from scratch. I >> haven't had time to work on the new GC recently, but I figure I'd throw >> it out here to get some feedback on the general concept. >> >> Check out >> https://github.com/dcolascione/emacs-1/blob/newgc-wip/src/alloc.c, >> specifically the big doc comment on top > Hey Daniel, I am no GC expert but I'm liking this a lot. I love the > block comments in your alloc.c -- very clear and easy to understand. > You're in a uniquely good position to work on this. I hope you > continue! Thank you! > I'm curious about the answer to one of the unanswered questions in your > alloc.c FAQ: What about systems without virtual memory? Asked another > way: can we reasonably expect to entirely replace the current GC with > this new one? Are there platforms Emacs supports today that would be > left behind? We can definitely replace the existing GC with the new GC everywhere. I've designed the new GC to work on systems without virtual memory facilities. On these systems, we'll have to run the GC in non-concurrent, non-generational mode, but that's no regression from what we have today. We'll also probably want to use a smaller block size on these systems to reduce fragmentation overhead. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: New GC concept 2021-06-07 18:03 ` Daniel Colascione @ 2021-06-07 19:51 ` Daniele Nicolodi 2021-06-08 2:22 ` Eli Zaretskii 0 siblings, 1 reply; 16+ messages in thread From: Daniele Nicolodi @ 2021-06-07 19:51 UTC (permalink / raw) To: emacs-devel On 07/06/2021 20:03, Daniel Colascione wrote: > On 6/7/21 10:32 AM, Matt Armstrong wrote: >> I'm curious about the answer to one of the unanswered questions in your >> alloc.c FAQ: What about systems without virtual memory? Asked another >> way: can we reasonably expect to entirely replace the current GC with >> this new one? Are there platforms Emacs supports today that would be >> left behind? > > We can definitely replace the existing GC with the new GC everywhere. > I've designed the new GC to work on systems without virtual memory > facilities. On these systems, we'll have to run the GC in > non-concurrent, non-generational mode, but that's no regression from > what we have today. We'll also probably want to use a smaller block size > on these systems to reduce fragmentation overhead. Isn't DOS the only system in this class? (It is not a rhetorical question: a while ago I asked which systems are officially supports and the answer was that all systems that currently run Emacs are supported). Does it make sense to still support DOS? Cheers, Dan ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: New GC concept 2021-06-07 19:51 ` Daniele Nicolodi @ 2021-06-08 2:22 ` Eli Zaretskii 2021-06-21 22:58 ` Daniel Colascione 0 siblings, 1 reply; 16+ messages in thread From: Eli Zaretskii @ 2021-06-08 2:22 UTC (permalink / raw) To: Daniele Nicolodi; +Cc: emacs-devel > From: Daniele Nicolodi <daniele@grinta.net> > Date: Mon, 7 Jun 2021 21:51:45 +0200 > > > We can definitely replace the existing GC with the new GC everywhere. > > I've designed the new GC to work on systems without virtual memory > > facilities. On these systems, we'll have to run the GC in > > non-concurrent, non-generational mode, but that's no regression from > > what we have today. We'll also probably want to use a smaller block size > > on these systems to reduce fragmentation overhead. > > Isn't DOS the only system in this class? (It is not a rhetorical > question: a while ago I asked which systems are officially supports and > the answer was that all systems that currently run Emacs are supported). > > Does it make sense to still support DOS? The development environment which is used to build the MS-DOS port of Emacs (DJGPP) does support virtual memory (IIUC what that means in this context). ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: New GC concept 2021-06-08 2:22 ` Eli Zaretskii @ 2021-06-21 22:58 ` Daniel Colascione 2021-06-22 12:59 ` Eli Zaretskii 0 siblings, 1 reply; 16+ messages in thread From: Daniel Colascione @ 2021-06-21 22:58 UTC (permalink / raw) To: Eli Zaretskii, Daniele Nicolodi; +Cc: emacs-devel On 6/7/21 7:22 PM, Eli Zaretskii wrote: >> From: Daniele Nicolodi <daniele@grinta.net> >> Date: Mon, 7 Jun 2021 21:51:45 +0200 >> >>> We can definitely replace the existing GC with the new GC everywhere. >>> I've designed the new GC to work on systems without virtual memory >>> facilities. On these systems, we'll have to run the GC in >>> non-concurrent, non-generational mode, but that's no regression from >>> what we have today. We'll also probably want to use a smaller block size >>> on these systems to reduce fragmentation overhead. >> Isn't DOS the only system in this class? (It is not a rhetorical >> question: a while ago I asked which systems are officially supports and >> the answer was that all systems that currently run Emacs are supported). >> >> Does it make sense to still support DOS? > The development environment which is used to build the MS-DOS port of > Emacs (DJGPP) does support virtual memory (IIUC what that means in > this context). Oh, right. I completely forgot that we have DPMI. It's been a very long time since I looked at that. Does DJGPP provide DPMI 0.9 or 1.0? To get generational GC under DJGPP, we'll need something like a SIGSEGV handler, a bit of code that we run when the CPU signals a memory protection fault. I think we get there by installing an exception interrupt handler, as in http://www.delorie.com/djgpp/doc/dpmi/ch4.5.html, and I think it'll work in both DPMI 0.9 and 1.0. Another thing we need for generational GC is the ability to mark a range of pages read-only, as with mprotect. I think DPMI gives us the ability to change page permissions, but 0.9 does not. See http://www.delorie.com/djgpp/doc/dpmi/api/310507.html The other thing we get with VM is the ability to swap the from-space and the to-space without an additional memory copy. DPMI 1.0 appears to provide a shared memory facility that would let us do that (the equivalent of mmap/MapViewOfFile of an anonymous segment), but I'm not sure that DPMI 0.9 gives us that ability. Anyway, even if it is theoretically possible to implement the new GC's fancy VM stuff in terms of DPMI, I think it should have lower priority than the rest of the system. The new GC run without virtual memory use at all should still be no worse overall than the current GC, so MS-DOS Emacs at least wouldn't see a regression if we switched to a version of the new GC that didn't understand DPMI. But DPMI support for the new GC would definitely be a fun retro computing project. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: New GC concept 2021-06-21 22:58 ` Daniel Colascione @ 2021-06-22 12:59 ` Eli Zaretskii 0 siblings, 0 replies; 16+ messages in thread From: Eli Zaretskii @ 2021-06-22 12:59 UTC (permalink / raw) To: Daniel Colascione; +Cc: daniele, emacs-devel > Cc: emacs-devel@gnu.org > From: Daniel Colascione <dancol@dancol.org> > Date: Mon, 21 Jun 2021 15:58:20 -0700 > > > The development environment which is used to build the MS-DOS port of > > Emacs (DJGPP) does support virtual memory (IIUC what that means in > > this context). > > Oh, right. I completely forgot that we have DPMI. > > It's been a very long time since I looked at that. Does DJGPP provide > DPMI 0.9 or 1.0? The DPMI provider which comes with DJGPP supports DPMI 0.9 with some extensions. (If one runs a DJGPP program on MS-Windows, one gets what Windows provides instead, which is DPMI 0.9.) > To get generational GC under DJGPP, we'll need something like a SIGSEGV > handler, a bit of code that we run when the CPU signals a memory > protection fault. I think we get there by installing an exception > interrupt handler, as in > http://www.delorie.com/djgpp/doc/dpmi/ch4.5.html, and I think it'll work > in both DPMI 0.9 and 1.0. Yes, this is supported. > Another thing we need for generational GC is > the ability to mark a range of pages read-only, as with mprotect. I > think DPMI gives us the ability to change page permissions, but 0.9 does > not. See http://www.delorie.com/djgpp/doc/dpmi/api/310507.html DJGPP has mprotect. It indeed requires DPMI 1.0, but it is also one of the extensions supported by the DPMI provider that comes with DJGPP. > The other thing we get with VM is the ability to swap the from-space and > the to-space without an additional memory copy. DPMI 1.0 appears to > provide a shared memory facility that would let us do that (the > equivalent of mmap/MapViewOfFile of an anonymous segment), but I'm not > sure that DPMI 0.9 gives us that ability. Right, this requires DPMI 1.0. > Anyway, even if it is theoretically possible to implement the new GC's > fancy VM stuff in terms of DPMI, I think it should have lower priority > than the rest of the system. The new GC run without virtual memory use > at all should still be no worse overall than the current GC, so MS-DOS > Emacs at least wouldn't see a regression if we switched to a version of > the new GC that didn't understand DPMI. Yes, definitely. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2021-07-24 13:39 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-06-04 3:30 New GC concept Daniel Colascione 2021-06-04 8:00 ` Daniel Mendler 2021-06-04 9:47 ` Daniel Colascione 2021-06-04 10:50 ` Eli Zaretskii 2021-06-21 13:00 ` Fejfighter 2021-06-21 13:31 ` Eli Zaretskii 2021-06-21 22:43 ` Daniel Colascione 2021-07-24 13:39 ` Fejfighter 2021-06-04 11:06 ` Daniel Mendler 2021-06-04 8:56 ` Andrea Corallo via Emacs development discussions. 2021-06-07 17:32 ` Matt Armstrong 2021-06-07 18:03 ` Daniel Colascione 2021-06-07 19:51 ` Daniele Nicolodi 2021-06-08 2:22 ` Eli Zaretskii 2021-06-21 22:58 ` Daniel Colascione 2021-06-22 12:59 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).