* Re: Emacs-devel Digest, Vol 246, Issue 17 [not found] <mailman.39.1723910423.12184.emacs-devel@gnu.org> @ 2024-08-17 22:49 ` ali_gnu2 2024-08-18 0:10 ` Po Lu 0 siblings, 1 reply; 112+ messages in thread From: ali_gnu2 @ 2024-08-17 22:49 UTC (permalink / raw) To: emacs-devel On 8/17/24 10:00 AM, emacs-devel-request@gnu.org wrote: >> Moreover, whatever becomes of the portable ELF unexec should >> not affect the Solaris unexec, which is provided by the operating system >> and should function without the likes of gmalloc. > AFAIK, the portable dumper is the default also on Solaris, so there is > no need to keep the unexec build around just for that platform. The Solaris code that does that is called dldump() and was invented years ago (~25 years?) to support emacs. We used to get occasional bug reports about emacs not dumping from time to time, and dldump() put an end to that. I'm the person who maintains that code in Solaris, and also the person who packages Emacs for our platform. We stopped using the unexec code the moment the portable dumper arrived, and haven't looked back. I don't think we'd even notice if unexec() went away. There are open source variants of Solaris for whom I don't speak, but from what I know about our common code, they should not be any more stuck on unexec() than we are. pdumper really doesn't use any unix features that didn't exist decades ago. Thanks for caring, but don't let us slow this down. The portable dumper is The Way. - Ali ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Emacs-devel Digest, Vol 246, Issue 17 2024-08-17 22:49 ` Emacs-devel Digest, Vol 246, Issue 17 ali_gnu2 @ 2024-08-18 0:10 ` Po Lu 2024-08-18 0:19 ` Po Lu ` (2 more replies) 0 siblings, 3 replies; 112+ messages in thread From: Po Lu @ 2024-08-18 0:10 UTC (permalink / raw) To: ali_gnu2; +Cc: emacs-devel ali_gnu2@emvision.com writes: > The Solaris code that does that is called dldump() and was > invented years ago (~25 years?) to support emacs. We used to > get occasional bug reports about emacs not dumping from time > to time, and dldump() put an end to that. > > I'm the person who maintains that code in Solaris, and also the > person who packages Emacs for our platform. We stopped using the > unexec code the moment the portable dumper arrived, and haven't > looked back. I don't think we'd even notice if unexec() went away. > > Thanks for caring, but don't let us slow this down. The portable > dumper is The Way. > > - Ali Hello Ali! I think you underestimate the number of programs using dldump. I've seen both Perl 5 and GNU Make hacked to save state with dldump, on Oracle Solaris, producing binaries that don't depend on the presence of a state file and probably start faster as well. Meanwhile pdumper-dumped binaries appear to crash in an x86 Solaris 10 zone, though I don't really use this configuration and I'm not interested in trying the portable dumper on sparc: core 'core' of 7021: ../../src/bootstrap-emacs -batch --no-site-file --no-site-lisp -f batc 00007fffaf433dc2 ???????? () 00007fffaf5eb3d7 ???????? () 00007fffaf5ec590 ???????? () 00007fffae3f351a _lwp_kill () + a 00007fffae3981b9 raise () + 19 00000000008baf90 terminate_due_to_signal () + c0 000000000090236e ???????? () 0000000000902334 deliver_thread_signal () + 74 00000000009023b0 deliver_fatal_thread_signal () + 10 00000000009024ef handle_sigsegv () + 4f 00007fffae3edd16 __sighndlr () + 6 00007fffae3e25e2 call_user_handler () + 252 00007fffae3e280e sigacthandler () + ee 00007fffaf5ea82d ???????? () ffffffffffffffff ???????? () 00000000009c77e7 lisp_align_malloc () + 4d7 00000000009c9dd2 make_float () + 42 00000000009d2e9d init_alloc () + d 00000000008bd373 main () + bb3 00000000006d15ab ???????? () > There are open source variants of Solaris for whom I don't > speak, but from what I know about our common code, they should > not be any more stuck on unexec() than we are. pdumper really > doesn't use any unix features that didn't exist decades ago. I don't believe we try to support Illumos. If Emacs should work, more power to them, but they have bigger fish to fry when GCC exception handling fails if an exception is raised the instant an object is unmapped, prompting dl_iterate_phdr to return -1. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Emacs-devel Digest, Vol 246, Issue 17 2024-08-18 0:10 ` Po Lu @ 2024-08-18 0:19 ` Po Lu 2024-08-18 1:15 ` Solaris dldump (was: Pure space) ali_gnu2 2024-12-08 12:17 ` pdumper on Solaris 10 Pip Cet via Emacs development discussions. 2 siblings, 0 replies; 112+ messages in thread From: Po Lu @ 2024-08-18 0:19 UTC (permalink / raw) To: ali_gnu2; +Cc: emacs-devel Po Lu <luangruo@yahoo.com> writes: > I don't believe we try to support Illumos. If Emacs should work, more > power to them, but they have bigger fish to fry when GCC exception > handling fails if an exception is raised the instant an object is > unmapped, prompting dl_iterate_phdr to return -1. As when an iconv_t is closed while an exception is raised in another thread. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Solaris dldump (was: Pure space) 2024-08-18 0:10 ` Po Lu 2024-08-18 0:19 ` Po Lu @ 2024-08-18 1:15 ` ali_gnu2 2024-08-18 1:25 ` Solaris dldump Po Lu 2024-12-08 12:17 ` pdumper on Solaris 10 Pip Cet via Emacs development discussions. 2 siblings, 1 reply; 112+ messages in thread From: ali_gnu2 @ 2024-08-18 1:15 UTC (permalink / raw) To: Po Lu; +Cc: emacs-devel On 8/17/24 6:10 PM, Po Lu wrote: > Hello Ali! > > I think you underestimate the number of programs using dldump. I've > seen both Perl 5 and GNU Make hacked to save state with dldump, on > Oracle Solaris, producing binaries that don't depend on the presence of > a state file and probably start faster as well. Meanwhile > pdumper-dumped binaries appear to crash in an x86 Solaris 10 zone, > though I don't really use this configuration and I'm not interested in > trying the portable dumper on sparc: > > core 'core' of 7021: ../../src/bootstrap-emacs -batch --no-site-file --no-site-lisp -f batc > 00007fffaf433dc2 ???????? () > 00007fffaf5eb3d7 ???????? () > 00007fffaf5ec590 ???????? () > 00007fffae3f351a _lwp_kill () + a > 00007fffae3981b9 raise () + 19 > 00000000008baf90 terminate_due_to_signal () + c0 > 000000000090236e ???????? () > 0000000000902334 deliver_thread_signal () + 74 > 00000000009023b0 deliver_fatal_thread_signal () + 10 > 00000000009024ef handle_sigsegv () + 4f > 00007fffae3edd16 __sighndlr () + 6 > 00007fffae3e25e2 call_user_handler () + 252 > 00007fffae3e280e sigacthandler () + ee > 00007fffaf5ea82d ???????? () > ffffffffffffffff ???????? () > 00000000009c77e7 lisp_align_malloc () + 4d7 > 00000000009c9dd2 make_float () + 42 > 00000000009d2e9d init_alloc () + d > 00000000008bd373 main () + bb3 > 00000000006d15ab ???????? () > Hello! Is that stack from the s10 zone? You're probably right that I don't know who is using dldump(), outside of emacs, but not to worry, it's not going away. It's a committed interface, so hard to remove, and at the same time, isn't causing any problems. Nonetheless, it's not our favored way to deploy emacs, and I wouldn't want anyone to think we prefer its use, or require it. We use pdumper on newer Solaris 11.4, both x86 and sparc, with no reported issues. I wasn't aware of the Solaris 10 zone problems (haven't seen any reports). If you end up looking at it, and think that the s10 zone is somehow at fault, please feel free to contact me offline. However, given that s10 is 20 years old, it wouldn't be unreasonable to drop it off the support tail. From discussions with Rainer Orth, who maintains gcc for Solaris, I believe that s10 support for gcc has ended, or is very close to ending. My personal opinion is that anyone happy to use a 20 year old OS should have no problem using an older gcc, or emacs, so it's not really the end of the road for those folks. >> There are open source variants of Solaris for whom I don't >> speak, but from what I know about our common code, they should >> not be any more stuck on unexec() than we are. pdumper really >> doesn't use any unix features that didn't exist decades ago. > > I don't believe we try to support Illumos. If Emacs should work, more > power to them, but they have bigger fish to fry when GCC exception > handling fails if an exception is raised the instant an object is > unmapped, prompting dl_iterate_phdr to return -1. I expect that we're both benefiting from your work anyway. Isn't emacs still largely C (not C++)? I wouldn't expect exception handling to be needed, so maybe it's OK. I do know that dl_iterate_phdr() is a relatively recent addition for us, and was done after the split, so that is a case where the code is not common. No doubt the fix for Illumos would not be difficult, if/when they get to it. Thanks! - Ali ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-18 1:15 ` Solaris dldump (was: Pure space) ali_gnu2 @ 2024-08-18 1:25 ` Po Lu 2024-08-18 22:27 ` Stefan Kangas 0 siblings, 1 reply; 112+ messages in thread From: Po Lu @ 2024-08-18 1:25 UTC (permalink / raw) To: ali_gnu2; +Cc: emacs-devel ali_gnu2@emvision.com writes: > On 8/17/24 6:10 PM, Po Lu wrote: >> Hello Ali! >> I think you underestimate the number of programs using dldump. I've >> seen both Perl 5 and GNU Make hacked to save state with dldump, on >> Oracle Solaris, producing binaries that don't depend on the presence of >> a state file and probably start faster as well. Meanwhile >> pdumper-dumped binaries appear to crash in an x86 Solaris 10 zone, >> though I don't really use this configuration and I'm not interested in >> trying the portable dumper on sparc: >> core 'core' of 7021: ../../src/bootstrap-emacs -batch >> --no-site-file --no-site-lisp -f batc >> 00007fffaf433dc2 ???????? () >> 00007fffaf5eb3d7 ???????? () >> 00007fffaf5ec590 ???????? () >> 00007fffae3f351a _lwp_kill () + a >> 00007fffae3981b9 raise () + 19 >> 00000000008baf90 terminate_due_to_signal () + c0 >> 000000000090236e ???????? () >> 0000000000902334 deliver_thread_signal () + 74 >> 00000000009023b0 deliver_fatal_thread_signal () + 10 >> 00000000009024ef handle_sigsegv () + 4f >> 00007fffae3edd16 __sighndlr () + 6 >> 00007fffae3e25e2 call_user_handler () + 252 >> 00007fffae3e280e sigacthandler () + ee >> 00007fffaf5ea82d ???????? () >> ffffffffffffffff ???????? () >> 00000000009c77e7 lisp_align_malloc () + 4d7 >> 00000000009c9dd2 make_float () + 42 >> 00000000009d2e9d init_alloc () + d >> 00000000008bd373 main () + bb3 >> 00000000006d15ab ???????? () >> > > Hello! > > Is that stack from the s10 zone? Yes. > You're probably right that I don't know who is using > dldump(), outside of emacs, but not to worry, it's not > going away. It's a committed interface, so hard to remove, > and at the same time, isn't causing any problems. Nonetheless, > it's not our favored way to deploy emacs, and I wouldn't want > anyone to think we prefer its use, or require it. > > We use pdumper on newer Solaris 11.4, both x86 and sparc, > with no reported issues. I wasn't aware of the Solaris 10 > zone problems (haven't seen any reports). If you end up > looking at it I plan to, but not till Emacs 30 is released. > and think that the s10 zone is somehow at fault, please feel free to > contact me offline. However, given that s10 is 20 years old, it > wouldn't be unreasonable to drop it off the support tail. From > discussions with Rainer Orth, who maintains gcc for Solaris, I believe > that s10 support for gcc has ended, or is very close to ending. My > personal opinion is that anyone happy to use a 20 year old OS should > have no problem using an older gcc, or emacs, so it's not really the > end of the road for those folks. I'm fine with using an older C compiler (whether GCC or no), but we have plenty of precedent in these quarters for remaining on decades-old operating systems. Not least when the operating system is to be supported for two more years. > I expect that we're both benefiting from your work anyway. > Isn't emacs still largely C (not C++)? I wouldn't expect > exception handling to be needed, so maybe it's OK. C, but several libraries draw in C++ dependencies and others create threads: HarfBuzz and librsvg for example. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-18 1:25 ` Solaris dldump Po Lu @ 2024-08-18 22:27 ` Stefan Kangas 2024-08-18 23:56 ` Po Lu 0 siblings, 1 reply; 112+ messages in thread From: Stefan Kangas @ 2024-08-18 22:27 UTC (permalink / raw) To: Po Lu, ali_gnu2; +Cc: emacs-devel Po Lu <luangruo@yahoo.com> writes: > ali_gnu2@emvision.com writes: > >> [...] However, given that s10 is 20 years old, it >> wouldn't be unreasonable to drop it off the support tail. From >> discussions with Rainer Orth, who maintains gcc for Solaris, I believe >> that s10 support for gcc has ended, or is very close to ending. My >> personal opinion is that anyone happy to use a 20 year old OS should >> have no problem using an older gcc, or emacs, so it's not really the >> end of the road for those folks. Thank you for sharing your informed opinion. I also can't see why we should consider the 20 year old Solaris 10 a blocker for removing the unexec build in Emacs 31. For example, even according to current Oracle communications, it will reach EOL in around two years. This means that by the time we release Emacs 31, users will already be busy moving to Solaris 11. If they aren't, they're fine on Emacs 30, or they can help us fix pdumper. > I'm fine with using an older C compiler (whether GCC or no), but we have > plenty of precedent in these quarters for remaining on decades-old > operating systems. Not least when the operating system is to be > supported for two more years. If there is interest in that very old proprietary system, and there is some problem with using pdumper there, then users should report bugs and volunteers should step up to fix them. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-18 22:27 ` Stefan Kangas @ 2024-08-18 23:56 ` Po Lu 2024-08-19 11:18 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 112+ messages in thread From: Po Lu @ 2024-08-18 23:56 UTC (permalink / raw) To: Stefan Kangas; +Cc: ali_gnu2, emacs-devel Stefan Kangas <stefankangas@gmail.com> writes: > Thank you for sharing your informed opinion. I also can't see why we > should consider the 20 year old Solaris 10 a blocker for removing the > unexec build in Emacs 31. > > For example, even according to current Oracle communications, it will > reach EOL in around two years. "Even" implies that it will reach EOL sooner, but by all indications the EOL date will be as stated, if it is not postponed any further, and Oracle and related organizations will continue to support the operating system at a reduced intensity indefinitely. Why do you suppose this is, if otherwise than because the operating system is abundantly used? Fedora 40's remaining support period is shorter; should we not cease to support it any longer, in view of the one or two crashes in the PGTK configuration that can only be reproduced with the distribution packages, and which continue to languish on the bug tracker? > This means that by the time we release Emacs 31, users will already be > busy moving to Solaris 11 This is nonsense. It's impossible to upgrade installed Solaris 10 systems to Solaris 11, and being a robust system many users are content to remain there till hell freezes over. > If they aren't, they're fine on Emacs 30, or they can help us fix > pdumper. No one is ever "fine on" an outdated text editor. You agree that this principle applies to operating systems, but when Emacs is in question, the about-face comes very quickly. It's a waste of my time (and my organization's) that would be totally needless if you were not so trigger-happy with old and proven features. It's a-ok to retain pure space to avoid burdening someone with very hypothetical additional labor, but it's not possible to take a far less radical measure to conserve my time. In any event, I promised to devote some of it to this issue after Emacs 30 is released. > If there is interest in that very old proprietary system, and there is > some problem with using pdumper there, then users should report bugs and > volunteers should step up to fix them. According to Microsoft, Windows XP reached EOL in 2014, and yet its users are none the less inclined to the latest releases of Emacs (nor has it been prevented from retaining 0.38% of Windows's aggregate market share, in excess of Windows 8's 0.24): https://gs.statcounter.com/os-version-market-share/windows/desktop/worldwide. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-18 23:56 ` Po Lu @ 2024-08-19 11:18 ` Eli Zaretskii 2024-08-19 12:09 ` Po Lu 2024-08-19 11:44 ` Pip Cet 2024-08-19 20:35 ` Stefan Kangas 2 siblings, 1 reply; 112+ messages in thread From: Eli Zaretskii @ 2024-08-19 11:18 UTC (permalink / raw) To: Po Lu; +Cc: stefankangas, ali_gnu2, emacs-devel > From: Po Lu <luangruo@yahoo.com> > Cc: ali_gnu2@emvision.com, emacs-devel@gnu.org > Date: Mon, 19 Aug 2024 07:56:13 +0800 > > Stefan Kangas <stefankangas@gmail.com> writes: > > > Thank you for sharing your informed opinion. I also can't see why we > > should consider the 20 year old Solaris 10 a blocker for removing the > > unexec build in Emacs 31. > > > > For example, even according to current Oracle communications, it will > > reach EOL in around two years. > > "Even" implies that it will reach EOL sooner, but by all indications the > EOL date will be as stated, if it is not postponed any further, and > Oracle and related organizations will continue to support the operating > system at a reduced intensity indefinitely. Why do you suppose this is, > if otherwise than because the operating system is abundantly used? > > Fedora 40's remaining support period is shorter; should we not cease to > support it any longer, in view of the one or two crashes in the PGTK > configuration that can only be reproduced with the distribution > packages, and which continue to languish on the bug tracker? These aspects are almost unrelated to the issue at hand: we don't make our decisions of dropping support of some platform or feature because it is EOLed by its vendor or developers. Instead, we make our own decisions, and in general try not to drop any feature/platform if we don't have to. In this case, keeping the support of unexec longer becomes a maintenance burden (just look at the #ifdef mess it requires), and that is the reason why we think we should drop those platforms that don't currently support pdumper. The fact that all those platforms are either very old or have better alternatives is just a supporting consideration, not the main reason. > It's a waste of my time (and my organization's) that would be totally > needless if you were not so trigger-happy with old and proven features. We are very far from being "trigger-happy" in these matters. In fact, we are often accused in the opposite. E.g., Gnulib dropped support for some of these platforms long ago, and couldn't be convinced to reconsider, even when told that Emacs needs that continued support. So what you say above is completely uncalled-for and unfair. > It's a-ok to retain pure space to avoid burdening someone with very > hypothetical additional labor, but it's not possible to take a far less > radical measure to conserve my time. In any event, I promised to devote > some of it to this issue after Emacs 30 is released. If you intend to work on modifying the unexec code to not use pure space, don't waste your time: I will object to any serious development of the unexec code. The only way forward for the platforms that currently need unexec is to start using pdumper. > > If there is interest in that very old proprietary system, and there is > > some problem with using pdumper there, then users should report bugs and > > volunteers should step up to fix them. > > According to Microsoft, Windows XP reached EOL in 2014, and yet its > users are none the less inclined to the latest releases of Emacs (nor > has it been prevented from retaining 0.38% of Windows's aggregate market > share, in excess of Windows 8's 0.24): > > https://gs.statcounter.com/os-version-market-share/windows/desktop/worldwide. Once again, it is immaterial when a platform was EOLed. That is not the reason why we want to drop unexec. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-19 11:18 ` Eli Zaretskii @ 2024-08-19 12:09 ` Po Lu 2024-08-19 12:50 ` Eli Zaretskii 0 siblings, 1 reply; 112+ messages in thread From: Po Lu @ 2024-08-19 12:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stefankangas, ali_gnu2, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > In this case, keeping the support of unexec longer becomes a > maintenance burden (just look at the #ifdef mess it requires), and > that is the reason why we think we should drop those platforms that > don't currently support pdumper. The fact that all those platforms > are either very old or have better alternatives is just a supporting > consideration, not the main reason. You mean the 35 instances of "HAVE_UNEXEC" in C source files, not excepting the "HAVE_PDUMPER || HAVE_UNEXEC" conditions, or the malloc and Gnulib flags that aren't necessary on unexsol? I would be as glad as you to see most of them removed, as they are not significant on the systems where unexec should be retained. > If you intend to work on modifying the unexec code to not use pure > space, don't waste your time: I will object to any serious development > of the unexec code. The only way forward for the platforms that > currently need unexec is to start using pdumper. I need not modify the unexec code, or adapt it to configurations without pure space, as there simply is no code to adapt. unexsol.c works _now_ with or without pure space, and I would be immensely surprised if the same were not true of DJGPP, and as it happens, whether in Emacs or elsewhere. > Once again, it is immaterial when a platform was EOLed. That is not > the reason why we want to drop unexec. That's not what I heard just one message removed from mine, where being two years from EOL was stated to be sufficient grounds to withdraw support. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-19 12:09 ` Po Lu @ 2024-08-19 12:50 ` Eli Zaretskii 0 siblings, 0 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-08-19 12:50 UTC (permalink / raw) To: Po Lu; +Cc: stefankangas, ali_gnu2, emacs-devel > From: Po Lu <luangruo@yahoo.com> > Cc: stefankangas@gmail.com, ali_gnu2@emvision.com, emacs-devel@gnu.org > Date: Mon, 19 Aug 2024 20:09:36 +0800 > > Eli Zaretskii <eliz@gnu.org> writes: > > > In this case, keeping the support of unexec longer becomes a > > maintenance burden (just look at the #ifdef mess it requires), and > > that is the reason why we think we should drop those platforms that > > don't currently support pdumper. The fact that all those platforms > > are either very old or have better alternatives is just a supporting > > consideration, not the main reason. > > You mean the 35 instances of "HAVE_UNEXEC" in C source files, not > excepting the "HAVE_PDUMPER || HAVE_UNEXEC" conditions, or the malloc > and Gnulib flags that aren't necessary on unexsol? I mean all of them, and I also mean the need to understand the fine details of unexec, the differences between it and pdumper mode, and the reason for some tricky code we need for unexec. Most of current frequent contributors to Emacs have no idea about that, and thus the unexec build is very easy to break by some change that doesn't take it into account. > I would be as glad as you to see most of them removed, as they are > not significant on the systems where unexec should be retained. They are necessary. > > Once again, it is immaterial when a platform was EOLed. That is not > > the reason why we want to drop unexec. > > That's not what I heard just one message removed from mine You've misunderstood what Stefan meant. He was just responding to your message, nothing more nothing less. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-18 23:56 ` Po Lu 2024-08-19 11:18 ` Eli Zaretskii @ 2024-08-19 11:44 ` Pip Cet 2024-08-19 11:57 ` Po Lu 2024-08-19 20:35 ` Stefan Kangas 2 siblings, 1 reply; 112+ messages in thread From: Pip Cet @ 2024-08-19 11:44 UTC (permalink / raw) To: Po Lu; +Cc: Stefan Kangas, ali_gnu2, emacs-devel "Po Lu" <luangruo@yahoo.com> writes: > It's a-ok to retain pure space to avoid burdening someone with very > hypothetical additional labor Wait, I'm not sure I understand that part. How does removing pure space burden anyone with additional labor, hypothetical or not? Also, do the systems that don't support pdumper but do support unexec work without dumping, when running temacs directly? It takes very long to build Emacs that way, but since we're talking non-free operating systems it might be acceptable to ask people to cross-compile for now, kind of like we do for the Android builds where the .elc files are generated on the build systems. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-19 11:44 ` Pip Cet @ 2024-08-19 11:57 ` Po Lu 2024-08-19 12:10 ` Pip Cet 0 siblings, 1 reply; 112+ messages in thread From: Po Lu @ 2024-08-19 11:57 UTC (permalink / raw) To: Pip Cet; +Cc: Stefan Kangas, ali_gnu2, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > Wait, I'm not sure I understand that part. How does removing pure space > burden anyone with additional labor, hypothetical or not? Isn't this theoretical burden the reason that pure space is not to be removed except along with unexec? > Also, do the systems that don't support pdumper but do support unexec > work without dumping, when running temacs directly? It takes very long > to build Emacs that way, but since we're talking non-free operating > systems it might be acceptable to ask people to cross-compile for now, > kind of like we do for the Android builds where the .elc files are > generated on the build systems. There is a substantial segment of our users who don't expect Emacs to start in 5+ seconds, if only judging by the hullabaloo that erupts whenever startup performance is threatened or even mildly retarded. Even in the Android port, this penalty is paid once on installation and a dump file is retained for subsequent initializations of the same binary. Anyway, I want pure space gone as much as any of us, I just don't agree that taking unexec down with it is justified. Maybe the ELF, XCOFF, and Windows unexecs, but not the Solaris or DOS ones. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-19 11:57 ` Po Lu @ 2024-08-19 12:10 ` Pip Cet 2024-08-19 12:55 ` Eli Zaretskii 0 siblings, 1 reply; 112+ messages in thread From: Pip Cet @ 2024-08-19 12:10 UTC (permalink / raw) To: Po Lu; +Cc: Stefan Kangas, ali_gnu2, emacs-devel "Po Lu" <luangruo@yahoo.com> writes: > Pip Cet <pipcet@protonmail.com> writes: >> Wait, I'm not sure I understand that part. How does removing pure space >> burden anyone with additional labor, hypothetical or not? > > Isn't this theoretical burden the reason that pure space is not to be > removed except along with unexec? Maybe a compromise would be to keep unexec but put it on probation, promising to remove it if problems arise that cannot be convincingly and immediately fixed? >> Also, do the systems that don't support pdumper but do support unexec >> work without dumping, when running temacs directly? It takes very long >> to build Emacs that way, but since we're talking non-free operating >> systems it might be acceptable to ask people to cross-compile for now, >> kind of like we do for the Android builds where the .elc files are >> generated on the build systems. > > There is a substantial segment of our users who don't expect Emacs to > start in 5+ seconds, if only judging by the hullabaloo that erupts > whenever startup performance is threatened or even mildly retarded. I agree, but I also think some compromise will have to be found. > Even in the Android port, this penalty is paid once on installation and > a dump file is retained for subsequent initializations of the same > binary. A very clever hack, I must say! > Anyway, I want pure space gone as much as any of us, I just don't agree > that taking unexec down with it is justified. Maybe the ELF, XCOFF, and > Windows unexecs, but not the Solaris or DOS ones. DOS in particular is what triggered my question: given the limitations of DOS systems, it's quite possible temacs-as-emacs just wouldn't fly on those machines. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-19 12:10 ` Pip Cet @ 2024-08-19 12:55 ` Eli Zaretskii 2024-08-19 13:46 ` Pip Cet 0 siblings, 1 reply; 112+ messages in thread From: Eli Zaretskii @ 2024-08-19 12:55 UTC (permalink / raw) To: Pip Cet; +Cc: luangruo, stefankangas, ali_gnu2, emacs-devel > Date: Mon, 19 Aug 2024 12:10:19 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: Stefan Kangas <stefankangas@gmail.com>, ali_gnu2@emvision.com, > emacs-devel@gnu.org > > "Po Lu" <luangruo@yahoo.com> writes: > > Pip Cet <pipcet@protonmail.com> writes: > >> Wait, I'm not sure I understand that part. How does removing pure space > >> burden anyone with additional labor, hypothetical or not? > > > > Isn't this theoretical burden the reason that pure space is not to be > > removed except along with unexec? > > Maybe a compromise would be to keep unexec but put it on probation, > promising to remove it if problems arise that cannot be convincingly and > immediately fixed? That'd just add to code churn and maintenance burden. So I prefer removing it to begin with. > > Anyway, I want pure space gone as much as any of us, I just don't agree > > that taking unexec down with it is justified. Maybe the ELF, XCOFF, and > > Windows unexecs, but not the Solaris or DOS ones. > > DOS in particular is what triggered my question: given the limitations > of DOS systems, it's quite possible temacs-as-emacs just wouldn't fly on > those machines. Those limitations are not relevant in our case. But this all is besides the point. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-19 12:55 ` Eli Zaretskii @ 2024-08-19 13:46 ` Pip Cet 2024-08-19 14:39 ` Eli Zaretskii 2024-08-19 20:51 ` Stefan Kangas 0 siblings, 2 replies; 112+ messages in thread From: Pip Cet @ 2024-08-19 13:46 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, stefankangas, ali_gnu2, emacs-devel "Eli Zaretskii" <eliz@gnu.org> writes: >> Date: Mon, 19 Aug 2024 12:10:19 +0000 >> From: Pip Cet <pipcet@protonmail.com> >> Cc: Stefan Kangas <stefankangas@gmail.com>, ali_gnu2@emvision.com, >> emacs-devel@gnu.org >> >> "Po Lu" <luangruo@yahoo.com> writes: >> > Pip Cet <pipcet@protonmail.com> writes: >> >> Wait, I'm not sure I understand that part. How does removing pure space >> >> burden anyone with additional labor, hypothetical or not? >> > >> > Isn't this theoretical burden the reason that pure space is not to be >> > removed except along with unexec? >> >> Maybe a compromise would be to keep unexec but put it on probation, >> promising to remove it if problems arise that cannot be convincingly and >> immediately fixed? > > That'd just add to code churn and maintenance burden. So I prefer > removing it to begin with. I've just gone through configure.ac removing all the code that depends on unexec (no doubt I've missed some), and I must say I now agree it is time for unexec to go. In particular, it had so far escaped my attention that it's incompatible with native compilation! So I'll update the scratch/no-purespace branch to also remove unexec, and of course I'm offering to help anyone who wants to fix the remaining non-pdumper ports. And while I am skeptical of the value of ASLR, it wuold be really embarrassing to run into a security issue that's exploitable only because Emacs disables ASLR for unexec builds. >> > Anyway, I want pure space gone as much as any of us, I just don't agree >> > that taking unexec down with it is justified. Maybe the ELF, XCOFF, and >> > Windows unexecs, but not the Solaris or DOS ones. >> >> DOS in particular is what triggered my question: given the limitations >> of DOS systems, it's quite possible temacs-as-emacs just wouldn't fly on >> those machines. > > Those limitations are not relevant in our case. I think it's relevant whether DOS will become completely unusable or merely difficult to use once unexec is removed, and what can be done to fix it. Does the DOS port work on free DOS clones? And is there a way to gain access to a Solaris machine to fix pdumper on it? Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-19 13:46 ` Pip Cet @ 2024-08-19 14:39 ` Eli Zaretskii 2024-08-19 15:26 ` Corwin Brust 2024-08-19 20:51 ` Stefan Kangas 1 sibling, 1 reply; 112+ messages in thread From: Eli Zaretskii @ 2024-08-19 14:39 UTC (permalink / raw) To: Pip Cet; +Cc: luangruo, stefankangas, ali_gnu2, emacs-devel > Date: Mon, 19 Aug 2024 13:46:11 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: luangruo@yahoo.com, stefankangas@gmail.com, ali_gnu2@emvision.com, emacs-devel@gnu.org > > I've just gone through configure.ac removing all the code that depends > on unexec (no doubt I've missed some), and I must say I now agree it is > time for unexec to go. In particular, it had so far escaped my > attention that it's incompatible with native compilation! All the major new feature don't support unexec; native-compilation was just the first. > And while I am skeptical of the value of ASLR, it wuold be really > embarrassing to run into a security issue that's exploitable only > because Emacs disables ASLR for unexec builds. We disable ASLR only during the build, AFAIR, not when we run the dumped Emacs. > >> DOS in particular is what triggered my question: given the limitations > >> of DOS systems, it's quite possible temacs-as-emacs just wouldn't fly on > >> those machines. > > > > Those limitations are not relevant in our case. > > I think it's relevant whether DOS will become completely unusable or > merely difficult to use once unexec is removed, and what can be done to > fix it. I was talking specifically about running temacs. > Does the DOS port work on free DOS clones? Yes, AFAIK (although I myself only run the DOS port on Windows for many years now). > And is there a way to gain access to a Solaris machine to fix > pdumper on it? No idea, perhaps Po Lu does. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-19 14:39 ` Eli Zaretskii @ 2024-08-19 15:26 ` Corwin Brust 2024-08-19 15:31 ` Corwin Brust 0 siblings, 1 reply; 112+ messages in thread From: Corwin Brust @ 2024-08-19 15:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Pip Cet, luangruo, stefankangas, ali_gnu2, emacs-devel On Mon, Aug 19, 2024 at 9:39 AM Eli Zaretskii <eliz@gnu.org> wrote: > > > Date: Mon, 19 Aug 2024 13:46:11 +0000 > > From: Pip Cet <pipcet@protonmail.com> > > Cc: luangruo@yahoo.com, stefankangas@gmail.com, ali_gnu2@emvision.com, emacs-devel@gnu.org > > > > > And is there a way to gain access to a Solaris machine to fix > > pdumper on it? > > No idea, perhaps Po Lu does. > It may be worth reaching out the GCC devs/testers - a peer on the FSF sysadmin team thinks they may have a few machines running Solaris (but isn't sure if 10 or 11 or both). ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-19 15:26 ` Corwin Brust @ 2024-08-19 15:31 ` Corwin Brust 0 siblings, 0 replies; 112+ messages in thread From: Corwin Brust @ 2024-08-19 15:31 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Pip Cet, luangruo, stefankangas, ali_gnu2, emacs-devel On Mon, Aug 19, 2024 at 10:26 AM Corwin Brust <corwin@bru.st> wrote: > > It may be worth reaching out the GCC devs/testers - a peer on the FSF > sysadmin team thinks they may have a few machines running Solaris (but > isn't sure if 10 or 11 or both). More information! cfarm210: Solaris 10 See: https://portal.cfarm.net/machines/list/ ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-19 13:46 ` Pip Cet 2024-08-19 14:39 ` Eli Zaretskii @ 2024-08-19 20:51 ` Stefan Kangas 1 sibling, 0 replies; 112+ messages in thread From: Stefan Kangas @ 2024-08-19 20:51 UTC (permalink / raw) To: Pip Cet, Eli Zaretskii; +Cc: luangruo, ali_gnu2, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > I've just gone through configure.ac removing all the code that depends > on unexec (no doubt I've missed some), and I must say I now agree it is > time for unexec to go. In particular, it had so far escaped my > attention that it's incompatible with native compilation! > > So I'll update the scratch/no-purespace branch to also remove unexec, > and of course I'm offering to help anyone who wants to fix the remaining > non-pdumper ports. Thanks for working on this. Please push the branch to Savannah when its ready, and let's take it from there. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Solaris dldump 2024-08-18 23:56 ` Po Lu 2024-08-19 11:18 ` Eli Zaretskii 2024-08-19 11:44 ` Pip Cet @ 2024-08-19 20:35 ` Stefan Kangas 2 siblings, 0 replies; 112+ messages in thread From: Stefan Kangas @ 2024-08-19 20:35 UTC (permalink / raw) To: Po Lu; +Cc: ali_gnu2, emacs-devel Po Lu <luangruo@yahoo.com> writes: > Fedora 40's remaining support period is shorter; should we not cease to > support it any longer, We seem to be miscommunicating. I'm not suggesting explicitly desupporting Solaris 10. Spending considerable time on keeping the unexec build alive has stopped making sense for the project as a whole. The good news is that there is nothing to suggest that the portable dumper should not be fixable on the systems where it's reportedly not yet up to scratch. I'm saying 1) that the reported problems with the portable dumper on some proprietary systems (MS-DOS, Windows 98, Solaris 10) should be fixed, 2) that I do not consider this blocking us from dropping the unexec build at the present time, and 3) I urged volunteers to step forward to improve and/or fix pdumper on these systems. I recommend reading the "Information for Maintainers of GNU Software" manual to get a better view of some of the principles that are guiding my thinking: https://www.gnu.org/prep/maintain/maintain.html#Platforms Note in particular this part: "Supporting other platforms is optional -- we do it when that seems like a good idea, but we don’t consider it obligatory. If the users don’t take care of a certain platform, you may have to desupport it unless and until users come forward to help. Conversely, if a user offers changes to support an additional platform, you will probably want to install them, but you don’t have to. If you feel the changes are complex and ugly, if you think that they will increase the burden of future maintenance, you can and should reject them. This includes both free or mainly-free platforms such as OpenBSD, FreeBSD, and NetBSD, and nonfree platforms such as Windows." These are the basics, applicable to the GNU project as a whole. There are special considerations for Emacs, of course, some of which have been indicated in this thread. For example, we are probably "best-in-class" among GNU projects when it comes to supporting various platforms, even very old/obsolete ones, and at the cost of valuable time and resources. So don't let anyone believe that we rush to leave (even fringe) groups of users behind for no good reason. That's just not the case. But we do have certain things that we prioritize ahead of others. That sometimes means asking for volunteer help to do things that are merely secondary, if that helps us advance our primary goals. Right now, it's very clear that the unexec build has reached the end of the road. Thus, we must ask volunteers to help us improve pdumper on systems that, with respect to existing users, are not currently primary considerations. I hope that helps make things more clear. > In any event, I promised to devote some of it to this issue after > Emacs 30 is released. That is good and welcome. Thanks in advance for your efforts. ^ permalink raw reply [flat|nested] 112+ messages in thread
* pdumper on Solaris 10 2024-08-18 0:10 ` Po Lu 2024-08-18 0:19 ` Po Lu 2024-08-18 1:15 ` Solaris dldump (was: Pure space) ali_gnu2 @ 2024-12-08 12:17 ` Pip Cet via Emacs development discussions. 2024-12-08 13:05 ` Eli Zaretskii ` (2 more replies) 2 siblings, 3 replies; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-08 12:17 UTC (permalink / raw) To: Po Lu; +Cc: ali_gnu2, emacs-devel "Po Lu" <luangruo@yahoo.com> writes: > pdumper-dumped binaries appear to crash in an x86 Solaris 10 zone, > though I don't really use this configuration and I'm not interested in > trying the portable dumper on sparc: > > core 'core' of 7021: ../../src/bootstrap-emacs -batch --no-site-file --no-site-lisp -f batc > 00007fffaf433dc2 ???????? () > 00007fffaf5eb3d7 ???????? () > 00007fffaf5ec590 ???????? () > 00007fffae3f351a _lwp_kill () + a > 00007fffae3981b9 raise () + 19 > 00000000008baf90 terminate_due_to_signal () + c0 > 000000000090236e ???????? () > 0000000000902334 deliver_thread_signal () + 74 > 00000000009023b0 deliver_fatal_thread_signal () + 10 > 00000000009024ef handle_sigsegv () + 4f > 00007fffae3edd16 __sighndlr () + 6 > 00007fffae3e25e2 call_user_handler () + 252 > 00007fffae3e280e sigacthandler () + ee > 00007fffaf5ea82d ???????? () > ffffffffffffffff ???????? () > 00000000009c77e7 lisp_align_malloc () + 4d7 > 00000000009c9dd2 make_float () + 42 > 00000000009d2e9d init_alloc () + d > 00000000008bd373 main () + bb3 > 00000000006d15ab ???????? () FWIW, this issue doesn't appear to happen on a "fresh" Solaris 10 install, in a qemu virtual machine, on x86. I used the sol-10-u11-ga-x86-dvd.iso image, installed to a new disk, then installed OpenCSW and built Emacs from the master branch with and without CFLAGS="-m64" (plus the linker path selection). Both builds appear to work. What's odd about that backtrace is that lisp_align_malloc in the current build is only 435 bytes long (with -m64), so it's hard to guess which part of the alignment code used to be at offset 0x4d7. But while we're talking about rare and unusual systems, !USE_LSB builds are currently broken except for the WIDE_EMACS_INT case, because the stack scanning code makes no attempt to remove MSB tags. It may be time to simply remove MSB tag support, unless there are systems around that actually fail to align static objects to 8-byte boundaries (but such systems would have been broken for a while now). Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 12:17 ` pdumper on Solaris 10 Pip Cet via Emacs development discussions. @ 2024-12-08 13:05 ` Eli Zaretskii 2024-12-08 13:52 ` Pip Cet via Emacs development discussions. 2024-12-09 0:58 ` Po Lu 2024-12-09 1:01 ` Po Lu 2 siblings, 1 reply; 112+ messages in thread From: Eli Zaretskii @ 2024-12-08 13:05 UTC (permalink / raw) To: Pip Cet; +Cc: luangruo, ali_gnu2, emacs-devel > Date: Sun, 08 Dec 2024 12:17:05 +0000 > Cc: ali_gnu2@emvision.com, emacs-devel@gnu.org > From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > > "Po Lu" <luangruo@yahoo.com> writes: > > But while we're talking about rare and unusual systems, !USE_LSB builds > are currently broken except for the WIDE_EMACS_INT case, because the > stack scanning code makes no attempt to remove MSB tags. Which builds except WIDE_EMACS_INT need to use !USE_LSB? ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 13:05 ` Eli Zaretskii @ 2024-12-08 13:52 ` Pip Cet via Emacs development discussions. 2024-12-08 14:52 ` Eli Zaretskii 2024-12-09 1:08 ` Po Lu 0 siblings, 2 replies; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-08 13:52 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, ali_gnu2, emacs-devel "Eli Zaretskii" <eliz@gnu.org> writes: >> Date: Sun, 08 Dec 2024 12:17:05 +0000 >> Cc: ali_gnu2@emvision.com, emacs-devel@gnu.org >> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> >> >> "Po Lu" <luangruo@yahoo.com> writes: >> >> But while we're talking about rare and unusual systems, !USE_LSB builds >> are currently broken except for the WIDE_EMACS_INT case, because the >> stack scanning code makes no attempt to remove MSB tags. > > Which builds except WIDE_EMACS_INT need to use !USE_LSB? The only platforms that "need" to use !USE_LSB are those that cannot guarantee 8-byte alignment for static objects, which is why I asked about those. If those exist, we should have received bug reports indicating that !WIDE_EMACS_INT builds don't work on such platforms. In particular, WIDE_EMACS_INT shouldn't imply !USE_LSB. That it currently does is a very questionable optimization at best (fixnum manipulation may be very slightly faster with !USE_LSB, but pointer manipulation will be slower and requires extra registers, which is an issue on i386). For example, NILP() would only need to look at a single 32-bit word for the WIDE_EMACS_INT + USE_LSB configuration. I strongly suspect that effect alone would make WIDE_EMACS_INT + USE_LSB faster than WIDE_EMACS_INT + !USE_LSB (of course, the relevant optimization would have to be made first). (Of course, WIDE_EMACS_INT is almost always a bad deal, anyway. As far as I can tell, the justification for its continued existence is that some C code assumes buffer positions are fixnums (and, because we expose fixnum-ness to Lisp, some broken Lisp code might do that, too). If we had implemented fixnums to be transparent, we could simply remove WIDE_EMACS_INT, but that mistake has been made...) Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 13:52 ` Pip Cet via Emacs development discussions. @ 2024-12-08 14:52 ` Eli Zaretskii 2024-12-08 16:17 ` Pip Cet via Emacs development discussions. 2024-12-09 1:08 ` Po Lu 1 sibling, 1 reply; 112+ messages in thread From: Eli Zaretskii @ 2024-12-08 14:52 UTC (permalink / raw) To: Pip Cet; +Cc: luangruo, ali_gnu2, emacs-devel > Date: Sun, 08 Dec 2024 13:52:09 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org > > "Eli Zaretskii" <eliz@gnu.org> writes: > > > Which builds except WIDE_EMACS_INT need to use !USE_LSB? > > The only platforms that "need" to use !USE_LSB are those that cannot > guarantee 8-byte alignment for static objects, which is why I asked > about those. That means: none, AFAIK. At least not given the platforms we currently support. So it's little wonder that configuration had bit-rotten. > In particular, WIDE_EMACS_INT shouldn't imply !USE_LSB. That it > currently does is a very questionable optimization at best (fixnum > manipulation may be very slightly faster with !USE_LSB, but pointer > manipulation will be slower and requires extra registers, which is an > issue on i386). Where can one find i386 these days, except in a museum? > (Of course, WIDE_EMACS_INT is almost always a bad deal, anyway. As far > as I can tell, the justification for its continued existence is that > some C code assumes buffer positions are fixnums (and, because we expose > fixnum-ness to Lisp, some broken Lisp code might do that, too). If we > had implemented fixnums to be transparent, we could simply remove > WIDE_EMACS_INT, but that mistake has been made...) I'm a very happy user of WIDE_EMACS_INT, so bad-mouthing it is not recommended ;-) In fact, one of my strongest reservations about the igc branch is that it will most probably force me to lose WIDE_EMACS_INT. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 14:52 ` Eli Zaretskii @ 2024-12-08 16:17 ` Pip Cet via Emacs development discussions. 2024-12-08 16:49 ` Eli Zaretskii 0 siblings, 1 reply; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-08 16:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, ali_gnu2, emacs-devel "Eli Zaretskii" <eliz@gnu.org> writes: >> Date: Sun, 08 Dec 2024 13:52:09 +0000 >> From: Pip Cet <pipcet@protonmail.com> >> Cc: luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org >> >> "Eli Zaretskii" <eliz@gnu.org> writes: >> >> > Which builds except WIDE_EMACS_INT need to use !USE_LSB? >> >> The only platforms that "need" to use !USE_LSB are those that cannot >> guarantee 8-byte alignment for static objects, which is why I asked >> about those. > > That means: none, AFAIK. At least not given the platforms we > currently support. So it's little wonder that configuration had > bit-rotten. So let's remove it, and switch WIDE_EMACS_INT builds to USE_LSB? >> In particular, WIDE_EMACS_INT shouldn't imply !USE_LSB. That it >> currently does is a very questionable optimization at best (fixnum >> manipulation may be very slightly faster with !USE_LSB, but pointer >> manipulation will be slower and requires extra registers, which is an >> issue on i386). > > Where can one find i386 these days, except in a museum? I meant all x86 systems using the 32-bit instruction set (and, in particular, its limited exposed register set). Those will be around for a while. >> (Of course, WIDE_EMACS_INT is almost always a bad deal, anyway. As far >> as I can tell, the justification for its continued existence is that >> some C code assumes buffer positions are fixnums (and, because we expose >> fixnum-ness to Lisp, some broken Lisp code might do that, too). If we >> had implemented fixnums to be transparent, we could simply remove >> WIDE_EMACS_INT, but that mistake has been made...) > > I'm a very happy user of WIDE_EMACS_INT, so bad-mouthing it is not > recommended ;-) I don't think you should be happy; WIDE_EMACS_INT is sadly necessary to support buffers > 512MB on 32-bit systems, but you're wasting 32 bits for almost every Lisp_Object, and registers as well. As 32-bit systems go away, it will become harder to write Lisp code that works correctly in !WIDE_EMACS_INT 32-bit builds, so we may well have to make WIDE_EMACS_INT the default at some point. > In fact, one of my strongest reservations about the igc branch is that > it will most probably force me to lose WIDE_EMACS_INT. I believe that problem is exclusively due to the fact that WIDE_EMACS_INT implies USE_LSB=0. Dropping !USE_LSB should allow us to use WIDE_EMACS_INT normally in MPS builds, I think. (This is somewhat theoretical because I can't build mingw32 Emacs right now; https://dl.osdn.net alternates between being entirely unreachable and responding with an expired certificate.) The "low-hanging fruit" performance improvements USE_LSB allows for (faster stack scanning during GC and many places which don't need to look at the MSB word at all) are, I think, real, while the way in which !USE_LSB is superior (we dereference pointer words without having to untag them first) may reduce code size slightly, but shouldn't really affect performance. Of course, if we set out to do so, 32-bit Emacs could be optimized in many other ways, but it's too late for that. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 16:17 ` Pip Cet via Emacs development discussions. @ 2024-12-08 16:49 ` Eli Zaretskii 2024-12-08 17:37 ` Pip Cet via Emacs development discussions. ` (2 more replies) 0 siblings, 3 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-12-08 16:49 UTC (permalink / raw) To: Pip Cet; +Cc: luangruo, ali_gnu2, emacs-devel > Date: Sun, 08 Dec 2024 16:17:53 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org > > "Eli Zaretskii" <eliz@gnu.org> writes: > > >> Date: Sun, 08 Dec 2024 13:52:09 +0000 > >> From: Pip Cet <pipcet@protonmail.com> > >> Cc: luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org > >> > >> "Eli Zaretskii" <eliz@gnu.org> writes: > >> > >> > Which builds except WIDE_EMACS_INT need to use !USE_LSB? > >> > >> The only platforms that "need" to use !USE_LSB are those that cannot > >> guarantee 8-byte alignment for static objects, which is why I asked > >> about those. > > > > That means: none, AFAIK. At least not given the platforms we > > currently support. So it's little wonder that configuration had > > bit-rotten. > > So let's remove it, and switch WIDE_EMACS_INT builds to USE_LSB? That'd be a waste of effort. What we have now works, and works well. I'm not interested in throwing away a lot of hard work which got us to where we are with WIDE_EMACS_INT, for advantages which I'm not sure even exist, let alone are significant. Those bits are unused in the WIDE_EMACS_INT build, so using them is a no-brainer, IMO. > >> In particular, WIDE_EMACS_INT shouldn't imply !USE_LSB. That it > >> currently does is a very questionable optimization at best (fixnum > >> manipulation may be very slightly faster with !USE_LSB, but pointer > >> manipulation will be slower and requires extra registers, which is an > >> issue on i386). > > > > Where can one find i386 these days, except in a museum? > > I meant all x86 systems using the 32-bit instruction set (and, in > particular, its limited exposed register set). Those will be around for > a while. Modern x86 CPUs can handle 64-bit values just fine, thank you. > >> (Of course, WIDE_EMACS_INT is almost always a bad deal, anyway. As far > >> as I can tell, the justification for its continued existence is that > >> some C code assumes buffer positions are fixnums (and, because we expose > >> fixnum-ness to Lisp, some broken Lisp code might do that, too). If we > >> had implemented fixnums to be transparent, we could simply remove > >> WIDE_EMACS_INT, but that mistake has been made...) > > > > I'm a very happy user of WIDE_EMACS_INT, so bad-mouthing it is not > > recommended ;-) > > I don't think you should be happy; WIDE_EMACS_INT is sadly necessary to > support buffers > 512MB on 32-bit systems, but you're wasting 32 bits > for almost every Lisp_Object, and registers as well. Why should I care? It isn't like each wasted bit comes with some monetary fine, does it? > As 32-bit systems go away, it will become harder to write Lisp code that > works correctly in !WIDE_EMACS_INT 32-bit builds, so we may well have to > make WIDE_EMACS_INT the default at some point. If you are trying to convince me to switch to 64-bit development environment, you are wasting your time. I have my very good reasons, and don't plan on doing so any time soon. And 64-but Windows supports 32-bit code perfectly for my needs. > > In fact, one of my strongest reservations about the igc branch is that > > it will most probably force me to lose WIDE_EMACS_INT. > > I believe that problem is exclusively due to the fact that > WIDE_EMACS_INT implies USE_LSB=0. Dropping !USE_LSB should allow us to > use WIDE_EMACS_INT normally in MPS builds, I think. No, there's also a built-in assumption in MPS about the size of a word. > The "low-hanging fruit" performance improvements USE_LSB allows for > (faster stack scanning during GC and many places which don't need to > look at the MSB word at all) are, I think, real, while the way in which > !USE_LSB is superior (we dereference pointer words without having to > untag them first) may reduce code size slightly, but shouldn't really > affect performance. I have no problems with performance that I can report, so I don't expect anyone to waste time and effort on these optimizations. We have enough real problems for the resources we have. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 16:49 ` Eli Zaretskii @ 2024-12-08 17:37 ` Pip Cet via Emacs development discussions. 2024-12-08 18:41 ` Eli Zaretskii 2024-12-08 18:47 ` Pip Cet via Emacs development discussions. 2024-12-09 1:13 ` Po Lu 2 siblings, 1 reply; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-08 17:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, ali_gnu2, emacs-devel "Eli Zaretskii" <eliz@gnu.org> writes: >> So let's remove it, and switch WIDE_EMACS_INT builds to USE_LSB? > > That'd be a waste of effort. It'd be a good investment of effort today, in exchange for making the GC code significantly easier to understand and maintain in the future. It would certainly not be without its benefits, so calling it a "waste of effort" is unfair. > I'm not interested in throwing away a lot of hard work which got us to > where we are with WIDE_EMACS_INT, for advantages which I'm not sure > even exist, let alone are significant. I think maintainability of the GC code is significant. > Those bits are unused in the WIDE_EMACS_INT build, so using them is a > no-brainer, IMO. As are the low-order bits of pointers, which have the advantage of already being present in the 32-bit register rather than needing a second register. >> >> In particular, WIDE_EMACS_INT shouldn't imply !USE_LSB. That it >> >> currently does is a very questionable optimization at best (fixnum >> >> manipulation may be very slightly faster with !USE_LSB, but pointer >> >> manipulation will be slower and requires extra registers, which is an >> >> issue on i386). >> > >> > Where can one find i386 these days, except in a museum? >> >> I meant all x86 systems using the 32-bit instruction set (and, in >> particular, its limited exposed register set). Those will be around for >> a while. > > Modern x86 CPUs can handle 64-bit values just fine, thank you. Modern x86 CPUs running 32-bit code (x86, not x32) still need two register names for each 64-bit value. With 8 GPRs, that's a significant problem. So, no, "just fine" isn't accurate here. >> >> (Of course, WIDE_EMACS_INT is almost always a bad deal, anyway. As far >> >> as I can tell, the justification for its continued existence is that >> >> some C code assumes buffer positions are fixnums (and, because we expose >> >> fixnum-ness to Lisp, some broken Lisp code might do that, too). If we >> >> had implemented fixnums to be transparent, we could simply remove >> >> WIDE_EMACS_INT, but that mistake has been made...) >> > >> > I'm a very happy user of WIDE_EMACS_INT, so bad-mouthing it is not >> > recommended ;-) >> >> I don't think you should be happy; WIDE_EMACS_INT is sadly necessary to >> support buffers > 512MB on 32-bit systems, but you're wasting 32 bits >> for almost every Lisp_Object, and registers as well. > > Why should I care? It isn't like each wasted bit comes with some > monetary fine, does it? I think most users of 32-bit systems at this stage do care about wasting a lot of memory, even if you personally don't. >> As 32-bit systems go away, it will become harder to write Lisp code that >> works correctly in !WIDE_EMACS_INT 32-bit builds, so we may well have to >> make WIDE_EMACS_INT the default at some point. > > If you are trying to convince me to switch to 64-bit development > environment, you are wasting your time. I have my very good reasons, > and don't plan on doing so any time soon. I wasn't, and I'm not sure how you got the impression I was. I meant what I said, that we may have to give up on !WIDE_EMACS_INT 32-bit builds at some point. As you're using WIDE_EMACS_INT already, this wouldn't affect you. >> > In fact, one of my strongest reservations about the igc branch is that >> > it will most probably force me to lose WIDE_EMACS_INT. >> >> I believe that problem is exclusively due to the fact that >> WIDE_EMACS_INT implies USE_LSB=0. Dropping !USE_LSB should allow us to >> use WIDE_EMACS_INT normally in MPS builds, I think. > > No, there's also a built-in assumption in MPS about the size of a > word. That's very vague. If there is an assumption that EMACS_INT == mps_word_t, it would certainly not be built into MPS, which doesn't know about EMACS_INT at all. But as it is, I have no idea where you even suspect this "built-in" assumption is made. >> The "low-hanging fruit" performance improvements USE_LSB allows for >> (faster stack scanning during GC and many places which don't need to >> look at the MSB word at all) are, I think, real, while the way in which >> !USE_LSB is superior (we dereference pointer words without having to >> untag them first) may reduce code size slightly, but shouldn't really >> affect performance. > > I have no problems with performance that I can report, so I don't > expect anyone to waste time and effort on these optimizations. We > have enough real problems for the resources we have. If performance and wasted memory aren't issues, then it's a tradeoff between leaving old code untouched and simplifying it to enable future development. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 17:37 ` Pip Cet via Emacs development discussions. @ 2024-12-08 18:41 ` Eli Zaretskii 2024-12-08 19:15 ` Gerd Möllmann 2024-12-09 4:59 ` Stefan Kangas 0 siblings, 2 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-12-08 18:41 UTC (permalink / raw) To: Pip Cet; +Cc: luangruo, ali_gnu2, emacs-devel > Date: Sun, 08 Dec 2024 17:37:50 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org > > "Eli Zaretskii" <eliz@gnu.org> writes: > > >> So let's remove it, and switch WIDE_EMACS_INT builds to USE_LSB? > > > > That'd be a waste of effort. > > It'd be a good investment of effort today, in exchange for making the GC > code significantly easier to understand and maintain in the future. It > would certainly not be without its benefits, so calling it a "waste of > effort" is unfair. I disagree. We've lived with this GC code for a long time, and I don't see any complications due to !USE_LSB. And if we are going to switch to igc at some point, investment in GC is even less sensible. I don't see what's unfair in making my position clear. > > I'm not interested in throwing away a lot of hard work which got us to > > where we are with WIDE_EMACS_INT, for advantages which I'm not sure > > even exist, let alone are significant. > > I think maintainability of the GC code is significant. It is, but there are no significant issues there at this time due to !USE_LSB. > > Those bits are unused in the WIDE_EMACS_INT build, so using them is a > > no-brainer, IMO. > > As are the low-order bits of pointers, which have the advantage of > already being present in the 32-bit register rather than needing a > second register. What's your point? The !USE_LSB ode works, the one you suggest needs to be written and debugged. > >> >> In particular, WIDE_EMACS_INT shouldn't imply !USE_LSB. That it > >> >> currently does is a very questionable optimization at best (fixnum > >> >> manipulation may be very slightly faster with !USE_LSB, but pointer > >> >> manipulation will be slower and requires extra registers, which is an > >> >> issue on i386). > >> > > >> > Where can one find i386 these days, except in a museum? > >> > >> I meant all x86 systems using the 32-bit instruction set (and, in > >> particular, its limited exposed register set). Those will be around for > >> a while. > > > > Modern x86 CPUs can handle 64-bit values just fine, thank you. > > Modern x86 CPUs running 32-bit code (x86, not x32) still need two > register names for each 64-bit value. With 8 GPRs, that's a significant > problem. So, no, "just fine" isn't accurate here. I again disagree. And you forget other registers. > >> > In fact, one of my strongest reservations about the igc branch is that > >> > it will most probably force me to lose WIDE_EMACS_INT. > >> > >> I believe that problem is exclusively due to the fact that > >> WIDE_EMACS_INT implies USE_LSB=0. Dropping !USE_LSB should allow us to > >> use WIDE_EMACS_INT normally in MPS builds, I think. > > > > No, there's also a built-in assumption in MPS about the size of a > > word. > > That's very vague. If there is an assumption that EMACS_INT == > mps_word_t, it would certainly not be built into MPS, which doesn't know > about EMACS_INT at all. Not EMACS_INT, Lisp_Object. At least that's what Gerd explained to me back when I asked about WIDE_EMACS_INT in the MPS build. Maybe he can chime in and clarify this. > >> The "low-hanging fruit" performance improvements USE_LSB allows for > >> (faster stack scanning during GC and many places which don't need to > >> look at the MSB word at all) are, I think, real, while the way in which > >> !USE_LSB is superior (we dereference pointer words without having to > >> untag them first) may reduce code size slightly, but shouldn't really > >> affect performance. > > > > I have no problems with performance that I can report, so I don't > > expect anyone to waste time and effort on these optimizations. We > > have enough real problems for the resources we have. > > If performance and wasted memory aren't issues, then it's a tradeoff > between leaving old code untouched and simplifying it to enable future > development. The existing code doesn't preclude nor interfere with future development. So yes, leaving working code untouched is the preference here. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 18:41 ` Eli Zaretskii @ 2024-12-08 19:15 ` Gerd Möllmann 2024-12-08 20:38 ` Eli Zaretskii 2024-12-09 4:59 ` Stefan Kangas 1 sibling, 1 reply; 112+ messages in thread From: Gerd Möllmann @ 2024-12-08 19:15 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Pip Cet, luangruo, ali_gnu2, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> Date: Sun, 08 Dec 2024 17:37:50 +0000 >> From: Pip Cet <pipcet@protonmail.com> >> Cc: luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org >> >> "Eli Zaretskii" <eliz@gnu.org> writes: >> >> >> So let's remove it, and switch WIDE_EMACS_INT builds to USE_LSB? >> > >> > That'd be a waste of effort. >> >> It'd be a good investment of effort today, in exchange for making the GC >> code significantly easier to understand and maintain in the future. It >> would certainly not be without its benefits, so calling it a "waste of >> effort" is unfair. > > I disagree. We've lived with this GC code for a long time, and I > don't see any complications due to !USE_LSB. And if we are going to > switch to igc at some point, investment in GC is even less sensible. > > I don't see what's unfair in making my position clear. I think Pip meant igc. That would be a lot simpler without the 32-bit stuff, wide ints or not. I said already what I think about that before. > >> >> > In fact, one of my strongest reservations about the igc branch is that >> >> > it will most probably force me to lose WIDE_EMACS_INT. >> >> >> >> I believe that problem is exclusively due to the fact that >> >> WIDE_EMACS_INT implies USE_LSB=0. Dropping !USE_LSB should allow us to >> >> use WIDE_EMACS_INT normally in MPS builds, I think. >> > >> > No, there's also a built-in assumption in MPS about the size of a >> > word. >> >> That's very vague. If there is an assumption that EMACS_INT == >> mps_word_t, it would certainly not be built into MPS, which doesn't know >> about EMACS_INT at all. > > Not EMACS_INT, Lisp_Object. At least that's what Gerd explained to me > back when I asked about WIDE_EMACS_INT in the MPS build. Maybe he can > chime in and clarify this. (Not sure I understand the context in which you are discussing.) As far as igc goes, a Lisp_Object consisting of 2 mps_word_t poses a problem because we scan one mps_word_t at a time. Depending on where the tag bits are, we need the other mps_word_t belonging to a Lisp_Object to be able to determine its type (Lisp_Int0/1Lisp_Symbol, ...). IIRC this is currently the case, and it's a major PITA. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 19:15 ` Gerd Möllmann @ 2024-12-08 20:38 ` Eli Zaretskii 2024-12-09 3:09 ` Gerd Möllmann 0 siblings, 1 reply; 112+ messages in thread From: Eli Zaretskii @ 2024-12-08 20:38 UTC (permalink / raw) To: Gerd Möllmann; +Cc: pipcet, luangruo, ali_gnu2, emacs-devel > From: Gerd Möllmann <gerd.moellmann@gmail.com> > Cc: Pip Cet <pipcet@protonmail.com>, luangruo@yahoo.com, > ali_gnu2@emvision.com, emacs-devel@gnu.org > Date: Sun, 08 Dec 2024 20:15:09 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> It'd be a good investment of effort today, in exchange for making the GC > >> code significantly easier to understand and maintain in the future. It > >> would certainly not be without its benefits, so calling it a "waste of > >> effort" is unfair. > > > > I disagree. We've lived with this GC code for a long time, and I > > don't see any complications due to !USE_LSB. And if we are going to > > switch to igc at some point, investment in GC is even less sensible. > > > > I don't see what's unfair in making my position clear. > > I think Pip meant igc. Then it's all a huge misunderstanding, and I apologize fore not guessing that it was about igc. In my defense I can only say that igc was never mentioned. > That would be a lot simpler without the 32-bit > stuff, wide ints or not. I said already what I think about that before. If you want to drop the 32-bit stuff, then (a) you will need to find someone else to regularly build and test the Windows port of the branch, and (b) we will need to agree on emacs-devel right now that 32-bit builds of Emacs will be dropped when igc lands. > >> > No, there's also a built-in assumption in MPS about the size of a > >> > word. > >> > >> That's very vague. If there is an assumption that EMACS_INT == > >> mps_word_t, it would certainly not be built into MPS, which doesn't know > >> about EMACS_INT at all. > > > > Not EMACS_INT, Lisp_Object. At least that's what Gerd explained to me > > back when I asked about WIDE_EMACS_INT in the MPS build. Maybe he can > > chime in and clarify this. > > (Not sure I understand the context in which you are discussing.) > > As far as igc goes, a Lisp_Object consisting of 2 mps_word_t poses a > problem because we scan one mps_word_t at a time. Depending on where the > tag bits are, we need the other mps_word_t belonging to a Lisp_Object to > be able to determine its type (Lisp_Int0/1Lisp_Symbol, ...). IIRC > this is currently the case, and it's a major PITA. That's what I remembered from when you explained that a few months ago. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 20:38 ` Eli Zaretskii @ 2024-12-09 3:09 ` Gerd Möllmann 2024-12-09 3:32 ` Eli Zaretskii 2024-12-09 9:56 ` Pip Cet via Emacs development discussions. 0 siblings, 2 replies; 112+ messages in thread From: Gerd Möllmann @ 2024-12-09 3:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pipcet, luangruo, ali_gnu2, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Gerd Möllmann <gerd.moellmann@gmail.com> >> Cc: Pip Cet <pipcet@protonmail.com>, luangruo@yahoo.com, >> ali_gnu2@emvision.com, emacs-devel@gnu.org >> Date: Sun, 08 Dec 2024 20:15:09 +0100 >> >> Eli Zaretskii <eliz@gnu.org> writes: >> >> >> It'd be a good investment of effort today, in exchange for making the GC >> >> code significantly easier to understand and maintain in the future. It >> >> would certainly not be without its benefits, so calling it a "waste of >> >> effort" is unfair. >> > >> > I disagree. We've lived with this GC code for a long time, and I >> > don't see any complications due to !USE_LSB. And if we are going to >> > switch to igc at some point, investment in GC is even less sensible. >> > >> > I don't see what's unfair in making my position clear. >> >> I think Pip meant igc. > > Then it's all a huge misunderstanding, and I apologize fore not > guessing that it was about igc. In my defense I can only say that igc > was never mentioned. Or I'm wrong, and Pip meant something else. >> That would be a lot simpler without the 32-bit >> stuff, wide ints or not. I said already what I think about that before. > > If you want to drop the 32-bit stuff, then (a) you will need to find > someone else to regularly build and test the Windows port of the > branch, and (b) we will need to agree on emacs-devel right now that > 32-bit builds of Emacs will be dropped when igc lands. I would recommend that, indeed, but I don't expect it to happen any time soon :-). >> >> > No, there's also a built-in assumption in MPS about the size of a >> >> > word. >> >> >> >> That's very vague. If there is an assumption that EMACS_INT == >> >> mps_word_t, it would certainly not be built into MPS, which doesn't know >> >> about EMACS_INT at all. >> > >> > Not EMACS_INT, Lisp_Object. At least that's what Gerd explained to me >> > back when I asked about WIDE_EMACS_INT in the MPS build. Maybe he can >> > chime in and clarify this. >> >> (Not sure I understand the context in which you are discussing.) >> >> As far as igc goes, a Lisp_Object consisting of 2 mps_word_t poses a >> problem because we scan one mps_word_t at a time. Depending on where the >> tag bits are, we need the other mps_word_t belonging to a Lisp_Object to >> be able to determine its type (Lisp_Int0/1Lisp_Symbol, ...). IIRC >> this is currently the case, and it's a major PITA. > > That's what I remembered from when you explained that a few months > ago. What about dropping, officially sanctioned so to speak, WIDE_EMACS_INT support for igc? That would help. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-09 3:09 ` Gerd Möllmann @ 2024-12-09 3:32 ` Eli Zaretskii 2024-12-09 3:43 ` Gerd Möllmann 2024-12-09 9:56 ` Pip Cet via Emacs development discussions. 1 sibling, 1 reply; 112+ messages in thread From: Eli Zaretskii @ 2024-12-09 3:32 UTC (permalink / raw) To: Gerd Möllmann; +Cc: pipcet, luangruo, ali_gnu2, emacs-devel > From: Gerd Möllmann <gerd.moellmann@gmail.com> > Cc: pipcet@protonmail.com, luangruo@yahoo.com, ali_gnu2@emvision.com, > emacs-devel@gnu.org > Date: Mon, 09 Dec 2024 04:09:39 +0100 > > What about dropping, officially sanctioned so to speak, WIDE_EMACS_INT > support for igc? That would help. You already dropped it, didn't you? ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-09 3:32 ` Eli Zaretskii @ 2024-12-09 3:43 ` Gerd Möllmann 2024-12-09 4:53 ` Stefan Kangas 0 siblings, 1 reply; 112+ messages in thread From: Gerd Möllmann @ 2024-12-09 3:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pipcet, luangruo, ali_gnu2, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Gerd Möllmann <gerd.moellmann@gmail.com> >> Cc: pipcet@protonmail.com, luangruo@yahoo.com, ali_gnu2@emvision.com, >> emacs-devel@gnu.org >> Date: Mon, 09 Dec 2024 04:09:39 +0100 >> >> What about dropping, officially sanctioned so to speak, WIDE_EMACS_INT >> support for igc? That would help. > > You already dropped it, didn't you? There is #ifdef WIDE_EMACS_INT # error "WIDE_EMACS_INT not supported" #endif in igc.c simply because it's not implemented. Mentally, I've dropped it, yes. I think it would make things really ugly, and not having it doesn't take away anything from users of WIDE_EMACS_INT which they currently have, i.e. the current GC. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-09 3:43 ` Gerd Möllmann @ 2024-12-09 4:53 ` Stefan Kangas 2024-12-09 5:26 ` Gerd Möllmann 2024-12-09 13:58 ` Eli Zaretskii 0 siblings, 2 replies; 112+ messages in thread From: Stefan Kangas @ 2024-12-09 4:53 UTC (permalink / raw) To: Gerd Möllmann, Eli Zaretskii; +Cc: pipcet, luangruo, ali_gnu2, emacs-devel Gerd Möllmann <gerd.moellmann@gmail.com> writes: > There is > > #ifdef WIDE_EMACS_INT > # error "WIDE_EMACS_INT not supported" > #endif > > in igc.c simply because it's not implemented. > > Mentally, I've dropped it, yes. I think it would make things really > ugly, and not having it doesn't take away anything from users of > WIDE_EMACS_INT which they currently have, i.e. the current GC. Is the idea to continue supporting both the old GC and mpc for the foreseeable future? ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-09 4:53 ` Stefan Kangas @ 2024-12-09 5:26 ` Gerd Möllmann 2024-12-09 13:58 ` Eli Zaretskii 1 sibling, 0 replies; 112+ messages in thread From: Gerd Möllmann @ 2024-12-09 5:26 UTC (permalink / raw) To: Stefan Kangas; +Cc: Eli Zaretskii, pipcet, luangruo, ali_gnu2, emacs-devel Stefan Kangas <stefankangas@gmail.com> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> There is >> >> #ifdef WIDE_EMACS_INT >> # error "WIDE_EMACS_INT not supported" >> #endif >> >> in igc.c simply because it's not implemented. >> >> Mentally, I've dropped it, yes. I think it would make things really >> ugly, and not having it doesn't take away anything from users of >> WIDE_EMACS_INT which they currently have, i.e. the current GC. > > Is the idea to continue supporting both the old GC and mpc for the > foreseeable future? ISTR Po Lu mentioning that some OS (Android?) does not support MPS, so yes from me. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-09 4:53 ` Stefan Kangas 2024-12-09 5:26 ` Gerd Möllmann @ 2024-12-09 13:58 ` Eli Zaretskii 2024-12-10 0:02 ` Po Lu 1 sibling, 1 reply; 112+ messages in thread From: Eli Zaretskii @ 2024-12-09 13:58 UTC (permalink / raw) To: Stefan Kangas; +Cc: gerd.moellmann, pipcet, luangruo, ali_gnu2, emacs-devel > From: Stefan Kangas <stefankangas@gmail.com> > Date: Sun, 8 Dec 2024 20:53:05 -0800 > Cc: pipcet@protonmail.com, luangruo@yahoo.com, ali_gnu2@emvision.com, > emacs-devel@gnu.org > > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > > > There is > > > > #ifdef WIDE_EMACS_INT > > # error "WIDE_EMACS_INT not supported" > > #endif > > > > in igc.c simply because it's not implemented. > > > > Mentally, I've dropped it, yes. I think it would make things really > > ugly, and not having it doesn't take away anything from users of > > WIDE_EMACS_INT which they currently have, i.e. the current GC. > > Is the idea to continue supporting both the old GC and mpc for the > foreseeable future? It could be, if MPS support is less than universal. But that doesn't necessarily decide the fate of WIDE_EMACS_INT. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-09 13:58 ` Eli Zaretskii @ 2024-12-10 0:02 ` Po Lu 0 siblings, 0 replies; 112+ messages in thread From: Po Lu @ 2024-12-10 0:02 UTC (permalink / raw) To: Eli Zaretskii Cc: Stefan Kangas, gerd.moellmann, pipcet, ali_gnu2, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Stefan Kangas <stefankangas@gmail.com> >> Date: Sun, 8 Dec 2024 20:53:05 -0800 >> Cc: pipcet@protonmail.com, luangruo@yahoo.com, ali_gnu2@emvision.com, >> emacs-devel@gnu.org >> >> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >> >> > There is >> > >> > #ifdef WIDE_EMACS_INT >> > # error "WIDE_EMACS_INT not supported" >> > #endif >> > >> > in igc.c simply because it's not implemented. >> > >> > Mentally, I've dropped it, yes. I think it would make things really >> > ugly, and not having it doesn't take away anything from users of >> > WIDE_EMACS_INT which they currently have, i.e. the current GC. >> >> Is the idea to continue supporting both the old GC and mpc for the >> foreseeable future? > > It could be, if MPS support is less than universal. It is less than universal, although it was simpler to port than I anticipated. E.g. Solaris on SPARC is not a supported configuration upstream. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-09 3:09 ` Gerd Möllmann 2024-12-09 3:32 ` Eli Zaretskii @ 2024-12-09 9:56 ` Pip Cet via Emacs development discussions. 2024-12-10 0:04 ` Po Lu 1 sibling, 1 reply; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-09 9:56 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Eli Zaretskii, luangruo, ali_gnu2, emacs-devel Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Eli Zaretskii <eliz@gnu.org> writes: > >>> From: Gerd Möllmann <gerd.moellmann@gmail.com> >>> Cc: Pip Cet <pipcet@protonmail.com>, luangruo@yahoo.com, >>> ali_gnu2@emvision.com, emacs-devel@gnu.org >>> Date: Sun, 08 Dec 2024 20:15:09 +0100 >>> >>> Eli Zaretskii <eliz@gnu.org> writes: >>> >>> >> It'd be a good investment of effort today, in exchange for making the GC >>> >> code significantly easier to understand and maintain in the future. It >>> >> would certainly not be without its benefits, so calling it a "waste of >>> >> effort" is unfair. >>> > >>> > I disagree. We've lived with this GC code for a long time, and I >>> > don't see any complications due to !USE_LSB. And if we are going to >>> > switch to igc at some point, investment in GC is even less sensible. >>> > >>> > I don't see what's unfair in making my position clear. >>> >>> I think Pip meant igc. >> >> Then it's all a huge misunderstanding, and I apologize fore not >> guessing that it was about igc. In my defense I can only say that igc >> was never mentioned. > > Or I'm wrong, and Pip meant something else. I was talking about the non-mps branch, yes. We should drop !USE_LSB, which doesn't work in its original use case today and hasn't for a while. It does happen to work in the WIDE_EMACS_INT case, but that's a fortuitous accident at best. >>> >> > No, there's also a built-in assumption in MPS about the size of a >>> >> > word. >>> >> >>> >> That's very vague. If there is an assumption that EMACS_INT == >>> >> mps_word_t, it would certainly not be built into MPS, which doesn't know >>> >> about EMACS_INT at all. >>> > >>> > Not EMACS_INT, Lisp_Object. At least that's what Gerd explained to me (Of course, we have typedef EMACS_INT Lisp_Word; typedef Lisp_Word Lisp_Object; so this is the same thing) >>> > back when I asked about WIDE_EMACS_INT in the MPS build. Maybe he can >>> > chime in and clarify this. >>> >>> (Not sure I understand the context in which you are discussing.) >>> >>> As far as igc goes, a Lisp_Object consisting of 2 mps_word_t poses a >>> problem because we scan one mps_word_t at a time. Depending on where the >>> tag bits are, we need the other mps_word_t belonging to a Lisp_Object to >>> be able to determine its type (Lisp_Int0/1Lisp_Symbol, ...). IIRC >>> this is currently the case, and it's a major PITA. So the problem is !USE_LSB, not WIDE_EMACS_INT. Another reason we should drop !USE_LSB, since it gives us working WIDE_EMACS_INT + MPS builds. >> That's what I remembered from when you explained that a few months >> ago. > > What about dropping, officially sanctioned so to speak, WIDE_EMACS_INT > support for igc? That would help. I don't see a technical reason to do so, since WIDE_EMACS_INT + !USE_LSB works fine (see the patch I sent). We already refuse to build igc.c in !USE_LSB situations and should continue to do so. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-09 9:56 ` Pip Cet via Emacs development discussions. @ 2024-12-10 0:04 ` Po Lu 2024-12-10 3:34 ` Eli Zaretskii 0 siblings, 1 reply; 112+ messages in thread From: Po Lu @ 2024-12-10 0:04 UTC (permalink / raw) To: Pip Cet; +Cc: Gerd Möllmann, Eli Zaretskii, ali_gnu2, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > I was talking about the non-mps branch, yes. We should drop !USE_LSB, > which doesn't work in its original use case today and hasn't for a > while. It does happen to work in the WIDE_EMACS_INT case, but that's a > fortuitous accident at best. I propose to make it work again. It ought to be a simple matter of scanning stack slots twice, with and without tag bits. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 0:04 ` Po Lu @ 2024-12-10 3:34 ` Eli Zaretskii 2024-12-11 1:13 ` Po Lu 0 siblings, 1 reply; 112+ messages in thread From: Eli Zaretskii @ 2024-12-10 3:34 UTC (permalink / raw) To: Po Lu; +Cc: pipcet, gerd.moellmann, ali_gnu2, emacs-devel > From: Po Lu <luangruo@yahoo.com> > Cc: Gerd Möllmann <gerd.moellmann@gmail.com>, Eli > Zaretskii <eliz@gnu.org>, > ali_gnu2@emvision.com, emacs-devel@gnu.org > Date: Tue, 10 Dec 2024 08:04:03 +0800 > > Pip Cet <pipcet@protonmail.com> writes: > > > I was talking about the non-mps branch, yes. We should drop !USE_LSB, > > which doesn't work in its original use case today and hasn't for a > > while. It does happen to work in the WIDE_EMACS_INT case, but that's a > > fortuitous accident at best. > > I propose to make it work again. It ought to be a simple matter of > scanning stack slots twice, with and without tag bits. Patches to that effect will be welcome, thanks. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 3:34 ` Eli Zaretskii @ 2024-12-11 1:13 ` Po Lu 2024-12-11 11:29 ` Pip Cet via Emacs development discussions. 0 siblings, 1 reply; 112+ messages in thread From: Po Lu @ 2024-12-11 1:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pipcet, gerd.moellmann, ali_gnu2, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Po Lu <luangruo@yahoo.com> >> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>, Eli >> Zaretskii <eliz@gnu.org>, >> ali_gnu2@emvision.com, emacs-devel@gnu.org >> Date: Tue, 10 Dec 2024 08:04:03 +0800 >> >> Pip Cet <pipcet@protonmail.com> writes: >> >> > I was talking about the non-mps branch, yes. We should drop !USE_LSB, >> > which doesn't work in its original use case today and hasn't for a >> > while. It does happen to work in the WIDE_EMACS_INT case, but that's a >> > fortuitous accident at best. >> >> I propose to make it work again. It ought to be a simple matter of >> scanning stack slots twice, with and without tag bits. > > Patches to that effect will be welcome, thanks. Yes, like I said at the beginning of this (burgeoning) thread, I intend to return to active Emacs development after the release of Emacs 30. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-11 1:13 ` Po Lu @ 2024-12-11 11:29 ` Pip Cet via Emacs development discussions. 0 siblings, 0 replies; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-11 11:29 UTC (permalink / raw) To: Po Lu; +Cc: Eli Zaretskii, gerd.moellmann, ali_gnu2, emacs-devel "Po Lu" <luangruo@yahoo.com> writes: > Eli Zaretskii <eliz@gnu.org> writes: >>> From: Po Lu <luangruo@yahoo.com> >>> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>, Eli >>> Zaretskii <eliz@gnu.org>, >>> ali_gnu2@emvision.com, emacs-devel@gnu.org >>> Date: Tue, 10 Dec 2024 08:04:03 +0800 >>> >>> Pip Cet <pipcet@protonmail.com> writes: >>> >>> > I was talking about the non-mps branch, yes. We should drop !USE_LSB, >>> > which doesn't work in its original use case today and hasn't for a >>> > while. It does happen to work in the WIDE_EMACS_INT case, but that's a >>> > fortuitous accident at best. >>> >>> I propose to make it work again. It ought to be a simple matter of >>> scanning stack slots twice, with and without tag bits. >> >> Patches to that effect will be welcome, thanks. > > Yes, like I said at the beginning of this (burgeoning) thread, I intend > to return to active Emacs development after the release of Emacs 30. That's great to hear, but I'd like to make a final (promise!) attempt to dissuade you from making this particular change ("fixing" the code to support !USE_LSB_TAG more often). The changes that are necessary concern the most delicate part of the garbage collector: ambiguous scanning needs to remove the tag (the easy part), and live_cons_p etc. have to be changed to allow for more offsets (we need to recognize pointers to &Lisp_Object + 4 as well as pointers to &Lisp_Object itself; I think this bug is already present on big-endian 32-bit builds utilizing WIDE_EMACS_INT, but no one's using that). I suspect other changes will be necessary (in particular, I expect breakage on systems that use the high byte of 64-bit pointers, as some Android systems do; I also expect there will be sign extension / zero extension problems). The pdumper code also needs to be studied carefully, and most likely changed. (Pure space and unexec will likely have gone away by then, but they would be affected, too). This is not a quick fix. What makes this code delicate is that it's very rare for a stack reference, particularly an unusual one, to be the last reference that keeps another object alive; even if we fail to recognize an ambiguous reference and free the object it refers to, the most likely outcome is an invisible UAF error, because we happen to use-after-free memory right after the garbage collection, and it'll still have the expected contents. This part of the garbage collector has long been in need of some work (we currently search the RB tree twice for every word, even though the second pass is usually unnecessary). Obviously, that will be harder if we change the code in other ways. The very best outcome of making the changes you propose is that no one will ever use the changed code; in that case, all that will be achieved is to add unused code to a function that's already hard to understand, and to make future changes that much harder. But that's not what I think will hapen. What I think will happen is that users will start or continue using !USE_LSB_TAG, try to switch to MPS, run into a problem, (hopefully) report a bug, and we won't be able to deal with that bug report because we're comparing a USE_LSB_TAG + MPS build to a !USE_LSB_TAG + !MPS one, and it'll be impossible to tell which of the two major changes are causing the problem. In other words, every person affected by your proposed changes will be unable to usefully test MPS. I think that's bad. If you insist on making the changes, please make sure there is a visible "feature" in the corresponding MPS build which will let us know that bug reports are useless and should be disregarded. I personally won't ask anyone to test MPS in a setting where they cannot usefully report bugs. Obviously, reducing the number of people who can usefully test MPS will make it slightly less likely it'll ever land. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 18:41 ` Eli Zaretskii 2024-12-08 19:15 ` Gerd Möllmann @ 2024-12-09 4:59 ` Stefan Kangas 2024-12-09 14:39 ` Eli Zaretskii 2024-12-09 16:21 ` Pip Cet via Emacs development discussions. 1 sibling, 2 replies; 112+ messages in thread From: Stefan Kangas @ 2024-12-09 4:59 UTC (permalink / raw) To: Eli Zaretskii, Pip Cet; +Cc: luangruo, ali_gnu2, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> Date: Sun, 08 Dec 2024 17:37:50 +0000 >> From: Pip Cet <pipcet@protonmail.com> >> Cc: luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org >> >> "Eli Zaretskii" <eliz@gnu.org> writes: >> >> >> So let's remove it, and switch WIDE_EMACS_INT builds to USE_LSB? >> > >> > That'd be a waste of effort. >> >> It'd be a good investment of effort today, in exchange for making the GC >> code significantly easier to understand and maintain in the future. It >> would certainly not be without its benefits, so calling it a "waste of >> effort" is unfair. > > I disagree. We've lived with this GC code for a long time, and I > don't see any complications due to !USE_LSB. And if we are going to > switch to igc at some point, investment in GC is even less sensible. Assuming that we are 100% sure that mpc will land, then I can agree that making any changes here is basically wasted effort. Unless, of course, the change would also simplify the mpc work (would it?). On the other hand, IIUC, we have some way to go with making the merging of the mpc branch a guarantee. While I'm an enthusiastic supporter of the great work that's being done on the mpc branch, isn't hedging our bets prudent until that work is done? Or am I misunderstanding how close we are to merging the mpc branch? >> If performance and wasted memory aren't issues, then it's a tradeoff >> between leaving old code untouched and simplifying it to enable future >> development. > > The existing code doesn't preclude nor interfere with future > development. So yes, leaving working code untouched is the preference > here. Based on my limited mucking around in the GC, it does interfere somewhat because you do need to understand both configurations, at least on a high level, and once you do you need to mentally filter that stuff out when reading the code. So I think I'd appreciate the simplification, at least. If the only known drawbacks are stability concerns, we could also consider an intermediate step along these lines: Leave the USE_LSB_TAG code as is, but set it to 1 in all configurations on master. See what issues crop up, if any. If anything does come up, ask Pip Cet to fix it (he volunteered, IIUC), and if things are starting to look too hairy, revert EMACS_WIDE_INT back to !USE_LSB_TAG. If nothing too bad comes up, we can then consider removing the associated code in Emacs 32. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-09 4:59 ` Stefan Kangas @ 2024-12-09 14:39 ` Eli Zaretskii 2024-12-09 21:06 ` Merging MPS a.k.a. scratch/igc, yet again Stefan Kangas 2024-12-10 0:09 ` pdumper on Solaris 10 Stefan Kangas 2024-12-09 16:21 ` Pip Cet via Emacs development discussions. 1 sibling, 2 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-12-09 14:39 UTC (permalink / raw) To: Stefan Kangas; +Cc: pipcet, luangruo, ali_gnu2, emacs-devel > From: Stefan Kangas <stefankangas@gmail.com> > Date: Sun, 8 Dec 2024 23:59:14 -0500 > Cc: luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org > > Eli Zaretskii <eliz@gnu.org> writes: > > >> Date: Sun, 08 Dec 2024 17:37:50 +0000 > >> From: Pip Cet <pipcet@protonmail.com> > >> Cc: luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org > >> > >> "Eli Zaretskii" <eliz@gnu.org> writes: > >> > >> >> So let's remove it, and switch WIDE_EMACS_INT builds to USE_LSB? > >> > > >> > That'd be a waste of effort. > >> > >> It'd be a good investment of effort today, in exchange for making the GC > >> code significantly easier to understand and maintain in the future. It > >> would certainly not be without its benefits, so calling it a "waste of > >> effort" is unfair. > > > > I disagree. We've lived with this GC code for a long time, and I > > don't see any complications due to !USE_LSB. And if we are going to > > switch to igc at some point, investment in GC is even less sensible. > > Assuming that we are 100% sure that mpc will land, then I can agree that > making any changes here is basically wasted effort. Unless, of course, > the change would also simplify the mpc work (would it?). The igc branch already dropped WIDE_EMACS_INT support, so it only supports USE_LSB anyway. > On the other hand, IIUC, we have some way to go with making the merging > of the mpc branch a guarantee. While I'm an enthusiastic supporter of > the great work that's being done on the mpc branch, isn't hedging our > bets prudent until that work is done? From where I stand, what's left to do on the branch is stability: using the branch, reporting bugs, and fixing them, especially on some rarer platforms (*BSD, for example). Plus some decisions: do we fork MPS or not, for example. So it isn't such a distant future. > Or am I misunderstanding how close we are to merging the mpc branch? Possibly. > >> If performance and wasted memory aren't issues, then it's a tradeoff > >> between leaving old code untouched and simplifying it to enable future > >> development. > > > > The existing code doesn't preclude nor interfere with future > > development. So yes, leaving working code untouched is the preference > > here. > > Based on my limited mucking around in the GC, it does interfere somewhat > because you do need to understand both configurations, at least on a > high level, and once you do you need to mentally filter that stuff out > when reading the code. So I think I'd appreciate the simplification, at > least. The simplification is minuscule at best. We need to mask some bits, either at the LSB end or at MSB end, that's all the difference. And we have macros that hide the differences from most levels. And remember that the original scheme of tagging in Emacs was !USE_LSB, so some veterans might even prefer it. > If the only known drawbacks are stability concerns, we could also > consider an intermediate step along these lines: > > Leave the USE_LSB_TAG code as is, but set it to 1 in all configurations > on master. That would put the WIDE_EMACS_INT configuration at risk, since that configuration will need changes. > See what issues crop up, if any. If anything does come up, > ask Pip Cet to fix it (he volunteered, IIUC), and if things are starting > to look too hairy, revert EMACS_WIDE_INT back to !USE_LSB_TAG. If > nothing too bad comes up, we can then consider removing the associated > code in Emacs 32. My point is that all of that could be avoided entirely, given some development decisions which basically drop !USE_LSB_TAG configurations. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Merging MPS a.k.a. scratch/igc, yet again 2024-12-09 14:39 ` Eli Zaretskii @ 2024-12-09 21:06 ` Stefan Kangas 2024-12-09 21:49 ` Óscar Fuentes ` (2 more replies) 2024-12-10 0:09 ` pdumper on Solaris 10 Stefan Kangas 1 sibling, 3 replies; 112+ messages in thread From: Stefan Kangas @ 2024-12-09 21:06 UTC (permalink / raw) To: Eli Zaretskii Cc: pipcet, luangruo, ali_gnu2, emacs-devel, Gerd Möllmann, Stefan Monnier Eli Zaretskii <eliz@gnu.org> writes: >> On the other hand, IIUC, we have some way to go with making the merging >> of the mpc branch a guarantee. While I'm an enthusiastic supporter of >> the great work that's being done on the mpc branch, isn't hedging our >> bets prudent until that work is done? > > From where I stand, what's left to do on the branch is stability: > using the branch, reporting bugs, and fixing them, especially on some > rarer platforms (*BSD, for example). Plus some decisions: do we fork > MPS or not, for example. So it isn't such a distant future. In that case, I'd suggest that we start working on getting README-IGC into an excellent state. In August, when I last tried building the branch, getting it to build was non-trivial, but I didn't try with the latest instructions. Taking a look at README-IGC, it seems like we're still missing build instructions for Debian. Maybe people could volunteer to add other popular distros too, and *BSD, etc. (If the idea is that such users should just follow the instructions under "Building MPS yourself", then we should say that instead of "TBD".) Once we feel happy that it's reasonably straightforward to follow the instructions, I'd suggest that Someone (TM) makes a post to emacs-devel, asking people to start seriously testing the branch. Such a post should normally get picked up by Emacs News, Reddit, etc. and hopefully the branch will then start seeing wider use. (Remember to Cc Sacha Chua to get it on Emacs News.) I'm sure that users will be excited to help test igc once they understand that we're working seriously on stabilizing it in preparation of getting it merged. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-09 21:06 ` Merging MPS a.k.a. scratch/igc, yet again Stefan Kangas @ 2024-12-09 21:49 ` Óscar Fuentes 2024-12-10 4:17 ` Xiyue Deng 2024-12-10 13:09 ` Eli Zaretskii 2024-12-09 23:13 ` chad 2024-12-10 12:41 ` Eli Zaretskii 2 siblings, 2 replies; 112+ messages in thread From: Óscar Fuentes @ 2024-12-09 21:49 UTC (permalink / raw) To: emacs-devel Stefan Kangas <stefankangas@gmail.com> writes: > Taking a look at README-IGC, it seems like we're still missing build > instructions for Debian. AFAIK Debian does not package MPS. The instructions I added to README-IGC for building MPS from their git repo are distro-agnostic. They are tested in Debian Trixie (a.k.a Testing) which is what I have installed on all the machines I regularly use. In fact, I'm pretty sure that any experienced autotools hacker can add MPS to the Emacs build in no time. The only annoying bit is that some MPS headers collide with Emacs', so I chose to instruct the user to copy the needed headers to a new directory and tell the config script to use it. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-09 21:49 ` Óscar Fuentes @ 2024-12-10 4:17 ` Xiyue Deng 2024-12-10 4:26 ` Sean Whitton ` (4 more replies) 2024-12-10 13:09 ` Eli Zaretskii 1 sibling, 5 replies; 112+ messages in thread From: Xiyue Deng @ 2024-12-10 4:17 UTC (permalink / raw) To: Óscar Fuentes, emacs-devel [-- Attachment #1: Type: text/plain, Size: 2196 bytes --] Óscar Fuentes <ofv@wanadoo.es> writes: > Stefan Kangas <stefankangas@gmail.com> writes: > >> Taking a look at README-IGC, it seems like we're still missing build >> instructions for Debian. > > AFAIK Debian does not package MPS. > > The instructions I added to README-IGC for building MPS from their git > repo are distro-agnostic. They are tested in Debian Trixie (a.k.a > Testing) which is what I have installed on all the machines I regularly > use. > > In fact, I'm pretty sure that any experienced autotools hacker can add > MPS to the Emacs build in no time. The only annoying bit is that some > MPS headers collide with Emacs', so I chose to instruct the user to copy > the needed headers to a new directory and tell the config script to use > it. > > If making MPS available in Debian would help Emacs packaging I'm willing to work on this (in the coming weeks as igc may not land with the upcoming Emacs 30 release so not in a hurry.) I have a few questions regarding the Emacs/igc usage of MPS: * Does igc require only mps.{h,c} or more sources from the MPS source package? It looks like there are many sources and it's autotools build script fails with GCC 14.2 in Debian Trixie due to several "-Werror"s. It may be easier to just compile and ship the required subset, though it may require providing a custom build script. * Does igc work with a dynamically linked MPS library? Currently I have seen people suggesting that directly compiling the source, which is effectively like using MPS as a static library. It would be less useful to package a static-only library in Debian because in case of any issues (usually security) updating the library is insufficient and its dependencies would need to be rebuilt as well. Using a dynamic library would solve this scalability issue, and it would be good to know if igc can work with a dynamically linked MPS. * Does igc work with the latest tagged version (release-1.118.0) or only the latest snapshot? Packaging a tagged version would be easier, though working with a snapshot may also work with a bit of extra efforts. -- Regards, Xiyue Deng [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 857 bytes --] ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-10 4:17 ` Xiyue Deng @ 2024-12-10 4:26 ` Sean Whitton 2024-12-10 4:42 ` chad ` (3 subsequent siblings) 4 siblings, 0 replies; 112+ messages in thread From: Sean Whitton @ 2024-12-10 4:26 UTC (permalink / raw) To: Xiyue Deng; +Cc: Óscar Fuentes, emacs-devel Hello, I can review and sponsor Xiyue’s upload to Debian. -- Sean Whitton Please excuse top-posting and brevity. I am writing to you from a mobile phone. > On 10 Dec 2024, at 12:19, Xiyue Deng <manphiz@gmail.com> wrote: > > Óscar Fuentes <ofv@wanadoo.es> writes: > >> Stefan Kangas <stefankangas@gmail.com> writes: >> >>> Taking a look at README-IGC, it seems like we're still missing build >>> instructions for Debian. >> >> AFAIK Debian does not package MPS. >> >> The instructions I added to README-IGC for building MPS from their git >> repo are distro-agnostic. They are tested in Debian Trixie (a.k.a >> Testing) which is what I have installed on all the machines I regularly >> use. >> >> In fact, I'm pretty sure that any experienced autotools hacker can add >> MPS to the Emacs build in no time. The only annoying bit is that some >> MPS headers collide with Emacs', so I chose to instruct the user to copy >> the needed headers to a new directory and tell the config script to use >> it. >> >> > > If making MPS available in Debian would help Emacs packaging I'm willing > to work on this (in the coming weeks as igc may not land with the > upcoming Emacs 30 release so not in a hurry.) > > I have a few questions regarding the Emacs/igc usage of MPS: > > * Does igc require only mps.{h,c} or more sources from the MPS source > package? It looks like there are many sources and it's autotools > build script fails with GCC 14.2 in Debian Trixie due to several > "-Werror"s. It may be easier to just compile and ship the required > subset, though it may require providing a custom build script. > > * Does igc work with a dynamically linked MPS library? Currently I have > seen people suggesting that directly compiling the source, which is > effectively like using MPS as a static library. It would be less > useful to package a static-only library in Debian because in case of > any issues (usually security) updating the library is insufficient and > its dependencies would need to be rebuilt as well. Using a dynamic > library would solve this scalability issue, and it would be good to > know if igc can work with a dynamically linked MPS. > > * Does igc work with the latest tagged version (release-1.118.0) or only > the latest snapshot? Packaging a tagged version would be easier, > though working with a snapshot may also work with a bit of extra > efforts. > > -- > Regards, > Xiyue Deng > <signature.asc> ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-10 4:17 ` Xiyue Deng 2024-12-10 4:26 ` Sean Whitton @ 2024-12-10 4:42 ` chad 2024-12-10 13:10 ` Óscar Fuentes ` (2 subsequent siblings) 4 siblings, 0 replies; 112+ messages in thread From: chad @ 2024-12-10 4:42 UTC (permalink / raw) To: Xiyue Deng; +Cc: Óscar Fuentes, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1100 bytes --] On Mon, Dec 9, 2024 at 11:19 PM Xiyue Deng <manphiz@gmail.com> wrote: > * Does igc require only mps.{h,c} or more sources from the MPS source > package? It looks like there are many sources and it's autotools > build script fails with GCC 14.2 in Debian Trixie due to several > "-Werror"s. It may be easier to just compile and ship the required > subset, though it may require providing a custom build script. > Emacs itself needs: #include <mps.h> > #include <mpsavm.h> > #include <mpscamc.h> > #include "mpscams.h" > #include <mpscawl.h> > #include <mpslib.h> This is shorter than mps/code/mps*.h (which I suggested earlier). * Does igc work with the latest tagged version (release-1.118.0) or only > the latest snapshot? Packaging a tagged version would be easier, > though working with a snapshot may also work with a bit of extra > efforts. I get the impression that Ravenbrook/mps is working towards an updated release, but at the moment, I believe that you really want patches that aren't in release-1.118.0. Hope that helps, ~Chad [-- Attachment #2: Type: text/html, Size: 1860 bytes --] ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-10 4:17 ` Xiyue Deng 2024-12-10 4:26 ` Sean Whitton 2024-12-10 4:42 ` chad @ 2024-12-10 13:10 ` Óscar Fuentes 2024-12-10 15:10 ` Pip Cet via Emacs development discussions. 2024-12-10 13:20 ` Eli Zaretskii 2024-12-10 14:46 ` Pip Cet via Emacs development discussions. 4 siblings, 1 reply; 112+ messages in thread From: Óscar Fuentes @ 2024-12-10 13:10 UTC (permalink / raw) To: emacs-devel Xiyue Deng <manphiz@gmail.com> writes: > If making MPS available in Debian would help Emacs packaging I'm willing > to work on this (in the coming weeks as igc may not land with the > upcoming Emacs 30 release so not in a hurry.) As a Debian user, I value every package that is made available through its repos, thank you. However, in the specific case of Emacs/MPS, IMAO distro packaging is not the best way, because: * Depending on packaged MPS brings versioning problems, not to mention that it would take a long time to have MPS available on a large part of the distro ecosystem. We would need the DIY part of README-IGC anyway. * It is very likely that we end doing some patching to the MPS sources to adapt to our specific needs (if those patches end upstream or not, that's another question.) * MPS does a performance-critical job. Using it as a shared object might incur in a performance penalty. Having it in source form alongside the Emacs sources will result in opportunities for optimizations (LTO, PGO, ...) that may bring better performance. * MPS does a correctness-critical job. Depending on multiple external sources for such core component is a recipe for problems (future changes by the MPS maintainers, patching by packagers, buggy compilers, etc.) We need to keep a close watch on what MPS incarnation we use. Better yet, total control. For those reasons, incorporating MPS into the Emacs sources is the right thing to do. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-10 13:10 ` Óscar Fuentes @ 2024-12-10 15:10 ` Pip Cet via Emacs development discussions. 2024-12-10 15:37 ` Óscar Fuentes 0 siblings, 1 reply; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-10 15:10 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel On Tuesday, December 10th, 2024 at 13:10, Óscar Fuentes <ofv@wanadoo.es> wrote: > Xiyue Deng manphiz@gmail.com writes: > * It is very likely that we end doing some patching to the MPS sources > to adapt to our specific needs (if those patches end upstream or not, > that's another question.) If that ends up being the case, we'll have to make sure not to use shared libraries which may contain the upstream code. But that's true of all libraries; in the particular case of a Debian package, both the APT versioning schemes and ELF versioning are available for that. > * MPS does a performance-critical job. Using it as a shared object might > incur in a performance penalty. Having it in source form alongside the > Emacs sources will result in opportunities for optimizations (LTO, > PGO, ...) that may bring better performance. ...and more problems. MPS has made the decision not to work with gcc -O3, only with -O2 or less, and LTO in particular is something MPS cannot reliably support, IIUC. > * MPS does a correctness-critical job. Depending on multiple external > sources for such core component is a recipe for problems (future > changes by the MPS maintainers, patching by packagers, buggy > compilers, etc.) We need to keep a close watch on what MPS incarnation > we use. Better yet, total control. I think the correctness argument goes both ways: shared linking means bugs may be fixed for you automatically, as is routinely the case with libc. > For those reasons, incorporating MPS into the Emacs sources is the right > thing to do. I don't think that's an option, because Emacs should remain capable of switching to GPLv4 if and when that is released, and we don't know whether the MPS license is compatible with such a future document. So it's either static or dynamic linking; static links have these disadvantages: * shared libraries on GNU/Linux have versioning, static libs don't, AFAIK * legally, statically-linked binaries are quite different from dynamically-linked ones * someone might enable LTO and break MPS (this may be done automatically by the compiler rather than a user error) * with dynamic linking, there is some hope we could switch from libmps.so to libmps-debug.so without having to recompile Emacs, which would help us diagnose crashes in their actual environment Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-10 15:10 ` Pip Cet via Emacs development discussions. @ 2024-12-10 15:37 ` Óscar Fuentes 2024-12-10 15:47 ` Pip Cet via Emacs development discussions. 2024-12-10 17:16 ` Eli Zaretskii 0 siblings, 2 replies; 112+ messages in thread From: Óscar Fuentes @ 2024-12-10 15:37 UTC (permalink / raw) To: Pip Cet; +Cc: emacs-devel Pip Cet <pipcet@protonmail.com> writes: >> * MPS does a performance-critical job. Using it as a shared object might >> incur in a performance penalty. Having it in source form alongside the >> Emacs sources will result in opportunities for optimizations (LTO, >> PGO, ...) that may bring better performance. > > ...and more problems. MPS has made the decision not to work with gcc > -O3, only with -O2 or less, and LTO in particular is something MPS > cannot reliably support, IIUC. That sounds worrysome. If I understand the implications of what you wrote, MPS basically depends on what the specifics of what gcc does. But gcc can do something else on future versions... not to mention what happens if the user wants to use other compilers. Can you point me to a description of how MPS is related to compiler optimizations and specifically to LTO? >> * MPS does a correctness-critical job. Depending on multiple external >> sources for such core component is a recipe for problems (future >> changes by the MPS maintainers, patching by packagers, buggy >> compilers, etc.) We need to keep a close watch on what MPS incarnation >> we use. Better yet, total control. > > I think the correctness argument goes both ways: shared linking means > bugs may be fixed for you automatically, as is routinely the case with > libc. libc is a central piece of any GNU/Linux distribution and therefore much cared by the packagers. MPS not so. Fixes on minor packages like MPS can take *years* to propagate through the distro universe, if at all. >> For those reasons, incorporating MPS into the Emacs sources is the right >> thing to do. > > I don't think that's an option, because Emacs should remain capable of > switching to GPLv4 if and when that is released, and we don't know > whether the MPS license is compatible with such a future document. Yeah, the licensing point is what I was too afraid to mention :-) ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-10 15:37 ` Óscar Fuentes @ 2024-12-10 15:47 ` Pip Cet via Emacs development discussions. 2024-12-10 17:16 ` Eli Zaretskii 1 sibling, 0 replies; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-10 15:47 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel On Tuesday, December 10th, 2024 at 15:37, Óscar Fuentes <ofv@wanadoo.es> wrote: > Pip Cet pipcet@protonmail.com writes: > > > > * MPS does a performance-critical job. Using it as a shared object might > > > incur in a performance penalty. Having it in source form alongside the > > > Emacs sources will result in opportunities for optimizations (LTO, > > > PGO, ...) that may bring better performance. > > > > ...and more problems. MPS has made the decision not to work with gcc > > -O3, only with -O2 or less, and LTO in particular is something MPS > > cannot reliably support, IIUC. > > That sounds worrysome. If I understand the implications of what you > wrote, MPS basically depends on what the specifics of what gcc does. But No compiler can perform cross-object linking without LTO, so we're safe there. > gcc can do something else on future versions... not to mention what > happens if the user wants to use other compilers. GC in general depends a lot on the compiler (and the programmer) not misbehaving. It is perfectly legal for a C compiler to scramble a pointer in a register, for example, but all conservative stack marking GC approaches will fail to recognize such a scrambled pointer and crash. > Can you point me to a description of how MPS is related to compiler > optimizations and specifically to LTO? I'll have a look. IIRC, setjmp() and the "void *top_of_stack = &top_of_stack" trick failed to properly detect all registers when the entry point was being inlined across objects, and Ravenbrook decided against moving to assembly code for those entry points. Of course there's also the scrambled frame pointer problem, but that's about what the client code does, not just about the MPS code. > > > * MPS does a correctness-critical job. Depending on multiple external > > > sources for such core component is a recipe for problems (future > > > changes by the MPS maintainers, patching by packagers, buggy > > > compilers, etc.) We need to keep a close watch on what MPS incarnation > > > we use. Better yet, total control. > > > > I think the correctness argument goes both ways: shared linking means > > bugs may be fixed for you automatically, as is routinely the case with > > libc. > > libc is a central piece of any GNU/Linux distribution and therefore much > cared by the packagers. MPS not so. Fixes on minor packages like MPS can > take years to propagate through the distro universe, if at all. Very good point, thank you. > > > For those reasons, incorporating MPS into the Emacs sources is the right > > > thing to do. > > > > I don't think that's an option, because Emacs should remain capable of > > switching to GPLv4 if and when that is released, and we don't know > > whether the MPS license is compatible with such a future document. > > Yeah, the licensing point is what I was too afraid to mention :-) Don't get me wrong: if Ravenbrook were to assign copyright to the FSF, including it in Emacs would be TRT, but that's unlikely to happen. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-10 15:37 ` Óscar Fuentes 2024-12-10 15:47 ` Pip Cet via Emacs development discussions. @ 2024-12-10 17:16 ` Eli Zaretskii 1 sibling, 0 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-12-10 17:16 UTC (permalink / raw) To: Óscar Fuentes; +Cc: pipcet, emacs-devel > From: Óscar Fuentes <ofv@wanadoo.es> > Cc: emacs-devel@gnu.org > Date: Tue, 10 Dec 2024 16:37:28 +0100 > > Pip Cet <pipcet@protonmail.com> writes: > > >> * MPS does a performance-critical job. Using it as a shared object might > >> incur in a performance penalty. Having it in source form alongside the > >> Emacs sources will result in opportunities for optimizations (LTO, > >> PGO, ...) that may bring better performance. > > > > ...and more problems. MPS has made the decision not to work with gcc > > -O3, only with -O2 or less, and LTO in particular is something MPS > > cannot reliably support, IIUC. > > That sounds worrysome. If I understand the implications of what you > wrote, MPS basically depends on what the specifics of what gcc does. But > gcc can do something else on future versions... not to mention what > happens if the user wants to use other compilers. MPS does quite a few questionable things and depends on several assumptions that are not easy to uphold. We have already bumped into some of them, with signals (like SIGPROF), for example, and don't yet have a satisfactory solution, at least IMO. It is hardly surprising for a library that attempts to literally pull the rug from under the feet of a running program. We will probably find other issues as we continue testing the branch. That is why it's important for as many people as possible to test it and report any problems. That is most of what is left to do on the branch before we decide it is ready to be merged (or, unlikely, decide the problems are too much for us to cope with). > >> For those reasons, incorporating MPS into the Emacs sources is the right > >> thing to do. > > > > I don't think that's an option, because Emacs should remain capable of > > switching to GPLv4 if and when that is released, and we don't know > > whether the MPS license is compatible with such a future document. > > Yeah, the licensing point is what I was too afraid to mention :-) We could alternatively fork the library and keep it in a separate repository, under a different but compatible license. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-10 4:17 ` Xiyue Deng ` (2 preceding siblings ...) 2024-12-10 13:10 ` Óscar Fuentes @ 2024-12-10 13:20 ` Eli Zaretskii 2024-12-10 14:46 ` Pip Cet via Emacs development discussions. 4 siblings, 0 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-12-10 13:20 UTC (permalink / raw) To: Xiyue Deng; +Cc: ofv, emacs-devel > From: Xiyue Deng <manphiz@gmail.com> > Date: Mon, 09 Dec 2024 20:17:54 -0800 > > If making MPS available in Debian would help Emacs packaging I'm willing > to work on this (in the coming weeks as igc may not land with the > upcoming Emacs 30 release so not in a hurry.) Thanks in advance. > I have a few questions regarding the Emacs/igc usage of MPS: > > * Does igc require only mps.{h,c} or more sources from the MPS source > package? It looks like there are many sources and it's autotools > build script fails with GCC 14.2 in Debian Trixie due to several > "-Werror"s. It may be easier to just compile and ship the required > subset, though it may require providing a custom build script. I suggest to use the detailed instructions under "Building the MPS for development" in manual/build.txt. This is what I did, and had no serious problems, even though I needed to concoct the various *.gmk Makefiles because my platform was not supported OOTB (GNU/Linux is supported OOTB). The reason I suggest that is that an official Debian distro of MPS had better included the several different builds of the library ("cool" and "hot"), and also included all the headers that any program using MPS might need, even if Emacs uses just part of them. The package should also include the Info manual, IMO. > * Does igc work with a dynamically linked MPS library? The MPS Makefiles build only static libraries, not shared libraries. Since this library implements GC, and Emacs must have some GC, why does it make sense to build MPS as a shared library? > Currently I have > seen people suggesting that directly compiling the source, which is > effectively like using MPS as a static library. It would be less > useful to package a static-only library in Debian because in case of > any issues (usually security) updating the library is insufficient and > its dependencies would need to be rebuilt as well. Using a dynamic > library would solve this scalability issue, and it would be good to > know if igc can work with a dynamically linked MPS. If you must build a shared library, you are basically on your own. And doing that is in stark contrast to what you asked above about headers used only by Emacs. > * Does igc work with the latest tagged version (release-1.118.0) or only > the latest snapshot? Packaging a tagged version would be easier, > though working with a snapshot may also work with a bit of extra > efforts. I built the official release, not a snapshot, FWIW. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-10 4:17 ` Xiyue Deng ` (3 preceding siblings ...) 2024-12-10 13:20 ` Eli Zaretskii @ 2024-12-10 14:46 ` Pip Cet via Emacs development discussions. 4 siblings, 0 replies; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-10 14:46 UTC (permalink / raw) To: Xiyue Deng; +Cc: Óscar Fuentes, emacs-devel On Tuesday, December 10th, 2024 at 04:17, Xiyue Deng <manphiz@gmail.com> wrote: > Óscar Fuentes ofv@wanadoo.es writes: > If making MPS available in Debian would help Emacs packaging I'm willing > to work on this (in the coming weeks as igc may not land with the > upcoming Emacs 30 release so not in a hurry.) I think that would be great, even if we decide we cannot make do with an unmodified upstream version of MPS. > * Does igc require only mps.{h,c} or more sources from the MPS source > package? It looks like there are many sources and it's autotools > build script fails with GCC 14.2 in Debian Trixie due to several > "-Werror"s. It may be easier to just compile and ship the required > subset, though it may require providing a custom build script. Ravenbrook recommends building the library directly by compiling mps.c, and that's what I usually do. I still ended up having to remove -Werror from the .mk files, at some point... > * Does igc work with a dynamically linked MPS library? It definitely does, because that's what we're using on Android. On other systems, statically-linked code may be very slightly faster, but IMHO packaging a statically-linked Emacs+MPS binary is problematic for a few reasons, just as statically linking to libc would be. (It should go without saying that "we always use it" is not sufficient reason for using a statically-linked library) > Currently I have > seen people suggesting that directly compiling the source, which is > effectively like using MPS as a static library. That works, and it's what I do on GNU/Linux, but we should probably change our approach there. > It would be less > useful to package a static-only library in Debian because in case of > any issues (usually security) updating the library is insufficient and > its dependencies would need to be rebuilt as well. Using a dynamic > library would solve this scalability issue, and it would be good to > know if igc can work with a dynamically linked MPS. It definitely can work, and I'll look into switching my builds over to using dynamic linking. > * Does igc work with the latest tagged version (release-1.118.0) or only > the latest snapshot? Packaging a tagged version would be easier, > though working with a snapshot may also work with a bit of extra > efforts. It's not quite clear to me yet whether we're going to be able to use unpatched MPS on all architectures (that's somewhat unlikely) or on every architecture except for 32-bit x86 (more likely). Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-09 21:49 ` Óscar Fuentes 2024-12-10 4:17 ` Xiyue Deng @ 2024-12-10 13:09 ` Eli Zaretskii 2024-12-10 13:20 ` Óscar Fuentes 1 sibling, 1 reply; 112+ messages in thread From: Eli Zaretskii @ 2024-12-10 13:09 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel > From: Óscar Fuentes <ofv@wanadoo.es> > Date: Mon, 09 Dec 2024 22:49:13 +0100 > > In fact, I'm pretty sure that any experienced autotools hacker can add > MPS to the Emacs build in no time. The only annoying bit is that some > MPS headers collide with Emacs' ??? The MPS build instructions in manual/build.txt say to copy to /usr/include only the headers that begin with "mps", and there are no such headers in Emacs, AFAICT. So what kind of collisions did you see? ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-10 13:09 ` Eli Zaretskii @ 2024-12-10 13:20 ` Óscar Fuentes 2024-12-10 14:41 ` Eli Zaretskii 0 siblings, 1 reply; 112+ messages in thread From: Óscar Fuentes @ 2024-12-10 13:20 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Óscar Fuentes <ofv@wanadoo.es> >> Date: Mon, 09 Dec 2024 22:49:13 +0100 >> >> In fact, I'm pretty sure that any experienced autotools hacker can add >> MPS to the Emacs build in no time. The only annoying bit is that some >> MPS headers collide with Emacs' > > ??? The MPS build instructions in manual/build.txt say to copy to > /usr/include only the headers that begin with "mps", and there are no > such headers in Emacs, AFAICT. So what kind of collisions did you > see? I don't recall the details, but passing -I/path/to/mps/code to Emacs' config script resulted in a failed build because the wrong headers were picked while compiling certain .c files. That should be quite easy to replicate, if you are interested. That problem does not happen if the directory passed to config only contains mps*.h files. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-10 13:20 ` Óscar Fuentes @ 2024-12-10 14:41 ` Eli Zaretskii 0 siblings, 0 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-12-10 14:41 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel > From: Óscar Fuentes <ofv@wanadoo.es> > Date: Tue, 10 Dec 2024 14:20:44 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> From: Óscar Fuentes <ofv@wanadoo.es> > >> Date: Mon, 09 Dec 2024 22:49:13 +0100 > >> > >> In fact, I'm pretty sure that any experienced autotools hacker can add > >> MPS to the Emacs build in no time. The only annoying bit is that some > >> MPS headers collide with Emacs' > > > > ??? The MPS build instructions in manual/build.txt say to copy to > > /usr/include only the headers that begin with "mps", and there are no > > such headers in Emacs, AFAICT. So what kind of collisions did you > > see? > > I don't recall the details, but passing -I/path/to/mps/code to Emacs' > config script resulted in a failed build because the wrong headers were > picked while compiling certain .c files. That should be quite easy to > replicate, if you are interested. If that's what you did, then I understand. The MPS instructions tell to copy all the mps*.h files into your /usr/include tree, not what you did. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-09 21:06 ` Merging MPS a.k.a. scratch/igc, yet again Stefan Kangas 2024-12-09 21:49 ` Óscar Fuentes @ 2024-12-09 23:13 ` chad 2024-12-10 12:41 ` Eli Zaretskii 2 siblings, 0 replies; 112+ messages in thread From: chad @ 2024-12-09 23:13 UTC (permalink / raw) To: Stefan Kangas Cc: Eli Zaretskii, pipcet, luangruo, ali_gnu2, emacs-devel, Gerd Möllmann, Stefan Monnier [-- Attachment #1: Type: text/plain, Size: 1405 bytes --] On Mon, Dec 9, 2024 at 4:07 PM Stefan Kangas <stefankangas@gmail.com> wrote: > [...] > Taking a look at README-IGC, it seems like we're still missing build > instructions for Debian. > FWIW, I use a somewhat wacky Debian setup, in that it's basically pure Debian run inside ChromeOS, which adds some mild container restrictions. In practice, the only impact this has is that my window system/manager is pre-selected and mostly unchangeable. I switched to scatch/igc a couple weeks ago, and have noticed no issues. I used some advice from this list, which was basically: git clone https://github.com/Ravenbrook/mps.git cd mps/code cc -O2-c mps.c ar rvs libmps.a mps.o make this available; I put it in /usr/local/lib make the header files available; I ended up doing cp mps*.h /usr/local/include configure emacs-igc with "--with-mps=yes"; I also used "--enable-checking=yes --enable-check-lisp-object-type=yes", which is normal practice for me with non-release builds. I suspect that not all of mps/code/mps*.h need to be copied into /usr/local/include, but I did the first few one at a time before breaking out the shotgun. I've been using this emacs regularly for almost 3 weeks now, but my usage has been quite light; mostly short Org/text docs, plus some occasional package updates (and thus byte & native compiling). I hope that helps, ~Chad [-- Attachment #2: Type: text/html, Size: 2116 bytes --] ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Merging MPS a.k.a. scratch/igc, yet again 2024-12-09 21:06 ` Merging MPS a.k.a. scratch/igc, yet again Stefan Kangas 2024-12-09 21:49 ` Óscar Fuentes 2024-12-09 23:13 ` chad @ 2024-12-10 12:41 ` Eli Zaretskii 2 siblings, 0 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-12-10 12:41 UTC (permalink / raw) To: Stefan Kangas Cc: pipcet, luangruo, ali_gnu2, emacs-devel, gerd.moellmann, monnier > From: Stefan Kangas <stefankangas@gmail.com> > Date: Mon, 9 Dec 2024 13:06:18 -0800 > Cc: pipcet@protonmail.com, luangruo@yahoo.com, ali_gnu2@emvision.com, > emacs-devel@gnu.org, Gerd Möllmann <gerd.moellmann@gmail.com>, > Stefan Monnier <monnier@iro.umontreal.ca> > > Eli Zaretskii <eliz@gnu.org> writes: > > > From where I stand, what's left to do on the branch is stability: > > using the branch, reporting bugs, and fixing them, especially on some > > rarer platforms (*BSD, for example). Plus some decisions: do we fork > > MPS or not, for example. So it isn't such a distant future. > > In that case, I'd suggest that we start working on getting README-IGC > into an excellent state. In August, when I last tried building the > branch, getting it to build was non-trivial, but I didn't try with the > latest instructions. Sure, if the instructions could be improved, this would be good regardless. > Once we feel happy that it's reasonably straightforward to follow the > instructions, I'd suggest that Someone (TM) makes a post to emacs-devel, > asking people to start seriously testing the branch. Such a post should > normally get picked up by Emacs News, Reddit, etc. and hopefully the > branch will then start seeing wider use. (Remember to Cc Sacha Chua to > get it on Emacs News.) This was already done: https://lists.gnu.org/archive/html/emacs-devel/2024-09/msg00257.html and some people already provide such feedback. But, of course, repeating the request for testing and feedback can never do any harm, and can be posted right now. > I'm sure that users will be excited to help test igc once they > understand that we're working seriously on stabilizing it in preparation > of getting it merged. Let's hope you are right. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-09 14:39 ` Eli Zaretskii 2024-12-09 21:06 ` Merging MPS a.k.a. scratch/igc, yet again Stefan Kangas @ 2024-12-10 0:09 ` Stefan Kangas 2024-12-10 12:59 ` Eli Zaretskii 1 sibling, 1 reply; 112+ messages in thread From: Stefan Kangas @ 2024-12-10 0:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pipcet, luangruo, ali_gnu2, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Stefan Kangas <stefankangas@gmail.com> >> Date: Sun, 8 Dec 2024 23:59:14 -0500 >> Cc: luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org >> >> Assuming that we are 100% sure that mpc will land, then I can agree that >> making any changes here is basically wasted effort. Unless, of course, >> the change would also simplify the mpc work (would it?). > > The igc branch already dropped WIDE_EMACS_INT support, so it only > supports USE_LSB anyway. I thought that WIDE_EMACS_INT will remain supported in non-MPS (i.e. "old GC") builds even after the igc merge? Am I mistaken? >> Based on my limited mucking around in the GC, it does interfere somewhat >> because you do need to understand both configurations, at least on a >> high level, and once you do you need to mentally filter that stuff out >> when reading the code. So I think I'd appreciate the simplification, at >> least. > > The simplification is minuscule at best. We need to mask some bits, > either at the LSB end or at MSB end, that's all the difference. And > we have macros that hide the differences from most levels. I agree that it's not a major issue, indeed. You don't need to look at this unless you want to understand how we do GC tagging in detail. OTOH, complexity almost always presents itself in small increments that individually don't look like much. It's only with the combined effect of many such small increments that they become a concern; hence the desire to take similarly small steps towards removing complexity. >> If the only known drawbacks are stability concerns, we could also >> consider an intermediate step along these lines: >> >> Leave the USE_LSB_TAG code as is, but set it to 1 in all configurations >> on master. > > That would put the WIDE_EMACS_INT configuration at risk, since that > configuration will need changes. That's why I proposed disabling it on master tentatively, with the option to revert the change if we don't like it. Setting a flag back to 0 is easy enough. But making the experiment I proposed might also demonstrate that we're fine, after all. OTOH, if we don't make the experiment, we have less data on which to base our decision. >> See what issues crop up, if any. If anything does come up, >> ask Pip Cet to fix it (he volunteered, IIUC), and if things are starting >> to look too hairy, revert EMACS_WIDE_INT back to !USE_LSB_TAG. If >> nothing too bad comes up, we can then consider removing the associated >> code in Emacs 32. > > My point is that all of that could be avoided entirely, given some > development decisions which basically drop !USE_LSB_TAG > configurations. Is your thinking here that we could merge MPS, wait, and then when it comes time to remove the old GC, we will get to drop !USE_LSB_TAG for free? If yes, couldn't that leave us waiting for a very long time indeed? Or are you saying something else? ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 0:09 ` pdumper on Solaris 10 Stefan Kangas @ 2024-12-10 12:59 ` Eli Zaretskii 2024-12-10 13:39 ` Óscar Fuentes 2024-12-10 15:23 ` Pip Cet via Emacs development discussions. 0 siblings, 2 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-12-10 12:59 UTC (permalink / raw) To: Stefan Kangas; +Cc: pipcet, luangruo, ali_gnu2, emacs-devel > From: Stefan Kangas <stefankangas@gmail.com> > Date: Mon, 9 Dec 2024 19:09:59 -0500 > Cc: pipcet@protonmail.com, luangruo@yahoo.com, ali_gnu2@emvision.com, > emacs-devel@gnu.org > > Eli Zaretskii <eliz@gnu.org> writes: > > >> From: Stefan Kangas <stefankangas@gmail.com> > >> Date: Sun, 8 Dec 2024 23:59:14 -0500 > >> Cc: luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org > >> > >> Assuming that we are 100% sure that mpc will land, then I can agree that > >> making any changes here is basically wasted effort. Unless, of course, > >> the change would also simplify the mpc work (would it?). > > > > The igc branch already dropped WIDE_EMACS_INT support, so it only > > supports USE_LSB anyway. > > I thought that WIDE_EMACS_INT will remain supported in non-MPS > (i.e. "old GC") builds even after the igc merge? Am I mistaken? Probably, but who will want to give up igc to get back WIDE_EMACS_INT (if indeed they are incompatible, which seems to be in disagreement)? I most probably won't. > OTOH, complexity almost always presents itself in small increments that > individually don't look like much. But here we have only a handful of increments, so the sum is also minuscule. > >> Leave the USE_LSB_TAG code as is, but set it to 1 in all configurations > >> on master. > > > > That would put the WIDE_EMACS_INT configuration at risk, since that > > configuration will need changes. > > That's why I proposed disabling it on master tentatively, with the > option to revert the change if we don't like it. Setting a flag back to > 0 is easy enough. But making the experiment I proposed might also > demonstrate that we're fine, after all. I think we already know that we are "not fine"? Didn't someone say that stack scanning is broken? > > My point is that all of that could be avoided entirely, given some > > development decisions which basically drop !USE_LSB_TAG > > configurations. > > Is your thinking here that we could merge MPS, wait, and then when it > comes time to remove the old GC, we will get to drop !USE_LSB_TAG for > free? If yes, couldn't that leave us waiting for a very long time > indeed? Maybe so, but why is such a long wait a problem? GC _works_, and works well. There are no pressing problems there, and we've lived with it for many years virtually without changes. What's the urge to make modifications there now, especially when there are chances we will be dropping this GC at some point? IMO, our main task here is to develop the application levels of Emacs, and infrastructure needed to enable such developments. We should only invest efforts in stuff like GC and other basics if we see significant issues, or could envision significant performance gains. There are no such issues or gains here, AFAIU. So diverting our humble resources to such jobs is a mistake, IMO. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 12:59 ` Eli Zaretskii @ 2024-12-10 13:39 ` Óscar Fuentes 2024-12-10 14:39 ` Eli Zaretskii ` (2 more replies) 2024-12-10 15:23 ` Pip Cet via Emacs development discussions. 1 sibling, 3 replies; 112+ messages in thread From: Óscar Fuentes @ 2024-12-10 13:39 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Maybe so, but why is such a long wait a problem? GC _works_, and > works well. Working on certain projects with lsp-mode is a miserable experience due to all the random pauses. My perception of the past week or two using igc is that those pauses are much less jarring, if perceptible at all. I need more time to make a definitive judgment, though. As code edition evolves and Emacs is put on more demanding tasks the limitations of GC become more obvious (and CPUs are not getting faster anymore). Apart from that, I'm convinced that there is quite a bit of evolutionary pressure exerted by GC on the Elisp package ecosystem: something that works too slowly or is too bumpy does not atract users and die. Others may end devoting a lot of effort to optimize GC usage and when they finally work "well enough" (for some generous interpretation) most potential users already made their mind (flx.el is a paradigmatic case) or the package author simply stops working on it, sometimes without making the first release. GC also diminishes the benefits of native-comp and other performance enhancements: no matter how fast you make your Elisp execution engine, the time taken by GC stablishes a hard limit. But the "stop the world" mode of GC operation makes user experience quite worse even if the total time to perform a task is smaller. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 13:39 ` Óscar Fuentes @ 2024-12-10 14:39 ` Eli Zaretskii 2024-12-10 15:21 ` Óscar Fuentes 2024-12-10 15:38 ` Pip Cet via Emacs development discussions. 2024-12-10 18:13 ` pdumper on Solaris 10 Gerd Möllmann 2 siblings, 1 reply; 112+ messages in thread From: Eli Zaretskii @ 2024-12-10 14:39 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel > From: Óscar Fuentes <ofv@wanadoo.es> > Date: Tue, 10 Dec 2024 14:39:54 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Maybe so, but why is such a long wait a problem? GC _works_, and > > works well. > > Working on certain projects with lsp-mode is a miserable experience due > to all the random pauses. And the changes discussed in this sub-thread will make it spectacularly faster? > My perception of the past week or two using igc is that those pauses are > much less jarring, if perceptible at all. I need more time to make a > definitive judgment, though. I was not talking about igc, and its advantages are clear to me. That's not what this sub-thread is about. > As code edition evolves and Emacs is put on more demanding tasks the > limitations of GC become more obvious (and CPUs are not getting faster > anymore). > > Apart from that, I'm convinced that there is quite a bit of evolutionary > pressure exerted by GC on the Elisp package ecosystem: something that > works too slowly or is too bumpy does not atract users and die. Others > may end devoting a lot of effort to optimize GC usage and when they > finally work "well enough" (for some generous interpretation) most > potential users already made their mind (flx.el is a paradigmatic case) > or the package author simply stops working on it, sometimes without > making the first release. > > GC also diminishes the benefits of native-comp and other performance > enhancements: no matter how fast you make your Elisp execution engine, > the time taken by GC stablishes a hard limit. > > But the "stop the world" mode of GC operation makes user experience > quite worse even if the total time to perform a task is smaller. All correct, but completely irrelevant to the issue at hand. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 14:39 ` Eli Zaretskii @ 2024-12-10 15:21 ` Óscar Fuentes 2024-12-10 16:39 ` Eli Zaretskii 0 siblings, 1 reply; 112+ messages in thread From: Óscar Fuentes @ 2024-12-10 15:21 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > All correct, but completely irrelevant to the issue at hand. I was specifically addressing your "GC works, and works well". A GC that takes big chunks of time on what is essentially a single-threaded execution engine and, even more significantly, introduces pauses that impacts user experience, does not work well, I would say that it barely works at all, in the sense that it is far from adequate for the kind of application Emacs is. I mean, if igc is finally deemed a success, any effort directed at keeping GC at the expense of anything else would be work invested on a misfeature, IMHO. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 15:21 ` Óscar Fuentes @ 2024-12-10 16:39 ` Eli Zaretskii 0 siblings, 0 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-12-10 16:39 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel > From: Óscar Fuentes <ofv@wanadoo.es> > Cc: emacs-devel@gnu.org > Date: Tue, 10 Dec 2024 16:21:13 +0100 > X-Spam-Status: No > > Eli Zaretskii <eliz@gnu.org> writes: > > > All correct, but completely irrelevant to the issue at hand. > > I was specifically addressing your "GC works, and works well". > > A GC that takes big chunks of time on what is essentially a > single-threaded execution engine and, even more significantly, > introduces pauses that impacts user experience, does not work well, I > would say that it barely works at all, in the sense that it is far from > adequate for the kind of application Emacs is. > > I mean, if igc is finally deemed a success, any effort directed at > keeping GC at the expense of anything else would be work invested on a > misfeature, IMHO. This sub-thread was not about GC vs igc, it was about changes in GC itself that would never come even close to igc. Everything I wrote should be assessed from that angle. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 13:39 ` Óscar Fuentes 2024-12-10 14:39 ` Eli Zaretskii @ 2024-12-10 15:38 ` Pip Cet via Emacs development discussions. 2024-12-10 16:04 ` Óscar Fuentes 2024-12-11 5:27 ` Gap buffer problem? Gerd Möllmann 2024-12-10 18:13 ` pdumper on Solaris 10 Gerd Möllmann 2 siblings, 2 replies; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-10 15:38 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel On Tuesday, December 10th, 2024 at 13:39, Óscar Fuentes <ofv@wanadoo.es> wrote: > Eli Zaretskii eliz@gnu.org writes: > > Maybe so, but why is such a long wait a problem? GC works, and > > works well. > > Working on certain projects with lsp-mode is a miserable experience due > to all the random pauses. To be fair, part of that may be the gap buffer problem rather than GC. > My perception of the past week or two using igc is that those pauses are > much less jarring, if perceptible at all. I need more time to make a > definitive judgment, though. If you do, and it's negative, please take into account that MPS offers many tunable parameters, and hasn't been fine-tuned for Emacs yet. Even if the current scratch/igc branch isn't satisfactory by itself, it's very likely it can be improved by changing some numbers. > But the "stop the world" mode of GC operation makes user experience > quite worse even if the total time to perform a task is smaller. Of course, these problems are largely fixable, and have been fixed, by such approaches as the fork()-based GC I wrote, which Eli vetoed (I believe the same applies to moving the GC mark bits to their own memory regions, which would have allowed us to interrupt GC on user input). The "don't touch the GC" edict has done a great deal of harm to Emacs; this is relevant because we're now discussing a simplification of the GC code which would help MPS, but is being vetoed (again), while putting effort into making our current code even more complicated by including an impossible code path is being encouraged. So, no, the current GC doesn't work well, it does cause problems, its code is overly complicated, and simplifications would make switching to MPS a lot easier. All is not well in GC land. Put drastically, if MPS fails to land, the most likely reason is the capriciously-applied "do not touch the GC" rule. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 15:38 ` Pip Cet via Emacs development discussions. @ 2024-12-10 16:04 ` Óscar Fuentes 2024-12-10 17:23 ` Eli Zaretskii 2024-12-11 5:27 ` Gap buffer problem? Gerd Möllmann 1 sibling, 1 reply; 112+ messages in thread From: Óscar Fuentes @ 2024-12-10 16:04 UTC (permalink / raw) To: Pip Cet; +Cc: emacs-devel Pip Cet <pipcet@protonmail.com> writes: >> My perception of the past week or two using igc is that those pauses are >> much less jarring, if perceptible at all. I need more time to make a >> definitive judgment, though. > > If you do, and it's negative, please take into account that MPS offers > many tunable parameters, and hasn't been fine-tuned for Emacs yet. > Even if the current scratch/igc branch isn't satisfactory by itself, > it's very likely it can be improved by changing some numbers. Noted, thanks. > this is relevant because we're now discussing a simplification of the > GC code which would help MPS Those modifications can go on a branch (a fork of scratch/igc). When/if igc demonstrates its virtues and considered a considerable improvement for Emacs, related changes surely meet less oposition. Then you can point to that branch and suggest merging it instead of scratch/igc. > Put drastically, if MPS fails to land, the most likely reason is the > capriciously-applied "do not touch the GC" rule. What appears capriciously from the outside, may be responsible maintenance from the inside. Eli and a few others have a very long term commitment with Emacs' and, as maintainers, consider not degrading stability their principal duty towards users, which in practice means being almost overly conservative. And even if I sometimes get irritated by some decisions, knowing that I can rely on Emacs working (save for very occassional tweaks) is something that I appreciate very much. Remember XEmacs? ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 16:04 ` Óscar Fuentes @ 2024-12-10 17:23 ` Eli Zaretskii 0 siblings, 0 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-12-10 17:23 UTC (permalink / raw) To: Óscar Fuentes; +Cc: pipcet, emacs-devel > From: Óscar Fuentes <ofv@wanadoo.es> > Cc: emacs-devel@gnu.org > Date: Tue, 10 Dec 2024 17:04:39 +0100 > > Pip Cet <pipcet@protonmail.com> writes: > > > Put drastically, if MPS fails to land, the most likely reason is the > > capriciously-applied "do not touch the GC" rule. > > What appears capriciously from the outside, may be responsible > maintenance from the inside. More importantly, since some platforms we care about probably won't support MPS, it could be that the old GC will have to stay with us for a very long time, alongside MPS. Keeping that old GC code stable and reliable is thus very important even if MPS will land (which I personally hope it will). Emacs is a very stable platform, and our users rely on us to keep it stable, even though we sometimes add semi-revolutionary new features to it. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Gap buffer problem? 2024-12-10 15:38 ` Pip Cet via Emacs development discussions. 2024-12-10 16:04 ` Óscar Fuentes @ 2024-12-11 5:27 ` Gerd Möllmann 2024-12-11 8:50 ` Pip Cet via Emacs development discussions. 2024-12-11 14:22 ` Eli Zaretskii 1 sibling, 2 replies; 112+ messages in thread From: Gerd Möllmann @ 2024-12-11 5:27 UTC (permalink / raw) To: Pip Cet via Emacs development discussions.; +Cc: Óscar Fuentes, Pip Cet Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> writes: > On Tuesday, December 10th, 2024 at 13:39, Óscar Fuentes <ofv@wanadoo.es> wrote: >> Eli Zaretskii eliz@gnu.org writes: >> > Maybe so, but why is such a long wait a problem? GC works, and >> > works well. >> >> Working on certain projects with lsp-mode is a miserable experience due >> to all the random pauses. > > To be fair, part of that may be the gap buffer problem rather than GC. Could you please tell more about the gap buffer problem? I've read a little about the tradeoffs between gap buffers, piece tables, ropes, but I'm wondering if there is something concrete already known for sure that is a performance problem in Emacs. Maybe a bug that has been analyzed or something. (I'm asking because I just recently encountered a performance problem when adding something to xdisp.c:27339 (with cc-mode, Eglot, Corfu), and editing there was so slow that it was absolutely no fun, and that on a an M1 pro. Haven't investigated the reason.) ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 5:27 ` Gap buffer problem? Gerd Möllmann @ 2024-12-11 8:50 ` Pip Cet via Emacs development discussions. 2024-12-11 9:35 ` Gerd Möllmann 2024-12-11 14:22 ` Eli Zaretskii 1 sibling, 1 reply; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-11 8:50 UTC (permalink / raw) To: Gerd Möllmann Cc: Pip Cet via "Emacs development discussions.", Óscar Fuentes Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > writes: > >> On Tuesday, December 10th, 2024 at 13:39, Óscar Fuentes <ofv@wanadoo.es> wrote: >>> Eli Zaretskii eliz@gnu.org writes: >>> > Maybe so, but why is such a long wait a problem? GC works, and >>> > works well. >>> >>> Working on certain projects with lsp-mode is a miserable experience due >>> to all the random pauses. >> >> To be fair, part of that may be the gap buffer problem rather than GC. > > Could you please tell more about the gap buffer problem? Just anecdotes, I'm afraid. My problem was a large buffer of test descriptions for a programming language, and I was running the tests and modifying the buffer to contain the output for each test in a block after the test itself. That worked, but running several tests in parallel, moving back and forth in the buffer to modify text as the output came in ... not so much. I also recall discussion somewhere (nullprogram.com, maybe) about multiple cursors and the gap buffer, and that's also a potential use case where the gap buffer would make things very slow. > I've read a little about the tradeoffs between gap buffers, piece > tables, ropes, but I'm wondering if there is something concrete already > known for sure that is a performance problem in Emacs. Maybe a bug that > has been analyzed or something. I'd be very interested in such a bug. Replacing the gap buffer assumption is quite hard: IIRC, the main problem is that the regexp code has been hacked to support gap buffers but not other data structures, so we'd need to do something about that. > (I'm asking because I just recently encountered a performance problem > when adding something to xdisp.c:27339 (with cc-mode, Eglot, Corfu), and > editing there was so slow that it was absolutely no fun, and that on a > an M1 pro. Haven't investigated the reason.) Interesting. It may be worth it to try reproducing that and disabling modes one by one to find out which one is at fault. I suspect that it's overlays/the interval tree rather than the gap buffer per se (however, if we ever replace the gap buffer code, we should make sure its replacement actually handles buffer text and text properties/intervals in an integrated manner, rather than storing just buffer text). Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 8:50 ` Pip Cet via Emacs development discussions. @ 2024-12-11 9:35 ` Gerd Möllmann 2024-12-11 11:50 ` Pip Cet via Emacs development discussions. 2024-12-11 12:27 ` Pip Cet via Emacs development discussions. 0 siblings, 2 replies; 112+ messages in thread From: Gerd Möllmann @ 2024-12-11 9:35 UTC (permalink / raw) To: Pip Cet Cc: Pip Cet via "Emacs development discussions.", Óscar Fuentes Pip Cet <pipcet@protonmail.com> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> >> writes: >> >>> On Tuesday, December 10th, 2024 at 13:39, Óscar Fuentes <ofv@wanadoo.es> wrote: >>>> Eli Zaretskii eliz@gnu.org writes: >>>> > Maybe so, but why is such a long wait a problem? GC works, and >>>> > works well. >>>> >>>> Working on certain projects with lsp-mode is a miserable experience due >>>> to all the random pauses. >>> >>> To be fair, part of that may be the gap buffer problem rather than GC. >> >> Could you please tell more about the gap buffer problem? > > Just anecdotes, I'm afraid. My problem was a large buffer of test > descriptions for a programming language, and I was running the tests and > modifying the buffer to contain the output for each test in a block > after the test itself. That worked, but running several tests in > parallel, moving back and forth in the buffer to modify text as the > output came in ... not so much. > > I also recall discussion somewhere (nullprogram.com, maybe) about > multiple cursors and the gap buffer, and that's also a potential use > case where the gap buffer would make things very slow. Thanks. > >> I've read a little about the tradeoffs between gap buffers, piece >> tables, ropes, but I'm wondering if there is something concrete already >> known for sure that is a performance problem in Emacs. Maybe a bug that >> has been analyzed or something. > > I'd be very interested in such a bug. Replacing the gap buffer > assumption is quite hard: IIRC, the main problem is that the regexp code > has been hacked to support gap buffers but not other data structures, so > we'd need to do something about that. > >> (I'm asking because I just recently encountered a performance problem >> when adding something to xdisp.c:27339 (with cc-mode, Eglot, Corfu), and >> editing there was so slow that it was absolutely no fun, and that on a >> an M1 pro. Haven't investigated the reason.) > > Interesting. It may be worth it to try reproducing that and disabling > modes one by one to find out which one is at fault. I suspect that it's > overlays/the interval tree rather than the gap buffer per se (however, Yeah, maybe I'll investigate that further at some point, not sure. I did try with VSCode and Zed now, though, for no good reason. They don't have a problem. > if we ever replace the gap buffer code, we should make sure its > replacement actually handles buffer text and text properties/intervals > in an integrated manner, rather than storing just buffer text). > > Pip And if I may add a wish to the future author: Make whatever you use persistent data structures, so that one could think of letting redisplay run concurrently. Really! :-) ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 9:35 ` Gerd Möllmann @ 2024-12-11 11:50 ` Pip Cet via Emacs development discussions. 2024-12-11 13:22 ` Gerd Möllmann 2024-12-11 12:27 ` Pip Cet via Emacs development discussions. 1 sibling, 1 reply; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-11 11:50 UTC (permalink / raw) To: Gerd Möllmann Cc: Pip Cet via "Emacs development discussions.", Óscar Fuentes Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Pip Cet <pipcet@protonmail.com> writes: >> if we ever replace the gap buffer code, we should make sure its >> replacement actually handles buffer text and text properties/intervals >> in an integrated manner, rather than storing just buffer text). >> >> Pip > > And if I may add a wish to the future author: Make whatever you use > persistent data structures, so that one could think of letting redisplay > run concurrently. Really! :-) You won't be surprised to hear I've been playing with some code, so could I ask you to expand on this point? What precisely does redisplay require? Full snapshotting or would it be sufficient to have fine-grained locking? (However, before anyone gets their hopes and/or fears up, my code depends on disabling most of the regexp code, and the additional number of garbage-collected objects is so great that I concluded I'd wait for MPS to land before resuming work on it. One of the few distinct advantages of the current gap buffer approach is that it doesn't affect GC...) I know virtually nothing about redisplay. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 11:50 ` Pip Cet via Emacs development discussions. @ 2024-12-11 13:22 ` Gerd Möllmann 2024-12-11 14:53 ` Pip Cet via Emacs development discussions. 0 siblings, 1 reply; 112+ messages in thread From: Gerd Möllmann @ 2024-12-11 13:22 UTC (permalink / raw) To: Pip Cet Cc: Pip Cet via "Emacs development discussions.", Óscar Fuentes [-- Attachment #1: Type: text/plain, Size: 1621 bytes --] Pip Cet <pipcet@protonmail.com> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> Pip Cet <pipcet@protonmail.com> writes: >>> if we ever replace the gap buffer code, we should make sure its >>> replacement actually handles buffer text and text properties/intervals >>> in an integrated manner, rather than storing just buffer text). >>> >>> Pip >> >> And if I may add a wish to the future author: Make whatever you use >> persistent data structures, so that one could think of letting redisplay >> run concurrently. Really! :-) > > You won't be surprised to hear I've been playing with some code, Indeed, I was just thinking to myself "I knew it" :-). Two thumbs up! > so could I ask you to expand on this point? What precisely does > redisplay require? Full snapshotting or would it be sufficient to have > fine-grained locking? Maybe it's helpful when I tell something about the background. Some time last year I asked myself if I could make Emacs more than one of my plenty of CPU cores without solving the multi-threaded Elisp problem. And the idea was that I could do that, possibly, by letting redisplay happen in another thread. I later realized while thinking about the details, that this undertaking is an order of magnitude too large for me. Everything taking more than a few months is. And, in addition, I wouldn't want to do data structures in C anyway. So it's history. Won't happen. But, there is an incomplete, terse, terrible Org file from those times that I kept. I talked a bit about this with Stefan Monnier and Eli at the time, just FYI. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Concurrent redisplay --] [-- Type: text/x-org, Size: 19475 bytes --] :PROPERTIES: :ID: E5E87FA1-48D1-4753-AAAE-E86FB36F5742 :END: #+title: Concurrent Redisplay # -*- mode: org; eval: (auto-fill-mode 1) -*- #+STARTUP: content #+AUTHOR: gerd@gnu.org * Concurrent Redisplay Redisplay is currently performed sequentially as part of Emacs' command loop. The command loop calls =redisplay= to make sure that changes in buffers are made visible on the screen. Concurrent redisplay means to change Emacs' architecture, so that redisplay can be done concurrently with the command loop and running Elisp. In this document, I'm trying to get an impression if a parallel redisplay is achievable, from a very high-level perspective at least. To make thinking about this possible, I make a number of assumptions and simplications, which are described in the following. ** Multi-threaded Lisp This document is no way concerned with making Elisp multi-threaded, if that's possible, if so how, and what else. Due to demand from others, I'm also considering the case that a concurrent redisplay can call Lisp. How this is made possible, I'm not considering. ** Possible Gains - Distribute work on more than one CPU core - Makes it possible to implement advanced display features in the future that would be too costly to perform in a sequential redisplay. ** Concurrency Architecture As a simple to reason about architecture, I assume that Emacs will consist of two modules: - The =main= module consists of command loop and Lisp, and runs in one thread. - The =redisplay= module runs in another thread. Both modules are isolated from each other, and may not access data owned by the other module. Communication between modules only happens by exchanging non-blocking messages. I could imagine a GUI/TUI backend model in this picture, for good measure, but won't consider that further. Random links: The Problem with Threads https://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf Plain Threads are the Goto of Modern Computing https://isocpp.org/blog/2014/12/plain-threads-are-the-goto-of-todays-computing ** Display Model Concurrent redisplay in the architecture described must work on a model that it owns. It is assumed for now, that this model represents a buffer's text plus a number of properties/variables relevant to redisplay, like faces that apply to regions of text. See [[*Redisplay Model]]. ** Triggering Redisplay Concurrentl redisplay could choose to display at its own whim, or triggered by receiving a message from the main module. It could, for example, decide to redisplay based on available hardware frame rates. How this is done is not considered here. ** Display Update Roughly speaking, current redisplay can be divided into two parts: - Produce desired glyphs, which describe what the display should look like. - Update the display by comparing current and desired glyphs and calling the GUI/TUI backend(s). Then set the current glyphs to the desired glyphs for the next round. The update part is not considered in the following. There are several conceivable ways to implement an update: - Update in the =redisplay= module * Call GUI backend directly * Post messages to the =main= module, which calls GUI backend * Post messages to a possible GUI module - Update in the =main= module + =Redisplay= posts message containing desired glyphs This looks like a solvable problem to me. So, for simplicity, I don't consider it here. In the following, "redisplay" mainly refers to producing desired glyphs. ** One Window/Buffer For simplicity, I only consider one window displaying one buffer. An interesting, maybe even natural, idea might be to run more than one redisplay in parallel, one for each window, but that is also not in scope here. ** Frame-based Redisplay Also not considered here is the update phase of TTY frames, which currently requires a view of all windows on a frame at once, which is commonly called frame-based redisplay. * Aspects to Consider The following is a list of things to consider when thinking of making redisplax concurrent. ** jit-lock Current redisplay calls run =fontification-functions= to ensure that properties are up to date for the text being displayed. This will not be possible in a concurrent redisplay, unless one assumes that Lisp can be called from multiple threads. Stefan (and Eli?) thinks we will eventually need to be able to call some Lisp-ish code from a concurrent redisplay before it can fully replace the existing synchronous redisplay. I myself would accept a display that is not always 100% accorate, for exmaple because parts of the text have not been fontified yet, or compositions determined. Instead of calling Lisp from redisplay, one use background fontification in the main module together with guesstimates which regions of text will actually be displayed to make sure that fontification results are visible as soon as possible. The redisplay module could also post messages to guide this guessing, for example at the end of redisplay. ** Hooks and functions In general, the idea is that =redisplay= posts messages to the =main= module that certain things have happened. Hooks/functions run by current redisplay are: - window-scroll-functions - activate-menubar-hook (redisplay_internal -> prepare_menu_bar -> update_menu_bar) - update-menubar-hook (update_menu_bar) - fontification-functions (jit-lock, and maybe others) - ? Open: what are the expectations of these hook function about variables, buffers? ** Caches Redisplay needs access to face, font and image caches which are stored on frames (owned by main module). I propose as an idea to remove the caches from frames and give ownership of these to concurrent redisplay. Could be a table frame -> caches. Some communication is necessary from the main module to redisplay for frame changes, clearing caches from Lisp, .... - does face/font/image code called during redisplay call Lisp? - can face etc. code be run from another thread? ** Glyph matrices Glyph matrices are a form of cache, so they should be treated likewise. Depends on which module does the update phase of redisplay. ** Point and mark Needed for region highlighting. Create new model version when changing these. This requires that creating new model versions is reasonably fast. ** window-start computation Redisplay posts message back to main module containing information about what is on the display. Window start/end could be part of that. ** move_it functions This concerns Lisp functions like vertical-motion. Rely (relied?) partly on current glyph matricx, and otherwise on redisplay functions that are used without producing glyphs. - do expect results based on current text, even if not displayed yet. Open. No solution in mind that isn't ugly (locking, maybe). - need display model to operate on. - Pixel positions in text? ** Mouse highlight Open. No idea how that is done nowadays. It used to use the current matrices, only. ** TTYs The update phase of redisplay on ttys needs a view of the whole frame's current and desired glyph matrices for optimization. This is done by giving tty frames matrices, and sub-allocating window matrices from these. The descriptions above are not affected by this, but it has to be kept in mind, for the update phase, which is not yet taken into account. ** minibuffer, reading from Seems to be all the same as other windows, but open. ** Echo area Open (-> window geometry?) ** Bidi Eli says: #+begin_quote I think this can be removed from the list of issues. Basically (with a few caveats, see below, which I don't think change anything in principle), bidi.c is just a subroutine of set_iterator_to_next, which implements the non-linear scanning of buffer text needed for bidi reordering. It effectively causes set_iterator_to_next to move to the next character in _visual_ order, not in buffer position order (the latter would require just incrementing the buffer position). To do this, bidi.c needs access to buffer text, and little else. The caveats I mentioned are: . sometimes we need to figure out the base paragraph direction, either L2R or R2L (the latter will be displayed with characters starting at the right edge of the window instead of the left), in this case bidi.c looks back using regexps for the beginning of the paragraph, because the Unicode Bidirectional Algorithm mandates that the paragraph direction is determined by the first strong directional character of the paragraph . when the buffer includes display properties, bidi.c treats all the characters "covered" by the property as a single neutral character, since this is how images and other such stuff needs to be handled for display reordering purposes -- this requires partial processing of display properties for the single purpose of determining whether they are "replacing" or "non-replacing" properties, and in the former case to determine at which buffer position the display property ends I don't think these caveats change anything, since again they only need to access buffer text. The bidi reordering code maintains a state (struct bidi_it), but it is a sub-structure of struct it, and lives only as long as the iterator object lives. #+end_quote ** Narrowing Don't remember how that is done. ** Selective display Open. Should at long last die. ** Window geometry changes Open. ** Others? * Redisplay Model What is being displayed and how it is displayed depends on - buffer text - Properties of the text (overlays, text properties) - Values of display-relevant variables (=truncate-lines=, ...) Concurrent redisplay mustown such a model, so that no synchronization is necessary between =main= and =redisplay= module. ** Buffer Text *** Copying One could think of making copies of all what is needed for redisplay and let concurrent redisplay work on such a model. I believe this is out of question, for performance reasons. Such a copy would have to be made by the main module, and that could easily cost more than what what we do now in sequential redisplay, especially if we don't exactly know what data redisplay will need (range of text, for example). *** Another "Copying" Possibility Stefan Monnier had another interesting idea that I quote here #+begin_quote Note that you can also use the current text representation with a concurrent redisplay: simply keep a whole copy of the buffer over in the redisplay side. Updating that whole copy should usually be quite efficient thanks to BEG/END_UNCHANGED. #+end_quote *** Persistence To avoid copying, let buffer text be represented as a persistent data structure. Conceptually, this persistent data structure contains an ordered set of buffer-text versions. When the =main= module modifies buffer text, new versions are created. When =redisplay= starts, it picks the youngest version available as buffer zexz. because it is known that any modification in =main= will lead to a new version, and not modify an existing version. The "piece table" is an interesting representation for such a persistent buffer text data structure. Some later descriptions assume that buffer text uses a persistent piece table. Some links: An interesting paper about text representations in general: https://www.cs.unm.edu/~crowley/papers/sds.pdf Piece tables: https://www.averylaird.com/programming/the%20text%20editor/2017/09/30/the-piece-table An implementation of a persistent piece table: https://github.com/cdacamar/fredbuf A blog post about VSCode using piece tables: https://code.visualstudio.com/blogs/2018/03/23/text-buffer-reimplementation An implementation of a persistent tree: https://cglab.ca/~dana/pbst/#:~:text=A%20persistent%20binary%20search%20tree,into%2Fdeletion%20from%20the%20tree. ** Properties (Overlays + Text Properties) Properties that are relevant for redisplay are: - =face= - =invisible= - =display= - =composition= Redisplay needs the following information about properties: - start + end position - property value Property values can contain constructs that eval Lisp. Examples: - display (=:when=, =(:height FN)=, ...) - =mode-line-format= may also contain =:eval= If concurrent redisplay cannot call Lisp: The parts of the property values that require evaluating Lisp must be part of the display model in evaluated form. Such a model could contain a map =Lisp_Object= -> =value= (at the time the model version was current), where - the key =Lisp_Object= is the part of the original property value containing =:eval=, for instance. It could be the =cons= cell of an =(:eval ...)= - =value= is the evaluated value Changing properties must create new model versions. - adding/removing/changing props -> new model version - each piece in a piece table could have a list of applicable props for the whole piece. - mass changes could be done without producing lots of new model versions + requires that concurrent redisplay doesn't work on a model that is mass-updated, which could require synchronization, which is ugly. Possible optimizations: - discard/coalesce old model versions in the background, to reduce memory footprint? The main module creates new versions, only. Redisplay uses only the latest version. ** Variables The display model must also contain a snapshot of the values of all relevant variables at the time of the model version. Relevant values are: - truncate-lines - scroll-conservatively - window, frame, buffer, global values (window-start, ...) - ? - todo. make a list * Persistent Data Structures Wikipedia: https://en.wikipedia.org/wiki/Persistent_data_structure ** Terminology Short summary of the terminology: - persistent + general term encompassing veriations below + always preserves versions of itself when modified. + immutable in the sense that they are not changed in-place. - partially persistent + all versions can be read + only newest version can be modified. - fully persistent + all versions can be read + every version can be modified. - confluently persistent + fully persistent + versions can be merged (melded). ** Links Kind of a brief overview: https://academic-accelerator.com/encyclopedia/persistent-data-structure Irmin: Mergeable ropes https://inria.hal.science/hal-01099136v1/document Intersting article: https://blog.acolyer.org/2015/01/14/mergeable-persistent-data-structures/ Partially and fully persistent DS in C (no merges) https://github.com/vineeths96/Persistent-Data-Structures Confluently persistent DS paper https://arxiv.org/pdf/1301.3388.pdf https://www.cs.utexas.edu/~ecprice/papers/confluent_swat.pdf Data visualization with persistent DS https://www.researchgate.net/publication/258713092_Efficient_Dynamic_Data_Visualization_with_Persistent_Data_Structures * Redisplay calling Lisp This is a hypotheical scenario, but Eli and Stefan seem to assume that it is important to have to make concurrent redisplay acceptable to users. - Redisplay calls Lisp to fontify etc. + just assuming that is possible in the future + as a substitute for storing a snapshot in display model. + how calling Lisp from redisplay works on the Lisp side, is not yet specified My conclusions from this: - Properties must be persistent data structures + because no props snapshot in display model + because redisplay needs props corresponding to its buffer-text version - Properties must be confluently persistent data structures + need to be able to modify prop versions in Lisp + need to merge back changes into current versions - buffer modifications from Lisp either + should be prevented (how?) + or buffer-text must be confluently persistent - merge or discard any buffer-text changes (delete version, if it was created). Probably discard. - what about if Lisp changes display-relevant variables? - unclear - Other modifications? ** Merging properties Imagine =fontification-functions= adding properties for font-lock. These modifications should not be lost once concurrent redisplay has finished. That means the properties added to an old version of the buffer text etc must be merged into newer versions. - Confluently persistent props require + way to merge changes to newer versions + consider only merge version n-1 to n - single prop = (beg end value) + position translation - know what changed in buffer-texts from n-1 to n + wanting translation pos_{n-1} to pos_n + piece added in n in front of pos_{n-1} => add length + piece delete in from of pos_{n-1} => subtract + details depend on buffer-text DS - buffer-text DS must take into account that translations must be possible - looks doable + value merging + assume interval [beg end] in the following (including beg and end) + added property - [beg_n end_n] may intersect with 0+ props in version n. - say first intersection is in [a b], a >= beg, b <= end with value val - [beg a-1] -> new value - [a b] -> value-dependent handling of old/new value (merge/discard..., must be defined) - [b+1 end] -> either new value or merge with next intersecing prop from n + changed values - treat as remove + add + removed props - no direct representation in version of props in version n-1 - assume scan whole version n for prop of the same kind - could record min/max pos of changes in n-1 - let [a b val] be "interesting" prop in n (face, ...) - if there is no intersecting prop in [a b] in n-1, what does that mean? - it has been newly added in n compared to n-1 - it was in n-2 and been removed in n-1 - must find out to resolve + can we in all cases? Assuming everything still open can be resolved, this looks doable, but it is certainly non-trivial. * Performance/Memory Considerations The use of persistent data structures will have an impoact on both performance and memory consumption. How large this impact will be I find impossible to tell, especially on older hardware. But keep in mind, that at the time this might be implemented, current hardware will be old. * Personal Conclusions I'm stopping here, despite open questions, because I think I have reached a sufficient level of gut feeling about the subject. I'd summarize my thoughts as: - Concurrent redisplay is feasible, both with and without being able to call Lisp from redisplay. - Changing buffer-text representation using a piece table is a big enough bite that it is only worth it only if a concurrent redisplay comes at some point. - If performance on old hardware will be acceptable, for some value of acceptable, I find unpredictable. - Concurrent redisplay with the ability to call Lisp from redisplay is considerably more complex than without being able to call Lisp. I'd say at least 2 times. - Concurrent redisplay will not happen unless at least 2 or 3 people with enought time decide to work on it. * Random Grab Bag - make pieces for long lines (max length of piece) - concurrency -> dump complicated redisplay optimizations? - pieces provide more detailed information about what text has changed (compared to BEG_UNCHANGED and END_UNCHANGED). Zed editor, rasterization on GPU https://zed.dev/blog/videogame # end. [-- Attachment #3: Type: text/plain, Size: 630 bytes --] It's probably not very helpful, but at least I get the idea of a concurrent redisplay planted into brains, where it can do it's evil work :-). > > (However, before anyone gets their hopes and/or fears up, my code > depends on disabling most of the regexp code, and the additional number > of garbage-collected objects is so great that I concluded I'd wait for > MPS to land before resuming work on it. One of the few distinct > advantages of the current gap buffer approach is that it doesn't affect > GC...) > > I know virtually nothing about redisplay. > > Pip What I've written is pretty high-level, nothing to worry about. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 13:22 ` Gerd Möllmann @ 2024-12-11 14:53 ` Pip Cet via Emacs development discussions. 2024-12-11 15:33 ` Gerd Möllmann 0 siblings, 1 reply; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-11 14:53 UTC (permalink / raw) To: Gerd Möllmann Cc: Pip Cet via "Emacs development discussions.", Óscar Fuentes Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Pip Cet <pipcet@protonmail.com> writes: > >> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >> >>> Pip Cet <pipcet@protonmail.com> writes: >>>> if we ever replace the gap buffer code, we should make sure its >>>> replacement actually handles buffer text and text properties/intervals >>>> in an integrated manner, rather than storing just buffer text). >>>> >>>> Pip >>> >>> And if I may add a wish to the future author: Make whatever you use >>> persistent data structures, so that one could think of letting redisplay >>> run concurrently. Really! :-) >> >> You won't be surprised to hear I've been playing with some code, > > Indeed, I was just thinking to myself "I knew it" :-). > Two thumbs up! > >> so could I ask you to expand on this point? What precisely does >> redisplay require? Full snapshotting or would it be sufficient to have >> fine-grained locking? > > Maybe it's helpful when I tell something about the background. Some time > last year I asked myself if I could make Emacs more than one of my > plenty of CPU cores without solving the multi-threaded Elisp problem. > And the idea was that I could do that, possibly, by letting redisplay > happen in another thread. This may be a very stupid idea, but why not use a separate process? fork() is fast on GNU/Linux, and I suspect on macOS too, and the redisplay child would receive a consistent snapshot of the data to inspect and/or modify while coming up with the redisplay instructions, which it would then send back via a pipe or shared memory to be executed in the main process. I suggested doing something similar for GC (the GC child would perform a full GC and send back the Lisp_Objects which are definitely unreachable via a pipe. No, I never figured out how to make that work for weak hash tables which may resurrect references, I just made all hash tables strong...), and in that case the pipe seemed sufficient for the amount of data that was transferred, but I'm not sure how compact (or otherwise) serialized redisplay "instructions" would be. One issue I see is that fork() does a lot of housekeeping work in addition to marking the child's memory as a COW copy of the parent's memory at the time of the fork(). ISTR you can split that process on GNU/Linux (probably not Android), so you'd already have a prepared thread/LWP which wouldn't need to "start up" when you un-share the memory, but I can't find the relevant manpage right now. However, I have no real idea just how bad the fork() latency would be (as you point out, most people have more CPU cores than they can use, so I don't consider the approximate doubling of CPU usage a problem). This would deal very nicely with fontification code attempting to modify data it shouldn't, by ignoring such modifications. It would also deal with catastrophic failure in the redisplay code, as it's insulated in a separate process and we could just print a nice message in the main process rather than crashing all of Emacs. I'm emphatically not suggesting letting the redisplay child actually communicate with the X server or equivalent. That would be much more difficult. In fact, I think a good way to test this approach would be to use the tty code, since there's already a standard serialization of redisplay instructions for tty displays: VT100 escape sequences. > I later realized while thinking about the details, that this undertaking > is an order of magnitude too large for me. Everything taking more than a > few months is. And, in addition, I wouldn't want to do data structures > in C anyway. I think the VT100 case could be done as a weekend project (those always end up taking several weeks for me...), but I'm not sure it's worth it as VT100 redisplay isn't the common use case, and the performance problems are more visible on GUI terminals. And, like pretty much all Emacs ideas, this depends on having a better GC. (However, I've just experimented with an 8 GB process forking, and it's much slower than I'd hoped for - about 70 ms. I wouldn't be surprised if most of that cost is setting up page tables for the ridiculously small 4KB page size x86 uses, so it may work a lot better for AArch64 systems such as yours). Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 14:53 ` Pip Cet via Emacs development discussions. @ 2024-12-11 15:33 ` Gerd Möllmann 2024-12-11 16:58 ` Eli Zaretskii 0 siblings, 1 reply; 112+ messages in thread From: Gerd Möllmann @ 2024-12-11 15:33 UTC (permalink / raw) To: Pip Cet Cc: Pip Cet via "Emacs development discussions.", Óscar Fuentes Pip Cet <pipcet@protonmail.com> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> Pip Cet <pipcet@protonmail.com> writes: >> >>> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >>> >>>> Pip Cet <pipcet@protonmail.com> writes: >>>>> if we ever replace the gap buffer code, we should make sure its >>>>> replacement actually handles buffer text and text properties/intervals >>>>> in an integrated manner, rather than storing just buffer text). >>>>> >>>>> Pip >>>> >>>> And if I may add a wish to the future author: Make whatever you use >>>> persistent data structures, so that one could think of letting redisplay >>>> run concurrently. Really! :-) >>> >>> You won't be surprised to hear I've been playing with some code, >> >> Indeed, I was just thinking to myself "I knew it" :-). >> Two thumbs up! >> >>> so could I ask you to expand on this point? What precisely does >>> redisplay require? Full snapshotting or would it be sufficient to have >>> fine-grained locking? >> >> Maybe it's helpful when I tell something about the background. Some time >> last year I asked myself if I could make Emacs more than one of my >> plenty of CPU cores without solving the multi-threaded Elisp problem. >> And the idea was that I could do that, possibly, by letting redisplay >> happen in another thread. > > This may be a very stupid idea, but why not use a separate process? Not stupid at all. I thought about something similar in a different context, namely if one could decouple the GUI part of Emacs from the rest. Something like that has been done by Eberhard Mattes for OS/2 with the old redisplay. He had to do that because the whole Presentation Manager (GUI) in OS/2 would block, for all process, when an application did not timely handle events and return to the PM. Something like that. Eberhard's OS/2 Emacs had one process doing the GUI stuff, and one for the rest. Both communicated with each other using a defined message protocol. It worked. Don't remember what he used for process communication, pipes or something else. I got stuck with this idea because everything seemed to depend on everything else nowadays. Redisplay needs to execute Lisp, Font backends I think, not sure. Some GUIs call redisplay (nsterm). And then I imagined the licensing issues, and dropped the idea. Although - NS could really need something done, IMO, which was the reason I thought about that in the first place. NS is not working for me at least. I always wonder why nobody else has the same freezing problems that I have. I think the same dependency problems also creep up with to concurrent redisplay, don't know. Values of variables, faces, jit-lock, and so on. I think it would be "easier" to handle if one has everything in one process. But in principle both could be done. An actor model. > fork() is fast on GNU/Linux, and I suspect on macOS too, and the > redisplay child would receive a consistent snapshot of the data to > inspect and/or modify while coming up with the redisplay instructions, > which it would then send back via a pipe or shared memory to be executed > in the main process. > > I suggested doing something similar for GC (the GC child would perform a > full GC and send back the Lisp_Objects which are definitely unreachable > via a pipe. No, I never figured out how to make that work for weak hash > tables which may resurrect references, I just made all hash tables > strong...), and in that case the pipe seemed sufficient for the amount > of data that was transferred, but I'm not sure how compact (or > otherwise) serialized redisplay "instructions" would be. > > One issue I see is that fork() does a lot of housekeeping work in > addition to marking the child's memory as a COW copy of the parent's > memory at the time of the fork(). ISTR you can split that process on > GNU/Linux (probably not Android), so you'd already have a prepared > thread/LWP which wouldn't need to "start up" when you un-share the > memory, but I can't find the relevant manpage right now. However, I have > no real idea just how bad the fork() latency would be (as you point out, > most people have more CPU cores than they can use, so I don't consider > the approximate doubling of CPU usage a problem). > > This would deal very nicely with fontification code attempting to modify > data it shouldn't, by ignoring such modifications. It would also deal > with catastrophic failure in the redisplay code, as it's insulated in a > separate process and we could just print a nice message in the main > process rather than crashing all of Emacs. > > I'm emphatically not suggesting letting the redisplay child actually > communicate with the X server or equivalent. That would be much more > difficult. > > In fact, I think a good way to test this approach would be to use the > tty code, since there's already a standard serialization of redisplay > instructions for tty displays: VT100 escape sequences. > >> I later realized while thinking about the details, that this undertaking >> is an order of magnitude too large for me. Everything taking more than a >> few months is. And, in addition, I wouldn't want to do data structures >> in C anyway. > > I think the VT100 case could be done as a weekend project (those always > end up taking several weeks for me...), but I'm not sure it's worth it > as VT100 redisplay isn't the common use case, and the performance > problems are more visible on GUI terminals. Yes. In a way, it's already the case that the GUI part of Emacs that I described above for OS/2, is the terminal emulator, and the protocol is VT100. > And, like pretty much all Emacs ideas, this depends on having a better > GC. > > (However, I've just experimented with an 8 GB process forking, and it's > much slower than I'd hoped for - about 70 ms. I wouldn't be surprised > if most of that cost is setting up page tables for the ridiculously > small 4KB page size x86 uses, so it may work a lot better for AArch64 > systems such as yours). > > Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 15:33 ` Gerd Möllmann @ 2024-12-11 16:58 ` Eli Zaretskii 2024-12-11 17:13 ` Gerd Möllmann 2024-12-11 17:41 ` Pip Cet via Emacs development discussions. 0 siblings, 2 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-12-11 16:58 UTC (permalink / raw) To: Gerd Möllmann; +Cc: pipcet, emacs-devel, ofv > From: Gerd Möllmann <gerd.moellmann@gmail.com> > Cc: "Pip Cet via \"Emacs development discussions.\"" <emacs-devel@gnu.org>, > Óscar Fuentes <ofv@wanadoo.es> > Date: Wed, 11 Dec 2024 16:33:18 +0100 > > Pip Cet <pipcet@protonmail.com> writes: > > > This may be a very stupid idea, but why not use a separate process? > > Not stupid at all. I thought about something similar in a different > context, namely if one could decouple the GUI part of Emacs from the > rest. If it can be done by two processes, it can also be done by two threads in the same process. Right? ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 16:58 ` Eli Zaretskii @ 2024-12-11 17:13 ` Gerd Möllmann 2024-12-11 17:45 ` Robert Pluim 2024-12-11 17:41 ` Pip Cet via Emacs development discussions. 1 sibling, 1 reply; 112+ messages in thread From: Gerd Möllmann @ 2024-12-11 17:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pipcet, emacs-devel, ofv Eli Zaretskii <eliz@gnu.org> writes: >> From: Gerd Möllmann <gerd.moellmann@gmail.com> >> Cc: "Pip Cet via \"Emacs development discussions.\"" <emacs-devel@gnu.org>, >> Óscar Fuentes <ofv@wanadoo.es> >> Date: Wed, 11 Dec 2024 16:33:18 +0100 >> >> Pip Cet <pipcet@protonmail.com> writes: >> >> > This may be a very stupid idea, but why not use a separate process? >> >> Not stupid at all. I thought about something similar in a different >> context, namely if one could decouple the GUI part of Emacs from the >> rest. > > If it can be done by two processes, it can also be done by two threads > in the same process. Right? Yes, I think so. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 17:13 ` Gerd Möllmann @ 2024-12-11 17:45 ` Robert Pluim 2024-12-11 18:11 ` Gerd Möllmann 2024-12-11 19:08 ` Eli Zaretskii 0 siblings, 2 replies; 112+ messages in thread From: Robert Pluim @ 2024-12-11 17:45 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Eli Zaretskii, pipcet, emacs-devel, ofv >>>>> On Wed, 11 Dec 2024 18:13:41 +0100, Gerd Möllmann <gerd.moellmann@gmail.com> said: Gerd> Eli Zaretskii <eliz@gnu.org> writes: >>> From: Gerd Möllmann <gerd.moellmann@gmail.com> >>> Cc: "Pip Cet via \"Emacs development discussions.\"" <emacs-devel@gnu.org>, >>> Óscar Fuentes <ofv@wanadoo.es> >>> Date: Wed, 11 Dec 2024 16:33:18 +0100 >>> >>> Pip Cet <pipcet@protonmail.com> writes: >>> >>> > This may be a very stupid idea, but why not use a separate process? >>> >>> Not stupid at all. I thought about something similar in a different >>> context, namely if one could decouple the GUI part of Emacs from the >>> rest. >> >> If it can be done by two processes, it can also be done by two threads >> in the same process. Right? Gerd> Yes, I think so. But then you have to throw a lock over all the memory in the non-display thread that might affect redisplay (although come to think of it, youʼd probably need that even when using fork) Robert -- ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 17:45 ` Robert Pluim @ 2024-12-11 18:11 ` Gerd Möllmann 2024-12-11 19:08 ` Eli Zaretskii 1 sibling, 0 replies; 112+ messages in thread From: Gerd Möllmann @ 2024-12-11 18:11 UTC (permalink / raw) To: Robert Pluim; +Cc: Eli Zaretskii, pipcet, emacs-devel, ofv Robert Pluim <rpluim@gmail.com> writes: >>>>>> On Wed, 11 Dec 2024 18:13:41 +0100, Gerd Möllmann <gerd.moellmann@gmail.com> said: > > Gerd> Eli Zaretskii <eliz@gnu.org> writes: > >>> From: Gerd Möllmann <gerd.moellmann@gmail.com> > >>> Cc: "Pip Cet via \"Emacs development discussions.\"" <emacs-devel@gnu.org>, > >>> Óscar Fuentes <ofv@wanadoo.es> > >>> Date: Wed, 11 Dec 2024 16:33:18 +0100 > >>> > >>> Pip Cet <pipcet@protonmail.com> writes: > >>> > >>> > This may be a very stupid idea, but why not use a separate process? > >>> > >>> Not stupid at all. I thought about something similar in a different > >>> context, namely if one could decouple the GUI part of Emacs from the > >>> rest. > >> > >> If it can be done by two processes, it can also be done by two threads > >> in the same process. Right? > > Gerd> Yes, I think so. > > But then you have to throw a lock over all the memory in the > non-display thread that might affect redisplay (although come to think > of it, youʼd probably need that even when using fork) > > Robert Well, it depends. Assume you have a solution that works in a second process. That solution wouldn't use things in the first process because it can't. Now move that code of the second process to the first process, and make two threads out of the two process, and replace process communication with inter-thread message passing like in an actor model. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 17:45 ` Robert Pluim 2024-12-11 18:11 ` Gerd Möllmann @ 2024-12-11 19:08 ` Eli Zaretskii 1 sibling, 0 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-12-11 19:08 UTC (permalink / raw) To: Robert Pluim; +Cc: gerd.moellmann, pipcet, emacs-devel, ofv > From: Robert Pluim <rpluim@gmail.com> > Cc: Eli Zaretskii <eliz@gnu.org>, pipcet@protonmail.com, > emacs-devel@gnu.org, ofv@wanadoo.es > Date: Wed, 11 Dec 2024 18:45:15 +0100 > > >>>>> On Wed, 11 Dec 2024 18:13:41 +0100, Gerd Möllmann <gerd.moellmann@gmail.com> said: > > Gerd> Eli Zaretskii <eliz@gnu.org> writes: > >> > >> If it can be done by two processes, it can also be done by two threads > >> in the same process. Right? > > Gerd> Yes, I think so. > > But then you have to throw a lock over all the memory in the > non-display thread that might affect redisplay No, you copy on write. Exactly like the OS does with forked process. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 16:58 ` Eli Zaretskii 2024-12-11 17:13 ` Gerd Möllmann @ 2024-12-11 17:41 ` Pip Cet via Emacs development discussions. 2024-12-11 19:04 ` Eli Zaretskii 2024-12-11 19:09 ` Gerd Möllmann 1 sibling, 2 replies; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-11 17:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Gerd Möllmann, emacs-devel, ofv "Eli Zaretskii" <eliz@gnu.org> writes: >> From: Gerd Möllmann <gerd.moellmann@gmail.com> >> Cc: "Pip Cet via \"Emacs development discussions.\"" <emacs-devel@gnu.org>, >> Óscar Fuentes <ofv@wanadoo.es> >> Date: Wed, 11 Dec 2024 16:33:18 +0100 >> >> Pip Cet <pipcet@protonmail.com> writes: >> >> > This may be a very stupid idea, but why not use a separate process? >> >> Not stupid at all. I thought about something similar in a different >> context, namely if one could decouple the GUI part of Emacs from the >> rest. > > If it can be done by two processes, it can also be done by two threads > in the same process. Right? AFAIU: No, not right. I may have misunderstood, but if the idea is to preserve a consistent state of all Lisp data and buffer text for redisplay to use, the easiest way to ensure that consistency is fork(). The other ways, such as copying all heap objects that might be used by redisplay (and adjusting all internal pointers in such heap objects to point to the copy rather than the original data), probably will end up either being a lot slower or being very specific to the system we're running on. I know that implementing fork() on Windows is very slow, and I don't know about a comparable snapshotting mechanism for Windows. To be honest, though, I'm a bit disappointed that GNU/Linux appears to make fork() take significant time that is proportional to the size of the mapped address space, even if it's never COW-faulted in. I'm pretty sure that could be avoided (and I hope the Linux kernel avoids doing it for swapped-out memory, not that anyone still does that). Concurrent access to Lisp data from several threads requires a locking mechanism (fine-grained or coarse) for all such data, and possibly requires rewriting addresses, which means no "ambiguous" references whatsoever. That's a lot harder than using MPS, which generously allows for ambiguous references. It's possible we could have gotten away with concurrent access by the redisplay machinery if we inhibited GC while the redisplay thread was busy inspecting our data, but inhibiting MPS GC is a lot harder and shouldn't be done for ordinary operations. Oh, and of course mmap() breaks fork()'s snapshotting magic. The reason I said this depends on a new GC is a bit subtle, by the way: the old GC does best if we sacrifice a lot of memory and only run it rarely, which we can usually get away with because RAM is cheap. With a fork()-based approach, memory usage comes with a performance penalty for every fork(), so we need to reduce both memory usage and GC time, which we can't do with non-incremental GC. The last reason it's difficult is that MPS isn't optimized for multi-thread settings: in an ideal world, "scanning" a memory area would use a secondary mapping of the memory, known only to the scanning code, so other threads could continue running while an area is being scanned. With MPS, there is only one mapping, so we need to stop all other threads while one thread un-mprotect()s a memory area to scan it. Unless MPS breaks POSIX threads in some spectacular way, fork() should still work, though. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 17:41 ` Pip Cet via Emacs development discussions. @ 2024-12-11 19:04 ` Eli Zaretskii 2024-12-11 19:54 ` Pip Cet via Emacs development discussions. 2024-12-11 19:09 ` Gerd Möllmann 1 sibling, 1 reply; 112+ messages in thread From: Eli Zaretskii @ 2024-12-11 19:04 UTC (permalink / raw) To: Pip Cet; +Cc: gerd.moellmann, emacs-devel, ofv > Date: Wed, 11 Dec 2024 17:41:29 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: Gerd Möllmann <gerd.moellmann@gmail.com>, emacs-devel@gnu.org, ofv@wanadoo.es > > "Eli Zaretskii" <eliz@gnu.org> writes: > > > If it can be done by two processes, it can also be done by two threads > > in the same process. Right? > > AFAIU: No, not right. > > I may have misunderstood, but if the idea is to preserve a consistent > state of all Lisp data and buffer text for redisplay to use, the easiest > way to ensure that consistency is fork(). The other ways, such as > copying all heap objects that might be used by redisplay (and adjusting > all internal pointers in such heap objects to point to the copy rather > than the original data), probably will end up either being a lot slower > or being very specific to the system we're running on. How do you do the same in a forked process? The glyph matrices are not allocated once, they are reallocated constantly. Are you going to fork each time? And if you are, how is it different from copying stuff lazily within the same process, exactly like the OS does with forked processes? > I know that implementing fork() on Windows is very slow, and I don't > know about a comparable snapshotting mechanism for Windows. I'm not talking about Windows, I'm talking about Posix systems. Anyway, the fact that redisplay calls Lisp and Lisp calls back into redisplay all but kills this idea. Gerd's document has also other gotchas. We didn't just give up easily back when we discussed that. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 19:04 ` Eli Zaretskii @ 2024-12-11 19:54 ` Pip Cet via Emacs development discussions. 2024-12-11 20:26 ` Eli Zaretskii 0 siblings, 1 reply; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-11 19:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: gerd.moellmann, emacs-devel, ofv "Eli Zaretskii" <eliz@gnu.org> writes: >> Date: Wed, 11 Dec 2024 17:41:29 +0000 >> From: Pip Cet <pipcet@protonmail.com> >> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>, emacs-devel@gnu.org, ofv@wanadoo.es >> >> "Eli Zaretskii" <eliz@gnu.org> writes: >> >> > If it can be done by two processes, it can also be done by two threads >> > in the same process. Right? >> >> AFAIU: No, not right. >> >> I may have misunderstood, but if the idea is to preserve a consistent >> state of all Lisp data and buffer text for redisplay to use, the easiest >> way to ensure that consistency is fork(). The other ways, such as >> copying all heap objects that might be used by redisplay (and adjusting >> all internal pointers in such heap objects to point to the copy rather >> than the original data), probably will end up either being a lot slower >> or being very specific to the system we're running on. > > How do you do the same in a forked process? The glyph matrices are > not allocated once, they are reallocated constantly. Are you going to > fork each time? Not necessarily "each time" (meaning once per frame/keystroke), but quite frequently, yes. > And if you are, how is it different from copying > stuff lazily within the same process, exactly like the OS does with > forked processes? It is very different indeed: Copying within a process involves changing the (virtual) addresses that the copied data is at (unless you use an architecture-specific implementation of TLS). The beauty of fork() is that the virtual addresses stay the same, so we don't need to adjust any pointers, which we cannot do because there are ambiguous references to Lisp data. IOW, no, you can't lazily create two copies of Lisp data in the same process. You have to do so eagerly, adjusting any and all pointers (and only those) in the Lisp data before the new data is read for the first time (because what you read might be a pointer, and then it needs to be adjusted). With fork(), you only have to make the copy when the data is being written to, by either process. (Of course you can just access all memory through some sort of API that translates addresses for you, but that would effectively mean we'd be running Emacs on a virtual machine and simulate fork() on it). > Anyway, the fact that redisplay calls Lisp and Lisp calls back into > redisplay all but kills this idea. Gerd's document has also other > gotchas. We didn't just give up easily back when we discussed that. I don't see why the redisplay process would not be able to call Lisp; it's a full Emacs process (with a single thread), except it doesn't have an FD or socket for the window system, and has an extra pipe to communicate with the parent process instead. It's true that the side effects of the called Lisp code won't be visible to the next redisplay process, but such side effects are perilous anyway, and avoiding them would seem to me to be a feature, not a bug. However, if such side effects are desired, we can use IPC to execute Lisp in the main process (some effort) or simply send a "this redisplay needs to happen synchronously" message to the main process, which would kill the current redisplay process and perform a synchronous redisplay (as not all operating systems support fork() reliably, we'll have to retain the ability to redisplay synchronously, either way). But, to be perfectly honest, I'm not sure redisplay is slowing me down the way traditional GC is. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 19:54 ` Pip Cet via Emacs development discussions. @ 2024-12-11 20:26 ` Eli Zaretskii 0 siblings, 0 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-12-11 20:26 UTC (permalink / raw) To: Pip Cet; +Cc: gerd.moellmann, emacs-devel, ofv > Date: Wed, 11 Dec 2024 19:54:07 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: gerd.moellmann@gmail.com, emacs-devel@gnu.org, ofv@wanadoo.es > > "Eli Zaretskii" <eliz@gnu.org> writes: > > > Anyway, the fact that redisplay calls Lisp and Lisp calls back into > > redisplay all but kills this idea. Gerd's document has also other > > gotchas. We didn't just give up easily back when we discussed that. > > I don't see why the redisplay process would not be able to call Lisp; > it's a full Emacs process (with a single thread) So you are going to fork on each redisplay? And how will you pass back the results of Lisp evaluation, if the other process meanwhile changes the global state (as it's running concurrently)? > except it doesn't have > an FD or socket for the window system, and has an extra pipe to > communicate with the parent process instead. Do you have an estimation of the throughput that such a pipe will need to handle in order to support GUI display? What will you send through the pipe? If you send only some kind of commands, then the other process will need to generate the font glyphs in some way -- the same glyphs that the "redisplay" process already produced. And if you intend to send the pixels, that would be too much traffic, I think. And again, the global state of the receiving process could have changed, which means any high-level data might be useless (e.g., using a font that was unloaded). > It's true that the side effects of the called Lisp code won't be visible > to the next redisplay process, but such side effects are perilous > anyway, and avoiding them would seem to me to be a feature, not a bug. In Emacs, they are a feature, and are expected to work. You'd be surprised to see how many packages and user code rely on that. > But, to be perfectly honest, I'm not sure redisplay is slowing me down > the way traditional GC is. It's the other way around: the Lisp machine blocks user interaction, including the UI and display. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 17:41 ` Pip Cet via Emacs development discussions. 2024-12-11 19:04 ` Eli Zaretskii @ 2024-12-11 19:09 ` Gerd Möllmann 1 sibling, 0 replies; 112+ messages in thread From: Gerd Möllmann @ 2024-12-11 19:09 UTC (permalink / raw) To: Pip Cet; +Cc: Eli Zaretskii, emacs-devel, ofv Pip Cet <pipcet@protonmail.com> writes: > "Eli Zaretskii" <eliz@gnu.org> writes: > >>> From: Gerd Möllmann <gerd.moellmann@gmail.com> >>> Cc: "Pip Cet via \"Emacs development discussions.\"" <emacs-devel@gnu.org>, >>> Óscar Fuentes <ofv@wanadoo.es> >>> Date: Wed, 11 Dec 2024 16:33:18 +0100 >>> >>> Pip Cet <pipcet@protonmail.com> writes: >>> >>> > This may be a very stupid idea, but why not use a separate process? >>> >>> Not stupid at all. I thought about something similar in a different >>> context, namely if one could decouple the GUI part of Emacs from the >>> rest. >> >> If it can be done by two processes, it can also be done by two threads >> in the same process. Right? > > AFAIU: No, not right. > > I may have misunderstood, but if the idea is to preserve a consistent > state of all Lisp data and buffer text for redisplay to use, the easiest > way to ensure that consistency is fork(). The other ways, such as > copying all heap objects that might be used by redisplay (and adjusting > all internal pointers in such heap objects to point to the copy rather > than the original data), probably will end up either being a lot slower > or being very specific to the system we're running on. I may also be misunderstanding, but in principle, I agree with Eli. Say we have processes A and B communicating with each other. Take the code of A and move it to B, possibly with some automatic transformations if A and B have the same source code. Make two threads in the result process for A and B. Replace inter-process message passing with inter-thread message passing. Initial message may be "fork" transferring the world of thread A to thread B. But I'm also thinking too abstract sometimes. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 9:35 ` Gerd Möllmann 2024-12-11 11:50 ` Pip Cet via Emacs development discussions. @ 2024-12-11 12:27 ` Pip Cet via Emacs development discussions. 2024-12-11 13:27 ` Gerd Möllmann 1 sibling, 1 reply; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-11 12:27 UTC (permalink / raw) To: Gerd Möllmann Cc: Pip Cet via "Emacs development discussions.", Óscar Fuentes Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Pip Cet <pipcet@protonmail.com> writes: > >> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >> I also recall discussion somewhere (nullprogram.com, maybe) about >> multiple cursors and the gap buffer, and that's also a potential use >> case where the gap buffer would make things very slow. It was nullprogram.com, at https://nullprogram.com/blog/2017/09/07/. The title is "Gap Buffers Are Not Optimized for Multiple Cursors", which seems accurate to me. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 12:27 ` Pip Cet via Emacs development discussions. @ 2024-12-11 13:27 ` Gerd Möllmann 2024-12-11 15:06 ` Marcus Harnisch 0 siblings, 1 reply; 112+ messages in thread From: Gerd Möllmann @ 2024-12-11 13:27 UTC (permalink / raw) To: Pip Cet Cc: Pip Cet via "Emacs development discussions.", Óscar Fuentes Pip Cet <pipcet@protonmail.com> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> Pip Cet <pipcet@protonmail.com> writes: >> >>> Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >>> I also recall discussion somewhere (nullprogram.com, maybe) about >>> multiple cursors and the gap buffer, and that's also a potential use >>> case where the gap buffer would make things very slow. > > It was nullprogram.com, at https://nullprogram.com/blog/2017/09/07/. The > title is "Gap Buffers Are Not Optimized for Multiple Cursors", which > seems accurate to me. > > Pip Thanks! Added to my collection. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 13:27 ` Gerd Möllmann @ 2024-12-11 15:06 ` Marcus Harnisch 0 siblings, 0 replies; 112+ messages in thread From: Marcus Harnisch @ 2024-12-11 15:06 UTC (permalink / raw) To: emacs-devel On 11/12/2024 14.27, Gerd Möllmann wrote: > Pip Cet <pipcet@protonmail.com> writes: > >> It was nullprogram.com, at https://nullprogram.com/blog/2017/09/07/. The >> title is "Gap Buffers Are Not Optimized for Multiple Cursors", which >> seems accurate to me. > > Thanks! Added to my collection. You may be interested in this article, too, which refererences the blog post above: https://coredumped.dev/2023/08/09/text-showdown-gap-buffers-vs-ropes/ ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 5:27 ` Gap buffer problem? Gerd Möllmann 2024-12-11 8:50 ` Pip Cet via Emacs development discussions. @ 2024-12-11 14:22 ` Eli Zaretskii 2024-12-11 15:51 ` Gerd Möllmann 1 sibling, 1 reply; 112+ messages in thread From: Eli Zaretskii @ 2024-12-11 14:22 UTC (permalink / raw) To: Gerd Möllmann; +Cc: emacs-devel, ofv, pipcet > From: Gerd Möllmann <gerd.moellmann@gmail.com> > Cc: Óscar Fuentes <ofv@wanadoo.es>, Pip Cet > <pipcet@protonmail.com> > Date: Wed, 11 Dec 2024 06:27:43 +0100 > > Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > writes: > > > To be fair, part of that may be the gap buffer problem rather than GC. > > Could you please tell more about the gap buffer problem? > > I've read a little about the tradeoffs between gap buffers, piece > tables, ropes, but I'm wondering if there is something concrete already > known for sure that is a performance problem in Emacs. Maybe a bug that > has been analyzed or something. > > (I'm asking because I just recently encountered a performance problem > when adding something to xdisp.c:27339 (with cc-mode, Eglot, Corfu), and > editing there was so slow that it was absolutely no fun, and that on a > an M1 pro. Haven't investigated the reason.) Unless you have a huge (and I mean a HUGE) buffer, and some Lisp that moves point, then inserts a small number of characters, then moves point far away and again inserts a small number of characters, etc., I'd be very surprised if the gap buffer caused significant performance problems on a modern CPU. Can you profile that case and post the expanded profile? I'm always happy to be wrong about performance bottlenecks, and profiles are good at proving me wrong. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 14:22 ` Eli Zaretskii @ 2024-12-11 15:51 ` Gerd Möllmann 2024-12-11 17:06 ` Eli Zaretskii 0 siblings, 1 reply; 112+ messages in thread From: Gerd Möllmann @ 2024-12-11 15:51 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, ofv, pipcet Eli Zaretskii <eliz@gnu.org> writes: >> From: Gerd Möllmann <gerd.moellmann@gmail.com> >> Cc: Óscar Fuentes <ofv@wanadoo.es>, Pip Cet >> <pipcet@protonmail.com> >> Date: Wed, 11 Dec 2024 06:27:43 +0100 >> >> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> >> writes: >> >> > To be fair, part of that may be the gap buffer problem rather than GC. >> >> Could you please tell more about the gap buffer problem? >> >> I've read a little about the tradeoffs between gap buffers, piece >> tables, ropes, but I'm wondering if there is something concrete already >> known for sure that is a performance problem in Emacs. Maybe a bug that >> has been analyzed or something. >> >> (I'm asking because I just recently encountered a performance problem >> when adding something to xdisp.c:27339 (with cc-mode, Eglot, Corfu), and >> editing there was so slow that it was absolutely no fun, and that on a >> an M1 pro. Haven't investigated the reason.) > > Unless you have a huge (and I mean a HUGE) buffer, and some Lisp that > moves point, then inserts a small number of characters, then moves > point far away and again inserts a small number of characters, etc., > I'd be very surprised if the gap buffer caused significant performance > problems on a modern CPU. > > Can you profile that case and post the expanded profile? I'm always > happy to be wrong about performance bottlenecks, and profiles are good > at proving me wrong. Maybe I'll try to investigate that further at some point. Such things always tend to be so time consuming... ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 15:51 ` Gerd Möllmann @ 2024-12-11 17:06 ` Eli Zaretskii 2024-12-11 17:15 ` Gerd Möllmann 0 siblings, 1 reply; 112+ messages in thread From: Eli Zaretskii @ 2024-12-11 17:06 UTC (permalink / raw) To: Gerd Möllmann; +Cc: emacs-devel, ofv, pipcet > From: Gerd Möllmann <gerd.moellmann@gmail.com> > Cc: emacs-devel@gnu.org, ofv@wanadoo.es, pipcet@protonmail.com > Date: Wed, 11 Dec 2024 16:51:56 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Unless you have a huge (and I mean a HUGE) buffer, and some Lisp that > > moves point, then inserts a small number of characters, then moves > > point far away and again inserts a small number of characters, etc., > > I'd be very surprised if the gap buffer caused significant performance > > problems on a modern CPU. > > > > Can you profile that case and post the expanded profile? I'm always > > happy to be wrong about performance bottlenecks, and profiles are good > > at proving me wrong. > > Maybe I'll try to investigate that further at some point. Such things > always tend to be so time consuming... I meant profiling with "M-x profile-start", then run your slow-down recipe. That should be easy and should not consume any significant time. Analyzing the profile could, but producing it shouldn't. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: Gap buffer problem? 2024-12-11 17:06 ` Eli Zaretskii @ 2024-12-11 17:15 ` Gerd Möllmann 0 siblings, 0 replies; 112+ messages in thread From: Gerd Möllmann @ 2024-12-11 17:15 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, ofv, pipcet Eli Zaretskii <eliz@gnu.org> writes: >> From: Gerd Möllmann <gerd.moellmann@gmail.com> >> Cc: emacs-devel@gnu.org, ofv@wanadoo.es, pipcet@protonmail.com >> Date: Wed, 11 Dec 2024 16:51:56 +0100 >> >> Eli Zaretskii <eliz@gnu.org> writes: >> >> > Unless you have a huge (and I mean a HUGE) buffer, and some Lisp that >> > moves point, then inserts a small number of characters, then moves >> > point far away and again inserts a small number of characters, etc., >> > I'd be very surprised if the gap buffer caused significant performance >> > problems on a modern CPU. >> > >> > Can you profile that case and post the expanded profile? I'm always >> > happy to be wrong about performance bottlenecks, and profiles are good >> > at proving me wrong. >> >> Maybe I'll try to investigate that further at some point. Such things >> always tend to be so time consuming... > > I meant profiling with "M-x profile-start", then run your slow-down > recipe. That should be easy and should not consume any significant > time. Analyzing the profile could, but producing it shouldn't. Plus making it reproducible, if it is. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 13:39 ` Óscar Fuentes 2024-12-10 14:39 ` Eli Zaretskii 2024-12-10 15:38 ` Pip Cet via Emacs development discussions. @ 2024-12-10 18:13 ` Gerd Möllmann 2 siblings, 0 replies; 112+ messages in thread From: Gerd Möllmann @ 2024-12-10 18:13 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel Óscar Fuentes <ofv@wanadoo.es> writes: > My perception of the past week or two using igc is that those pauses are > much less jarring, if perceptible at all. I need more time to make a > definitive judgment, though. Please make sure not to have --enable-checking=igc_debug and not to have --with-mps=debug. They are expensive, and I'm not talking about some dozen percent :-). ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 12:59 ` Eli Zaretskii 2024-12-10 13:39 ` Óscar Fuentes @ 2024-12-10 15:23 ` Pip Cet via Emacs development discussions. 2024-12-10 17:08 ` Eli Zaretskii 1 sibling, 1 reply; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-10 15:23 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Stefan Kangas, luangruo, ali_gnu2, emacs-devel On Tuesday, December 10th, 2024 at 12:59, Eli Zaretskii <eliz@gnu.org> wrote: > > From: Stefan Kangas stefankangas@gmail.com > > > Date: Mon, 9 Dec 2024 19:09:59 -0500 > > Cc: pipcet@protonmail.com, luangruo@yahoo.com, ali_gnu2@emvision.com, > > emacs-devel@gnu.org > > > > Eli Zaretskii eliz@gnu.org writes: > > > > > > From: Stefan Kangas stefankangas@gmail.com > > > > Date: Sun, 8 Dec 2024 23:59:14 -0500 > > > > Cc: luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org > > > > > > > > Assuming that we are 100% sure that mpc will land, then I can agree that > > > > making any changes here is basically wasted effort. Unless, of course, > > > > the change would also simplify the mpc work (would it?). > > > > > > The igc branch already dropped WIDE_EMACS_INT support, so it only > > > supports USE_LSB anyway. > > > > I thought that WIDE_EMACS_INT will remain supported in non-MPS > > (i.e. "old GC") builds even after the igc merge? Am I mistaken? > > Probably, but who will want to give up igc to get back WIDE_EMACS_INT > (if indeed they are incompatible, which seems to be in disagreement)? It's !USE_LSB_TAG that's incompatible with MPS, not WIDE_EMACS_INT per se. I don't think anyone suggested that there is a fundamental problem if we force USE_LSB_TAG to 1 and enable WIDE_EMACS_INT. > > > > Leave the USE_LSB_TAG code as is, but set it to 1 in all configurations > > > > on master. > > > > > > That would put the WIDE_EMACS_INT configuration at risk, since that > > > configuration will need changes. > > > > That's why I proposed disabling it on master tentatively, with the > > option to revert the change if we don't like it. Setting a flag back to > > 0 is easy enough. But making the experiment I proposed might also > > demonstrate that we're fine, after all. > > I think we already know that we are "not fine"? Didn't someone say > that stack scanning is broken? !USE_LSB_TAG && !WIDE_EMACS_INT stack scanning is broken (but doesn't currently happen on actual machines) USE_LSB_TAG && WIDE_EMACS_INT (currently impossible, but trivial to enable) stack scanning works USE_LSB_TAG && !WIDE_EMACS_INT stack scanning works (this is the usual case) !USE_LSB_TAG && WIDE_EMACS_INT scack scanning works (this is Eli's situation) So following Stefan's suggestion would fix the broken case. I've already reported that I tested this with the patch I posted and it appears to work just fine, with or without MPS. > > > My point is that all of that could be avoided entirely, given some > > > development decisions which basically drop !USE_LSB_TAG > > > configurations. > > > > Is your thinking here that we could merge MPS, wait, and then when it > > comes time to remove the old GC, we will get to drop !USE_LSB_TAG for > > free? If yes, couldn't that leave us waiting for a very long time > > indeed? > > Maybe so, but why is such a long wait a problem? GC works, and > works well. There are no pressing problems there, and we've lived > with it for many years virtually without changes. What's the urge to > make modifications there now, especially when there are chances we > will be dropping this GC at some point? The old !USE_LSB_TAG code, which is broken, interferes with GC development, both MPS and non-MPS. > IMO, our main task here is to develop the application levels of Emacs, > and infrastructure needed to enable such developments. We should only > invest efforts in stuff like GC and other basics if we see significant > issues, or could envision significant performance gains. There are no > such issues or gains here, AFAIU. So diverting our humble resources > to such jobs is a mistake, IMO. Given how many GC developers we have already "lost", simplifying the GC code even a little so people can work with it is worth it, IMHO. And encouraging someone to invest resources into fixing a code path that will never again be used is a much greater mistake. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 15:23 ` Pip Cet via Emacs development discussions. @ 2024-12-10 17:08 ` Eli Zaretskii 2024-12-10 18:03 ` Gerd Möllmann 0 siblings, 1 reply; 112+ messages in thread From: Eli Zaretskii @ 2024-12-10 17:08 UTC (permalink / raw) To: Pip Cet; +Cc: stefankangas, luangruo, ali_gnu2, emacs-devel > Date: Tue, 10 Dec 2024 15:23:45 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: Stefan Kangas <stefankangas@gmail.com>, luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org > > > > I thought that WIDE_EMACS_INT will remain supported in non-MPS > > > (i.e. "old GC") builds even after the igc merge? Am I mistaken? > > > > Probably, but who will want to give up igc to get back WIDE_EMACS_INT > > (if indeed they are incompatible, which seems to be in disagreement)? > > It's !USE_LSB_TAG that's incompatible with MPS, not WIDE_EMACS_INT per se. I don't think anyone suggested that there is a fundamental problem if we force USE_LSB_TAG to 1 and enable WIDE_EMACS_INT. That's not what Gerd says, AFAIU. But if you are right, then how about making the WIDE_EMACS_INT configuration on the igc branch use USE_LSB_TAG in the HAVE_MPS code branch? I can volunteer to test such a build, if that would help. > > Maybe so, but why is such a long wait a problem? GC works, and > > works well. There are no pressing problems there, and we've lived > > with it for many years virtually without changes. What's the urge to > > make modifications there now, especially when there are chances we > > will be dropping this GC at some point? > > The old !USE_LSB_TAG code, which is broken, interferes with GC development, both MPS and non-MPS. That work is on the igc branch. My objection is against doing that on master and/or with the "old" GC code. In the HAVE_MPS branch of the code, all the arguments I brought up against removing !USE_LSB_TAG are null and void, and I therefore have no objections to doing that in those parts of the code. > > IMO, our main task here is to develop the application levels of Emacs, > > and infrastructure needed to enable such developments. We should only > > invest efforts in stuff like GC and other basics if we see significant > > issues, or could envision significant performance gains. There are no > > such issues or gains here, AFAIU. So diverting our humble resources > > to such jobs is a mistake, IMO. > > Given how many GC developers we have already "lost", simplifying the GC code even a little so people can work with it is worth it, IMHO. And encouraging someone to invest resources into fixing a code path that will never again be used is a much greater mistake. Our perspectives are very different, so let's agree to disagree on this. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 17:08 ` Eli Zaretskii @ 2024-12-10 18:03 ` Gerd Möllmann 2024-12-10 19:34 ` Pip Cet via Emacs development discussions. 2024-12-11 14:13 ` Pip Cet via Emacs development discussions. 0 siblings, 2 replies; 112+ messages in thread From: Gerd Möllmann @ 2024-12-10 18:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Pip Cet, stefankangas, luangruo, ali_gnu2, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> Date: Tue, 10 Dec 2024 15:23:45 +0000 >> From: Pip Cet <pipcet@protonmail.com> >> Cc: Stefan Kangas <stefankangas@gmail.com>, luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org >> >> > > I thought that WIDE_EMACS_INT will remain supported in non-MPS >> > > (i.e. "old GC") builds even after the igc merge? Am I mistaken? >> > >> > Probably, but who will want to give up igc to get back WIDE_EMACS_INT >> > (if indeed they are incompatible, which seems to be in disagreement)? >> >> It's !USE_LSB_TAG that's incompatible with MPS, not WIDE_EMACS_INT per se. I don't think anyone suggested that there is a fundamental problem if we force USE_LSB_TAG to 1 and enable WIDE_EMACS_INT. > > That's not what Gerd says, AFAIU. But if you are right, then how > about making the WIDE_EMACS_INT configuration on the igc branch use > USE_LSB_TAG in the HAVE_MPS code branch? I can volunteer to test such > a build, if that would help. If a Lisp_Object looks like this 0 32 64 +------------------+-------------------+ | tag | pointer | ... | +------------------+-------------------+ there is a chance it could be made to work, if ugly. That's USE_LSB_TAG == 1. If it looks like this 0 32 64 +------------------+-------------------+ | pointer | ... |tag | +------------------+-------------------+ it gets a lot more ugly. That's USE_LSB_TAG == 0. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 18:03 ` Gerd Möllmann @ 2024-12-10 19:34 ` Pip Cet via Emacs development discussions. 2024-12-10 19:59 ` Gerd Möllmann 2024-12-11 14:13 ` Pip Cet via Emacs development discussions. 1 sibling, 1 reply; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-10 19:34 UTC (permalink / raw) To: Gerd Möllmann Cc: Eli Zaretskii, stefankangas, luangruo, ali_gnu2, emacs-devel Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Eli Zaretskii <eliz@gnu.org> writes: > >>> Date: Tue, 10 Dec 2024 15:23:45 +0000 >>> From: Pip Cet <pipcet@protonmail.com> >>> Cc: Stefan Kangas <stefankangas@gmail.com>, luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org >>> >>> > > I thought that WIDE_EMACS_INT will remain supported in non-MPS >>> > > (i.e. "old GC") builds even after the igc merge? Am I mistaken? >>> > >>> > Probably, but who will want to give up igc to get back WIDE_EMACS_INT >>> > (if indeed they are incompatible, which seems to be in disagreement)? >>> >>> It's !USE_LSB_TAG that's incompatible with MPS, not WIDE_EMACS_INT >>> per se. I don't think anyone suggested that there is a fundamental >>> problem if we force USE_LSB_TAG to 1 and enable WIDE_EMACS_INT. >> >> That's not what Gerd says, AFAIU. But if you are right, then how >> about making the WIDE_EMACS_INT configuration on the igc branch use >> USE_LSB_TAG in the HAVE_MPS code branch? I can volunteer to test such >> a build, if that would help. Thanks for the offer. I definitely think we should move away from USE_LSB_TAG=0 as much as possible, and if the only place where such a change would not be vetoed is scratch/igc + WIDE_EMACS_INT, we can at least fix it there. If any issues arise, of course, it will be more difficult to ascertain whether they were caused by the USE_LSB_TAG change or the IGC changes themselves. So I'll push that change in a bit, unless someone objects. > If a Lisp_Object looks like this > > 0 32 64 > +------------------+-------------------+ > | tag | pointer | ... | > +------------------+-------------------+ > > there is a chance it could be made to work, if ugly. That's USE_LSB_TAG > == 1. It does appear to work. I'm not sure how it is "ugly", to be honest, since MPS only sees 32-bit words, and that's the tagged pointer and 0. No changes required. > If it looks like this > > 0 32 64 > +------------------+-------------------+ > | pointer | ... |tag | > +------------------+-------------------+ > > it gets a lot more ugly. That's USE_LSB_TAG == 0. Given that gcc likes storing the two 32-bit words of a 64-bit integer in non-adjacent places on the stack, it would be quite expensive to get this working. And if we decided to do that, it would become a lot more complicated to change our tagging scheme (which we should do, some time after merging MPS, to speed up EQ by having a "may be EQ to a different object" tag or, ideally, bit: EQ could then be simplified to if (x == y) return true; else if (((x|y) & BIT) == 0) return false; <expensive non-inlined code here>) Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 19:34 ` Pip Cet via Emacs development discussions. @ 2024-12-10 19:59 ` Gerd Möllmann 2024-12-10 20:17 ` Pip Cet via Emacs development discussions. 0 siblings, 1 reply; 112+ messages in thread From: Gerd Möllmann @ 2024-12-10 19:59 UTC (permalink / raw) To: Pip Cet; +Cc: Eli Zaretskii, stefankangas, luangruo, ali_gnu2, emacs-devel Pip Cet <pipcet@protonmail.com> writes: >> If a Lisp_Object looks like this >> >> 0 32 64 >> +------------------+-------------------+ >> | tag | pointer | ... | >> +------------------+-------------------+ >> >> there is a chance it could be made to work, if ugly. That's USE_LSB_TAG >> == 1. > > It does appear to work. I'm not sure how it is "ugly", to be honest, > since MPS only sees 32-bit words, and that's the tagged pointer and > 0. No changes required. I was just assuming it would end up ugly in some form. But I haven't thought about it much. WIDE_INT and 32-bits are in an SEP field for me :-). >> If it looks like this >> >> 0 32 64 >> +------------------+-------------------+ >> | pointer | ... |tag | >> +------------------+-------------------+ >> >> it gets a lot more ugly. That's USE_LSB_TAG == 0. > > Given that gcc likes storing the two 32-bit words of a 64-bit integer in > non-adjacent places on the stack, it would be quite expensive to get > this working. Yeah, that's for sure. Nightmare. > And if we decided to do that, it would become a lot more complicated to > change our tagging scheme (which we should do, some time after merging > MPS, to speed up EQ by having a "may be EQ to a different object" tag > or, ideally, bit: EQ could then be simplified to > > if (x == y) > return true; > else if (((x|y) & BIT) == 0) > return false; > > <expensive non-inlined code here>) Hm, interesting idea. One would have to try it out of course to know, but from a gut feeling, would you say one would notice a difference? I don't have an "educated" gut feeling wrt EQ. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 19:59 ` Gerd Möllmann @ 2024-12-10 20:17 ` Pip Cet via Emacs development discussions. 2024-12-10 20:34 ` Gerd Möllmann 0 siblings, 1 reply; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-10 20:17 UTC (permalink / raw) To: Gerd Möllmann Cc: Eli Zaretskii, stefankangas, luangruo, ali_gnu2, emacs-devel Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Pip Cet <pipcet@protonmail.com> writes: > >>> If a Lisp_Object looks like this >>> >>> 0 32 64 >>> +------------------+-------------------+ >>> | tag | pointer | ... | >>> +------------------+-------------------+ >>> >>> there is a chance it could be made to work, if ugly. That's USE_LSB_TAG >>> == 1. >> >> It does appear to work. I'm not sure how it is "ugly", to be honest, >> since MPS only sees 32-bit words, and that's the tagged pointer and >> 0. No changes required. > > I was just assuming it would end up ugly in some form. But I haven't > thought about it much. WIDE_INT and 32-bits are in an SEP field for > me :-). > >>> If it looks like this >>> >>> 0 32 64 >>> +------------------+-------------------+ >>> | pointer | ... |tag | >>> +------------------+-------------------+ >>> >>> it gets a lot more ugly. That's USE_LSB_TAG == 0. >> >> Given that gcc likes storing the two 32-bit words of a 64-bit integer in >> non-adjacent places on the stack, it would be quite expensive to get >> this working. > > Yeah, that's for sure. Nightmare. > >> And if we decided to do that, it would become a lot more complicated to >> change our tagging scheme (which we should do, some time after merging >> MPS, to speed up EQ by having a "may be EQ to a different object" tag >> or, ideally, bit: EQ could then be simplified to >> >> if (x == y) >> return true; >> else if (((x|y) & BIT) == 0) >> return false; >> >> <expensive non-inlined code here>) > > Hm, interesting idea. One would have to try it out of course to know, > but from a gut feeling, would you say one would notice a difference? > I don't have an "educated" gut feeling wrt EQ. My gut feeling is that EQ happens so often that it's worth micro-optimizing. Andrea has started doing that by using __builtin_expect, but the assembler code we produce still looks very inefficient. In particular, we don't even perform a quick exit if the arguments are BASE_EQ, or attempt to move the cold code into its own function, which shouldn't be inlined (there are about 2000 locations GDB thinks correspond to EQ calls in my current Emacs, so that's a lot of duplicated code). I was going to suggest a patch to change that... Of course it's entirely possible that EQ just doesn't matter for performance. My entire post-MPS proposal is to have bignums, floats and symbols-with-position as the "exotic" tags that (may) need special handling in EQ. That leaves four tags for fixnums, strings, vectorlikes, symbols, and cons cells, which doesn't work out. I _think_ the least painful option is to give strings the "treat specially in EQ" bit, since comparing strings with EQ, while legal, is rare. (and, yes, this approach would use Lisp_Type_Unused0 and reduce fixnum range by one bit). Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 20:17 ` Pip Cet via Emacs development discussions. @ 2024-12-10 20:34 ` Gerd Möllmann 0 siblings, 0 replies; 112+ messages in thread From: Gerd Möllmann @ 2024-12-10 20:34 UTC (permalink / raw) To: Pip Cet; +Cc: Eli Zaretskii, stefankangas, luangruo, ali_gnu2, emacs-devel Pip Cet <pipcet@protonmail.com> writes: >> Hm, interesting idea. One would have to try it out of course to know, >> but from a gut feeling, would you say one would notice a difference? >> I don't have an "educated" gut feeling wrt EQ. > > My gut feeling is that EQ happens so often that it's worth > micro-optimizing. Andrea has started doing that by using > __builtin_expect, but the assembler code we produce still looks very > inefficient. In particular, we don't even perform a quick exit if the > arguments are BASE_EQ, or attempt to move the cold code into its own > function, which shouldn't be inlined (there are about 2000 locations GDB > thinks correspond to EQ calls in my current Emacs, so that's a lot of > duplicated code). > > I was going to suggest a patch to change that... > > Of course it's entirely possible that EQ just doesn't matter for > performance. The proof is in the pudding, I guess. > My entire post-MPS proposal is to have bignums, floats and > symbols-with-position as the "exotic" tags that (may) need special handling in > EQ. That leaves four tags for fixnums, strings, vectorlikes, symbols, > and cons cells, which doesn't work out. > > I _think_ the least painful option is to give strings the "treat > specially in EQ" bit, since comparing strings with EQ, while legal, is > rare. > > (and, yes, this approach would use Lisp_Type_Unused0 and reduce fixnum > range by one bit). Oops :-). ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-10 18:03 ` Gerd Möllmann 2024-12-10 19:34 ` Pip Cet via Emacs development discussions. @ 2024-12-11 14:13 ` Pip Cet via Emacs development discussions. 2024-12-11 17:43 ` Eli Zaretskii 1 sibling, 1 reply; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-11 14:13 UTC (permalink / raw) To: Gerd Möllmann Cc: Eli Zaretskii, stefankangas, luangruo, ali_gnu2, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: >> Eli Zaretskii <eliz@gnu.org> writes: >>> That's not what Gerd says, AFAIU. But if you are right, then how >>> about making the WIDE_EMACS_INT configuration on the igc branch use >>> USE_LSB_TAG in the HAVE_MPS code branch? I can volunteer to test such >>> a build, if that would help. > > Thanks for the offer. I definitely think we should move away from > USE_LSB_TAG=0 as much as possible, and if the only place where such a > change would not be vetoed is scratch/igc + WIDE_EMACS_INT, we can at > least fix it there. If any issues arise, of course, it will be more > difficult to ascertain whether they were caused by the USE_LSB_TAG > change or the IGC changes themselves. > > So I'll push that change in a bit, unless someone objects. Just pushed it to the scratch/igc branch. It shouldn't have any effect on ordinary 64-bit builds; some of the code is to cater to the hypothetical big-endian 32-bit use case, and technically the x86 MPS weak pointer "optimization" could bite us again, but recent GCC does not generate the precise instructions that MPS emulates, so I'll risk it. As I've just explained, bug reports for the WIDE_EMACS_INT case will be difficult to deal with, as there are two major changes; let's see what happens, but I suspect we'll end up having to ask users to build a !MPS + USE_LSB_TAG + WIDE_EMACS_INT configuration. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-11 14:13 ` Pip Cet via Emacs development discussions. @ 2024-12-11 17:43 ` Eli Zaretskii 0 siblings, 0 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-12-11 17:43 UTC (permalink / raw) To: Pip Cet; +Cc: gerd.moellmann, stefankangas, luangruo, ali_gnu2, emacs-devel > Date: Wed, 11 Dec 2024 14:13:11 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: Eli Zaretskii <eliz@gnu.org>, stefankangas@gmail.com, luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org > > Pip Cet <pipcet@protonmail.com> writes: > > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> Eli Zaretskii <eliz@gnu.org> writes: > >>> That's not what Gerd says, AFAIU. But if you are right, then how > >>> about making the WIDE_EMACS_INT configuration on the igc branch use > >>> USE_LSB_TAG in the HAVE_MPS code branch? I can volunteer to test such > >>> a build, if that would help. > > > > Thanks for the offer. I definitely think we should move away from > > USE_LSB_TAG=0 as much as possible, and if the only place where such a > > change would not be vetoed is scratch/igc + WIDE_EMACS_INT, we can at > > least fix it there. If any issues arise, of course, it will be more > > difficult to ascertain whether they were caused by the USE_LSB_TAG > > change or the IGC changes themselves. > > > > So I'll push that change in a bit, unless someone objects. > > Just pushed it to the scratch/igc branch. It shouldn't have any effect > on ordinary 64-bit builds; some of the code is to cater to the > hypothetical big-endian 32-bit use case, and technically the x86 MPS > weak pointer "optimization" could bite us again, but recent GCC does not > generate the precise instructions that MPS emulates, so I'll risk it. The "normal" (i.e. without WIDE_EMACS_INT) 32-bit MS-Windows build is now broken: CCLD temacs.exe GEN ../etc/DOC /bin/mkdir -p ../etc make -C ../lisp update-subdirs make[3]: Entering directory `/d/gnu/git/emacs/feature/lisp' make[3]: Leaving directory `/d/gnu/git/emacs/feature/lisp' cp -f temacs.exe bootstrap-emacs.exe rm -f bootstrap-emacs.pdmp ./temacs --batch -l loadup --temacs=pbootstrap \ --bin-dest '/d/usr/bin/' --eln-dest '/d/usr/lib/emacs/31.0.50/' Loading loadup.el (source)... Dump mode: pbootstrap Using load-path (d:/gnu/git/emacs/feature/lisp d:/gnu/git/emacs/feature/lisp/emacs-lisp d:/gnu/git/emacs/feature/lisp/progmodes d:/gnu/git/emacs/feature/lisp/language d:/gnu/git/emacs/feature/lisp/international d:/gnu/git/emacs/feature/lisp/textmodes d:/gnu/git/emacs/feature/lisp/vc) Loading emacs-lisp/debug-early... Loading emacs-lisp/byte-run... Loading emacs-lisp/backquote... Loading subr... lisp.h:1273: Emacs fatal error: assertion failed: !FIXNUM_OVERFLOW_P (n) Backtrace: 0124c3b2 Makefile:1018: recipe for target `bootstrap-emacs.pdmp' failed make[2]: *** [bootstrap-emacs.pdmp] Error 3 Below I show the backtrace and some data from GDB. Does such a build work on GNU/Linux? Let me know if I can provide more data for debugging this. Thread 1 hit Breakpoint 1, terminate_due_to_signal (sig=sig@entry=22, backtrace_limit=backtrace_limit@entry=2147483647) at emacs.c:432 432 { (gdb) bt #0 terminate_due_to_signal (sig=sig@entry=22, backtrace_limit=backtrace_limit@entry=2147483647) at emacs.c:432 #1 0x00b1b53a in die ( msg=msg@entry=0x10e4889 <i_fwd+849> "!FIXNUM_OVERFLOW_P (n)", file=file@entry=0x10e478d <i_fwd+597> "lisp.h", line=line@entry=1273) at alloc.c:8377 #2 0x00bd7830 in make_fixnum (n=<optimized out>) at lisp.h:1273 #3 0x00bdaf04 in make_fixnum (n=<optimized out>) at lisp.h:1273 #4 weak_hash_table_entry (entry=...) at igc.c:4111 #5 0x00b514e9 in WEAK_HASH_INDEX (h=<optimized out>, idx=<optimized out>) at fns.c:5487 #6 0x00b52c4b in weak_hash_lookup_with_hash (h=h@entry=0xb0542b8, key=key@entry=XIL(0xb059b43), hash=hash@entry=make_fixnum(14948)) at fns.c:5722 #7 0x00b60f85 in Fputhash (key=XIL(0xb059b43), value=XIL(0xb059cfb), table=XIL(0xb0542bd)) at fns.c:6555 #8 0x00b9bc8a in exec_byte_code (fun=<optimized out>, args_template=771, args_template@entry=0, nargs=<optimized out>, nargs@entry=0, args=<optimized out>, args@entry=0x0) at lisp.h:791 #9 0x00b9e082 in Fbyte_code (bytestr=<optimized out>, vector=XIL(0xb059e9d), maxdepth=make_fixnum(4)) at bytecode.c:325 #10 0x00b4be63 in eval_sub (form=form@entry=XIL(0xb059c6b)) at eval.c:2610 #11 0x00b87f70 in readevalloop (readcharfun=readcharfun@entry=XIL(0x60a0), infile0=infile0@entry=0x767f048, sourcename=sourcename@entry=XIL(0xb059a34), printflag=printflag@entry=false, unibyte=unibyte@entry=XIL(0), readfun=readfun@entry=XIL(0), start=start@entry=XIL(0), end=<optimized out>, end@entry=XIL(0)) at lread.c:2540 #12 0x00b889e5 in Fload (file=XIL(0xb0598fc), noerror=XIL(0), nomessage=XIL(0), nosuffix=XIL(0), must_suffix=<optimized out>) at lisp.h:1226 #13 0x00b4be1a in eval_sub (form=form@entry=XIL(0xb0598eb)) at eval.c:2618 #14 0x00b87f70 in readevalloop (readcharfun=readcharfun@entry=XIL(0x60a0), infile0=infile0@entry=0x767f638, sourcename=sourcename@entry=XIL(0xb04a314), printflag=printflag@entry=false, unibyte=unibyte@entry=XIL(0), readfun=readfun@entry=XIL(0), start=start@entry=XIL(0), end=<optimized out>, end@entry=XIL(0)) at lread.c:2540 #15 0x00b889e5 in Fload (file=XIL(0xb049f84), noerror=XIL(0), nomessage=XIL(0), nosuffix=XIL(0), must_suffix=<optimized out>) at lisp.h:1226 #16 0x00b4be1a in eval_sub (form=form@entry=XIL(0xb049fab)) at eval.c:2618 #17 0x00b4dd99 in Feval (form=XIL(0xb049fab), lexical=lexical@entry=XIL(0x20)) at eval.c:2463 #18 0x00aa69a1 in top_level_2 () at lisp.h:1226 #19 0x00b4613b in internal_condition_case ( bfun=bfun@entry=0xaa6943 <top_level_2>, handlers=handlers@entry=XIL(0x60), hfun=hfun@entry=0xab0496 <cmd_error>) at eval.c:1618 #20 0x00aa70b0 in top_level_1 (ignore=XIL(0)) at lisp.h:1226 #21 0x00b46055 in internal_catch (tag=tag@entry=XIL(0xc540), func=func@entry=0xaa7087 <top_level_1>, arg=arg@entry=XIL(0)) at eval.c:1297 #22 0x00aa675f in command_loop () at lisp.h:1226 #23 0x00ab0054 in recursive_edit_1 () at keyboard.c:760 #24 0x00ab0342 in Frecursive_edit () at keyboard.c:843 #25 0x00cf4375 in main (argc=<optimized out>, argv=<optimized out>) at emacs.c:2646 (gdb) up #1 0x00b1b53a in die ( msg=msg@entry=0x10e4889 <i_fwd+849> "!FIXNUM_OVERFLOW_P (n)", file=file@entry=0x10e478d <i_fwd+597> "lisp.h", line=line@entry=1273) at alloc.c:8377 8377 terminate_due_to_signal (SIGABRT, INT_MAX); (gdb) #2 0x00bd7830 in make_fixnum (n=<optimized out>) at lisp.h:1273 1273 eassert (!FIXNUM_OVERFLOW_P (n)); (gdb) #3 0x00bdaf04 in make_fixnum (n=<optimized out>) at lisp.h:1273 1273 eassert (!FIXNUM_OVERFLOW_P (n)); (gdb) #4 weak_hash_table_entry (entry=...) at igc.c:4111 4111 return make_fixnum (entry.intptr >> 1); (gdb) p entry $1 = { intptr = 4294967295, fixnum = make_fixnum(6) } (gdb) up #5 0x00b514e9 in WEAK_HASH_INDEX (h=<optimized out>, idx=<optimized out>) at fns.c:5487 5487 return XFIXNUM (weak_hash_table_entry (h->strong->index[idx])); (gdb) up #6 0x00b52c4b in weak_hash_lookup_with_hash (h=h@entry=0xa8542b8, key=key@entry=XIL(0xa859b43), hash=hash@entry=make_fixnum(14948)) at fns.c:5722 5722 for (ptrdiff_t i = WEAK_HASH_INDEX (h, start_of_bucket); (gdb) up #7 0x00b60f85 in Fputhash (key=XIL(0xa859b43), value=XIL(0xa859cfb), table=XIL(0xa8542bd)) at fns.c:6555 6555 ptrdiff_t i = weak_hash_lookup_with_hash (wh, key, hash); ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-09 4:59 ` Stefan Kangas 2024-12-09 14:39 ` Eli Zaretskii @ 2024-12-09 16:21 ` Pip Cet via Emacs development discussions. 1 sibling, 0 replies; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-09 16:21 UTC (permalink / raw) To: Stefan Kangas; +Cc: Eli Zaretskii, luangruo, ali_gnu2, emacs-devel "Stefan Kangas" <stefankangas@gmail.com> writes: > Eli Zaretskii <eliz@gnu.org> writes: >>> Date: Sun, 08 Dec 2024 17:37:50 +0000 >>> From: Pip Cet <pipcet@protonmail.com> >>> Cc: luangruo@yahoo.com, ali_gnu2@emvision.com, emacs-devel@gnu.org >>> >>> "Eli Zaretskii" <eliz@gnu.org> writes: >>> >>> >> So let's remove it, and switch WIDE_EMACS_INT builds to USE_LSB? >>> > >>> > That'd be a waste of effort. >>> >>> It'd be a good investment of effort today, in exchange for making the GC >>> code significantly easier to understand and maintain in the future. It >>> would certainly not be without its benefits, so calling it a "waste of >>> effort" is unfair. >> >> I disagree. We've lived with this GC code for a long time, and I >> don't see any complications due to !USE_LSB. And if we are going to >> switch to igc at some point, investment in GC is even less sensible. > > Assuming that we are 100% sure that mpc will land, then I can agree that Even if mps does land on master, the old GC will remain in place for a very long time, so I don't think we should declare the old GC a do-not-touch zone just yet. > making any changes here is basically wasted effort. Unless, of course, > the change would also simplify the mpc work (would it?). I believe it would, yes. > On the other hand, IIUC, we have some way to go with making the merging > of the mpc branch a guarantee. While I'm an enthusiastic supporter of > the great work that's being done on the mpc branch, isn't hedging our > bets prudent until that work is done? > > Or am I misunderstanding how close we are to merging the mpc branch? My current understanding is that Eli expressed requirements for how things like the signalling issue should be fixed. While I have a solution that appears to work, it doesn't meet these requirements. >>> If performance and wasted memory aren't issues, then it's a tradeoff >>> between leaving old code untouched and simplifying it to enable future >>> development. >> >> The existing code doesn't preclude nor interfere with future >> development. So yes, leaving working code untouched is the preference >> here. > > Based on my limited mucking around in the GC, it does interfere somewhat > because you do need to understand both configurations, at least on a > high level, and once you do you need to mentally filter that stuff out > when reading the code. So I think I'd appreciate the simplification, at > least. I agree with Stefan here. Also, let's keep in mind that !USE_LSB_TAG in its original use case is currently broken, and has been for a long time. > If the only known drawbacks are stability concerns, we could also > consider an intermediate step along these lines: > > Leave the USE_LSB_TAG code as is, but set it to 1 in all configurations > on master. See what issues crop up, if any. If anything does come up, > ask Pip Cet to fix it (he volunteered, IIUC), and if things are starting > to look too hairy, revert EMACS_WIDE_INT back to !USE_LSB_TAG. If > nothing too bad comes up, we can then consider removing the associated > code in Emacs 32. I think that would be a good approach! I'd just like to add that stability concerns go both ways: it's a good reason to move the very few remaining users of !USE_LSB_TAG to use the same code (and experience the same problems) as all other users, rather than splitting what time we have for GC work between the two code paths. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 16:49 ` Eli Zaretskii 2024-12-08 17:37 ` Pip Cet via Emacs development discussions. @ 2024-12-08 18:47 ` Pip Cet via Emacs development discussions. 2024-12-09 1:13 ` Po Lu 2 siblings, 0 replies; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-08 18:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, ali_gnu2, emacs-devel Pip Cet <pipcet@protonmail.com> writes: >>> > In fact, one of my strongest reservations about the igc branch is that >>> > it will most probably force me to lose WIDE_EMACS_INT. >>> >>> I believe that problem is exclusively due to the fact that >>> WIDE_EMACS_INT implies USE_LSB=0. Dropping !USE_LSB should allow us to >>> use WIDE_EMACS_INT normally in MPS builds, I think. >> >> No, there's also a built-in assumption in MPS about the size of a >> word. > > That's very vague. If there is an assumption that EMACS_INT == > mps_word_t, it would certainly not be built into MPS, which doesn't know > about EMACS_INT at all. But as it is, I have no idea where you even > suspect this "built-in" assumption is made. FWIW, my MPS branch works fine in this constellation (32-bit x86, WIDE_EMACS_INT, USE_LSB_TAG) on GNU/Linux. If there is an issue, it must be quite subtle. Or specific to mingw32, which would mean it has to wait until some day, if ever, that toolchain becomes available on the internet again. commit 4370e866d8557b55c948e740d119e170338b91fd Author: Pip Cet <pipcet@protonmail.com> Date: Sun Dec 8 17:45:12 2024 +0000 try enabling MPS for WIDE_EMACS_INT + USE_LSB_TAG builds diff --git a/src/igc.c b/src/igc.c index 4589cfd0085..e97277d962c 100644 --- a/src/igc.c +++ b/src/igc.c @@ -70,9 +70,6 @@ #ifndef USE_LSB_TAG # error "USE_LSB_TAG required" #endif -#ifdef WIDE_EMACS_INT -# error "WIDE_EMACS_INT not supported" -#endif #if USE_STACK_LISP_OBJECTS # error "USE_STACK_LISP_OBJECTS not supported" #endif diff --git a/src/lisp.h b/src/lisp.h index d4638fa160c..0541f8f901b 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -280,7 +280,7 @@ #define VAL_MAX (EMACS_INT_MAX >> (GCTYPEBITS - 1)) b. slower, because it typically requires extra masking. So, USE_LSB_TAG is true only on hosts where it might be useful. */ DEFINE_GDB_SYMBOL_BEGIN (bool, USE_LSB_TAG) -#define USE_LSB_TAG (VAL_MAX / 2 < INTPTR_MAX) +#define USE_LSB_TAG 1 DEFINE_GDB_SYMBOL_END (USE_LSB_TAG) /* Mask for the value (as opposed to the type bits) of a Lisp object. */ ^ permalink raw reply related [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 16:49 ` Eli Zaretskii 2024-12-08 17:37 ` Pip Cet via Emacs development discussions. 2024-12-08 18:47 ` Pip Cet via Emacs development discussions. @ 2024-12-09 1:13 ` Po Lu 2 siblings, 0 replies; 112+ messages in thread From: Po Lu @ 2024-12-09 1:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Pip Cet, ali_gnu2, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> As 32-bit systems go away, it will become harder to write Lisp code that >> works correctly in !WIDE_EMACS_INT 32-bit builds, so we may well have to >> make WIDE_EMACS_INT the default at some point. > > If you are trying to convince me to switch to 64-bit development > environment, you are wasting your time. I have my very good reasons, > and don't plan on doing so any time soon. > > And 64-but Windows supports 32-bit code perfectly for my needs. Moreover, 32-bit Android systems remain widespread, and many people build 32-bit Emacs binaries for more optimal memory utilization. In fact Android OEMs sometimes install 32-bit operating systems on 64-bit capable hardware to optimize memory usage, and consequently, speaking of the demise of 32-bit configurations is a pointless exercise. And armv7 is not nearly so register-starved as 32-bit x86... ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 13:52 ` Pip Cet via Emacs development discussions. 2024-12-08 14:52 ` Eli Zaretskii @ 2024-12-09 1:08 ` Po Lu 1 sibling, 0 replies; 112+ messages in thread From: Po Lu @ 2024-12-09 1:08 UTC (permalink / raw) To: Pip Cet; +Cc: Eli Zaretskii, ali_gnu2, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > The only platforms that "need" to use !USE_LSB are those that cannot > guarantee 8-byte alignment for static objects, which is why I asked > about those. If those exist, we should have received bug reports > indicating that !WIDE_EMACS_INT builds don't work on such platforms. > > In particular, WIDE_EMACS_INT shouldn't imply !USE_LSB. That it > currently does is a very questionable optimization at best (fixnum > manipulation may be very slightly faster with !USE_LSB, but pointer > manipulation will be slower and requires extra registers, which is an > issue on i386). > > For example, NILP() would only need to look at a single 32-bit word for > the WIDE_EMACS_INT + USE_LSB configuration. I strongly suspect that > effect alone would make WIDE_EMACS_INT + USE_LSB faster than > WIDE_EMACS_INT + !USE_LSB (of course, the relevant optimization would > have to be made first). > > (Of course, WIDE_EMACS_INT is almost always a bad deal, anyway. As far > as I can tell, the justification for its continued existence is that > some C code assumes buffer positions are fixnums (and, because we expose > fixnum-ness to Lisp, some broken Lisp code might do that, too). If we > had implemented fixnums to be transparent, we could simply remove > WIDE_EMACS_INT, but that mistake has been made...) Why is WIDE_EMACS_INT a bad deal? Its effect is just as you describe: it enables 32-bit systems to access files larger than the standard fixnum limit on those systems. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 12:17 ` pdumper on Solaris 10 Pip Cet via Emacs development discussions. 2024-12-08 13:05 ` Eli Zaretskii @ 2024-12-09 0:58 ` Po Lu 2024-12-09 3:28 ` Eli Zaretskii 2024-12-09 1:01 ` Po Lu 2 siblings, 1 reply; 112+ messages in thread From: Po Lu @ 2024-12-09 0:58 UTC (permalink / raw) To: Pip Cet; +Cc: ali_gnu2, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > But while we're talking about rare and unusual systems, !USE_LSB builds > are currently broken except for the WIDE_EMACS_INT case, because the > stack scanning code makes no attempt to remove MSB tags. It may be time > to simply remove MSB tag support, unless there are systems around that > actually fail to align static objects to 8-byte boundaries (but such > systems would have been broken for a while now). Aren't the MS-DOS builds !USE_LSB? ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-09 0:58 ` Po Lu @ 2024-12-09 3:28 ` Eli Zaretskii 0 siblings, 0 replies; 112+ messages in thread From: Eli Zaretskii @ 2024-12-09 3:28 UTC (permalink / raw) To: Po Lu; +Cc: pipcet, ali_gnu2, emacs-devel > From: Po Lu <luangruo@yahoo.com> > Cc: ali_gnu2@emvision.com, emacs-devel@gnu.org > Date: Mon, 09 Dec 2024 08:58:32 +0800 > > Pip Cet <pipcet@protonmail.com> writes: > > > But while we're talking about rare and unusual systems, !USE_LSB builds > > are currently broken except for the WIDE_EMACS_INT case, because the > > stack scanning code makes no attempt to remove MSB tags. It may be time > > to simply remove MSB tag support, unless there are systems around that > > actually fail to align static objects to 8-byte boundaries (but such > > systems would have been broken for a while now). > > Aren't the MS-DOS builds !USE_LSB? No. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-08 12:17 ` pdumper on Solaris 10 Pip Cet via Emacs development discussions. 2024-12-08 13:05 ` Eli Zaretskii 2024-12-09 0:58 ` Po Lu @ 2024-12-09 1:01 ` Po Lu 2024-12-09 13:11 ` Pip Cet via Emacs development discussions. 2 siblings, 1 reply; 112+ messages in thread From: Po Lu @ 2024-12-09 1:01 UTC (permalink / raw) To: Pip Cet; +Cc: ali_gnu2, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > "Po Lu" <luangruo@yahoo.com> writes: > >> pdumper-dumped binaries appear to crash in an x86 Solaris 10 zone, >> though I don't really use this configuration and I'm not interested in >> trying the portable dumper on sparc: >> >> core 'core' of 7021: ../../src/bootstrap-emacs -batch --no-site-file --no-site-lisp -f batc >> 00007fffaf433dc2 ???????? () >> 00007fffaf5eb3d7 ???????? () >> 00007fffaf5ec590 ???????? () >> 00007fffae3f351a _lwp_kill () + a >> 00007fffae3981b9 raise () + 19 >> 00000000008baf90 terminate_due_to_signal () + c0 >> 000000000090236e ???????? () >> 0000000000902334 deliver_thread_signal () + 74 >> 00000000009023b0 deliver_fatal_thread_signal () + 10 >> 00000000009024ef handle_sigsegv () + 4f >> 00007fffae3edd16 __sighndlr () + 6 >> 00007fffae3e25e2 call_user_handler () + 252 >> 00007fffae3e280e sigacthandler () + ee >> 00007fffaf5ea82d ???????? () >> ffffffffffffffff ???????? () >> 00000000009c77e7 lisp_align_malloc () + 4d7 >> 00000000009c9dd2 make_float () + 42 >> 00000000009d2e9d init_alloc () + d >> 00000000008bd373 main () + bb3 >> 00000000006d15ab ???????? () > > FWIW, this issue doesn't appear to happen on a "fresh" Solaris 10 > install, in a qemu virtual machine, on x86. I used the > sol-10-u11-ga-x86-dvd.iso image, installed to a new disk, then installed > OpenCSW and built Emacs from the master branch with and without > CFLAGS="-m64" (plus the linker path selection). Both builds appear to > work. That's a very different configuration from a Solaris 10 zone, which is a modern Solaris 11 kernel hosting a Solaris 10 userspace with a number of compatibility libraries loaded into running processes. ^ permalink raw reply [flat|nested] 112+ messages in thread
* Re: pdumper on Solaris 10 2024-12-09 1:01 ` Po Lu @ 2024-12-09 13:11 ` Pip Cet via Emacs development discussions. 0 siblings, 0 replies; 112+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-09 13:11 UTC (permalink / raw) To: Po Lu; +Cc: ali_gnu2, emacs-devel "Po Lu" <luangruo@yahoo.com> writes: > That's a very different configuration from a Solaris 10 zone, which is a > modern Solaris 11 kernel hosting a Solaris 10 userspace with a number of > compatibility libraries loaded into running processes. Thanks for explaining. My understanding is cfarm210 is a Solaris 10 zone (but on sparc64), and the problem doesn't appear there, so it might be specific to Solaris 10 zones on Solaris 11 on x86. I'm not sure whether I can build such a zone to try reproducing it. Pip ^ permalink raw reply [flat|nested] 112+ messages in thread
end of thread, other threads:[~2024-12-11 20:26 UTC | newest] Thread overview: 112+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <mailman.39.1723910423.12184.emacs-devel@gnu.org> 2024-08-17 22:49 ` Emacs-devel Digest, Vol 246, Issue 17 ali_gnu2 2024-08-18 0:10 ` Po Lu 2024-08-18 0:19 ` Po Lu 2024-08-18 1:15 ` Solaris dldump (was: Pure space) ali_gnu2 2024-08-18 1:25 ` Solaris dldump Po Lu 2024-08-18 22:27 ` Stefan Kangas 2024-08-18 23:56 ` Po Lu 2024-08-19 11:18 ` Eli Zaretskii 2024-08-19 12:09 ` Po Lu 2024-08-19 12:50 ` Eli Zaretskii 2024-08-19 11:44 ` Pip Cet 2024-08-19 11:57 ` Po Lu 2024-08-19 12:10 ` Pip Cet 2024-08-19 12:55 ` Eli Zaretskii 2024-08-19 13:46 ` Pip Cet 2024-08-19 14:39 ` Eli Zaretskii 2024-08-19 15:26 ` Corwin Brust 2024-08-19 15:31 ` Corwin Brust 2024-08-19 20:51 ` Stefan Kangas 2024-08-19 20:35 ` Stefan Kangas 2024-12-08 12:17 ` pdumper on Solaris 10 Pip Cet via Emacs development discussions. 2024-12-08 13:05 ` Eli Zaretskii 2024-12-08 13:52 ` Pip Cet via Emacs development discussions. 2024-12-08 14:52 ` Eli Zaretskii 2024-12-08 16:17 ` Pip Cet via Emacs development discussions. 2024-12-08 16:49 ` Eli Zaretskii 2024-12-08 17:37 ` Pip Cet via Emacs development discussions. 2024-12-08 18:41 ` Eli Zaretskii 2024-12-08 19:15 ` Gerd Möllmann 2024-12-08 20:38 ` Eli Zaretskii 2024-12-09 3:09 ` Gerd Möllmann 2024-12-09 3:32 ` Eli Zaretskii 2024-12-09 3:43 ` Gerd Möllmann 2024-12-09 4:53 ` Stefan Kangas 2024-12-09 5:26 ` Gerd Möllmann 2024-12-09 13:58 ` Eli Zaretskii 2024-12-10 0:02 ` Po Lu 2024-12-09 9:56 ` Pip Cet via Emacs development discussions. 2024-12-10 0:04 ` Po Lu 2024-12-10 3:34 ` Eli Zaretskii 2024-12-11 1:13 ` Po Lu 2024-12-11 11:29 ` Pip Cet via Emacs development discussions. 2024-12-09 4:59 ` Stefan Kangas 2024-12-09 14:39 ` Eli Zaretskii 2024-12-09 21:06 ` Merging MPS a.k.a. scratch/igc, yet again Stefan Kangas 2024-12-09 21:49 ` Óscar Fuentes 2024-12-10 4:17 ` Xiyue Deng 2024-12-10 4:26 ` Sean Whitton 2024-12-10 4:42 ` chad 2024-12-10 13:10 ` Óscar Fuentes 2024-12-10 15:10 ` Pip Cet via Emacs development discussions. 2024-12-10 15:37 ` Óscar Fuentes 2024-12-10 15:47 ` Pip Cet via Emacs development discussions. 2024-12-10 17:16 ` Eli Zaretskii 2024-12-10 13:20 ` Eli Zaretskii 2024-12-10 14:46 ` Pip Cet via Emacs development discussions. 2024-12-10 13:09 ` Eli Zaretskii 2024-12-10 13:20 ` Óscar Fuentes 2024-12-10 14:41 ` Eli Zaretskii 2024-12-09 23:13 ` chad 2024-12-10 12:41 ` Eli Zaretskii 2024-12-10 0:09 ` pdumper on Solaris 10 Stefan Kangas 2024-12-10 12:59 ` Eli Zaretskii 2024-12-10 13:39 ` Óscar Fuentes 2024-12-10 14:39 ` Eli Zaretskii 2024-12-10 15:21 ` Óscar Fuentes 2024-12-10 16:39 ` Eli Zaretskii 2024-12-10 15:38 ` Pip Cet via Emacs development discussions. 2024-12-10 16:04 ` Óscar Fuentes 2024-12-10 17:23 ` Eli Zaretskii 2024-12-11 5:27 ` Gap buffer problem? Gerd Möllmann 2024-12-11 8:50 ` Pip Cet via Emacs development discussions. 2024-12-11 9:35 ` Gerd Möllmann 2024-12-11 11:50 ` Pip Cet via Emacs development discussions. 2024-12-11 13:22 ` Gerd Möllmann 2024-12-11 14:53 ` Pip Cet via Emacs development discussions. 2024-12-11 15:33 ` Gerd Möllmann 2024-12-11 16:58 ` Eli Zaretskii 2024-12-11 17:13 ` Gerd Möllmann 2024-12-11 17:45 ` Robert Pluim 2024-12-11 18:11 ` Gerd Möllmann 2024-12-11 19:08 ` Eli Zaretskii 2024-12-11 17:41 ` Pip Cet via Emacs development discussions. 2024-12-11 19:04 ` Eli Zaretskii 2024-12-11 19:54 ` Pip Cet via Emacs development discussions. 2024-12-11 20:26 ` Eli Zaretskii 2024-12-11 19:09 ` Gerd Möllmann 2024-12-11 12:27 ` Pip Cet via Emacs development discussions. 2024-12-11 13:27 ` Gerd Möllmann 2024-12-11 15:06 ` Marcus Harnisch 2024-12-11 14:22 ` Eli Zaretskii 2024-12-11 15:51 ` Gerd Möllmann 2024-12-11 17:06 ` Eli Zaretskii 2024-12-11 17:15 ` Gerd Möllmann 2024-12-10 18:13 ` pdumper on Solaris 10 Gerd Möllmann 2024-12-10 15:23 ` Pip Cet via Emacs development discussions. 2024-12-10 17:08 ` Eli Zaretskii 2024-12-10 18:03 ` Gerd Möllmann 2024-12-10 19:34 ` Pip Cet via Emacs development discussions. 2024-12-10 19:59 ` Gerd Möllmann 2024-12-10 20:17 ` Pip Cet via Emacs development discussions. 2024-12-10 20:34 ` Gerd Möllmann 2024-12-11 14:13 ` Pip Cet via Emacs development discussions. 2024-12-11 17:43 ` Eli Zaretskii 2024-12-09 16:21 ` Pip Cet via Emacs development discussions. 2024-12-08 18:47 ` Pip Cet via Emacs development discussions. 2024-12-09 1:13 ` Po Lu 2024-12-09 1:08 ` Po Lu 2024-12-09 0:58 ` Po Lu 2024-12-09 3:28 ` Eli Zaretskii 2024-12-09 1:01 ` Po Lu 2024-12-09 13:11 ` Pip Cet via Emacs development discussions.
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).