From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Suppressing native compilation (short and long term) Date: Sat, 15 Oct 2022 11:51:06 +0300 Message-ID: <83czat79mt.fsf@gnu.org> References: <87ill8paw7.fsf@trouble.defaultvalue.org> <83o7uzivey.fsf@gnu.org> <3ac9d2b9632f75018327a1bcde0c373f152c404a.camel@gmail.com> <835ygob7ja.fsf@gnu.org> <834jw7a2ym.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="3224"; mail-complaints-to="usenet@ciao.gmane.io" Cc: rlb@defaultvalue.org, emacs-devel@gnu.org To: Liliana Marie Prikler , Andrea Corallo Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Oct 15 10:52:14 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ojcty-0000cu-0f for ged-emacs-devel@m.gmane-mx.org; Sat, 15 Oct 2022 10:52:14 +0200 Original-Received: from localhost ([::1]:40876 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ojctw-0007ev-JO for ged-emacs-devel@m.gmane-mx.org; Sat, 15 Oct 2022 04:52:12 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:50642) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ojct7-0005OY-OV for emacs-devel@gnu.org; Sat, 15 Oct 2022 04:51:21 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:44988) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ojct6-0003E0-5c; Sat, 15 Oct 2022 04:51:20 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=e4bWqB27W+t7bRiIwCONkswB1hEYi18cOJ4ST1IaqEI=; b=pftkO71MfczZfcTIPOnN WNNRVZbefkn2YDn+5CGXPPWcI7G0EwmEt7D0LYGdA41rHKZMctI1tKLLNVJWTdtGpCb+DLSJiTXkT LfyXQExpfLxHkbnRTsLgdy22/mqEtUcsl/Ia2T2G48cWc8TFpF2zxN90W9RXkPVbrQ/aW9W1cosbq MPPaJHW3VTNmbXl8tseUhlYuOrGTPHZbQxsvBF8BhPsOGJqLpP1AMa9fJkMc7iSs0MWxjMMmJmKm1 t7IhLf7iJG1pEfEWe6tQDY0/RDuvh5Nc+d+4nljN+YY3RRPJSpQkJ1tqwNuiPBBGhWkJDb+n3nLon ks8Ndxz/Xmz98g==; Original-Received: from [87.69.77.57] (port=2156 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ojct4-0003NR-L0; Sat, 15 Oct 2022 04:51:19 -0400 In-Reply-To: (message from Liliana Marie Prikler on Fri, 14 Oct 2022 21:46:09 +0200) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:297765 Archived-At: > From: Liliana Marie Prikler > Cc: rlb@defaultvalue.org, emacs-devel@gnu.org > Date: Fri, 14 Oct 2022 21:46:09 +0200 > > > > > When you encounter bugs in native compilation, please report them > > > > to us, so we could fix them.  As of now, we are not aware of any > > > > such bugs that were reported and haven't been fixed.  So if you > > > > still have such problem, please report them ASAP. > > > Of course, that's the intention, but this fix will only make it > > > into the next Emacs release.  Thus, if you're between releases, you > > > still need a workaround. > > > > If the fix is urgent, why can't you patch the sources when you > > prepare your distribution? > Guix prides itself in being a package manager that can work around many > failures (even as the proper workaround to bugs is discussed in mailing > lists). The fact, that the solutions to this issue is "compile 28.1 > without native-comp" or "use Emacs 27" does not reflect that > particularly well. I think this answers a different question. I asked why you cannot patch the Emacs you distribute when you consider a fix to be important enough to not wait until the next Emacs release. My point is that reporting bugs in a timely fashion will help us fix them early on, and you will then have a possibility of backporting the fixes to a released Emacs and distributing an updated package with the fix, if you think that's important enough. > > > A particular candidate known to cause issues with the currently > > > packaged 28.1 is [1]. > > > > Where's the description of the actual problem with natively compiling > > that package?  And would you please submit a bug report with the > > details, if you know them? > I am not personally affected, so I can't. I could direct people to the > Emacs mailing lists, but it seems people in other threads have already > started debugging. Do you still wish me to do so? Which threads are you alluding to here? Your [1] is just a reference to ido-completing-read-plus package, and I don't see the description of the problems with native-compilation on that site. So yes, I'd like to hear a description of the problem in that case. > > > > Why isn't it sufficient to use no-native-compile?  It just means > > > > that on some architectures the corresponding file will be loaded > > > > as byte-compiled, and thus will be slightly slower (how much > > > > slower depends on the code, so if you are worried, my > > > > recommendation is first to measure the difference -- you might be > > > > surprised). > > > Because it'd require a distro-wide fix to address something that > > > e.g. only happens on some AMD CPUs. > > > > I'm asking why doing so is a problem?  Did you measure the effect on > > performance and found it to be unacceptable in some cases? > Isn't performance one of the main reasons to use native compilation? On average, yes. But it depends on what the original Lisp code does. We've found that in some cases the performance gains are minimal, and in at least one very special case we found that native-compilation produces a slightly slower code. Which is why I asked the question above: it is quite possible that the (hopefully, few) packages where you need to avoid native-compilation for now don't gain performance from using native-compilation enough for justify any more elaborate measures. And this is a temporary measure anyway, because those problems will eventually be fixed, whether in the packages themselves or in Emacs core. > Note that I am talking in hypotheticals here when mentioning the AMD > thing, i.e. we could very well imagine a performance-critical Emacs > package having a native-compilation bug (I imagine those to be > particularly likely for those trailing unreleased Emacs versions, > though thankfully I don't think we've encountered one so far.) Let's not be bothered by hypothetical cases until they actually emerge. When there are specific situations where this happens and performance gains from native-compilation are critical, we can always look for specific solutions for those cases, something that is impossible without concrete cases. > > OK, so why is this relevant to the issue of disabling?  Those who > > choose ahead-of-time compilation will never see async JIT > > compilation, and those who selected not to do ahead-of-time will > > naturally see JIT compilation, as they've chosen.  What is the > > problem here? > The problem is that I can't meaningfully choose the "I don't want JIT > for stuff I haven't AOT'd" option, especially not combined with "but I > do want to load what I have AOT'd". As I already explained, this mode of operation doesn't make sense to me, and is currently not supported for that reason. I fail to see why people would want native-compilation for some parts of Emacs, but not for others. I haven't yet seen a valid use case where that would make sense as the desired, clean, and non-kludgey solution. Only one valid use case was brought p to this date, where it would be desirable to delay JIT native-compilation temporarily: when the user runs a laptop on batteries. We will probably provide a solution for that, which will automatically re-enable JIT compilation when AC power is connected. This would be a clean, non-kludgey solution for that case. None of the problems you describe are of that nature. They all sound like someone wants to arbitrarily disable native-compilation in some cases, but not others, where reasonable solutions already exist. And if you still disagree, then let's agree to disagree, because we are just repeating the same arguments over and over again. > > > > If a package is a single file or a small number of files, those > > > > users can add the no-native-compile cookies in those files. > > > This is not trivial in the case where the Elisp code is placed in > > > system-managed storage and thus requires elevated privileges to > > > modify (as is the default in most package managers, I assume).  Of > > > course, you can copy the file to your $HOME, but editing it with a > > > broken Emacs is rather painful. > > > > Using broken packages is always painful, and native compilation > > doesn't change that. > Using broken packages normally doesn't result in the OOM killer firing > off. It could, rarely. And which problem of native-compilation caused the OOM killer? Where is that problem described in enough detail for us to investigate it? Was it reported to the Emacs bug-tracker, and if so, what is the bug number, please? IOW, we'd definitely want to avoid such catastrophic failures, but we need the details to investigate and fix them. I can tell you that I'm using Emacs 28 with JIT native-compilation enabled for the best part of this year, and have yet to see any problems even approaching the one mentioned above. So such problems are quite exceptional, and need to be reported with every possible detail for us to be able to fix them quickly. They are definitely not a reason to disable native-compilation. We generally try to provide at least a workaround for critical problems, once we have enough detail to understand what's going on, so reporting a problem quickly will in many cases yield a quick solution that doesn't hamper unrelated parts of the user's usage patterns. > > Packages provided by a distribution and installed into directories > > where users cannot easily write should be well tested by the > > distributor.   > I think you're underestimating the number of breakages that can happen > in a rolling release model. Not every distro is as stable as Debian, > but the joke's still on you because despite Debian's hard requirements, > they still ended up encountering this bug. Sure, that's understandable. But each new problem that is found and reported should cause the corresponding package to be updated with a fix. I don't see why such problems are deemed as reasons to disable native-compilation for the entire Emacs session, or for requirements that they be "fixed" in core. Bugs should be fixed where their root cause is. > > You mean, you find the loading of preloaded *.eln files at startup > > annoying?  Then you should know that this is the best solution we > > found for dumping Emacs with natively-compiled preloaded code. > No, I find it annoying that Emacs supposes it has a writable eln-cache > always. The user's home directory should always be writable. This is required by many Emacs features regardless of native-compilation. For example, saving customizations writes to a subdirectory of the user's home directory, as does desktop.el or save-place etc. If this is a problem during installation of packages, which run at root level, the installation procedure can tweak native-comp-eln-load-path to make sure there's a writable directory there, or point HOME to a non-existent directory. > This is not the case in typical package manager scenarios and > it also isn't the case when users choose to make (parts of) their $HOME > read-only, which is a supported configuration in Nix and Guix. Users make ~/.emacs.d/ read-only? Then how do they use all the features, some of which mentioned above, that write to that directory? > I can't think of a good reason why one would want to assume this > invariant. If this use case is supported by pointing the relevant variables, like save-place-file, eshell-directory-name, desktop-dirname, etc., to non-default places, then they can do the same with native-comp-eln-load-path. If this is not what you mean, please describe how Nix and Guix support this use case where parts of $HOME are read-only, and let's see how native-compilation should support it. > > If you know of a better solution that doesn't suffer from any fatal > > issues we found with the alternatives, please suggest such solutions, > > and we will definitely consider them. > I haven't read the discussions around the alternatives, but couldn't > you just generate one trampoline per function which you use as soon as > it's advised? And then re-generate it again each time the advised function is called again? > Also, how come advice isn't breaking byte-compilation in exactly the > same manner? Andrea, can you please answer that? I have only a very general understanding of why trampolines are needed for native-compilation. > > As I told earlier, disabling loading of native code made no sense to > > us while Emacs 28 was in development; it still doesn't.  Either one > > wants native-compilation, or one doesn't.  Making Emacs code more > > complicated and harder to maintain due to features that make no sense > > to us is a non-starter.  I see no problem with having to use a > > separate build, since building a release tarball takes a minute or so > > on a modern system.  And distros should definitely have a build > > without native-compilation on offer, for a variety of valid reasons. > I don't think that asking distros to package every Emacs variant twice > is a great idea. At Guix, we prefer to offer the most complete version > of a package, so we ship with native compilation enabled. I think this is a mistake. Native-compilation is not for everyone. It requires GCC and Binutils to be installed, and who says every Emacs user wants that? More generally, when we add optional features, we don't consider whether having them all in the same build will make sense. For example, ImageMagick support has some advantages and some (quite serious, IMO) disadvantages, so always providing it because it's "the most complete version" doesn't necessarily make sense for the users. > > > While bytecode performance on such machines might too be slow (but > > > perhaps tolerable for the task), ahead-of-time compilation, perhaps > > > with offloading, is preferable. > > > > I recommend against this, because it is impossible to rely on AOT > > installations to never compile at run time.  Users cannot rely on > > that, and should be advised accordingly. > But why can't they? Trampolines is one reason. I'm sure there are others. Again, we didn't design native-compilation support in Emacs to be switchable on and off at run time, so it's small wonder that it doesn't work reliably. It would be a surprise if it did. > > > For another, it can cause bugs like [2]. > > > > That bug by itself (the cause of massive launching of async > > subprocesses) was never explored or described in that thread?  It > > seems like the discussion switched to looking for ways of disabling > > native-compilation right away, without a good understand of what was > > happening.  Or did I mis something?  Async compilation by default > > never launches more subprocesses than half the execution units of the > > CPU, so what is described there should be carefully investigated and > > the findings described. > It'd be weird if someone found a counterexample to the above statement. I don't understand this comment, sorry. > > The other problem in that discussion, with warnings during async JIT > > compilation is well-known, was reported several times, and the > > culprit is always in the 3rd-part packages being compiled, which > > should be fixed.  In any case, those are just warnings in almost all > > cases, so their only adverse effect is annoyance (that can be > > suppressed by clicking the button in the message). > I read no such problem in that discussion. Do we read the same thread? I hope so. I referred to this: https://issues.guix.gnu.org/issue/57878#13 > > Again, I see no reason to blame the upstream project for these > > issues. They should be solved by the offending 3rd-party packages, > > and the distro should ideally uncover and fix them before they get to > > users (I presume that you build and compile the add-on packages you > > offer?). > I'd like to tap at the "rolling release distro is not Debian" sign, but > again, stable distros like Debian are experiencing issues with native > compilation. Once again: no one expects all the issues to be found in advance, but when a new issue in a package is found, I do expect the distro to fix it and publish an updated package. I do not expect the distro to come back to the upstream project and ask for knobs to deal with bugs in 3rd-party packages uncovered by latest Emacs features. > > > Which defcustom? > > > > Begin with those described in the ELisp manual, in the > > "Native-Compilation Variables" node.  And my recommendation is to > > review _all_ of the defcustoms in comp.el > The only one I found is setting native-comp-speed to -1. Is that the > solution? It doesn't appear to be. It _is_ a solution, for one class of problems with native-compilation. Another solution is to tweak native-comp-eln-load-path. Yet another one is to temporarily point HOME to another, perhaps non-existent, directory. > > To summarize: native compilation in a build which supports it is > > ubiquitous, and is not designed to be disabled except by > > no-native-compile on a file by file level.  If a more general > > disabling is needed for some reason, users should simply use a build > > without native-compilation.  It's the same as various toolkit builds: > > if the toolkit is broken or doesn't fit the user's needs, those users > > should install a build with a different toolkit. > Pardon my French, but that thinking in and of itself is broken. Native > compilation is not a choice in which you pick the one that most suits > your fancy from a range of options – it could be that if you allowed > the user to choose between libgccjit, clang and some other compilers > that shall not be named, not that I recommend you implement this. As > such, I think users who do want to use native compilation should get > some more say in when, where, and what to compile. We hear the users and their complaints, and fix stuff that we think belongs to Emacs. This was and will always be the case. In this case, I'm not convinced that the issues you describe justify a new knob in Emacs. You've described several issues which either already have solutions (which you reject), or should be solved elsewhere, not in Emacs. And if that still doesn't convince you, let's agree to disagree.