From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Eric Abrahamsen Newsgroups: gmane.emacs.bugs Subject: bug#33653: 27.0.50; Change Gnus obarrays-as-hash-tables into real hash tables Date: Tue, 26 Mar 2019 12:58:32 -0700 Message-ID: <87ftr9tnkn.fsf@ericabrahamsen.net> References: <8736raz3ec.fsf@ericabrahamsen.net> <87y392xoht.fsf@ericabrahamsen.net> <87imxzxa4s.fsf@ericabrahamsen.net> <87a7hn3h3w.fsf@ericabrahamsen.net> <87woknisuu.fsf@ericabrahamsen.net> <8736naj1qz.fsf@ericabrahamsen.net> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="96065"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) To: 33653@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Mar 26 21:02:27 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1h8sHB-000OqI-5D for geb-bug-gnu-emacs@m.gmane.org; Tue, 26 Mar 2019 21:02:25 +0100 Original-Received: from localhost ([127.0.0.1]:37248 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h8sHA-0005vF-5I for geb-bug-gnu-emacs@m.gmane.org; Tue, 26 Mar 2019 16:02:24 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:44226) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h8sGu-0005rx-A9 for bug-gnu-emacs@gnu.org; Tue, 26 Mar 2019 16:02:10 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h8sGq-0003bQ-Bi for bug-gnu-emacs@gnu.org; Tue, 26 Mar 2019 16:02:06 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:47062) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1h8sGo-0003aG-20 for bug-gnu-emacs@gnu.org; Tue, 26 Mar 2019 16:02:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1h8sGn-0006Zx-TI for bug-gnu-emacs@gnu.org; Tue, 26 Mar 2019 16:02:01 -0400 X-Loop: help-debbugs@gnu.org In-Reply-To: <8736raz3ec.fsf@ericabrahamsen.net> Resent-From: Eric Abrahamsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 26 Mar 2019 20:02:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 33653 X-GNU-PR-Package: emacs X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.155363050825266 (code B ref -1); Tue, 26 Mar 2019 20:02:01 +0000 Original-Received: (at submit) by debbugs.gnu.org; 26 Mar 2019 20:01:48 +0000 Original-Received: from localhost ([127.0.0.1]:60606 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h8sGZ-0006ZR-Oe for submit@debbugs.gnu.org; Tue, 26 Mar 2019 16:01:48 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:57849) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h8sGY-0006ZD-58 for submit@debbugs.gnu.org; Tue, 26 Mar 2019 16:01:46 -0400 Original-Received: from lists.gnu.org ([209.51.188.17]:49636) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1h8sGS-0003SZ-Vf for submit@debbugs.gnu.org; Tue, 26 Mar 2019 16:01:41 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:43979) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h8sGR-0005jZ-HC for bug-gnu-emacs@gnu.org; Tue, 26 Mar 2019 16:01:40 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h8sDb-0002VF-93 for bug-gnu-emacs@gnu.org; Tue, 26 Mar 2019 15:58:44 -0400 Original-Received: from [195.159.176.226] (port=52564 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1h8sDb-0002Uv-0X for bug-gnu-emacs@gnu.org; Tue, 26 Mar 2019 15:58:43 -0400 Original-Received: from list by blaine.gmane.org with local (Exim 4.89) (envelope-from ) id 1h8sDZ-000KeL-ED for bug-gnu-emacs@gnu.org; Tue, 26 Mar 2019 20:58:41 +0100 X-Injected-Via-Gmane: http://gmane.org/ Cancel-Lock: sha1:VJ447eenNV4Uf89u/2AeNKR3RQM= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:156837 Archived-At: Andy Moreton writes: > On Mon 25 Mar 2019, Eric Abrahamsen wrote: > >> Andy Moreton writes: >>> Other notes from reading the code: >>> >>> 1) In `gnus-gnus-to-quick-newsrc-format' you ignore the contents of >>> `gnus-newsrc-alist' when saving "newsrc.eld", and replace it with the >>> details from `gnus-newsrc-hashtb'. Why ? The rest of the gnus code >>> appears to treat `gnus-newsrc-alist' as the single source of truth, >>> with the hash tables being used only for faster access to it. >> >> Eventually I would like to reduce the number of data structures so that >> groups are held in `gnus-newsrc-hashtb', and ordering is kept in >> `gnus-group-list', and that's it. `gnus-newsrc-alist' would only be used >> when persisting to disk. My next proposed change (once I've recovered my >> confidence) is to turn groups into actual objects, in which case the >> alist would really just be a kind of serialization format. > > ok, but if the hash table and the alist reference the same lisp objects > then the existing setup is not so bad. There's nothing wrong with the existing setup here! I'm just laying the groundwork for the next round of changes. >> The hash table ought to be in sync with the rest of the data structure >> -- if it isn't, that's another bug. >> >>> 2) In `gnus-gnus-to-quick-newsrc-format' you dropped the code to remove >>> the dummy group from `gnus-newsrc-alist'. Why ? This internal dummy >>> group is now saved in "newsrc.eld", which is not needed. >> >> This was an error. (Though in my case, I've had the dummy group in my >> newsrc.eld for months, and it hasn't done any harm. I don't know why >> it's necessary.) > > I agree that it's harmless, it just seemed to be an unnecessary change. > I expect it is there to ensure that the alist is never empty to avoid > the need to check that everywhere. Yes, for sure, and I didn't mean to leave it in there, that's just a bug. >>> 3) The format of the entries in `gnus-newsrc-hashtb' has changed, >>> removing the second element. Why? >> >> Because the old `gnus-gethash' call returned a slice of >> `gnus-newsrc-alist', where the second element was actually the group >> *before* the group you wanted, and the third element was the cdr of >> `gnus-newsrc-alist', starting with the group you wanted. This was >> undocumented, and took a bit to figure out. Now, the gethash call just >> gives you the group. Ideally, in the next set of changes, it will give >> you an object. > > ok, so it sounds like the old code was trying hard to use the same lisp > objects in both the hash table and the alist. That avoids alloacting > twice the storage, and ensures that the hash table and the alist stay in > sync. Yes, I'm looking forward to when the groups are actual objects, . But in the meantime, the lisp objects are still the same, I haven't doubled storage (at least `eq' returns t, so I assume that's what that means). >>> 4) You changed several hash tale sizesfrom 4096 to 4000, and 1024 to >>> 1000. Why? >> >> My understanding is that using a prime number is significant when it >> comes to vector access, but that the hash table implementation is >> higher-level, where a prime number is no longer significant. If that's >> incorrect I would like to know! > > None of these numbers are prime. The old numbers are powers of two, > which are reasonable for allocation sizes on a binary machine. Again, > not a terribly important change, but not really needed either. Yes, sorry, powers of two. It's true it's a do-nothing change, I guess I've been erring on the side of trying to make the code more self-explanatory, less "odd". Probably this change didn't need to be made, but it seems about as un-risky as a code change could be. >>> Your patch contains several logical changes that would be easier to >>> understand (and bisect) as a series of patches with one logical change >>> in each patch: >>> - code layout changes >>> - add missing doc strings and code comments >>> - change hash table implementation >>> - change format of `gnus-newsrc-hashtb' entries >>> - change usage of `gnus-group-change-level' >>> - change coding of group names >>> While it can take extra work to split things up, the end result is much >>> easier to understand. >> >> In principle I agree with this completely. In practice I found it >> extraordinarily difficult to touch one part of Gnus without running into >> knock-on repercussions. > > Agreed. I find that this sort of work usually goes in two phases: some > exploratory programming to decide on the path forward and the eventual > goal, followed by rearranging the changes into a logical set of > (bisectable) patches that get to that goal in small, self-contained > steps. The second half is extra work, but worth it to make it easier for > other people to review the patches (and to simplify any archaeology > needed by a later maintainer). > > If changing gnus is hard, then perhaps writing new tests to document > what you have discovered about the code will help to ensure that later > changes do not change the observed behaviour. The previous commit did add some new Gnus tests -- perhaps the first! I'm planning more, once Gnus group are actual objects. Everything's so interconnected now (and data-reliant) that it's very hard to write effective tests. >> The ultimate goal of the changes I have in mind for Gnus is to address >> exactly this: to make it more modular, to improve isolation of code >> paths, and to reduce the number of semi-redundant data structures. But >> the process is evidently even messier than I thought. I held back >> another commit to group name encoding in an attempt to keep things >> simple, but that seems to have made things even worse. > > A worthwhile goal - please do not get discouraged. Thank you, Eric