unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eric Abrahamsen <eric@ericabrahamsen.net>
To: 33653@debbugs.gnu.org
Subject: bug#33653: 27.0.50; Change Gnus obarrays-as-hash-tables into real hash tables
Date: Tue, 26 Mar 2019 12:58:32 -0700	[thread overview]
Message-ID: <87ftr9tnkn.fsf@ericabrahamsen.net> (raw)
In-Reply-To: <8736raz3ec.fsf@ericabrahamsen.net>

Andy Moreton <andrewjmoreton@gmail.com> writes:

> On Mon 25 Mar 2019, Eric Abrahamsen wrote:
>
>> Andy Moreton <andrewjmoreton@gmail.com> writes:
>>> Other notes from reading the code:
>>>
>>> 1) In `gnus-gnus-to-quick-newsrc-format' you ignore the contents of
>>>    `gnus-newsrc-alist' when saving "newsrc.eld", and replace it with the
>>>    details from `gnus-newsrc-hashtb'. Why ? The rest of the gnus code
>>>    appears to treat `gnus-newsrc-alist' as the single source of truth,
>>>    with the hash tables being used only for faster access to it.
>>
>> Eventually I would like to reduce the number of data structures so that
>> groups are held in `gnus-newsrc-hashtb', and ordering is kept in
>> `gnus-group-list', and that's it. `gnus-newsrc-alist' would only be used
>> when persisting to disk. My next proposed change (once I've recovered my
>> confidence) is to turn groups into actual objects, in which case the
>> alist would really just be a kind of serialization format.
>
> ok, but if the hash table and the alist reference the same lisp objects
> then the existing setup is not so bad.

There's nothing wrong with the existing setup here! I'm just laying the
groundwork for the next round of changes.

>> The hash table ought to be in sync with the rest of the data structure
>> -- if it isn't, that's another bug.
>>
>>> 2) In `gnus-gnus-to-quick-newsrc-format' you dropped the code to remove
>>>    the dummy group from `gnus-newsrc-alist'. Why ? This internal dummy
>>>    group is now saved in "newsrc.eld", which is not needed.
>>
>> This was an error. (Though in my case, I've had the dummy group in my
>> newsrc.eld for months, and it hasn't done any harm. I don't know why
>> it's necessary.)
>
> I agree that it's harmless, it just seemed to be an unnecessary change.
> I expect it is there to ensure that the alist is never empty to avoid
> the need to check that everywhere.

Yes, for sure, and I didn't mean to leave it in there, that's just a bug.

>>> 3) The format of the entries in `gnus-newsrc-hashtb' has changed,
>>> removing the second element. Why?
>>
>> Because the old `gnus-gethash' call returned a slice of
>> `gnus-newsrc-alist', where the second element was actually the group
>> *before* the group you wanted, and the third element was the cdr of
>> `gnus-newsrc-alist', starting with the group you wanted. This was
>> undocumented, and took a bit to figure out. Now, the gethash call just
>> gives you the group. Ideally, in the next set of changes, it will give
>> you an object.
>
> ok, so it sounds like the old code was trying hard to use the same lisp
> objects in both the hash table and the alist. That avoids alloacting
> twice the storage, and ensures that the hash table and the alist stay in
> sync.

Yes, I'm looking forward to when the groups are actual objects, . But in
the meantime, the lisp objects are still the same, I haven't doubled
storage (at least `eq' returns t, so I assume that's what that means).

>>> 4) You changed several hash tale sizesfrom 4096 to 4000, and 1024 to
>>> 1000. Why?
>>
>> My understanding is that using a prime number is significant when it
>> comes to vector access, but that the hash table implementation is
>> higher-level, where a prime number is no longer significant. If that's
>> incorrect I would like to know!
>
> None of these numbers are prime. The old numbers are powers of two,
> which are reasonable for allocation sizes on a binary machine. Again,
> not a terribly important change, but not really needed either.

Yes, sorry, powers of two. It's true it's a do-nothing change, I guess
I've been erring on the side of trying to make the code more
self-explanatory, less "odd". Probably this change didn't need to be
made, but it seems about as un-risky as a code change could be.

>>> Your patch contains several logical changes that would be easier to
>>> understand (and bisect) as a series of patches with one logical change
>>> in each patch:
>>>  - code layout changes
>>>  - add missing doc strings and code comments
>>>  - change hash table implementation
>>>  - change format of `gnus-newsrc-hashtb' entries
>>>  - change usage of `gnus-group-change-level'
>>>  - change coding of group names
>>> While it can take extra work to split things up, the end result is much
>>> easier to understand.
>>
>> In principle I agree with this completely. In practice I found it
>> extraordinarily difficult to touch one part of Gnus without running into
>> knock-on repercussions.
>
> Agreed. I find that this sort of work usually goes in two phases: some
> exploratory programming to decide on the path forward and the eventual
> goal, followed by rearranging the changes into a logical set of
> (bisectable) patches that get to that goal in small, self-contained
> steps. The second half is extra work, but worth it to make it easier for
> other people to review the patches (and to simplify any archaeology
> needed by a later maintainer).
>
> If changing gnus is hard, then perhaps writing new tests to document
> what you have discovered about the code will help to ensure that later
> changes do not change the observed behaviour.

The previous commit did add some new Gnus tests -- perhaps the first!
I'm planning more, once Gnus group are actual objects. Everything's so
interconnected now (and data-reliant) that it's very hard to write
effective tests.

>> The ultimate goal of the changes I have in mind for Gnus is to address
>> exactly this: to make it more modular, to improve isolation of code
>> paths, and to reduce the number of semi-redundant data structures. But
>> the process is evidently even messier than I thought. I held back
>> another commit to group name encoding in an attempt to keep things
>> simple, but that seems to have made things even worse.
>
> A worthwhile goal - please do not get discouraged.

Thank you,

Eric






  reply	other threads:[~2019-03-26 19:58 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-06 22:39 bug#33653: 27.0.50; Change Gnus obarrays-as-hash-tables into real hash tables Eric Abrahamsen
2018-12-06 22:46 ` Eric Abrahamsen
     [not found]   ` <m35zvzq34a.fsf@gnus.org>
2018-12-11 23:30     ` Eric Abrahamsen
2019-02-05  2:05     ` Eric Abrahamsen
2019-03-22  0:09       ` Eric Abrahamsen
2019-03-22  9:20         ` Robert Pluim
2019-03-22 17:21           ` Eric Abrahamsen
2019-03-22 19:54             ` Eric Abrahamsen
2019-03-22 21:07               ` Eli Zaretskii
2019-03-22 22:10                 ` Eric Abrahamsen
2019-03-23 14:52                   ` Andy Moreton
2019-03-23 16:14                     ` Eric Abrahamsen
2019-03-26 18:28                     ` Andy Moreton
2019-03-26 19:49                       ` Eric Abrahamsen
2019-03-22 22:40               ` Glenn Morris
2019-03-22 22:51                 ` Eric Abrahamsen
2019-03-23  6:46                 ` Eli Zaretskii
2019-03-24 22:29               ` Bastien
2019-03-24 23:40                 ` Eric Abrahamsen
2019-03-30 12:09                   ` Deus Max
2019-03-31 23:27                     ` Eric Abrahamsen
2019-04-01 22:39                       ` Deus Max
2019-04-02  5:23                         ` Eric Abrahamsen
2019-03-25  2:14         ` Katsumi Yamaoka
2019-03-25  2:35           ` Eric Abrahamsen
2019-03-25 14:45             ` Andy Moreton
2019-03-25 17:35               ` Eric Abrahamsen
2019-03-25 17:51                 ` Robert Pluim
2019-03-25 18:17                   ` Basil L. Contovounesios
2019-03-25 19:04                   ` Bastien
2019-03-25 20:15                 ` Andy Moreton
2019-03-26 19:58                   ` Eric Abrahamsen [this message]
2019-03-26 21:44           ` Eric Abrahamsen
2019-03-27  4:54             ` Katsumi Yamaoka
2019-03-27 18:47               ` Eric Abrahamsen
2019-03-27 21:27               ` Eric Abrahamsen
2019-03-27 22:10                 ` Eric Abrahamsen
2019-03-31 22:55                 ` Eric Abrahamsen
2019-04-01 20:18                   ` Adam Sjøgren
2019-04-01 20:57                     ` Eric Abrahamsen
2019-04-02 16:43                       ` Adam Sjøgren
2019-04-03 22:16                         ` Katsumi Yamaoka
2019-04-03 22:36                           ` Eric Abrahamsen
2019-04-05  4:25                             ` Katsumi Yamaoka
2019-04-05  6:44                               ` Katsumi Yamaoka
2019-04-05 11:02                                 ` Basil L. Contovounesios
2019-04-08  1:47                                   ` Katsumi Yamaoka
2019-04-05 20:18                                 ` Eric Abrahamsen
2019-04-08  1:58                                   ` Katsumi Yamaoka
2019-04-08  4:31                                     ` Katsumi Yamaoka
2019-04-11 21:29                                 ` Basil L. Contovounesios
2019-04-11 23:56                                   ` Katsumi Yamaoka
2019-04-12 11:05                                     ` Basil L. Contovounesios
2019-06-22 13:11                                       ` Lars Ingebrigtsen
2019-04-05 11:01                               ` Basil L. Contovounesios
2019-04-08  8:13               ` Katsumi Yamaoka
2019-04-08  8:57                 ` Andreas Schwab
2019-04-09  0:55                   ` Katsumi Yamaoka
2019-04-08 18:31                 ` Eric Abrahamsen
2019-04-09  0:55                   ` Katsumi Yamaoka
2019-04-09  2:01                     ` Eric Abrahamsen
2019-04-09  4:18                       ` Katsumi Yamaoka
2019-04-09  4:30                         ` Eric Abrahamsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ftr9tnkn.fsf@ericabrahamsen.net \
    --to=eric@ericabrahamsen.net \
    --cc=33653@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).