From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eric Abrahamsen Newsgroups: gmane.emacs.bugs Subject: bug#33653: 27.0.50; Change Gnus obarrays-as-hash-tables into real hash tables Date: Thu, 06 Dec 2018 14:39:07 -0800 Message-ID: <8736raz3ec.fsf@ericabrahamsen.net> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1544135891 20832 195.159.176.226 (6 Dec 2018 22:38:11 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 6 Dec 2018 22:38:11 +0000 (UTC) To: 33653@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Dec 06 23:38:07 2018 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gV2HS-0005Af-Bm for geb-bug-gnu-emacs@m.gmane.org; Thu, 06 Dec 2018 23:38:02 +0100 Original-Received: from localhost ([::1]:43305 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gV2JZ-0000l4-0U for geb-bug-gnu-emacs@m.gmane.org; Thu, 06 Dec 2018 17:40:13 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:52807) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gV2JS-0000kR-08 for bug-gnu-emacs@gnu.org; Thu, 06 Dec 2018 17:40:07 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gV2JO-0001tY-Qn for bug-gnu-emacs@gnu.org; Thu, 06 Dec 2018 17:40:05 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:60098) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gV2JO-0001tG-HN for bug-gnu-emacs@gnu.org; Thu, 06 Dec 2018 17:40:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1gV2JO-0005tm-BU for bug-gnu-emacs@gnu.org; Thu, 06 Dec 2018 17:40:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eric Abrahamsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 06 Dec 2018 22:40:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 33653 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.154413598522639 (code B ref -1); Thu, 06 Dec 2018 22:40:02 +0000 Original-Received: (at submit) by debbugs.gnu.org; 6 Dec 2018 22:39:45 +0000 Original-Received: from localhost ([127.0.0.1]:36123 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gV2J6-0005t4-Sm for submit@debbugs.gnu.org; Thu, 06 Dec 2018 17:39:45 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:42992) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gV2J5-0005so-Cz for submit@debbugs.gnu.org; Thu, 06 Dec 2018 17:39:43 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gV2Ix-0001Rr-T6 for submit@debbugs.gnu.org; Thu, 06 Dec 2018 17:39:37 -0500 Original-Received: from lists.gnu.org ([2001:4830:134:3::11]:47520) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gV2Ix-0001R8-PE for submit@debbugs.gnu.org; Thu, 06 Dec 2018 17:39:35 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:52574) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gV2Iv-0000hD-S7 for bug-gnu-emacs@gnu.org; Thu, 06 Dec 2018 17:39:35 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gV2Is-0001Fw-Lc for bug-gnu-emacs@gnu.org; Thu, 06 Dec 2018 17:39:33 -0500 Original-Received: from mail.ericabrahamsen.net ([50.56.99.223]:45783) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gV2Is-00017W-9m for bug-gnu-emacs@gnu.org; Thu, 06 Dec 2018 17:39:30 -0500 Original-Received: from localhost (unknown [207.109.85.82]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) (Authenticated sender: eric@ericabrahamsen.net) by mail.ericabrahamsen.net (Postfix) with ESMTPSA id EA8963F46A for ; Thu, 6 Dec 2018 22:39:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mail.ericabrahamsen.net; s=mail; t=1544135960; bh=xPP4HOCNqOoTJ2tUCzV7mOwH3i18k3nO+sX3clJvMj8=; h=From:To:Subject:Date:From; b=eJRYKaqH37O/oh4qZN7Fzsiigwr/nSoRyr524KeS21CzO23N98Vo2REdszLUgFZBo O8IAYQv4qX+I4UAMAkFm68ERyEf1QqXHC44UICpcRvUMQtrsm9pgKsizzqBXxWzGX4 ZBrRETemMKMz/8KIHN+aoKXUKBKgUx5BTGxX236Q= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:153152 Archived-At: Here's the next thing: turning Gnus' obarrays-as-hash-tables into real hash tables. Gnus currently stores information about groups by coercing group names to unibyte, interning them in custom obarrays, and then setting their symbol-value to whatever value needs to be stored. I think all this was written before Emacs had actual hash tables. I think real hash tables are better for a few reasons: 1. Hash table lookups seem to be marginally faster than obarray lookups, 2. It's "less weird" for contributors and bug hunters, 3. It allows us to reduces the amount of encoding/decoding going on: group names can stay strings instead of being forced into symbols. I've pushed a branch, scratch/gnus-hashtables, with two commits in it. The first changes all the obarrays to hash tables. Apart from simply replacing function calls, there were a few bumps. 1. Gnus uses `text-property-any' to find and compare group-name text properties; I made a new `gnus-text-property-search' which behaves similarly, but compares with equal. 2. Some of the "hash tables" were simply storing the value t, ie just used for membership. I left these as hash tables, but I think in many cases simple string-in-list membership tests would suffice. 3. The hash table in gnus-async.el didn't appear to be doing anything -- it seemed to be redundant with `gnus-async-article-alist'. I've removed it, but it might need to be put back if I'm missing something. 4. The creation of `gnus-newsgroup-dependencies' was the most fiddly, and I added tests for that. I'm not entirely convinced that `gnus-thread-loop-p' behaves the way it's meant to, it appears to only check for direct loops between parent and child, not parent and descendant. 5. The old return value of (gnus-gethash gnus-newsrc-hashtb) was kind of a slice into `gnus-newsrc-alist': it behaved a bit like `member', but also included the group *before* the group you wanted, as well as all those after, so you could traverse the list in either direction. It now no longer preserves the order of `gnus-newsrc-alist' (this ordering wasn't actually used in many places), and instead there's a new variable `gnus-group-list' which records the proper sort order of the groups. I feel fairly confident that all this is working okay. The second commit I do *not* feel very confident about, and it's more of a "let's see how things break" attempt. In theory, we should now be able to limit group name encoding/decoding to the boundaries of Gnus -- reading/writing active files, and talking to servers. Within Gnus, the group names can remain decoded strings. In the second commit I've just gone in and pulled out all the decoding I can find, changed all the 'raw-text encoding options to 'undecided, and deleted the `mm-disable-multibyte's. I have no confidence that I've covered all the bases, though I have been using this branch for a couple of weeks, with nnml, nnmaildir, and nnimap groups with multibyte names, and so far nothing has broken. I'm less confident about nntp. I'll continue working on this, and I hope I can get some feedback here, particularly on the second commit. Thanks, Eric