From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#68244: hash-table improvements Date: Mon, 08 Jan 2024 19:33:13 -0500 Message-ID: References: <170438379722.3921.9312235725296561206@vcs2.savannah.gnu.org> <20240104155642.B4A99C00344@vcs2.savannah.gnu.org> <8d49ebdc-9da7-4e70-a080-d8e892b980b6@gutov.dev> <08314177-5AE9-4352-94A0-641830B4094D@gmail.com> <19265EA5-E6F3-446C-AD9B-763693CF0A48@gmail.com> Reply-To: Stefan Monnier Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="29362"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: Dmitry Gutov , Eli Zaretskii , 68244@debbugs.gnu.org To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Jan 09 01:34:35 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rN04h-0007Nq-BI for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 09 Jan 2024 01:34:35 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rN046-00075f-7x; Mon, 08 Jan 2024 19:33:58 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rN044-00075R-LN for bug-gnu-emacs@gnu.org; Mon, 08 Jan 2024 19:33:56 -0500 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rN044-0005Lo-7s for bug-gnu-emacs@gnu.org; Mon, 08 Jan 2024 19:33:56 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1rN04A-0003q5-6R for bug-gnu-emacs@gnu.org; Mon, 08 Jan 2024 19:34:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 09 Jan 2024 00:34:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 68244 X-GNU-PR-Package: emacs Original-Received: via spool by 68244-submit@debbugs.gnu.org id=B68244.170476041714717 (code B ref 68244); Tue, 09 Jan 2024 00:34:02 +0000 Original-Received: (at 68244) by debbugs.gnu.org; 9 Jan 2024 00:33:37 +0000 Original-Received: from localhost ([127.0.0.1]:37894 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rN03h-0003pG-OM for submit@debbugs.gnu.org; Mon, 08 Jan 2024 19:33:37 -0500 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:23082) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rN03c-0003ou-9A for 68244@debbugs.gnu.org; Mon, 08 Jan 2024 19:33:32 -0500 Original-Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 04BEF80CC9; Mon, 8 Jan 2024 19:33:16 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1704760394; bh=MifJ4B7Iwja96TOed25M0tt9iwL8CBCTXLIEp2YSoq8=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=m+W6h4UGQR7SIvHgddNPiapkbJhVi6nEZIjeR51y/dBrMWCuMx5SL4IAQe211V62O edMeM4NcTALpVBDnpTkjnA1bOvJPAJwOA0WBa2vYxvCILdKrpdXOd0phtp87oBZiiH 9sPhm0bVs/g4vJ5iFeu9NSieDQrVwwcb7B/jssEEy9t9nKiw4m0V3eAx3tk66gmwI7 MO9n1S/GAZR+a4P879vAkBgn3hDxCbGU7srM/5qz5mjBPkyi2y0VnJfHQfimusXuA7 eK5v+ba4Itcn0p9nA7ldaQDzXI9Ogox4zGw1a+EaCPUl/wfC4W+OX7oUNjlXUU34sA FT04Ec7I++pww== Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id D8FF7809A5; Mon, 8 Jan 2024 19:33:14 -0500 (EST) Original-Received: from pastel (65-110-221-238.cpe.pppoe.ca [65.110.221.238]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id A98AC120BE4; Mon, 8 Jan 2024 19:33:14 -0500 (EST) In-Reply-To: <19265EA5-E6F3-446C-AD9B-763693CF0A48@gmail.com> ("Mattias =?UTF-8?Q?Engdeg=C3=A5rd?="'s message of "Mon, 8 Jan 2024 19:26:21 +0100") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:277587 Archived-At: > The left columns are for the standard hash tables with remainder-based range > reduction, the right ones with Knuth reduction. > 'size' is the table size, 'idxsiz' the index vector size, '1st%' the portion > of entries accessed with a single lookup, 'avg' the average number of > accesses and 'max' the maximum number of accesses required. > > The old code looks perfect (no collisions!), but the new shiny code is a disappointment. Ah, that could explain it, indeed. BTW, what is the "per-entry byte-size" of your new code? The old code had about 6 words per entry, IIRC. > We could try switching to a high-quality hash function (or family thereof), > like Murmur or Jenkins. Then range reduction is just a matter of masking off > the required number of bits. I don't see a strong need for it. BTW, I see in the Knuth reduction you extract the bits 32..32+N of the multiplication. Any reason not to use the top N bits instead (so we're not limited to 32 bits, for example)? Stefan