From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Mark H Weaver Newsgroups: gmane.lisp.guile.user Subject: Re: Comparing two hash tables for equality? Date: Sun, 26 Aug 2018 18:49:00 -0400 Message-ID: <87o9dovj4z.fsf@netris.org> References: <2418985.EcWt8OQBW1@aleksandar-ixtreme-m5740> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1535323747 3406 195.159.176.226 (26 Aug 2018 22:49:07 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 26 Aug 2018 22:49:07 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) Cc: guile-user@gnu.org To: Aleksandar Sandic Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Mon Aug 27 00:49:02 2018 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fu3qA-0000lG-8Z for guile-user@m.gmane.org; Mon, 27 Aug 2018 00:49:02 +0200 Original-Received: from localhost ([::1]:50639 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fu3sG-0004GW-KJ for guile-user@m.gmane.org; Sun, 26 Aug 2018 18:51:12 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:56684) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fu3ri-0004GM-Q8 for guile-user@gnu.org; Sun, 26 Aug 2018 18:50:39 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fu3rf-0004I5-Ja for guile-user@gnu.org; Sun, 26 Aug 2018 18:50:38 -0400 Original-Received: from world.peace.net ([64.112.178.59]:45134) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fu3rf-0004GN-Dq for guile-user@gnu.org; Sun, 26 Aug 2018 18:50:35 -0400 Original-Received: from mhw by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1fu3re-0007jY-2l; Sun, 26 Aug 2018 18:50:34 -0400 In-Reply-To: <2418985.EcWt8OQBW1@aleksandar-ixtreme-m5740> (Aleksandar Sandic's message of "Sun, 26 Aug 2018 12:01:17 +0200") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 64.112.178.59 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Original-Sender: "guile-user" Xref: news.gmane.org gmane.lisp.guile.user:14770 Archived-At: Hi Aleksandar, Aleksandar Sandic writes: > I have been looking through the reference manual, but there does not seem to > be a procedure for comparing two hash tables for equality. The procedure > 'equal?' only returns true if the two tables are also 'eq?': > > scheme@(guile-user)> (define h1 (make-hash-table)) > scheme@(guile-user)> (define h2 (make-hash-table)) > scheme@(guile-user)> (equal? h1 h2) > $1 = #f > scheme@(guile-user)> (equal? h1 h1) > $2 = #t > > Is this the intended behaviour? Yes. That might seem surprising. I have a few things to say on that, in no particular order: An equality test on hash tables needs to know how to compare the keys and how to compare the values. There's no way to pass those additional arguments to 'equal?', so it can't do that job. Hash table implementations typically don't offer an equality test on the hash tables themselves, and I don't recall anyone ever asking for such an operation before now. I guess that's because in the use case where hash tables are beneficial -- when the tables are very large -- comparing them for equality is expensive. While hash tables are indispensible for certain use cases, they also have very significant downsides, and I tend to avoid them whenever possible. Their most severe downside, in my opinion, is that they are fundamentally incompatible with a functional programming style. They invariably force all code that uses them to be written in an imperative style. I cannot stress enough how important this is. It's also not possible to efficiently create a new hash table based on an existing one, but with one or more additional entries. In Scheme, you can efficiently 'cons' or 'append' some entries onto the front of an existing list without modifying the original list. That includes association lists. To do the same with a hash table, you must make a copy of the entire hash table and then add the new entries to the copy. There can be no sharing of storage between multiple hash tables, as can be done with lists, association lists, or balanced trees. Even if you don't need sharing, hash tables also use more space than those other data structures. So, I would ask yourself whether the benefits of hash tables truly outweigh the downsides in your particular use case. If you're comfortable sharing details, I'd be curious to hear what you're trying to accomplish here, at a higher level. Maybe there's a better way. Anyway, if you really need to compare hash tables for equality, here's some suggested code that's simpler than the code you wrote, and should be quite a bit more efficient as well. In particular, this code performs no heap allocation: --8<---------------cut here---------------start------------->8--- (use-modules (ice-9 match)) (define (hash-table-subset? val=? h1 h2) "H1 and H2 must be hash tables which use equal? as the equality test for keys. Return #t if and only if for every entry (K . V1) in H1, there exists an entry (K . V2) in H2 such that (VAL=? V1 V2), otherwise return #f." (hash-fold (lambda (k v1 equal-so-far) (and equal-so-far (match (hash-get-handle h2 k) ((_ . v2) (val=? v1 v2)) (#f #f)))) #t h1)) (define (hash-table=? val=? h1 h2) "H1 and H2 must be hash tables which use equal? as the equality test for keys. Return #t if H1 and H2 have the same keys (in the sense of equal?) and each key has the same value (in the sense of val=?), otherwise return #f." (and (hash-table-subset? val=? h1 h2) (hash-table-subset? val=? h2 h1))) (define (equal*? a b) (or (equal? a b) (and (hash-table? a) (hash-table? b) (hash-table=? equal*? a b)))) --8<---------------cut here---------------end--------------->8--- You might not need 'equal*?' here. I included it only to match more closely the behavior of your code. For most purposes, 'hash-table=?' is actually more flexible than 'equal*?', as long as the "type" of values stored in a given hash table are known. If the values stored in the hash table are numbers, then 'val=?' should be '='. If the values are themselves hash tables, then 'val=?' should be (lambda (a b) (hash-table=? foo=? a b)) for some value of 'foo=?'. In some other cases, 'val=?' should be 'equal?'. If you truly need hash tables whose values could contain either hash tables or some other type, then something like 'equal*?' might be the right tool. However, keep in mind that 'equal?' is usually not the desired equality test for numbers, because (equal? 1 1.0) => #false and (equal? +0.0 -0.0) => #false. 'equal?' tests for _operational_ equivalence, which makes distinctions important for memoization and some other purposes, whereas '=' tests for mathematical equality and is usually what you want for numbers. Regards, Mark