* bug#39634: All keyowrds hash to the same value
2020-02-25 20:56 ` Andy Wingo
@ 2020-02-25 22:13 ` lloda
2020-02-26 21:02 ` Ludovic Courtès
2020-03-06 14:42 ` Ludovic Courtès
2 siblings, 0 replies; 8+ messages in thread
From: lloda @ 2020-02-25 22:13 UTC (permalink / raw)
To: Andy Wingo; +Cc: 39634, Ludovic Courtès, Rob Browning
> On 25 Feb 2020, at 21:56, Andy Wingo <wingo@igalia.com> wrote:
>
> On Thu 20 Feb 2020 17:19, Ludovic Courtès <ludo@gnu.org> writes:
>
>> Of all the scm_tc7_ values listed in ‘scm.h’, the following are not
>> explicitly listed (so they go to the default case that hashes the first
>> word):
>
> Reformatting your list so I can check one by one :)
>
>> variable,
>> hashtable,
>> fluid,
>> dynamic_state,
>> frame,
>> atomic_box,
>> program,
>> vm_cont,
>> weak_set,
>> weak_table,
>> port
>
> No equal? implementation, so should hashq() instead.
>
>> bytevector,
>> array,
>> bitvector,
>
> These have equal? implementations, and what's more, a bitvector can
> equal? an array... I think we have another bug!
>
> ;; Project 2d array as 1d array (scm_tc7_array)
> (define x
> (make-shared-array #2b((#t #t #t)) (lambda (i) (list 0 i)) '(0 2)))
> ;; scm_tc7_bitvector
> (define y #*111)
>
> (equal? x y) ;; => #t
> (equal? (hash x #xffffffff) (hash y #xffffffff)) ;; => #f
>
> Similarly for 1-d scm_tc7_array versus regular vectors, bytevectors,
> etc.
>
> Fixing this will not be straightforward... I think basically 1d arrays
> need some special hashing logic so that e.g. vectors and 1d arrays hash
> to the same thing.
I cannot check at the moment but I think that use of make-shared-array is special cased to return a bitvector because the shared array and the root happen to be equivalent. So your x isn't a scm_tc7_array but a scm_tc7_bitvector. The same is true for the other vector types. You can see that if you make a shared array with bounds '(0 1) instead of '(0 2) for example, or non-unit step or non-zero lower bound or anything that cannot be represented as a root vector.
Now it is true that functionally a root vector and a 1d array with the same bounds and the same elements are equivalent even if the array has non-unit stride and so on, but we had that logic before were you could use the root vector functions on arrays and it was an absolute mess. I think there should be a logic to hash n-d arrays that extends to 1-d arrays so there's no need to make special cases. All vectors can be treated as 1-d arrays so that should work fine for those too.
Partially related, I have a series of patches on wip-vector-cleanup to make sure that the various vector implementations don't depend on arrays (as they still do on some cases) but rather the other way around, strictly. I haven't posted about it b/c it changes a few interfaces and I haven't figured out the deprecation route.
regards
>
>> stringbuf,
>> values,
>
> These are never exposed to Scheme, and never compared using equal?
> AFAIU. No need for special cases.
>
> Basically I think the tc7 case should default to hashq, and include
> special cases for the ones that have equal? implementations or which
> have read syntax.
>
> Sound right to you?
>
> Andy
>
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#39634: All keyowrds hash to the same value
2020-02-25 20:56 ` Andy Wingo
2020-02-25 22:13 ` lloda
@ 2020-02-26 21:02 ` Ludovic Courtès
2020-03-06 14:42 ` Ludovic Courtès
2 siblings, 0 replies; 8+ messages in thread
From: Ludovic Courtès @ 2020-02-26 21:02 UTC (permalink / raw)
To: Andy Wingo; +Cc: 39634, Rob Browning
Hi Andy!
Andy Wingo <wingo@igalia.com> skribis:
> On Thu 20 Feb 2020 17:19, Ludovic Courtès <ludo@gnu.org> writes:
>
>> Of all the scm_tc7_ values listed in ‘scm.h’, the following are not
>> explicitly listed (so they go to the default case that hashes the first
>> word):
>
> Reformatting your list so I can check one by one :)
>
>> variable,
>> hashtable,
>> fluid,
>> dynamic_state,
>> frame,
>> atomic_box,
>> program,
>> vm_cont,
>> weak_set,
>> weak_table,
>> port
>
> No equal? implementation, so should hashq() instead.
>
>> bytevector,
>> array,
>> bitvector,
>
> These have equal? implementations, and what's more, a bitvector can
> equal? an array... I think we have another bug!
>
> ;; Project 2d array as 1d array (scm_tc7_array)
> (define x
> (make-shared-array #2b((#t #t #t)) (lambda (i) (list 0 i)) '(0 2)))
> ;; scm_tc7_bitvector
> (define y #*111)
>
> (equal? x y) ;; => #t
> (equal? (hash x #xffffffff) (hash y #xffffffff)) ;; => #f
>
> Similarly for 1-d scm_tc7_array versus regular vectors, bytevectors,
> etc.
Oops.
> Fixing this will not be straightforward... I think basically 1d arrays
> need some special hashing logic so that e.g. vectors and 1d arrays hash
> to the same thing.
OK.
>> stringbuf,
>> values,
>
> These are never exposed to Scheme, and never compared using equal?
> AFAIU. No need for special cases.
Agreed.
> Basically I think the tc7 case should default to hashq, and include
> special cases for the ones that have equal? implementations or which
> have read syntax.
>
> Sound right to you?
It does, yes.
Thanks for looking into it,
Ludo’.
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#39634: All keyowrds hash to the same value
2020-02-25 20:56 ` Andy Wingo
2020-02-25 22:13 ` lloda
2020-02-26 21:02 ` Ludovic Courtès
@ 2020-03-06 14:42 ` Ludovic Courtès
2020-03-06 15:43 ` Andy Wingo
2 siblings, 1 reply; 8+ messages in thread
From: Ludovic Courtès @ 2020-03-06 14:42 UTC (permalink / raw)
To: Andy Wingo; +Cc: 39634, Rob Browning
[-- Attachment #1: Type: text/plain, Size: 469 bytes --]
Andy Wingo <wingo@igalia.com> skribis:
>> variable,
>> hashtable,
>> fluid,
>> dynamic_state,
>> frame,
>> atomic_box,
>> program,
>> vm_cont,
>> weak_set,
>> weak_table,
>> port
>
> No equal? implementation, so should hashq() instead.
Here’s a patch for these, let me know what you think!
(Of course longer term it’d be nice to have a centralized way to declare
each tc7 with its equal and hash methods.)
Ludo’.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 1053 bytes --]
diff --git a/libguile/hash.c b/libguile/hash.c
index c51a794c9..9cb8fcedd 100644
--- a/libguile/hash.c
+++ b/libguile/hash.c
@@ -1,4 +1,4 @@
-/* Copyright 1995-1997,2000-2001,2003-2004,2006,2008-2015,2017-2018
+/* Copyright 1995-1997,2000-2001,2003-2004,2006,2008-2015,2017-2018,2020
Free Software Foundation, Inc.
This file is part of Guile.
@@ -331,6 +331,22 @@ scm_raw_ihash (SCM obj, size_t depth)
h ^= scm_raw_ihash (scm_syntax_module (obj), depth);
return h;
}
+
+ /* The following tc7s have no 'equal?' implementation. Thus, just
+ fall back to 'hashq'. */
+ case scm_tc7_variable:
+ case scm_tc7_hashtable:
+ case scm_tc7_fluid:
+ case scm_tc7_dynamic_state:
+ case scm_tc7_frame:
+ case scm_tc7_atomic_box:
+ case scm_tc7_program:
+ case scm_tc7_vm_cont:
+ case scm_tc7_weak_set:
+ case scm_tc7_weak_table:
+ case scm_tc7_port:
+ return scm_raw_ihashq (SCM_UNPACK (obj));
+
case scm_tcs_cons_imcar:
case scm_tcs_cons_nimcar:
if (depth)
^ permalink raw reply related [flat|nested] 8+ messages in thread
* bug#39634: All keyowrds hash to the same value
2020-03-06 14:42 ` Ludovic Courtès
@ 2020-03-06 15:43 ` Andy Wingo
2020-03-06 16:19 ` Ludovic Courtès
0 siblings, 1 reply; 8+ messages in thread
From: Andy Wingo @ 2020-03-06 15:43 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: 39634, Rob Browning
On Fri 06 Mar 2020 15:42, Ludovic Courtès <ludo@gnu.org> writes:
> Andy Wingo <wingo@igalia.com> skribis:
>
>>> variable,
>>> hashtable,
>>> fluid,
>>> dynamic_state,
>>> frame,
>>> atomic_box,
>>> program,
>>> vm_cont,
>>> weak_set,
>>> weak_table,
>>> port
>>
>> No equal? implementation, so should hashq() instead.
>
> Here’s a patch for these, let me know what you think!
LGTM!
Andy
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#39634: All keyowrds hash to the same value
2020-03-06 15:43 ` Andy Wingo
@ 2020-03-06 16:19 ` Ludovic Courtès
0 siblings, 0 replies; 8+ messages in thread
From: Ludovic Courtès @ 2020-03-06 16:19 UTC (permalink / raw)
To: Andy Wingo; +Cc: 39634-done, Rob Browning
Andy Wingo <wingo@igalia.com> skribis:
> On Fri 06 Mar 2020 15:42, Ludovic Courtès <ludo@gnu.org> writes:
>
>> Andy Wingo <wingo@igalia.com> skribis:
>>
>>>> variable,
>>>> hashtable,
>>>> fluid,
>>>> dynamic_state,
>>>> frame,
>>>> atomic_box,
>>>> program,
>>>> vm_cont,
>>>> weak_set,
>>>> weak_table,
>>>> port
>>>
>>> No equal? implementation, so should hashq() instead.
>>
>> Here’s a patch for these, let me know what you think!
>
> LGTM!
Pushed as c5d3b45c9f903cc9726b4bd7e3c13db56bafd587, thanks!
Ludo'.
^ permalink raw reply [flat|nested] 8+ messages in thread