Analyzing Hash Collisions
By Stephen Compall - Posted on December 28th, 2007
Tagged:
This is a little debugging helper for analyzing hash functions in GST by looking at collisions in LookupTables. I just copied one of the more-or-less identical #findIndex: implementations and added a counter to it, then a nice method for slicing up a whole table for analysis.
For example,
st> Dictionary methodDictionary analyzeCollisions! 0->(#keys #add: #at: #removeAllKeys: #keyAtValue: #atAll: #keysDo: #associationAt: #removeKey: #values #occurrencesOf: #remove: #remove:ifAbsent: #removeKey:ifAbsent: #deepCopy #keyAtValue:ifAbsent: ) 1->(#at:put: #includesAssociation: #associationsDo: #associationAt:ifAbsent: #do: ) 2->(#addAll: #at:ifAbsent: ) 3->(#at:ifAbsentPut: #removeAllKeys:ifAbsent: #reject: ) 4->(#at:ifPresent: #storeOn: ) 6->(#includesKey: ) 7->(#includes: ) 8->(#rehash #findIndex: ) 9->(#keysAndValuesDo: ) 13->(#collect: ) 14->(#whileGrowingAt:put: ) 15->(#select: ) 16->(#inspect ) 17->(#keysClass ) 18->(#= ) 19->(#hash ) 22->(#printOn: ) 26->(#findKeyIndex: ) 27->(#hashFor: #withAllSuperspaces ) 29->(#findElementIndex: )
Here's the code:
!LookupTable methodsFor: 'debugging'!
collisionsAt: key
"Answer the number of collisions before finding the given key, or
nil if not present."
| index size element collisions |
self beConsistent.
index := ((self hashFor: key) scramble
bitAnd: (size := self primSize) - 1) + 1.
collisions := 0.
[(element := self primAt: index) isNil ifTrue: [^nil].
element = key ifTrue: [^collisions].
collisions := 1 + collisions.
index == size ifTrue: [index := 1] ifFalse: [index := index + 1]]
repeat
!
analyzeCollisions
"Print an analysis of collisions for my keys on the transcript."
| colls scolls |
colls := LookupTable new.
self keysDo: [:k |
(colls at: (self collisionsAt: k)
ifAbsentPut: [OrderedCollection new]) add: k].
scolls := SortedCollection sortBlock: [:a :b | a key < b key].
colls keysAndValuesDo: [:k :v | scolls add: k -> v asArray].
scolls do: [:each | each printNl]
! !
