Re: hashcode calculation for a Collection of objects
Roedy Green wrote On 08/24/07 01:29,:
On Fri, 10 Aug 2007 14:09:45 -0400, Eric Sosman <Eric.Sosman@sun.com>
wrote, quoted or indirectly quoted someone who said :
4. it does not tend to collapse the range of the digest into a
narrower band.
Equally, it does not expand the range. If you
XOR a bunch of bytes, you get one of 256 possible
values, never more.
I said "it does not collapse the range", not that "it expands it".
Yes, that's what you said. And you're right, and I
did not contradict you.
Further, XOR depends on every single bit used to compute it. Change a
bit and the result changes.
Various multiply, shift, modulus etc tend to collapse the range.
... but if I compute a 32-bit value, even if two-thirds
of its bits are "collapsed" in the sense of being directly
computable from the others, my "eleven-bits-of-entropy" hash
still gets eight times better scattering than an 8-bit value
of whatever pedigree. A computation that yields an 8-bit
result is "pre-collapsed," even if it is careful to preserve
as many degrees of freedom as it still retains.
It reminds me a little bit of the GIF/JPEG wars, now
(happily) mostly a thing of the past. "JPEG is lossy!"
shrieked the GIF camp -- and then the JPEG fans pointed
out that turning an image into GIF *starts* by discarding
two-thirds (typically) of the input. True, it zealously
guards the remaining one-third in a way JPEG does not, but
the "collapse" has already occurred and cannot be undone.
--
Eric.Sosman@sun.com