Re: hashCode() for Custom classes
Eric Sosman wrote:
It is usually impossible to guarantee unique hashCode()
values, because hashCode() returns an int and there are "only"
four billion ints. Four billion may sound like a lot, but
consider: How many different Strings exist? Let's see: there's
one empty String, 64K one-character Strings, 4G two-character
Strings, 256T three-character Strings, ... There are clearly
a *lot* more Strings than there are int values, so there aren't
enough unique ints to go 'round.
Correct. Not only is it impossible for hashes to be guaranteed unique, it's
unnecessary and not even really desirable. Hashes are a hack. They exist to
speed up equality comparisons. If equals() were fast enough, we'd never use
hashCode().
The purpose of a hash code is to map a large set of wide inputs to a smaller
set of narrower inputs. ints are faster to compare than Strings, by enough to
notice. If hashes were unique, they'd need to be numerous and wide, and then
we'd use the original values without encoding them in the first place.
The hash comparison doesn't replace the equality test, it supplements it.
Hash indexing is only step one - if more than one key is in the same hash
bucket, you walk the list comparing each bucket denizen against the candidate
for step two. A good hash is one with a very low probability that it will
land more than one key at a time in any given bucket under real-life conditions.
--
Lew