operator== on the unordered containers
Why is there no operator== defined for the unordered containers?
Assuming a good hash function it is possible to define one for
unordered_set and unordered_map that takes O(N) time (where N = the size
of the collection):
template <class X, class H, class P, class A>
bool operator==(const unorderd_set<T, H, P, A> &lhs,
const unordered_set<T, H, P, A> &rhs)
{
if (lhs.size() != rhs.size()) return false;
for (auto it1 = lhs.begin(); it1!= lhs.end(); ++it1) {
auto it2 = rhs.find(*it1);
if ((it2 == rhs.end()) || (*it1 != *it2)) return false;
}
return true;
}
Whether this is possible for unordered_multiset and unordered_multiset
depends on how unordered we want the unordered container to be.
Consider two hash_multimap<int, char> objects h1 and h2, such that h1
contains the elements { {1, 'a'}, {1, 'b'} and h2 contains the same
elements in the opposite order. The efficiency of operator== on
hash_multimap and hash_multiset depends on whether s1 and s2 should be
considered equal.
If the order of elements with equivalent keys matters, then the
unordered_multiset and unordered_multimap can still be compared in O(N)
time:
template <class X, class H, class P, class A>
bool operator==(const unorderd_multiset<T, H, P, A> &lhs,
const unordered_multiset<T, H, P, A> &rhs)
{
if (lhs.size() != rhs.size()) return false;
for auto it1= lsh.begin(); it1!= lhs.end(); /*No increment*/) {
if (lhs.count(*it1) != rhs.count(*it1)) return false;
auto rng = rhs.equal_range(*it);
for (auto it2 = rng.first; it2 != rng.second; ++it1, ++it2) {
if (*it1 != *it2) return false;
}
}
return true;
}
If the order of elements with equivalent keys doesn't matter then
unordered_multiset and unordered_multimap can be compared in O(Nk) time
where k is the maximum number of elements with equivalent keys.
template <class X, class H, class P, class A>
bool operator==(const unorderd_multiset<T, H, P, A> &lhs,
const unordered_multiset<T, H, P, A> &rhs)
{
if (lhs.size() != rhs.size()) return false;
for auto it1= lsh.begin(); it1!= lhs.end(); /*No increment*/) {
if (lhs.count(*it1) != rhs.count(*it1)) return false;
//Can't just search for all the elements equivalent to *it1 in
rhs because the multiplicities might differ.
auto rng = rhs.equal_range(*it);
list<unordered_multiset::const_iterator> rh_list;
for (auto it2 = rh_rng.first; it2 != rng.second; ++ it2) {
rh_list.insert(it);
}
for (/*No init */ ; !rh_list.empty(); ++it1) {
auto it2 = rh_list.begin();
for (/*No init */; (it2 != rh_list.end()) && (*it1 != *it2);
++it2) {}
if (it2 == rh_list.end()) { //None found equal to *it1
return false;
} else {
rh_list.erase(it2); // Make sure we don't match the
same element twice
}
}
}
}
Note that all these equality operators can blow up if the unordered
container has a bad hash function. But if an unordered container has a
bad hash function then ALL of it's operations become inefficient. I
still think it's worthwhile to be able to compare two hashed containers
for equality.
Joe Gottman
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]