Re: Algorithm to find pairs in array [can not order]
daniel_nordlund_1982@hotmail.com wrote:
obaqueiro@gmail.com skreiv:
At the end I believe given the restrictions a O(n*n) is the best I can
achieve. Not that it is wrong but I thought something better could be
done.
Your code is extremely inefficient if you have many matching pairs -
not O(n*n)!! This is due to the following reasons:
- You restart the whole algorithm once you remove two elements!!
- You use remove(i) with ArrayLists - causes all subsequent elements to
be shifted!
And using a LinkedList would not solve this because that would make the
indexed access slow.
- You are using a bad algorithm.
Indeed. It is even worse than I thought at first.
It is easy to fix the first two points, and it might make a huge
difference in execution time. You can for instance mark each offer that
should be removed. Don't compare marked offers and at the end you have
a separate loop to copy the unmarked offers to a new ArrayList.
If you still need it faster you can make the algo O(n log n) quite
easily by making option types comparable (doesn't matter how), then you
just sort the offers for their option type - this is O(n log n). For
two offers with the same option type put hold before write (or vice
versa). After the sorting you can go through the sorted list and remove
duplicates as they will be next to eachother. This stage is O(n) if
done properly. Random matching of pairs is easy to implement in the
compare function.
In conjunction with this. note that the standard Collections.sort is
stable. That is, if two items are equal in the sort order, they appear
in the result in the same order as in the input. As far as I can tell,
order only needs to be preserved among options for the same direction
for the same contract.
If the current algorithm is anywhere near to livable performance, the
O(n log n) performance from those ideas should be fine. However, if the
list is very long, consider the following:
Have a pair of HashMap instances, each mapping from contract to a queue
of offers, one for unmatched holds and one for unmatched writes.
Do a single scan of the input list. I'll describe processing for a write
- exchange "write" and "hold" to get the hold processing.
Look up the contract in the holds map. If there is queue, poll to get
the head item. If that is non-null, you have a match.
If no match, add the offer to the holds map, making it the tail item of
the queue for its contract. That may involve creating a new queue and
inserting in the map.
As you go along, the unmatched offers can be appended to the tail of a list.
This reduces the cost from O(n log n) to O(n). However, unless the list
is very long, there will be little or no gain in net performance because
the constant factor will be bigger, and the simpler approach is better
just because it is simpler.
Patricia