Re: Faster than Array search?
Roedy Green wrote:
On Mon, 14 Jan 2008 03:07:27 -0800 (PST), Sanny
<softtanks@hotmail.com> wrote, quoted or indirectly quoted someone who
said :
if (array1[]=='K') here 'K' is not unique It may happen more that 2
points have value 'K'.
Array1={'a','m', 'x', 'K', 'L', 'L', 'K', 'K'...........'X','b', 'a'}
So same character may be present at many places in the array.
then simple indexing won't work. What you could do is
have an array indexed by a-z A-Z. At that index is a tiny array giving
you the list of elements that match that letter.
You then have to compute the index like this:
If I understand Sanny's code correctly, it can be paraphrased as:
1) do some magic in function1() that returns an integer value
called 'hh' in the range 0..100
2) if the value has one of a set of arbitrary values then pass 'hh'
to function2() which does more magic and returns some value
which is added to a variable called 'ff'
The array is never searched.
The test for the arbitrary value is to use 'hh' as a direct index into
an array that contains an element corresponding to each of the 100
possible values of 'hh' and to compare the indexed element with a
literal constant.
AFAIK direct indexing is the fastest way to access an array, so
something else must be eating up the CPU cycles.
Sanny, some questions:
- presumably you're using some sort of profiler, so does it
measure time spent in a code clause, or just in the entire line?
- how do you know that the wait time is in the condition?
It matters whether the profiling tool is actually measuring the time
spent in each statement with a nanosecond timer or if it merely looks
at your program every few microseconds and increments an array indexed
on the line number of the statement that the instruction pointer
happens to be in. The second type used to be most common and probably
still are.
If it does the second, then it can be fooled in some circumstances
and, while it may be OK for saying how much time is spent in your
loop, its less reliable for saying exactly which statement(s) in the
loop are eating the CPU cycles.
- do you get the same answer if you rewrite the line like:
if (array1[hh]=='K')
{
ff+=function2(hh);
}
which would let a line or source statement oriented profiler
distinguish time spent in function2 from time in the condition test?
Using a String, StringBuffer, Vector or List could be as fast as
directly indexing an array but it is very unlikely to be faster because:
- String.charAt() and StringBuffer.charAt() involve method calls,
so they will be slower even if their internal implementation is
an indexed array access.
- Vectors and Lists are likely to be implemented as more complex
structures than mere arrays. Accessing them consists of
a method call plus a more complex, and hence slower, access
procedure than an indexed array access.
- Sets are unsuitable because they can't hold duplicate values.
They also add overheads: HashSet has to be slower than indexing
because there's an overhead in calculating the hash and the TreeSet
has a tree walking overhead.
--
martin@ | Martin Gregorie
gregorie. | Essex, UK
org |