Re: Improve Java Code

From:

Patricia Shanahan <pats@acm.org>

Newsgroups:

comp.lang.java.programmer

Date:

Thu, 10 Jan 2008 09:46:21 -0800

Message-ID:

<fm5lhk$2t06$1@ihnp4.ucsd.edu>

Patricia Shanahan wrote:

Sanny wrote:
....

So If I Use

u1=i+1;u2=j+1;
if (Arr1[u1*3+u2]=='B'){// is True 25% times only.
if ((u1<9)&&(u2<9)) {
ux=u1*3+u2;
mz++;BigARRAY[mz]=ux;
}
}

Will above code will be faster?

It depends on things like the predictability of the branches. A single
mispredicted branch can cost a lot more than all the integer arithmetic
in your code snippet. In this sort of code, you want to put highly
predictable if statements on the outside, so that you only execture
unpredictable ones if you really have to.

You do need a testbed program that extracts the relevant portions of
your application in which you can measure proposed changes.

Most people in this thread seem to be primarily concerned with the
integer arithmetic, so maybe I should explain my preoccupation with the
conditional branches.

Most processors now have deep, wide pipelines. That means that at any
given time there are dozens of instructions at some stage of being
processed, going through a sort of production line. As long as the work
keeps flowing smoothly, multiple instructions complete each cycle.
Evaluating some small integer expression can be tucked into the corners
of memory accesses, and is practically free.

Now consider what happens when a conditional branch enters the pipeline.
The processor has to decide what to put in the pipeline after it. Should
it carry on fetching the next instruction in memory order, or should it
switch to fetching the branch target and its memory order successors?

The processor does not find out whether it guessed right or not until
the branch condition has been evaluated, much later on in the pipeline.
If it guessed wrong, all the work it has done on later instructions is
wasted. It has to flush out its pipeline and start fetching the other
list of instructions. At that point, it has wasted dozens of instruction
execution opportunities.

To compensate for this, processors take heroic measures to try to
predict branches based on history, but are not always successful. The
easiest are branches that almost always go the same way, such as an
error detection branch or the return to the top of a loop with a high
iteration count. Branches that have no pattern to whether they are taken
or not taken, and go either way about equally often, are really expensive.

We are dealing with a short code snippet with three conditional
branches, including one that depends on a value fetched from an array.
They may all have a high correct prediction rate, in which case each
branch is of similar cost to e.g. an integer add, and small tweaks to
the integer arithmetic will make measurable differences in the code
performance. At the other extreme, the Arr1 test could be being
mispredicted so often that the only thing that matters for the loop
performance is doing it as infrequently as possible.

Only measurements can tell.

Patricia