Re: Array optimizing problem in C++?
On 23 mar, 09:58, "jason.cipri...@gmail.com"
<jason.cipri...@gmail.com> wrote:
Also, wrt James' original post:
void
smooth( double* dest,
double const* src,
size_t len )
{
for ( size_t i = 1 ; i < len - 1 ; ++ i ) {
dest[ i - 1 ] = (src[ i - 1 ] + src[ i ] + src[ i + 1 ]) =
/ 3 ;
}
}
The compiler cannot arrange to use the src[ i + 1 ] value
from the preceding pass through the loop as the src[ i ]
value in the current pass, since the write to dest[ i - 1 ]
might have changed it. In Java, it can, since two arrays
are either identical or disjoint.
I am not sure what you would expect in either language. I
believe that James had been expecting it to cache [i+1] in an
fpu register, and use that instead of accessing the value the
next time through.
Exactly. It's a very common optimization.
As it stands right now, VS at least (I didn't check GCC)
generates assembler instructions that are more equivalent to
this:
fpu_register = src[i - 1]
fpu_register += src[i]
fpu_register += src[i + 1]
fpu_register /= 3
dest[i - 1] = fpu_register
Using fld, fadd, fdiv, and fstp on Intel machines. It never
loads src[i + 1] anyways. I have not tested this or done any
research, but I suspect this is still a bit faster than:
fpu_register = src[i - 1]
fpu_register += other_fpu_register
other_fpu_register = src[i + 1]
fpu_register += other_fpu_register
fpu_register /= 3
dest[i - 1] = fpu_register
The fastest solution would pre-charge the first two values, and
only read one new value each time through the loop. (Don't
forget that memory bandwidth will be the limiting factor for
this type of loop.) The Intel's stack architecture may make
optimizing this somewhat more difficult, but it should still be
possible. You'll end up with more instructions, but less memory
accesses and better run time.
But of course, you can't do this in C++, because dest[ i -1 ]
might modify one of the values you're holding in register. (Of
course, some compilers do generate two versions of the loop,
with code which checks for aliasing first, and uses one or the
other, depending on whether aliasing is present or not. But
this isn't the usual case.)
--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34