Re: Java vs C++ speed (IO & Revisted Again)
Razii wrote:
On Sat, 22 Mar 2008 23:40:56 +0100, "Bo Persson" <bop@gmb.dk> wrote:
Yes, obviously the default file buffering is not optimal. Let us
"fix" that:
char Cache[150000000];
int main()
{
std::ifstream src("bible.txt");
std::ofstream dst("output.txt");
dst.rdbuf()->pubsetbuf(Cache, sizeof Cache);
clock_t start=clock();
//etc.
For 119 meg file, I got times like
C:\>CopyFile
Time for reading and writing file: 3750 ms
C:\>CopyFile
Time for reading and writing file: 3718 ms
C:\>CopyFile
Time for reading and writing file: 3703 ms
C:\>CopyFile
Time for reading and writing file: 3703 ms
C:\>CopyFile
Time for reading and writing file: 3766 ms
for Java it was
Time for reading and writing files: 2219 ms (java)
Time for reading and writing files: 2156 ms (java)
Time for reading and writing files: 2250 ms (java)
Time for reading and writing files: 2453 ms (java)
The compiler options were C:\>cl /O2 CopyFile.cpp
Why the difference?
I used this
#include <ctime>
#include <fstream>
#include <iostream>
char Cache[150000000];
int main(int argc,char *argv[])
{
std::ifstream src("bible3.txt");
std::ofstream dst("output.txt");
clock_t start=clock();
dst.rdbuf()->pubsetbuf(Cache, sizeof Cache);
dst << src.rdbuf();
clock_t endt=clock();
std::cout <<"Time for reading and writing file: " <<
double(endt-start)/CLOCKS_PER_SEC * 1000 << " ms\n";
return 0;
}
Is that what you meant?
No, now you removed the read buffer. :-)
#include <ctime>
#include <fstream>
#include <iostream>
char Cache[150000000];
int main()
{
std::ifstream src("bible.txt");
std::ofstream dst("output.txt");
dst.rdbuf()->pubsetbuf(Cache, sizeof Cache);
clock_t start=clock();
// dst << src.rdbuf();
while(src.good())
{
char Buffer[1000];
src.read(Buffer, sizeof Buffer);
dst.write(Buffer, src.gcount());
}
clock_t endt=clock();
std::cout <<"Time for reading and writing file: " <<
double(endt-start)/CLOCKS_PER_SEC * 1000 << " ms\n";
return 0;
}
That gets me about 800 ms on my machine. It turns to 5800 ms if I add
a dst.close() before the final clock() call. Totally I/O-bound - has
nothing to do with the languages involved. If you have a faster hard
disk, I bet you will get 700 instead of 800 ms.
The problem with
// dst << src.rdbuf();
is that it reads the file character for character, looking for an EOF.
The read() and write() functions do not.
So, I shaved 50% off the execution time by using the Buffer and a more
efficient read(). Then got another 80% reduction by cheating in the
benchmark (moving the bulk of the work to the destructor). Note that I
wrote "fix" in the previous message.
Benchmarks are hard.
Bo Persson