Re: byte stream vs char stream buffer
On 5/11/2014 7:02 AM, Robert Klemme wrote:
- Reading in 1k chunks from the application level is already close to
optimal.
Good test suit here all around, Robert. I just wanted address this one
point quickly. I'm not sure if this is new information or not, but in
actual use, I've found that it helps to allocate large buffers when
you're reading a large text file that must be processed quickly.
Sample:
ByteBuffer bb = ByteBuffer.allocateDirect( 64 * 1024 );
RandomAccessFile f = new RandomAccessFile( "blet", "rw" );
FileChannel fc = f.getChannel();
readToCapacity( fc, bb );
bb.flip();
ByteBuffer b2 = bb.slice();
b2.asReadOnlyBuffer();
Allocating a direct byte buffer of 64k seems to help text processing
speed by about an order of magnitude. Files that take minutes to
process go to seconds when large direct byte buffers are used. It seems
important to get that first big chunk of bytes in before doing anything
else (like converting to chars).
The input file for this was about 600k, so if you're reading much
smaller files the difference might not show up.
This is just a sample, I don't seem to be able to find the original, so
I hope it's accurate. Here's the static method used by the above code:
public static void readToCapacity( FileChannel fc, ByteBuffer buf )
throws IOException
{
final int capacity = buf.capacity();
int totalRead = 0;
int bytesRead = 0;
while( (bytesRead = fc.read( buf ) ) != -1 ) {
totalRead += bytesRead;
if( totalRead == capacity ) break;
}
}