Re: Reading from a socket: first characters, then octets
On 2009-07-15 08:39:00 -0400, ram@zedat.fu-berlin.de (Stefan Ram) said:
When a client sends an HTTP PUT request to my web server,
I start to read characters from the socket:
this.br =
new java.io.BufferedReader
( new java.io.InputStreamReader
( socket.getInputStream(), "ANSI_X3.4-1968" ));
What about
this.s = new java.io.BufferedInputStream(socket.getInputStream());
this.r = new java.io.InputStreamReader(s, encoding);
But sometimes, after the initial text, the socket will
change to emit binary data (octets), that is, it will
send me the actual data of the PUT request during the
same transmission (TCP session).
The read()-method of the InputStreamReader will give
me a converted character. But I need unconverted octets
from this point on. With the above code, Java will
convert some octets and modify their values, so that
the binary data will become corrupted.
When I start then to use the read()-method of the
socket.getInputStream(), I might miss some octets which where
already read by the InputStreamReader or the BufferedReader.
But I also want to have buffering, because I/O usually is
slower without buffering.
By doing the buffering before conversion, you're guaranteed that only
the bytes you've already read as characters will have been consumed
from the buffer when you begin reading bytes, rather than those bytes
plus up to a whole buffer page. You could, for example:
while (r.read() != '.') ; // consume up to the first dot
readSomeBytes(s); // consume some binary data
and be assured that readSomeBytes would pick up at the first byte after
the code unit that was read as a '.' characters.
-o