Re: Parse binary stream to long ints

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Wed, 27 Feb 2008 06:33:33 -0800 (PST)

Message-ID:

<8c8d6521-1e25-4e4b-a609-1eb4b99144f0@n77g2000hse.googlegroups.com>

On Feb 27, 5:23 am, Micah Cowan <mi...@cowan.name> wrote:

John Brawley wrote:

Apparently I'm trying to go backwards in computer language
development: the language (C++ for ex.) goes to great
lengths to _hide_ what I really want, from the coder.

Well, the main reason that C++ hides such things, because it
wants to continue to support platforms for which it may not be
able to guarantee these things.

There's that, and the fact that they're just a distraction to
the programmer much of the time. One of the parculiarities of
C++, however, is that it makes it a point of honor to allow you
to access the lowest levels when appropriate. Such code won't
necessarily be portable, of course, because such low level
abstractions do vary between hardware. But nothing says that
C++ can only be used for 100% portable applications.

Thanks for the response.
If anyone knows a simple way to get a linear series of
random bits from a disk file, reading those bits as a
number, I'd appreciate knowing about it...

For my part, I probably wouldn't be worrying a whole hell of a
lot with portability in this case, and simply read them
directly into ints (or whatnot). This is made much easier by
the fact that, in this case, byte ordering is irrelevant
(unless you want the same input to parse the same way on
various implementations).

If he's using a truely random source, there's no way he could
tell, since he can't get the same input on two different
implementations. About the only thing that might cause problems
is different formats for integral types. He can minimize this
by using unsigned integral types (most of the differences
concern representation of negative numbers), but at least one
implementation (using 48 bit signed magnitude int's) required 8
bits of an integral value to be 0, or it treated the value as a
floating point. (I don't know if it ever had a C++
implementation, or even a C, but it would have been interesting;
there was no hardware support for unsigned.)

You'd of course be using
istream::read() rather than the >> operator.

The usual way to do more portable reads (though usually used
for values where it actually matters what format you read it
in) is to read it in as a series of bytes, and construct the
int therefrom, perhaps via a series of bitshifts, so that you
can remain ignorant of the host byte ordering.

The C++ FAQ Lite has a lot to say about
serialization:http://www.parashift.com/c++-faq-lite/serialization.html...
probably much more than you need for this, but very useful
info at any rate.

The problem with most na=EFve deserialization schemes is that they
introduce a certain randomness (e.g. due to byte order, etc.).
In his case, I hardly think that that can be considered a real
problem---the results won't be any more random than the original
data.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34