Re: converting char to float (reading binary data from file)

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Sat, 31 May 2008 01:33:13 -0700 (PDT)

Message-ID:

<ec48749d-eedd-47f2-b139-cacd03dbee3f@c65g2000hsa.googlegroups.com>

On May 30, 6:38 pm, gpderetta <gpdere...@gmail.com> wrote:

On May 28, 10:06 pm, James Kanze <james.ka...@gmail.com> wrote:

On May 28, 12:11 pm, gpderetta <gpdere...@gmail.com> wrote:

On May 28, 10:30 am, James Kanze <james.ka...@gmail.com> wrote:

On May 27, 12:07 pm, gpderetta <gpdere...@gmail.com> wrote:

In particular, the Boost.Serialization binary format is
primarily used by Boost.MPI (which obviously is a wrapper
around MPI) for inter process communication. I think that
the idea is that the MPI layer will take care of
marshaling between peers and thus resolve any
representation difference. I think that in practice most
(but not all) MPI implementations just assume that peers
use the same layout format (i.e. same CPU/compiler/OS) and
just network copy bytes back and forward. In a sense the
distributed program is a logical single run of the same
program even if in practice are different processes
running on different machines, so your observation is
still valid

If the programs are not running on different machines,
what's the point of marshalling. Just put the objects in
shared memory. Marshalling is only necessary if the data is
to be used in a different place or time (networking or
persistency). And a different place or time means a
different machine (sooner or later, in the case of time).

Well, MPI programs runs on large clusters of, usually,
homogeneous machines, connected via LAN.

That's original. I don't think I've ever seen a cluster of
machines where every system in the cluster was identical.

I think that for MPI it is common. Some vendors even sell
shrink wrapped clusters in a box (something like a big closet
with thousands of different computers-on-a-board, each running
a different OS image). Even custom built MPI clusters are
fairly homogeneous (i.e. at least same architecture and OS
version).

I think that you work mostly on services applications, while
MPI is more common on high performance computing.

I realized that much, but I wasn't aware that it was that common
even on high performance computing. The high performance
computing solutions I've seen have mostly involved a lot of
CPU's using the same memory, so marshalling wasn't an issue.
(But I'm not much of an expert in the domain, and I've not seen
that many systems, so what I've seen doesn't mean much.)

[...]
The real question, however, doesn't concern just the machines.
If all of the machines are running a single executable, loaded
from the same shared disk, it will probably work. If not, then
sooner or later, some of the machines will have different
compiles of the program, which may or may not be binary
compatible. In practice, the old rule always holds: identical
copies aren't. (Remember, binary compatibility can be lost just
by changing options, or using a newer version of the compiler.)

Yep, one need to be careful, but at least with the compiler I use,
options that change the ABI are explicitly documented as such.

Lucky guy:-). For the most part, what the options actually do
is well documented, and if you understand a bit about what it
means at the hardware level, you can figure out which ones are
safe, and which aren't. But it's far from explicit.

Note that this can be a problem just trying to statically link
libraries; you don't need marshalling at all to get into
trouble. (Or rather: you don't want to have to marshall every
time you pass an std::vector to a function in another module.)

Probably a much bigger problem are differences in third party
libraries between machines (i.e. do not expect the layout of
objects you do not control to stay stable).

That's another problem entirely, and affects linking more than
marshalling. The problem is that compilers may change
representation between versions, etc.

The same program will spawn multiple copies of itself on every
machine in the cluster, and every copy communicates via
message passing. So you have one logical program which is
partitioned on multiple machines. I guess that most MPI
implementations do not bother (in fact I do not even know if
it is required by the standard) to convert messages to a
machine agnostic format before sending it to another peer.

Well, I don't know much about that context. In my work, we have
a hetrogeneous network, with PC's under Windows as clients, and
either PC's under Linux or Sparcs under Solaris as servers (and
high level clients). And that more or less corresponds to what
I've seen elswhere as well.

Where I work, clusters are composed of hundreds of very
different machines, but all use the same architecture and
exact same OS version (so that we can copy binaries around and
not have to worry about library incompatibilities). We do not
use MPI though, but have an in- house communication framework
which does take care of marshaling in a (mostly) system
agnostic format.

Yes. We do something more or less like this for the clients:
they're all PC's under Windows, and we use a lowest common
denominator which should work for all Windows systems. Our
machines are geographically distributed, however, so
realistically, ensuring exactly the same version of the OS,
isn't possible.

For the servers, economic considerations result in a decision to
move from Solaris on Sparc to Linux on PC, at least for all but
the most critical systems. Similarly, economic considerations
mean that the entire park won't be upgraded at the same instant.
Are you saying that if a decision comes to upgrade the
architecture, you change all of the machines in a cluster at
once? (But maybe... I can imagine that all of the machines in a
cluster still cost less than one supercomputer. And if you were
using a supercomputer, and wanted to upgrade, you'd change it
all at once. I guess it's just a different mindset.)

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34