Re: Refactoring question

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Thu, 17 Dec 2009 00:39:59 -0800 (PST)

Message-ID:

<6a626dc2-2c34-4f28-a3f5-02bb5a298ccb@r5g2000yqb.googlegroups.com>

On Dec 16, 8:15 pm, Brian <c...@mailvault.com> wrote:

On Dec 16, 3:12 am, James Kanze <james.ka...@gmail.com> wrote:

On 15 Dec, 23:21, Brian <c...@mailvault.com> wrote:

[...]

Which do you want: to ensure that both bytes are in the same
buffer, or that the buffer is as full as possible?

I guess adding a Reserve function would be one way to address
this. I'm not sure the buffering has to be uniform, but perhaps a
Reserve function would be useful in avoiding the check of overflow
with each byte.

Or modify the "Receive" function to check after each byte (or to
copy all that fits, then a second copy for what's left). Or specify
clearly that whether buffers are full or not isn't specified.

I just tried a version of Receive that copies all that fits and then
does a second copy for the balance. The lines marked with a plus sign
are new and are the only thing that changed.

  void
  Receive(void const* data, unsigned int dlen)
  {
    if (dlen > bufsize_ - index_) {
      memcpy(buf_ + index_, data, bufsize_ - index_); // +
      data += bufsize_ - index_; // +
      dlen -= bufsize_ - index_; // +
      SendStoredData();

      if (dlen > bufsize_) {
        PersistentWrite(data, dlen);
        return;
      }
    }
    memcpy(buf_ + index_, data, dlen);
    index_ += dlen;
  }

The resulting executable is just 200 bytes more, but the time
is over 30% slower than without the change.

That doesn't sound right. The difference is far too big.

But the real difference would be downstream: by filling every buffer
to
the maximum, you need less buffers. Which means that downstream,
there
are less buffers to handle.

I'm using a buffer of size 4096 and the only thing going into the
buffer are 4 byte integers. I also tried it with this:
if (bufsize_ - index_ > 0) {

}

around the three added lines, but that didn't help. I find this
result disappointing as philosophically I could persuade myself that
always filling up buffers makes sense.

Perhaps I'll have one configuration for files and TCP and another for
UDP. Asking files and TCP to pay for making UDP happy is
unreasonable.

I'm fairly sure that you're worrying about the wrong things, and that
the difference won't be significant in a real application.

The second form is needed for correctness and both forms put the
same amount of information onto the stream and the ints themselves
are in the same sequence (but with different byte order) in either
case. In "Effective TCP/IP Programming" it says, "There is no
such thing as a 'packet' for a TCP application. An application
with a design that depends in any way on how TCP packetizes the
data needs to be rethought."

That's TCP. Applications don't talk to one another in TCP; they use
some higher level protocol. (And of course, they may also use other
protocols, like UDP, for the lower level.)

I want to support UDP, but it shouldn't be allowed to have such a
big role that it hinders the performance of other protocols.

At the application level, it's neither TCP nor UDP, but an application
level protocol.

[...]

I think you have to define a higher level protocol to begin
with. (And although it's possible, and I've seen at least some
formats which do so, I'm not convinced that there's any
advantage in supported different representations.)

I've no idea what you mean by that last sentence.

The impression I have here (but I don't see the entire
context---only what you've posted) is that you're putting the cart
before the horse. Before writing a single line of code, you should
specify the protocol, exactly.

I'm not that rigid (or omniscient). I like to have some fun once in a
while. Anyway, I'm happy with my approach to things. It doesn't make
perfect sense I guess, but eventually I get to a good place.

If it's for fun, or even more or less a learning experience, fine.

--
James Kanze