Re: Why is java consumer/producer so much faster than C++
 
On Mon, 23 Jul 2012 06:57:03 -0700 (PDT)
Zoltan Juhasz <zoltan.juhasz@gmail.com> wrote:
On Sunday, 22 July 2012 17:59:20 UTC-4, Melzzzzz  wrote:
I have played little bit with new C++11 features and compared,
java performance to c++.
My take on this:
I guess the Java BlockingQueue is a lock-free implementation 
of the blocking circular-buffer concept. Your implementation 
uses a global, single lock around operations, the _m mutex, 
which essentially serializes your code (plus the synchronization 
overhead). Your code might be executed concurrently, but not in 
parallel for sure.
This is implementation of put/take methods of Java queue:
public void put(E e) throws InterruptedException {
 244:         if (e == null) throw new NullPointerException();
 245:         final E[] items = this.items;
 246:         final ReentrantLock lock = this.lock;
 247:         lock.lockInterruptibly();
 248:         try {
 249:             try {
 250:                 while (count == items.length)
 251:                     notFull.await();
 252:             } catch (InterruptedException ie) {
 253:                 notFull.signal(); // propagate to non-interrupted
thread 254:                 throw ie;
 255:             }
 256:             insert(e);
 257:         } finally {
 258:             lock.unlock();
 259:         }
 260:     }
310:     public E take() throws InterruptedException {
 311:         final ReentrantLock lock = this.lock;
 312:         lock.lockInterruptibly();
 313:         try {
 314:             try {
 315:                 while (count == 0)
 316:                     notEmpty.await();
 317:             } catch (InterruptedException ie) {
 318:                 notEmpty.signal(); // propagate to non-interrupted thread
 319:                 throw ie;
 320:             }
 321:             E x = extract();
 322:             return x;
 323:         } finally {
 324:             lock.unlock();
 325:         }
 326:     }
 327: 
What is wrong with your code? That you compare a naive, 
inefficient blocking queue implementation, with an industrial 
strength solution. 
Heh, I don't think so. This implementation is pretty classic.
It does well for blocking queue, but this test is too
much for it. It gets contented with mutex/condition_variable
calls, so does Java, but for some reason, much less.
- look up a lock-free, single producer / single consumer 
circular-buffer implementation
Java ArrayBlockingQueue use circular buffer but is
not lock free, rather implementation is identical,
as I see (one mutex, two condition variables)
- use rvalue semantics to move items around, no need for 
ptrs; at very least specialize the container for built-in types
- eliminate noise (i.e. rand)
The most important part: profile your code to find out 
where you spend your time, whether you have some kind of 
contention etc. Performance optimization is a tricky business, 
especially in a multi-threaded environment, naive 
implementations usually produce catastrophic results (see above).
Agreed. I think that I am close. Somehow (probably because it
is faster? c++ puts much more pressure on queue than Java does).
Perhaps someone can confirm or deny this.