Re: [Algorithm] Sum of Primes < 1000000

From:

Patricia Shanahan <pats@acm.org>

Newsgroups:

comp.lang.java.programmer

Date:

Mon, 15 Jan 2007 16:04:17 GMT

Message-ID:

<5CNqh.10265$pQ3.4386@newsread4.news.pas.earthlink.net>

Philipp wrote:

Lew wrote:

Patricia Shanahan wrote:

This, I think, is the big message from this thread. Simply implementing
a very well-known algorithm changed the performance from somewhere
between 1 minute and 10 minutes to a fraction of a second. Tweaking the
code got a further factor of about 3.

It is very, very unlikely that any amount of micro-optimization applied
to the original algorithm could have got the performance of a simple,
direct sieve implementation.

Some years back I tried to make this point to my project manager on a
contract I was working. The optimization there was of a database
structure, where the difference for one report was between O(n^2) and
O(n). I was forbidden to make the algorithmic optimization, then got
in trouble because the report was too slow once they loaded all the
data. It took months and lawyers to straighten that one out.

I don't know why so many project managers seem to ignore this kind of
reasoning. Most are not as egregious, but the problem remains. Not
only do we as engineers need to know about the big O, but we have a
fiduciary responsibility to communicate such reasoning to our
customers (including employers).

Big O is just how the system scales, not the absolute running time. If
you have an amount of entries which will not vary by a factor of 10 in a
close future, it would be wise to check if the O(n^2) is not much faster
than the O(n) algorithm on these entries...

I did that once, and created a lot of trouble. It was a system for
managing and applying patch cards when booting an operating system. I
was told that there would never be more than order tens of cards at a
time, before the OS would be recompiled with source code changes.

I did a very simple implementation that I knew was O(n^2), but that I
had tested for a couple of hundred cards, several times the number
required by the specification, and at that size it ran about as fast as
the card reader.

Over the years, the number of cards increased to about a thousand, and
it became a real bottleneck. I tried to explain the issue, and get
permission to rework the code to use e.g. a tree structure or a hash
table instead of a linear search...

A lot depends on your confidence that reimplementation will be permitted
if the problem outgrows the O(n^n) solution.

Patricia