Poor, Edmund W wrote:
As a programmer with well over a quarter century of
experience, I
applaud this effort by Nick to identify and slay the "dragon" of
N-squared performance.
As a computer science student in merely my first year, I apologise if
any of the following may come across as condescending or patronising,
but I feel that I should point out a few fallacies and inaccuracies in
your posting.
When a process on a file takes time proportion to the
file size, we call
that N.
Firstly, you mean "linear", not "proportional" (nor do you mean
*"proportion"). (Actually, technically, you mean "asymptotically
linear", but I won't go into the difference now.) The correct notation
for this is O(n) (normally pronounced "order N"), with n being the input
size. See [[Big O notation]].
When time is proportion to log N, we call that log N
performance
(quicksort, which uses a partition scheme and then sorts the partitions,
has log N performance).
You are talking about O(n log n), not O(log n). In-place sort algorithms
cannot be faster than O(n log n); I could prove this right here, but I
can't be bothered.
[[Heapsort]] and [[Merge Sort]] are well-known examples of O(n long n)
sort algorithms. [[Quick Sort]], however, is O(n^2) (due to its
unfortunate worst-case scenario).
A process whose elapsed time grows in proportion to
the SQUARE of the
file size is out of control! It's called an N-squared or N^2 process,
and is to be avoided like the plague.
There are numerous situations where an O(n^2) algorithm is appropriate
to use. Quicksort, for example, is often applied where the worst-case
scenario is guaranteed to not occur. Sometimes there may not even be a
better algorithm.
You also appear to suggest that O(n^2) is the worst you can get;
Warshall's algorithm (which computes the transitive closure of a
directed graph) is O(n^3), and I think this is the fastest we can get
for that.
Insertion sort and Bubble sort are
examples, and have long been shunned for all but small numbers of items.
Insertion sort is *extremely* fast in situations where the data is
"almost sorted" and you are pretty sure the worst case never occurs (you
can get an average of O(n) operation). It has often been used to sort
the polygons you wish to render in a 3D scene by their distance from the
viewer; when you move the camera only slightly, these distances only get
out of order a little bit. (Of course, by now there are much more clever
algorithms for this sort of thing; just today we learnt one which
doesn't even require sorting the polygons.)
Nick, if you and Magnus can pull this off, to
Brion's satisfaction, you
will gain the everlasting gratitude of the entire Wikipedia community.
More power to you!
I completely agree. :-)
Greetings,
Timwi