On Sun, Jul 14, 2002 at 03:50:20AM +0200, Axel Boldt wrote:
Some special
pages are still moderately slow (particularly "wanted"
and "random page"), but the real time hogs now are very long pages
with lots of links.
I looked at the random page code, and right now it fetches a complete
list of all article IDs, in order to pick one out randomly. There must
be a better way to do this. This should be an O(log n) operation, not O(n).
It can be. If you do a COUNT(*) without any conditions it looks it up in the
index, so that's very fast. Then you can guess a number n under this
upperbound and ask for the n'th record with LIMIT. MySQL will use the index
for that so it should be O(log n). If the record doesn't satisfy the
conditions (not the right namespace) then you simply guess again, but of
course it usually will be Ok the first time.
The "wanted" special page will get a lot
faster if we implement
Jan's idea of a table recording the number of broken links to every
unwritten article.
Well, I'm back from my short trip, so if Lee tells me how I can get
started I will. Actually, after that, if everybody agrees, I may begin to work
on the TeX extension.
If long pages with lots of links cause trouble, maybe
we should revive
the caching idea of the current code: the cur table gets another
column cur_cache where we store the rendered HTML. When displaying an
article, we simply pump out the contents of cur_cache, or, if cur_cache
is empty, we render, display and store in cur_cache. If a newly saved
article necessitates the updating of links, we junk the cur_cache of
all affected articles.
A good idea.
-- Jan Hidders