On Mon, 2003-03-31 at 02:07, Andre Engels wrote:
If I remember correctly, there are 1000 pages being
selected once a day or
so, which are then cycled through. So once you get a significant portion of
this 1000 pages, you would indeed often be getting the same pages again.
No, that's not correct. Random selection is made from the set of all
articles. Each page is assigned a random number. The set of pages is
sorted by the random number, and a random index into this set is
selected. The selected page's random number is then reassigned to a new
random number so it should not be reselected even if the same random
index came up on a subsequent random load.
If that's not random enough, it may be due to an allegedly defective
RAND() function in MySQL 3.x.
At one time, random selection was made by picking 1000 articles at
random, then picking random indexes from that queue for the next X
number of random selections (a 1 in 800 chance of resetting the queue on
each random load). This was abandoned because the queue system was
problematic (high probability of duplicates; too-slow queue refill
operation; would sometimes bring up deleted pages; on smaller wikis it
didn't update, etc).
-- brion vibber (brion @
pobox.com)