I am not familiar with the PHP code, so this may be already done:
In the English Wikipedia 9.5% of links (362K out of 3800K total) consist
of references to years between 0 and 2000.
Therefore two lines of code might save quite a bit of link checking.
if link is 3 of 4 digits and numeric value < 2005 then found = true
Perl version would look something like this (PHP will be similar):
if (($language eq "en") && ($link =~ /^\d{3,4}$/) && ($link <
2005))
{ $found = 1 ; }
----
Checking if "|$link|" is found in
"|Race (US Census)|Square mile|Census|Population density|United States
Census Bureau|Asia|Geographic references|Native American|African
American|Hispanic|Latino|United States|"
would again save > 10% of checks (390K)
(assuming all articles are viewed equally often, which of course is not
true)
Erik Zachte