Okay, I think I've managed to semi-automate this now. In the future,
updates should appear within a day of new backup dumps going up.
http://www.wikipedia.org/wikistats/EN/Sitemap.htm
This is generated from the backup dumps by Erik Zachte's processing
scripts:
http://members.chello.nl/epzachte/Wikipedia/Statistics/
(To change the language of the page, change the "EN" to "NL" etc. Most
languages are not translated yet; contact Erik for info. All versions
should show the same data set.)
Erik: I made some slight tweaks to support filtering the data through
bzip2 as it reads; I don't know if that will work on Windows, though it
ought to. Patch attached.
-- brion vibber (brion @
pobox.com)
51,52c51,54
< $file_in_old = $path_in . $dumpdate . "_old_table.sql" ;
< $file_in_cur = $path_in . $dumpdate . "_cur_table.sql" ;
---
#$file_in_old = $path_in . $dumpdate .
"_old_table.sql" ;
#$file_in_cur = $path_in . $dumpdate . "_cur_table.sql" ;
$file_in_old = $path_in . $language . "/old_table.sql.bz2" ;
$file_in_cur = $path_in . $language . "/cur_table.sql.bz2" ;
160c162,166
< open "FILE_IN", "<", $file_in || abort ("Input file
" . $file_in . " could not be opened.") ;
---
if($file_in =~ /.bz2$/) {
open "FILE_IN", "-|", "bzip2 -dc
\"$file_in\"" || abort ("Input file " . $file_in . " could
not be opened.") ;
} else {
open "FILE_IN", "<", $file_in || abort ("Input file
" . $file_in . " could not be opened.") ;
}