Saqib Kadri wrote:
download.wikimedia.org says enwiki 20070908 has
10,218,632 pages, and
when I run it mwdumper says it inserted 5,654,236 pages into the
database. But MySQL is only showing about 2.6 million rows. I ran
mwdumper twice, the first time gave 2,639,569/2,678,371/2,615,000 rows
for the page/revision/text tables, respectively. The second time gave
2,583,864/2,510,365/2,615,000 rows.
Which number of rows is correct - 10 million, 5 million, or 2.6 million?
Did something go wrong with the DB insert? Note that this is on a Linux
machine with MySQL 4.1.22, and there is plenty of space on the hard drive.
Thanks.
The 10,218,632 number includes redirects, if that helps. There must be
more than 2.6 million pages by now, as just articles accounts for 2
million, so my guess is 5,654,236 is the number of non-redirect pages.
None of which explains why there are only half as many rows in the
tables, of course.
-Gurch