[HipHop] starting import

Paul Tarjan pt at fb.com
Thu Jun 13 22:54:22 UTC 2013



On 6/13/13 2:49 PM, "Chad" <innocentkiller at gmail.com> wrote:

>On Thu, Jun 13, 2013 at 10:49 AM, Chad <innocentkiller at gmail.com> wrote:
>> On Thu, Jun 13, 2013 at 2:09 AM, Paul Tarjan <pt at fb.com> wrote:
>>> I've started an import of the database dump. I had to move the db from
>>> /var/lib/mysql to /home/mysql since we only have 10 gigs of storage on
>>>/.
>>> The import seems like it will take 3.5 days at 112 pages per second
>>>with
>>> 30M pages, if it stays steady.
>>>
>>> Here is the command I'm running:
>>>
>>> bzcat ~/dump/enwiki-20130503-pages-articles.xml.bz2 | perl
>>>~/bin/mwimport
>>> | mysql -f -u<user> -p<pass> --default-character-set=utf8 my_wiki
>>>
>>
>> I think enwiki is a little bigger than we need to do for this round of
>> testing. I was
>> thinking of just doing something like simplewiki or mediawiki.org
>>content.
>>
>> Also, /home shared storage across instances, so the performance
>>probably isn't
>> as good as we'd get on /.
>>
>> I'm going to stop this import, flush all the data from the wiki and do
>> the import
>> of simplewiki *today*.
>>
>
>Well, the import seems to still be running fine. I'll just let it finish.

The last line printed was

  4549000 pages ( 72.942/s),   4549000 revisions ( 72.942/s) in 62365
seconds

So it has slowed down quite a bit. ETA is 4 more days now.


Can we try and get it running with a partial import? Are the tables that
control the look and feel at the end of the dump? I can easily stop it and
start the import again from the point it stopped.

Paul




More information about the HipHop mailing list