Hello again,

In trying to retrieve the images for the Hebrew Wikipedia ZIM I'm making, I tried running Emmanuel's script mirrorMediawikiPages.pl.  My command line was this:

./mirrorMediawikiPages.pl --sourceHost=he.wikipedia.org --destinationHost=localhost --useIncompletePagesAsInput --sourcePath=w

After working for more than 20 hours, and still in the stage of populating the @pages with incomplete pages, it aborted with "out of memory".  The machine has 4GB physical memory, and the last time I checked -- several hours before it aborted -- the script was consuming 3.6GB.

Is there a way to do this in several large chunks, without specifying each individual page?  How do you do it?

Thanks in advance,

    Asaf Bartov
    Wikimedia Israel

--
Asaf Bartov <asaf@forum2.org>