[Foundation-l] WikiX Image Synchronization Utility Posted
Jeffrey V. Merkey
jmerkey at wolfmountaingroup.com
Sun Jul 2 08:41:31 UTC 2006
Since the Images files on Wikimedia.org are in a constant state of flux
and the last 80GB archive of images is 8 months old (made in
November of 2005) I wrote a program in C that scans an XML dump of the
English Wikipedia, then constructs and outputs
a bash script that uses CURL in a non-obtrusive manner to download any
missing images from Wikipedia commons and
Wikipedia. The program runs in background and is low impact.
invoke from your /MediaWiki/images/ directory as:
/wikix/wikix < enwiki-<date>-pages-articles.xml > image_sh &
./image_sh >& image.log &
The program will invoke CURL (if loaded) after it outputs a full
download script which will resync any remote wiki with the
master images on Wikipedia and Wikipedia Commons.
Enjoy.
The source code and makefiles are attached as text (since the program
is small) and the tar gz is also available from
ftp.wikigadugi.org/wiki/images
Jeff V. Merkey
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Makefile
Url: http://lists.wikimedia.org/pipermail/foundation-l/attachments/20060702/8341bb1e/attachment-0001.diff
More information about the foundation-l
mailing list