[Foundation-l] WikiX Image Synchronization Utility Posted

Jeffrey V. Merkey jmerkey at wolfmountaingroup.com
Sun Jul 2 08:41:31 UTC 2006


Since the Images files on Wikimedia.org are in a constant state of flux 
and the last 80GB archive of images is 8 months old (made in
November of 2005) I wrote a program in C that scans an XML dump of the 
English Wikipedia, then constructs and outputs
a bash script that uses CURL in a non-obtrusive manner to download any 
missing images from Wikipedia commons and
Wikipedia.  The program runs in background and is low impact.


invoke from your /MediaWiki/images/ directory as:

/wikix/wikix < enwiki-<date>-pages-articles.xml > image_sh &
./image_sh >& image.log &

The program will invoke CURL (if loaded) after it outputs a full 
download script which will resync any remote wiki with the
master images on Wikipedia and Wikipedia Commons.  

Enjoy.

The source code  and makefiles are attached as text (since the program 
is small) and the tar gz is also available from

ftp.wikigadugi.org/wiki/images

Jeff V. Merkey





-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Makefile
Url: http://lists.wikimedia.org/pipermail/foundation-l/attachments/20060702/8341bb1e/attachment-0001.diff 


More information about the foundation-l mailing list