Erik Moeller wrote in gmane.science.linguistics.wikipedia.technical:
To address this, I've written a very basic Perl
script a while ago that
makes a dump of a wiki's images that are only in the Commons. It's in
/home/erik/extractdb.pl. I'm sure it could be done a lot faster, though.
in fact, it should rather be slower than faster. this is the idea of
trickle. otherwise it's too much load on albert.
Ideally, such a solution could be used to create
combined dumps that
include *all* the images used in a particular wiki.
yes.
From a legal standpoint, we have to be careful with
distributing image
dumps separately from the metadata that includes the licensing
information, as many licenses prohibit this.
can we just include the image description pages in the image dumps? that
shouldn't increase their size by a large amount, i wouldn't've thought.
Erik
kate.