Hello,
the image copyright problem is still largely unresolved. On the English
wikipedia there are about 90.000 images, and a smaller but growing number
of images in the other wikipedias.
I saw on [[:en:Wikipedia:Image_copyright_tags]] that some users have
started categorizing images using templates like {{PD}}, {{GFDL}} and so
on placed in the image description pages. I think this is very useful for
semi-automatic sorting of the various images for different purposes (and
outright deletion of the illegal ones).
Since we are about to do the same on the Italian wiki, I wrote a small
Perl script to read a database dump and write out several image lists,
one for each template, listing which images contain that particular
template, and an "unassigned" list for image without any template. Each
list is in wiki format, ready to be copied-and-pasted into a wikipedia
page if needed. We plan to use the tool on it: to generate lists of images
still to be categorized.
Of course all this can be done with a few clever SQL queries, but not all
of us have access to the DB or mysql installed.
In case anyone wants to use it the URL is
http://www.tommasoconforti.com/wiki/tools/images.pl.gz
The number of lines in each generated file is the number of images with
that particular template. For example this is the situation for the Aug 28
english dump, showing that a bit less of 25% of the images have a
proper template:
$ wc -l *
73 CopyrightedFreeUse
229 CopyrightedFreeUseProvided
13 CrownCopyright
10373 GFDL
56 GPL
2 LGPL
5864 PD
1 PD_USGov
104 PermissionAndFairUse
33 Sovietpd
66 copyrighted
3503 fairuse
1 freefairusein
237 images.pl
253 noncommercial
137 noncommercialProvided
64568 unassigned
131 unknown
1984 unverified
18 verifieduse
87646 total
Processing the dump takes a while, especially if it must be decompressed
on the way.
Alfio