On Sun, Mar 30, 2008 at 4:10 PM, Brianna Laugher
<brianna.laugher(a)gmail.com> wrote:
Hi,
There is an interesting Firefox extension called Zemanta, that works
with some blogging platforms, to suggest images to match a blog post
you type. One of the sources they use is Commons.
See this post (comments) for a description of how it works and what
it's lacking:
<http://brianna.modernthings.org/article/97/zemanta-wikimedia-commons-for-bloggers>
In particular,
"If you have an idea how to correctly capture wikipedia images
attribution (something that would assure at least 50% correct coverage
from 2.8M images), please help us! ;)"
Really, we can't blame people too much for not providing attribution,
when we don't give that information in a standard way, or give a
standard way of accessing it.
Now is as good a time as any to formally write an API to recommend for
other people to use. Aside from the MediaWiki API, there are three
main things I can think of that are often needed to be automated:
* identify any "problem tags" (files with deletion markers shouldn't
be used or indexed by third parties)
* extract license name(s) and URL for a given file
* extract author attribution string for a given file
So I propose we put our heads together and figure out the most robust
algorithm for each of these, and provide some sample code for each.
I made a start here:
<http://commons.wikimedia.org/wiki/Commons:API>
Contributions and feedback welcome...
cheers,
Brianna
--
They've just been waiting in a mountain for the right moment:
http://modernthings.org/
_______________________________________________
Commons-l mailing list
Commons-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l
I already started something some time ago on
<http://commons.wikimedia.org/wiki/Commons:Machine_readability>. It
allows you to extract all information provided by the {{Information}}
template and some other templates. It's not yet finished; I'm still
think what is the easiest way to fetch license information.
Bryan