Re: [Commons-l] Automated identification of images on commons

26 Sep 2011

  On 9/20/2011 9:36 AM, とある白い猫 wrote:
...
  The conference has no such tool yet, at least nothing
we can 
 use tomorrow but they are able to pretty accurately on what the images 
 are. I am going to try to propose if they would be interested in 
 providing commons with such a service, The website relevant is 
 http://www.imageclef.org/2011/Wikipedia for commons but 
 http://www.imageclef.org/2011/Plants is also interesting (even though 
 it had nothing to do with commons so far. It could ver well be used 
 for commons and wikispecies alike.

   -- とある白い猫  (To Aru Shiroi Neko)

         I've made some attempt to map images on Wikimedia commons to 
distinct concepts from DBpedia,  see

http://ookaboo.com/

       This could be useful for forming a training set,  but I haven't 
yet got around to releasing a public dump of the data.  I have about 1 
million things classified and could certainly extend the strategies used 
to get more.

       Unless there's been a really unprecedented breakthrough,  I'd 
think that the application of machine vision to Wikimedia faces the 
problem of getting enough training data.  If you had thousands or tens 
of thousands of photos that were labeled 'cat' or 'not cat',  or
'member 
of plant species X' or 'not member of plant species X',  you can train a 
classifier to make the distinction.  However,  if you've got two or 
three bad photos of a particular plant (which is what you have most of 
the times in Commons) you don't have enough training data to generalize.

       If you've got a specific mission,  say genitals recognition, I 
think you can make progress,  but to attack the general problem you need 
to go big with your training sets.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Commons-l] Automated identification of images on commons