Commons-l May 2016

commons-l@lists.wikimedia.org

24 participants
12 discussions

Re: [Commons-l] GSoC 2016 | Porting catimages to pywikibot-core

by Fæ

Replies in-line. On 24 May 2016 at 06:57, Dr. Trigon <dr.trigon(a)surfeu.ch> wrote: >> * incomplete uploads resulting from >> server failures. Checksum >> comparisons would mean re- >> downloading files, which would be >> unnecessarily bandwidth expensive, but >> local image analysis would >> highlight these. > > What about local checksum comparison? Yes, we have SHA1 values for the Commons hosted images, however a local checksum is not normally available from the source (e.g. NYPL) which means re-downloading the original to do the comparison. As some of my uploads are over 100mb for one page, it's an expensive solution. >> * uploads that are mostly blank pages >> in old scanned books. I have a >> simple detection process, but it would >> be neat to have a more common >> standard way of doing this. > > Depends on the format. For PDF you can try to use Poppler/poppler-utils or > MuPDF. For images it will be bit more involved ... but intressting. Formats are normally jpeg or TIFF. My blank detection uses analysis of pixel colour deviations over parts of the image to deduce if it looks blank. This uses the basic Python Image Library rather than any sophisticated math. This can happen pre-upload by testing a client-side image. See <https://commons.wikimedia.org/wiki/User:Fae/Project_list/Internet_Archive#B…> ... >> Hi Fæ, >> >> Thanks a lot for the ideas ! >> The ideas you mentioned are awesome, and something I'll definitely look >> into ! >> >> The second and third ideas mentioned are, I believe, do-able within the >> scope of my GSoC. For the first idea to be implemented, as you mentioned >> local image analysis would be needed, which we've not planned (But i'll add >> it to the "to plan" list :) ). Currently we're planning on downloading the >> image and performing the analysis on ToolsLab or a personal computer. >> >> Thank you for the project list ! I was looking for a good dataset to test >> things out on and this will be immensely helpful. >> >> Regards >> Abdeali JK >> >> On Wed, May 18, 2016 at 5:25 PM, Fæ <faewik(a)gmail.com> wrote: >>> >>> (Just replying on Commons-l with a non-tech observation. If more tech >>> stuff arises I'll add it to Phabricator instead) >>> >>> This looks like a useful contained project, though a lot to be done in >>> 12 weeks. :-) >>> >>> I was not familiar with catimages.py. It would be great if using the >>> module for the preparation or housekeeping of large batch uploads were >>> easy and not time consuming to try. As Commons grows we are seeing >>> more donations over 10,000 images and have had a few with over 1m. >>> Uploads of this size make manual categorization a huge hurdle, so >>> automatic 'tagging' of image characteristics would be a useful way of >>> breaking down such a large batch to highlight the more interesting >>> outliers or mistakes, which can then be prioritized on a backlog for >>> human review. >>> >>> For example, in my upload projects I have problems detecting: >>> * incomplete uploads resulting from server failures. Checksum >>> comparisons would mean re-downloading files, which would be >>> unnecessarily bandwidth expensive, but local image analysis would >>> highlight these. >>> * uploads that are mostly blank pages in old scanned books. I have a >>> simple detection process, but it would be neat to have a more common >>> standard way of doing this. >>> * distinguishing between scans with diagrams and line >>> drawings/cartoons, printed old photographs, newsprint and text pages. >>> >>> It would be great if the testing routines you use during the project >>> could tackle any of these and be written up as practical case studies. >>> >>> As well as the Phabricator write-up/tracking of the project, it would >>> be useful to have an on-wiki Commons or Mediawiki user guide. Perhaps >>> this can be sketched out as you go along during the project, giving an >>> insight into what other users or amateur Python programmers might do >>> to customize or make better use of the module? Having an more easy to >>> find manual, might avoid others going off on their own tangents using >>> various off the shelf image modules, when they could just plug in >>> catimages with a smallish amount of configuration. >>> >>> P.S. If you would like to test the tool on some large collections with >>> predictable formats, try looking through < >>> https://commons.wikimedia.org/wiki/User:Fae/Project list >. The 1/2 >>> million images in the book plates project would be an interesting >>> sample set. >>> >>> Thanks, >>> Fae >>> >>> On 18 May 2016 at 02:53, Abdeali Kothari <abdealikothari(a)gmail.com> >>> wrote: >>> > Hi, >>> > >>> > I'm a student from Chennai, India and my project is going to be related >>> > to >>> > performing image processing on the images on commons.wikimedia to >>> > automate >>> > categorization. DrTrigon had made the script catimages.py a few years >>> > ago >>> > which was made in the old pywikipedia-bot framework. I'll be working >>> > towards >>> > updating the script to the pywikibot-core framework, updating it's >>> > dependencies, and using newer techniques when possible. >>> > >>> > catimages.py is a script that analyzes an image using various computer >>> > vision algorithms and allots categories to the image on commons. For >>> > example, consider algorithms that detect faces, barcodes, etc. The >>> > script >>> > uses these to categorize images to Category:Unidentified People, >>> > Category:Barcode, and so on. >>> > >>> > If you have any suggestions and categorizations you think might be >>> > useful to >>> > you, drop in at #gsoc-catimages on freenode or my talk page[0]. You can >>> > find >>> > out more about me on User:AbdealiJK[1] and about the project at >>> > T129611[2]. >>> > >>> > Regards >>> > >>> > [0] - https://commons.wikimedia.org/wiki/User_talk:AbdealiJK >>> > [1] - https://meta.wikimedia.org/wiki/User:AbdealiJK >>> > [2] - https://phabricator.wikimedia.org/T129611 >>> > >>> > >>> > _______________________________________________ >>> > Commons-l mailing list >>> > Commons-l(a)lists.wikimedia.org >>> > https://lists.wikimedia.org/mailman/listinfo/commons-l >>> > >>> >>> >>> >>> -- >>> faewik(a)gmail.com https://commons.wikimedia.org/wiki/User:Fae >>> Personal and confidential, please do not circulate or re-quote. >> >> > > Dr. Trigon -- faewik(a)gmail.com https://commons.wikimedia.org/wiki/User:Fae Personal and confidential, please do not circulate or re-quote.

8 years

Commons Picture of the Year 2015 round 2 voting has started

by Pine W

Commons Picture of the Year 2015 round 2 voting has started: https://commons.wikimedia.org/wiki/Commons:Picture_of_the_Year/2015. There are many excellent finalists. Pine

8 years

Fwd: [Wikimediauk-l] 3rd party Dictionary of Species taking a feed from Commons via Wikidata

by David Gerard

---------- Forwarded message ---------- From: Robin Owain <info(a)cymruwales.com> Date: 19 May 2016 at 15:11 Subject: [Wikimediauk-l] 3rd party Dictionary of Species taking a feed from Commons via Wikidata To: wikimediauk-l(a)lists.wikimedia.org Hi all The Dictionary of Welsh species has now become an Illustrated Dictionary. Take a look: http://www.llennatur.com/Drupal7/llennatur/?q=node/6#Gl%C3%B6yn It also includes clickbacks to corresponding articles on Wikipedia. All thanks to Wikimedia UK and the University of Bangor. There a couple of thousand images missing on Commons, and if anyone can help, there's a Wikiproject here to fill the gaps. Best regards Robin _______________________________________________ Wikimedia UK mailing list wikimediauk-l(a)wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimediauk-l WMUK: https://wikimedia.org.uk

8 years

GSoC 2016 | Porting catimages to pywikibot-core

by Abdeali Kothari

Hi, I'm a student from Chennai, India and my project is going to be related to performing image processing on the images on commons.wikimedia to automate categorization. DrTrigon had made the script catimages.py a few years ago which was made in the old pywikipedia-bot framework. I'll be working towards updating the script to the pywikibot-core framework, updating it's dependencies, and using newer techniques when possible. catimages.py is a script that analyzes an image using various computer vision algorithms and allots categories to the image on commons. For example, consider algorithms that detect faces, barcodes, etc. The script uses these to categorize images to Category:Unidentified People, Category:Barcode, and so on. If you have any suggestions and categorizations you think might be useful to you, drop in at #gsoc-catimages on freenode or my talk page[0]. You can find out more about me on User:AbdealiJK[1] and about the project at T129611[2]. Regards [0] - https://commons.wikimedia.org/wiki/User_talk:AbdealiJK [1] - https://meta.wikimedia.org/wiki/User:AbdealiJK [2] - https://phabricator.wikimedia.org/T129611

8 years

acknowledged with thanks.

by Sunder Thadani

I am proud to be a part of such a huge family of commons. thanks sunderkt(a)gmail.com https://www.youtube.com/Thesunderkt

8 years

Co-ordinates for a path

by Richard Symonds

All, I have some videos of the seabed of the Dogger Bank, which includes some footage of wrecks on the bed, marine life, and parts of prehistoric settlements. I have the exact co-ordinates of the videos - however, because it's a video, the co-ordinates change over time, and the *moving *co-ordinates of the file can't really be entered into Commons - or can they? Can anyone help with this? * *What's the best way to record the co-ordinates if they move over the duration of the video?** Richard Symonds Wikimedia UK 0207 065 0992 Wikimedia UK is a Company Limited by Guarantee registered in England and Wales, Registered No. 6741827. Registered Charity No.1144513. Registered Office 4th Floor, Development House, 56-64 Leonard Street, London EC2A 4LT. United Kingdom. Wikimedia UK is the UK chapter of a global Wikimedia movement. The Wikimedia projects are run by the Wikimedia Foundation (who operate Wikipedia, amongst other projects). *Wikimedia UK is an independent non-profit charity with no legal control over Wikipedia nor responsibility for its contents.*

8 years

Re: [Commons-l] Co-ordinates for a path

by reguyla＠gmail.com

Because it doesnt work. Probably because my account is globally blocked to prevent me from improving the projects and to enforce my bullshit abusive ban on enwp.. Sent from my T-Mobile 4G LTE device ------ Original message------From: Nahid SultanDate: Tue, May 10, 2016 6:28 AMTo: Wikimedia Commons Discussion List;Subject:Re: [Commons-l] Co-ordinates for a path There is a 'Unsubscribe' button at the bottom of every mail. Why don't you use that? ---Nahid SultanUser:NahidSultan on all Wikimedia Foundation's public wikisMember of Wikimedia ombudsman commissionSecretary, Wikimedia Bangladeshhttp://wikimedia.org.bd Facebook | Nahid SultanTwitter | @nahidunlimited Date: Tue, 10 May 2016 03:22:22 -0700 From: reguyla(a)gmail.com To: richard.symonds(a)wikimedia.org.uk; commons-l(a)lists.wikimedia.org Subject: Re: [Commons-l] Co-ordinates for a path Take me off these spam lists. Since editors arent wanted on the wmf projects and the wmf wants to enable bully behavior by admins I dont want to be spammed with this crap anymore. Sent from my T-Mobile 4G LTE device ------ Original message------From: Richard SymondsDate: Tue, May 10, 2016 5:18 AMTo: Wikimedia Commons Discussion List;Subject:[Commons-l] Co-ordinates for a path All, I have some videos of the seabed of the Dogger Bank, which includes some footage of wrecks on the bed, marine life, and parts of prehistoric settlements. I have the exact co-ordinates of the videos - however, because it's a video, the co-ordinates change over time, and the moving co-ordinates of the file can't really be entered into Commons - or can they? Can anyone help with this? * What's the best way to record the co-ordinates if they move over the duration of the video?* Richard SymondsWikimedia UK0207 065 0992Wikimedia UK is a Company Limited by Guarantee registered in England and Wales, Registered No. 6741827. Registered Charity No.1144513. R egistered Office 4th Floor, Development House, 56-64 Leonard Street, London EC2A 4LT. United Kingdom. Wikimedia UK is the UK chapter of a global Wikimedia movement. The Wikimedia projects are run by the Wikimedia Foundation (who operate Wikipedia, amongst other projects).Wikimedia UK is an independent non-profit charity with no legal control over Wikipedia nor responsibility for its contents. _______________________________________________Commons-l mailing listCommons-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/commons-l

8 years

Re: [Commons-l] Co-ordinates for a path

by reguyla＠gmail.com

Take me off these spam lists. Since editors arent wanted on the wmf projects and the wmf wants to enable bully behavior by admins I dont want to be spammed with this crap anymore. Sent from my T-Mobile 4G LTE device ------ Original message------From: Richard SymondsDate: Tue, May 10, 2016 5:18 AMTo: Wikimedia Commons Discussion List;Subject:[Commons-l] Co-ordinates for a path All, I have some videos of the seabed of the Dogger Bank, which includes some footage of wrecks on the bed, marine life, and parts of prehistoric settlements. I have the exact co-ordinates of the videos - however, because it's a video, the co-ordinates change over time, and the moving co-ordinates of the file can't really be entered into Commons - or can they? Can anyone help with this? * What's the best way to record the co-ordinates if they move over the duration of the video?* Richard SymondsWikimedia UK0207 065 0992Wikimedia UK is a Company Limited by Guarantee registered in England and Wales, Registered No. 6741827. Registered Charity No.1144513. Registered Office 4th Floor, Development House, 56-64 Leonard Street, London EC2A 4LT. United Kingdom. Wikimedia UK is the UK chapter of a global Wikimedia movement. The Wikimedia projects are run by the Wikimedia Foundation (who operate Wikipedia, amongst other projects).Wikimedia UK is an independent non-profit charity with no legal control over Wikipedia nor responsibility for its contents.

8 years

Size of images donated

by Romaine Wiki

Hi all, A professional photo agency offers us (Wikimedia Belgium) a donation of images of art works. They now offer as a start these images with 595 x 842 pixels at 72 dpi. This size is almost double of that from a thumbnail size on Wikipedia. My own (not the most modern) smartphone makes images at 5.312 × 2.988 pixels at 72 dpi. Seeing the size of these images I think they are to low. My question is: what is the minimum of quality we should ask? Thanks! Romaine

8 years

Re: [Commons-l] Size of images donated

by reguyla＠gmail.com

Please disenroll me from this list. If the wmf nor its communities want editors and want to support bullies because they are admins I dint want the WMFs spam. Sent from my T-Mobile 4G LTE device ------ Original message------From: Romaine WikiDate: Thu, May 5, 2016 7:51 AMTo: Wikimedia Commons Discussion List;Affiliates discussion list;Wikimedia Chapters cultural partners coordination;Subject:[Commons-l] Size of images donated Hi all, A professional photo agency offers us (Wikimedia Belgium) a donation of images of art works. They now offer as a start these images with 595 x 842 pixels at 72 dpi. This size is almost double of that from a thumbnail size on Wikipedia. My own (not the most modern) smartphone makes images at 5.312 × 2.988 pixels at 72 dpi. Seeing the size of these images I think they are to low. My question is: what is the minimum of quality we should ask? Thanks! Romaine

8 years

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Commons-l May 2016