Thanks for sharing this Dominic. It would be interesting to chat about this sometime. I’ve
run into the same issue with relying on stats.grok.se in the past with Linkypedia. It
became untenable so I stopped showing article view stats.
I wonder if it would be worthwhile to get together (Skype, hangout?) to chat about the
current state of the analytics hadoop cluster, and how we might collectively push the
Foundation in the right direction? In my few conversations with them they seemed focused
on building out general tools, rather than specific (and very useful) tools like
stats.grok.se. But we should be able to right that ship no?
I know that the Europeana project are looking to collect statistics as part of their Wiki
GLAM Toolset project [1,2]. It might be good to rope some of them into the conversation if
Magnus isn’t already connected with them.
//Ed
[1]
On Feb 17, 2014, at 7:22 PM, Dominic McDevitt-Parks <mcdevitd(a)gmail.com> wrote:
Magnus' blog post today is a must-read for anyone
in the GLAM community who wants to understand why analytics has been such a challenge for
us:
http://magnusmanske.de/wordpress/?p=173
Quoting the last three paragraphs in full:
[...]
Like others, I have tried to get the Foundation to provide the page view data in a more
accessible and local (as in toolserver/Labs) way. Like others, I failed. The last
iteration was a video meeting with the Analytics team (newly restarted, as the previous
Analytics team didn’t really work out for a reason; I didn’t inquire too deeply), which
ended with a promise to get this done Real Soon Now™, and the generous offer to use the
page view data from their hadoop cluster. Except the cluster turned out to be empty; I
then was encouraged to import the view data myself. (No, this is not a joke. I have the
emails to prove it.) As much as I enjoy working with and around the Wikiverse, I do have
neither the time, the bandwidth, nor the inclination to do your paid jobs for you, thank
you very much.
As the sophisticated reader might have picked up at this point, the entire topic is
rather frustrating for myself and others, and being unable to offer a patchy, error-prone
data set to GLAMs who have released hundreds of thousands of files under a free license
into Commons is, quite frankly, disgraceful. The requirement for the Foundation is not
unreasonable; providing what Henrik has been doing for years on his own would be quite
sufficient. Not even that is required; myself and others have volunteered to write
interfaces if the back-end data is provided in a usable form.
Of the tools I try to provide in the GLAM realm, some don’t really work at the moment due
to the constraints described above; some work so-so, kept running with a significant
amount of manual fixing. Adding 100.000 Wellcome Trust images may be enough for them to
come to a grinding halt. And when all the institutions who so graciously have contributed
free content to the Wikiverse come a-running, I will make it perfectly clear that there is
only the Foundation to blame.
And as many of have said before, thanks to Magnus for doing as much as he has been able
to within current circumstances.
Dominic
_______________________________________________
GLAM mailing list
GLAM(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/glam