Hello Giovanni,
thanks for the pointer to the Click datasets.
I'd have to take a look at the complete dataset, to see how much of those
requests are touching wikipedia.
Then, one of the requirements to access those datas is:
"The Click Dataset is large (~2.5 TB compressed), which requires that it be
transferred on a physical hard drive. You will have to provide the drive as
well as pre-paid return shipment. "
I have to check if this is possible and how long this might take to ship
and send back an hard-drive from Switzerland.
I'll let you know !!
Best,
Valerio
On Wed, Sep 17, 2014 at 4:09 PM, Giovanni Luca Ciampaglia <
gciampag(a)indiana.edu> wrote:
Valerio,
I didn't know such data existed. As an alternative, perhaps you could have
a look at our click datasets, which contain requests to the Web at large
(i.e., not just Wikipedia) generated from within the campus of Indiana
University over a period of several months. HTH
http://carl.cs.indiana.edu/data/#click
Cheers
G
Giovanni Luca Ciampaglia
✎ 919 E 10th ∙ Bloomington 47408 IN ∙ USA
☞
http://www.glciampaglia.com/
✆ +1 812 855-7261
✉ gciampag(a)indiana.edu
2014-09-17 9:53 GMT-04:00 Valerio Schiavoni <valerio.schiavoni(a)gmail.com>om>:
Hello,
just bumping my email from last week, since so far I did not get any
answer.
Should I consider that dataset to be somehow lost ?
I've also contacted the researchers who partially released it, but making
it publicly available is tricky for them, due to its size (12 TB), which
might instead be somehow in the norms of the operations taken daily by
Wikipedia servers.
Thanks again,
Valerio
On Wed, Sep 10, 2014 at 4:15 AM, Valerio Schiavoni <
valerio.schiavoni(a)gmail.com> wrote:
Dear WikiMedia foundation,
in the context of a EU research project [1], we are interested in
accessing
wikipedia access traces.
In the past, such traces were given for research purposes to other
groups
[2].
Unfortunately, only a small percentage (10%) of that trace has been made
made available (10%).
We are interested in accessing the totality of that same trace (or even
better, a more recent one, but the same one will do).
If this is not the correct ML to use for such requests, could please
anyone
redirect me to correct one ?
Thanks again for your attention,
Valerio Schiavoni
Post-Doc Researcher
University of Neuchatel, Switzerland
1 -
http://www.leads-project.eu
2 -
http://www.wikibench.eu/?page_id=60
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l