[Foundation-l] Where do our readers come from? Q&A

Milos Rancic millosh at gmail.com
Sat Jan 16 17:15:28 UTC 2010


On Fri, Jan 15, 2010 at 11:39 PM, Erik Zachte <erikzachte at infodisiac.com> wrote:
> Q: Nikola Smolenski / Milos Rancic
> At Wikipedia Page Views By Country - Breakdown [1] and Wikipedia Page Views
> By Country - Trends [2] could you include more languages (ideally all
> languages)?
> Some of the numbers are going below 0.1% of population, but some of them are
> not mentioned even they are larger than 0.5% of population.
>
> [1] http://tinyurl.com/yhp3an7
> [2] http://tinyurl.com/yzga2hm
>
> A:
> Yes on some reports I do include smaller percentages for the largest
> Wikipedia's as those represent significant numbers of page views.
> I used different (and arbitrary) thresholds per report. The arbitrariness
> could change, but I want to plead for a notoriety threshold:
>
> Here is a much more extended version of the breakdown report [1] (for this
> discussion only)
> It shows per country up to 50 Wikipedia's
> An extra column shows the total number of records for this country/language
> (for the 6 month period) on which the percentage is based.
> As you can see for the smallest countries that number is so low that it is
> no longer significant.
>
> Let us say we cut off not at 1%, but at an (arbitrary) absolute threshold of
> x logged records per country/language pair (per row).
> Let us say we cut off at average 5 records per month. Everything below that
> threshold in the test report is in dark red.
> Personally I think this is still way too much detail for a general report.
> Not because of Kb's but information overload.
>
> [1] http://tinyurl.com/yjwoyre

Detailed statistics have two very important values:
* The first one is chapter-related. I want to know more details about
tendencies in Serbia, so I would be able: (1) to analyze what is going
on and what WM RS did; (2) to make a media event based on statistics.
* The other value is of general sociolinguistic value. I may trace up
to some extent where do speakers of some language live, what is the
percentage of internet adoption (actually, Wikipedia adoption); all of
that in comparison with, let's say, GDP, number of inhabitants and so
on.

It would be great if you put some periodic job which would create such
statistics at the end of every month. For example, I would really like
to know about the trends in the past 6 months.

I noticed in your quarterly report that share of Serbian language in
Serbia is raising. It is very important because it shows one (or both)
of two things: Serbian Wikipedia quality is raising or/and Internet
adoption among those who don't know English well enough is raising. If
number of visits to English Wikipedia is stable enough, it is about
the second; if number of visits is lower than previous, it is about
first; and so on.

Also, I would like to know is it seasonal: which numbers are about
tourists, and which are about general population behavior.

So, while such statistics are truly an information overload for
creation of a general report, they are very valuable for particular
reports.



More information about the foundation-l mailing list