[Foundation-l] wikistats (was Information flow)

Lars Aronsson lars at aronsson.se
Thu Aug 25 18:01:26 UTC 2005


Erik Zachte wrote:
> So considering I could only start on this >> two weeks ago << I'm making
> fair progress.
> 
> By the way, many people would like to see for more frequent dumps.

This progress is excellent news!  I'm looking forward to it.  Now 
that wikistats has a long history, it should be continued as soon 
as possible, so we can watch trends over time.

However, it is also possible that the MediaWiki software could 
produce some statistics more directly.  Statistics is essentially 
the reduction of data volumes into useful numbers (e.g. reducing 
the list of article names to the article count), and the closer to 
the source such reductions can take place, the more efficient.  
The dilemma is that this reduction is irreversible, you cannot 
reconstruct the the full information from the reduced data. This 
is where the board or its officers can provide insights into what 
statistics are really useful for managing the project.

Page visit counters were an example of such direct statistics, 
that was also the first to be (blindly, in my opinion) disabled 
during the performance problems in 2002-2003.  Perhaps they were 
never so useful anyway.  Right now (on wikipedia-l) people are 
browsing through the "what links here" list to find the number of 
links.  This work could be saved by presenting a "select count(*)" 
at the top of the [[Special:Whatlinkshere]] page, at virtually no 
extra cost.  Such counts could be presented also for the lists of 
user contributions, pages belonging to a category, etc.

Collecting statistics from full database dumps is a slow and heavy 
process.  We could do better.  But only if we know which stats to 
collect.


-- 
  Lars Aronsson (lars at aronsson.se)
  Aronsson Datateknik - http://aronsson.se



More information about the foundation-l mailing list