[Wikipedia-l] Wikipedia is an encyclopedia

Stephen Forrest stephen.forrest at gmail.com
Fri Apr 8 02:13:42 UTC 2005


On Mar 10, 2005 5:20 PM, Stirling Newberry
<stirling.newberry at xigenics.net> wrote:

> More than you are aware of, in that so far wikipedia has only had to
> deal with "natural" challenges of software, social organization and
> server load factor. Wait until you start having to deal with the
> problems of organized attempts to extract value from wikipedia, it will
> generate server loads and social problems which are, if the experience
> of e-tailers like EBay and Amazon are any indication two to four orders
> of magnitude above current peak loads. And I am not kidding.

Wikipedia is different from Amazon and ebay in that it is possible to
download its entire content in a single compressed bundle (or a small
set of these, one for each wiki).  Presumably some way can be found to
section off the bandwidth consumed by downloading these bundles from
the bandwidth used for reading and editing the online version (i.e.
have a host of mirrors only for these downloads).

Data miners and others trying to extract value from reading Wikipedia
have a strong incentive to download a bundle and run their datamining
scripts locally, rather than accessing the current Wikipedia
page-by-page over the Net.  It's faster, easier to code, and less
likely to give offense.

So, what we need to worry about is mostly people doing wide-scale
editing with bots for nefarious purposes (e.g. improving PageRank
rankings).

Steve



More information about the Wikipedia-l mailing list