Hr. Daniel Mikkelsen wrote:
I know ideas for a distributed Wikipedia have been
discussed here before,
I hate to kill good ideas and discussion, and I'd love to work out the
technical details of a distributed solution, but it's also extremely
frustrating to work on a complicated solution when it turns out that
nobody really needs it, so let's run through the old arguments again,
just like a check list.
Yes, application-level distribution has been discussed, but even
though nobody thinks it is right out impossible it has never been
implemented, because it doesn't solve any problem that we have.
The first question that a proposal must answer is: What problem are
you trying to solve?
Wikipedia itself is the solution to a problem: How to create a free
encyclopedia, especially without getting bogged down in the editorial
process of Nupedia. For this, standard components are used, such as
Apache, PHP, and MySQL. We could invent our own programming language
or database, but we don't, becuase it could get us sidetracked, make
us lose focus of the original goal. However, Wikipedia has developed
its own wiki software, abandoning the existing UseModWiki software.
So there is a fine line between inventing the new and using the
existing. The innovations in the Wikipedia PHP script, that makes a
difference from UseModWiki, are mainly aimed at supporting the
editorial process (namespaces, statistics, uploads, etc.), or user
experience (skins) and not towards performance architecture.
As far as I know, there are no distributed wikis in the world.
Wikipedia is the world's biggest wiki, and would be the first to need
such a solution, but so far runs at a single site.
One "popular" problem to discuss is work load and response times.
However, with the current software architecture (PHP + Apache + MySQL)
it is possible to distribute the work load over a large number of
computers at a single site without any application-level distribution.
It is also possible to identify software bottle necks and improve the
performance without adding hardware. Wikipedia has been "slow"
before, when it had far less work load than today, and some people
thought this was the end of the road, but software improvements showed
that it was possible to handle the load.
In addition to the technical challenge, application-level distribution
brings many new problems with contract law and administration: Who
runs each site? What are the legal contracts between those
responsible for each site? What to do if a site becomes unavailable
or goes out of business? Etc. Those problems are avoided by sticking
to a single site. If you propose a distributed solution, you would
have to include answers to these administrative and legal questions.
Still, there is a possibility that new kinds of problems can be solved
by a distributed architecture. What about the Russian users who have
to pay extra for accessing international websites, but can use Russian
websites within their flat fee subscription plan? If this problem can
be addressed by a distributed server inside Russia, you would have to
answer who can run it and what does it cost to keep it updated if
Internet traffic across the border is not flat rate?
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik -
http://aronsson.se/