On Mon, Sep 12, 2005 at 04:05:33PM +0200, Lars Aronsson wrote:
Tomasz Wegrzanowski wrote:
For wikipedia, dictd is not very useful.
For wiktionaries it is - this way content from the wiktionary
becomes just one of many dict data feeds your dict client may use.
What are the scalability issues if
wiktionary.org were to run a
central dictd server for the entire world to use? How many
clients can connect simultaneously and how much traffic should be
expected from the average client?
dictd servers typically use read-only preprocessed databases,
and are therefore easily replicable. Just dump wiktionary once a day,
preprocess it to suitable format and copy over nfs or something.
The computational costs and traffic overhead are low (but unlike http
no free gzip-on-fly) - it's a really just a very simple protocol with
clients only reading data from servers.
The only operation that is any expensive are various advanced match strategies,
like "match by regular expression" or "match by substring".
However the RFC requires that we implement only exact match
and prefix match, if it was any problem.
Maintaining one more daemon and the processing scripts would
probably be the highest part of the cost.