[Foundation-l] Re: Cluster report, September-November, 2005

Anthere anthere9 at yahoo.com
Sun Nov 27 14:20:27 UTC 2005


Domas Mituzas wrote:
> Hello, just a shameless copy-paste from meta (http:// 
> meta.wikimedia.org/wiki/Cluster_report%2C_September-November%2C_2005)
> 
> These months were yet again amazing in Wikimedia growth history.
> Since September request rates doubled, lots of information added,  
> modified and expanded, more users came.
> To deal with that site had to improve both software and hardware  
> platforms again.
> 
> Of course, more hardware was thrown at the problem.
> In mid-September three new database servers (thistle,ixia,lomaria)  were 
> added to the pool, removing ancient type of hardware from the  service.
> With data growth rates 'old' 4GB-RAM boxes could not keep up with  
> operation, except quite limited one.
> 40 dual-opteron application servers have been deployed, conserving  our 
> limited colocation space, as well as providing lots of  performance for 
> a buck.
> One batch of them (20) was deployed just this week.
> They're equipped with larger drives and more memory, thus allowing to  
> place various unplanned services on them (9 apache servers are  storing 
> old revisions as well), some servers participate in shared  memory pool, 
> running memcached.
> 
> One of really efficient purchases was 12k$ worth image server  'amane', 
> providing us with storage space and even ability to to  backup at 
> current loads.
> It is running now highly efficient and lightweight HTTP server -  lighttpd.
> So far images are served, but growth of Wikimedia Commons will force  us 
> to find a really scalable and reliable way to handle lots of media.
> 
> Additionally 10 more application servers are ordered together with a  
> new Squid cache server batch.
> These 10 single-opteron boxes will have 4 small and fast disks and  
> should enable efficient caching of content.
> 
> As all this gear was bought for donated money, we really appreciate  
> community help here, thank you!
> 
> Yahoo supplied cluster in Seoul, Korea has finally got into action,  
> bringing cached content closer to Asian locations, as well as having  
> master databases and application cluster for Japanese, Thai, Korean  and 
> Malaysian Wikipedias.
> 
> For internal load balancing Perlbal was replaced by LVS, and we've  got 
> a nice flashy donated load balancing device that may be deployed  into 
> operation soon as well.
> LVS has to be handled with care and several tiny misconfiguration  
> incidents seriously affected site performance.
> Lately the cluster has became quite big and complex and now we need  
> more sophisticated and extensive sanity checks and test cases.
> 
> There are lots of work in establishing more failover capabilities -  we 
> will be having two active links to our main ISP in Florida.
> Static HTML dump is (becoming) nice and usable and may help us in  case 
> of serious crashes. It can be served from Amsterdam cluster as  well!
> 
> As for last several days we managed to bring cluster into quite  proper 
> working shape, now it's important to fix everything and  prepare for 
> more load and more growth and yet another expansion.
> We hope that we will be able with the help of community to solve all  
> our performance and stability issues and avoid being Lohipedia :)
> 
> Lots of various problems were solved so far in order to achieve what  we 
> have now, and lots of low hanging fruits have been picked.
> What is dealt now with is complex and needs manpower and fresh ideas  as 
> well.
> 
> Discussions are always welcome on #wikimedia-tech in Freenode (except  
> during serious downtimes :).
> 
> And, of course, Thanks Team (or rather, Family)! It is amazing to  work 
> together!
> 
> Cheers,
> Domas


A big cheer for Domas, Brion, Tim, Mark, Kate and all others helping for 
the site to follow up (see : http://meta.wikimedia.org/wiki/Developer)

On a more personal note, thanks a lot for the report Dammit :-)

I take the opportunity to remind editors they might find information in 
such places such as 
http://wikimediafoundation.org/wiki/Hardware_and_hosting_report

See also
http://meta.wikimedia.org/wiki/Hardware_ordered_September_14%2C_2005
http://meta.wikimedia.org/wiki/Hardware_ordered_October_6%2C_2005
http://meta.wikimedia.org/wiki/Hardware_ordered_October_18%2C_2005
http://meta.wikimedia.org/wiki/Hardware_ordered_November_15%2C_2005
for last orders. This does not show the daily work of maintenance, 
install and improvement though...

Thanks again.

Ant




More information about the foundation-l mailing list