We have investigated the reason for wp's intermittent lockups a bit by
following vmstat output on the different machines. All machines- including
the DB- seem to lock up at the same time, with bi/bo and in dropping to 0
or near-0.
The conclusion that this is related to nfs is close. As a first measure i
would propose to move /apache/common onto the local hd on the Apaches. All
php scripts are validated against the nfs server for changes on every
execution at the moment, this should be the major load.
Additionally zwinger is a massive spof right now. If it goes down we're
hosed. I'm currently trying to set up coda (
http://www.coda.cs.cmu.edu/)
and heartbeat on my lan. These are part of the LVS project and should
provide a solution for our needs
(
http://www.linuxvirtualserver.org/HighAvailability.html), both
performance- and HA-wise.
--
Gabriel Wicke