Shaihulud and me investigated the recent load balancing weirdness a bit
and came across some slow ping times to machines on the secondary
switch. Two averages of 250 pings from suda, run at the same time:
maurus: rtt min/avg/max/mdev = 0.138/2.063/25.486/4.660 ms
coronelli: rtt min/avg/max/mdev = 0.125/0.328/2.023/0.213 ms
More samples between various hosts also showed that the really slow ping
times only occur between machines on the first switch and the newer ones
on the secondary one (all hanging on a single port on the primary
switch).
The load balancing is based on icp response times, so a network
congestion can temporarily override it. I'm not sure if the GigE switch
is already ordered, if not i'd recommend to order it as soon as
possible.
--
Gabriel Wicke