Alex J. Avriette wrote:
Order 64-bit
capable machines, of course, for their security advantages,
and speed advantages when running 64-bit software. Move to 64-bit
software as soon as reasonably possible.
Want to voice 100% approval of this. Nobody wants to hear that
Postgres compiles natively in 64-bit and makes use of it (I measured
40% increase in row-insert performance on UltraSPARC, among other
things), but it bears mentioning. I don't know whether MySQL is
n64'able. I can help building a 32/64 or a 64-bit gcc if anyone would
like help. Opteron, Opteron, Opteron. I don't think we need 64-bit
Xeon's or Itanium/Itanium2's.
However, consider the global supplier situation for Opterons. Alas, Dell
don't do them. Who would be the right supplier for this?
vastly more
reliable systems deployment. Consider Debian as both an
operating system and a deployment system: at the moment, different
machines run different operating system flavours, making sysadminning
harder.
RedHat, of course, has its own products for these purposes. But I'm
pretty OS agnostic.
Databases are a different issue: you can't
apply the same commodity
thinking: however, try to order DB machines in identical multiples, too,
Anyone know what an 8-way 848 Opteron costs these days?
for the same reasons. Clearly the DB machines will
need to be
hand-crafted. I don't know much about databases on this sort of scale...
but I imagine the Wikipedia developers do.
First, multiplicity. Second, fast disk. Hardware raid controllers and
tablespaces which allow you to put your indices on the really fast
disk and your "big data" on the slower, cheaper, bigger disk. Is SAN
attachment an option or are we sticking with NFS? SAN over 1gbit ether
(which is where we're at) is not so bad. I mean, it could be worse.
Oh, and don't forget the Gbytes and Gbytes of RAM!
for your
servers, and hence your data and syadmin sanity. An ounce of
prevention is again worth a pound of cure.
Amen.
Finaly, you might consider buying a cheap radio
clock on a per-site
basis, if your colo does not already provide you with a local stratum-1
Am I missing something? Can we not just sync with
{tick,tock}.usno.navy.mil? If
usno.navy.mil goes down, we have much
bigger problems than a toasted wikipedia.
Consider connectivity going down at a remote site (takes just one
backhoe at the site 1/4 mile down the road where all of your supposedly
"diverse" fibers join the same duct, or simply a router going mad, or
someone hitting the Big Red Power-Off Button at ********* *****
[substitute your national Achilles Heel IXP]), rather than Global
Thermonuclear War. Trust me, a local clock is a good thing. That's why
it's such a useful service to have onsite.
Nobody's mentioned backups. Automated backups are
hard to do, and
manual backups require somebody to go there and swap tapes. Is a tape
changer in the cards for us? We'd need to go with LTO or something....
and that's pretty ugly, money wise. Anyone?
aa
Now, that's a very good point. However, it might be better to do what
Linus does and "let millions of people mirror it everywhere". This might
be something to ask organizations like to UK Mirror Service, Internet
Archive and Google to do on a formal basis. This way, the backups are
off-site too. The current archive is 50G. If we take a week to back it
up, that's a data rate of 50e9*8/(86400*7) = 662 kbps < 1 Mbps. So
Wikipedia could perform complete backups weekly to three different sites
at a cost of 3 Mbps sustained. That's cheaper than the cost of the tape
media.
-- Neil