Brion Vibber wrote:
On Sat, 12 Jul 2003, Timwi wrote:
* BLOBs that store article text are combined in
the same table as
meta-data (e.g. date, username of a change, change summary, minor flag,
etc.). This is bad because variable-length fields like BLOBs negatively
affect the performance of reading the table.
How much of a difference does this make when we're usually taking single
rows found via an index?
Hm. Good point. I haven't thought about it in that much detail -- I've
only been "taught" this from experience on LJ. I think it has something
to do with hard disk seeks and stuff -- very technical. Regardless,
though, it should be clear that at least *updating* a table with
variable-length fields is quite a lot more complex than updating one
without.
* Store
translated website text, so translators don't have to dig
through PHP code and submit a file to the mailing list.
We certainly could do this, though there are performance concerns whenever
the idea is brought up. Caching the strings in shared memory may alleviate
this.
It's worked perfectly on LJ ever since it was introduced. Additionally,
I'm keeping MemCacheD in mind while thinking this all through. It would
probably keep all the text in memory all the time.
There's been some talk of adapting the translation
system we use at some
of Esperanto cxe Interreto's sites, such as
http://lernu.net/, so the
interface and source-file-scanner doesn't have to be written from scratch.
Well, one of the problems I find with that system is that it is too
easily vandalisable. If you vandalise a single Wiki page, that doesn't
matter too much, it can be reverted within a few minutes, but if you
vandalise, say, the wording of the "Edit this page" link, it would
affect everybody who would visit Wikipedia within the minutes it takes
someone to revert it.
I can see several ways of doing this:
* Restrict access to assorted people. Of course, this is un-wiki-like
and not a real lot better than modifying LanguageXY.php.
* Make changes in the translatable strings not take effect until they
have been kept unchanged for 24 hours. Some trusted few could be given
the privilege to be able to change the strings directly (in case, for
example, a vandalism goes unnoticed for 24 hours).
However, for this to work, the changes in the translation system
should appear on the Recent Changes pages with the Wiki pages. Perhaps
they *should* be Wiki pages with their own namespace (String:XYZ?).
Which in turn would deviate from the concept that
lernu.net uses.
I haven't looked at the table structure used, but
I imagine it's a fairly
straightforward language-key-string triplet set.
I don't know about
lernu.net either, but as for LJ, it's a little bit
more complex than that. If you're interested, LJ's database is here:
http://www.livejournal.com/doc/server/ljp.dbschema.ref.html
The tables beginning with "ml_" are the ones pertaining to the
translation system.
* A global
table for bidirectional inter-wiki links. People should not
have to add the same link to so many articles.
There's an experimental table for interwiki links, but it's not entirely
the best setup. It's questionable whether bidirectional is really right,
though, as there's not always a 1:1 matchup between articles.
Colour me ignorant, but why shouldn't there always be a 1:1 matchup?
Maybe there isn't now, but articles certainly can (and perhaps they
should) be changed to comply. Or do you know of a particularly striking
example where it should not?
Oh, by the way. How would you prefer to do the conversion from the old
database to the new? Myself, I thought perhaps we could have the
software do this whenever an article is edited by a user. This way, we
don't have to take Wikipedia down for the time it takes to convert the
entire database.
Are you all
still convinced that adapting the current code to all these
radical changes is easier than rewriting it all from scratch? :-)
Yes, certainly.
Okay then. I'll take your word for it and learn some PHP. I'll create a
preliminary SQL table-creation script for all this tomorrow. Which is
really today, but I should really go to bed first...
Good night,
Timwi