On Wednesday 13 August 2003 09:18, tarquin wrote:
Nicholas Knight wrote:
By now it's probably obvious where I'm
going with this. Could one of these
methods (either storing a parsed and non-parsed version or the approach I
took with "de-parsing") be used for some performance gain on Wikipedia's
webserver?
De-parsing strikes me as a rather odd way to do it.
It's odd, no doubt about that, just happens to be the best fit in my case. :)
Furthermore, Jimbo has often remarked that disk space
is not a problem. (he
may come to regret that remark when we hit a million articles.... but hey!
;)
In general I wouldn't expect it to be a problem. It's just a concern on my
personal server, which I don't have much in the way of funds avalible for
upgrading. Thought I'd throw it out there anyway as it struck me as a rather
elegant solution for cases where disk space is a problem. :)
I would suggest we consider semi-parsing.
Save two versions of the article:
a) wikitext
b) the wikitext parsed into HTML, with wikilinks still as [[link]]. Note
that this would not be a fully-formed HTML document, just a fragment
since it would not have a head section or enclosing tags.
upon page read, it's b) that is inserted into the delivered page. Links
are parsed live, since their status as existing / stub / ghost depends
on the state of the database at that moment.
Oops! Right, forgot about that since it's not applicable to my script ("all
the world's a blog" syndrome? ;)). The 'semi-parsing' solution seems
perfect
to me.