Ivan Krstic wrote:
Brion Vibber wrote:
The parser cache iirc does have to parse
_differently_ for users with
different options, so I don't know how much duplication there is.
This is probably worth looking into. With duplication and constant
growth working against you, 40GB isn't as much as it seems. I'm more
interested in your opinion about whether the one parser hit on edit
would solve the parser performance issues altogether, since it takes any
guesswork out of the picture and scales very easily.
I'm sure it would help some, though I don't know how much as I don't
have the hit ratio numbers.
Making it have to parse on edit _only_ would require a number of
specific things though such as:
* Parsing has to be independent of user options and settings. Things
like math rendering options need to alter only a later stage of output
than what is cached.
* Variable substitutions like the current date and number of articles
must be kept for later. Note that people like to use the date variables
in links for 'X of the day' type features; this causes some niggling
trouble with link consistency.
* Template substitution must similarly be delayed. If the templates are
pre-parsed though it could be easy to just grab the template's parse
tree and stick it in to the appropriate place. Caveat: templates have
parameters, with the same kinds of problems as variable substitutions
(they are often used in links)
And of course, if we change the parsing rules we need to be able to
clear out the cache, either through automatic versioning or some other
scheme. I tend to favor versioning and checking the cache for
currentness at load time and saving it then if necessary; that's how we
deal with most such things.
-- brion vibber (brion @
pobox.com)