Nick Pisarro wrote:
Speaking on a purely theoretical basis, an important
thing to writing or
tuning a program written in something like PHP is to understand what
operations cost the most when executed in the particular language. Also in
almost any language, repeatedly scanning, copying and concatinating large
strings, like 100,000 byte articles is really costly. [.....]
Database turning is quite important as well, though I get feeling you've
been on top of that.
I find it *highly* interesting that so mnay people appear to think that
most of the performance problems are related to CPU usage rather than
database queries.
I can't claim to have a lot of experience with this, but this seems so
highly unlikely to me. Concatenating two strings, even kilobytes large
ones, it *negligible* compared to database queries that search the
entire 'old' table, or create temporary tables, or do ridiculous amounts
of joins and that stuff.
Since I'm afraid I don't know much about the database and the codebase,
I can't say a lot about this, but it seems highly foolish and very
detrimental to database performance to have separate 'cur' and 'old'
tables to serve entirely the same function, but then sticking the BLOB
that stores the actual article text in the same table as the article
meta information!
The database schema also uses far too many character strings (VARCHARs)
with indexes on them, in places where a numerical ID would do. "Hint
tables", such as "recentchanges" is one, are also often discouraged as
updating them on every edit is more costly than reading a
general-purpose table like... uhm... 'articlehistories', provided its
indexes are well-chosen.
But I've said all of this in August and September last year already.
Greetings,
Timwi