Claudio V wrote:
Is there a way to run fast queries (perhaps using
indexes) using the "old" table in my replication of
the English Wikipedia database (MySQL 4.0.20)?
You'll either have to make sure your query can make use of the existing
indexes, or create the necessary indexes yourself.
select distinct old_user, old_user_text
from old
where old_user > 0
and old_title like 'History_of_%';
Use explain:
+-------+-------+--------------------------+-----------+---------+------+------+-----------------------------+
| table | type | possible_keys | key | key_len | ref
| rows | Extra |
+-------+-------+--------------------------+-----------+---------+------+------+-----------------------------+
| old | range | old_title,user_timestamp | old_title | 255 | NULL
| 1 | where used; Using temporary |
+-------+-------+--------------------------+-----------+---------+------+------+-----------------------------+
Your query is slow because it uses a temporary table.
I'm surprised that this sort of query runs so
slowly,
considering that when on the live Wikipedia, you click
on "User Contributions" and get a list of responses
with lightning speed.
mysql> explain SELECT
-> old_namespace,old_title,old_timestamp,old_comment,
-> old_minor_edit,old_user_text FROM old
-> WHERE old_user_text='Timwi' ORDER BY inverse_timestamp
-> LIMIT 100;
+-------+------+--------------------+--------------------+---------+-------+------+------------+
| table | type | possible_keys | key | key_len | ref
| rows | Extra |
+-------+------+--------------------+--------------------+---------+-------+------+------------+
| old | ref | usertext_timestamp | usertext_timestamp | 255 |
const | 199 | where used |
+-------+------+--------------------+--------------------+---------+-------+------+------------+
See? No temporary table.
A major difference that I can see between your query and this one is
that yours uses DISTINCT, but you'll have to experiment a bit to see if
that's the only reason.
Also, type=ref with ref=const is faster than type=range, but I assume
with what your query is trying to achieve you can't avoid type=range.
Timwi