On Tue, Feb 11, 2003 at 12:35:37AM -0800, Brion Vibber wrote:
On sab, 2003-02-08 at 14:26, Brion Vibber wrote:
The watchlist:
SELECT DISTINCT
cur_id,cur_namespace,cur_title,cur_comment,
cur_user,cur_user_text,cur_timestamp,cur_minor_edit,cur_is_new
FROM cur,watchlist
WHERE wl_user=7457 AND wl_title=cur_title
AND (cur_namespace=wl_namespace OR cur_namespace=wl_namespace+1)
ORDER BY inverse_timestamp LIMIT 100;
+-----------+------+----------------------------------------------+-----------+---------+--------------------+------+----------------------------------------------------------+
| table | type | possible_keys | key | key_len |
ref | rows | Extra |
+-----------+------+----------------------------------------------+-----------+---------+--------------------+------+----------------------------------------------------------+
| watchlist | ref | wl_user | wl_user | 4 |
const | 1730 | where used; Using index; Using temporary; Using filesort |
| cur | ref | cur_namespace,cur_title,name_title_timestamp | cur_title | 255 |
watchlist.wl_title | 1 | where used |
+-----------+------+----------------------------------------------+-----------+---------+--------------------+------+----------------------------------------------------------+
Explain doesn't give any different results if the namespace+1 is taken
out; so adding near-dupe rows to cover talk pages wouldn't help.
Strange. Did you force it to use the index Name_title_timestamp with USE?
MySQL sometimes chooses a smaller index than it should.
I'm not quite sure how to match up the indexes
right. Note that
watchlist has a composite index on wl_user, wl_namespace, and
wl_title... but I don't know if it can match those up if part 1 of the
index is a constant, and parts 2 and 3 match 1 and 2 of the other
table's index.
They don't have to match. All that is required is that the columns on which
it joins watchlist with cur are a prefix of some index on cur. So without
the namespace+1 part the name_title_timestamp index should do, and with
probably also (but I'm not sure because I don't know how it handles the
OR). As always, EXPLAIN will tell you if it uses the right indices or not.
Some prolific users have three thousand or more titles
in their watchlist;
the total number of distinct watched pages on the en.wiki is 30409.
Is there any way we could do this without temporary tables & filesorts &
whatnot?
Yes, there is, you could add an extra inverse_timestamp column to the
watchlist that (very redundantly) stores the inverse_timestamp of the
watched page. Obviously this would have to be updated each time the page is
updated (and when the page is added to the watchlist). If you then have an
index on (wl_user, inv_timestamp) and make sure MySQL uses it then it won't
have to sort.
-- Jan Hidders