I've set up an indexing update daemon for the Lucene search indexes,
which accepts updates sent from a MediaWiki extension hook over XML-RPC.
One of the main reasons for doing it this way is that the page text
storage changes in MediaWiki 1.5 would make it much harder for a
separate app to get the text by poking directly at the database. Another
of course is that it'll just be easier to push out updates and to handle
page deletions.
Some quick notes:
http://meta.wikimedia.org/wiki/User:Brion_VIBBER/MWDaemon#Updater_daemon
Right now it's just running, indexing new updates. I'll re-pull old
pages and copy out the indexes later today, and set up the regular
pause-and-copy job.
The maintenance script 'luceneUpdate.php' can be used to poke the daemon
for status, to flush indexes, and to stop/start the update applier
thread for queued updates.
If this somehow breaks the site overnight, somebody check for the
include of MWSearchUpdateHook.php in CommonSettings and take it out.
The daemon's log file is in /usr/local/mwsearch/MWUpdater.log on maurus.
-- brion vibber (brion @
pobox.com)