Thanks for this. I added a note about the rvcontinue parameter to the
API wiki docs - as it's only mentioned in the auto-generated MediaWiki
API documentation page.
http://www.mediawiki.org/wiki/API:Query_-_Properties#Parameters
To give me an idea what is reasonable: If I request 100 revisions
every 10 seconds would that be ok?
Rob
On 30/11/2009, at 3:15 PM, Tim Starling wrote:
Robert Carter wrote:
I've been experimenting with the parameters
to Special:Export to
retrieve the whole history of an article. I haven't been able to get
more than 1000 revisions (from en wikipedia).
Does anyone know of a way to obtain the full history of an article?
Those huge 7z exports seem too crazy to work with to extract data for
only one page.
You can use api.php with rvprop=content and rvcontinue to fetch the
text of all revisions of a page. Please do this in a single thread
with a substantial delay between requests, since this is a very
expensive operation for our servers. Do not attempt to do it for a
large number of pages, for that, use the XML download instead. Do not
do it regularly or set up a web gateway which allows users to initiate
these requests.
-- Tim Starling
_______________________________________________
MediaWiki-l mailing list
MediaWiki-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l