I want to be able to mirror an article from our wiki
onto a separate
website, leaving out the navigation links, tabs and other material
that isn't specific to my article. Is there a technique for doing this?
Not that I am aware of, but (depending on how badly you want this
behaviour), and assuming you always want the latest "live" version,
you could perhaps have the separate website call Special:Export (e.g.
http://en.wikipedia.org/wiki/Special:Export/test ), then parse the
XML, to get the raw wiki text, and then manually remove any links, and
then display it in the way that you want.
e.g. in PHP you could do all of the above apart from removing the
links and the displaying like so:
======================================
function getWebArticleText($title) {
// build the URL
$url = "http://en.wikipedia.org/wiki/Special:Export/" . $title;
// init CURL resource
$ch = curl_init();
// set url to pull
curl_setopt($ch, CURLOPT_URL, $url);
// return pulled web text into a variable.
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// retrieve the web text.
$str = curl_exec ($ch);
// if we encountered an error, then log it, and exit.
if (curl_error($ch)) {
trigger_error("Curl error #: " . curl_errno($ch) . " - " .
curl_error ($ch) );
print "Curl error #: " . curl_errno($ch) . " - " . curl_error
($ch) . " - exiting.\n";
exit();
}
// close the CURL resource
curl_close ($ch);
// just want article wiki text found in via [[Special:export]] XML output
xml_parse_into_struct(xml_parser_create('UTF-8'), $str, $val, $ind);
// if we got valid data.
if (isset($ind['TEXT'][0])) {
$text = $val[$ind['TEXT'][0]]['value'];
}
// otherwise we got invalid data (most likely article was deleted,
or possibly wikipedia may be malfunctioning)
else {
$text = "";
}
// return the article's wiki text.
return $text;
}
======================================
Then you'll need some regexes to remove the links, and then have to
work out how to display it (given that it's wiki text, and you want
HTML presumably).
All the best,
Nick.