RE: [Wikitech-l] Mirroring article onto another website

14 Feb 2006

...
  I want to be able to mirror an article from our wiki
onto a separate
 website, leaving out the navigation links, tabs and other material
 that isn't specific to my article. Is there a technique for doing this? 
Not that I am aware of, but (depending on how badly you want this
behaviour), and assuming you always want the latest "live" version,
you could perhaps have the separate website call Special:Export (e.g.
http://en.wikipedia.org/wiki/Special:Export/test ), then parse the
XML, to get the raw wiki text, and then manually remove any links, and
then display it in the way that you want.

e.g. in PHP you could do all of the above apart from removing the
links and the displaying like so:
======================================
function getWebArticleText($title) {
    // build the URL
    $url = "http://en.wikipedia.org/wiki/Special:Export/" . $title;

    // init CURL resource
    $ch = curl_init();

    // set url to pull
    curl_setopt($ch, CURLOPT_URL, $url);

    // return pulled web text into a variable.
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

    // retrieve the web text.
    $str = curl_exec ($ch);

    // if we encountered an error, then log it, and exit.
    if (curl_error($ch)) {
        trigger_error("Curl error #: " . curl_errno($ch) . " - " .
curl_error ($ch) );
        print "Curl error #: " . curl_errno($ch) . " - " . curl_error
($ch) . " - exiting.\n";
        exit();
    }

    // close the CURL resource
    curl_close ($ch);

    // just want article wiki text found in via [[Special:export]] XML output
    xml_parse_into_struct(xml_parser_create('UTF-8'), $str, $val, $ind);

    // if we got valid data.
    if (isset($ind['TEXT'][0])) {
        $text = $val[$ind['TEXT'][0]]['value'];
    }
    // otherwise we got invalid data (most likely article was deleted,
or possibly wikipedia may be malfunctioning)
    else {
        $text = "";
    }

    // return the article's wiki text.
    return $text;
}
======================================
Then you'll need some regexes to remove the links, and then have to
work out how to display it (given that it's wiki text, and you want
HTML presumably).

All the best,
Nick.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

RE: [Wikitech-l] Mirroring article onto another website