Hi Michael and all,
The first thing which we implemented was exactly this idea of a proxy using
the wikipedia API.
The proxy is here:
http://mementoproxy.lanl.gov/wiki/timegate/(wikipedia URI)
For example:
http://mementoproxy.lanl.gov/wiki/timegate/http://en.wikipedia.org/wiki/Clo…
We have also implemented proxies for the Internet Archive, Archive-It,
WebCitation.org and several others, as proof-of-concept pieces for the
research.
There are several reasons why a native implementation is better for all
concerned:
1. The browser somehow needs to know where the proxy is, rather than being
natively redirected to the correct page. For a few websites, and a few
proxies, this is tolerable. However even one proxy per CMS would be an
impossible burden to maintain, let alone one proxy per website!
2. If the website redirected to the proxy, rather than the client knowing
where to go, then this would be on trust that the proxy behaved correctly.
In a native implementation, you're never redirected off-site.
3. The proxy will redirect back to the appropriate history page, however
this page doesn't know that it's being treated as a Memento, and will not
issue the X-Datetime-Validity or X-Archive-Interval headers. This makes it
difficult (but not impossible) for the client to trap that it has been
redirected correctly.
4. The offsite redirection adds at least 2 extra HTTP transactions per
resource, slowing down the retrieval. In the native implementation the main
page redirects to the history page directly. In the proxy, the browser goes
to the main page, then either knows of or is redirected to the proxy, the
proxy makes one or more API calls to fetch the history for the page to
calculate the right revision, and then redirects the client back there.
5. We don't have to maintain the proxies :)
So for wikimedia installations the native approach is better as it's trusted
and faster and involves less API calls. For the client it's better as it's
faster and doesn't require intelligence or a list of proxies. For the proxy
maintainer it's better as they're no longer needed.
I hope that helps clarify things,
Rob Sanderson
(Also at Los Alamos with Herbert Van de Sompel)
Michael Dale wrote:
Instead of witting it as an extra header to HTTP protocol ... why don't
they write it as a proxy to wikimedia (or any other site the want to
temporal proxy). Getting a new HTTP header out there is not an easy task
at best a small percentage of sites will support it and then you need to
deploy clients and write user interfaces that support it as well.
If viewing old version of sites is something interesting to them. It
probably best to write a interface a firefox extension or grease monkey
script that integrates makes a "temporal" interface of their likening
for the mediawiki api (presumably the "history button" fails to
represent their vision? )... for non-mediawiki sites could access "the
way back machine".
If the purpose is to support searching or archival. Then its probably
best to proxy the mediaWiki api through a proxy that they setup that
supports those temporal requests across all sites (ie an enhanced
interface to the wayback machine?)
--michael