Hi Tim,
If there's a problem with viewing past versions of the main page, that's
perfectly okay -- it can be excluded from the resources that are datetime
content negotiable like the Special: pages.
I admit to not following the second issue completely. A regular robot would
never issue the X-Accept-Datetime to jump back in time, so that's okay. A
regular robot would also respect the history page policy and not crawl
backwards either, as you say. A robot that did issue X-Accept-Datetime
would end up crawling old revision pages and never hit a history list, but
this could also be forbidden via robots.txt if the revision pages were
excluded too?
However, that seems like it's a long time off before people write past-web
crawlers and the use case for even doing it at all is pretty hard to come up
with. :)
Hope this addresses your concerns!
Rob
On Thu, Nov 12, 2009 at 5:15 PM, Tim Starling <tstarling(a)wikimedia.org>wrote;wrote:
Daniel Kinzler wrote:
Hi all
The Memento Project <http://www.mementoweb.org/> (including the Los
Alamos
National Laboratory (!) featuring Herbert Van de
Sompel of OpenURL fame)
is
proposing a new HTTP header, X-Accept-Datetime,
to fetch old versions of
a web
resource. They already wrote a MediaWiki
extension for this
<http://www.mediawiki.org/wiki/Extension:Memento> - which would of
course
be
particularly interesting for use on Wikipedia.
Do you think we could have this for Wikimedia project? I think that would
be
very nice indeed. I recall that ways to look at
last weeks main page have
been
discussed before, and I see several issues:
You can't view the main page as it was in the past, because users
routinely upload temporary images to display there, so that they can
be protected, and then delete them once they're off the page.
Also, we can't have people crawling Wikipedia while requesting old
versions, because of the excessive disk seeking and CPU usage that
would generate. That's why the history page has a robot policy of
noindex, nofollow.
-- Tim Starling
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l