Hi
I currently try to create a cache for "mwoffliner". A cache for images
(thumbnails) and a cache for Parsoid output. For the images/thumbnails
it's pretty straight forward thanks to the "last-modified" header.
Unfortunately, for the Parsoid output, this seems to be more
complicated. Gabriel's htmldumper relies only on the oldid value, but
I'm not really satisfied byt this approach because I want to be able to
download a new version of the HTML for the same oldid if necessary (for
example if the HTML output was improved with a Parsoid fix).
There is an "age" header but I don't really understand the fundamental
difference with "last-modified". Do we have the same information here
but presented in an other way? If yes, why is that better than
"last-modified"?
There is in addition the "x-varnish" header but this is IMO an internal
information I should not rely on (and BTW, time to time we get headers
with two "x-warning" header entries, what looks pretty weird to me - see
PS).
Finally my question, might we introduce a "last-modified" HTTP header?
Regards
Emmanuel
PS: Here an example of request with two "x-varnish" headers:
$ curl -I
"http://parsoid-lb.eqiad.wikimedia.org/dewiki/Almer%C3%ADa?oldid=133672544"
HTTP/1.1 200 OK
X-Powered-By: Express
Vary: Accept-Encoding
Access-Control-Allow-Origin: *
Cache-Control: s-maxage=2592000
content-revision-id: 133672544
X-Parsoid-Performance: duration=4063; start=1416051524354
Content-Type: text/html; charset=UTF-8
X-Varnish: 735376643 735208307
Via: 1.1 varnish
Date: Sat, 15 Nov 2014 12:03:47 GMT
X-Varnish: 1047669169
Age: 1499
Via: 1.1 varnish
Connection: keep-alive
X-Cache: cp1058 hit (6), cp1058 frontend miss (0)
--
Kiwix - Wikipedia Offline & more
* Web: http://www.kiwix.org
* Twitter: https://twitter.com/KiwixOffline
* more: http://www.kiwix.org/wiki/Communication