I'd like to propose to define *one* request header to be used for all
analytics purposes. It can be key/value pairs, and be set client side where
applicable. Varnish can append to it where needed, later keys overriding
earlier ones. Then we can log that one header across all HTTP/caching
clusters without having to change the log stream all the time, and without
wasting much space, and caching edge configuration changes are kept to a
minimum as well.
Agreed. Instrumentation should ideally never get in the way of production
performance, so if we can cut or optimize header use for logging without
being too onerous, we'll happily do so. afaik, the reasons that custom HTTP
headers are used at all are:
- They're accessible from varnishncsa without code modifications;
- Varnish and/or other parties in the request chain can munge the values
prior to logging to save bytes (examples being X-CS, which replaces the
semantic carrier name with a [vastly shorter] numeric code, and the
proposed X-MF-Mode header, which prevents the need to log the whole cookies
header for post-processing).
Ideally, none of this should need to make a trip to the client. I don't
recall seeing anything in the Varnish docs providing a way to send values
exclusively to the loggers, but if there is, that's an easy win, and it
wouldn't require any changes to our parsing pipeline.
If that's not possible, it makes sense to collapse various headers into a
KV field; that would require changes on our side, including all downstream
consumers of the log stream (which is surprisingly large), so it's not a
trivial move.
--
David Schoonover
dsc(a)wikimedia.org