On Tue, Sep 1, 2015 at 5:54 PM, Gergo Tisza <gtisza(a)wikimedia.org> wrote:
Rate limiting / UA policy enforcement has to be done in Varnish, since API
responses can be cached there and so the requests don't necessarily reach
higher layers (and we wouldn't want to vary on user agent).
The cost / benefit trade-offs for Varnish cache hits are fairly different
from those of cache misses. Especially for in-memory (frontend) hits it
might overall be cheaper to send a regular response, rather than adding
rate limit overheads to each cache hit.