On Wed, Sep 2, 2015 at 1:21 AM, Gabriel Wicke <gwicke(a)wikimedia.org> wrote:
On Tue, Sep 1, 2015 at 5:54 PM, Gergo Tisza
<gtisza(a)wikimedia.org> wrote:
Rate limiting / UA policy enforcement has to be done in Varnish, since API
responses can be cached there and so the requests don't necessarily reach
higher layers (and we wouldn't want to vary on user agent).
The cost / benefit trade-offs for Varnish cache hits are fairly different
from those of cache misses. Especially for in-memory (frontend) hits it
might overall be cheaper to send a regular response, rather than adding
rate limit overheads to each cache hit.
Yeah I was mostly thinking of uncacheable API accesses. If we can
cache it, we don't mind (as much) in terms of load/abuse. By having
the simpler outer check in varnish, though, it takes the big load from
anonymous spikes away from being handled at the applayer for those
uncacheable hits.