On 9/15/20 9:43 AM, Alex Ezell wrote:
Do we use levels for any of these error log outputs?
That is, are they
classified on output as High, Medium, Low, Info, or something like that?
To an extent, yes. We have separate channels for PHP errors and
exceptions, for example, and although I don't think we currently
differentiate in logstash, maybe we could plausibly draw a further
distinction between PHP error levels. Intuitively, a low number of PHP
notices probably indicates something of lower severity than a high
number of fatals, and so forth.
Teasing out more detail about reported error severity could be a useful
exercise, but I'm not sure it would result in much more meaningful
signals than we currently have about production health. Serious
problems can manifest as trivial-seeming notices, some issues start out
that way and cascade over time, and generally any form of recurring
logspam needs human evaluation before we can easily say much more than
"this is a problem".
Or do we have to triage each of them as we examine
them?
Yeah. There are doubtless a lot of ways to improve the tooling we use
for that process, but right now I think it would be most helpful if we
just had more eyes _routinely_ on the logs and the workboard. (See
Tyler's earlier and much more detailed/thoughtful response to this thread.)
--
Brennen Bearnes
Release Engineering