Hi Risker!
On Wed, Feb 16, 2022 at 5:52 PM Risker <risker.wp(a)gmail.com> wrote:
Thank you very much for sharing this data, Tyler (and
to the team that researched and analysed it, as well). I think it shows that the train
has been pretty successful in mitigating the issues it was intended to improve.
I think so, too :)
I note the data points that show there has been a
significant and clear trend toward fewer comments per patch. This would be worth
investigating further. Iis the total number of reviews pretty consistent, or is it
increasing or decreasing? Is it possible that developers have become more proficient at
writing patches to standard, and thus fewer comments are required? Or could it be that,
because more time is invested in writing patches (assuming that more patches = more time
writing them), there is less time for review?
I'll preface my comments with the caveat: I am (definitely) not a data
scientist.
I think we need to investigate more to say anything definitive. And I
love that this data enables us to have a conversation about what to
investigate next.
The comments per patch trend comes from the number of comments per
patch averaged over a whole train. Outliers could be affecting the
average (for instance, there is one patch[0] from 2015 with 354
comments).
Another possible explanation is: as we've added more bots over time,
my simple tools to filter out bot noise are proving insufficient.
I've only begun to explore this trend[1]. I'll keep folks posted and I
invite others to explore along with me!
Thanks!
β Tyler
[0]:
<https://data.releng.team/train?sql=select+*+from+patch+order+by+comments+desc>
[1]:
<https://gitlab.wikimedia.org/thcipriani/train-stats#a-look-at-comments-per-patch>