Id be careful about using numbers in triage right now. The numbers are a
little misleading as the error logging is only enabled on smaller wikis.
Also if an error results in data loss but only impacts a small amount of
people I would say that's worse than a benign error that occurs for lots.
We rolled out to Spanish, German and Japanese wikipedia yesterday so these
numbers will start becoming more useful, but English Wikipedia will
severely skew these numbers when we finally enable it.
On Tue, Sep 22, 2020, 9:59 AM Ed Sanders <esanders(a)wikimedia.org> wrote:
Speaking specifically about the new JavaScript error
logging, and
specifically to Alex's point about triaging these tasks, it would be very
helpful if the reports included some indication of how often the error is
occurring.
For example, VisualEditor is loaded several hundred thousands times per
day. If an error has occurred 4 times in the last 30 days (based on a
recent example) then it is probably very low priority.
On Thu, 17 Sep 2020 at 16:40, C. Scott Ananian <cananian(a)wikimedia.org>
wrote:
ACN -- for what it's worth, I've been
working for the foundation for a
while now, and I can report from the inside that the trend is definitely
in
a positive direction. There is a lot more
internal focus on addressing
code debt and giving maintenance a fair spot at the table. (In fact, my
entire team is now sitting inside 'maintenance' now, apparently; we used
to
be 'platform evolution'.) This email
thread is one visible aspect of
that
focus on code quality, not just features.
That said, the one aspect which hasn't improved much in my time at the
foundation has been the tendency of teams to work in silos. This thread
also seems to be a symptom of that: a bunch of production issues are
being
dropped on the floor ('not resolved in over a
month') because they are
falling between the silos and nobody knows who is best able to fix them.
There are knowledge/expertise gaps among the silos as well: someone
qualified to fix a DB issue might be at sea trying to track down a front
end bug, and vice-versa---a number of generalists in the org could
technically tackle a bug no matter where it lies, but it will take them
much longer to grok an unfamiliar codebase than it would for someone more
familiar with that silo. So bug triage is an increasingly technical task
in its own right.
This thread, as I read it sitting inside the org, isn't so much asking
for
more attention to be paid to maintenance --
we're winning that battle,
internally -- as it is a plea for those folks on the edges of their silos
to keep an eye out for these things which are currently falling between
them and help with the triage.
--scott, speaking only for myself and my view here
On Wed, Sep 16, 2020 at 11:25 PM AntiCompositeNumber <
anticompositenumber(a)gmail.com> wrote:
> There is an impression among many community members, myself included,
> that Foundation development generally prioritizes new features over
> fixing existing problems. Foundation teams will sprint for a few
> months to put together a minimum viable product, release it, then move
> on to the new hotness, leaving user requests, bugfixes, and the like
> behind. It often seems that the only way to get a bug fixed is to get
> a volunteer developer to look at it. This is likely unintentional, but
> it happens nonetheless.
>
> Putting a higher priority within the Foundation on cleaning up old
> toys before taking out new ones is necessary for the long-term
> stability of the projects.
>
> ACN
>
> On Wed, Sep 16, 2020 at 9:05 PM Dan Andreescu <
dandreescu(a)wikimedia.org>
> wrote:
> >
> > >
> > > For example, of the 30 odd backend errors reported in June, 14 were
> still
> > > open a month later in July [1], and 12 were still open – three
months
> later
> > > – in September. The majority of these haven't even yet been
triaged,
> > > assigned assigned or otherwise
acknowledged. And meanwhile we've
got
> more
> > > (non-JavaScript) stuff from July, August and September adding
> pressure. We
> > > have to do better.
> > >
> > > -- Timo
> > >
> >
> > This feels like it needs some higher level coordination. Like
perhaps
> > managers getting together and deciding
production issues are a
priority
and
> diverting resources dynamically to address them. Building an awesome
new
> feature will have a lot less impact if the
users are hurting from
growing
> > disrepair. It seems to me like if individual contributors and
> maintainers
> > could have solved this problem, they would have by now. I'm a little
> > worried that the only viable solution right now seems like heroes
> stepping
> > up to fix these bugs.
> >
> > Concretely, I think expanding something like the Core Platform Team's
> > clinic duty might work. Does anyone have a very rough idea of the
time
it
would take to tackle 293 (wow we went up by a
dozen since this thread
started) tasks?
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
(
http://cscott.net)
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l