[QA] how often to run browser tests?

Tue Mar 25 19:40:35 UTC 2014

There has been some confusion and lack of consensus about this for some
time.  I'd like to discuss the issue of frequency so that we at least have
some knowledge, even if we might disagree on the details.

The idea of running tests for every commit is for unit tests only. And by
"unit tests" I mean tests that do not connect to a database and do not
touch a filesystem and do not exercise a UI. These sort of microtests are
very very fast and very very reliable, and they should run on every commit.

Integration tests are likely to talk to databases and/or filesystems, but
not to the UI.  You might want to run these after changes to databases, to
APIs, architecture, that sort of thing.  Maybe hourly, to take advantage of
the automatic db updates in the shared test environments.  Again, these
sorts of tests should be pretty reliable.

UI tests are a little different. For one thing, they take a long time to
run, on the order of many minutes.  I think we have one suite right now
(for better or worse) that takes more than an hour to run. If you have a UI
test suite that takes an hour to run, and your repo is getting five commits
per hour, it does not make any kind of sense to run the UI test suite after
every commit.

UI tests are also by their nature flaky.  Some suites in some repos can be
relied on to run green all the time (assuming that the test environment
isn't flaky), but UI tests that are complex enough to be useful will simply
not be green all the time.  In this case, running UI tests very often
creates noise that some human has to sort.

So my own opinion is that UI test suites should be run often enough to be
effective, but no more often than that. The vast majority of bugs that I
report because of UI tests I find first thing in the morning after the
overnight run of all of the test suites.  Those bugs tend to be introduced
in the afternoon and evening of the previous day.  We run a daytime build
of the UI tests also, but I find that much less valuable for identifying
problems than the overnight run.

And that is about all of the bandwidth that our existing processes can
support, nor would more frequent builds improve the situation in any
meaningful way. If the UI test builds ran five times per day, that would be
five times as many test results to interpret, five times as many failed
tests waiting for bug fixes, five times more timeouts when beta labs flakes
out.  We all have other things we need to do.

I do think it makes sense to evaluate how often particular test suites
should run, but I would prefer that we not make blanket statements like
"tests should run after every commit" in the case of UI tests and
integration tests.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/qa/attachments/20140325/d6df14b1/attachment.html>