Skip to main content

Bug Tracker

Side navigation

#14536 closed bug (wontfix)

Opened November 13, 2013 07:12PM UTC

Closed December 06, 2013 05:55PM UTC

Dealing with flaky tests

Reported by: jzaefferer Owned by:
Priority: undecided Milestone: None
Component: unfiled Version: 1.10.2
Keywords: Cc:
Blocked by: Blocking:
Description

For example, testing anything related to focus is difficult and unreliable. Tests that are flaky have very questionable value in the first place.

Suggestion: Add a checkbox for “Run flaky tests” via QUnit.config.urlConfig. When enabled specifically, it will run those flaky tests, by default, on TestSwam, it won’t.

(from the Amsterdam meeting)

Attachments (0)
Change History (10)

Changed November 13, 2013 07:14PM UTC by scottgonzalez comment:1

When the checkbox isn't checked (default), there should be a warning saying that some tests were skipped, so that devs don't forget there are other tests when running locally.

Changed November 13, 2013 07:41PM UTC by timmywil comment:2

I really don't like the idea that tests would be skipped in testswarm. If they are skippable, they are removable. If they are not removable, but they are flaky, they should be fixed.

Changed November 13, 2013 07:44PM UTC by scottgonzalez comment:3

Well, obviously nobody has a fix for them, and nobody wants to actually delete them. If either of those were the case, this wouldn't have been discussed several times over the past year or so ;-)

Changed November 13, 2013 07:47PM UTC by dmethvin comment:4

I feel like we should keep them in there as a sign that we wish they *would* work reliably, but have some facility in QUnit to indicate when they've failed without getting a fail for the whole test. Does that sound wishy-washy enough?

Also, I want to have my cake and eat it too. And a pony.

Changed November 13, 2013 07:52PM UTC by scottgonzalez comment:5

Well, a skip feature in QUnit has been turned down several times already. Besides, you want to be able to easily opt-in to running these tests, without going and deleting the skipped marker for each individual test.

Changed November 13, 2013 07:59PM UTC by timmywil comment:6

Well, obviously nobody has a fix for them, and nobody wants to actually delete them.

I don't think either of these things are true. In my experience, it's not that the flaky tests are not fixable. It is about the lack of time to spend stabilizing these tests. Also, I'm sure we could find some tests that we are willing to delete if, again, we took the time to do so. Nonetheless, neither lack of time nor the desire to see green on testswarm are good enough reasons to have a config flag to skip any tests. The problem remains at the test level. No excuse suffices to make it otherwise.

On a related note, I would also not mind a pony.

Changed November 14, 2013 10:48AM UTC by jzaefferer comment:7

some facility in QUnit to indicate when they've failed without getting a fail for the whole test

That sounds like a great plugin.

Changed November 28, 2013 08:03AM UTC by m_gol comment:8

Replying to [comment:6 timmywil]:

> Well, obviously nobody has a fix for them, and nobody wants to actually delete them. I don't think either of these things are true. In my experience, it's not that the flaky tests are not fixable. It is about the lack of time to spend stabilizing these tests.

I don't fully agree. Some tests tend to fail when run on TestSwarm but not necessarily locally, some of them pass if run separately. If I test locally, it's acceptable for me to see on or two failed tests if I can then click on the "rerun" button for them separately and see if they really fail. It would be better to have them marked as flakey so that they don't cause the page to go red but yellow and so that I know those are the tests that I have to re-run to make sure they're not broken by the patch. Integrating such a solution to our test setup would be difficult and it would seriously bump the overall test time.

Of course, it would be more difficult to maintain such a situation if we had too many of those tests so you have a point.

Changed November 28, 2013 08:03AM UTC by m_gol comment:9

Replying to [comment:6 timmywil]:

> Well, obviously nobody has a fix for them, and nobody wants to actually delete them. I don't think either of these things are true. In my experience, it's not that the flaky tests are not fixable. It is about the lack of time to spend stabilizing these tests.

I don't fully agree. Some tests tend to fail when run on TestSwarm but not necessarily locally, some of them pass if run separately. If I test locally, it's acceptable for me to see on or two failed tests if I can then click on the "rerun" button for them separately and see if they really fail. It would be better to have them marked as flakey so that they don't cause the page to go red but yellow and so that I know those are the tests that I have to re-run to make sure they're not broken by the patch. Integrating such a solution to our test setup would be difficult and it would seriously bump the overall test time.

Of course, it would be more difficult to maintain such a situation if we had too many of those tests so you have a point.

Changed December 06, 2013 05:55PM UTC by timmywil comment:10

resolution: → wontfix
status: newclosed

Some tests would inevitably end up in the flaky category undeservedly and we'd end up with inconspicuous regressions. The core team will deal with flaky tests at the test level on a case-by-case basis.