A flaky test — one that passes and fails without any code change — is worse than no test at all, because it trains the team to ignore red builds. Here's how to hunt flakiness down.
The usual suspects
- Hard waits.
sleep(2000)is a bet against the network. Replace it with web-first assertions that wait for a condition. - Race conditions. Asserting before the UI has settled, or before an async request resolves.
- Shared state. Tests that depend on data left behind by other tests.
- Time and locale. Hardcoded dates, timezones, or currency formatting.
- Animations. Elements that are mid-transition when the click fires.
Fixes that stick
// Brittle: fixed wait, then assert.
await page.waitForTimeout(2000);
expect(await page.locator(".cart-count").textContent()).toBe("1");
// Resilient: auto-retrying, web-first assertion.
await expect(page.locator(".cart-count")).toHaveText("1");- Use auto-retrying assertions (
toHaveText,toBeVisible) instead of manual waits. - Make each test set up and tear down its own data so it can run in isolation and in any order.
- Seed deterministic data — freeze time, fix the locale, control randomness.
- Quarantine, don't ignore. Tag a known-flaky test, track it, and fix it —
never
skipand forget.
Measure it
Track flake rate in CI. If a test fails then passes on retry, that's a signal, not a success. A suite the team trusts is one where red always means broken.