Why Playwright Tests Break So Often

Selector drift is the most common cause

A button gets renamed from 'Submit' to 'Save changes'. A modal becomes a slide-over panel. A form moves from one step to two. A component library update changes the internal DOM structure of a dropdown. None of these changes break the product — but all of them break tests that were written against the old DOM shape.

This is the core problem with selector-first authoring. When tests are written by finding and targeting specific DOM nodes, they silently couple themselves to implementation details that will change. The test passes today not because the user journey works, but because the current DOM happens to match what the test was written against.

The fix is not to find better selectors — it is to author tests at a level where selectors are an implementation detail managed by the framework, not the primary artifact engineers write and maintain.

Timing and asynchrony create hidden fragility

Modern web applications are full of asynchronous behavior: loading states, route transitions, animated overlays, debounced search inputs, re-rendering after validation, and lazy-loaded components. Playwright has excellent built-in auto-waiting, but it cannot wait for things it does not know to wait for.

This is why teams encounter flaky tests after a 'nothing meaningful changed' deploy. A slightly slower API response, a new loading spinner, a modal that now fades in instead of appearing instantly — any of these can be enough to break a test that was not written to handle the new behavior.

Flakiness is expensive. A test suite that fails intermittently trains engineers to ignore failures, which defeats the entire purpose of the suite.

Animations and CSS transitions that delay interactive state
Debounced inputs that require a delay before the UI responds
Components that re-render after form validation or API response
Loading spinners or skeleton screens that briefly obscure target elements
Route transitions that temporarily unmount and remount components

Intent is usually missing from the test artifact

Many Playwright suites only encode mechanics. Click this locator. Fill that selector. Wait for this node. Six months later, when a test fails, nobody on the team can immediately say what user behavior the test was protecting — they have to reverse-engineer the intent from the implementation.

That makes maintenance slow and risky. Every failure requires rediscovering the purpose of the test before the team can decide whether to fix it, update it, or delete it. And when engineers cannot easily understand the purpose of a test, they are more likely to simply delete it when it gets in the way.

A test suite where the intent is clearly expressed in a readable, reviewable artifact is easier to maintain, easier to trust, and much harder for the team to silently let decay.

How to reduce breakage over time

The most durable Playwright suites share a common pattern: they are authored around stable user intent rather than fragile DOM detail. They use labels, roles, and accessibility attributes as selectors where possible. They keep scenario definitions readable enough that non-engineering stakeholders can follow and review them. And they are versioned in the repo alongside the product changes they are meant to protect.

In Assert, the Markdown scenario is the durable artifact. It describes what the user does and what they expect to see, in plain English. The Playwright layer is generated from that scenario and can be regenerated when the UI changes — without requiring the team to rewrite the human-readable intent.

The brittle pattern — and what to do instead

// Brittle: tightly coupled to DOM structure
await page.locator('.header > div:nth-child(2) button').click()
await page.locator('.modal .form input:nth-child(3)').fill('hi@example.com')

// Better: intent-level, label-based
- Click "Sign in"
- Fill "email" with "hi@example.com"

FAQ

Is Playwright the cause of brittle tests?

Rarely. Playwright is well-designed and handles asynchrony better than most frameworks. Brittleness almost always comes from authoring decisions: writing against unstable DOM structure, encoding too much layout detail, or failing to wait for the right conditions. The framework is sound; the problem is how tests are written.

What is the single biggest cause of broken Playwright tests?

Selector churn. Tests that target specific DOM nodes — especially those coupled to CSS class names, positional selectors like nth-child, or component internals — break every time the UI evolves. The most resilient selectors are those tied to stable product-level concepts: button labels, field names, ARIA roles, and deliberate test hooks.

How do you reduce Playwright maintenance overhead without rewriting everything?

Start by auditing the most frequently broken tests and identifying whether they fail due to selectors, timing, or unclear intent. Fix selector issues first — stable labels and data attributes go a long way. Then look at whether the intent of each test is legible to someone who did not write it. If it is not, that test will be expensive to maintain indefinitely.

Put the workflow in your repo, not in a chat transcript

Assert is strongest when scenarios become durable project assets: readable Markdown in the repo, generated execution underneath, and result inspection in the dashboard.

Get Started Free See How It Works