How Playwright screenshot testing works
Playwright's toHaveScreenshot() assertion captures a screenshot of a page or element and compares it to a stored baseline image. On first run, it creates the baseline. On subsequent runs, it diffs the current screenshot against the baseline and fails if the difference exceeds a configurable threshold.
The comparison is pixel-level. Differences in text rendering, animation state, dynamic content (timestamps, user-specific data), or font anti-aliasing across operating systems will all produce failures. Playwright provides options to mask dynamic regions and configure tolerance, but tuning these is ongoing work.
- toHaveScreenshot() compares the full page or a specific element
- Baseline images are committed to the repo and updated explicitly
- Pixel difference threshold is configurable per assertion
- Masking dynamic regions prevents false positives from timestamps and user data
When visual regression testing is worth it
Screenshot testing pays off on components or pages with complex visual layouts that are difficult to assert on with DOM-based checks — data visualisations, rich text editors, custom chart components, marketing landing pages with strict brand requirements.
It pays off less on application UIs with frequent small changes, internationalised content, or any dynamic data. Every intentional UI change requires updating baseline images, which creates CI noise and desensitises engineers to screenshot failures.
- High value: design system component libraries, marketing pages, chart/visualisation components
- Lower value: application screens with dynamic data, rapidly iterated features
- Consider: third-party visual testing services (Percy, Chromatic) add better diffing and review workflows
Behaviour-based testing as the primary layer
Assert's approach focuses on behavioural correctness — does the user flow work? — rather than pixel accuracy. For most product teams, catching a broken checkout flow or a missing error message is more valuable than catching a 2px layout shift.
Visual regression and behavioural E2E testing are complementary, not competing. Use Playwright's snapshot assertions selectively for visual-critical components and rely on semantic, behaviour-based tests for coverage of application logic.
FAQ
How do I update Playwright screenshot baselines?
Run npx playwright test --update-snapshots to regenerate all baseline images. For a specific test, use --update-snapshots with a test filter. Commit the updated images to your repo. Review the diff carefully before committing — the whole point of the baseline is to catch unintended changes.
Why do Playwright screenshots fail on CI but not locally?
Font rendering, anti-aliasing, and pixel density differ between operating systems and environments. Screenshots taken on macOS will differ from those taken in a Linux CI container, even if the pages are identical. The standard fix is to generate and update baselines inside a Docker container that matches your CI environment.
Is Playwright visual testing the same as Percy or Chromatic?
Playwright's built-in screenshot comparison is simpler and free — it compares pixels locally with no external dependency. Percy and Chromatic are commercial services that add cloud-based diffing, per-browser comparison, and review workflows. For teams with significant visual testing needs, the tooling and review UX of dedicated services is usually worth the cost.
Put the workflow in your repo, not in a chat transcript
Assert is strongest when scenarios become durable project assets: readable Markdown in the repo, generated execution underneath, and result inspection in the dashboard.