Skip to content

Implementation roadmap for the AI-driven testing stack. This file is the executable spec for the WSL/Ubuntu CC session to work from.

\\wsl$\Ubuntu-24.04\home\ta\projects\monorepo\

Template site: sites/Template/

ToolPurposeCost
PlaywrightE2E: Chromium, Firefox, WebKitFree
Playwright MCP (@playwright/mcp)Claude Code browser automationFree
VitestUnit + component testingFree
@testing-library/domDOM assertions for VitestFree
@axe-core/playwrightAccessibility testingFree
Lighthouse CISEO + performance gatesFree
ZeroStep (free tier)AI-resilient selectors (500 calls/mo)Free
GitHub ActionsCI/CD pipeline (2,000 min/mo Linux)Free
LambdaTestReal device testing (when revenue arrives)$15/mo

Total: $0/month (rising to $15/month when real-device testing is needed)


Terminal window
cd ~/projects/monorepo
npm init playwright@latest

Accept defaults:

  • TypeScript
  • tests/ directory
  • Add GitHub Actions workflow: Yes
  • Install browsers: Yes

This creates:

  • playwright.config.ts
  • tests/example.spec.ts
  • .github/workflows/playwright.yml

Replace the generated config with this 5-project setup:

import { defineConfig, devices } from '@playwright/test';
export default defineConfig({
testDir: './tests/e2e',
fullyParallel: true,
forbidOnly: !!process.env.CI,
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 1 : undefined,
reporter: [
['html'],
['json', { outputFile: 'test-results/results.json' }],
],
use: {
baseURL: 'http://localhost:4321', // Astro dev server default
trace: 'on-first-retry',
screenshot: 'only-on-failure',
},
projects: [
{
name: 'Desktop Chrome',
use: { ...devices['Desktop Chrome'] },
},
{
name: 'Desktop Firefox',
use: { ...devices['Desktop Firefox'] },
},
{
name: 'Desktop Safari',
use: { ...devices['Desktop Safari'] },
},
{
name: 'Mobile Chrome',
use: { ...devices['Pixel 7'] },
},
{
name: 'Mobile Safari',
use: { ...devices['iPhone 15'] },
},
],
webServer: {
command: 'npm run dev',
url: 'http://localhost:4321',
reuseExistingServer: !process.env.CI,
timeout: 120 * 1000,
},
});

Key decisions:

  • testDir: './tests/e2e' — separates E2E from unit tests
  • baseURL uses Astro’s default port 4321
  • trace: 'on-first-retry' — traces on failure for debugging without overhead
  • webServer auto-starts Astro dev server
  • JSON reporter enables Claude Code to parse results programmatically

Add to monorepo root package.json:

{
"scripts": {
"test": "npm run test:unit && npm run test:e2e",
"test:unit": "vitest run",
"test:e2e": "playwright test",
"test:e2e:ui": "playwright test --ui",
"test:e2e:debug": "playwright test --debug",
"test:visual": "playwright test --update-snapshots",
"test:a11y": "playwright test tests/a11y/",
"test:lighthouse": "lhci autorun"
}
}

Step 4: Set Up Playwright MCP Server for Claude Code

Section titled “Step 4: Set Up Playwright MCP Server for Claude Code”

Add to Claude Code MCP config (~/.claude/mcp.json or project .mcp.json):

{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest"]
}
}
}

This gives Claude Code direct browser automation capabilities — navigate pages, click elements, fill forms, take screenshots, and run accessibility audits autonomously.

Create tests/e2e/homepage.spec.ts:

import { test, expect } from '@playwright/test';
test.describe('Homepage', () => {
test('loads successfully', async ({ page }) => {
await page.goto('/');
await expect(page).toHaveTitle(/./); // Has any title
await expect(page.locator('h1')).toBeVisible();
});
test('navigation works', async ({ page }) => {
await page.goto('/');
// Use text-based locators (resilient to redesigns)
const nav = page.getByRole('navigation');
await expect(nav).toBeVisible();
});
test('no console errors', async ({ page }) => {
const errors: string[] = [];
page.on('console', msg => {
if (msg.type() === 'error') errors.push(msg.text());
});
await page.goto('/');
expect(errors).toEqual([]);
});
test('no 404 requests', async ({ page }) => {
const failed: string[] = [];
page.on('response', response => {
if (response.status() === 404) failed.push(response.url());
});
await page.goto('/');
// Wait for network idle
await page.waitForLoadState('networkidle');
expect(failed).toEqual([]);
});
test('visual regression', async ({ page }) => {
await page.goto('/');
await expect(page).toHaveScreenshot('homepage.png', {
maxDiffPixelRatio: 0.001,
});
});
});

Create directory structure:

tests/
├── e2e/ # Playwright E2E tests
│ └── homepage.spec.ts
├── unit/ # Vitest unit tests
├── a11y/ # Accessibility tests
└── visual/ # (screenshots auto-generated by Playwright)

Terminal window
npm install -D vitest @testing-library/dom jsdom

Create vitest.config.ts (or add to existing astro.config.mjs):

import { defineConfig } from 'vitest/config';
export default defineConfig({
test: {
include: ['tests/unit/**/*.test.ts'],
environment: 'jsdom',
},
});

Write a sample unit test in tests/unit/example.test.ts:

import { describe, it, expect } from 'vitest';
describe('Utility functions', () => {
it('example test runs', () => {
expect(1 + 1).toBe(2);
});
});

Astro Container API (for component isolation testing):

import { experimental_AstroContainer as AstroContainer } from 'astro/container';
import { describe, it, expect } from 'vitest';
describe('Astro Component', () => {
it('renders correctly', async () => {
const container = await AstroContainer.create();
// Import and render component
// const result = await container.renderToString(MyComponent, { props: {} });
// expect(result).toContain('expected content');
});
});
Terminal window
npm install -D @axe-core/playwright

Create tests/a11y/accessibility.spec.ts:

import { test, expect } from '@playwright/test';
import AxeBuilder from '@axe-core/playwright';
test.describe('Accessibility', () => {
test('homepage has no a11y violations', async ({ page }) => {
await page.goto('/');
const results = await new AxeBuilder({ page })
.withTags(['wcag2a', 'wcag2aa', 'wcag21a', 'wcag21aa'])
.analyze();
expect(results.violations.filter(v =>
['critical', 'serious'].includes(v.impact!)
)).toEqual([]);
});
// Add one test per route
// test('about page has no a11y violations', async ({ page }) => { ... });
});

Visual regression is already in Step 5 (toHaveScreenshot()). To generate initial baselines:

Terminal window
npx playwright test --update-snapshots

This creates *.png files in tests/e2e/homepage.spec.ts-snapshots/ — one per browser project, per viewport. Commit these to git as baselines.

Tips:

  • Mask dynamic content to prevent false positives:
    await expect(page).toHaveScreenshot('homepage.png', {
    maxDiffPixelRatio: 0.001,
    mask: [page.locator('.dynamic-date'), page.locator('.animation')],
    });
  • Run npm run test:visual to update baselines after intentional design changes
Terminal window
npm install -D zerostep

Sign up at https://zerostep.com for free tier (500 AI calls/month).

Add ZEROSTEP_TOKEN to .env and CI secrets.

Usage in tests:

import { ai } from 'zerostep';
import { test, expect } from '@playwright/test';
test('newsletter signup works', async ({ page }) => {
await page.goto('/');
// Natural language — survives UI redesigns
await ai('Find the newsletter signup form and enter test@example.com', { page, test });
await ai('Click the subscribe button', { page, test });
await ai('Verify a success message appears', { page, test });
});

Update .github/workflows/playwright.yml:

name: Test Suite
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
jobs:
test:
timeout-minutes: 30
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run unit tests
run: npm run test:unit
- name: Install Playwright browsers
run: npx playwright install --with-deps
- name: Run E2E tests
run: npm run test:e2e
- name: Upload test results
uses: actions/upload-artifact@v4
if: ${{ !cancelled() }}
with:
name: playwright-report
path: playwright-report/
retention-days: 7
- name: Upload test traces
uses: actions/upload-artifact@v4
if: failure()
with:
name: test-traces
path: test-results/
retention-days: 7
Terminal window
npm install -D @lhci/cli

Create lighthouserc.js in monorepo root:

module.exports = {
ci: {
collect: {
startServerCommand: 'npm run preview',
startServerReadyPattern: 'Local',
url: ['http://localhost:4321/'],
numberOfRuns: 3,
},
assert: {
assertions: {
'categories:performance': ['error', { minScore: 0.9 }],
'categories:seo': ['error', { minScore: 0.95 }],
'categories:accessibility': ['error', { minScore: 0.95 }],
'categories:best-practices': ['warn', { minScore: 0.9 }],
},
},
upload: {
target: 'temporary-public-storage', // Free, stores reports for 7 days
},
},
};

Add Lighthouse step to GitHub Actions (after E2E tests):

- name: Build for Lighthouse
run: npm run build
- name: Run Lighthouse CI
run: npx lhci autorun
env:
LHCI_GITHUB_APP_TOKEN: ${{ secrets.LHCI_GITHUB_APP_TOKEN }}

This is the workflow where Claude Code autonomously runs and fixes tests:

Protocol:

  1. Claude runs npm run test:e2e -- --reporter=json
  2. Parses JSON output for failures
  3. On failure: reads Playwright trace (test-results/ folder), analyzes screenshots
  4. Fixes the source code
  5. Re-runs only failed tests: npx playwright test --grep "test name"
  6. Repeats until green
  7. Documents all fixes in ChangeLog.md

With Playwright MCP: Claude Code can also directly navigate the running site, interact with elements, and verify behavior visually — not just through test code.


  • Free tier: 60 minutes/month of real device testing
  • Paid: $15/month for more capacity
Terminal window
npm install -D lambdatest-playwright

Add LambdaTest config to playwright.config.ts as additional projects:

// Only in CI for pre-release validation
...(process.env.LAMBDATEST ? [{
name: 'Real Safari iOS',
use: {
connectOptions: {
wsEndpoint: `wss://cdp.lambdatest.com/playwright?capabilities=${encodeURIComponent(JSON.stringify({
browserName: 'pw-webkit',
browserVersion: 'latest',
'LT:Options': {
platform: 'MacOS Ventura',
build: process.env.GITHUB_SHA,
user: process.env.LT_USERNAME,
accessKey: process.env.LT_ACCESS_KEY,
},
}))}`,
},
},
}] : []),

Trigger: Only run on pre-release tags, not every commit.


Create tests/e2e/seo.spec.ts:

import { test, expect } from '@playwright/test';
const pages = ['/', '/about', '/contact']; // Add all routes
for (const path of pages) {
test.describe(`SEO: ${path}`, () => {
test('has required meta tags', async ({ page }) => {
await page.goto(path);
// Title exists and is non-empty
const title = await page.title();
expect(title.length).toBeGreaterThan(0);
expect(title.length).toBeLessThanOrEqual(60);
// Meta description
const desc = page.locator('meta[name="description"]');
await expect(desc).toHaveAttribute('content', /./);
// H1 exists (exactly one)
const h1s = await page.locator('h1').count();
expect(h1s).toBe(1);
// Canonical link
const canonical = page.locator('link[rel="canonical"]');
await expect(canonical).toHaveAttribute('href', /./);
});
test('all images have alt text', async ({ page }) => {
await page.goto(path);
const images = page.locator('img');
const count = await images.count();
for (let i = 0; i < count; i++) {
const alt = await images.nth(i).getAttribute('alt');
expect(alt, `Image ${i} missing alt text on ${path}`).toBeTruthy();
}
});
});
}

monorepo/
├── playwright.config.ts
├── vitest.config.ts
├── lighthouserc.js
├── .github/
│ └── workflows/
│ └── playwright.yml # CI: unit → E2E → Lighthouse
├── tests/
│ ├── e2e/
│ │ ├── homepage.spec.ts # Core E2E + visual regression
│ │ └── seo.spec.ts # SEO meta validation
│ ├── a11y/
│ │ └── accessibility.spec.ts # axe-core scans
│ └── unit/
│ └── example.test.ts # Vitest unit tests
├── .mcp.json # Playwright MCP config for CC
└── sites/
└── Template/ # Template site (first target)

Feed this file to the WSL CC session with: “Read this plan and implement it step by step.”

  • Step 1: Install Playwright (npm init playwright@latest)
  • Step 2: Configure playwright.config.ts (5 projects)
  • Step 3: Add test scripts to package.json
  • Step 4: Configure Playwright MCP (.mcp.json)
  • Step 5: Write first E2E tests (homepage.spec.ts)
  • Step 6: Install Vitest, write sample unit test
  • Step 7: Install @axe-core/playwright, write a11y tests
  • Step 8: Generate visual regression baselines
  • Step 9: Set up ZeroStep free tier
  • Step 10: Configure GitHub Actions workflow
  • Step 11: Set up Lighthouse CI with score thresholds
  • Step 12: Verify the autonomous loop (CC → test → fix → re-run)
  • Verify: npm run test:e2e runs on all 5 browser projects
  • Verify: toHaveScreenshot() generates baseline screenshots
  • Verify: Claude Code can run tests via Playwright MCP
  • Verify: GitHub Actions triggers on push and reports pass/fail
  • Verify: Lighthouse CI enforces score thresholds

Reference: Playwright WebKit vs. Real Safari

Section titled “Reference: Playwright WebKit vs. Real Safari”

Playwright WebKit covers ~85-90% of real Safari behavior. Known gaps:

  • iOS-specific touch/focus event ordering
  • Third-party cookie/storage partitioning
  • Autoplay policies
  • Memory pressure and background app behavior

For content/educational sites: Playwright WebKit is sufficient for 95%+ of dev cycles. Reserve LambdaTest for pre-release validation only.