Run 3: Test, Ship, Repeat¶
Where You Left Off¶
In Run 2, you used skills to bring consistency to your observation network — type-specific forms that follow the same patterns, a decomposed backlog that kept your team organized, and manual review that caught real failures against specific acceptance criteria. The platform works, and it looks like a coherent system rather than a collection of one-off features.
Then in Lift 3, you solved the problem that was nagging you: manual review doesn't scale. You learned why the two-week cliff happens — one change silently breaks something you already verified — and you saw the fix. Your acceptance criteria in Given/When/Then format are already test specifications. You handed them to AI, it generated automated tests, and you watched them run. You experienced the closed loop: criteria → test → fail → implement → pass. And you deployed your observation network to a live URL — tests gating deployment, verified work going live.
You also discussed how to make the TDD cycle the default — using a skill or your project context file so that AI follows red → green automatically when you hand it a story. That idea is your starting point for this run.
The Challenge¶
Your observation network is live and you have the beginnings of a safety net. Now build with confidence. Use the closed loop for every feature — write the acceptance criteria, have AI generate a failing test, implement until it passes, confirm existing tests still pass, and redeploy. The safety net grows with every feature.
This is where automated testing pays off: you can be ambitious without being reckless. Add substantial features — and know that your existing work is protected.
Write the test first. Implement second. Redeploy when tests are green.
Baseline Capabilities¶
- The closed loop is your default workflow — at least two features built using the full cycle: acceptance criteria → generate failing test → implement → tests pass. If you discussed making TDD the default in Lift 3 (via a skill or project context update), put that into practice now.
- Core features have automated test coverage — the observation submission form, the observation feed, and at least one Run 2 feature have automated tests that run and pass
- New feature added with regression confidence — add a substantial feature to the platform and verify that all existing tests still pass after the change. The safety net catches what manual review would miss.
- Application redeployed with latest work — the live URL reflects everything you've built through Run 3, shipped because tests passed
Stretch Goals¶
- Map view with end-to-end tests — display observations on a map by location, with automated tests that verify observations render at the correct coordinates and that selecting a map marker shows the observation detail
- Live data enrichment — auto-tag submitted observations with current danger ratings from the avalanche forecast API or current weather conditions from the NWS API, with tests validating that enrichment data appears correctly (your repository has API documentation for both)
- Observation validation test suite — comprehensive automated tests for form validation: required fields produce errors when blank, type-specific fields appear for the correct observation types, invalid values are rejected
- Test coverage across all observation types — each type-specific form (avalanche, snowpack, weather, red flag) has its own test suite verifying its unique fields and behaviors
- Visual regression — use end-to-end tests to verify the observation feed layout, card rendering, and form display look correct from the user's perspective
Tips¶
- Make the closed loop automatic first. If you discussed a TDD skill or context file update in Lift 3's team discussion and didn't yet create it, do so now — before you start building features. A simple addition to your project context file works: "When implementing a user story, always write a failing test from the acceptance criteria first, then implement until the test passes, then run the full test suite to check for regressions." Every feature you build after that benefits from the pattern.
- Write the test first, even when it feels backward. Seeing the test fail before implementation confirms you're testing the right thing. If the test passes before you've built anything, it's not testing what you think.
- Your acceptance criteria are already test blueprints. The Given/When/Then format maps directly: "Given" becomes the test setup, "When" becomes the action, "Then" becomes the assertion. When AI generates a test from your criteria, check that each part is represented. If the test doesn't mirror the structure of your criterion, it's testing something else.
- Use your sample observations as test fixtures. The pre-seeded observation data is designed for this — real Wasatch locations, real observation types, realistic field values. Tests that use this data verify against realistic scenarios.
- When a test fails, check the error message. A good failure message tells AI exactly what went wrong — "expected 'Salt Lake' but got 'undefined'" is actionable. A vague message like "assertion failed" leaves AI guessing and leads to the same spinning loop from Run 1. If your test failures aren't clear, ask AI to improve them.
- Redeploy after each batch of green tests. Keep the live URL current. Tests pass → save and sync → redeploy. Build the rhythm: verify it, then ship it.