Run 1: The Forecast Intelligence Platform¶

The Challenge¶

Build a multi-zone forecast intelligence platform. The deterministic pipeline ingests published forecasts, weather, and snowpack data across UAC zones, parses danger ratings, and routes alerts through configurable thresholds. The AI-powered analysis layer calls the Anthropic API to generate enriched analysis — contextual alerts, cross-zone briefings, risk factor extraction — that makes the published forecast accessible to operations staff monitoring multiple zones and backcountry travelers planning a trip.

You have the Director's Toolkit from Lift 1: a verified deployment pipeline, a context architecture for managing complexity across subsystems, skills that encode your team's role gaps as AI roles, and delegation patterns for running parallel workstreams. Put all of it to work.

Your system should feel like something an operations manager would actually open at the start of each day, or a backcountry skier would check before a trip. Not a code demo — a tool.

Who You're Building For¶

Forecast operations staff — people who manage the UAC forecasting program across all 9 zones. They don't write the daily forecast — the zone forecasters do that. Operations staff care about the big picture: which zones have the highest danger right now, are any zones missing data, where should field resources be deployed today, what patterns are emerging across the range. Their pain point is the cognitive load of monitoring 9 zones x 3 data sources without a synthesized view.

Backcountry recreationists — skiers, snowboarders, and mountaineers planning a trip into the backcountry. They check the forecast before heading out but may not have the expertise to parse a technical forecast discussion about persistent weak layers and faceted snow. They need the forecast translated into actionable trip-planning information: what's the danger, what are the key concerns, and what terrain should they avoid.

The forecasters who publish the daily danger ratings are the domain experts whose assessments are your system's ground truth. Their published forecasts are the authoritative input — your system makes that expertise accessible to the people downstream.

What You're Starting With¶

From Lift 1, your team has:

A verified deployment pipeline — you deployed a landing page and confirmed continuous delivery works
A context architecture discussion — what goes in the project root vs. path-scoped rules vs. subdirectory context
A role gap analysis — which missing team roles (PM, QA, designer, tech lead) could be filled by skills
A visibility problem framing — the question isn't "does the code run" but "does the AI-generated analysis accurately reflect the published forecast"

Your repository has pre-seeded data for all 9 UAC zones: forecast snapshots, NWS weather data, SNOTEL snowpack readings, zone configurations, and alert thresholds. API documentation covers all five data sources for live integration.

Baseline Capabilities¶

Continuous delivery from the start — the deployment pipeline is verified and you're shipping to your live URL throughout the run, not just at the end. Every significant feature goes live as soon as it works.
Context architecture implemented — your project has a root-level context file using the bootstrap pattern, plus at least one path-scoped rule or subdirectory context for a specific subsystem (data ingestion, analysis, dashboard, tests — whichever your team decides warrants scoped context)
At least one skill filling a role gap — your team identified missing roles in Lift 1's discussion. Turn at least one into a working skill: decomposition (PM), test generation (QA), UI component conventions (designer), or code review patterns (tech lead). The skill is version-controlled and used during the run.
Multi-zone data ingestion and display — the engine loads forecast data for multiple UAC zones from the pre-seeded snapshot and displays current conditions on a dashboard. An operations manager opening the dashboard sees danger ratings, active avalanche problems, and zone-level summaries without clicking into each zone individually.
At least one AI-powered analysis component — the system does more than parse and display forecast data. It calls the Anthropic API at runtime to generate something the raw feed doesn't provide on its own: a cross-zone synthesis briefing for operations staff, a plain-language summary for backcountry travelers, a contextual alert message that explains why an alert fired, or risk factor extraction from the forecast discussion text. This is the layer that will need evaluation in Run 2 — build it now so you have something to measure.

Stretch Goals¶

Cross-source data correlation — combine forecast, weather, and snowpack data to surface relationships: wind events that may be loading specific aspects, temperature trends that suggest warming cycles, snowpack changes that correlate with increased danger
Configurable alert engine — alerts fire based on configurable thresholds (danger level, rate of change, multi-problem combinations). Use the pre-seeded alert threshold configuration as a starting point, with the ability to adjust rules
AI-powered zone analysis — use the Anthropic API to generate a richer interpretation for individual zones, contextualizing the published danger rating with weather forecast, snowpack trends, and active avalanche problems
Parallel workstream execution — multiple team members building independent subsystems simultaneously: one on data ingestion, one on the dashboard, one on analysis, one on alerting. Separate conversations, independent delegation contracts, work that integrates through the shared codebase
Live API integration — supplement pre-seeded data with live API calls for real-time conditions, with caching to protect the public APIs

Tips¶

Deploy first, build second. You verified the pipeline in Lift 1. Keep using it. The first feature you build should be live before you start the second. Continuous delivery means continuous — not "deploy at the end."
Your context architecture matters immediately. This is a multi-subsystem project. A monolithic context file will degrade as the project grows. Set up your layered architecture before the codebase gets complex — it's easier to scope context to subsystems when the subsystems are small.
Build against the snapshot, not the APIs. The multi-zone snapshot contains everything you need for Run 1. Live APIs add latency, rate limit risk, and failure modes you don't want to debug during a build sprint. Wire in live data as a stretch goal once the engine works against static data.
The golden datasets are your future answer key. You won't build eval harnesses until Run 2, but knowing the golden datasets exist should shape how you build now. The expected outputs (danger ratings, avalanche problems, alert decisions) serve double duty: direct comparison targets for testing the deterministic pipeline, and rubric criteria for evaluating whether your AI-generated analysis accurately reflects what the forecaster identified.