Skip to content

Reflection 2

A facilitation guide for the team debrief following Run 2. These questions are designed to spark discussion — not every question needs to be covered. Pick the ones that resonate with what you observed during the run.

What You Built

  • What skills did your team create? Walk through one — what problem did it solve, and how did it change the consistency of AI's output?
  • Did you do the process with AI first — refining until the output was right — and then capture it as a skill? Or did you skip straight to "create a skill for X"? What was the difference in quality?
  • Compare something you built in Run 1 (without skills) to something you built in Run 2 (with skills). What's different about the result? What's different about how you got there?
  • Did you find that adding a new feature broke something that was already working? What happened, and how did you handle it?
  • Did you ask AI to check its own work — whether the codebase was well-organized, whether anything should be cleaned up before adding more? If so, what did it find? If not, what might have been building up underneath features that "worked"?

What You Practiced

  • How did decomposition change the way you worked compared to Run 1? Did having a managed backlog of independently shippable pieces change how your team made decisions about what to build next?
  • When you reviewed against acceptance criteria, did you catch failures that you would have missed with a "looks good" check? What did a specific pass/fail call teach you about the quality of what AI produced?
  • Lift 2 described the spinning loop — re-prompting in circles without clear criteria. Did you catch yourself in the loop during this run? What pulled you out?

How You Worked

  • How did your team divide the work this time? Did the decomposed backlog change how you organized — could different people take different stories, or did you still need to mob?
  • Did your skills help the team stay aligned, or did different team members still produce different conventions?

Looking Ahead

  • Manual review works, but it doesn't scale. You walked through acceptance criteria by hand for every feature. How many features do you have now — and did you re-check the earlier ones after adding new ones? The faster you ship without automated checks, the more likely a new change quietly breaks something you already verified. What would it take to make that verification automatic?