Building in Public: AI-Powered Coaching

Here's the scene. I'm standing at my desk in New York, mid-thoracic rotation, holding an isometric against the desk edge. Jon is on Zoom from Costa Rica, surrounded by plants and natural light, coaching me through a movement I've been refining for a while now. In the background, AI agents are parsing the transcript of our last session, extracting every move into structured documentation.

Jon and I have been doing this for a few years — twice-weekly coaching sessions where I train at my standing desk while he coaches from Costa Rica. Recently, we decided to build a system around it. We're loading the most recent 12 sessions into a pipeline that extracts, structures, and publishes everything Jon teaches.

Three things are happening at once: I'm getting coached, I'm working on the product, and the coaching session itself is generating the data the product needs. That wasn't the original plan. But it's turning out to be the most interesting thing I've built.

Two dials, not one

Most fitness systems have one dimension: hard or easy. You're either "working out" or you're not. That binary is why most knowledge workers don't exercise — they can't justify stopping deep work for a gym session.

Jon taught me something different. Every move operates on two independent dials.

The Two-Dial Model

Intention = how much attention you're devoting to the workout (20% to 100%)

Effort = how hard you're pushing physically (gentle to near-max)

20% intention, gentle effort: Ankle rolls during a meeting. Nobody notices.
20% intention, high effort: Isometric hold against the desk while reading email. Invisible strength work.
100% intention, gentle effort: Mindful mobility flow. Full focus, low load.
100% intention, max effort: Full BFR protocol. This is a workout.

This is the unlock. There is always a setting that fits your current moment. You're never "too busy" to train — you just dial down intention and keep working. And because you're moving throughout the day, you never have to find an hour to "go work out."

Years in, I haven't missed a single workout. Not because I'm disciplined. Because there's nothing to skip.

How we built it: three layers

Here's the technical problem. I need to take a 45-minute coaching transcript and extract every move into structured documentation — with setup instructions, progressions, dosage guidance, and intention profiles for four different levels. That's at least five processing steps. And LLMs are probabilistic.

If each step is 90% accurate — and 90% is generous — you get 0.9 × 0.9 × 0.9 × 0.9 × 0.9 = 59% success across five steps. That's a coin flip. Not acceptable when you're documenting movement instructions that people will follow with their bodies.

The solution came from a framework I picked up watching Nick Saraev's YouTube channel on AI automation: split the system into three layers.

DOE Architecture

Directive (what to do) — Markdown SOPs that define goals, inputs, tools, outputs, and edge cases. Like instructions you'd give a good employee.

Orchestration (decision-making) — AI reads directives, calls scripts in the right order, handles errors, and updates directives with learnings. This is the judgment layer.

Execution (doing the work) — Deterministic Python and JavaScript scripts. Extract moves, generate Word docs, build React pages, deploy to Vercel. Reliable, testable, fast.

The key insight: push complexity into deterministic code. Let AI handle routing, judgment, and error recovery. Keep the business logic in scripts that do the same thing every time. When something breaks, the AI doesn't retry — it reads the error, fixes the script, updates the directive, and tests against prior transcripts.

We have 9 directives, 24 execution scripts, and a growing knowledge base that gets smarter with every session. The AI orchestrates. The code executes. The directives improve.

Self-annealing: errors make it stronger

Every system has bugs. Most systems try to minimize them. Ours tries to learn from them.

We keep a correction log. Every time a human reviews an extraction and finds an error, it gets logged: what was wrong, why, and what the fix is. Then that fix propagates — into the extraction prompts, the validation scripts, and the directives themselves.

The biggest one so far: progression vs. dosage. A progression is a new body position — you change what you're doing. Dosage is the same position with more effort — you change how hard you're doing it. The AI kept confusing them. "Add more tension" was getting classified as a progression when it's actually a dosage increase.

Fix: updated the extraction prompt with an explicit decision boundary ("if body position doesn't change, it's dosage"), tested against both prior transcripts, verified it classified correctly. Updated the directive so future sessions start with this rule baked in.

The system is now stronger. Not just bigger — stronger. That's the difference between a database and a learning system.

"We're at the forefront of a lot of this stuff. We're experimenting a little bit... on paper, I know that you're playing with the hypoxic training, pressurization of the cerebral spinal fluid. But then ultimately, we try it. We see how it works for you... and that's the only real way to figure it out."
— Jon, Session 8

Jon's describing his coaching philosophy, but he's also describing the entire system. You can architect all you want on paper. The real learning happens when you run it, find the failures, and let them make the system smarter.

Dead ends

Not everything worked.

Auto-summaries. We started with auto-generated call summaries as the basis for documentation. They were decent for recall ("oh right, we did that move") but useless for teaching. They flattened Jon's layered coaching into bullet points. Setup, progression, stacking, dosage — all collapsed into "do this movement." You can't teach someone a move from that. We had to build a real extraction pipeline instead.

Flat documentation. The first format was a 1-pager per move. Too much information for someone who already knows the move (they just need a glance). Too little for someone learning it (they need setup, execution, feel targets, modifications). We evolved to progressive disclosure — three layers of the same content:

Glance — 5-second scan. Name, one-liner summary, three bullets. For when you've done it 20 times.
Guided — Follow-along level. Setup, execution, progressions. For developing competence.
Full — Complete reference. Everything including coach notes, visualization cues, common mistakes, and why the move works biomechanically.

Same move, three depths. The workout pages let you toggle between them.

Merging bilateral asymmetry. Early on, the extraction was combining left and right sides into one instruction. Jon teaches them differently on purpose — the left side of your body has different patterns than the right. When we merged them, we lost coaching nuance that matters. Fixed: preserve bilateral cues as separate entries.

What's live right now

This isn't a concept. It's running.

Project Status

Sessions being loaded: 12 from the most recent coaching cycle

Sessions processed so far: 2 of 12

Moves documented: 14 (7 full, 7 stubs)

Directives: 9 SOPs

Execution scripts: 24 Python/JS scripts

Website: Static HTML + React on Vercel. Interactive workout pages with intention slider.

Blog posts: This one, plus a deep dive on BDNF and eye training

The workout pages are the most interesting output. Each one is a React app with an intention slider — drag it from 20% to 100% and the move cards dynamically show what each move looks like at that attention level. It's the two-dial concept made interactive.

The site is static HTML deployed on Vercel with no build step. The React components are transpiled in-browser via Babel. It's simple on purpose — fast to iterate, nothing to break.

Right now, everything is free. We're putting it out there because we think the workouts are genuinely useful and we want to see how people use them. We'll figure out what this becomes as we go — maybe it stays free, maybe some of it becomes a product. For now, we're just building and sharing.

Where this goes

Short-term: process the remaining 10 sessions. Each one adds to the knowledge base — new moves, refinements to existing ones, injury adaptations, coaching evolution. The system gets better with each transcript because the correction log and the directives keep improving.

Longer-term: the architecture isn't specific to fitness coaching. The DOE pattern — directive-driven SOPs, AI orchestration, deterministic execution — works for any domain where an expert's knowledge needs to be captured, structured, and delivered in multiple formats. We're starting with movement coaching because that's what we know. But the pattern is the thing.

The system learns from its mistakes. And we're building it all in the open.

This is the first post in that process. More to come — including the failures.

Try the workout

See what AI-powered coaching extraction looks like in practice.

Free Workout