Aught Hour · May 9, 2026 · CPO Council · Tenni Theurer

Shape the Field: AI Agents in Product

Speaker notes and outline

Format: Talk + interactive web deck (15-20 min)

Audience: CPO Council - product executives exploring AI agent adoption

Open interactive deck →
Key thesis: Building great agent systems is product work. The PMs who build agent infrastructure around themselves will operate at a visibly different level - and the gap widens fast.

Opening (1 min)

"I've been running AI agents as my actual operating system for about a year now - across startups, an advisory role, and now a large enterprise. Not experimenting. Operating. And I built the whole thing with a product manager's lens - backlogs, feedback loops, interaction contracts, value checks. That framing turned out to matter more than the AI itself. I want to share what I learned across three phases."

Frame the three acts:

  1. What I was building when I first started (the solo phase)
  2. How my thinking evolved (the shift to enterprise)
  3. What the space looks like 12 months from now

Act 1: Early Use Cases (3 min)

Key message: Agents started as task automation. The interesting part is when they stopped being tools and became domain-specific operating systems.

Where everyone starts

Task automation. Auto-bookers, calendar sync, receipt scanners. The agent does what a human would do, just faster. This is where most orgs are right now.

Where it gets interesting: agents for real businesses

The turning point: memory

"All of this was for me and small teams. Full control, fast iteration. Then I took a role at a large company and everything broke."

Act 2: What Changed (7-8 min)

Key message: The pattern survived the transition. The implementation had to be completely rewritten. Three weeks to get it load-bearing.

The 3-week bootstrap

WeekModeWhat happened
1BootstrapSetup. Basic triage. AI captures, doesn't really act. Mostly trust-building.
2OS OverhaulRewrote the system mid-week. AI started doing real work through the system. First skills emerged.
3LeverageLoad-bearing for daily ops and strategic synthesis. Started building infrastructure for other people to use.

The shape: capture -> automate -> leverage -> share.

What "Share" actually looked like (Week 3)

The Manager OS: 8 use cases

Not a chatbot. A manager OS. Two repos: one shareable, one private. AI handles inputs and drafts; the human owns judgment and the send button.

  1. Daily intelligence triage - One sweep of email, chat, calendar, transcripts, shared files, work items, service health
  2. Meeting prep & capture - Auto-prep before, auto-capture after. Action items reconcile back to a unified backlog
  3. Drafting with context - Replies pulled from full thread history, KB, and prior interactions. Written in your voice. Click-to-copy HTML draft, never auto-send
  4. People intelligence - Every shared artifact updates a running profile: what they own, what they care about, friction points
  5. Knowledge base - LLM-maintained pages over curated raw sources. Humans curate inputs, AI structures the knowledge
  6. Strategic synthesis - Briefs, POV docs, framings generated from accumulated context, not from scratch
  7. Backlog & follow-through - Unified store for todos, drafts, delegated items, watch items. Auto-resolves when downstream evidence shows up
  8. Operational hygiene - Privacy guards on every commit, nightly lint, session memory rotation

What works (the honest version)

Where it breaks (the honest version)

"That last point - single-user shape - is the unsolved problem. And it tells you where this is going."

Act 3: 12 Months Out (4-5 min)

Key message: The patterns are clear. The infrastructure isn't there yet. Here's what closes the gap.

Prediction 1: Memory becomes the moat, not the model

Proved this twice - once solo (250+ sessions), once at enterprise scale. Every product org will have access to the same foundation models. The differentiation is what your agent knows about this user, this workflow, this org.

CPO implication: Your agent strategy should start with "what's the memory architecture?" not "which model?"

Prediction 2: The Manager OS is the first real agent product category

Not "chat with your data." Not "autocomplete in your IDE." A system that does daily triage, preps you for meetings, drafts in your voice, tracks follow-through, and maintains a running model of your people and priorities. I built it by hand. It works. Someone will productize it.

CPO implication: If you're building agent products, look at what managers actually do all day. That's the TAM, not developer productivity.

Prediction 3: The ceiling is judgment, not automation

5-10x on inputs and drafts. 2-3x on actual leverage. That's the honest number. Products that respect this boundary will win. Products that pretend the AI replaces judgment will fail.

CPO implication: Design for the 2-3x, not the 5-10x. Build your products around the handoff, not the automation.

Prediction 4: Agent-native PMs will create an uncomfortable performance gap

The PMs who build agent systems around themselves will operate at a visibly different level. This isn't a tool adoption curve. It's a skill gap. The difference between a PM who uses agents and a PM who builds agent systems is the difference between someone who uses Excel and someone who builds models.

CPO implication: You need to decide if this is something you encourage, require, or let happen organically. In 12 months, you'll be able to tell which PMs built agent systems and which didn't - just from the quality of their work.


Close (1 min)

The shape, one more time: capture -> automate -> leverage -> share.

Every CPO in this room can start week 1 tomorrow. The question isn't whether agents work in product leadership - they do. The question is whether you're willing to invest the trust-building to get to the point where it's load-bearing.

The hard part isn't the AI. It's designing the system around the AI - the memory, the contracts, the feedback loops, the human-agent boundaries. Building great agent systems is product work. It's PM work. The people best equipped to lead this transition are already in your org.


Anticipated Questions

"What about hallucination / trust?"

The handoff pattern handles this. AI never sends - it drafts. Investigation before draft means replies are grounded in real thread history. Trust is a design problem, not a model problem.

"Does it scale beyond one person?"

Not yet - that's the honest answer. The patterns are transferable but the infrastructure isn't forkable. That's prediction #2 - someone will productize this.

"What model do you use?"

Claude (Opus), but that's the least interesting part. The value is in the memory architecture, the interaction contracts, and the workflow design.

"How do you handle sensitive data?"

Privacy as structure: two-repo split, PII guards on every commit, session memory rotation. Make leaks structurally impossible rather than relying on vigilance.

"What should I try first?"

Daily intelligence triage. One sweep of everything that came in overnight. Lowest risk, highest immediate value, builds the trust that unlocks everything else.