Work That Survives the Night: Beads for Multi-Session Tasks

Compacting conversation. Your agent just forgot some of what you discussed. Not everything — there’s a summary somewhere — but enough that you’re now re-explaining context. You could dig up that summary, but nobody does. Too much friction. (Context exhaustion isn’t just a productivity problem — it’s also a security vulnerability.)

Plan mode helps — the plan sits in a file, you can reread it. But plans assume linear execution. Point 3 reveals something unexpected. Points 4-10 need replanning. While you’re mid-execution. Context full of step 3’s details. Getting messier.

There’s another problem. While working on feature X, you notice a potential bug. Or a refactor opportunity. You can’t stop — you’ll lose the thread. You can’t ignore it — you’ll forget. Mental notes are worth exactly what you paid for them.

Plan mode gives you a document. Beads gives you a graph of work with dependencies that the agent understands.

Steve Yegge built it after iterating through several approaches to agent memory. Instead of one big plan, you get atomic tasks with explicit dependencies. The whole tree persists in your repository, survives session boundaries, syncs with git.

Tasks small enough to survive context loss. Blocking relationships that prevent premature work. That’s the model.

The Beads Mental Model

Installation is two steps.

First, install the bd CLI (one time, system-wide):

curl -fsSL https://raw.githubusercontent.com/steveyegge/beads/main/scripts/install.sh | bash

Then add the plugin to Claude Code (see the plugins post for context):

/plugin marketplace add https://github.com/steveyegge/beads
/plugin install beads@beads-marketplace

The plugin wraps the CLI with slash commands and hooks — but the CLI does the actual work.

The workflow loop:

bd ready                              # What can I work on?
bd show <id>                          # Review the task
bd update <id> --status=in_progress   # Claim it
# ... work ...
bd close <id>                         # Done
bd sync                               # Persist task state to git

The beads workflow: ready, TDD loop with discoveries filed as new tasks, quality gates, then close and sync

What makes this different from a todo list: dependencies are first-class. When you mark a task as blocking another, bd ready won’t show the blocked task. The agent literally cannot work on it until prerequisites are done.

A real project with 24 open issues:

bd stats

Summary:
  Total Issues:           270
  Open:                   24
  Blocked:                23
  Ready to Work:          1

23 out of 24 open issues are blocked. Only one is ready.

Dependencies in Practice

A multi-iteration feature. Each iteration depends on the previous:

bd graph --compact

LAYER 0 (ready)
└── ○ [EPIC] Iteration 5: Template Creation

LAYER 1
├── ○ Define input models
└── ○ [EPIC] Iteration 6: Job Submission

LAYER 2
├── ○ Implement validation endpoint
└── ○ [EPIC] Iteration 7: Job Cancellation

LAYER 3
└── ○ [EPIC] Iteration 8: Polish + E2E Tests

Work organized in execution layers. Layer 0 is ready now. Layers 1-3 are blocked — the agent sees them, knows they exist, but can’t claim them until their dependencies close.

Dependency layers: Layer 0 is ready now, Layers 1-3 are blocked until their dependencies resolve

Creating dependencies:

bd dep add <blocked-task> <blocking-task>
# Example: tests depend on implementation
bd dep add beads-tests beads-feature

Checking what’s stuck:

bd blocked

🚫 Blocked issues (23):
[P1] Implement validation endpoint
  Blocked by: [Define input models, Template Creation epic]

[P1] Job Submission epic
  Blocked by: [Template Creation epic]

bd dep cycles catches circular dependencies before they cause problems.

Handling Discoveries

You’re implementing feature X. You notice that function Y has a bug. Old you: interrupt, lose context, or forget.

New workflow:

bd create --title="Bug: function Y returns wrong value on empty input" --type=bug

Filed. Tracked. You continue working on X. The bug won’t be forgotten, won’t pollute your current context.

Session Boundaries

Before ending your session:

bd sync    # Persist task state to git (not your code — just beads metadata)

Tomorrow:

bd ready   # What's next?

The agent picks up where it left off. No re-explaining context. No hunting for “where was I?”

Close reasons preserve rationale:

bd show completed-task

Close reason: Implemented with retry logic. Added regression test.
Commit: a1b2c3d

The close reason links the decision to the commit. Months later, that context is still there.

TodoWrite vs Beads

Claude has built-in task tracking (TodoWrite). When do you use what?

TodoWrite is ephemeral — lost on compaction. Good for single-session checklists. “Do these 5 things right now.”

Beads persists. Survives restarts, syncs to git. Dependencies enforce execution order. “Track this work across days.”

The antipattern: Heavy TodoWrite use within a beads task usually means the task wasn’t decomposed enough. If you’re adding 15 TodoWrite items to track sub-steps, split the beads task instead.

Rule of thumb: If you might end the session before finishing, use beads. If it’s a quick checklist within a claimed task, TodoWrite is fine.

Beads and Documentation

Requirements live in docs. Tasks reference them — don’t duplicate.

Documentation (PRODUCT_SPEC.md, README, design docs) = source of truth for requirements. What should the system do? What are the acceptance criteria?

Beads tasks = implementation work. The task description points to the relevant doc section. Requirements drift when duplicated; references don’t.

Example flow:

Requirements in PRODUCT_SPEC.md define the feature
Epic in beads references that section
Tasks implement specific acceptance criteria
Close reasons document what was done and why

Wrapping Beads in a Workflow

Beads on its own is a task tracker. The real value comes from wrapping it in a structured workflow.

Here’s what I use:

Before claiming a task — validate requirements. Does the task description tell me exactly what “done” looks like? Can I write test assertions from it? If not, I update the task or ask for clarification. Don’t start coding against fuzzy requirements.

While working — TDD loop. Write failing test, implement, pass, refactor. Discoveries become new beads tasks, not scope creep on the current one.

Before closing — verify. Did I actually meet the acceptance criteria? Do tests pass? Is the code committed? Only then close.

This turns beads from a task tracker into a development discipline that prevents the common failure modes: starting work you don’t understand, scope creep mid-task, closing without verification.

The workflow lives in your .claude/rules/ directory. Example rule file:

# .claude/rules/workflows/beads-workflow.md

## Before Claiming a Task
1. Run `bd show <id>` — read the full description
2. Check: Can I write test assertions from this?
3. If unclear: update task notes or ask for clarification
4. Only then: `bd update <id> --status=in_progress`

## Before Closing
1. Quality gates pass? `npm run quality`
2. Code committed? `git status`
3. Acceptance criteria met?
4. Only then: `bd close <id>`

Quality gates were covered in the previous post — the agent runs lint, typecheck, and tests before you can close.

The agent reads these on every session. You encode your standards once, the agent follows them every time.

Advanced Features

Beads has capabilities I haven’t fully explored yet — useful when you have multiple agents or want automated workflows:

Swarms/Molecules: Coordinating parallel agents working on epic children (see Part 6 for sub-agent patterns)
Relationship types: Beyond blocks, there’s relates_to, tracks, and others
Labels and states: Workflow automation based on task state changes
Merge slots: Serializing conflict resolution when multiple agents work in parallel

I’ll cover these when I’ve used them more. For now, master the core workflow — it’s already a significant upgrade from plan mode.

The Shift

Without beads: “Where was I? What was I doing? Let me re-read the plan… wait, it’s outdated now.”

With beads: bd ready → here’s what’s next, with full context, dependencies already resolved.

The loop (ready → claim → work → close → sync) becomes muscle memory after a day or two. The graph of work survives the night.

Fifth in a series on agentic development. Previously: Stop Using Claude Code Like a Copilot, MCP Setup, Plugins, and Verification Patterns. Next: sub-agents for parallel exploration without context pollution.