How It Works

BookForge uses a 5-stage pipeline to convert non-fiction books into structured, verified agent skills. Each stage has a specific job, and the output of one feeds directly into the next.

Pipeline Overview

┌──────────┐    ┌─────────────┐    ┌────────────┐    ┌──────────┐    ┌──────────┐
│ Extract  │───→│ Decompose   │───→│ Synthesize │───→│ Verify   │───→│ Optimize │
│          │    │             │    │            │    │          │    │          │
│ Book →   │    │ Score →     │    │ Generate   │    │ Test     │    │ Tune     │
│ chapters │    │ skill units │    │ SKILL.md   │    │ output   │    │ triggers │
└──────────┘    └─────────────┘    └────────────┘    └──────────┘    └──────────┘

Stage 1: Extract

Parse the book into structured chapters and sections. The extractor handles PDF, EPUB, and other common formats, producing a normalized representation of the book's content hierarchy.

Input: A book file (PDF, EPUB) Output: Structured chapter/section tree with text content

Stage 2: Decompose

Not every section of a book makes a good skill. The decomposer scores each topic for "skill density" using a 1-5 rubric across six dimensions:

Skill Density — how much actionable procedure does this section contain?
Digital Actionability — can an AI agent actually do this, or is it purely physical/social?
Output Verifiability — can you check whether the skill was applied correctly?
Trigger Clarity — is it obvious when this skill should activate?
Reuse Frequency — will this come up often enough to justify a skill?
Composability — does this work well alongside other skills?

Topics scoring below 3 on any dimension are filtered out. The remaining topics are grouped into skill units — coherent bundles of knowledge that belong together.

Input: Structured chapter/section tree Output: Scored and grouped skill units (threshold: total score 18+)

Stage 3: Synthesize

Each skill unit becomes a SKILL.md file. The synthesizer generates:

Frontmatter with a description tuned for agent triggering
A structured body with When to Use, Checklist, Process, Key Principles, and Examples
WHY reasoning for every step — not just what to do, but why it matters
Scripts for automatable tasks
Reference files for deep-dive material

The synthesizer generalizes terminology. Author-specific jargon is replaced with domain-standard language so the skill works for anyone, not just readers of that specific book.

Input: Skill units with source content Output: Complete SKILL.md packages (body, scripts, references)

Stage 4: Verify

Every skill goes through two kinds of testing:

Structural checks — does the SKILL.md conform to the spec? Are all required sections present? Is the body under 500 lines? Do script references resolve?

Functional testing — the skill is tested against real tasks using a with-skill vs without-skill baseline comparison. An agent attempts the same task twice: once with the skill installed, once without. The outputs are compared to measure whether the skill actually improves results.

Input: Generated SKILL.md packages Output: Verified skills with test results

Stage 5: Optimize

The final stage tunes two things:

Description optimization — the frontmatter description determines when an agent triggers the skill. BookForge runs 20 evaluation queries against each skill's description to measure trigger accuracy: does the skill activate when it should, and stay quiet when it shouldn't?

Frontmatter tuning — model recommendations, context window requirements, and allowed-tools lists are calibrated based on the skill's complexity and tool needs.

Input: Verified skills Output: Production-ready skills with optimized triggering

Why This Matters

The bottleneck in the agent skills ecosystem is authorship. Today, only developers who deeply understand both a domain and agent tooling can write effective skills. That limits the entire ecosystem to what a small group of people find time to build.

BookForge removes that bottleneck. The world's non-fiction knowledge — negotiation tactics, architecture patterns, management frameworks, scientific methods — is already written down in books. BookForge distills it into a format agents can use directly.

The result: agent capabilities grow at the speed of book processing, not at the speed of manual skill authoring.