On this page

Chapter 13: LLM Agents & MCP Integration

Duration: 15 minutes Prerequisites: Chapter 0 (Setup) completed

Goals & Purpose

Faultbox is designed for two types of users: human engineers and LLM agents. Both write specs, run tests, and fix code — but LLM agents need structured output and tool integration instead of human-readable text.

This chapter teaches you to:

  • Set up Claude Code integration in one command
  • Use custom slash commands for fault injection workflows
  • Connect via MCP for native tool-use in any LLM agent
  • Parse structured JSON output for automated code-test-fix loops

After this chapter, your LLM agent can autonomously: generate specs from docker-compose, run fault tests, analyze failures, and suggest fixes.

Quick setup

One command sets up everything:

faultbox init --claude

This creates:

.claude/commands/
├── fault-test.md         # /fault-test slash command
├── fault-generate.md     # /fault-generate slash command
└── fault-diagnose.md     # /fault-diagnose slash command
.mcp.json                 # MCP server auto-config

Open Claude Code in your project. The slash commands and MCP tools are available immediately.

Custom slash commands

/fault-test — Run tests

Type /fault-test in Claude Code. It finds your .star spec, runs all tests with --format json, and reports:

  • Pass/fail count
  • Failure reasons with replay commands
  • Diagnostics with fix suggestions

/fault-generate — Generate specs

Type /fault-generate. It detects your project setup:

  • Has docker-compose.yml? Generates spec from compose
  • Has a Go/Node/Python service? Generates a starter spec
  • Has an existing spec? Runs faultbox generate for failure scenarios

/fault-diagnose — Analyze failures

Type /fault-diagnose after a test failure. It reads the JSON output, finds the relevant source code, and suggests specific fixes based on the diagnostic codes:

DiagnosticWhat it means
FAULT_FIRED_BUT_SUCCESSFault hit but service didn’t return error — missing error handling
FAULT_NOT_FIREDWrong syscall variant or path filter
SERVICE_CRASHEDUnhandled error caused panic
TIMEOUT_DURING_FAULTPossible infinite retry loop

MCP server

The MCP (Model Context Protocol) server exposes Faultbox as native tools for any compatible LLM agent.

How it works

LLM Agent ──(JSON-RPC)──→ faultbox mcp ──→ run tests, parse results
                                         ──→ generate specs
                                         ──→ analyze topology

The .mcp.json file (created by faultbox init --claude) auto-configures the connection:

{
  "mcpServers": {
    "faultbox": {
      "command": "faultbox",
      "args": ["mcp"]
    }
  }
}

Available tools

ToolDescription
run_testRun all tests, return structured JSON results
run_single_testRun a specific test by name
list_testsDiscover test functions in a .star file
generate_faultsRun failure scenario generator
init_from_composeGenerate spec from docker-compose.yml
init_specGenerate starter spec for a binary

Example: agent workflow

An LLM agent building a microservice uses Faultbox like this:

1. Agent writes service code (Go, Rust, Node, etc.)
2. Agent writes docker-compose.yml
3. Agent calls init_from_compose → gets faultbox.star
4. Agent calls run_test → gets structured results
5. Test fails: FAULT_FIRED_BUT_SUCCESS on write path
6. Agent reads diagnostic → missing error handling in /data endpoint
7. Agent fixes the code → adds proper error return
8. Agent calls run_test again → all pass
9. Agent commits the fix with confidence

No human intervention. The structured JSON output and diagnostics give the agent enough context to fix issues autonomously.

Structured JSON output

For CI pipelines and programmatic consumption:

faultbox test faultbox.star --format json

JSON goes to stdout, human output goes to stderr. Parse with jq:

# Check if all tests passed
faultbox test spec.star --format json | jq '.fail == 0'

# Get failed test names
faultbox test spec.star --format json | jq '.tests[] | select(.result=="fail") | .name'

# Get diagnostics
faultbox test spec.star --format json | jq '.tests[].diagnostics[]'

JSON structure

{
  "version": 2,
  "pass": 3,
  "fail": 1,
  "tests": [
    {
      "name": "test_write_failure",
      "result": "fail",
      "reason": "assert_eq failed: 200 != 503",
      "failure_type": "assertion",
      "seed": 42,
      "replay_command": "faultbox test spec.star --test write_failure --seed 42",
      "faults": [
        {
          "service": "db",
          "syscall": "write",
          "action": "deny",
          "errno": "EIO",
          "hits": 3,
          "label": "disk failure"
        }
      ],
      "syscall_summary": {
        "db": {"total": 45, "faulted": 3, "breakdown": {"write": 20, "read": 15}},
        "api": {"total": 30, "faulted": 0, "breakdown": {"write": 10, "connect": 5}}
      },
      "diagnostics": [
        {
          "level": "error",
          "code": "ASSERTION_MISMATCH",
          "message": "assert_eq failed: 200 != 503",
          "suggestion": "Check the service's error handling logic."
        }
      ]
    }
  ]
}

Docker (for CI agents)

LLM agents in CI environments (GitHub Actions, GitLab CI) can use the Docker image:

docker run --privileged -v $(pwd):/workspace -w /workspace \
  ghcr.io/faultbox/faultbox test faultbox.star --format json

Or use the GitHub Action:

- uses: faultbox/faultbox/.github/actions/test@main
  with:
    spec: faultbox.star

The action installs Faultbox, runs tests, posts a summary, and uploads JSON results as an artifact.

Manual MCP setup

For editors other than Claude Code, configure the MCP server manually.

Cursor: Add to .cursor/mcp.json:

{
  "mcpServers": {
    "faultbox": {
      "command": "faultbox",
      "args": ["mcp"]
    }
  }
}

Claude Desktop: Add to claude_desktop_config.json:

{
  "mcpServers": {
    "faultbox": {
      "command": "faultbox",
      "args": ["mcp"]
    }
  }
}

What you learned

  • faultbox init --claude sets up Claude Code integration in one command
  • /fault-test, /fault-generate, /fault-diagnose slash commands
  • MCP server exposes 6 tools for native LLM agent integration
  • --format json provides structured output for automated workflows
  • Diagnostics give agents enough context to fix code autonomously
  • Docker image and GitHub Action for CI integration

What’s next

You’ve completed the Faultbox tutorial. You now know how to:

  • Inject syscall-level and protocol-level faults
  • Write temporal assertions on internal behavior
  • Explore concurrent interleavings
  • Monitor invariants across all tests
  • Use containers with real infrastructure
  • Generate failure scenarios automatically
  • Integrate with LLM agents for automated testing

Next steps:

  • Add Faultbox to your CI pipeline
  • Write scenario() functions for your critical paths
  • Run faultbox generate to discover untested failure modes
  • Use /fault-diagnose to fix issues found by the generator