Faultbox — syscall interception and fault injection

faultbox

Fault injection for distributed systems.
Intercept syscalls and protocol messages to test how your services behave under failure.

faultbox.star
api = service("api", binary="./api", http="localhost:8080")
db  = service("db",  binary="./db",  tcp="localhost:5432")

def test_write_failure(t):
    fault(db, write=deny("EIO"))
    resp = api.http.post("/orders", json={"item": "widget"})
    assert_eq(resp.status, 503, "API should return 503 when DB fails")
$ faultbox test faultbox.star

PASS  test_write_failure  (0.42s)
 fault(db, write=deny("EIO"))
 POST /orders 503
 assert_eq(resp.status, 503)

Install

curl -fsSL https://faultbox.io/install.sh | sh

Detects your platform, downloads the latest release, verifies checksum. Or build from source.

Why Faultbox

Syscall-level injection

Deny, delay, or hold any syscall via seccomp-notify. No eBPF, no ptrace, no code changes. Faultbox automatically expands syscall families — write covers write, writev, pwrite64.

Protocol-level injection

Inject faults at HTTP, gRPC, Postgres, MySQL, Redis, Kafka, NATS, MongoDB, AMQP, and Memcached protocol level. Target specific queries, paths, or topics via transparent proxy.

Deterministic exploration

hold() and release() control syscall ordering across services. --explore mode walks all interleavings automatically. Seed-based replay for reproducible failures.

Starlark specs

Topology, faults, and assertions in one .star file. No YAML. No separate config language. The spec is executable code.

Two modes

Run local binaries with binary= or real infrastructure (Postgres, Redis, Kafka) in Docker containers with image=.

Event log & traces

Every intercepted syscall recorded with vector clocks. Temporal assertions: assert_eventually(), assert_never(), assert_within(). ShiViz visualization support.

How it works

1
Write a spec Define topology, faults, and assertions in a single .star file
2
Start services Runtime launches binaries or containers and installs seccomp filters
3
Intercept syscalls Kernel pauses processes on target syscalls and asks Faultbox what to do
4
Inject & assert Deny, delay, or hold syscalls — then verify your service handles it

Powered by seccomp-notify — no ptrace, no eBPF, no code instrumentation. Faults are injected in the kernel, invisible to the target process.

Supported protocols

HTTP gRPC PostgreSQL MySQL Redis Kafka NATS MongoDB AMQP Memcached TCP

Built for LLM agents

LLM agents write code. But who tests what happens when the database crashes, the network drops, or the disk fills up? Faultbox closes the loop.

1
Agent writes code

Your LLM agent builds a microservice. It writes handlers, connects to Postgres, adds Redis caching.

2
Faultbox generates tests

One command from docker-compose. Every dependency gets fault scenarios — disk failures, network drops, slow queries.

faultbox init --from-compose
3
Structured feedback

JSON output with diagnostics: "write fault fired 3 times but service returned 200 — missing error handling in the persist path."

4
Agent fixes the bug

The agent reads the diagnostic, finds the code, adds error handling. Runs tests again. All pass. Commits with confidence.

MCP native Built-in MCP server with 6 tools. Claude, Cursor, and any MCP client connect directly.
One command setup faultbox init --claude creates slash commands and MCP config. Zero configuration.
Actionable diagnostics Not just "test failed" — structured hints that tell the agent exactly what to fix and where.

Every LLM agent writing microservices needs to answer one question:
"What happens when things break?"

Faultbox is that answer.

LLM Integration Guide