Errno Reference
When injecting faults with deny(), you specify an errno — the error code
the kernel returns to the target process. This reference lists the most
useful errnos for fault injection testing, grouped by failure scenario.
Quick reference
| Errno | Code | Meaning | Common use |
|---|---|---|---|
EIO | 5 | Input/output error | Disk corruption, hardware failure |
ENOSPC | 28 | No space left on device | Disk full |
EROFS | 30 | Read-only file system | Mounted read-only, immutable volume |
ENOENT | 2 | No such file or directory | Missing file, deleted config |
EACCES | 13 | Permission denied | Wrong file permissions |
EPERM | 1 | Operation not permitted | Missing capability, security policy |
ECONNREFUSED | 111 | Connection refused | Service down, port not listening |
ECONNRESET | 104 | Connection reset by peer | Remote service crashed mid-request |
ETIMEDOUT | 110 | Connection timed out | Network unreachable, firewall drop |
EHOSTUNREACH | 113 | No route to host | Network partition, DNS failure |
ENETUNREACH | 101 | Network is unreachable | Interface down, routing failure |
EAGAIN | 11 | Resource temporarily unavailable | Socket buffer full, non-blocking I/O |
ENOMEM | 12 | Out of memory | Memory pressure, OOM conditions |
EMFILE | 24 | Too many open files | File descriptor exhaustion |
ENFILE | 23 | Too many open files in system | System-wide fd limit |
EEXIST | 17 | File exists | Lock file contention, create-exclusive |
ENOTEMPTY | 39 | Directory not empty | Cleanup failure |
ENOSYS | 38 | Function not implemented | Missing kernel feature, seccomp block |
Note: Errno codes shown are for Linux (amd64/arm64). They’re the same across architectures for the common ones listed here.
Disk & storage failures
EIO — I/O error
fault(db, write=deny("EIO"), run=scenario)
Simulates: Disk corruption, bad sectors, SAN disconnection, NFS timeout. The most generic I/O error — the storage layer failed but doesn’t say why.
What to test:
- Does the service retry or fail fast?
- Is the error surfaced to the caller (not swallowed)?
- Does partial write leave corrupted state?
ENOSPC — No space left on device
fault(db, write=deny("ENOSPC"), run=scenario)
Simulates: Disk full, volume quota exceeded, WAL growth beyond capacity. One of the most common production failures — logs or data fill the disk.
What to test:
- Does the service return a meaningful error (not just “internal error”)?
- Can the service still respond to healthchecks?
- Does it stop accepting writes gracefully?
EROFS — Read-only file system
fault(db, write=deny("EROFS"), run=scenario)
Simulates: Filesystem remounted read-only after corruption detection, immutable container layers, read-only volume mount.
What to test:
- Does the service distinguish “read-only” from “broken”?
- Can it still serve read requests?
File access failures
ENOENT — No such file or directory
fault(db, openat=deny("ENOENT"), run=scenario)
Simulates: Missing config file, deleted data directory, unmounted volume, symlink target removed.
What to test:
- Does the service fail with a clear error message naming the missing file?
- Does it retry or fail immediately?
EACCES — Permission denied
fault(db, openat=deny("EACCES"), run=scenario)
Simulates: Wrong file ownership after deployment, restrictive SELinux/AppArmor policy, missing group membership.
What to test:
- Does the error message mention permissions (not just “failed to open”)?
- Can the service recover if permissions are fixed?
EPERM — Operation not permitted
fault(db, openat=deny("EPERM"), run=scenario)
Simulates: Missing Linux capability (e.g., CAP_NET_BIND_SERVICE),
seccomp policy blocking the operation, mandatory access control denial.
EACCES vs EPERM:
EACCESis “you don’t have permission for this specific resource.”EPERMis “you’re not allowed to do this operation at all.” In practice, many programs don’t distinguish them.
Network failures
ECONNREFUSED — Connection refused
fault(api, connect=deny("ECONNREFUSED"), run=scenario)
Simulates: Target service not running, port not listening, service crashed during deployment.
What to test:
- Does the caller return 503 (not 500)?
- Does it retry with backoff?
- Does the error message name the target service?
ECONNRESET — Connection reset by peer
fault(api, read=deny("ECONNRESET"), run=scenario)
Simulates: Remote service crashed mid-response, load balancer killed the connection, TCP RST from firewall.
What to test:
- Does the caller handle partial reads?
- Does it retry the full request (idempotent) or fail?
ETIMEDOUT — Connection timed out
fault(api, connect=deny("ETIMEDOUT"), run=scenario)
Simulates: Firewall silently dropping packets (no RST), network congestion, DNS resolution timeout.
Tip: For testing timeout behavior,
delay("5s")is often more realistic thandeny("ETIMEDOUT"). A deny returns instantly — a real timeout makes the caller wait.
EHOSTUNREACH — No route to host
fault(api, connect=deny("EHOSTUNREACH"), run=scenario)
Simulates: Network partition, host down, routing table misconfiguration.
ENETUNREACH — Network is unreachable
fault(api, connect=deny("ENETUNREACH"), run=scenario)
Simulates: Interface down, default route missing, VPN disconnected.
Resource exhaustion
EAGAIN — Resource temporarily unavailable
fault(db, write=deny("EAGAIN"), run=scenario)
Simulates: Socket send buffer full, non-blocking I/O would block, file lock temporarily held by another process.
What to test:
- Does the caller retry?
- Is there a retry limit to prevent infinite loops?
ENOMEM — Out of memory
fault(db, write=deny("ENOMEM"), run=scenario)
Simulates: Memory pressure, mmap failure, large allocation rejected.
EMFILE — Too many open files
fault(db, openat=deny("EMFILE"), run=scenario)
Simulates: File descriptor exhaustion in the process. Common when connection pools or file handles leak.
What to test:
- Does the service report fd exhaustion clearly?
- Can it still handle healthcheck requests?
ENFILE — Too many open files in system
fault(db, openat=deny("ENFILE"), run=scenario)
Simulates: System-wide fd limit hit. Affects all processes on the host.
Data integrity
fsync failures
fault(db, fsync=deny("EIO"), run=scenario)
Simulates: Postgres fsync failure — data written to page cache but
not persisted to disk. This is how real data loss happens: the write
succeeds but the sync fails, and the application thinks data is durable.
What to test:
- Does the database detect the sync failure?
- Does it refuse to confirm the transaction?
- Does it enter a crash-safe recovery state?
Critical: Postgres historically panicked on
fsyncfailure because retrying might silently return success even though data was lost. This is exactly the kind of bug Faultbox was built to find.
Filesystem edge cases
EEXIST — File exists
fault(db, openat=deny("EEXIST"), run=scenario)
Simulates: Lock file already held by another process, create-exclusive
(O_EXCL) failing because the file was already created, PID file from a
previous crashed instance.
What to test:
- Does the service handle “already exists” differently from “can’t create”?
- Does it clean up stale lock files?
ENOTEMPTY — Directory not empty
fault(db, openat=deny("ENOTEMPTY"), run=scenario)
Simulates: Trying to remove a directory that still has files (e.g., cleanup of temp directories, log rotation removing old dirs).
ENOSYS — Function not implemented
fault(db, openat=deny("ENOSYS"), run=scenario)
Simulates: Running on a kernel that doesn’t support a specific syscall, seccomp policy blocking the operation entirely, missing filesystem feature.
What to test:
- Does the service fall back to an alternative?
- Does it report a clear “unsupported” error?
Using errnos not listed here
Linux has ~130 errnos. This reference covers the most common ones for fault injection. If you need an errno not listed here:
Step 1: Find the errno name. Run on your target Linux system:
# List all errnos:
python3 -c "import errno; print('\n'.join(f'{v}: {k}' for k,v in sorted(errno.errorcode.items())))"
# Or search for a specific error:
grep -r "EDEADLK\|ELOOP\|ENOLCK" /usr/include/asm-generic/errno*.h
Step 2: Use it directly in Faultbox — any valid Linux errno name works:
fault(db, write=deny("EDEADLK"), run=scenario) # resource deadlock
fault(db, openat=deny("ELOOP"), run=scenario) # too many symlinks
fault(db, write=deny("EDQUOT"), run=scenario) # disk quota exceeded
Faultbox passes the errno string to the kernel — if Linux recognizes it, it works. No configuration needed.
Combining errnos with probability
Not all failures are 100%. Use probability for intermittent errors:
# 10% of writes fail — tests retry logic
fault(db, write=deny("EIO", probability="10%"), run=scenario)
# 50% connection failures — tests circuit breaker
fault(api, connect=deny("ECONNREFUSED", probability="50%"), run=scenario)
Combining errnos with delay
Real failures often start with slowness before errors:
# Slow then broken — cascade simulation
fault(db,
write=delay("2s"),
fsync=deny("EIO"),
run=scenario,
)