Benchmarks
Measured, reproducible performance
Every number on this page comes from BENCHMARK.md in the fast-langraph repo. You can regenerate them yourself in a few minutes.
737×
Checkpoint serialization
vs Python deepcopy, 235 KB state
45.9×
Sustained state updates
quick workload, 1000 steps
2.77×
End-to-end graph execution
20 nodes, 50 iterations, with checkpointing
9.78×
LLM response caching
90% hit rate
Test system: Python 3.12.3 · Linux 6.14 · x86_64 · generated 2025-12-10
Checkpoint serialization
Rust's biggest win. Speedup scales with state complexity.
| State size | Rust | Python (deepcopy) | Speedup |
|---|---|---|---|
| 3.8 KB | 0.35 ms | 15.29 ms | 43× |
| 35.0 KB | 0.29 ms | 52.00 ms | 178× |
| 235.5 KB | 0.28 ms | 206.21 ms | 737× |
Sustained state updates
Simulating real LangGraph execution with continuous state updates.
| Workload | Steps | Rust | Python | Speedup |
|---|---|---|---|---|
| Quick | 1,000 | 1.83 ms | 83.98 ms | 45.9× |
| Medium | 100 | 0.57 ms | 7.56 ms | 13.2× |
End-to-end graph simulation
20 nodes, 50 iterations with checkpointing.
Rust total time
9.11 ms
Python total time
25.26 ms
Speedup
2.77×
LLM caching
Without cache
108.48 ms
With cache (90% hit rate)
11.09 ms · 9.78×
Feature summary
| Feature | Performance | Use case |
|---|---|---|
| Complex Checkpoint (250 KB) | 737× faster than deepcopy | Large agent state |
| Complex Checkpoint (35 KB) | 178× faster | Medium state |
| LLM Response Caching | 9.78× (90% hit rate) | Repeated prompts, RAG |
| Function Caching | 1.6× speedup | Expensive computations |
| In-Memory Checkpoint PUT | 1.4 μs/op | Fast state snapshots |
| LangGraph State Update | 1.4 μs/op | High-frequency updates |
Reproduce it yourself
git clone https://github.com/neul-labs/fast-langgraph
cd fast-langgraph
uv run python scripts/generate_benchmark_report.py
# Individual suites
uv run python scripts/benchmark_rust_strengths.py
uv run python scripts/benchmark_complex_structures.py
uv run python scripts/benchmark_all_features.py
cargo bench
A note on honesty: not every operation gets a Rust speedup. Python's native {**a, **b} dict merge is hand-optimized C and stays faster than our Rust equivalent for simple merges — so we don't use Rust there. fast-langraph is about picking the right battles, not about blanket rewrites.
Hit a LangGraph scaling wall?
We help production teams squeeze every bottleneck out of LangGraph — checkpoints, state, LLM costs, memory. Honest audits. Measurable fixes.