Open source · Rust + PyO3 · MIT Licensed

The Rust performance
layer for LangGraph

Drop-in accelerators that make production LangGraph graphs up to 737× faster at checkpoint serialization and 2.8× faster end-to-end. Full API compatibility. One line to enable.

Quickstart → See benchmarks GitHub

$ pip install fast-langgraph

Measured performance

737×

Checkpoint serialization

vs Python deepcopy, 235 KB state

45.9×

Sustained state updates

quick workload, 1000 steps

2.77×

End-to-end graph execution

20 nodes, 50 iterations, with checkpointing

9.78×

LLM response caching

90% hit rate

Benchmarks run on Python 3.12 / Linux x86_64. See /benchmarks for the full report and reproduction instructions.

Production LangGraph has three real bottlenecks

LangGraph is great for building agents. But production workloads hit the same walls — every time. We profiled them, isolated them, and rewrote the hot paths in Rust.

#1 — Checkpoint serialization

Python deepcopy collapses on large state

Serializing 235 KB of graph state through Python's deepcopy takes 206 ms. Through RustSQLiteCheckpointer, the same operation takes 0.28 ms.

737× faster

#2 — Executor churn

58% of time spent recreating thread pools

LangGraph builds a new ThreadPoolExecutor per invocation. Our shim caches them, eliminating the largest single source of per-invocation overhead.

2.3× faster

#3 — LLM redundancy

Repeated prompts waste API spend

Graphs loop, retry, and branch. The same prompts get answered multiple times. Our @cached decorator is a 10× speedup at 90% hit rate — and a direct cost reduction.

9.78× faster

One line. Zero code changes.

Automatic mode patches LangGraph transparently at import time.

your_app.py

# 1. install
$ pip install fast-langgraph

# 2. enable (anywhere before you build your graph)
import fast_langgraph
fast_langgraph.shim.patch_langgraph()

# 3. use LangGraph exactly as you already do
from langgraph.graph import StateGraph
# → ~2.8x end-to-end speedup, zero API changes

Full quickstart guide →

Guides

Production-tested walkthroughs from quickstart to deep optimization.

All guides →

beginner

Cache LLM calls with @cached for a 10x speedup

LangGraph graphs re-issue the same LLM prompts constantly. The fast-langraph @cached decorator drops onto your LLM call sites and eliminates redundant API spend.

intermediate

Find LangGraph bottlenecks with GraphProfiler

Before you adopt fast-langraph or any other optimization, measure. The GraphProfiler adds ~1.6 μs of overhead per operation and tells you exactly where your wall clock goes.

beginner

Quickstart: enable fast-langraph in under a minute

Install fast-langraph, flip the shim on, and measure your first speedup. No code changes to your existing LangGraph application.

Technical articles

Deep-dives on what's slow in LangGraph and how we fixed it.

All articles →

performance

Hit a LangGraph scaling wall?

We help production teams squeeze every bottleneck out of LangGraph — checkpoints, state, LLM costs, memory. Honest audits. Measurable fixes.

consulting@neullabs.com Book a 30-min intro call

The Rust performance
layer for LangGraph

Measured performance

Production LangGraph has three real bottlenecks

Python deepcopy collapses on large state

58% of time spent recreating thread pools

Repeated prompts waste API spend

One line. Zero code changes.

Guides

Cache LLM calls with @cached for a 10x speedup

Find LangGraph bottlenecks with GraphProfiler

Quickstart: enable fast-langraph in under a minute

Technical articles

Why Python's deepcopy kills LangGraph at scale

Scaling LangGraph in production: the three real bottlenecks

Executor churn: the 58% problem in LangGraph invocations

Hit a LangGraph scaling wall?

The Rust performance layer for LangGraph

Measured performance

Production LangGraph has three real bottlenecks

Python deepcopy collapses on large state

58% of time spent recreating thread pools

Repeated prompts waste API spend

One line. Zero code changes.

Guides

Cache LLM calls with @cached for a 10x speedup

Find LangGraph bottlenecks with GraphProfiler

Quickstart: enable fast-langraph in under a minute

Technical articles

Why Python's deepcopy kills LangGraph at scale

Scaling LangGraph in production: the three real bottlenecks

Executor churn: the 58% problem in LangGraph invocations

Hit a LangGraph scaling wall?

The Rust performance
layer for LangGraph