Automatic shim mode: zero-code-change acceleration

The shim is the easy mode of fast-langraph. It exists because we believe the first win should be free — and because there’s a deep lesson we learned profiling real workloads: most of LangGraph’s per-invocation overhead lives in places you wouldn’t think to touch.

What it does

fast_langgraph.shim.patch_langgraph() replaces two things inside LangGraph at runtime:

The executor construction path. LangGraph builds a fresh ThreadPoolExecutor every time you invoke a graph. We measured this taking up to 58% of wall clock on short graphs. The shim swaps in a cached executor that’s reused across invocations.
The apply_writes function that applies channel updates at each super-step. We replace it with a Rust implementation that batches updates natively.

That’s it. Two surgical patches. The rest of LangGraph — your state, your nodes, your tools, your channels — runs exactly as before.

Enable it

Two equivalent options:

Explicit (recommended when you control your entry point):

import fast_langgraph
fast_langgraph.shim.patch_langgraph()

# Now your normal LangGraph imports pick up the patched versions
from langgraph.graph import StateGraph

Environment variable (recommended for production and when you don’t control imports):

export FAST_LANGGRAPH_AUTO_PATCH=1
python your_app.py

The env var is checked as soon as fast_langgraph is imported for the first time. If it’s set, patch_langgraph() runs automatically.

Verify it’s working

fast_langgraph.shim.print_status()

Prints a table of what’s currently patched. If a component shows as not patched, the most likely cause is import order: LangGraph was imported and had its functions cached somewhere before the shim ran. Fix by ensuring fast_langgraph.shim.patch_langgraph() runs first in your entry point.

The measured wins

Component	Speedup	Source
Executor caching	2.3×	Reuses ThreadPoolExecutor across invocations
Rust `apply_writes`	1.2×	Native-code channel batch updates
Combined	~2.8×	Real-world end-to-end on graphs with checkpointing

The 2.8× is conservative — it comes from our end-to-end benchmark with 20 nodes and 50 iterations. On shorter graphs (where executor setup dominates), the multiplier can be higher. On longer graphs (where node work dominates), it’s slightly lower.

Safety

The shim is a monkey-patch. That’s a red flag for some teams and we respect that. A few points:

Fully reversible. Call fast_langgraph.shim.unpatch_langgraph() to restore the original functions. Useful in test fixtures.
Upstream-tested. We run LangGraph’s own test suite against the shimmed implementation. 85 of 88 tests pass with zero failures — see compatibility.
Visible. print_status() tells you exactly what’s been replaced. No hidden magic.
Optional. If you don’t want automatic patching, use manual acceleration mode and import Rust components explicitly.

When to graduate to manual mode

The shim gives you the first 2–3×. It won’t give you the headline 737× — that requires direct use of RustSQLiteCheckpointer and friends. Think of the shim as the onramp: use it to prove the library works with your stack, then selectively adopt manual components for the biggest remaining bottlenecks.

Most teams run in hybrid mode: shim enabled globally, plus explicit RustSQLiteCheckpointer and @cached wired into their hot paths.