← Back to Knowledge
manualadvancedoptimization

Manual acceleration mode: direct Rust component usage

Neul Labs · · Level: advanced · Read: 6 min
TL;DR

Import RustSQLiteCheckpointer, @cached, and langgraph_state_update directly. Wire them into your hot paths. This is how you get the big numbers.

The shim is the polite 2.8× win that requires no thought. Manual mode is what you reach for when 2.8× isn’t enough.

In manual mode, you explicitly import Rust-backed components and wire them into the places in your code where they’ll pay off most. Each component replaces exactly one thing: a checkpointer, a cached function, a state merge operation. There’s no magic, no monkey-patching, and no ambiguity about what’s running where.

The components

ComponentReplacesPeak speedupImport
RustSQLiteCheckpointerSqliteSaver737× on large statefrom fast_langgraph import RustSQLiteCheckpointer
@cachedAd-hoc LLM caching9.78× on LLM-bound workloadsfrom fast_langgraph import cached
langgraph_state_updateManual state merging46× on high-frequency updatesfrom fast_langgraph import langgraph_state_update

Pattern 1: fast checkpointing

from fast_langgraph import RustSQLiteCheckpointer
from langgraph.graph import StateGraph

checkpointer = RustSQLiteCheckpointer("state.db")

graph = build_my_graph().compile(checkpointer=checkpointer)

One line swap. Same on-disk format as SqliteSaver, so existing threads migrate for free. Full walkthrough in the dedicated guide.

Pattern 2: LLM response caching

from fast_langgraph import cached

@cached(max_size=1000)
def call_llm(prompt: str, temperature: float = 0.0) -> str:
    return llm.invoke(prompt)

Keyed by the argument tuple, LRU-evicted at max_size. Check stats with call_llm.cache_stats(). Full walkthrough in caching LLM calls.

Pattern 3: high-frequency state updates

This is the one most teams don’t reach for but probably should.

If your graph has a node that accumulates messages or updates scratchpad state at high frequency (think streaming token events, continuous tool observations, or per-chunk state mutations), the cost of merging state dict updates in Python adds up fast.

from fast_langgraph import langgraph_state_update

def accumulate_messages_node(state):
    new_message = generate_next_message(state)
    return langgraph_state_update(
        state,
        {"messages": [new_message]},
        append_keys=["messages"],
    )

langgraph_state_update takes the current state dict, the partial update, and a list of keys whose values should be list-appended rather than replaced. It performs the merge in Rust and returns the new state — 1.4 μs per call, versus the ~10–30 μs a Python dict merge takes at non-trivial state sizes.

On a graph that does 100 state updates per invocation, that’s ~2 ms of savings per invocation. Not headline-grabbing, but it compounds when your graph runs thousands of times.

Pattern 4: combining with the shim

Nothing stops you from running both modes at once. In fact, most production users do:

import fast_langgraph
fast_langgraph.shim.patch_langgraph()

from fast_langgraph import RustSQLiteCheckpointer, cached
from langgraph.graph import StateGraph

@cached(max_size=2000)
def call_llm(prompt: str) -> str:
    return llm.invoke(prompt)

graph = (
    StateGraph(MyState)
    .add_node("llm", llm_node_using_call_llm)
    # ...
    .compile(checkpointer=RustSQLiteCheckpointer("state.db"))
)

The shim handles the executor and apply_writes paths globally. The manual components take over the specific hot paths you’ve profiled as problems.

Picking your targets

Don’t adopt everything at once. Profile first. Adopt the component that attacks your largest bottleneck. Measure. Then the next one. Most teams get their first 2–3× from the shim alone, another 2–3× from RustSQLiteCheckpointer, and another chunk from @cached. The three together can move a graph from “barely usable in production” to “comfortably cheap.”

When manual mode is overkill

If you’re prototyping, if your state is tiny, or if your workload is LLM-call-bound anyway, the shim is enough. Manual mode is for teams who have already hit the easy wins and want to push further. There’s no prize for using every component.

Questions about your specific workload? We do production audits.