Walk-forward optimization in Python

Optimizing parameters on your whole dataset and reporting the result is the single most common way to lie to yourself in quant research. Walk-forward optimization is the antidote: it only ever scores parameters on data they were not tuned on. Here is how it works, in Python.

The overfitting trap

Give an optimizer enough parameters and it will find a setting that looks brilliant on any fixed history, by fitting the noise, not the signal. That number is in-sample performance, and it is almost always unachievable live. The only honest measure is how parameters perform on data they have never seen.

How walk-forward works

Split the timeline into consecutive folds. On each fold, optimize the parameters on the in-sample portion, then freeze them and measure performance on the following out-of-sample portion. Slide forward and repeat. Stitch the out-of-sample segments together and you get an equity curve made entirely of unseen data, plus a record of how the optimal parameters drift over time. Stable optima across folds are a good sign; wildly different ones each fold mean the edge is fragile.

Walk-forward in Manifold-BT

Wrap the parameters you want tuned in mbt.param() and call run_walk_forward() with a method (rolling or anchored) and a fold count. The Rust core optimizes every fold in parallel, so a full walk-forward across years of data finishes in seconds rather than minutes:

walk_forward.py
import manifoldbt as mbt
from manifoldbt.indicators import close, ema
from manifoldbt.helpers import time_range, Slippage, Interval

# Wrap periods in mbt.param so each fold can re-optimize them
fast_p = mbt.param("fast", default=20)
slow_p = mbt.param("slow", default=50)
fast, slow = ema(close, fast_p), ema(close, slow_p)

strategy = (
    mbt.Strategy.create("wf_ema")
    .signal("fast", fast)
    .signal("slow", slow)
    .size(mbt.when(fast > slow, 1.0, 0.0))
)

start, end = time_range("2020-01-01", "2026-01-01")
config = mbt.BacktestConfig(
    universe={"binance": ["BTCUSDT"]},
    time_range_start=start,
    time_range_end=end,
    bar_interval=Interval.hours(4),
    initial_capital=10_000,
    fees=mbt.FeeConfig.binance_perps(),
    slippage=Slippage.fixed_bps(2),
    warmup_bars=50,
)
store = mbt.ingest(provider="binance", symbol="BTCUSDT", symbol_id=1,
                   interval="4h", start="2020-01-01T00:00:00Z",
                   end="2026-01-01T00:00:00Z")

# Optimize in-sample, validate out-of-sample, fold by fold (Pro)
wf = mbt.run_walk_forward(
    strategy,
    wf_config={"method": "Rolling", "n_splits": 5, "optimize_metric": "sharpe"},
    config=config,
    store=store,
)

for fold in wf["folds"]:
    print(fold)                      # in- and out-of-sample metrics per fold
print(wf["best_params_per_fold"])    # how the optimum drifts over time

Walk-forward optimization is a Pro feature. The single-backtest and parameter-sweep workflows are available in the open-source core.

Frequently asked questions

What is walk-forward optimization?

Walk-forward optimization tunes a strategy's parameters on an in-sample window, then tests those parameters on the next, unseen out-of-sample window, and repeats across the data. The combined out-of-sample results estimate real-world performance.

Why is walk-forward better than a single backtest?

A single optimized backtest reports in-sample performance, which is inflated by overfitting. Walk-forward only ever scores parameters on data they were not tuned on, so it exposes strategies that only worked in hindsight.

What is the difference between anchored and rolling walk-forward?

Anchored keeps the in-sample window's start fixed and grows it over time; rolling slides a fixed-length window forward. Rolling adapts faster to regime change; anchored uses more history per fold.

Keep reading

Run your first backtest

Install Manifold-BT and reproduce the backtest above in seconds. The Rust core runs years of bars sub-second so you can sweep parameters instead of waiting.

$pip install manifoldbt