pickuma.
Finance

Why Your Backtest Is Lying to You: Data Biases That Ruin Equity Research

Survivorship bias, look-ahead bias, point-in-time data, and the rest of the subtle ways your beautiful backtest is overstating returns. With concrete fixes for each.

O
Owen
Engineer · Investor
Verify profile ↗
5 min read

You built a backtest. It returns 18% annualized over the last decade. The Sharpe is 1.4. The drawdown is acceptable. You’re already mentally apartment-shopping in the city you’ll move to after the strategy makes you wealthy.

It’s probably wrong. Not because your code has bugs — backtest code can be flawless and still produce numbers that have almost nothing to do with what would have happened if you’d traded the strategy live. The reasons are subtle and they have names.

Survivorship bias

The most famous one. If your historical universe is “S&P 500 today, backtested over ten years,” you’ve quietly excluded every company that was in the S&P 500 at some point and got booted out — usually because it failed.

The companies that died don’t show up in the dataset. Your historical “S&P 500” is actually “the winning subset of the S&P 500 visible from the future.” Returns are biased upward, sometimes by several percentage points a year for small-cap or emerging-market strategies.

Fix: use a point-in-time index membership database, or build one. CRSP and similar paid datasets handle this; free options require more work and willingness to scrape historical index constituent lists.

Look-ahead bias

You computed the trailing-twelve-month P/E using today’s reported financials. Trouble: those financials weren’t available at the time. Earnings get reported 30–90 days after a quarter ends, and restatements happen after that.

If you’re ranking companies on March 31, 2018, you should be using financials that were actually filed and disclosed by March 31, 2018 — not the latest revision visible today.

Fix: use point-in-time fundamentals. EDGAR preserves the original filing dates, so it can be done, but it requires care. Most price-data vendors hand you the latest-available value with no timestamp — that’s the bias source.

Universe selection bias

A close cousin of survivorship. If you backtest on “stocks that have at least 10 years of clean fundamental data,” you’ve selected for stability — companies that didn’t go bankrupt, didn’t merge, didn’t get acquired, didn’t change reporting standards. Your universe is structurally biased toward survivors.

Fix: define your universe as a point-in-time set (“Russell 3000 as of the rebalance date”) and accept that some constituents will have noisy or incomplete data.

Slippage and transaction costs

You assumed you traded at the closing price with zero cost. In reality, large orders move the market, bid-ask spreads exist, and small-cap stocks can be impossible to fill at the quoted price.

A strategy that backtests at 15% gross return after slippage and costs is often 8–10% net. The smaller your average market cap and the higher your turnover, the worse the gap.

Fix: model slippage explicitly. A reasonable starting point is 20–50 basis points per round-trip for liquid stocks, 100+ bps for small caps. Charge it to every trade.

Backtest overfitting

You ran 200 parameter combinations and the best one delivered 22%. Congratulations: you’ve discovered noise in your dataset, not signal. By construction, the best of 200 trials on the same data will look impressive even if the underlying strategy has no edge.

Fix: walk-forward testing. Develop on one window, test on a later out-of-sample window you never touched during parameter selection. If the out-of-sample period underperforms substantially, the rule was likely overfit. Bonferroni-style corrections for multiple-testing apply here too.

Calendar and currency cleanup

Subtler than the rest, but important:

  • Time zones. “Closing price on date X” depends on which exchange and which time zone X refers to. Cross-listed names bite first.
  • Corporate actions. Stock splits and dividends have to be back-adjusted consistently. Yahoo Finance gives you adjusted prices; sometimes the adjustments are off for small caps or delisted names.
  • Currency. A non-USD strategy backtest needs FX adjustment at the same point in time as the prices, not at today’s rate.

The honest closing

A backtest is a hypothesis, not a result. The version of you who will trade the strategy live doesn’t have access to next year’s returns — only to today’s beliefs. The point of correcting for biases isn’t to make the backtest look worse; it’s to make the backtest tell you something true about what to expect.

If after correcting for all of the above, your strategy still shows a clean edge: trade a small amount of real money for six months and see if the live numbers match. If they don’t, you’ve learned the cheap version of the lesson everyone else learns the expensive way.

Related reading

See all finance articles →

Get the best tools, weekly

One email every Friday. No spam, unsubscribe anytime.

O
Owen
Engineer · Investor
Verify profile ↗