Why Your Backtest Is Lying to You: Data Biases That Ruin Equity Research
Survivorship bias, look-ahead bias, point-in-time data, and the rest of the subtle ways your beautiful backtest is overstating returns. With concrete fixes for each.
You built a backtest. It returns 18% annualized over the last decade. The Sharpe is 1.4. The drawdown is acceptable. You’re already mentally apartment-shopping in the city you’ll move to after the strategy makes you wealthy.
It’s probably wrong. Not because your code has bugs — backtest code can be flawless and still produce numbers that have almost nothing to do with what would have happened if you’d traded the strategy live. The reasons are subtle and they have names.
Survivorship bias
The most famous one. If your historical universe is “S&P 500 today, backtested over ten years,” you’ve quietly excluded every company that was in the S&P 500 at some point and got booted out — usually because it failed.
The companies that died don’t show up in the dataset. Your historical “S&P 500” is actually “the winning subset of the S&P 500 visible from the future.” Returns are biased upward, sometimes by several percentage points a year for small-cap or emerging-market strategies.
Fix: use a point-in-time index membership database, or build one. CRSP and similar paid datasets handle this; free options require more work and willingness to scrape historical index constituent lists.
Look-ahead bias
You computed the trailing-twelve-month P/E using today’s reported financials. Trouble: those financials weren’t available at the time. Earnings get reported 30–90 days after a quarter ends, and restatements happen after that.
If you’re ranking companies on March 31, 2018, you should be using financials that were actually filed and disclosed by March 31, 2018 — not the latest revision visible today.
Fix: use point-in-time fundamentals. EDGAR preserves the original filing dates, so it can be done, but it requires care. Most price-data vendors hand you the latest-available value with no timestamp — that’s the bias source.
Universe selection bias
A close cousin of survivorship. If you backtest on “stocks that have at least 10 years of clean fundamental data,” you’ve selected for stability — companies that didn’t go bankrupt, didn’t merge, didn’t get acquired, didn’t change reporting standards. Your universe is structurally biased toward survivors.
Fix: define your universe as a point-in-time set (“Russell 3000 as of the rebalance date”) and accept that some constituents will have noisy or incomplete data.
Slippage and transaction costs
You assumed you traded at the closing price with zero cost. In reality, large orders move the market, bid-ask spreads exist, and small-cap stocks can be impossible to fill at the quoted price.
A strategy that backtests at 15% gross return after slippage and costs is often 8–10% net. The smaller your average market cap and the higher your turnover, the worse the gap.
Fix: model slippage explicitly. A reasonable starting point is 20–50 basis points per round-trip for liquid stocks, 100+ bps for small caps. Charge it to every trade.
Backtest overfitting
You ran 200 parameter combinations and the best one delivered 22%. Congratulations: you’ve discovered noise in your dataset, not signal. By construction, the best of 200 trials on the same data will look impressive even if the underlying strategy has no edge.
Fix: walk-forward testing. Develop on one window, test on a later out-of-sample window you never touched during parameter selection. If the out-of-sample period underperforms substantially, the rule was likely overfit. Bonferroni-style corrections for multiple-testing apply here too.
Calendar and currency cleanup
Subtler than the rest, but important:
- Time zones. “Closing price on date X” depends on which exchange and which time zone X refers to. Cross-listed names bite first.
- Corporate actions. Stock splits and dividends have to be back-adjusted consistently. Yahoo Finance gives you adjusted prices; sometimes the adjustments are off for small caps or delisted names.
- Currency. A non-USD strategy backtest needs FX adjustment at the same point in time as the prices, not at today’s rate.
The honest closing
A backtest is a hypothesis, not a result. The version of you who will trade the strategy live doesn’t have access to next year’s returns — only to today’s beliefs. The point of correcting for biases isn’t to make the backtest look worse; it’s to make the backtest tell you something true about what to expect.
If after correcting for all of the above, your strategy still shows a clean edge: trade a small amount of real money for six months and see if the live numbers match. If they don’t, you’ve learned the cheap version of the lesson everyone else learns the expensive way.
Related reading
2026-05-27
Alpaca Markets Trading API Review: Commission-Free Algo Trading for Developers
A developer's hands-on review of Alpaca's trading API — paper trading setup, Python SDK, real-time websocket streams, order execution, and how it compares to IBKR for retail algo trading.
2026-05-27
Bond Investing for Developers: Duration, Yield Curves, and Why Fixed Income Isn't Boring
An engineer's guide to bond investing — Treasury bonds, TIPS, bond ETFs, yield curve mechanics, duration and convexity explained with code, and how to think about fixed income in a portfolio.
2026-05-27
Factor Investing in Python: Implementing the Fama-French Models From Scratch
A developer's guide to implementing the Fama-French 3-factor and 5-factor models in Python — data sources, regression with statsmodels, alpha calculation, and what factor investing actually means for your portfolio.
2026-05-27
Interactive Brokers API: A Developer's Deep Dive Into Programmatic Trading
A hands-on review of IBKR's trading API — from TWS setup to paper trading with ib_insync, market data streaming, order placement, and how it stacks up against other broker APIs for the developer-investor.
2026-05-27
Monte Carlo Portfolio Simulation in Python: From Random Walks to Retirement Projections
Build a Monte Carlo portfolio simulator from scratch in Python — model returns, volatility, and withdrawal rates to estimate retirement success probability with visualizations you can trust.
Get the best tools, weekly
One email every Friday. No spam, unsubscribe anytime.