Building a Monte Carlo Retirement Simulator in Python
How to model thousands of possible market paths instead of trusting one average return — a working Python approach to retirement projections and the assumptions that make or break it.
A retirement projection built on a single number — “assume 7% per year” — answers the wrong question. It tells you what happens on one perfectly smooth path that the market has never actually delivered. Real returns arrive jagged: a strong decade, a flat stretch, a crash three years before you plan to stop working. A Monte Carlo simulation replaces that single path with thousands of them, and the output stops being a number and becomes a distribution. This walks through building one in Python and, more importantly, understanding what it can and cannot tell you.
Why a Single Average Return Lies to You
The standard projection uses the deterministic compound-growth formula: start with a balance, multiply by (1 + r) every year, subtract withdrawals. It is easy to put in a spreadsheet and it produces a clean curve. The problem is that the clean curve corresponds to a sequence of identical annual returns, and no such sequence has ever occurred.
The full set of returns over 30 years does determine your ending balance if you never withdraw — multiplication is commutative. But the moment you take money out, the order of returns matters. A bad year early in retirement forces you to sell more shares to fund the same withdrawal, and those shares are gone before any recovery can compound them. Two retirees with the identical 30-year average can end up with wildly different outcomes depending on when the bad years landed. A single-average projection cannot see this. It computes one path and hides the distribution of paths around it.
What a Monte Carlo Simulation Actually Does
The technique is mechanically simple. You assume a distribution for annual returns — for example, a mean and standard deviation estimated from long-run historical data. Then you:
- Draw a random return for each year of the projection horizon.
- Run your withdrawal logic year by year on that random sequence.
- Record whether the portfolio survived and what it ended at.
- Repeat thousands of times.
The result is not “your portfolio will be worth X.” It is “across 10,000 simulated futures, the portfolio survived in 87% of them, and here is the spread of ending balances.” That success rate, and the shape of the distribution, is the actual deliverable.
A Minimal Simulator in Python
The core is about 20 lines of NumPy. Everything else is interpretation.
import numpy as np
def simulate(start, withdrawal, years, mean, std, trials=10_000): # Draw a (trials x years) matrix of random annual returns returns = np.random.normal(mean, std, size=(trials, years)) balances = np.full(trials, float(start)) survived = np.ones(trials, dtype=bool)
for y in range(years): balances = balances * (1 + returns[:, y]) - withdrawal survived &= balances > 0 balances = np.clip(balances, 0, None)
return { "success_rate": survived.mean(), "p10": np.percentile(balances, 10), "p50": np.percentile(balances, 50), "p90": np.percentile(balances, 90), }
result = simulate( start=1_000_000, withdrawal=45_000, years=30, mean=0.05, std=0.12, # illustrative real-return assumptions)print(result)The numbers passed in here are illustrative, not a recommendation. A real run should use return and volatility assumptions you can defend, expressed in real (inflation-adjusted) terms so the withdrawal stays in constant dollars.
Reading the Output: Percentiles, Not Averages
The instinct is to look at the median ending balance and call it the answer. Resist it. The two outputs that matter are the success rate and the lower percentiles.
The success rate is the fraction of simulated paths where the portfolio never hit zero. A 95% success rate means 1 in 20 simulated futures ran out of money. Whether that is acceptable is a personal judgment, not a technical one — but you cannot make the judgment without the number.
The lower percentiles describe the bad-but-not-impossible cases. The P10 ending balance is the outcome that 90% of paths beat. If the P10 is a comfortable cushion, the plan is robust. If the P10 is zero, the plan depends on the market cooperating, and the median balance is cold comfort. This is the same reason you look at P99 latency instead of average latency when you operate a service: the average hides the tail, and the tail is what hurts.
Where the Model Is Wrong
A simulator is only as honest as its assumptions, and the minimal version above makes several that are convenient rather than accurate.
Normal distributions understate tail risk. Real market returns have fatter tails than a normal distribution — extreme years happen more often than the bell curve predicts. Drawing from np.random.normal produces crashes that are too mild and too rare. Using a Student’s t-distribution, or resampling from actual historical years (bootstrap), gives a more realistic tail.
Returns are not independent. The model treats each year as an independent draw. Real markets show some mean reversion over long horizons and volatility clustering over short ones. Independent draws miss both.
Fixed withdrawals are unrealistic. Almost nobody spends an identical inflation-adjusted amount through a 25% drawdown. Modeling a flexible rule — spending less in bad years — changes the success rate substantially and reflects how people actually behave.
Taxes and fees are missing. The minimal model ignores both. A 0.5% annual fee and the tax drag on a taxable account are real subtractions from the return every single year.
The point of building your own simulator is not that it beats established tools like cFIREsim or the research behind the 4% rule. It is that writing the loop forces you to make every assumption explicit. You cannot hand-wave the return distribution when you have to type it into np.random. That explicitness — seeing exactly which inputs the answer depends on — is worth more than the answer itself.
FAQ
How many trials should I run? +
Should I use historical returns or a fitted distribution? +
Is a homemade simulator good enough, or should I use a financial advisor's tools? +
Related reading
2026-05-21
The Behavioral Biases That Quietly Cost Investors the Most
Loss aversion, recency bias, overconfidence, and herding are predictable bugs in how people invest. Here is how each one works and how to build guardrails instead of relying on willpower.
2026-05-21
Bonds for Developers: What Fixed Income Actually Does in a Portfolio
A practical literacy guide to bonds, duration, and why fixed income plays a specific role in a portfolio — written for developers who already understand stock-index investing.
2026-05-21
Brokerage APIs for Algorithmic Trading: Alpaca, Interactive Brokers, and Tradier Compared
A developer-focused comparison of Alpaca, IBKR, and Tradier brokerage APIs — auth, paper trading, order types, rate limits, and honest caveats.
2026-05-21
Building a Stock Screener: From Data Source to a Ranked List
A pipeline-focused engineering guide to building a stock screener: universe selection, data sources, ranking metrics, storage, and the pitfalls that quietly break each stage.
2026-05-21
Dollar-Cost Averaging vs Lump-Sum Investing: What the Data Actually Says
Historical data favors deploying a lump sum immediately, but DCA reduces regret and worst-case outcomes. Here is how to think through the tradeoff rigorously.
Get the best tools, weekly
One email every Friday. No spam, unsubscribe anytime.