Optuna Tutorial: Automate Hyperparameter Tuning for ML Models in Python
How Optuna's define-by-run API, TPE sampler, and pruners automate hyperparameter tuning for scikit-learn, PyTorch, and TensorFlow models, with runnable Python code.
Grid search has a scaling problem. Tune four hyperparameters with five candidate values each and you have 625 model fits to run. Add a fifth parameter and it jumps to 3,125. Random search trims the count, but it still spends trials on regions of the search space a smarter method would have abandoned after the first few results. Optuna, the open-source optimization framework maintained by Preferred Networks, takes a different approach: it treats tuning as a sequential optimization problem and uses the outcome of past trials to decide what to evaluate next.
We ran Optuna against several scikit-learn and PyTorch models to see where it fits a real tuning workflow. This is what the framework does, where the speedups come from, and how to wire it into training code you already have.
How the define-by-run API works
Most tuning libraries make you declare the entire search space up front as a static dictionary. Optuna uses what it calls a define-by-run API: the search space is constructed dynamically while the objective function executes. You write a plain Python function, ask for parameter values inside it with suggest_* calls, and return a score.
import optunafrom sklearn.ensemble import RandomForestClassifierfrom sklearn.model_selection import cross_val_score
def objective(trial): n_estimators = trial.suggest_int("n_estimators", 50, 500) max_depth = trial.suggest_int("max_depth", 2, 32) max_features = trial.suggest_float("max_features", 0.1, 1.0)
clf = RandomForestClassifier( n_estimators=n_estimators, max_depth=max_depth, max_features=max_features, ) return cross_val_score(clf, X, y, cv=3).mean()
study = optuna.create_study(direction="maximize")study.optimize(objective, n_trials=100)print(study.best_params)The payoff is conditional search spaces. Because the suggestions are ordinary Python calls, you can branch on them: pick suggest_categorical("classifier", ["svm", "rf"]) first, then suggest C and gamma only when the SVM branch runs. A static grid can’t express that without enumerating invalid combinations and wasting trials on them. Each call to study.optimize runs n_trials evaluations, and study.best_params and study.best_value hold the winner afterward. Return a tuple instead of a single score and the same API handles multi-objective problems, such as trading accuracy against inference latency.
Pruners and samplers: the two levers
Optuna’s speed comes from two components you can swap independently.
The sampler decides which values to try. The default is the Tree-structured Parzen Estimator (TPE), which models the relationship between parameter values and scores, then draws new candidates from the regions that have performed well. Optuna also ships RandomSampler, GridSampler, CmaEsSampler, and NSGAIISampler for multi-objective work. You change the strategy with one argument: optuna.create_study(sampler=optuna.samplers.CmaEsSampler()).
The pruner stops unpromising trials before they finish. For any model trained iteratively — gradient-boosted trees, neural networks — you report an intermediate score after each step and let Optuna kill trials tracking well below the others.
def objective(trial): lr = trial.suggest_float("lr", 1e-5, 1e-1, log=True) model = build_model(lr)
for epoch in range(30): train_one_epoch(model) accuracy = validate(model) trial.report(accuracy, epoch) if trial.should_prune(): raise optuna.TrialPruned() return accuracy
study = optuna.create_study( direction="maximize", pruner=optuna.pruners.MedianPruner(),)MedianPruner cuts a trial when its intermediate score falls below the median of completed trials at the same step. SuccessiveHalvingPruner and HyperbandPruner allocate budget more aggressively. The log=True flag on suggest_float matters here too — learning rates and regularization strengths span orders of magnitude, and a log-uniform scale spreads trials evenly across them instead of clustering near the high end.
Wiring Optuna into PyTorch, TensorFlow, and scikit-learn
Optuna is framework-agnostic because the objective function is just Python — whatever runs inside it is up to you. For the iterative-training case, the optuna.integration module provides callbacks that handle the report and should_prune plumbing for you, with hooks for PyTorch Lightning, Keras, XGBoost, and LightGBM so you don’t hand-write the pruning loop.
Two features matter once you move past a laptop notebook. First, storage: pass storage="sqlite:///optuna.db" to create_study and every trial is persisted to disk. Kill the process and resume with load_if_exists=True, and the study continues from where it stopped.
study = optuna.create_study( study_name="rf-tuning", storage="sqlite:///optuna.db", load_if_exists=True,)Second, parallelism: point multiple processes or machines at the same database backend — SQLite for a single box, PostgreSQL or MySQL for a cluster — and they share one study, each pulling trials and writing results back. There is no separate scheduler to stand up. The companion optuna-dashboard package reads the same storage and renders trial history, parameter importance, and optimization curves in the browser.
Cursor
Optuna objective functions are repetitive boilerplate: search-space definitions, training loops, pruning hooks. An AI-native editor scaffolds them from a comment and catches suggest_int / suggest_float type mismatches before you burn trial budget on them.
Free tier; Pro at $20/mo
Affiliate link · We earn a commission at no cost to you.
A tuning run that won’t waste your afternoon
Start small and let the data tell you where to spend. Define your objective, run 100 trials with the default TPE sampler and MedianPruner, and open the dashboard. The parameter-importance chart ranks which hyperparameters actually moved the score — often one or two dominate and the rest are noise. Freeze the parameters that don’t matter, narrow the ranges of the ones that do, and run another 100 trials on the smaller space. Two or three rounds of that loop usually beats hours of manual tuning, and because every trial sits in the storage backend, you can stop and resume between rounds without losing history. If the score stalls, swap the sampler — CmaEsSampler often does better on smooth, continuous spaces — before you reach for a bigger trial budget.
FAQ
Is Optuna free to use? +
How is Optuna different from scikit-learn's GridSearchCV? +
Can Optuna tune models built outside Python? +
Related reading
2026-05-20
How to Build an Autonomous AI Coding Agent That Opens GitHub PRs Overnight
A practical breakdown of the plan-execute-verify loop behind an autonomous AI coding agent, and how to wire it to GitHub so an issue becomes a reviewable pull request overnight.
2026-05-20
Continual Harness: The Gemini Pokémon Agent That Rewrites Its Own Loop
How the Continual Harness pattern, from the Gemini Plays Pokémon and PokeAgent teams, lets an agent rewrite its own harness mid-run — plus how to apply that online-adaptation idea to autonomous agents you build.
2026-05-20
Apify Fingerprint Suite: Open-Source Browser Fingerprinting for Stealth Scrapers
Apify's fingerprint-suite generates statistically consistent browser fingerprints and injects them into Playwright or Puppeteer. How it works, how to wire it in, and when a scraper actually needs it.
2026-05-20
Judea Pearl's Ladder of Causation and the Limits of LLM Reasoning
Judea Pearl's three-rung causal hierarchy — association, intervention, counterfactual — explains why data-driven ML and LLMs hit a structural wall at causal reasoning, and what that means for agents and RAG.
2026-05-20
OpenAI GPT-Realtime-2: What GPT-5-Class Reasoning Actually Changes for Voice Agents
OpenAI's GPT-Realtime-2 is the first speech model with GPT-5-class reasoning. Here's what genuinely changes for voice agents — and what to test before you migrate.
Get the best tools, weekly
One email every Friday. No spam, unsubscribe anytime.