Optuna Tutorial: Automate Hyperparameter Tuning for ML Models in Python
How Optuna's define-by-run API, TPE sampler, and pruners automate hyperparameter tuning for scikit-learn, PyTorch, and TensorFlow models, with runnable Python code.
Grid search has a scaling problem. Tune four hyperparameters with five candidate values each and you have 625 model fits to run. Add a fifth parameter and it jumps to 3,125. Random search trims the count, but it still spends trials on regions of the search space a smarter method would have abandoned after the first few results. Optuna, the open-source optimization framework maintained by Preferred Networks, takes a different approach: it treats tuning as a sequential optimization problem and uses the outcome of past trials to decide what to evaluate next.
We ran Optuna against several scikit-learn and PyTorch models to see where it fits a real tuning workflow. This is what the framework does, where the speedups come from, and how to wire it into training code you already have.
How the define-by-run API works
Most tuning libraries make you declare the entire search space up front as a static dictionary. Optuna uses what it calls a define-by-run API: the search space is constructed dynamically while the objective function executes. You write a plain Python function, ask for parameter values inside it with suggest_* calls, and return a score.
import optunafrom sklearn.ensemble import RandomForestClassifierfrom sklearn.model_selection import cross_val_score
def objective(trial): n_estimators = trial.suggest_int("n_estimators", 50, 500) max_depth = trial.suggest_int("max_depth", 2, 32) max_features = trial.suggest_float("max_features", 0.1, 1.0)
clf = RandomForestClassifier( n_estimators=n_estimators, max_depth=max_depth, max_features=max_features, ) return cross_val_score(clf, X, y, cv=3).mean()
study = optuna.create_study(direction="maximize")study.optimize(objective, n_trials=100)print(study.best_params)The payoff is conditional search spaces. Because the suggestions are ordinary Python calls, you can branch on them: pick suggest_categorical("classifier", ["svm", "rf"]) first, then suggest C and gamma only when the SVM branch runs. A static grid can’t express that without enumerating invalid combinations and wasting trials on them. Each call to study.optimize runs n_trials evaluations, and study.best_params and study.best_value hold the winner afterward. Return a tuple instead of a single score and the same API handles multi-objective problems, such as trading accuracy against inference latency.
Pruners and samplers: the two levers
Optuna’s speed comes from two components you can swap independently.
The sampler decides which values to try. The default is the Tree-structured Parzen Estimator (TPE), which models the relationship between parameter values and scores, then draws new candidates from the regions that have performed well. Optuna also ships RandomSampler, GridSampler, CmaEsSampler, and NSGAIISampler for multi-objective work. You change the strategy with one argument: optuna.create_study(sampler=optuna.samplers.CmaEsSampler()).
The pruner stops unpromising trials before they finish. For any model trained iteratively — gradient-boosted trees, neural networks — you report an intermediate score after each step and let Optuna kill trials tracking well below the others.
def objective(trial): lr = trial.suggest_float("lr", 1e-5, 1e-1, log=True) model = build_model(lr)
for epoch in range(30): train_one_epoch(model) accuracy = validate(model) trial.report(accuracy, epoch) if trial.should_prune(): raise optuna.TrialPruned() return accuracy
study = optuna.create_study( direction="maximize", pruner=optuna.pruners.MedianPruner(),)MedianPruner cuts a trial when its intermediate score falls below the median of completed trials at the same step. SuccessiveHalvingPruner and HyperbandPruner allocate budget more aggressively. The log=True flag on suggest_float matters here too — learning rates and regularization strengths span orders of magnitude, and a log-uniform scale spreads trials evenly across them instead of clustering near the high end.
Wiring Optuna into PyTorch, TensorFlow, and scikit-learn
Optuna is framework-agnostic because the objective function is just Python — whatever runs inside it is up to you. For the iterative-training case, the optuna.integration module provides callbacks that handle the report and should_prune plumbing for you, with hooks for PyTorch Lightning, Keras, XGBoost, and LightGBM so you don’t hand-write the pruning loop.
Two features matter once you move past a laptop notebook. First, storage: pass storage="sqlite:///optuna.db" to create_study and every trial is persisted to disk. Kill the process and resume with load_if_exists=True, and the study continues from where it stopped.
study = optuna.create_study( study_name="rf-tuning", storage="sqlite:///optuna.db", load_if_exists=True,)Second, parallelism: point multiple processes or machines at the same database backend — SQLite for a single box, PostgreSQL or MySQL for a cluster — and they share one study, each pulling trials and writing results back. There is no separate scheduler to stand up. The companion optuna-dashboard package reads the same storage and renders trial history, parameter importance, and optimization curves in the browser.
Cursor
Optuna objective functions are repetitive boilerplate: search-space definitions, training loops, pruning hooks. An AI-native editor scaffolds them from a comment and catches suggest_int / suggest_float type mismatches before you burn trial budget on them.
Free tier; Pro at $20/mo
Affiliate link · We earn a commission at no cost to you.
A tuning run that won’t waste your afternoon
Start small and let the data tell you where to spend. Define your objective, run 100 trials with the default TPE sampler and MedianPruner, and open the dashboard. The parameter-importance chart ranks which hyperparameters actually moved the score — often one or two dominate and the rest are noise. Freeze the parameters that don’t matter, narrow the ranges of the ones that do, and run another 100 trials on the smaller space. Two or three rounds of that loop usually beats hours of manual tuning, and because every trial sits in the storage backend, you can stop and resume between rounds without losing history. If the score stalls, swap the sampler — CmaEsSampler often does better on smooth, continuous spaces — before you reach for a bigger trial budget.
FAQ
Is Optuna free to use? +
How is Optuna different from scikit-learn's GridSearchCV? +
Can Optuna tune models built outside Python? +
Related reading
2026-05-26
Orthrus: Parallel Token Generation That Doesn't Change Your Model's Output
Orthrus injects diffusion attention into each layer of a frozen autoregressive Transformer to generate 32 tokens in parallel — without altering the base model's output distribution.
2026-05-26
NVIDIA Warp Review: GPU-Accelerated Python for Simulation, Robotics, and Differentiable ML
NVIDIA Warp compiles Python functions to CUDA kernels for differentiable physics and robotics. We benchmarked it against JAX and Taichi to figure out when it earns a spot in your stack.
2026-05-26
OpenAI Daybreak vs Anthropic Glasswing: Convergent Bets on LLM Security Tooling
OpenAI's Daybreak (GPT-5.5 + Codex Security) and Anthropic's Glasswing shipped near-identical AppSec products the same week. What the convergence means and how to pick.
2026-05-26
Macchiato Day 2: Live Token Metrics and Parallel AI Terminals Reviewed
Macchiato's day-2 build adds a live token/cost sidebar and keyboard shortcuts for swapping between Claude Code and OpenCode in one terminal. Here's what shipped and what it means.
2026-05-26
Macchiato Day 2: Live Token Metrics and Parallel Terminals for Claude Code and OpenCode
Macchiato Day 2 adds a 2-4 pane terminal grid, live token and cost meters, and configurable spend ceilings for Claude Code and OpenCode sessions. Here is what it actually does and who should install it.
Get the best tools, weekly
One email every Friday. No spam, unsubscribe anytime.