Optuna Tutorial: Automate Hyperparameter Tuning for ML Models in Python

Grid search has a scaling problem. Tune four hyperparameters with five candidate values each and you have 625 model fits to run. Add a fifth parameter and it jumps to 3,125. Random search trims the count, but it still spends trials on regions of the search space a smarter method would have abandoned after the first few results. Optuna, the open-source optimization framework maintained by Preferred Networks, takes a different approach: it treats tuning as a sequential optimization problem and uses the outcome of past trials to decide what to evaluate next.

We ran Optuna against several scikit-learn and PyTorch models to see where it fits a real tuning workflow. This is what the framework does, where the speedups come from, and how to wire it into training code you already have.

How the define-by-run API works

Most tuning libraries make you declare the entire search space up front as a static dictionary. Optuna uses what it calls a define-by-run API: the search space is constructed dynamically while the objective function executes. You write a plain Python function, ask for parameter values inside it with suggest_* calls, and return a score.

import optuna
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score

def objective(trial):
    n_estimators = trial.suggest_int("n_estimators", 50, 500)
    max_depth = trial.suggest_int("max_depth", 2, 32)
    max_features = trial.suggest_float("max_features", 0.1, 1.0)

    clf = RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        max_features=max_features,
    )
    return cross_val_score(clf, X, y, cv=3).mean()

study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=100)
print(study.best_params)

The payoff is conditional search spaces. Because the suggestions are ordinary Python calls, you can branch on them: pick suggest_categorical("classifier", ["svm", "rf"]) first, then suggest C and gamma only when the SVM branch runs. A static grid can’t express that without enumerating invalid combinations and wasting trials on them. Each call to study.optimize runs n_trials evaluations, and study.best_params and study.best_value hold the winner afterward. Return a tuple instead of a single score and the same API handles multi-objective problems, such as trading accuracy against inference latency.

Pruners and samplers: the two levers

Optuna’s speed comes from two components you can swap independently.

The sampler decides which values to try. The default is the Tree-structured Parzen Estimator (TPE), which models the relationship between parameter values and scores, then draws new candidates from the regions that have performed well. Optuna also ships RandomSampler, GridSampler, CmaEsSampler, and NSGAIISampler for multi-objective work. You change the strategy with one argument: optuna.create_study(sampler=optuna.samplers.CmaEsSampler()).

The pruner stops unpromising trials before they finish. For any model trained iteratively — gradient-boosted trees, neural networks — you report an intermediate score after each step and let Optuna kill trials tracking well below the others.

def objective(trial):
    lr = trial.suggest_float("lr", 1e-5, 1e-1, log=True)
    model = build_model(lr)

    for epoch in range(30):
        train_one_epoch(model)
        accuracy = validate(model)
        trial.report(accuracy, epoch)
        if trial.should_prune():
            raise optuna.TrialPruned()
    return accuracy

study = optuna.create_study(
    direction="maximize",
    pruner=optuna.pruners.MedianPruner(),
)

MedianPruner cuts a trial when its intermediate score falls below the median of completed trials at the same step. SuccessiveHalvingPruner and HyperbandPruner allocate budget more aggressively. The log=True flag on suggest_float matters here too — learning rates and regularization strengths span orders of magnitude, and a log-uniform scale spreads trials evenly across them instead of clustering near the high end.

Wiring Optuna into PyTorch, TensorFlow, and scikit-learn

Optuna is framework-agnostic because the objective function is just Python — whatever runs inside it is up to you. For the iterative-training case, the optuna.integration module provides callbacks that handle the report and should_prune plumbing for you, with hooks for PyTorch Lightning, Keras, XGBoost, and LightGBM so you don’t hand-write the pruning loop.

Two features matter once you move past a laptop notebook. First, storage: pass storage="sqlite:///optuna.db" to create_study and every trial is persisted to disk. Kill the process and resume with load_if_exists=True, and the study continues from where it stopped.

study = optuna.create_study(
    study_name="rf-tuning",
    storage="sqlite:///optuna.db",
    load_if_exists=True,
)

Second, parallelism: point multiple processes or machines at the same database backend — SQLite for a single box, PostgreSQL or MySQL for a cluster — and they share one study, each pulling trials and writing results back. There is no separate scheduler to stand up. The companion optuna-dashboard package reads the same storage and renders trial history, parameter importance, and optimization curves in the browser.

Cursor

Optuna objective functions are repetitive boilerplate: search-space definitions, training loops, pruning hooks. An AI-native editor scaffolds them from a comment and catches suggest_int / suggest_float type mismatches before you burn trial budget on them.

Free tier; Pro at $20/mo

Try Cursor

Affiliate link · We earn a commission at no cost to you.

A tuning run that won’t waste your afternoon

Start small and let the data tell you where to spend. Define your objective, run 100 trials with the default TPE sampler and MedianPruner, and open the dashboard. The parameter-importance chart ranks which hyperparameters actually moved the score — often one or two dominate and the rest are noise. Freeze the parameters that don’t matter, narrow the ranges of the ones that do, and run another 100 trials on the smaller space. Two or three rounds of that loop usually beats hours of manual tuning, and because every trial sits in the storage backend, you can stop and resume between rounds without losing history. If the score stalls, swap the sampler — CmaEsSampler often does better on smooth, continuous spaces — before you reach for a bigger trial budget.

FAQ

Is Optuna free to use?

Yes. Optuna is open source under the MIT license with no paid tier or usage limits. You can run it locally, in CI, or across a cluster without a license cost, and the optuna-dashboard companion package is open source as well.

How is Optuna different from scikit-learn's GridSearchCV?

GridSearchCV evaluates every combination in a fixed grid and keeps no memory between fits. Optuna samples adaptively — the TPE sampler uses past trial scores to choose the next candidate — and can prune trials mid-training. For the same trial budget that adaptivity usually finds better parameters, and conditional search spaces aren't expressible in a flat grid at all.

Can Optuna tune models built outside Python?

Optuna is a Python library, so the objective function must be Python. But that function can launch a subprocess that trains a model in another language and parse the score it prints. Most teams keep the whole loop in Python because the framework integrations only cover Python ML libraries.

Optuna Tutorial: Automate Hyperparameter Tuning for ML Models in Python

How the define-by-run API works

Pruners and samplers: the two levers

Wiring Optuna into PyTorch, TensorFlow, and scikit-learn

Cursor

A tuning run that won’t waste your afternoon

FAQ

Aider vs Continue.dev: Terminal-First vs Editor-First AI Coding in 2026

AI Code Review Tools Compared: CodeRabbit, Greptile, and Diamond in 2026

Using Claude Code Subagents for Parallel Refactoring: A Hands-On Workflow

Cline vs Roo Code: Comparing Open-Source Agentic Coding Extensions in 2026

How to Build a Skills Library for Your AI Engineering Team

Get the best tools, weekly