Hyperparameter Tuning with Optuna

Tutorial — Binning Optimisation

NextStat's fast Rust inference engine makes it ideal as an objective function for black-box optimisation. A single fit completes in ~1–50 ms (CPU) or ~0.5–5 ms (GPU), allowing Optuna to evaluate hundreds of configurations in minutes rather than hours.

This tutorial shows how to use Optuna to find the optimal histogram binning that maximises discovery significance Z₀. The same pattern works for any hyperparameter — cut thresholds, sample selections, systematic treatments, etc.

Installation

pip install nextstat optuna
# Optional: pip install optuna-dashboard  (for live monitoring)

The Problem: Optimal Binning

Histogram binning directly affects statistical power. Too few bins — you lose shape information. Too many — statistical fluctuations dominate. The optimal binning depends on your signal/background shapes, statistics, and systematics.

Traditionally this is done by hand or simple scans. With Optuna + NextStat, we can search over binning parameters automatically using Bayesian optimisation (TPE).

Step 1: Define the Workspace Builder

First, write a function that builds a HistFactory workspace given binning parameters. This is the part that encodes your analysis-specific logic.

import json
import numpy as np

def build_workspace(n_bins: int, lo: float, hi: float) -> dict:
    """Build a simple 1-channel HistFactory workspace with given binning.

    In a real analysis, this would read ntuples and rebin with the
    given edges. Here we use analytic shapes for illustration.
    """
    edges = np.linspace(lo, hi, n_bins + 1)
    centers = 0.5 * (edges[:-1] + edges[1:])
    width = edges[1] - edges[0]

    # Analytic signal: Gaussian peak at 0.5
    signal = 50.0 * np.exp(-0.5 * ((centers - 0.5) / 0.08) ** 2) * width
    # Analytic background: falling exponential
    background = 200.0 * np.exp(-2.0 * centers) * width

    # HistFactory workspace (pyhf format)
    return {
        "channels": [{
            "name": "SR",
            "samples": [
                {
                    "name": "signal",
                    "data": signal.tolist(),
                    "modifiers": [
                        {"name": "mu", "type": "normfactor", "data": None}
                    ],
                },
                {
                    "name": "background",
                    "data": background.tolist(),
                    "modifiers": [
                        {"name": "bkg_norm", "type": "normsys",
                         "data": {"hi": 1.05, "lo": 0.95}},
                    ],
                },
            ],
        }],
        "observations": [{
            "name": "SR",
            "data": (signal + background).tolist(),
        }],
        "measurements": [{
            "name": "meas",
            "config": {
                "poi": "mu",
                "parameters": [],
            },
        }],
        "version": "1.0.0",
    }

Step 2: Define the Optuna Objective

import nextstat

def objective(trial):
    """Optuna objective: maximise Z₀ by tuning binning."""

    # Search space
    n_bins = trial.suggest_int("n_bins", 3, 40)
    lo = trial.suggest_float("lo", 0.0, 0.3)
    hi = trial.suggest_float("hi", 0.7, 1.0)

    # Build workspace with trial's binning
    ws = build_workspace(n_bins=n_bins, lo=lo, hi=hi)

    # Load into NextStat
    try:
        model = nextstat.from_pyhf(ws)
    except Exception:
        return 0.0  # invalid config → zero significance

    # Hypothesis test at μ=0 → discovery significance Z₀
    hypo = nextstat.hypotest(model, mu=0.0)
    return float(hypo.significance)  # Z₀ in σ

Step 3: Run the Study

import optuna

# Create study (maximise significance)
study = optuna.create_study(
    direction="maximize",
    study_name="nextstat-binning",
    sampler=optuna.samplers.TPESampler(seed=42),
)

# Run 200 trials (~10–30 seconds with NextStat)
study.optimize(objective, n_trials=200, show_progress_bar=True)

# Results
print(f"Best Z₀: {study.best_value:.3f}σ")
print(f"Best params: {study.best_params}")
# → Best Z₀: 3.142σ
# → Best params: {'n_bins': 15, 'lo': 0.12, 'hi': 0.92}

Step 4: Analyse Results

# Optuna built-in visualisations
from optuna.visualization import (
    plot_optimization_history,
    plot_param_importances,
    plot_contour,
)

# Optimisation convergence
fig1 = plot_optimization_history(study)
fig1.show()

# Which parameters matter most?
fig2 = plot_param_importances(study)
fig2.show()

# 2D contour: n_bins vs lo
fig3 = plot_contour(study, params=["n_bins", "lo"])
fig3.show()

Step 5: Log to W&B (Optional)

import wandb
from nextstat.mlops import metrics_dict

wandb.init(project="nextstat-optuna")

def objective_with_logging(trial):
    n_bins = trial.suggest_int("n_bins", 3, 40)
    lo = trial.suggest_float("lo", 0.0, 0.3)
    hi = trial.suggest_float("hi", 0.7, 1.0)

    ws = build_workspace(n_bins=n_bins, lo=lo, hi=hi)
    model = nextstat.from_pyhf(ws)
    result = nextstat.fit(model)

    # Log fit metrics to W&B
    wandb.log({
        "trial": trial.number,
        "n_bins": n_bins,
        "lo": lo,
        "hi": hi,
        **metrics_dict(result, prefix="fit/"),
    })

    return -float(result.nll) if result.converged else 0.0

Advanced: GPU-Accelerated Objective

For models with many channels or bins, use GPU-accelerated ranking inside the objective to evaluate systematic impact alongside significance:

from nextstat.interpret import rank_impact

def objective_gpu(trial):
    n_bins = trial.suggest_int("n_bins", 5, 50)
    lo = trial.suggest_float("lo", 0.0, 0.2)
    hi = trial.suggest_float("hi", 0.8, 1.0)

    ws = build_workspace(n_bins=n_bins, lo=lo, hi=hi)
    model = nextstat.from_pyhf(ws)

    # Compute Z₀ via hypothesis test
    hypo = nextstat.hypotest(model, mu=0.0)
    z0 = float(hypo.significance)

    # Systematic impact for regularisation
    ranking = rank_impact(model, top_n=5)
    total_syst = sum(r["total_impact"] for r in ranking)

    # Multi-objective: maximise Z₀ while penalising systematic sensitivity
    return z0 - 0.1 * total_syst

Advanced: Parallel with Ray Tune

For distributed search across multiple machines, wrap the Optuna study with Ray Tune:

from ray import tune
from ray.tune.search.optuna import OptunaSearch

def trainable(config):
    """Ray Tune trainable wrapping NextStat."""
    ws = build_workspace(
        n_bins=config["n_bins"],
        lo=config["lo"],
        hi=config["hi"],
    )
    model = nextstat.from_pyhf(ws)
    hypo = nextstat.hypotest(model, mu=0.0)
    tune.report(z0=float(hypo.significance))

search = OptunaSearch(
    metric="z0",
    mode="max",
    sampler=optuna.samplers.TPESampler(seed=42),
)

tuner = tune.Tuner(
    trainable,
    param_space={
        "n_bins": tune.randint(3, 40),
        "lo": tune.uniform(0.0, 0.3),
        "hi": tune.uniform(0.7, 1.0),
    },
    tune_config=tune.TuneConfig(
        num_samples=200,
        search_alg=search,
    ),
)

results = tuner.fit()
print(results.get_best_result("z0", "max").config)

Performance

Model Size	CPU Fit	GPU Fit	200 Trials
10 bins, 2 NP	~1 ms	~0.5 ms	~1 sec
100 bins, 20 NP	~10 ms	~2 ms	~4 sec
1000 bins, 100 NP	~100 ms	~10 ms	~20 sec

Tips

Pruning — Optuna's MedianPruner can stop unpromising trials early. Works well with NextStat since each trial is fast.
Multi-objective — Use optuna.create_study(directions=["maximize", "minimize"]) to jointly optimise Z₀ and computation time.
Persistence — Use optuna.create_study(storage="sqlite:///study.db") to save progress across sessions.
Dashboard — Run optuna-dashboard sqlite:///study.db for real-time monitoring in the browser.