NextStat for Data Scientists

From sklearn to rigorous inference

You train models, evaluate metrics, and ship predictions. NextStat adds something most ML pipelines lack: statistically rigorous uncertainty quantification and hypothesis testing that accounts for every systematic you care about — built in Rust, callable from Python, and differentiable in PyTorch.

The pitch in one sentence

NextStat lets you replace ad-hoc error bars with profile likelihood intervals, replace grid search with Bayesian NUTS sampling, and train neural networks whose loss function is a full statistical test.

What You Already Know → What NextStat Calls It

Your World	NextStat Term	Why It Matters
Feature importance	Ranking (impact plot)	Shows how each systematic shifts the result, with up/down bands
Loss function	Negative log-likelihood (NLL)	Profile likelihood marginalises over nuisance params automatically
Confidence interval	Profile likelihood interval	Exact coverage, not bootstrap approximation
p-value / significance	CLs / Z₀ (discovery significance)	Accounts for look-elsewhere effect and systematic uncertainties
Hyperparameter tuning	Profile scan / Optuna integration	Scan over parameter of interest with full profiling
Cross-validation	Toy Monte Carlo	Sample pseudo-experiments to check coverage and bias
DataFrame	Arrow Table / Polars DataFrame	Zero-copy ingest via Arrow IPC — no serialisation overhead

5-Minute Quickstart

import nextstat

# 1. Load a model (pyhf JSON workspace — the "experiment spec")
model = nextstat.from_pyhf("workspace.json")

# 2. Fit (MLE) — like sklearn .fit() but with full uncertainty
result = nextstat.fit(model)
print(f"Best fit: {result.bestfit}")
print(f"Uncertainties: {result.uncertainties}")
print(f"Correlation matrix: {result.corr_matrix}")

# 3. Hypothesis test: is the signal real?
hypo = nextstat.hypotest(model, poi_value=1.0)
print(f"CLs = {hypo.cls:.4f}")  # < 0.05 → exclude at 95% CL

# 4. Feature importance: which systematics matter most?
ranking = nextstat.ranking(model)
for r in ranking[:5]:
    print(f"  {r.name}: +{r.impact_up:.3f} / {r.impact_down:.3f}")

# 5. Profile scan: likelihood landscape
scan = nextstat.scan(model, poi_values=[0, 0.5, 1.0, 1.5, 2.0])
# → scan.deltanll_values for the -2ΔlogL curve

Arrow / Polars: Your Data, Zero Copy

If your data lives in Polars, PyArrow, DuckDB, or Spark — NextStat ingests it directly via Arrow IPC with zero serialisation overhead.

import polars as pl
import nextstat

df = pl.read_parquet("histograms.parquet")
table = df.to_arrow()

model = nextstat.from_arrow(table, poi="mu", observations={"SR": [10, 20]})
result = nextstat.fit(model)

# Export back to Arrow for downstream analysis
yields = nextstat.to_arrow(model, params=result.bestfit, what="yields")

Differentiable Training: Loss = Statistical Test

The killer feature for ML: train a neural network where the loss function is the actual discovery significance (Z₀), with all systematic uncertainties profiled out. Gradients flow through the full statistical model via the envelope theorem.

from nextstat.torch import SignificanceLoss, SoftHistogram

# Your NN outputs continuous scores → soft histogram → Z₀ loss
loss_fn = SignificanceLoss(model, "signal")
soft_hist = SoftHistogram(bin_edges=torch.linspace(0, 1, 11))

for batch_x, batch_w in dataloader:
    scores = classifier(batch_x)
    histogram = soft_hist(scores, batch_w)
    loss = loss_fn(histogram.double().cuda())   # → scalar -Z₀
    loss.backward()                             # gradients to NN weights
    optimizer.step()

When to Use NextStat vs sklearn / statsmodels

Task	Best Tool
Quick logistic regression on a clean dataset	sklearn
GLM with robust standard errors, detailed summary	statsmodels
Hypothesis test with multiple systematic uncertainties	NextStat
NN training where loss = discovery significance	NextStat
GPU-accelerated batch fitting (1000s of models)	NextStat
Bayesian posterior with NUTS + diagnostics	NextStat
Reproducible validation artifacts for audit	NextStat

Next Steps

ML Overview — Full terminology bridge (Physics ↔ ML) → ML Overview
Training Guide — End-to-end SignificanceLoss tutorial → Training Guide
Python API — Complete API reference → Python API
Arrow / Polars — Zero-copy data interchange → Arrow / Polars
Agentic Tools — LLM tool definitions for AI-driven analysis → Agentic Tools
Server API — Self-hosted GPU inference for shared compute → Server API
Glossary — HEP ↔ DS ↔ Quant ↔ Bio term mapping → Glossary