NextStatNextStat

Rust API Reference

The Rust API is organized as a Cargo workspace with multiple crates. Each crate has a focused responsibility following clean architecture principles.

CratePurpose
ns-coreCore types and traits (Model, FitResult, error handling)
ns-adAutomatic differentiation (forward Dual, reverse Tape, Scalar trait)
ns-probProbability distributions and math (logpdf, cdf, transforms)
ns-unbinnedEvent-level PDFs, normalizing flows, DCR surrogates, EventStore
ns-computeCompute backends: SIMD, Apple Accelerate, CUDA, Metal
ns-rootNative ROOT file I/O (TH1, TTree, expression engine)
ns-translateFormat translators: pyhf JSON, HistFactory XML, HS3, TRExFitter, Arrow
ns-inferenceInference algorithms (MLE, NUTS, CLs, PK/NLME, churn, econometrics)
ns-vizPlot-friendly artifacts (CLs curves, profiles, ranking, pulls, gammas)
ns-clinextstat CLI
ns-servernextstat-server REST API (axum)
ns-wasmWebAssembly bindings for browser playground
ns-zstdPure Rust Zstd decoder (optimized for ROOT decompression)
ns-pyPython bindings via PyO3/maturin

ns-core: Types & Traits

Defines LogDensityModel (NLL + gradient + metadata),FitResult (parameters, uncertainties, NLL, convergence info, termination_reason, final_grad_norm, initial_nll, n_active_bounds), and error types.

ns-prob: Probability Distributions

Reusable probability building blocks. Each module exports logpdf/nll: normal, poisson, exponential, gamma, beta, weibull, student_t, bernoulli, binomial, neg_binomial. Also: stable math helpers (log1pexp, sigmoid, softplus) and bijective transforms (ParameterTransform).

ns-translate: Model Loading

use ns_translate::pyhf::{
    HistFactoryModel, HistoSysInterpCode,
    NormSysInterpCode, Workspace,
};

let json = std::fs::read_to_string("workspace.json")?;
let ws: Workspace = serde_json::from_str(&json)?;

// Default interpolation (NextStat "smooth" defaults):
//   NormSys=Code4, HistoSys=Code4p.
// For strict HistFactory/pyhf defaults, use Code1/Code0:
let model = HistFactoryModel::from_workspace_with_settings(
    &ws,
    NormSysInterpCode::Code1,
    HistoSysInterpCode::Code0,
)?;

from_workspace() uses NextStat smooth defaults (Code4/Code4p). Use from_workspace_with_settings() to pick interpolation codes explicitly. See docs/pyhf-parity-contract.md for details.

ns-inference: Fitting

use ns_inference::mle::MaximumLikelihoodEstimator;

let mle = MaximumLikelihoodEstimator::new();
let result = mle.fit(&model)?;

println!("Parameters: {:?}", result.parameters);
println!("NLL: {}", result.nll);
println!("Uncertainties: {:?}", result.uncertainties);

GPU Flow Session (CUDA, feature-gated)

Orchestrates flow PDF evaluation + GPU NLL reduction for unbinned models with neural PDFs:

• GpuFlowSession — session: flow eval (CPU / CUDA EP) + GPU NLL reduction
  ├ new(config)           — create session, allocate GPU buffers
  ├ nll(logp_flat, params) — NLL from pre-computed logp_flat[n_procs × n_events]
  ├ nll_grad(params, eval_logp) — NLL + gradient (central finite differences)
  └ compute_yields(params)      — yield computation from parameter vector

• GpuFlowSessionConfig — processes, n_events, n_params, gauss_constraints
• FlowProcessDesc     — base_yield, yield_param_idx, yield_is_scaled
use ns_inference::gpu_flow_session::{GpuFlowSession, GpuFlowSessionConfig, FlowProcessDesc};

let config = GpuFlowSessionConfig {
    processes: vec![
        FlowProcessDesc {
            process_index: 0,
            base_yield: 100.0,
            yield_param_idx: Some(0),
            yield_is_scaled: true,
        },
    ],
    n_events: 50_000,
    n_params: 1,
    gauss_constraints: vec![],
    constraint_const: 0.0,
};

let mut session = GpuFlowSession::new(config)?;
let nll = session.nll(&logp_flat, &params)?;

Volatility Models (GARCH / Stochastic Volatility)

Financial time series volatility estimation in ns_inference::timeseries::volatility:

• garch11_fit(y, config) → Garch11Fit — Gaussian GARCH(1,1) MLE (L-BFGS-B)
  ├ Garch11Params { mu, omega, alpha, beta }
  ├ Garch11Config { optimizer, alpha_beta_max, init, min_var }
  └ Garch11Fit   { params, log_likelihood, conditional_variance, optimization }

• sv_logchi2_fit(y, config) → SvLogChi2Fit — approximate SV via log(χ²₁) + Kalman MLE
  ├ SvLogChi2Params { mu, phi, sigma }
  ├ SvLogChi2Config { optimizer, log_eps, init }
  └ SvLogChi2Fit   { params, log_likelihood, smoothed_h, smoothed_sigma, optimization }
use ns_inference::timeseries::volatility::{garch11_fit, Garch11Config};

let returns = vec![0.01, -0.02, 0.005, 0.03, -0.015];
let fit = garch11_fit(&returns, Garch11Config::default())?;
println!("omega={:.4} alpha={:.4} beta={:.4}",
    fit.params.omega, fit.params.alpha, fit.params.beta);

ns-root: ROOT I/O

use ns_root::RootFile;

let file = RootFile::open("data.root")?;
let tree = file.get_tree("events")?;

// Columnar extraction
let pt: Vec<f64> = file.branch_data(&tree, "pt")?;
let eta: Vec<f64> = file.branch_data(&tree, "eta")?;

// Compiled expressions
let expr = ns_root::CompiledExpr::compile(
    "pt > 25.0 && abs(eta) < 2.5"
)?;

ns-ad: Automatic Differentiation

Provides dual-number and tape-based automatic differentiation used internally by the optimizer for gradient computation.

ns-compute: Backend Abstraction

Abstracts over CPU (SIMD), Apple Accelerate (vDSP/vForce), CUDA, and Metal backends. The inference layer selects the best available backend at runtime.

CudaFlowNllAccelerator (feature-gated)

GPU NLL reduction from externally-computed log-prob values (flow PDFs). Separates PDF evaluation from likelihood reduction, enabling mixed parametric+flow models.

• CudaFlowNllAccelerator
  ├ new(config: &FlowNllConfig)              — allocate GPU buffers, load PTX kernel
  ├ nll(logp_flat, yields, params) → f64     — host-upload: logp computed on CPU, uploaded to GPU
  ├ nll_device(d_logp_flat, yields, params)  — device-resident: CudaSlice<f64> from CUDA EP (zero-copy)
  └ is_available() → bool                    — runtime CUDA check

• FlowNllConfig — n_events, n_procs, n_params, gauss_constraints, constraint_const

ns-translate: Arrow / Parquet

Zero-copy columnar data interchange with the Arrow ecosystem. Feature-gated behind arrow-io.

// Ingest Arrow IPC → Workspace
use ns_translate::arrow::ingest::{from_arrow_ipc, ArrowIngestConfig};

let config = ArrowIngestConfig { poi: "mu", .. };
let workspace = from_arrow_ipc(&ipc_bytes, &config)?;

// Export Model → Arrow IPC
use ns_translate::arrow::export::yields_to_ipc;
let ipc = yields_to_ipc(&model, Some(&params))?;

// Parquet read/write (Zstd)
use ns_translate::arrow::parquet::{from_parquet, write_parquet};
let workspace = from_parquet("histograms.parquet", &config)?;

Schema: channel (Utf8), sample (Utf8), yields (List<Float64>), optional stat_error (List<Float64>).

ns-server: REST API

Self-hosted REST API for shared GPU inference, built on axum 0.8.

// Key exports
ns_server::state::AppState     // GPU lock, atomic counters, model cache
ns_server::pool::ModelPool     // LRU cache keyed by SHA-256 of workspace
ns_server::pool::ModelInfo     // Cached model metadata (id, name, params, age)

// Endpoints: /v1/fit, /v1/ranking, /v1/batch/fit,
//            /v1/models, /v1/health

ns-unbinned: Event-Level PDFs

Parametric PDFs: GaussianPdf, CrystalBallPdf, DoubleCrystalBallPdf, ExponentialPdf, ChebyshevPdf, ArgusPdf, VoigtianPdf, SplinePdf. Non-parametric: HistogramPdf, KdePdf, MorphingHistogramPdf, ProductPdf. Neural (feature neural): FlowPdf, DcrSurrogate. All implement the UnbinnedPdf trait.

ns-inference: Differentiable Layer

Zero-copy PyTorch integration for ML workflows:

• DifferentiableSession (CUDA)
  ├ nll_grad_signal(params, d_signal, d_grad_signal) — zero-copy gradient into PyTorch tensor
  └ signal_n_bins(), n_params(), parameter_init()

• ProfiledDifferentiableSession (CUDA)
  ├ profiled_q0_and_grad(d_signal) → (f64, Vec<f64>) — discovery q₀ + envelope-theorem gradient
  └ profiled_qmu_and_grad(mu_test, d_signal) → (f64, Vec<f64>) — exclusion qμ

• MetalProfiledDifferentiableSession (Metal, f32)
  ├ upload_signal(signal), profiled_q0_and_grad(), profiled_qmu_and_grad(mu_test)
  └ batch_profiled_qmu(mu_values) — multiple mu values with session reuse

ns-viz: Visualization Artifacts

Lightweight, dependency-free, JSON-serializable artifacts. All include schema_version.

ClsCurveArtifact         — Brazil band CLs exclusion curves
ProfileCurveArtifact     — −2Δln L profile scans
RankingArtifact          — NP impact on POI
PullsArtifact            — pull/constraint plots for all NPs
CorrArtifact             — correlation (+ covariance) matrix
DistributionsArtifact    — stacked pre/post-fit histograms with ratio panel
YieldsArtifact           — per-channel yield tables
GammasArtifact           — staterror (Barlow-Beeston) parameters
SeparationArtifact       — S/B separation metric per channel
PieArtifact              — sample composition fractions
UncertaintyBreakdownArtifact — grouped NP impact breakdown

ns-zstd: Pure Rust Zstd Decoder

Fork of ruzstd 0.8.2 optimized for ROOT file decompression. Fused decode+execute (single-pass), exponential match copy, static Huffman lookup tables. ~820 MB/s median throughput (2× original ruzstd).

Building & Testing

# Build entire workspace
cargo build --workspace

# Run all tests
cargo test --workspace --all-features

# Format and lint
cargo fmt --check
cargo clippy --workspace -- -D warnings

# Benchmarks
cargo bench --workspace