Unbinned event-level analysis:
why work with every event (and how to make it reproducible)

Binned analysis (HistFactory) aggregates events into histograms and constructs a Poisson likelihood over bins. This works beautifully — when you have enough data. But for narrow resonances, low statistics, and multidimensional observables, binning destroys information. Unbinned analysis works with each event directly. Here we present a compact “mini-article”: the extended likelihood, the runtime architecture, the configuration contract (JSON schema), event-level systematics (including horizontal morphing), and CI guardrails for toy studies.

Terminology note. “Unbinned” here refers to an event-level likelihood data model, not a different statistical paradigm. You still do MLE fits, profile scans, and toy studies — the difference is in how the data are represented and how the likelihood is evaluated.

UnbinnedEvent-LevelHistFactoryCrystal BallKDEExtended Likelihood

2026-02-09 · 15 min read

1.Why Unbinned Analysis

The standard approach in high-energy physics is binned analysis: events are grouped into histogram bins, and a Poisson likelihood is built for each bin. Modifiers (NormSys, HistoSys, ShapeSys, etc.) describe systematic uncertainties. This approach has dominated for decades for good reasons: it scales well, is well understood, and has mature implementations (ROOT/RooFit, pyhf, NextStat).

But binning is a choice, and it has a cost. When you drop events into bins, you lose information about the exact observable value within a bin. In the limit of infinite statistics and infinitely fine bins, the loss approaches zero. In practice — it does not.

When binning hurts

›Narrow resonances. A signal with width σ ≈ 2 GeV at a bin size of 5 GeV gets smeared. Sensitivity degrades because events from the peak core and from the tail end up in the same bin with equal weight.
›Low statistics. With 50–200 events in the signal region, any binning choice is an artifact. Bins with 0–2 events make the likelihood sensitive to arbitrary bin edges and bin definitions. The unbinned likelihood avoids this discretization step.
›Multidimensional observables. Binning in N dimensions grows as O(k^N). For a 3D observable with 50 bins per axis you need 125,000 bins — mostly empty. The unbinned approach does not suffer from the curse of dimensionality at the same level of rigor.
›Event-level shape systematics. When the template shape is built from a TTree, per-event reweighting and per-event observable shifts can be part of the model contract (weight_systematics and horizontal_systematics). In a binned workflow the same effects are typically approximated at the bin level.

Why unbinned likelihood is computationally expensive

The statistical rigor of the unbinned approach is paid for in compute. In a binned model, one NLL evaluation is a loop over bins. In an event-level model, it is a loop over events: for each event you evaluate a mixture of processes Σ_p ν_p · f_p(x_i) and then accumulate a log.

As a first-order model, the cost of a single NLL evaluation scales like O(N_events × N_processes). Gradients add the cost of PDF derivatives with respect to shape parameters. This then multiplies by the number of optimizer iterations and line-search evaluations.

Practical takeaway: for millions of events, a naive implementation becomes unusable. This is why NextStat focuses on memory layout (SoA), batch PDF evaluation, and—where the model permits it—GPU acceleration.

Diagram: binned vs unbinned

2.Mathematical Foundations

2.1 Binned likelihood (HistFactory)

In a standard binned HistFactory analysis the likelihood is built over histogram bins:

L_binned = ∏_bins Poisson(n_b | ν_b(θ)) × ∏_NP Constraint(θ_j)

where ν_b(θ) = Σ_samples  μ_s · f_s(θ) · h_s,b
      n_b    — observed event count in bin b
      θ      — nuisance parameters (systematics)

2.2 Unbinned extended likelihood

In the unbinned approach the likelihood is built per event. For a channel with multiple processes (signal + backgrounds) we use the extended likelihood:

L_unbinned = e^(-ν_total) × ∏_events [ Σ_processes  ν_p · f_p(x_i | θ) ]
           × ∏_NP Constraint(θ_j)

where ν_total = Σ_processes ν_p        — total expected event count
      ν_p                               — expected event count for process p
      f_p(x | θ)                        — normalized PDF of process p
      x_i                               — observables of event i

The e^(-ν_total) factor is the extended term. It makes the likelihood sensitive not only to the shape of the distribution but also to thenumber of events, which is necessary for estimating the signal strength μ.

Key difference: in binned analysis, information about each event is projected onto a bin number (an integer). In unbinned analysis, each event contributes through the continuous PDF at its point x_i. This preserves full information per the Neyman–Pearson lemma.

2.3 Negative log-likelihood (NLL)

In practice we minimize the NLL:

−log L = ν_total − Σ_events log( Σ_processes ν_p · f_p(x_i | θ) )
       + Σ_NP penalty(θ_j)

This is exactly the function NextStat passes to the L-BFGS-B optimizer via the LogDensityModel trait. Same optimizer, same profile scans, same hypothesis tests — the inference stack is shared between HistFactory and event-level models.

3.PDF Catalog

NextStat provides a catalog of parametric and non-parametric PDFs, each implementing the UnbinnedPdf trait with two key methods:

›log_prob_batch(events, params, out) — compute log f(x|θ) for all events in one call
›log_prob_grad_batch(events, params, out_logp, out_grad) — same + analytic gradient ∂log f / ∂θ

The batch interface is critical for performance: events are stored in columnar format (SoA), and PDFs can use SIMD instructions for parallel evaluation across all events.

3.1 Parametric signal PDFs

›Gaussian [μ, σ] — standard normal distribution. Suitable for clean resonances with symmetric detector resolution.
›Crystal Ball [μ, σ, α, n] — Gaussian core with a power-law tail on one side. Describes detector effects (energy leakage, dead material) in calorimetric measurements. Standard PDF for H→γγ, H→ZZ*→4ℓ, J/ψ→μμ.
›Double Crystal Ball [μ, σ, α_l, n_l, α_r, n_r] — power-law tails on both sides. Used when both left and right tails have non-Gaussian nature.

3.2 Background PDFs

›Exponential [λ] — e^(λx) on a finite interval. Classic description of combinatorial background.
›Chebyshev [c₁, …, c_k] — order-k Chebyshev polynomial on [-1, 1], mapped to the observable support. Smooth, stable approximation of arbitrary background shapes. Order is determined by the number of parameters.

3.3 Template and non-parametric PDFs

›Histogram — piecewise-constant PDF from histogram bins. Allows using an MC template as a PDF in unbinned analysis: events are assigned the density of their bin.
›Histogram from TTree — same, but the histogram is built on-the-fly from a ROOT TTree.
›KDE (Kernel Density Estimation) — kernel density estimate from training events. Non-parametric — no analytic form needed. Bandwidth is an explicit parameter.
›Morphing Histogram — histogram with HistoSys shape systematics (up/down templates), interpolated over nuisance parameters.
›Morphing KDE — KDE with event-level weight systematics (weight_systematics). Describes shape dependence on nuisance parameters via per-event up/down weight expressions.
›Horizontal-morphing KDE (HorizontalMorphingKdePdf) — a KDE in which a nuisance parameter can shift kernel centers (horizontal systematics, horizontal_systematics). Implements normalization on bounded observable support, analytic gradients, and sample()for toy generation.

Why both template and parametric? Parametric PDFs (Crystal Ball, Gaussian) are ideal for signals when the shape is well understood physically. Template PDFs (Histogram, KDE) are for backgrounds when the analytic form is unknown and one must rely on MC simulation. Combining both types in one unbinned fit is standard practice.

4.EventStore: Columnar Storage

The unbinned likelihood requires evaluating f(x_i) for every event. With millions of events and dozens of PDF calls per optimizer iteration, memory layout is critical.

EventStore uses Structure-of-Arrays (SoA): each observable is stored as a contiguous f64 vector. This ensures:

›Cache locality — when evaluating a PDF on one observable, data is contiguous in memory
›SIMD-friendly — 4 events at a time (f64x4) without repacking
›Separate access — the "mass" PDF does not touch the "pt" column

A practical detail that becomes part of the contract: most event-level PDFs in NextStat are normalized on a bounded observable support. This means observables.boundsare not “plotting defaults” — they define the domain for normalization (especially important for KDE-based PDFs and horizontal morphing).

Data sources:

json

{
  "data": {
    "file": "data.root",
    "tree": "CollectionTree",
    "selection": "mass > 100 && mass < 150 && nJets >= 2"
  },
  "observables": [
    { "name": "mass", "bounds": [100, 150] },
    { "name": "BDT_score", "expr": "bdt_out", "bounds": [0, 1] }
  ]
}

NextStat natively reads ROOT TTrees (via ns-root), supports expression-based observables (arbitrary expressions over branches) and selection cuts. No ROOT C++ dependency — pure Rust.

Parquet ingestion contract (nextstat_unbinned_events_v1)

For reproducibility and ecosystem interoperability, NextStat can also ingest event-level Parquet files. The contract is intentionally strict:

›One row per event. Each row represents a single event.
›Float64 columns for observables. Each observable is a dedicated Float64 column.
›Optional weights and channels. _weight: Float64 is an optional per-event weight column; _channel: Utf8 is an optional channel label for multi-channel files.
›Metadata (when written by NextStat). NextStat-written Parquet files embed key-value metadata such as nextstat.schema_version = "nextstat_unbinned_events_v1" and nextstat.observables (JSON array of {name,bounds}).

If metadata is absent (e.g. files authored outside NextStat), the reader can still ingest the event table. In that case, observable bounds must be supplied by the spec; otherwise the ingestion falls back to (-inf, inf) bounds and the bounded-support normalization contract becomes the caller’s responsibility.

Observed-data weights. NextStat supports per-event weights as non-negative frequency weights. For ROOT sources use channels[].data.weight; for Parquet sources provide an optional _weight column. Each weight must be finite, >= 0, and the total weight sum must be positive.

5.UnbinnedModel: Assembling the Likelihood

UnbinnedModel assembles the likelihood from channels, processes, and constraints. The structure intentionally mirrors the HistFactory architecture:

yaml

Model
├── Parameters (name, init, bounds, constraint?)
├── Channel "SR"
│   ├── data: ROOT TTree → EventStore
│   ├── observable: mass [100, 150]
│   ├── Process "signal"
│   │   ├── pdf: crystal_ball(mass_mu, mass_sig, alpha, n)
│   │   └── yield: 50 × μ, modified by NormSys(alpha_lumi)
│   └── Process "background"
│       ├── pdf: exponential(lambda)
│       └── yield: fixed(1000)
└── Channel "CR" (optional, control region)

Yield expressions

›Fixed(ν) — fixed event count. For well-known backgrounds.
›Parameter(index) — free parameter optimized directly. For data-driven background estimation.
›Scaled(base, μ) — base_yield × μ. Classic signal strength.
›Modified(base, modifiers) — yield with multiplicative modifiers: NormSys and WeightSys.

Rate modifiers

›NormSys(α, lo, hi) — piecewise-exponential interpolation. At α=0 the factor is 1; at α=+1 it is hi; at α=-1 it is lo. Describes normalization systematics (luminosity, cross-section).
›WeightSys(α, lo, hi, interp_code) — HistoSys-style interpolation (code0 — piecewise-linear, code4p — smooth polynomial in [-1,1] with linear extrapolation). For rate-only systematics with non-linear behavior.

Unified trait. UnbinnedModel implements the same LogDensityModel as HistFactoryModel. This means MLE, profile scans, CLs, ranking, toy studies — the entire inference stack from ns-inference — works for unbinned models without a single line of new code.

6.Typical Workflow: Resonance Search

Consider searching for a narrow resonance in an invariant mass spectrum. Signal described by Crystal Ball, background by an exponential. 50 expected signal events, 1000 background, 5% normalization systematic.

Step 1: JSON specification

json

{
  "schema_version": "nextstat_unbinned_spec_v0",
  "model": {
    "poi": "mu",
    "parameters": [
      { "name": "mu",         "init": 1.0,   "bounds": [0, 10] },
      { "name": "mass_mu",    "init": 125.0,  "bounds": [120, 130] },
      { "name": "mass_sigma", "init": 2.0,    "bounds": [0.5, 5] },
      { "name": "alpha_cb",   "init": 1.5,    "bounds": [0.5, 5] },
      { "name": "n_cb",       "init": 3.0,    "bounds": [1, 20] },
      { "name": "lambda",     "init": -0.02,  "bounds": [-1, 0] },
      { "name": "alpha_lumi", "init": 0,      "bounds": [-5, 5],
        "constraint": { "type": "gaussian", "mean": 0, "sigma": 1 } }
    ]
  },
  "channels": [{
    "name": "SR",
    "data": { "file": "data.root", "tree": "events",
              "selection": "mass > 100 && mass < 150" },
    "observables": [{ "name": "mass", "bounds": [100, 150] }],
    "processes": [
      {
        "name": "signal",
        "pdf": { "type": "crystal_ball", "observable": "mass",
                 "params": ["mass_mu", "mass_sigma", "alpha_cb", "n_cb"] },
        "yield": { "type": "scaled", "base_yield": 50, "scale": "mu",
                   "modifiers": [{ "type": "normsys", "param": "alpha_lumi",
                                   "lo": 0.95, "hi": 1.05 }] }
      },
      {
        "name": "background",
        "pdf": { "type": "exponential", "observable": "mass",
                 "params": ["lambda"] },
        "yield": { "type": "fixed", "value": 1000 }
      }
    ]
  }]
}

The configuration file is a contract: an unambiguous description of the model, data, and constraints that should be interpreted identically locally, in CI, and by third-party replication. By default, NextStat reads the file as YAML; if the extension is .json, it expects JSON.

To avoid ambiguity, it helps to understand how the key sections map to the runtime.

›schema_version. Must be exactly nextstat_unbinned_spec_v0. This is the anchor for validation and compatibility.
›model.poi. The name of the parameter of interest (POI). It is required by profiling and toy commands (e.g.unbinned-scan and unbinned-fit-toys): this is what gets scanned and summarized.
›model.parameters. The single source of truth for parameter definitions. Names are referenced later frompdf.params, modifiers, and yield.bounds define the allowed domain; init is the optimizer start.constraint adds an NLL penalty (e.g. a Gaussian constraint for luminosity).
›
channels. A channel bundles data + processes that explain that data. Each channel defines:
- ›data.file and data.tree for the ROOT file and TTree. Relative paths resolve relative to the configuration file directory.
- ›data.selection is the event cut expression. It is evaluated on nominal branches; for horizontal systematics (see below) the selection is not recomputed.
- ›observables list names and bounds. These bounds are part of the normalization contract for bounded-support PDFs.
›
processes. A process is a PDF + an expected yield.
- ›pdf sets the PDF type (e.g.crystal_ball) and the list of shape parameters in params.
- ›yield sets the expected event count: fixed, parameterized, or scaled by the POI. A normsys modifier implements a HistFactory-like normalization systematic.

For non-parametric PDFs built directly from trees (histogram_from_tree and kde_from_tree), NextStat supports two event-level systematic mechanisms:

›weight_systematics. Up/down expressions define event-level weight ratios w_up / w_nom and w_down / w_nom. They can affect the shape (rebuilding the histogram/KDE weights) and/or the yield depending on application flags.
›horizontal_systematics. Up/down expressions define the observable value at α=±1. This is “horizontal” morphing: an event moves along the observable axis while nominal weights are kept. A single nuisance parameter cannot simultaneously define shape via weights and via horizontal shifts.

Event-level systematics: weights vs horizontal shifts

Conceptually, these are two different levers: one changes how much an event contributes, the other changes wherethe event sits on the observable axis.

›Weight systematics (weight_systematics) modify the contribution of each event in template/KDE construction.
›Horizontal systematics (horizontal_systematics) modify the per-event observable value. For kde_from_tree this corresponds to morphing kernel centers; for histogram_from_tree it changes how events populate bins.

Minimal horizontal-systematics example in kde_from_tree:

json

{
  "pdf": {
    "type": "kde_from_tree",
    "observable": "mass",
    "source": {
      "file": "mc.root",
      "tree": "events",
      "selection": "mass > 100 && mass < 150",
      "weight": "w"
    },
    "horizontal_systematics": [{
      "param": "alpha_mass_scale",
      "up": "mass * 1.001",
      "down": "mass * 0.999",
      "interp": "code4p"
    }]
  }
}

A key invariant for predictability: the same nuisance parameter should not drive shape through bothweight_systematics and horizontal_systematics. This prevents double-counting and ambiguous model interpretation.

If you want to validate a config before running a fit, print the schema and validate the specification:

shell

nextstat config schema --name unbinned_spec_v0

Step 2: MLE fit

shell

nextstat unbinned-fit --config search.json --output result.json

The result contains best-fit values for all parameters (including μ̂), Hessian uncertainties, NLL at the minimum, and convergence status.

GPU mode (conservative subset)

NextStat supports GPU evaluation of NLL+gradient for a restricted subset of unbinned models. GPU is enabled explicitly via --gpu:

shell

nextstat unbinned-fit --config search.json --output result.json --gpu metal
nextstat unbinned-fit --config search.json --output result.json --gpu cuda

›PDFs. Phase 1 GPU supports a closed set of 1D PDFs:gaussian, exponential,crystal_ball, double_crystal_ball,chebyshev (order ≤ 16), and histogram.
›Yield. Supported expressions are fixed, parameter, and scaled, with optional rate modifiers (normsys and weightsys).
›Data model. Exactly one observable per channel; observed data weights are not supported.
›Precision. cuda is f64; metal is f32 (Apple Silicon).

If your spec goes beyond these limits (KDE-based PDFs, template building from TTrees, morphing PDFs, or multi-observable channels), use the CPU path without --gpu. This is deliberate: start with a small, parity-tested GPU subset, then expand coverage.

Step 3: Profile scan over μ

shell

nextstat unbinned-scan --config search.json \
  --start 0 --stop 5 --points 51

At each scan point the POI is fixed while all other parameters are profiled. The result is a Δ(−2 log L) vs μ curve from which confidence intervals are extracted.

Step 4: Toy studies (coverage, bias, pulls)

shell

nextstat unbinned-fit-toys --config search.json \
  --n-toys 1000 --seed 42 --gen init

For each toy: Poisson(ν_total) → sample from PDF → full MLE fit. This validates coverage, bias, and pull distributions — critical metrics for validating the statistical procedure.

For CI and automated validation, you can enforce guardrails: for example, require all toys to converge and constrain the mean/σ of the POI pull (see --require-all-converged,--max-abs-poi-pull-mean, --poi-pull-std-range). If guardrails fail, artifacts are still written, and the command exits non-zero.

If you run toys on CUDA, an experimental fast path can also sample toys on GPU:--gpu cuda --gpu-sample-toys. This keeps the large toy event buffer device-resident and is designed to reduce host↔device traffic in toy-heavy workflows.

Correctness gates (closure, coverage, RooFit reference)

Unbinned likelihood code is easy to get subtly wrong. NextStat treats correctness as a precondition for performance numbers. The main gates are:

›Closure + coverage regression tests. The Python suite validates recovery of truth on synthetic datasets and (optionally) frequentist coverage across toys. The source of truth for tolerances is tests/python/_tolerances.py, including UNBINNED_CLOSURE_PARAM_ATOL = 0.15, UNBINNED_CLOSURE_PARAM_RTOL = 0.10, UNBINNED_COVERAGE_1SIGMA_LO = 0.55, and UNBINNED_COVERAGE_1SIGMA_HI = 0.98.
›Independent RooFit reference. For canonical 1D models, NextStat compares fit results to RooFit using tests/validate_roofit_unbinned.py with tolerances NLL_ATOL = 0.5, PARAM_ATOL = 0.3, and PARAM_RTOL = 0.10.

RooFit reference run (requires ROOT on PATH):

shell

python tests/validate_roofit_unbinned.py --cases gauss_exp,cb_exp --seed 42 --output tmp/validate_roofit_unbinned.json

Step 5: Hypothesis tests (toy-based CLs, qtilde)

For upper limits and CLs workflows in HEP, a standard test statistic is qtilde (q̃μ): a one-sided profile statistic that is clipped to zero when the fitted signal strength μ̂ exceeds the tested value μ.

In NextStat, toy-based CLs for unbinned models is implemented via a sampler hookin the inference layer: the CLs algorithm (qtilde + tail probabilities) is generic, while toy generation is provided as sample_toy(model, params, seed). For event-level models this typically maps to model.sample_poisson_toy(params, seed).

CLI wrapper (CPU):

shell

nextstat unbinned-hypotest-toys --config search.json --mu 1.0 \
  --n-toys 1000 --seed 42 --expected-set --threads 1

The command generates two toy ensembles (b-only at μ=0 and s+b at μ=μ_test), computes the toy distributions of qtilde, and returns CLs, CLsb, and CLb, together with the observed q_obs and μ̂. The --expected-set option additionally returns a 5-point Brazil-style expected set (in the conventional order [2, 1, 0, -1, -2]).

A correctness requirement is that μ=0 lies within the POI bounds (otherwise the b-only hypothesis is undefined). For reproducibility, the b-only and s+b toy ensembles use separate deterministic seeds.

Limitations and reproducibility checklist

Unbinned analysis reduces information loss from binning, but it does not remove the need for an explicit protocol. Common pitfalls are about contracts and boundary conditions rather than “math errors”.

›Bounded support is mandatory. Always set observables.bounds. KDE-based PDFs rely on bounded-support normalization.
›Selection under horizontal systematics. data.selection is evaluated on nominal branches; it is not recomputed forhorizontal_systematics. Treat this as a modeling decision.
›KDE needs enough events. With very small training samples, bandwidth choice and boundary effects can dominate.
›Performance scaling. Event-level NLL is O(events) per evaluation; for very high statistics, binned models can be faster.
›Validate configs early. Use nextstat config schema --name unbinned_spec_v0 and keep the config alongside JSON outputs.

7.Binned vs Unbinned: When to Use What

Criterion	Binned (HistFactory)	Unbinned
Statistics	High (thousands+)	Any, especially low
Signal	Wide relative to bin	Narrow resonance
Dimensionality	1D–2D	1D–ND without curse of dimensionality
Systematics	Per-bin modifiers	Per-event + per-bin
Speed	O(bins) per iteration	O(events) per iteration
Ecosystem	pyhf, ROOT, NextStat	RooFit, zfit, NextStat

In NextStat both approaches coexist and share the same optimizer. Switching between them is a matter of input format, not a tool change.

8.Current Status and Roadmap

Unbinned analysis in NextStat is in Phase 1: a fully working implementation with 10 PDFs, extended likelihood, CLI commands, and JSON specification. The article reflects NextStat 0.9.0 (git 2326065, 2026-02-10).

›Phase 1 (now) — 1D unbinned PDFs, MLE fit, profile scan, toy studies, JSON spec v0, conservative GPU subset for NLL+gradient
›Phase 2 (next) — multidimensional PDFs (2D Gaussian, conditional PDF), GPU-accelerated unbinned NLL, Python API
›Phase 3 — full unbinned HistFactory compatibility (unbinned channels in the same workspace as binned)

Unbinned event-level analysis:why work with every event (and how to make it reproducible)

1.Why Unbinned Analysis

When binning hurts

Why unbinned likelihood is computationally expensive

2.Mathematical Foundations

2.1 Binned likelihood (HistFactory)

2.2 Unbinned extended likelihood

2.3 Negative log-likelihood (NLL)

3.PDF Catalog

3.1 Parametric signal PDFs

3.2 Background PDFs

3.3 Template and non-parametric PDFs

4.EventStore: Columnar Storage

Parquet ingestion contract (nextstat_unbinned_events_v1)

5.UnbinnedModel: Assembling the Likelihood

Yield expressions

Rate modifiers

6.Typical Workflow: Resonance Search

Step 1: JSON specification

Event-level systematics: weights vs horizontal shifts

Step 2: MLE fit

GPU mode (conservative subset)

Step 3: Profile scan over μ

Step 4: Toy studies (coverage, bias, pulls)

Correctness gates (closure, coverage, RooFit reference)

Step 5: Hypothesis tests (toy-based CLs, qtilde)

Limitations and reproducibility checklist

7.Binned vs Unbinned: When to Use What

8.Current Status and Roadmap

Related Reading

Unbinned event-level analysis:
why work with every event (and how to make it reproducible)