NextStatNextStat

Optimizer Convergence & Best-NLL Philosophy

NextStat uses L-BFGS-B and targets the best NLL minimum by default. Differences vs pyhf in best-fit parameters on large models (>100 params) are expected and documented behavior, not a bug.

Position: Best-NLL by Default

  • NextStat does not intentionally constrain the optimizer to match a specific external tool.
  • If L-BFGS-B finds a deeper minimum than pyhf's SLSQP, that is a correct result.
  • Objective parity is validated: NextStat and pyhf compute the same NLL at the same parameter point (~1e-9 to 1e-13).
  • Differences come from the optimizer, not the model.

Typical Mismatch Scale

ModelParametersΔNLL (NS − pyhf)Reason
simple_workspace20.0Both converge
complex_workspace90.0Both converge
tchannel184−0.01 to −0.08pyhf SLSQP premature stop
tHu~200−0.08pyhf SLSQP premature stop
tttt249−0.01pyhf SLSQP premature stop

Negative ΔNLL means NextStat finds a better (lower) minimum.

Parity Levels

Level 1: Objective Parity (P0, required)

NLL(params) matches between NextStat and pyhf at the same params. Tolerance: rtol=1e-6, atol=1e-8. Verified by golden tests on all fixture workspaces.

Level 2: Fit Parity (P1, conditional)

Best-fit parameters match within tolerances: atol=2e-4 on parameters, atol=5e-4 on uncertainties. Full agreement on small models (<50 params); mismatches on large models due to different optimizers. Not a defect if NS NLL ≤ pyhf NLL.

Level 3: Optimizer Compatibility (rejected)

Intentionally degrading the optimizer to match SLSQP is rejected — it is an artificial constraint with no scientific value.

How to Verify

# For users
import nextstat, json

ws = json.load(open("workspace.json"))
model = nextstat.from_pyhf(json.dumps(ws))
result = nextstat.fit(model)
print(f"NLL: {result.nll}")  # lower is better
# For developers (parity checks)
make pyhf-audit-nll   # Objective parity (must always pass)
make pyhf-audit-fit   # Fit parity (may differ on large models)

# Cross-eval diagnostic
python tests/diagnose_optimizer.py workspace.json

Warm-Start for pyhf Reproducibility

If a specific use case requires matching pyhf (e.g. reproducing a published result):

import pyhf, nextstat, json

# 1. Fit in pyhf
ws = json.load(open("workspace.json"))
model = pyhf.Workspace(ws).model()
pyhf_pars, _ = pyhf.infer.mle.fit(
    model.config.suggested_init(), model, return_uncertainties=True
)

# 2. Warm-start NextStat from the pyhf point
ns_model = nextstat.from_pyhf(json.dumps(ws))
result = nextstat.fit(ns_model, init_pars=pyhf_pars.tolist())
# result.nll <= pyhf NLL (guaranteed)

L-BFGS-B vs SLSQP

AspectL-BFGS-B (NextStat)SLSQP (pyhf/scipy)
HessianQuasi-Newton (m=10 history)Rank-1 update
BoundsNative box constraintsNative box constraints
Convergence||proj_grad|| < ftol||grad|| threshold
ScalingO(m·n) per iterationO(n²) per iteration
Large models (>100p)RobustOften premature stop

Profile Scan Evidence

FixtureNS vs pyhf |dq(μ)|NS vs ROOT |dq(μ)|ROOT fit
xmlimport1e-70.051Converged
multichannel4e-73.4e-8Converged
coupled_histosys5e-622.5FAILED (status=-1)