Gymnasium RL Environment
experimental
nextstat.gym provides an optional Gymnasium/Gym wrapper that treats a HistFactory workspace as an RL/DOE environment. You propose updates to a sample's nominal yields (e.g. a signal histogram) and receive a NextStat metric as reward.
Installation
pip install nextstat gymnasium numpyQuick Start
from pathlib import Path
from nextstat.gym import make_histfactory_env
ws_json = Path("workspace.json").read_text()
env = make_histfactory_env(
ws_json,
channel="singlechannel",
sample="signal",
reward_metric="q0", # maximize discovery significance
max_steps=64,
action_scale=0.02,
action_mode="logmul", # multiplicative updates in log-space
init_noise=0.0,
)
obs, info = env.reset(seed=123)
total = 0.0
for _ in range(64):
action = env.action_space.sample()
obs, reward, terminated, truncated, info = env.step(action)
total += float(reward)
if terminated or truncated:
break
print("episode reward:", total)Reward Metrics
| Metric | Profiled? | Description |
|---|---|---|
| nll | No | -NLL at fixed parameters (fast, many steps/sec) |
| q0 / z0 | Yes | Discovery test statistic / significance |
| qmu / zmu | Yes | -qμ / -sqrt(qμ) for upper-limit optimization |
Configuration
action_mode—"additive"(direct delta) or"logmul"(multiplicative in log-space)action_scale— scale factor for actions (default 0.02)max_steps— episode length before truncationinit_noise— Gaussian noise added to initial yields on reset
Notes
- Profiled rewards (q0, qmu) run optimization internally — heavier per step than NLL mode.
- Compatible with both
gymnasium(preferred) and legacygym. - The environment modifies the model in-place by overriding one sample's nominal yields.
