NextStatNextStat

Agentic Analysis

LLM Tool Definitions for AI-Driven Statistical Analysis

NextStat exposes 21 operations spanning 12+ statistical verticals as standardised tool definitions compatible with OpenAI function calling, LangChain, and the Model Context Protocol (MCP). This lets AI agents — GPT-4o, Llama 4, Claude, local Ollama models — discover and invoke regression fits, survival analysis, hypothesis tests, causal inference, time series, and more — all programmatically.

The Key Idea

A researcher says: "Fit a Cox PH model on this clinical trial data, then run Bayesian sampling with 4 chains and show me the posterior." The agent calls nextstat_survival_fit then nextstat_bayesian_sample — no script needed.

Available Tools

ToolDescription
HEP / HistFactory
nextstat_fitMLE fit → best-fit params, uncertainties, NLL, convergence
nextstat_hypotestAsymptotic CLs hypothesis test at given μ
nextstat_hypotest_toysToy-based CLs hypothesis test
nextstat_upper_limit95% CL upper limit via CLs scan
nextstat_rankingSystematic impact ranking (Feature Importance)
nextstat_discovery_asymptoticDiscovery significance Z₀ (background-only test)
nextstat_scanProfile likelihood scan over μ values
nextstat_workspace_auditWorkspace compatibility audit
nextstat_read_root_histogramRead TH1 histogram from a ROOT file
Regression & Bayesian
nextstat_glm_fitGLM fit (linear, logistic, Poisson, NB) → coefficients, SE
nextstat_bayesian_sampleNUTS sampling for any model → posterior, ESS, R̂, diagnostics
Survival & Clinical
nextstat_survival_fitCox PH / Weibull / Log-Normal AFT / Exponential MLE fit
nextstat_kaplan_meierKM curve + optional log-rank test
Econometrics & Causal
nextstat_panel_fePanel fixed effects (within estimator) with cluster-robust SE
nextstat_didDifference-in-Differences (TWFE) → ATT with cluster-robust SE
nextstat_iv_2slsIV / 2SLS with weak instrument diagnostics
nextstat_aipwDoubly-robust ATE/ATT (AIPW) with propensity diagnostics
Time Series & Other
nextstat_kalmanKalman filter / smoother / forecast on state-space models
nextstat_meta_analysisFixed/random-effects meta-analysis with I², Q, τ²
nextstat_churn_retentionChurn retention curve from tenure/event data
nextstat_chain_ladderInsurance reserving (chain ladder / Mack with prediction errors)

OpenAI Function Calling

import json, openai
from nextstat.tools import get_toolkit, execute_tool

# 1. Get tool definitions (OpenAI-compatible JSON Schema)
tools = get_toolkit()

# 2. Send to the model
response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Fit this workspace and tell me the signal strength"},
    ],
    tools=tools,
)

# 3. Execute tool calls from the agent
for call in response.choices[0].message.tool_calls:
    result = execute_tool(call.function.name, json.loads(call.function.arguments))
    print(json.dumps(result, indent=2))

LangChain Integration

from nextstat.tools import get_langchain_tools
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.agents import create_tool_calling_agent, AgentExecutor

tools = get_langchain_tools()  # list of StructuredTool
llm = ChatOpenAI(model="gpt-4o")

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a statistical analysis assistant with NextStat tools."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)

result = executor.invoke({
    "input": "What is the discovery significance of this workspace?"
})

MCP (Model Context Protocol)

For MCP-based tool servers (used by Windsurf, Cursor, Claude Desktop):

from nextstat.tools import get_mcp_tools, handle_mcp_call

# Register tools with your MCP server
tools = get_mcp_tools()  # list of {name, description, inputSchema}

# Handle incoming tool calls
result = handle_mcp_call("nextstat_fit", {"workspace_json": ws_str})

Local Agents (Ollama / vLLM)

The tool schemas work with any model that supports function calling. For local models via Ollama or vLLM:

import json, requests
from nextstat.tools import get_toolkit, execute_tool

tools = get_toolkit()

# Ollama with function calling
response = requests.post("http://localhost:11434/api/chat", json={
    "model": "llama3.1",
    "messages": [{"role": "user", "content": "Fit this workspace"}],
    "tools": tools,
})

# Parse and execute
for call in response.json()["message"]["tool_calls"]:
    result = execute_tool(call["function"]["name"], call["function"]["arguments"])

API Reference

FunctionReturns
get_toolkit()OpenAI-compatible tool definitions (list of dicts)
execute_tool(name, args)Execute a tool call → JSON-serialisable dict
get_langchain_tools()LangChain StructuredTool instances
get_mcp_tools()MCP-compatible tool definitions
get_tool_names()List of available tool name strings
get_tool_schema(name)JSON Schema for a specific tool

Server Mode (nextstat-server)

If you run nextstat-server, agents can fetch tool definitions and execute tools over HTTP — no Python import needed:

  • Tool registry: GET /v1/tools/schema
  • Tool execution: POST /v1/tools/execute
from nextstat.tools import get_toolkit, execute_tool

server_url = "http://127.0.0.1:3742"
tools = get_toolkit(transport="server", server_url=server_url)

out = execute_tool(
    "nextstat_hypotest",
    {"workspace_json": "...", "mu": 1.0, "execution": {"deterministic": True}},
    transport="server",
    server_url=server_url,
)
# server_url also via NEXTSTAT_SERVER_URL env var
# fallback_to_local=False to disable local fallback

Full server docs: NextStat Server.