Now in Epiphron DRTx®

Governed AI research agents
for the investment workflow you already trust.

Build LLM-powered research agents once, version-control them, prove their accuracy with regression evals, enforce runtime guardrails, and trace every recommendation back to the exact tool call that produced it — all inside the same Epiphron platform that runs your data requests, transformations and investment systems integrations.

See how it works ↓

⚡

Build

Define an agent's persona once: model, system prompt, tool kit, output contract. Re-use it across every task that needs that thinking.

🔒

Govern

Snapshot agent state at publish time. Production runs pin to the published version — prompt edits don't silently change behavior.

✓

Test

Author regression cases, run them on demand, gate publishing on a configurable pass-rate threshold. Treat agents like code.

⌕

Audit

Every recommendation traces back to the turn and tool call that emitted it. Every input, output, and token cost is captured immutably.

Build the persona

An LLM agent is a stored procedure with a system prompt.

Capture the agent's whole identity in one place: which model it talks to, how many turns it can take, the tool kit it can reach, the output contract it must satisfy, and the system prompt that frames every call.

System prompt the persona instructions, saved verbatim in every published version.
Tool kit least-privilege checkbox list of approved tools the agent may invoke.
Output contract SecuritiesList (validated against your SecId types) or free text.
Custom settings non-engineer-editable parameters (sectors to focus on, sources to scan).

One pane per agent

Versions, evaluations, and guardrails — in one window.

The Manage Agent dashboard collapses the four governance dimensions of an agent into a single tabbed view. Quick KPIs up top tell you whether anything needs attention; tabs drill into version history, the eval suite, and bound guardrails.

Overview active eval cases, active guardrails, last-run timestamp + status.
Versions Published / Archived / Draft history with change notes per publish.
Evaluations the agent's regression suite at a glance.
Guardrails bound runtime safety checks and their per-binding configuration.

Test before you ship

Regression tests for prompts, models, and tool kits.

Author canned input/expected-output pairs once and run them whenever you change anything. Four grading modes cover both deterministic Mock-backed agents and stochastic real-LLM runs. Pass-rate gates publishing — you can't accidentally promote a regression.

ExactMatch deterministic agents must return exactly the expected set.
SubsetMatch a few specific names must appear; extras allowed.
SecuritiesOverlap Jaccard overlap above a configurable threshold (e.g. 60%).
Publish gate block new releases until the eval pass-rate clears the bar.

Govern at runtime

Guardrails on every production run, not just at design time.

Bind safety checks to an agent with one of three severities: Block aborts the run, Warn records the violation, Note stamps for telemetry. Pre-input guardrails fire before the first LLM call; post-output guardrails validate every emitted security against per-secIdType regex.

SecIdValidation.Strict reject malformed tickers, ISINs, CUSIPs at the door.
TokenBudgetPerAgent spend caps that throttle runaway costs in real time.
Per-binding config the same guardrail can behave differently per agent.
Snapshot semantics production runs use the bindings that existed at publish time — nobody can loosen guardrails after release.

Watch the loop think

A streaming view of the agent's reasoning — turn by turn.

When you trigger a task, the live execution viewer streams every think → tool-call → tool-result event in real time. Token counts and rolling cost estimate update as the run progresses. The fastest way to spot prompt drift, runaway turns, or a misbehaving tool.

[Thinking] per-turn LLM calls with target model identifier.
[ToolCalled] the exact tool name and arguments the agent invoked.
[ToolResult] a one-line summary of what came back; full payload in the audit record.
Cooperative cancel stop the agent mid-loop without losing the partial transcript.

The audit trail your compliance team will love

Every recommendation traces back to the tool call that produced it.

Click any security in any execution and Epiphron tells you: which turn of the agent loop produced it, which tool call's output the security was extracted from, and the model's per-element confidence. Pair it with the immutable per-turn token log and you have regulatory-grade reproducibility — without hand-rolling your own observability stack.

Producing turn the loop iteration that emitted the security.
Producing tool which tool call's output the security came from.
Confidence the LLM's per-element score, when provided.
Pinned to a published version so historical runs remain interpretable forever.

Bring your own LLM stack

Anthropic, OpenAI, Google, AWS Bedrock, Azure OpenAI, xAI — or all of them.

A vendor-neutral connection layer with one keychain, one budget cap per provider, and a per-model cost catalog so every execution's USD estimate is computed live. Built-in Mock provider gives you deterministic, free, offline regression runs.

Encrypted at rest DPAPI-protected API keys; rotation without redeployment.
Monthly budget caps per-provider USD ceilings that gate spend.
Per-model pricing input/output token cost stored on the model row, not in code.
Mock provider deterministic outputs for repeatable regression tests.