Test Case detail

The Test Case detail panel is where you drill into a single case - read its conversation, inspect the trace that produced it, edit the inputs, manage its assertions, and re-run it on demand. Deep-link to it from any test-case row in Testing Lab, or from a failure in Reports.

Don't confuse this with the Test Case list page (the index view under Trust Centre → Test Cases), which groups cases by type - Knowledge Base / Playground saved session / Scenario. This page documents the per-case detail panel that opens when you click into a single case.

Detail-panel layout

A two-column layout: the conversation on the left, the case metadata + execution trace on the right (it varies slightly by tab - the Conversation tab keeps the trace inline).

The detail panel has three tabs:

Tab	What's on it
Conversation	The captured user ↔ agent turns, paired with the execution trace per turn (when a baseline exists).
Assertions	The checks ticked when the case was saved (turn-anchored + global).
Settings	Case name, source agent, initial state, datasets the case belongs to, and the Re-run button. A stale-baseline banner appears here when the underlying config has drifted (see below).

Conversation tab - inline traces

For cases captured via Trial run (i.e. cases with a baseline), each agent message is paired with the execution trace that produced it:

Tool calls - which tool fired, with what arguments, with what return value.
Memory updates - keys read / written during the turn, with before / after values.
Agent decisions - Context Expert routing pick, sub-agent push / pop.
Per-turn metrics - latency, token cost, evaluator scores.

Tip: For the full guide to what a trace contains and how to read it, see Reading an execution trace; for a step-by-step playbook when a case fails, see Debug a failing test case.

The trace fetches lazily from GET /v1/agentic-qa/testcases/:slug/trace?bot=<botId> only when you switch to the Conversation tab - 5-minute staleTime (baselines are immutable, no point re-fetching).

If the case predates the baseline-capture flow (v2 agents, legacy cases imported from elsewhere), the trace is unavailable and the tab shows the plain bubble list. There's no way to retrofit a baseline onto a legacy case - re-capture via Trial run if you need the inline traces.

💡 Try the Nexus AI Layer: "This test case is failing on turn 3. Explain what the bot did, why it picked the wrong agent, and what to change."

Assertions tab

Shows the list you ticked in the Assertion picker when the case was saved. Each assertion has:

Field	Notes
Type	`tool_called`, `memory_set`, `intent_triggered`, `max_turns`.
Anchor	`turnIndex` for turn-anchored assertions; `conversation` for global assertions.
Expected value	What the assertion checks for.
Enabled	Toggle - disable an assertion without removing it (useful if you're debugging a flake).

You can add an assertion manually on this tab (without going through the picker again) or remove one that's become too strict.

Settings tab - edit + re-run

Action	What it does
Edit inputs	Change the user messages or initial state. Marks the case `stale` until you re-capture the baseline.
Reassign to a dataset	Move the case between datasets, or add it to multiple datasets.
Re-run	Queues this single case immediately. Useful when iterating on a fix - no need to run the whole dataset.
Delete	Soft-deletes the case (flags it `deleted` rather than removing the record). There's no self-serve restore - treat deletion as final.

The Re-run button is the fastest feedback loop in the Trust Centre: tweak the bot config, click Re-run, watch the conversation + trace re-render with the new behaviour.

When a case goes stale

If the bot config changed in ways that affect this case (an evaluator threshold shifted, a routing rule was added, a referenced variable was renamed), the detail panel surfaces a stale-baseline banner prompting a re-run. The same signal appears as a chip on the case's row in the Testing Lab table.

Two options to clear staleness:

Re-capture the baseline - open Trial run on this case, replay it, approve the new trace.
Accept the staleness - if you know the change was intentional and the case still tests the right thing, dismiss the chip. The case keeps running; the dashboard just doesn't promise the trace baseline is current.

Best practices

Open the Conversation tab first when a case fails. The trace pairing usually tells you why - wrong tool, missing memory write, the model misread an argument. Skip the tab if you're sure it's a flake.
Add assertions over time, not at creation. Start with "right agent fired"; once that's stable, add response-content checks; later, add latency / cost checks.
Re-run a single case before re-running the dataset. Faster iteration; cheaper queue cost.
Don't edit and re-baseline without thinking. A stale chip is a warning, not a bug. If the case was right and the bot is wrong, the case should keep failing - fix the bot, don't lower the bar.

💡 Try the Nexus AI Layer: "This case used to pass and is now failing after I added a Routing Logic rule. Show me the diff between the old baseline trace and the new run, and tell me whether to update the case or revert the rule."

Read next: Reports - see how this case scored in the latest run alongside the rest of its dataset.

Detail-panel layout​

Conversation tab - inline traces​

Assertions tab​

Settings tab - edit + re-run​

When a case goes stale​

Best practices​