System guide
Tracing & analytics
On the Tracing & Analytics page you can review every LLM call, inspect a single call's full log, and diagnose errors, slow responses, and whether knowledge base retrieval is working as expected.
Tracing & Analytics records every call your AI agents make to a language model — the agent replying to a customer, or a background job such as smart template groups generating templates. For each call it keeps the tokens used, the latency, whether it succeeded or failed, which knowledge base passages were retrieved, and the full request and response sent to and received from the model. Use it to find out why an AI reply was slow, why a call failed, or why the knowledge base did or did not get used.
The overview cards
At the top, four cards summarize everything that matches your current filters and time range. They update as you change the filters.
| Card | What it shows |
|---|---|
| Calls | Total number of LLM calls in the current view |
| Tokens (in/out) | Input tokens / output tokens, shown as a pair |
| Avg latency | Average response time, in milliseconds or seconds |
| Failure rate | Share of calls that failed; highlighted in red above 5% |
Filtering the logs
The following fields are available as filters:
| Filter | What it does |
|---|---|
| AI Agent | Limit to one agent, or All agents |
| Source | AI responder, Template supplement, or All sources |
| Provider | OpenAI, Gemini, Anthropic, or All providers |
| Model | A specific model under the chosen provider, or All models |
| Session ID | Find all calls for one conversation; matches by ID prefix |
| Status | All, Success, or Failed |
| Min latency (ms) | Show only calls at or above a latency threshold — useful for hunting slow calls |
| Time range | The date window to look at; defaults to the last 7 days |
Log table fields
Each row is one LLM call. The columns are:
| Column | What it shows |
|---|---|
| Time | When the record was created |
| Source | AI responder (blue) or Template supplement (purple) |
| Agent / Group | The AI agent name, or the group for a template supplement |
| Model | The model name, with its provider as a label |
| Session | The conversation ID, truncated |
| Latency | Response time; calls at 3 seconds or slower are highlighted |
| Tokens (in/out) | Input / output tokens for this call |
| Status | Success (green) or Failed (red) |
| Actions | A View button that opens the call detail |
Viewing one call in detail
To inspect a single call:
- Find the row you want, using the filters or paging.
- Click View at the end of the row.
- A dialog titled AI Call Detail opens with everything recorded for that call.
The detail dialog contains:
- The basics — agent, provider and model, latency, tokens, created time, and session ID.
- Error — shown only when the call failed; the full error text from the model or provider.
- Retrieval Hits (RAG) — which knowledge base passages the model used (see below).
- Request JSON and Response JSON — the full payloads sent to and received from the model, in scrollable code blocks. Either can be empty for a given call.
The retrieval (RAG) section
This section tells you whether — and how well — the agent's knowledge base was used for this call. It has three states:
- No retrieval this turn — the agent did not look anything up. Auto-injection is off and the model did not call the knowledge base search tool.
- Retrieval ran but returned 0 hits — the agent searched but found nothing relevant. The question is likely unrelated to any authorized knowledge base, or every candidate passage scored below the minimum match threshold.
- One or more hits — each hit is a card showing its source (auto-inject when injected automatically before the call, or tool when the model searched explicitly), the document title, a relevance score, and the matched passage text.
Troubleshooting recipes
An AI agent is producing wrong or failed replies
Set AI Agent to the agent in question and Status to Failed. Open a failing row and read the Error block — it usually names the cause, such as a timeout, an exhausted quota, or an unavailable model. The Request JSON and Response JSON let you confirm exactly what was sent and returned.
Replies feel slow
Enter a value such as 3000 in Min latency to surface every call of 3 seconds or more (already highlighted in the table), and watch the Avg latency card. Open a slow call to compare how much was sent against how long the model took.
Checking knowledge base retrieval
Pick an agent that has a knowledge base attached, open a call, and read the Retrieval Hits (RAG) section. No retrieval means auto-injection is off and the model chose not to search; 0 hits points at a query unrelated to your content or a match threshold that is too strict; a list of hits lets you confirm the right documents came back.
Following one whole conversation
Paste the conversation's Session IDinto the filter to list every AI call for that session. A single conversation can contain several calls, so you can step through them in order to see how the agent's responses developed.