Tracing & analytics · Documentation

Tracing & Analytics records every call your AI agents make to a language model — the agent replying to a customer, or a background job such as smart template groups generating templates. For each call it keeps the tokens used, the latency, whether it succeeded or failed, which knowledge base passages were retrieved, and the full request and response sent to and received from the model. Use it to find out why an AI reply was slow, why a call failed, or why the knowledge base did or did not get used.

Records are kept for 7 days

Logs older than seven days are removed automatically. The page defaults to the last 7 days; if you pick a wider range you will only ever see what is still within the retention window.

The overview cards

At the top, four cards summarize everything that matches your current filters and time range. They update as you change the filters.

Card	What it shows
Calls	Total number of LLM calls in the current view
Tokens (in/out)	Input tokens / output tokens, shown as a pair
Avg latency	Average response time, in milliseconds or seconds
Failure rate	Share of calls that failed; highlighted in red above 5%

Filtering the logs

The following fields are available as filters:

Filter	What it does
AI Agent	Limit to one agent, or All agents
Source	AI responder, Template supplement, or All sources
Provider	OpenAI, Gemini, Anthropic, or All providers
Model	A specific model under the chosen provider, or All models
Session ID	Find all calls for one conversation; matches by ID prefix
Status	All, Success, or Failed
Min latency (ms)	Show only calls at or above a latency threshold — useful for hunting slow calls
Time range	The date window to look at; defaults to the last 7 days

Log table fields

Each row is one LLM call. The columns are:

Column	What it shows
Time	When the record was created
Source	AI responder (blue) or Template supplement (purple)
Agent / Group	The AI agent name, or the group for a template supplement
Model	The model name, with its provider as a label
Session	The conversation ID, truncated
Latency	Response time; calls at 3 seconds or slower are highlighted
Tokens (in/out)	Input / output tokens for this call
Status	Success (green) or Failed (red)
Actions	A View button that opens the call detail

Viewing one call in detail

To inspect a single call:

Find the row you want, using the filters or paging.
Click View at the end of the row.
A dialog titled AI Call Detail opens with everything recorded for that call.

The detail dialog contains:

The basics — agent, provider and model, latency, tokens, created time, and session ID.
Error — shown only when the call failed; the full error text from the model or provider.
Retrieval Hits (RAG) — which knowledge base passages the model used (see below).
Request JSON and Response JSON — the full payloads sent to and received from the model, in scrollable code blocks. Either can be empty for a given call.

The retrieval (RAG) section

This section tells you whether — and how well — the agent's knowledge base was used for this call. It has three states:

No retrieval this turn — the agent did not look anything up. Auto-injection is off and the model did not call the knowledge base search tool.
Retrieval ran but returned 0 hits — the agent searched but found nothing relevant. The question is likely unrelated to any authorized knowledge base, or every candidate passage scored below the minimum match threshold.
One or more hits — each hit is a card showing its source (auto-inject when injected automatically before the call, or tool when the model searched explicitly), the document title, a relevance score, and the matched passage text.

A quick read on retrieval quality

Higher scores mean a closer match. When you are checking whether an agent is answering from the right material, confirm the documents that came back are the ones you expected and that their scores look reasonable.

Troubleshooting recipes

An AI agent is producing wrong or failed replies

Set AI Agent to the agent in question and Status to Failed. Open a failing row and read the Error block — it usually names the cause, such as a timeout, an exhausted quota, or an unavailable model. The Request JSON and Response JSON let you confirm exactly what was sent and returned.

Replies feel slow

Enter a value such as 3000 in Min latency to surface every call of 3 seconds or more (already highlighted in the table), and watch the Avg latency card. Open a slow call to compare how much was sent against how long the model took.

Checking knowledge base retrieval

Pick an agent that has a knowledge base attached, open a call, and read the Retrieval Hits (RAG) section. No retrieval means auto-injection is off and the model chose not to search; 0 hits points at a query unrelated to your content or a match threshold that is too strict; a list of hits lets you confirm the right documents came back.

Following one whole conversation

Paste the conversation's Session IDinto the filter to list every AI call for that session. A single conversation can contain several calls, so you can step through them in order to see how the agent's responses developed.