On this page

Agent Analytics SDK

Early Access

This feature is in Early Access. During this time, aspects of the functionality may still be developed, and this documentation may not always be up to date. If you have any questions, contact Amplitude Support.

This page is the developer reference for the Amplitude AI SDK. For the product-level setup overview, refer to Set up Agent Analytics. For the product concepts and how Amplitude uses the data, refer to the Agent Analytics overview and Analyze agent results.

The timeline below shows what your instrumentation produces. Click any event to inspect its shape and the call that emits it.

From the SDKreal-time during sessionFrom Amplitudepost-hoc enrichmentLOADTURN 1TURN 2ENDViewedPage(browser SDK)User MessageTool CallAI ResponseUser MessageTool CallAI Response···Session EndALSO EMITTED — INSIDE A TURNSpanSession RecordEvaluator Result × N
Event type
[Agent] AI Responsefrom the SDK
Fired at
22:33:48
Identity
[Agent] Session ID4ddcc6b2-1041-432a-aa8c-ebe3eccac40b
[Agent] Agent IDsupport-chatbot
[Agent] Trace IDb4f63d43-d752-4b1f-8489-d234ddf586b2
Event-specific
$llm_message.textI can help. Your subscription renews on Aug 15…
[Agent] Modelgpt-4o-mini
[Agent] Provideropenai
[Agent] Input Tokens1245
[Agent] Output Tokens87
[Agent] Latency Ms3420
[Agent] Cost USD0.0012
Closes the turn. Carries the eight fields the SDK doctor checks at setup: Session ID, Agent ID, Model, Provider, Latency Ms, Input/Output Tokens, Cost USD. Emitted by s.trackAiMessage(...) or a provider wrapper.

Prerequisites

  • An Amplitude project with Agent Analytics enabled.
  • The View Agent Analytics Objects permission. Admins grant access through role-based access control (RBAC).
  • An agent codebase to instrument in Node.js or Python (or a runtime that can call the Amplitude HTTP API).
  • The project's API key for the right data center. Agent Analytics runs in US and EU.

Install the SDK

bash
npm install @amplitude/ai @amplitude/analytics-node

To let an AI coding agent wire up the SDK, run this and paste the printed prompt into Cursor, Claude Code, Windsurf, GitHub Copilot, or Codex:

bash
npx amplitude-ai

The agent scans your codebase, identifies every LLM call site and the session lifecycle, then instruments them.

Initialize the SDK

Initialize once at your application entry point and reuse the instance. The recommended pattern is a bootstrap module that exports ai plus wrapped provider clients.

typescript
// src/lib/amplitude.ts
import { AmplitudeAI, AIConfig, OpenAI } from "@amplitude/ai";

export const ai = new AmplitudeAI({
  apiKey: process.env.AMPLITUDE_AI_API_KEY!,
  config: new AIConfig({ contentMode: "full", redactPii: true }),
});

export const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY!,
  amplitude: ai,
});

Import openai from this module instead of directly from 'openai'. Add more wrapped providers as needed.

To validate events during development without sending them, set dryRun: true (Node) or dry_run=True (Python) on AIConfig.

Instrument an agent session

Wrap each agent invocation in a session. The session correlates every event (user message, model response, tool calls, spans) into a single record.

An agent session is one job the user hands the agent, from start to finish: the unit of work with a real outcome. Set the sessionId from an ID you already track, rather than inventing a new one:

  • Chatbot or copilot: the conversation thread ID.
  • Coding agent: the task or work-session ID.
  • Support agent: the ticket ID.
  • Voice agent: the call ID.
  • Background or autonomous agent: the run or job ID.

An agent session isn't Amplitude's standard-analytics session. The agent session, [Agent] Session ID, is one job the user hands the agent. Amplitude's standard-analytics session, $session_id, is the user's app or web visit that powers Session Replay and product reports. Set the agent session from your own ID, and forward the standard-analytics session ID across the network boundary if you want to link the two.

typescript
import { ai, openai } from "@/lib/amplitude";

const agent = ai.agent("chat-handler", {
  description: "Customer support chatbot",
});

export async function POST(req: Request) {
  const { messages, userId } = await req.json();
  return agent.session({ userId }).run(async (s) => {
    s.trackUserMessage(messages[messages.length - 1].content);
    const response = await openai.chat.completions.create({
      model: "gpt-4o-mini",
      messages,
    });
    return Response.json(response);
  });
}

The Python SDK follows the same pattern with ai.agent(...).session(...). A session opened with run() closes when the callback returns. For sessions that span multiple requests, end them one of these ways:

  • Close it explicitly (recommended): Call trackSessionEnd() (Node) or track_session_end() (Python) when the job finishes, such as a closed ticket or a completed run. Server-side evaluation runs immediately.
  • Let the idle timeout close it: The timeout defaults to 30 minutes from the first user message, configurable per session with idleTimeoutMinutes (Node) or idle_timeout_minutes (Python). Raise it for jobs with long natural gaps, such as 240 for a support ticket worked over hours. Set it to -1 to disable the idle close, which keeps the session open until you end it explicitly, with a 90-day backstop.

When the same user returns with a new goal, start a new session with a new sessionId rather than continuing the old one.

Minimum viable instrumentation

Agent Analytics needs four fields to correlate events: the API key, a user identifier (userId or deviceId), an agentId, and a sessionId. The recommended pattern automatically adds provider wrappers so the SDK captures model, token, cost, and latency data.

Two identity rules keep a single user from splitting into two:

  • Don't pass a placeholder userId such as "anonymous", "", or a temporary ID. Omit the userId instead. Amplitude can't change a userId after it's set, so a placeholder creates a separate user that won't merge later.
  • Reuse the same deviceId across a pre-account session. If your backend generates a new deviceId per request, the merge breaks. Read the deviceId from the Browser SDK and forward it.

Auto-instrument provider calls

The SDK offers two zero-code paths for capturing provider activity.

Provider wrappers

Wrap the provider client at construction time. The wrapper forwards calls to the underlying client and records request, response, tokens, latency, and cost.

typescript
import OpenAI from "openai";
const openaiWrapped = new OpenAI({ amplitude: ai });

If you can't change the construction site, use wrap(existingClient, ai) to instrument an existing client without modifying its creation.

patch()

Call patch({ amplitudeAI: ai }) once at startup for zero-code instrumentation. The SDK monkey-patches supported clients and auto-extracts [Agent] Tool Call events from message arrays for OpenAI Chat Completions, OpenAI Responses, and Anthropic Messages. Extracted tool calls land with latencyMs: 0 because execution timing isn't available through message inspection. Use tool() or trackToolCall() when you need real tool latency.

Track tools

The tool() higher-order function wraps a tool function so the SDK records each call:

typescript
import { tool } from "@amplitude/ai";

const searchProducts = tool(searchDB, { name: "search_products" });

// Inside session.run, call as usual:
const result = await searchProducts(query);
// [Agent] Tool Call event emitted with duration, success, input/output

For inline tool calls or unsupported flows, use s.trackToolCall(name, latencyMs, success, { input, output }) directly.

Track spans

Spans wrap internal sub-operations such as vector lookups, reranks, guardrails, or any timed work that sits inside a turn. They emit [Agent] Span events and share the trace's identity.

typescript
import { observe } from "@amplitude/ai";

// As a higher-order function:
const runSubAgent = observe(
  async (prompt: string) => {
    return await subAgent.execute(prompt);
  },
  { name: "sub-agent-execution" },
);

// Or explicitly when you need error capture:
const start = Date.now();
try {
  const result = await subAgent.execute(prompt);
  s.trackSpan({
    name: "sub-agent-execution",
    latencyMs: Date.now() - start,
    inputState: { prompt: prompt.slice(0, 1000) },
    outputState: { response: result.slice(0, 1000) },
  });
} catch (e) {
  s.trackSpan({
    name: "sub-agent-execution",
    latencyMs: Date.now() - start,
    isError: true,
    errorType: (e as Error).name,
    errorMessage: (e as Error).message,
  });
  throw e;
}

Spans don't replace turn-level events

Agent Analytics turn counts and interaction views are driven by [Agent] User Message and [Agent] AI Response, not spans. If you only emit spans around internal steps, dashboards show traces with no turn-level analytics. Always emit the User Message / AI Response pair for each user-visible cycle, and use spans on top.

Manual instrumentation

For custom flows or unsupported providers, use the manual methods on the session object directly. Each maps to a single [Agent] event type.

For AI responses that don't go through a wrapper (proxies, custom gateways), pass usage from the completion response:

typescript
s.trackAiMessage(completedMessage.content, "gpt-4o", "openai", latencyMs, {
  inputTokens: usage.prompt_tokens,
  outputTokens: usage.completion_tokens,
  totalTokens: usage.total_tokens,
});

Pass the canonical provider model id (gpt-4o-mini, claude-sonnet-4-20250514), not an internal gateway label, so cost auto-calculates correctly.

Send user feedback (scores)

Capture explicit user feedback, such as a thumbs up or down on a response or an optional rating, as a [Agent] Score event. Scores come only from your application; Amplitude's enrichment pipeline never generates them.

typescript
// Thumbs up/down on a specific AI response
ai.score({
  userId: "user-123",
  name: "user-feedback",
  value: 1.0,
  targetId: aiMessageId,
  targetType: "message",
  source: "user",
});

If you ingest events directly instead of using the SDK, send an [Agent] Score event with [Agent] Score Name set to your score name (for example, user-feedback).

Multi-agent architectures

Parent agents can delegate to child agents. Child agents inherit the parent's session, so all events stay correlated under one Session ID.

typescript
const orchestrator = ai.agent('shopping-agent', { description: 'Orchestrates shopping requests' });
const recipeAgent = orchestrator.child('recipe-agent', { description: 'Finds recipes' });

await orchestrator.session({ userId }).run(async (s) => {
  s.trackUserMessage(userInput);

  const result = await s.runAs(recipeAgent, async (cs) => {
    cs.trackUserMessage(delegatedQuery);
    return openai.chat.completions.create({ model: 'gpt-4o', messages: [...] });
  });
});

Wrap delegation calls with observe() or trackSpan if you want latency and error metrics on the dispatch itself, not only the child's LLM call.

Stream responses

Streaming sessions must stay open until the stream is fully consumed. Closing the session before the stream finishes drops the AI response event.

typescript
// WRONG: session ends before stream is consumed
return agent.session({ userId }).run(async (s) => {
  const stream = await openai.chat.completions.create({
    model: "gpt-4o",
    messages,
    stream: true,
  });
  return new Response(stream.toReadableStream());
});

// CORRECT: session stays open until stream completes
return agent.session({ userId }).run(async (s) => {
  const stream = await openai.chat.completions.create({
    model: "gpt-4o",
    messages,
    stream: true,
  });
  const readable = stream.toReadableStream();
  const [passthrough, forClient] = readable.tee();
  const reader = passthrough.getReader();
  (async () => {
    while (!(await reader.read()).done) {}
  })();
  return new Response(forClient);
});

With the Vercel AI SDK, flush in the onFinish callback:

typescript
const result = await streamText({
  model: openai("gpt-4o"),
  messages,
  onFinish: async () => {
    await ai.flush();
  },
});

When a session crosses the network boundary, pass Amplitude IDs through request headers so server-side events join the user's standard-analytics session ($session_id). Pass the value as the session's browserSessionId field:

typescript
const browserSessionId = req.headers.get("x-amplitude-session-id");
const deviceId = req.headers.get("x-amplitude-device-id");
const session = agent.session({ userId, browserSessionId, deviceId });

For cross-service propagation between back-end services, use injectContext() on the outbound side and extractContext(headers) on the inbound side.

Supported providers and frameworks

Providers with native wrappers: OpenAI (Chat Completions + Responses), Anthropic, Azure OpenAI, Gemini (@google/generative-ai), Google Gen AI (@google/genai), Mistral, Bedrock (Converse APIs).

Agent frameworks with first-party integrations: LangChain, LlamaIndex, OpenAI Agents SDK, Anthropic Tool Use, Claude Agent SDK (ClaudeAgentSDKTracker), Anthropic Managed Agents, CrewAI (Python only).

Provider-specific notes

Vercel AI SDK

Provider wrappers instrument the underlying SDK (openai), not the Vercel abstraction. If only @ai-sdk/openai is present, either add openai as a direct dependency or fall back to patch(). For streaming responses, use onFinish to call await ai.flush() (refer to Stream responses).

Claude Agent SDK

Use ClaudeAgentSDKTracker from @amplitude/ai/integrations/claude-agent-sdk. Two fields are required for the events to be useful: agentId on ai.agent() (identifies the AI feature in the LLM Usage Application Registry), and userId + sessionId on agent.session() (ties events into a single interaction).

typescript
import { AmplitudeAI } from "@amplitude/ai";
import { ClaudeAgentSDKTracker } from "@amplitude/ai/integrations/claude-agent-sdk";
import { query } from "@anthropic-ai/claude-agent-sdk";

const ai = new AmplitudeAI({ apiKey: process.env.AMPLITUDE_AI_API_KEY! });
const agent = ai.agent({ agentId: "code-reviewer" });
const tracker = new ClaudeAgentSDKTracker();

await agent.session({ userId: "u1", sessionId: "sess-abc" }).run(async (s) => {
  for await (const message of query({
    prompt: "Analyze this codebase",
    options: { hooks: tracker.hooks(s) },
  })) {
    tracker.process(s, message);
  }
});

tracker.hooks(session) returns PreToolUse / PostToolUse hooks with precise tool latency. tracker.process(session, message) processes the message stream for AI responses and user messages.

Anthropic Managed Agents

Provider wrappers don't work, because LLM calls happen in Anthropic's cloud, not your code. Use manual tracking and poll client.beta.sessions.events.list(). Map event types to SDK methods:

Deduplicate events across polls, because events.list() returns previously-seen events:

typescript
const seenIds = new Set<string>(savedState.seenIds);
for (const event of response.data) {
  if (seenIds.has(event.id)) continue;
  seenIds.add(event.id);
  // track event
}

Measure latency as wall-clock time between session.status_running and the event's processed_at, not poll round-trip. events.list() doesn't include usage or token counts, so cost tracking requires the Anthropic Admin API.

OpenAI Assistants API

Provider wrappers don't auto-instrument the Assistants API (async / polling-based). Use manual tracking: trackUserMessage() when creating a message, trackAiMessage() when polling completion events.

MCP servers

The MCP protocol doesn't pass the originating user prompt to tools, so MCP servers can't capture it. Add an optional rationale parameter to each tool so the LLM can self-explain its intent and you keep usable session content.

Framework notes

Next.js (App Router)

Initialize the SDK in a server-side module, never a client component. Add @amplitude/ai to serverExternalPackages in next.config.ts. Wrap session creation inside each route handler; in serverless deployments call await ai.flush() before the handler returns so the runtime doesn't freeze before events ship.

Express / Fastify / Hono

Use the bundled middleware to attach ai to every request:

typescript
import { createAmplitudeAIMiddleware } from "@amplitude/ai";

app.use(
  createAmplitudeAIMiddleware({
    amplitudeAI: ai,
    userIdResolver: (req) => req.headers["x-user-id"] ?? null,
  }),
);

Edge runtimes and Cloudflare Workers

@amplitude/ai cannot bundle in Cloudflare Workers

The SDK depends on node:async_hooks, node:module, and node:crypto. Workers Builds rejects the upload even with nodejs_compat_v2 enabled. @amplitude/analytics-node is also incompatible (depends on Node's http).

The only safe import is import type { ... } from '@amplitude/ai/types', which is erased at compile time. For runtime tracking, use a fetch-based transport that constructs [Agent] events directly:

typescript
import type { AmplitudeClientLike, AmplitudeEvent } from "@amplitude/ai/types";

class FetchAmplitudeClient implements AmplitudeClientLike {
  private _apiKey: string;
  private _buffer: AmplitudeEvent[] = [];

  constructor(apiKey: string) {
    this._apiKey = apiKey;
  }

  track(event: AmplitudeEvent): void {
    this._buffer.push(event);
  }

  async flush(): Promise<void> {
    if (!this._buffer.length) return;
    const events = this._buffer.splice(0);
    try {
      const resp = await fetch("https://api2.amplitude.com/2/httpapi", {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify({ api_key: this._apiKey, events }),
      });
      if (!resp.ok) console.error(`[Amplitude] Flush failed: ${resp.status}`);
    } catch (err) {
      console.error(`[Amplitude] Flush error: ${(err as Error).message}`);
    }
  }
}

export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext) {
    if (env.AMPLITUDE_TRACKING_DISABLED) return handleRequest(request, env);

    const transport = new FetchAmplitudeClient(env.AMPLITUDE_API_KEY);

    transport.track({
      event_type: "[Agent] User Message",
      user_id: userId,
      event_properties: {
        "[Agent] Session ID": sessionId,
        "[Agent] Agent ID": "my-agent",
        $llm_message: { text: content },
      },
    });

    // After the LLM call completes:
    transport.track({
      event_type: "[Agent] AI Response",
      user_id: userId,
      event_properties: {
        "[Agent] Session ID": sessionId,
        "[Agent] Agent ID": "my-agent",
        "[Agent] Model": model,
        "[Agent] Provider": "anthropic",
        "[Agent] Latency Ms": latencyMs,
        $llm_message: { text: responseText },
      },
    });

    // Non-blocking flush so events ship before the isolate terminates
    ctx.waitUntil(transport.flush());
    return new Response("ok");
  },
};

Construct FetchAmplitudeClient per-request to avoid buffer leakage between requests. Use crypto.randomUUID() for event insert_id dedup, and gate tracking behind an AMPLITUDE_TRACKING_DISABLED env var to disable it.

Ingest OpenTelemetry spans

For stacks that already emit OpenTelemetry GenAI spans (OpenLIT, Traceloop, OpenAI's OTel instrumentation), map them into [Agent] events:

  • AmplitudeGenAIExporter (inbound, production-ready): ingests GenAI semantic-convention spans and emits [Agent] events. Ignores non-GenAI spans, so it's safe in a mixed pipeline.
  • AmplitudeAgentExporter (outbound, experimental): converts Amplitude events into flat OTel spans for forwarding to other backends. Doesn't preserve trace hierarchy.

Choose a privacy mode

Set contentMode on AIConfig:

  • full (default): captures prompt and response text. redactPii: true is on by default and scrubs emails, phone numbers, SSNs, credit card numbers, IP addresses, and base64-encoded image data before events leave the process. The SDK tunes phone and SSN detection for US formats; add customRedactionPatterns or customRedactionFn for international locales.
  • metadata_only: token counts, latency, model, and cost only. No prompt or response text. Use for sensitive or regulated data.
  • customer_enriched: no text by default. Send pre-scored summaries through trackSessionEnrichment(). Designed for teams with existing evaluation stacks.

For managed-agent architectures, prefer full with redactPii: true. The managed API already stores message content server-side, so metadata_only adds no privacy benefit.

Manage cost and tokens

s.trackAiMessage(...) auto-calculates [Agent] Cost USD from the model name and token counts through the bundled Pydantic genai-prices catalog. Two things cause cost_usd: 0:

Unrecognized model name. Vertex AI aliases like claude-sonnet-4-6 won't match the canonical claude-sonnet-4-20250514. Internal gateway labels won't resolve. Brand-new models may not yet be in genai-prices. Pass the canonical provider id, or set totalCostUsd explicitly to override.

Incorrect inputTokens with prompt caching. The SDK expects inputTokens to be cache-inclusive (cached tokens are a subset, never additive). Provider conventions differ:

The built-in Anthropic, Bedrock, and Gemini wrappers handle this normalization for you. Manual trackAiMessage callers need to handle it themselves. Pass cacheReadTokens / cacheCreationTokens separately so the SDK applies the differential pricing.

When you need to compute cost yourself, call calculateCost({ modelName, inputTokens, outputTokens, cacheReadInputTokens, cacheCreationInputTokens }) and pass the result as totalCostUsd.

Shape message content

The first argument to trackUserMessage becomes $llm_message.text on [Agent] User Message. This is what session lists, segmentation, and enrichment treat as "what the user said". Two practical rules:

Do pass a short natural-language line as the message body. For example, the real prompt, or a canonical summary for headless jobs:

typescript
s.trackUserMessage(
  "Summarize the attached design doc and list open questions",
  {
    context: { structuredPayload: payloadRecord },
  },
);

Don't pass large JSON blobs as the message body. The product uses the JSON as the session title and breaks down charts by raw JSON:

typescript
// Session label becomes the JSON
s.trackUserMessage(JSON.stringify(payloadRecord));

Put structured segmentation dimensions in the context option (becomes [Agent] Context JSON, queryable in charts). For server-side enrichment to reason over structured data, also keep essential facts in content. Enrichments derive eval input primarily from turn text, not from [Agent] Context.

Instrument without the SDK

For unsupported runtimes (Java, Go, Ruby, edge environments), send events to the Amplitude HTTP API directly:

bash
curl -X POST https://api2.amplitude.com/2/httpapi \
  -H 'Content-Type: application/json' \
  -d '{
    "api_key": "YOUR_API_KEY",
    "events": [{
      "event_type": "[Agent] User Message",
      "user_id": "user-123",
      "event_properties": {
        "[Agent] Session ID": "sess-abc",
        "[Agent] Agent ID": "support-chatbot",
        "$llm_message": { "text": "How do I cancel my subscription?" }
      }
    }]
  }'
Use $llm_message.text for message content (the ingestion pipeline reads this property for interaction text). For the full event reference, refer to the HTTP event reference.

Verify your data

Run the doctor to validate env vars, installed dependencies, and the event-pipeline connection:

bash
npx amplitude-ai doctor

Then confirm events land in Amplitude:

  1. Open the project's Live Events stream.
  2. Send a test session from the instrumented code.
  3. Within seconds, an [Agent] AI Response event should appear with these properties populated:
    • [Agent] Session ID, [Agent] Agent ID
    • [Agent] Model, [Agent] Provider
    • [Agent] Latency Ms
    • [Agent] Input Tokens, [Agent] Output Tokens
    • [Agent] Cost USD

Test against a mock client

For CI, use MockAmplitudeAI from @amplitude/ai/testing to assert your events emit correctly:

typescript
import { AIConfig } from "@amplitude/ai";
import { MockAmplitudeAI } from "@amplitude/ai/testing";

const mock = new MockAmplitudeAI(new AIConfig({ contentMode: "full" }));
const agent = mock.agent("test-agent", { userId: "u1" });

await agent.session({ sessionId: "s1" }).run(async (s) => {
  s.trackUserMessage("hello");
  s.trackAiMessage("response", "gpt-4o-mini", "openai", 150);
});

mock.assertEventTracked("[Agent] User Message", { userId: "u1" });
mock.assertSessionClosed("s1");

// Data quality gate: every AI Response must carry the eight verification fields
for (const e of mock.eventsOfType("[Agent] AI Response")) {
  const p = e.event_properties ?? {};
  expect(e.user_id || e.device_id).toBeTruthy();
  expect(p["[Agent] Session ID"]).toBeTruthy();
  expect(p["[Agent] Model"]).toBeTruthy();
  expect(p["[Agent] Provider"]).toBeTruthy();
  expect(p["[Agent] Latency Ms"]).toBeGreaterThan(0);
  expect(p["[Agent] Input Tokens"]).toBeGreaterThan(0);
  expect(p["[Agent] Output Tokens"]).toBeGreaterThan(0);
  expect(p["[Agent] Cost USD"]).toBeGreaterThan(0);
}

Keep this test in CI to catch silent instrumentation regressions such as bad model names or missing token counts produce broken dashboards without throwing at runtime.

Troubleshooting

API reference

Core classes

Session tracking methods

Higher-order functions

Other APIs

Was this helpful?