OBSERVABILITY

See everything your agents do

Distributed tracing, Prometheus metrics, and a full audit trail across all three pillars. Monitor every tool call, sandbox execution, and skill invocation — with automatic secret masking built in.

MCP Gateway — Observability Dashboard

Real-time dashboard with tool call volume, success rates, and latency percentiles

HOW IT WORKS

Full visibility across tools, skills, and sandboxes

Distributed Tracing

Push traces to any OpenTelemetry-compatible backend — Datadog, Azure Application Insights, AWS CloudWatch, Google Cloud Operations, Grafana Cloud, New Relic, Splunk, or any custom OTLP endpoint. Auto-instruments FastAPI, SQLAlchemy, and HTTPX with zero code changes. Swap providers at runtime without restarting.

Telemetry Export settings — choose from 8 observability providers

Prometheus Metrics

Pull-based metrics at /api/v1/metrics in Prometheus exposition format. HTTP request counts, latency percentile histograms (p50/p90/p95/p99), tool call success rates, active connections, and token usage — all with automatic UUID normalization to prevent high-cardinality explosion.

Latency percentile monitoring — p50 through p99 over 24 hours

Full Audit Trail

Every MCP tool call, sandbox command, and skill execution is logged to PostgreSQL with full-text search. See the operation type, source, arguments, response, latency, status, and user — with automatic secret masking for API keys, tokens, and passwords. Filter by status, source, date range, and export to CSV.

Request history — tool calls with arguments, responses, and latency

Token Usage Analytics

Track LLM token consumption across every MCP server. Input tokens, output tokens, and total usage with time-range filtering and per-server breakdowns. Know exactly where your AI spend is going and optimize accordingly.

Token usage breakdown — input vs output tokens across servers

API REFERENCE

Complete Metrics API

Every metric is queryable. Build dashboards, set up alerts, or integrate with your existing monitoring stack.

GET/api/v1/metrics
# HELP http_requests_total Total HTTP requests
# TYPE http_requests_total counter
http_requests_total{endpoint="/api/v1/search",method="POST",status="200"} 156.0
# HELP tool_latency_seconds Tool execution latency
# TYPE tool_latency_seconds histogram
tool_latency_seconds_bucket{server="github",tool="create_pr",le="0.5"} 42.0
# HELP server_status MCP server health
# TYPE server_status gauge
server_status{server="github",type="remote"} 1.0
server_status{server="slack",type="remote"} 1.0
Prometheus exposition format
GET/api/v1/audit/tool-calls?status=error
{
"items": [
{
"tool": "github__create_pr",
"server": "github",
"status": "error",
"latency_ms": 2341,
"error": "rate_limit_exceeded",
"created_at": "2026-02-26T14:30:00Z"
}
],
"total": 3,
"page": 1
}
Full-text search with pagination

Ready to see what your agents are doing?

Deploy MCP Gateway and get full visibility into every tool call, sandbox execution, skill invocation, and token.