OnPremiseAgent exposes a Prometheus-compatible metrics endpoint for comprehensive monitoring of your AI agent fleet. Track query latency, token usage, error rates, compliance violations, and resource utilization. Combine with Grafana for dashboards and Alertmanager for intelligent alerting — all within your infrastructure.
Token, Service Account
Observability
Prometheus, Thanos, Cortex, Mimir, VictoriaMetrics
Available
Everything you need to integrate Prometheus into your on-premise agent workflows.
Standard /metrics endpoint exposing 50+ metrics including query latency, token usage, and error rates.
Define custom metrics for business-specific KPIs like compliance scores and agent accuracy rates.
Pre-built Prometheus alert rules for common scenarios: high latency, error spikes, token budget exceeded.
Automatic service discovery for Kubernetes deployments — new agents are scraped without configuration changes.
Enable the Prometheus metrics endpoint in your OnPremiseAgent configuration (enabled by default on port 9090).
Track agent response times and availability against SLA targets with automatic alerting on violations.
Monitor token usage per agent and department to optimize costs and enforce budget limits.
Track compliance violation rates and policy enforcement metrics for regulatory reporting.
Query count, latency histograms, token usage, error rates, active agents, compliance scores, and resource utilization — plus custom metrics you define.
Yes. The metrics endpoint is fully compatible with any system that scrapes Prometheus-format metrics.
Combine Prometheus with these connectors for a complete integration stack.
Deploy on your own infrastructure with full data sovereignty. Get started in minutes.