Skip to main content

Performance Monitor Agent

The Performance Monitor Agent tracks system performance metrics, identifies bottlenecks, and provides optimization recommendations.

πŸ“‹ Overview​

PropertyValue
Modulesrc.agents.monitoring.performance_monitor_agent
ClassPerformanceMonitorAgent
AuthorUIP Team
Version1.0.0

🎯 Purpose​

The Performance Monitor Agent provides:

  • Real-time performance tracking for all system components
  • Resource utilization monitoring (CPU, memory, I/O)
  • Latency analysis for API endpoints and database queries
  • Bottleneck identification and optimization recommendations
  • Historical trend analysis for capacity planning

πŸ“Š Metrics Collected​

System Metrics​

MetricUnitDescription
cpu_usage%CPU utilization
memory_usageMBMemory consumption
disk_ioMB/sDisk read/write rate
network_ioMB/sNetwork throughput

Application Metrics​

MetricUnitDescription
request_latencymsAPI response time
request_throughputreq/sRequests per second
error_rate%Failed requests percentage
active_connectionscountConcurrent connections

Agent Metrics​

MetricUnitDescription
agent_execution_timemsAgent processing time
entities_processedcountEntities per execution
queue_depthcountPending operations

πŸ”§ Architecture​

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Performance Monitor Agent β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ System β”‚ β”‚ App β”‚ β”‚ Agent β”‚ β”‚
β”‚ β”‚ Metrics β”‚ β”‚ Metrics β”‚ β”‚ Metrics β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β–Ό β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Aggregator β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β–Ό β–Ό β–Ό β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚Time- β”‚ β”‚Prometheusβ”‚ β”‚ Alert β”‚ β”‚
β”‚ β”‚Series DBβ”‚ β”‚ Export β”‚ β”‚ Engine β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Usage​

Basic Monitoring​

from src.agents.monitoring.performance_monitor_agent import PerformanceMonitorAgent

monitor = PerformanceMonitorAgent()

# Start monitoring
monitor.start()

# Get current metrics
metrics = monitor.get_metrics()
print(f"CPU Usage: {metrics['cpu_usage']}%")
print(f"Memory Usage: {metrics['memory_usage']}MB")
print(f"Request Latency: {metrics['avg_latency']}ms")

Track Specific Operations​

# Track API endpoint performance
with monitor.track("api.cameras.list"):
cameras = await get_cameras()

# Track database query
with monitor.track("db.neo4j.query"):
results = neo4j.query(cypher)

# Get timing statistics
stats = monitor.get_stats("api.cameras.list")
print(f"Avg: {stats['avg_ms']}ms, P95: {stats['p95_ms']}ms")

Custom Metrics​

# Register custom metric
monitor.register_metric(
name="active_websockets",
type="gauge",
description="Number of active WebSocket connections"
)

# Update metric
monitor.set_metric("active_websockets", 42)

# Increment counter
monitor.increment_metric("requests_total")

βš™οΈ Configuration​

# config/performance_monitor_config.yaml
performance_monitor:
enabled: true
collection_interval_seconds: 10

# Metrics to collect
system_metrics:
- cpu_usage
- memory_usage
- disk_io
- network_io

# Application metrics
app_metrics:
track_endpoints: true
track_database_queries: true
track_agent_execution: true

# Alerting thresholds
thresholds:
cpu_warning: 70
cpu_critical: 90
memory_warning: 80
memory_critical: 95
latency_warning_ms: 500
latency_critical_ms: 2000

# Export configuration
export:
prometheus:
enabled: true
port: 9091
timeseries_db:
enabled: true
url: http://localhost:8086
database: uip_metrics

πŸ“ˆ Dashboard Integration​

Grafana Queries​

# Average request latency
rate(request_latency_sum[5m]) / rate(request_latency_count[5m])

# CPU usage by service
avg(cpu_usage) by (service)

# Request throughput
sum(rate(requests_total[1m])) by (endpoint)

# Error rate
sum(rate(requests_failed_total[5m])) / sum(rate(requests_total[5m])) * 100

Sample Dashboard​

{
"title": "UIP Performance Dashboard",
"panels": [
{
"title": "Request Latency (P95)",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.95, rate(request_latency_bucket[5m]))"
}
]
},
{
"title": "Throughput",
"type": "stat",
"targets": [
{
"expr": "sum(rate(requests_total[1m]))"
}
]
}
]
}

πŸ›‘οΈ Performance Alerts​

# Configure performance alerts
monitor.add_alert(
name="high_latency",
condition=lambda m: m['avg_latency'] > 1000,
severity="warning",
action=lambda: notify_ops("High API latency detected")
)

monitor.add_alert(
name="memory_critical",
condition=lambda m: m['memory_usage'] > 95,
severity="critical",
action=lambda: trigger_gc()
)

See the complete agents reference for all available agents.