Monitoring Blueprint
Complete Observability Stack
RedQueen is part of a comprehensive monitoring infrastructure. Built on Prometheus, OpenSearch, and Grafana — deployed through Terraform and managed via Flux CD.
Monitoring Architecture
How data flows through the observability stack.
Stack Components
Industry-standard tools, custom-configured for your needs.
Prometheus
Metrics collection and alerting
- 30s scrape interval
- 10-year retention
- 100+ custom alerts
- kube-prometheus-stack
OpenSearch
Log aggregation and search
- App, ALB, WAF logs
- Infrastructure logs
- Full-text search
- Dashboards UI
Grafana
Visualization and dashboards
- Pre-built dashboards
- OpenSearch datasource
- Anonymous access
- Alert visualization
100+ Alert Rules
Comprehensive alerting across all layers of your infrastructure.
OOM & Resources
ContainerOOMKilled
KubeContainerOOM
HighCPUUsage
HighMemoryUsage
Response Time
ResponseTime90thPercentile
AverageResponseTimeHigh
RequestLatencySpike
Error Rates
5xxErrorRateHigh
4xxErrorRateHigh
ErrorBudgetBurn
Saturation
RequestSaturation
RequestCountAnomaly
PodReplicasTooLow
Kubernetes
PodCrashLoopBackOff
PodNotReady
DeploymentReplicasMismatch
NodeNotReady
Infrastructure
DiskSpaceLow
NetworkLatencyHigh
CertificateExpiring
Log Indexes
Structured log storage in OpenSearch.
Application Logs
{env}-{cluster}-app-*stdout/stderr from all pods
ALB Logs
{env}-{cluster}-alb-logs-*Load balancer access logs
WAF Logs
{env}-{cluster}-waf-logs-*Web Application Firewall events
Infrastructure Logs
{env}-{cluster}-infra-*Kubernetes system components
Alert Routing
Different alerts go to different destinations.
OOM Alerts→Slack #ops
Performance Alerts (RT-tagged)→SNS → RedQueen → Slack
General Alerts→Slack #alerts
Cost Alerts→SNS → Email