Commit Graph

5 Commits

Author SHA1 Message Date
a3003d597f PERF(observability): adjust resources based on VPA
- Update blackbox-exporter cpu 15m→23m, memory 64Mi→100Mi
- Update grafana cpu 11m→23m, memory 425Mi→175Mi
- Update loki cpu 23m→63m, memory 462Mi→363Mi
- Update tempo cpu 50m→15m, memory 128Mi→100Mi
- Update thanos memory 128Mi→283Mi
- Update node-exporter memory 64Mi→100Mi
- Update kube-state-metrics memory 100Mi→105Mi
- Update opentelemetry-operator cpu 10m→11m, memory 256Mi→75Mi
- Update vpa memory 128Mi→100Mi
2026-01-10 14:33:40 +09:00
9e218a8adc PERF(observability): reduce replicas, add priority
- Reduce Prometheus replicas from 2 to 1
- Reduce Grafana replicas from 2 to 1
- Reduce Blackbox-exporter replicas from 2 to 1
- Move Loki, Thanos, Tempo to workers (remove tolerations)
- Add medium-priority to Prometheus, Loki, Thanos, Tempo
2026-01-10 13:15:03 +09:00
ffed27419a REFACTOR(blackbox-exporter): revert to http_2xx module
- Remove http_auth module workaround
- Authelia now bypasses internal cluster traffic
- All endpoints use standard http_2xx module
2026-01-09 21:42:35 +09:00
37c216c433 FIX(blackbox-exporter): handle Authelia-protected endpoints
- Add http_auth module accepting 401/403 status codes
- Apply http_auth to grafana, code-server, pgweb, velero-ui
- These services return 401 when accessed without authentication
2026-01-09 21:42:35 +09:00
884a38d8ad FEAT(blackbox-exporter): add external endpoint monitoring
- Add blackbox-exporter with prometheus-community Helm chart
- Configure HTTP probes for 25 external endpoints
- Include SSL certificate expiry alerting rules
- Add probe failure and slow response alerts
- Deploy 2 replicas with anti-affinity for HA
2026-01-09 21:42:35 +09:00