Commit Graph

7 Commits

Author SHA1 Message Date
9e218a8adc PERF(observability): reduce replicas, add priority
- Reduce Prometheus replicas from 2 to 1
- Reduce Grafana replicas from 2 to 1
- Reduce Blackbox-exporter replicas from 2 to 1
- Move Loki, Thanos, Tempo to workers (remove tolerations)
- Add medium-priority to Prometheus, Loki, Thanos, Tempo
2026-01-10 13:15:03 +09:00
94af545120 REFACTOR(thanos): remove S3 storage integration
- Disable Store Gateway and Compactor
- Remove Sidecar objectStorageConfig
- Keep Thanos Query + Sidecar for HA query
- 3-day local retention is sufficient
2026-01-09 21:42:35 +09:00
14bd244b98 FIX(thanos): increase compactor memory to 256Mi
- Compactor was OOMKilled with 128Mi limit
- Set to 256Mi for stability during compaction
2026-01-09 21:42:35 +09:00
5089e8607d CHORE(resources): set memory limits equal to memory requests
Align memory limits with memory requests for guaranteed QoS class.
- prometheus, thanos (query, storegateway, compactor)
- alertmanager, tempo, goldilocks (dashboard, controller)
- node-exporter, opentelemetry-collector, vpa, kube-state-metrics
2026-01-09 21:42:35 +09:00
aecb15031d FEAT(grafana): add Thanos as default datasource
- Add Thanos Query as default Prometheus datasource
- Keep original Prometheus datasource as backup
- Thanos provides deduplicated metrics from HA Prometheus

REFACTOR(thanos): move all components to master node

- Add tolerations for control-plane:NoSchedule
- Add nodeSelector for control-plane node
- Affects: query, storegateway, compactor
- PVC will be recreated on master node (data in S3)

FIX(thanos): allow non-Bitnami images (quay.io/thanos)

FIX(thanos): correct nodeSelector value to 'true'
2026-01-09 21:41:52 +09:00
9b052b49cf FEAT(thanos): add Thanos for Prometheus HA
- Add Thanos Query, Store Gateway, Compactor
- Enable Prometheus Sidecar with S3 (MinIO) storage
- Configure OCI registry for Bitnami chart
- Fix Vault secret path and image settings
- Add nodeSelector for master node
2026-01-09 21:41:52 +09:00
6b576d6a16 FEAT(thanos): add Thanos for Prometheus HA and long-term storage
- Add Thanos Query, Store Gateway, Compactor
- Enable Prometheus Sidecar with S3 (MinIO) storage
- Configure Prometheus replicas: 2 with pod anti-affinity
- Add ExternalSecrets for MinIO credentials
- Retention: raw 7d, 5m downsampled 30d, 1h downsampled 90d
2026-01-09 21:41:52 +09:00