Commit Graph

12 Commits

Author SHA1 Message Date
d079b8582a PERF(platform): use 20% memory increase instead of VPA
- Update argocd controller memory 1700Mi→2040Mi (+20%)
- Update argocd server memory 138Mi→166Mi (+20%)
- Update argocd repo-server memory 1536Mi→1843Mi (+20%)
- Update cert-manager memory 96Mi→115Mi (+20%)
- Update cert-manager webhook memory 96Mi→115Mi (+20%)
- Update cert-manager cainjector memory 192Mi→230Mi (+20%)
2026-01-10 14:37:21 +09:00
26ca07623e PERF(platform): adjust resources based on VPA
- Update argocd controller cpu 126m→350m, memory 1700Mi→640Mi
- Update argocd server memory 138Mi→121Mi
- Update argocd repo-server cpu 15m→49m, memory 1536Mi→933Mi
- Update argocd-image-updater cpu 10m→15m, memory 64Mi→100Mi
- Update cert-manager cpu 23m→15m, memory 96Mi→100Mi
- Update cert-manager webhook cpu 23m→15m, memory 96Mi→100Mi
- Update cert-manager cainjector cpu 23m→15m, memory 192Mi→237Mi
2026-01-10 14:31:28 +09:00
57ef8ebca1 PERF(cert-manager): reduce replicas to 1
- Reduce cert-manager replicas to 1
- Reduce cainjector replicas to 1
- Reduce webhook replicas to 1
2026-01-10 13:31:46 +09:00
03ca19b771 feat(argocd): enable ServiceMonitor for metrics collection
- Add serviceMonitor.enabled: true to controller, server, repoServer
- Allows Prometheus to scrape ArgoCD metrics

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 03:36:29 +09:00
249e451990 FIX(cert-manager): enable ServiceMonitor for Prometheus
- Enable ServiceMonitor to allow Prometheus direct scraping
- Fix missing metrics in Grafana dashboard after OTel migration
- Cert-manager uses exported_namespace label which requires ServiceMonitor
2026-01-10 02:56:02 +09:00
f5ea1b9fc6 CHORE(cert-manager): increase cainjector memory
- Increase cainjector memory request and limit from 96Mi to 192Mi
- Maintain CPU request at 23m
2026-01-10 02:09:27 +09:00
a422382bc2 FIX(cert-manager): increase memory to prevent OOM
- Increase controller memory from 64Mi to 96Mi
- Increase webhook memory from 64Mi to 96Mi
- Increase cainjector memory from 64Mi to 96Mi
- Increase CPU requests from 15m to 23m (1.5x)
2026-01-10 01:17:36 +09:00
56af1a9a17 CHORE(resources): set memory limits equal to memory requests
- Align memory limits with memory requests for guaranteed QoS class
- argocd: controller, server, repoServer, redis
- traefik: main container
- cert-manager: main, webhook, cainjector
- argocd-image-updater: main container
2026-01-10 01:17:35 +09:00
561a07399a FIX(cert-manager): merge duplicate webhook and cainjector sections
- Merge webhook.affinity into webhook section
- Merge cainjector.affinity into cainjector section
- Fix YAML structure to prevent configuration override
2026-01-09 21:43:36 +09:00
da93a2e346 FEAT(platform): enable HA with replica 2 and soft anti-affinity
- Add replicaCount: 2 to cert-manager components
- Add soft pod anti-affinity for node distribution
- Remove descheduler (moved to separate location)
2026-01-09 21:43:36 +09:00
2e2f75dd6b PERF(resources): remove CPU limits - keep memory limits only
- CPU throttling prevents app startup, not crashes
- Memory OOM is the real cascading failure cause
- CPU request ensures fair scheduling
2026-01-07 23:48:39 +09:00
ce2ee8d39e REFACTOR(repo): restructure infra folder structure
- Remove argocd/, helm-values/, ingress/ subdirectories
- Move files to parent directory with standardized names
- Add namespace.yaml to all apps with Goldilocks labels
- Preserve vault/ subdirectories (falco, velero)
- Update main kustomization.yaml to reference argocd.yaml files directly
- Comment out argocd.yaml in each app's kustomization.yaml to prevent
  circular reference

Applications restructured:
- cert-manager (2 ArgoCD apps)
- external-secrets
- reloader
- vault (2 ArgoCD apps)
- velero (2 ArgoCD apps)
- falco
- cnpg
- haproxy
- metallb
- vpa
- argocd
2025-12-29 02:21:00 +09:00