Commit Graph

5 Commits

Author SHA1 Message Date
100b7be198 REFACTOR(resources): remove cpu limits
- to prevent throttling
Removed CPU limits from all infrastructure components while keeping
memory limits for protection:

- cnpg: removed 500m CPU limit
- external-secrets: removed 200m, 100m CPU limits (operator, webhook,
  certController)
- falco: removed 500m CPU limit (falcosidekick webui)
- vault: removed 500m CPU limit
- velero: removed 500m, 1000m CPU limits (server, node-agent)

Benefits:
-  Prevents CPU throttling
-  Better performance and lower latency
-  More efficient resource utilization
-  Simpler management (only requests to tune)

Memory limits are kept to prevent memory leaks and OOM issues.
2026-01-04 23:41:39 +09:00
b7d3c5bab1 PERF(falco): optimize falco cpu request
- for worker-node-2
Reduced Falco DaemonSet CPU request to prevent node-agent
scheduling failures:
- Falco: 50m → 40m (actual usage ~39m)

This optimization frees up 10m CPU per node. On worker-node-2,
this contributes to the total 110m CPU savings needed for
Velero node-agent (30m request) to be scheduled successfully.

Worker-node-2 CPU allocation before: 840m/1000m (84%)
Worker-node-2 CPU allocation after: 730m/1000m (73%)
2026-01-04 23:41:39 +09:00
9bdb035d93 PERF(falco): reduce Falco resource requests
- Reduce CPU/memory requests for worker-node-2 optimization
- Free up resources
2026-01-04 23:41:39 +09:00
18dac6b77f FIX(falco): change Falco driver to modern_ebpf
- Use modern_ebpf driver for kernel 6.14 compatibility
- Fix kernel module issues
2026-01-04 23:41:39 +09:00
10308d48d0 FEAT(velero): Add Velero, Falco,
- and CNPG infrastructure components
Add three critical infrastructure components via GitOps:

- Velero: Backup and disaster recovery solution
  - Configured with Minio S3 backend
  - Daily full cluster backups (30-day retention)
  - Hourly backups for critical namespaces (7-day retention)
  - Credentials managed via External Secrets from Vault

- Falco: Runtime security monitoring
  - eBPF-based threat detection
  - Custom rules for container security
  - Falcosidekick for alert forwarding
  - Prometheus metrics enabled

- CNPG (CloudNativePG): PostgreSQL operator
  - Kubernetes-native PostgreSQL management
  - Automated failover and backups
  - Will replace Bitnami PostgreSQL

All components follow existing GitOps patterns:
- Helm charts deployed via ArgoCD
- Values managed in Git
- Automated sync with selfHeal enabled
2026-01-04 23:41:39 +09:00