Commit Graph

38 Commits

Author SHA1 Message Date
26e40d234a CHORE(falco): disable sidekick web ui
- to save 535mb redis memory
2026-01-04 23:41:39 +09:00
368f7b5f5a PERF(falco): reduce falcosidekick replicas to 1
- Scale down to single replica
- Reduce resource usage
2026-01-04 23:41:39 +09:00
d392bbc57a REFACTOR(argocd): remove serversideapply
- from argocd applications
- Fixes OutOfSync issues caused by operator-added default values
- ServerSideApply causes stricter field management that conflicts with
  CRD defaults
2026-01-04 23:41:39 +09:00
f38cbedcba REFACTOR(traefik): switch from HAProxy
- to Traefik ingress controller
- Update all ingress files to use ingressClassName: traefik
- Update cert-manager ClusterIssuer to use traefik class
- Remove haproxy.org annotations from ingress files
- Update vault helm-values to use traefik
2026-01-04 23:41:39 +09:00
64aeb36e78 CHORE(external-secrets): update ESO API version from v1beta1 to v1
- Update ExternalSecret API version
- Migrate to stable API
2026-01-04 23:41:39 +09:00
a2682e292b REFACTOR(goldilocks): use managedNamespaceMetadata for namespace labels
- Remove namespace.yaml files
- Add managedNamespaceMetadata with Goldilocks label
- Set CreateNamespace=true in syncOptions
- Update kustomization.yaml to remove namespace.yaml references
2026-01-04 23:41:39 +09:00
7653a33ffa CHORE(repo): clean kustomization files
- Remove unused entries from kustomization
- Clean up configuration
2026-01-04 23:41:39 +09:00
34a1c9f783 REFACTOR(repo): restructure infra folder structure
- Remove argocd/, helm-values/, ingress/ subdirectories
- Move files to parent directory with standardized names
- Add namespace.yaml to all apps with Goldilocks labels
- Preserve vault/ subdirectories (falco, velero)
- Update main kustomization.yaml to reference argocd.yaml files directly
- Comment out argocd.yaml in each app's kustomization.yaml to prevent
  circular reference

Applications restructured:
- cert-manager (2 ArgoCD apps)
- external-secrets
- reloader
- vault (2 ArgoCD apps)
- velero (2 ArgoCD apps)
- falco
- cnpg
- haproxy
- metallb
- vpa
- argocd
2026-01-04 23:41:39 +09:00
cedb4ec0d4 FIX(falco): falco sync loop by updating ignoreDiff
- Remove optional operator (?) from jqPathExpressions
- Add apiVersion and kind to ignored fields for volumeClaimTemplates
- Prevents continuous sync loop caused by Kubernetes removing these
  fields from StatefulSet
2026-01-04 23:41:39 +09:00
5c918b64fc REFACTOR(falco): use falco-ui-secret
- for sidekick webui authenti...
2026-01-04 23:41:39 +09:00
4e0d10e581 FIX(falco): falco UI auth: use
- FALCOSIDEKICK_UI_USER format
2026-01-04 23:41:39 +09:00
90c7883c37 FEAT(velero): add velero and falco UI auth
- secrets from Vault
2026-01-04 23:41:39 +09:00
50ceb6d98d FIX(argocd): falco cpu requests in argocd
- application
- Falco: 100m → 30m
- Falcosidekick Web UI: 50m → 30m

The previous commit only updated helm-values/falco.yaml which wasn't
being used. The ArgoCD Application uses inline helm values.
2026-01-04 23:41:39 +09:00
4d2a0f5169 PERF(cnpg): reduce cpu requests
- to allow cnpg join pod scheduling
- Falco: 40m → 30m
- Falcosidekick Web UI: 50m → 30m
- Velero UI: 50m → 30m

This frees up ~40m CPU on worker nodes to allow CNPG join pods
(which request 500m) to be scheduled successfully.
2026-01-04 23:41:39 +09:00
27d26cdfb3 CHORE(falco): ignore volumeClaimTemplates status
- in falco StatefulSet
2026-01-04 23:41:39 +09:00
8a398a3bdc REFACTOR(falco): use cpu: null
- to delete Helm chart default CPU limit...
Following Helm best practice to override default values with null.
2026-01-04 23:41:39 +09:00
6e1304f703 FIX(falco): re-enable auto-sync
- for falco and use Helm chart defaults
Let Helm chart apply default CPU limits like other apps.
2026-01-04 23:41:39 +09:00
d6b9fe6a01 CHORE(falco): disable auto-sync for falco
- to allow manual CPU limit r...
Will manually patch DaemonSet to remove CPU limits after this is
applied.
2026-01-04 23:41:39 +09:00
85ef6e8c9f CHORE(falco): set Falco CPU limit to empty string
- Override Helm default CPU limit
- Prevent throttling
2026-01-04 23:41:39 +09:00
10211f35bc REFACTOR(falco): remove invalid empty string CPU
- limit from falco
Kubernetes rejects cpu: "" as invalid quantity format. Will allow
DaemonSet
to be created with default CPU limit, then manually patch and disable
auto-sync.
2026-01-04 23:41:39 +09:00
fa98684528 CHORE(falco): set Falco CPU limit to empty string
- Override Helm default CPU limit
- Prevent throttling
2026-01-04 23:41:39 +09:00
1408000e4c REFACTOR(falco): remove cpu limits entirely
- from vault and falco
- Remove cpu line from limits section (not just set to null)
- Prevents Helm charts from applying default CPU limit values
- Eliminates CPU throttling for infrastructure components
2026-01-04 23:41:39 +09:00
9d8a0554c8 FIX(falco): set Falco CPU limit to null
- Explicitly set CPU limit to null
- Prevent throttling
2026-01-04 23:41:39 +09:00
5c0e5364b9 REFACTOR(resources): remove cpu limits
- from infrastructure components
- velero-ui: Remove 200m CPU limit
- metallb controller: Remove 100m CPU limit
- metallb speaker: Remove 100m CPU limit (300m total across 3 nodes)
- falco: Remove 1000m CPU limit (3000m total across 3 nodes)

Total CPU limits removed: ~3600m

This eliminates CPU throttling and reduces CPU limits overcommit from
131% to 0%.
2026-01-04 23:41:39 +09:00
100b7be198 REFACTOR(resources): remove cpu limits
- to prevent throttling
Removed CPU limits from all infrastructure components while keeping
memory limits for protection:

- cnpg: removed 500m CPU limit
- external-secrets: removed 200m, 100m CPU limits (operator, webhook,
  certController)
- falco: removed 500m CPU limit (falcosidekick webui)
- vault: removed 500m CPU limit
- velero: removed 500m, 1000m CPU limits (server, node-agent)

Benefits:
-  Prevents CPU throttling
-  Better performance and lower latency
-  More efficient resource utilization
-  Simpler management (only requests to tune)

Memory limits are kept to prevent memory leaks and OOM issues.
2026-01-04 23:41:39 +09:00
b7d3c5bab1 PERF(falco): optimize falco cpu request
- for worker-node-2
Reduced Falco DaemonSet CPU request to prevent node-agent
scheduling failures:
- Falco: 50m → 40m (actual usage ~39m)

This optimization frees up 10m CPU per node. On worker-node-2,
this contributes to the total 110m CPU savings needed for
Velero node-agent (30m request) to be scheduled successfully.

Worker-node-2 CPU allocation before: 840m/1000m (84%)
Worker-node-2 CPU allocation after: 730m/1000m (73%)
2026-01-04 23:41:39 +09:00
9bdb035d93 PERF(falco): reduce Falco resource requests
- Reduce CPU/memory requests for worker-node-2 optimization
- Free up resources
2026-01-04 23:41:39 +09:00
7eba5c06c4 FEAT(velero): activate HTTPS in Falco and update Velero version
- Enable HTTPS for Falco UI
- Update Velero chart version
2026-01-04 23:41:39 +09:00
25d0bc2c55 FIX(falco): disable SSL redirect for Falco UI
- Disable SSL redirect on ingress
- Fix routing configuration
2026-01-04 23:41:39 +09:00
fe6cc1f9e7 REFACTOR(falco): move Falco Ingress to falco folder
- Move ingress configuration to falco directory
- Disable SSL redirect
2026-01-04 23:41:39 +09:00
81e000260f FEAT(falco): add Falco UI Ingress via infrastructure
- Add ingress for Falco UI
- Configure routing
2026-01-04 23:41:39 +09:00
c5c5a7e469 FEAT(falco): add HAProxy Ingress for Falco UI
- Add HAProxy ingress at falco0213.kro.kr
- Configure SSL/TLS
2026-01-04 23:41:39 +09:00
ac5fde6ba4 FIX(repo): simplify ignoreDiff for all StatefulSets
- Simplify ignoreDifferences configuration
- Reduce complexity
2026-01-04 23:41:39 +09:00
70eb551871 FIX(falco): disable selfHeal for Falco
- Prevent StatefulSet drift issues
- Disable automatic healing
2026-01-04 23:41:39 +09:00
3f18a3cdf8 FEAT(repo): enhance syncPolicy and ignoreDiff for StatefulSet
- Add enhanced sync policy
- Configure ignoreDifferences for StatefulSet
2026-01-04 23:41:39 +09:00
c2b9175b8b FIX(storage): improve ignoreDiff for StatefulSet PVC retention
- Improve ignoreDifferences configuration
- Handle PVC retention policy
2026-01-04 23:41:39 +09:00
18dac6b77f FIX(falco): change Falco driver to modern_ebpf
- Use modern_ebpf driver for kernel 6.14 compatibility
- Fix kernel module issues
2026-01-04 23:41:39 +09:00
10308d48d0 FEAT(velero): Add Velero, Falco,
- and CNPG infrastructure components
Add three critical infrastructure components via GitOps:

- Velero: Backup and disaster recovery solution
  - Configured with Minio S3 backend
  - Daily full cluster backups (30-day retention)
  - Hourly backups for critical namespaces (7-day retention)
  - Credentials managed via External Secrets from Vault

- Falco: Runtime security monitoring
  - eBPF-based threat detection
  - Custom rules for container security
  - Falcosidekick for alert forwarding
  - Prometheus metrics enabled

- CNPG (CloudNativePG): PostgreSQL operator
  - Kubernetes-native PostgreSQL management
  - Automated failover and backups
  - Will replace Bitnami PostgreSQL

All components follow existing GitOps patterns:
- Helm charts deployed via ArgoCD
- Values managed in Git
- Automated sync with selfHeal enabled
2026-01-04 23:41:39 +09:00