Commit Graph

44 Commits

Author SHA1 Message Date
0f9f427e44 FEAT(minio): add minio storageclass
- and persistentvolumes for local d...
2026-01-05 00:39:12 +09:00
8ac235fb17 CHORE(cnpg): update CNPG chart version to 0.27.0
- Upgrade CloudNativePG chart
- Apply dependency updates
2026-01-05 00:39:12 +09:00
edb90dcb25 REFACTOR(longhorn): change Longhorn namespace
- Change from longhorn-system to longhorn
- Standardize namespace naming
2026-01-05 00:39:12 +09:00
9abcdfa98d REFACTOR(goldilocks): use managedNamespaceMetadata for namespace labels
- Remove namespace.yaml files
- Add managedNamespaceMetadata with Goldilocks label
- Set CreateNamespace=true in syncOptions
- Update kustomization.yaml to remove namespace.yaml references
2026-01-05 00:39:12 +09:00
66890d8f66 FEAT(velero): add kustomize source
- to velero for ingress deployment
2026-01-04 23:47:13 +09:00
fc8ddebf6e REFACTOR(longhorn): migrate longhorn
- from worker-1 to master
- Enable scheduling on mayne-vcn (master)
- Disable scheduling and request eviction on mayne-worker-1
- Longhorn will use only master and worker-2
2026-01-04 23:47:13 +09:00
6bc8359180 CHORE(repo): disable scheduling and request eviction on mayne-vcn
- Disable scheduling on master node
- Request eviction of existing workloads
2026-01-04 23:47:13 +09:00
50c3ad5e9e REFACTOR(minio): arrange folder structure for Longhorn and MinIO
- Reorganize folder structure
- Clean up configuration files
2026-01-04 23:47:13 +09:00
54f7f152da REFACTOR(cnpg): change CNPG namespace
- Update namespace configuration
- Standardize naming
2026-01-04 23:47:13 +09:00
b17eb6309a CHORE(repo): clean kustomization files
- Remove unused entries from kustomization
- Clean up configuration
2026-01-04 23:47:13 +09:00
6b4cd0dce8 REFACTOR(velero): simplify vault
- and velero configs
- vault: Fix CreateNamespace conflict (set to false)
- velero: Consolidate ExternalSecrets into vault/velero-secrets.yaml
- velero: Clean up kustomization.yaml
2026-01-04 23:47:13 +09:00
f7610c9a3e FEAT(cert-manager): integrate cert-manager,
- vault, velero
2026-01-04 23:47:13 +09:00
a39ec16b35 FIX(pgweb): pgweb namespace duplication
- Remove namespace definition from deployment.yaml
- Namespace now only defined in namespace.yaml
- Fixes ComparisonError: may not add resource with already registered id
2026-01-04 23:47:13 +09:00
ad12f641a2 FIX(argocd): helm valueFiles paths in ArgoCD
- Applications
- Update valueFiles paths from helm-values/<app>.yaml to helm-
  values.yaml
- Fixes ComparisonError after folder restructuring

Applications fixed:
- cert-manager
- cnpg
- external-secrets
- vault
- vpa
- velero
2026-01-04 23:47:13 +09:00
d9df80bca3 REFACTOR(postgresql): restructure pgweb
- and pg-dev folder str...
- Remove argocd/, helm-values/ subdirectories
- Move files to parent directory with standardized names
- Add namespace.yaml to both apps with Goldilocks labels
- Preserve vault/ subdirectories (pgweb: 3 files, postgresql-dev: 1
  file)
- Update main kustomization.yaml to reference argocd.yaml files directly
- Update postgresql-dev helm valueFiles path
- Comment out argocd.yaml in each app's kustomization.yaml to prevent
  circular reference

Applications restructured:
- pgweb
- postgresql-dev
2026-01-04 23:47:13 +09:00
55380edbd4 REFACTOR(repo): restructure infra folder structure
- Remove argocd/, helm-values/, ingress/ subdirectories
- Move files to parent directory with standardized names
- Add namespace.yaml to all apps with Goldilocks labels
- Preserve vault/ subdirectories (falco, velero)
- Update main kustomization.yaml to reference argocd.yaml files directly
- Comment out argocd.yaml in each app's kustomization.yaml to prevent
  circular reference

Applications restructured:
- cert-manager (2 ArgoCD apps)
- external-secrets
- reloader
- vault (2 ArgoCD apps)
- velero (2 ArgoCD apps)
- falco
- cnpg
- haproxy
- metallb
- vpa
- argocd
2026-01-04 23:47:13 +09:00
abbc4304fc FIX(longhorn): longhorn crd sync loop by ignoring
- preserveunknownfields
- Add .spec.preserveUnknownFields to ignoreDifferences for all Longhorn
  CRDs
- Prevents OutOfSync status caused by Kubernetes auto-adding this field
- Affects: engines, engineimages, instancemanagers, nodes, replicas,
  settings, volumes
2026-01-04 23:47:13 +09:00
cfb6e9db5b CHORE(velero): clean up velero configuration
Remove unused repository maintenance job configuration.
2026-01-04 23:47:13 +09:00
9276070f86 CHORE(grafana): disable cnpg grafana dashboard
- auto-creation
Dashboard JSON exceeds Kubernetes ConfigMap annotation size limit.
Dashboard can be manually imported from Grafana.com (ID: 20417).
2026-01-04 23:47:13 +09:00
4c408901e6 FEAT(grafana): enable grafana dashboard
- auto-creation for cnpg operator
Add monitoring.grafanaDashboard.create=true to automatically deploy
the official CNPG Grafana dashboard as a ConfigMap that Grafana can
discover and import.
2026-01-04 23:47:13 +09:00
a76543660b FEAT(repo): add repositoryMaintenanceJob
- auto-cleanup: keep only late...
2026-01-04 23:47:13 +09:00
3b2768c9f0 FIX(velero): velero-ui auth: use explicit env
- instead of en...
2026-01-04 23:47:13 +09:00
044cae85e3 FEAT(velero): add velero and falco UI auth
- secrets from Vault
2026-01-04 23:47:13 +09:00
cd9e2822f4 FIX(velero): velero-s3-credentials ExternalSecret
- to use databases/minio
2026-01-04 23:47:13 +09:00
de0a0f6629 REFACTOR(postgresql): remove bitnami pg after
- successful migr...
All applications (gitea, jaejadle, todo, mas, umami) have been
successfully
migrated to CloudNativePG. All databases verified working on CNPG
cluster.
2026-01-04 23:47:13 +09:00
73e760b609 PERF(longhorn): reduce longhorn replica count
- from 3 to 2
Due to storage capacity constraints with 50GB disks per node, reducing
replica count to 2 to fit all volumes within available capacity.
2026-01-04 23:47:13 +09:00
4532c1b11b CHORE(longhorn): update ArgoCD app
- to include Longhorn nodes via kust...
- Changed source from ingress-only to full longhorn/ directory
- Use kustomize to manage ingress + nodes together
- Enables GitOps management of Longhorn Node disk configs
2026-01-04 23:47:13 +09:00
97e77078e9 REFACTOR(longhorn): migrate longhorn
- to dedicated 50gb disks
- Update defaultDataPath to /mnt/longhorn-storage
- Add Node CRs for all nodes with new disk configuration
- Evict data from old /var/lib/longhorn disks to new disks
- Nodes: mayne-vcn, mayne-worker-1, mayne-worker-2
2026-01-04 23:47:13 +09:00
628d168e96 PERF(cnpg): reduce cpu requests
- to allow cnpg join pod scheduling
- Falco: 40m → 30m
- Falcosidekick Web UI: 50m → 30m
- Velero UI: 50m → 30m

This frees up ~40m CPU on worker nodes to allow CNPG join pods
(which request 500m) to be scheduled successfully.
2026-01-04 23:47:13 +09:00
8dd636847e FEAT(longhorn): add longhorn distributed block
- storage
- Add Longhorn Helm chart configuration
- Configure UI ingress at longhorn0213.kro.kr
- Set CPU limits to null to prevent throttling
- Configure 3 replicas for high availability
- Set Longhorn as default StorageClass
2026-01-04 23:47:13 +09:00
a15cb1510f PERF(grafana): optimize cpu requests based on
- actual usage from grafa...
- external-secrets: 20m → 5m (actual: 1m)
- external-secrets-webhook: 10m → 2m (actual: 1m)
- external-secrets-cert: 10m → 2m (actual: 1m)
- cnpg: 100m → 5m (actual: 2m)
- haproxy-ingress: 100m → 15m (actual: 9-10m)
2026-01-04 23:47:13 +09:00
b59c5618ea REFACTOR(resources): remove cpu limits
- to prevent throttling
Removed CPU limits from all infrastructure components while keeping
memory limits for protection:

- cnpg: removed 500m CPU limit
- external-secrets: removed 200m, 100m CPU limits (operator, webhook,
  certController)
- falco: removed 500m CPU limit (falcosidekick webui)
- vault: removed 500m CPU limit
- velero: removed 500m, 1000m CPU limits (server, node-agent)

Benefits:
-  Prevents CPU throttling
-  Better performance and lower latency
-  More efficient resource utilization
-  Simpler management (only requests to tune)

Memory limits are kept to prevent memory leaks and OOM issues.
2026-01-04 23:47:13 +09:00
ecb04fc14a FEAT(velero): configure minio
- for selective velero backup
Added pod annotation to exclude PVC data from Velero backups while
preserving MinIO resource definitions:
- backup.velero.io/backup-volumes-excludes: export

This prevents circular backup of the velero-backups bucket while
still backing up MinIO StatefulSet, Services, and configuration.

Note: MinIO bucket data (bucket, bucket-dev, velero-backups) will
NOT be backed up. Consider separate backup strategy for critical
bucket data if needed.
2026-01-04 23:47:13 +09:00
656d3fa5a3 PERF(velero): optimize velero node-agent
- resources and prevent circul...
- Reduce node-agent CPU request from 100m to 50m
  - Fixes scheduling issue on mayne-worker-2 (was at 99% CPU)
  - Enables node-agent to run on all 3 nodes for complete backup
coverage
- Exclude minio namespace from backups
  - Prevents circular backup (backing up the backup storage)
  - Minio config is in Git and can be recreated
  - Saves significant storage space
2026-01-04 23:47:13 +09:00
b0cd9274b1 FEAT(velero): configure velero
- for full k3s cluster backup
- Enable node-agent for PV file-system backups
- Add defaultVolumesToFsBackup configuration
- Optimize backup schedule (daily, 7-day retention)
- Exclude non-essential namespaces (postgresql-dev, harbor)
- Update Velero to v1.17.1
- Update velero-plugin-for-aws to v1.13.1

Full cluster disaster recovery backup now active.
2026-01-04 23:47:13 +09:00
4ef5497fd5 FEAT(velero): activate https in falco, update
- velero version
2026-01-04 23:47:13 +09:00
f1b99f0bdf FEAT(traefik): add per-application ingress
- management
- Added ingress files for MinIO (API and Console) and pgweb
- Updated kustomization files to include ingress resources
- Migrated from centralized ingress management to per-app architecture
2026-01-04 23:47:13 +09:00
3767a6edea CHORE(traefik): split centralized ingress
- management to per-applicati...
- Moved ArgoCD ingress to argocd/ingress/
- Moved Velero ingress to velero/ingress/
- Removed centralized ingress/ingresses.yaml (single point of failure)
- Updated root kustomization.yaml to reference argocd and velero
  directories
- Each application now manages its own ingress independently
2026-01-04 23:47:13 +09:00
311e8a1cc1 FEAT(velero): Add Velero UI
- with HAProxy Ingress at velero0213.kro.kr
2026-01-04 23:47:13 +09:00
3366a6b5b8 FEAT(velero): Add Velero, Falco,
- and CNPG infrastructure components
Add three critical infrastructure components via GitOps:

- Velero: Backup and disaster recovery solution
  - Configured with Minio S3 backend
  - Daily full cluster backups (30-day retention)
  - Hourly backups for critical namespaces (7-day retention)
  - Credentials managed via External Secrets from Vault

- Falco: Runtime security monitoring
  - eBPF-based threat detection
  - Custom rules for container security
  - Falcosidekick for alert forwarding
  - Prometheus metrics enabled

- CNPG (CloudNativePG): PostgreSQL operator
  - Kubernetes-native PostgreSQL management
  - Automated failover and backups
  - Will replace Bitnami PostgreSQL

All components follow existing GitOps patterns:
- Helm charts deployed via ArgoCD
- Values managed in Git
- Automated sync with selfHeal enabled
2026-01-04 23:47:13 +09:00
b6802a45e6 REFACTOR(vault): update Vault secret paths
- Update secret paths for databases/*
- Reorganize secret structure
2025-12-17 21:32:31 +09:00
26378b9143 FEAT(minio): add minio and pgweb
- move from applications to databases
2025-12-17 15:17:45 +09:00
a096efe80d CHORE(argocd): update ArgoCD applications to point to databases repo
- Update repoURL to databases repo
- Change source repository reference
2025-12-17 15:13:05 +09:00
27838e5bad INIT(postgresql): databases setup
- with pg and pg-dev
2025-12-17 15:09:48 +09:00