e7f97888cc
REFACTOR(cert-manager): move to security repo
...
- Remove cert-manager folder
- Update kustomization references
2026-01-10 19:58:03 +09:00
ad591293f1
CHORE(traefik): disable dashboard
...
- Remove dashboard and api.dashboard settings
- Remove --api.insecure argument
- Keep core settings (DaemonSet, metrics, crossNamespace)
2026-01-10 19:52:46 +09:00
b650c0af56
REFACTOR(argocd): merge priority-classes into argocd
...
- Move priority-classes to argocd/manifests
- Remove separate priority-classes Application
- Simplify platform folder structure
2026-01-10 19:47:30 +09:00
81c42f67e9
REFACTOR(argocd): merge image-updater into argocd
...
- Move image-updater Application to argocd folder
- Move helm-values and secrets to argocd
- Remove separate argocd-image-updater folder
- Update kustomization references
2026-01-10 19:44:02 +09:00
121d5eb198
REFACTOR(gitea): move from applications repo
...
- Add gitea Application manifests
- Update repoURL to reference platform repo
- Include helm-values, kustomization, redirect configs
2026-01-10 19:38:35 +09:00
c31046a322
REFACTOR(traefik): remove control-plane scheduling
...
- Remove tolerations for control-plane taint
- Remove svclb tolerations annotation
- Allow pods to schedule on any available node
2026-01-10 18:35:15 +09:00
737873066d
feat: increase argocd application-controller CPU request to 250m
2026-01-10 18:02:48 +09:00
c38b944a96
REVERT(argocd): restore original resource values
...
- Keep argocd controller at 126m/1700Mi
- Keep argocd server at 15m/138Mi
- Keep argocd repo-server at 15m/1536Mi
2026-01-10 14:44:44 +09:00
d079b8582a
PERF(platform): use 20% memory increase instead of VPA
...
- Update argocd controller memory 1700Mi→2040Mi (+20%)
- Update argocd server memory 138Mi→166Mi (+20%)
- Update argocd repo-server memory 1536Mi→1843Mi (+20%)
- Update cert-manager memory 96Mi→115Mi (+20%)
- Update cert-manager webhook memory 96Mi→115Mi (+20%)
- Update cert-manager cainjector memory 192Mi→230Mi (+20%)
2026-01-10 14:37:21 +09:00
26ca07623e
PERF(platform): adjust resources based on VPA
...
- Update argocd controller cpu 126m→350m, memory 1700Mi→640Mi
- Update argocd server memory 138Mi→121Mi
- Update argocd repo-server cpu 15m→49m, memory 1536Mi→933Mi
- Update argocd-image-updater cpu 10m→15m, memory 64Mi→100Mi
- Update cert-manager cpu 23m→15m, memory 96Mi→100Mi
- Update cert-manager webhook cpu 23m→15m, memory 96Mi→100Mi
- Update cert-manager cainjector cpu 23m→15m, memory 192Mi→237Mi
2026-01-10 14:31:28 +09:00
57ef8ebca1
PERF(cert-manager): reduce replicas to 1
...
- Reduce cert-manager replicas to 1
- Reduce cainjector replicas to 1
- Reduce webhook replicas to 1
2026-01-10 13:31:46 +09:00
187d6aa668
PERF(argocd): increase repo-server memory
...
- Increase memory from 960Mi to 1536Mi
- Prevent OOM during manifest generation
2026-01-10 13:26:40 +09:00
f867b281ff
FIX(priority-classes): correct repoURL
...
- Change repoURL from Mayne0213 to K3S-HOME
2026-01-10 13:18:46 +09:00
52c66f51ae
PERF(argocd): move to workers, add high priority
...
- Remove nodeSelector forcing control-plane placement
- Remove tolerations from ArgoCD and image-updater
- Add high-priority PriorityClass
2026-01-10 13:14:07 +09:00
c9eb7e69f6
PERF(repo): add PriorityClasses for workloads
...
- Create high-priority (1000) for critical infra
- Create medium-priority (500) for observability
- Create low-priority (100) as global default
2026-01-10 13:13:01 +09:00
00cdc2efb1
REVERT(telepresence): remove Telepresence installation
...
- Delete telepresence folder and all configurations
- Remove from kustomization.yaml
- Decided to use local database instead
2026-01-10 03:53:54 +09:00
98d340f7eb
CHORE(telepresence): upgrade to OSS chart v2.25.2
...
- Switch from datawire commercial to telepresence-oss chart
- Use OCI registry ghcr.io/telepresenceio
- Update helm values for OSS chart compatibility
2026-01-10 03:48:23 +09:00
53b8494b6f
FIX(telepresence): set Helm release name to traffic-manager
...
- Telepresence chart requires release name to be traffic-manager
- Add releaseName field to helm configuration
2026-01-10 03:40:12 +09:00
03ca19b771
feat(argocd): enable ServiceMonitor for metrics collection
...
- Add serviceMonitor.enabled: true to controller, server, repoServer
- Allows Prometheus to scrape ArgoCD metrics
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-01-10 03:36:29 +09:00
af1691cedd
FEAT(telepresence): add Telepresence traffic manager
...
- Add ArgoCD Application for Helm chart deployment
- Configure resources with memory request equal to limits
- Enable agent injector with OnDemand policy
2026-01-10 03:29:12 +09:00
249e451990
FIX(cert-manager): enable ServiceMonitor for Prometheus
...
- Enable ServiceMonitor to allow Prometheus direct scraping
- Fix missing metrics in Grafana dashboard after OTel migration
- Cert-manager uses exported_namespace label which requires ServiceMonitor
2026-01-10 02:56:02 +09:00
f5ea1b9fc6
CHORE(cert-manager): increase cainjector memory
...
- Increase cainjector memory request and limit from 96Mi to 192Mi
- Maintain CPU request at 23m
2026-01-10 02:09:27 +09:00
a422382bc2
FIX(cert-manager): increase memory to prevent OOM
...
- Increase controller memory from 64Mi to 96Mi
- Increase webhook memory from 64Mi to 96Mi
- Increase cainjector memory from 64Mi to 96Mi
- Increase CPU requests from 15m to 23m (1.5x)
2026-01-10 01:17:36 +09:00
5ac46a4b91
FEAT(image-updater): add github-creds ExternalSecret
...
- Manage GitHub PAT via Vault
- Remove manual secret creation requirement
2026-01-10 01:17:36 +09:00
9f186d6fa2
CHORE(traefik): change deployment to DaemonSet for HA
...
- Change from Deployment with 3 replicas to DaemonSet
- Ensure Traefik runs on every node automatically
2026-01-10 01:17:36 +09:00
97fd010eb8
FIX(argocd): increase repo-server memory to 960Mi
...
- Repo-server was crashing under load with 640Mi limit
- Set both requests and limits to 960Mi
2026-01-10 01:17:36 +09:00
56af1a9a17
CHORE(resources): set memory limits equal to memory requests
...
- Align memory limits with memory requests for guaranteed QoS class
- argocd: controller, server, repoServer, redis
- traefik: main container
- cert-manager: main, webhook, cainjector
- argocd-image-updater: main container
2026-01-10 01:17:35 +09:00
2d5abed20a
CHORE(repo): disable prune for App of Apps safety
...
- Set prune: false to prevent cascade deletion
- Ensure child apps persist if parent is removed
2026-01-09 21:43:56 +09:00
34277fb7e8
FEAT(argocd): enable metrics service endpoints
...
- Add controller metrics on port 8082
- Add server metrics on port 8083
- Add repoServer metrics on port 8084
2026-01-09 21:43:56 +09:00
f80b1be770
CHORE(argocd): remove app-of-apps.yaml
...
- Remove manual deployment file
- Now managed via GitOps
2026-01-09 21:43:56 +09:00
424c296d05
REFACTOR(argocd): consolidate App of Apps into single file
...
- Merge multiple app-of-apps files
- Simplify repository structure
2026-01-09 21:43:56 +09:00
6c387a7f7e
FEAT(argocd): add web-apps Application to platform
...
- Register web-apps repository in App of Apps
2026-01-09 21:43:56 +09:00
4a4ccd0c44
FIX(argocd): use control-plane nodeSelector
...
- Change nodeSelector from master to control-plane
- K8s nodes have control-plane: "true" label
- Fix pod scheduling failure
FIX(argocd): use hostname instead of hosts for ingress
- Change from hosts array to hostname string
- Change tls from array to boolean
- Matches argo-cd Helm chart expected format
FIX(argocd): resolve SharedResourceWarning
- Change from including argocd/ folder to argocd/argocd.yaml only
- Namespace and webhook-ingress now managed by argocd app only
- Prevents duplicate resource management between platform and argocd
2026-01-09 21:43:36 +09:00
0d38963837
FEAT(argocd): enable GitOps self-management
...
- Add ArgoCD Application for Helm chart deployment
- Add helm-values.yaml with custom settings
- Configure GOMEMLIMIT=400MiB, GOGC=50
- Disable reconciliation (webhook only)
- Enable anonymous access (Authelia handles auth)
- Move main ingress to helm-values.yaml
- Add separate webhook-ingress.yaml
- Remove ConfigMap files (now in helm-values)
2026-01-09 21:43:36 +09:00
a2b13bb4f6
REFACTOR(repo): standardize taint to control-plane
...
- Remove deprecated master taint from traefik
- Update svclb annotation to control-plane
- Remove master taint from argocd-image-updater
2026-01-09 21:43:36 +09:00
561a07399a
FIX(cert-manager): merge duplicate webhook and cainjector sections
...
- Merge webhook.affinity into webhook section
- Merge cainjector.affinity into cainjector section
- Fix YAML structure to prevent configuration override
2026-01-09 21:43:36 +09:00
da93a2e346
FEAT(platform): enable HA with replica 2 and soft anti-affinity
...
- Add replicaCount: 2 to cert-manager components
- Add soft pod anti-affinity for node distribution
- Remove descheduler (moved to separate location)
2026-01-09 21:43:36 +09:00
bd1b3c9d85
FIX(argocd): disable app-resync to prevent periodic spikes
...
- Set controller.app.resync to 0 (default 180s)
- Rely on webhook + selfHeal only
- Fixes 3-minute periodic reconciliation causing CPU/memory spikes
2026-01-09 21:43:31 +09:00
2e2f75dd6b
PERF(resources): remove CPU limits - keep memory limits only
...
- CPU throttling prevents app startup, not crashes
- Memory OOM is the real cascading failure cause
- CPU request ensures fair scheduling
2026-01-07 23:48:39 +09:00
9f46c94dff
Disable ArgoCD polling - webhook only
...
- Set timeout.reconciliation to 0 (disabled)
- ArgoCD now relies solely on GitHub webhooks for refresh
- Reduces unnecessary reconciliation cycles
2026-01-07 18:54:15 +09:00
7bcab45089
CHORE: Remove Tekton CI/CD platform
...
- Delete tekton/ directory (pipeline, triggers, dashboard, ci-cd)
- Remove tekton references from kustomization.yaml
- Switching to GitHub Actions for CI/CD
2026-01-07 17:51:10 +09:00
3ff9df0e35
FIX(tekton): use ExternalSecret API v1 instead of v1beta1
2026-01-07 16:37:32 +09:00
a31b2b1a55
FEAT(tekton): add Tekton Triggers for GitHub webhooks
...
- Add EventListener for GitHub push events
- Add TriggerBinding for payload parsing
- Add TriggerTemplates for Next.js and FastAPI
- Add RBAC for trigger service account
- Add ExternalSecret for webhook secret from Vault
- Add Ingress at tekton0213.kro.kr/hooks
2026-01-07 16:30:22 +09:00
892b5dc815
FEAT(argocd): add webhook ingress without Authelia
...
- Add separate ingress for /api/webhook path
- Exclude Authelia middleware for GitHub webhook
- Enable automatic refresh on git push events
2026-01-07 16:11:59 +09:00
e1641cd3cf
FEAT(ci): add ArgoCD Image Updater and CI/CD pipelines
...
- ArgoCD Image Updater for Zot registry polling
- Tekton Tasks: git-clone, buildah-build-push
- Pipelines: nextjs, fastapi, python
- ExternalSecrets for Zot and GitHub credentials
2026-01-07 14:41:53 +09:00
34de9051c6
FEAT(tekton): add Tekton CI/CD platform
...
- Tekton Pipeline for container builds
- Tekton Triggers for webhook events
- Tekton Dashboard at tekton0213.kro.kr
- Namespace patched to privileged for buildah
2026-01-07 14:27:44 +09:00
045967b265
REFACTOR(argocd): move config files to manifests/
...
- Move namespace.yaml to manifests/
- Move argocd-cm.yaml to manifests/
- Move argocd-rbac-cm.yaml to manifests/
- Move argocd-cmd-params-cm.yaml to manifests/
- Move ingress.yaml to manifests/
2026-01-06 15:42:19 +09:00
82781cb4f1
REFACTOR(cert-manager): move issuer to manifests
...
- Move ClusterIssuer to manifests/ folder
- Separate from Helm chart configuration
2026-01-06 01:38:31 +09:00
cc8bd860fe
REFACTOR(repo): platform repo structure
...
- Add application.yaml for ArgoCD app-of-apps
- Add kustomization.yaml with platform components
- Add renovate.json for automated updates
- Update cert-manager/argocd.yaml repoURL to platform repo
- Update traefik/argocd.yaml repoURL to platform repo
2026-01-04 23:28:29 +09:00
a954e68790
REFACTOR(grafana): remove Falco and Traefik UI
...
- Use Grafana dashboards instead
- Delete falco-ui-secret ExternalSecret
- Delete traefik dashboard IngressRoute
- Update traefik kustomization.yaml
2026-01-04 23:28:29 +09:00