15d5e58d6c
migrate: change repoURLs from GitHub to Gitea
...
Update all ArgoCD Application references to use Gitea (github0213.com)
instead of GitHub for K3S-HOME/observability repository.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-01-10 20:43:29 +09:00
7d0c8aa5f3
FIX(opentelemetry-operator): remove cpu null values
...
- Remove cpu: null (not allowed in new chart schema)
- Keep only memory limits
2026-01-10 18:55:23 +09:00
9c00c42946
CHORE(opentelemetry-operator): upgrade chart to 0.102.0
...
- Fix ServiceMonitor duplicate creation bug (Issue #3446 )
- Upgrade from 0.74.0 to 0.102.0
2026-01-10 18:53:34 +09:00
a08d989fc3
FIX(opentelemetry-operator): remove invalid serviceMonitor
...
- Remove top-level serviceMonitor (not in chart schema)
- Keep manager.serviceMonitor.enabled: false
2026-01-10 18:42:02 +09:00
203a8debac
REFACTOR(repo): remove control-plane scheduling
...
- Remove nodeSelector for control-plane node
- Remove tolerations for control-plane taint
- Allow pods to schedule on any available node
2026-01-10 18:35:15 +09:00
c128ece672
FIX(opentelemetry-operator): disable serviceMonitor
...
- Add top-level serviceMonitor.enabled: false
- Prevent duplicate ServiceMonitor creation on restart
2026-01-10 18:28:12 +09:00
b4b48c6e89
FIX(opentelemetry-operator): restore memory to 256Mi
...
- VPA recommended 75Mi was too low causing informer sync timeout
- Restore original memory value for stability
2026-01-10 14:52:24 +09:00
a3003d597f
PERF(observability): adjust resources based on VPA
...
- Update blackbox-exporter cpu 15m→23m, memory 64Mi→100Mi
- Update grafana cpu 11m→23m, memory 425Mi→175Mi
- Update loki cpu 23m→63m, memory 462Mi→363Mi
- Update tempo cpu 50m→15m, memory 128Mi→100Mi
- Update thanos memory 128Mi→283Mi
- Update node-exporter memory 64Mi→100Mi
- Update kube-state-metrics memory 100Mi→105Mi
- Update opentelemetry-operator cpu 10m→11m, memory 256Mi→75Mi
- Update vpa memory 128Mi→100Mi
2026-01-10 14:33:40 +09:00
de81ca68c9
FIX(opentelemetry-operator): fix ServiceMonitor config path
...
- Move serviceMonitor config from metrics to manager section
- Fix Helm schema validation error
- Disable ServiceMonitor creation to prevent conflicts
2026-01-10 02:38:53 +09:00
dac5fc7bcf
FIX(opentelemetry-operator): disable ServiceMonitor creation
...
- Set metrics.serviceMonitor.enabled to false
- Prevent ServiceMonitor conflict causing CrashLoopBackOff
- Resolve error: servicemonitor already exists
2026-01-10 02:36:12 +09:00
8a050dd303
CHORE(opentelemetry-operator): disable CPU limits
...
- Set CPU limits to null for manager container
- Set CPU limits to null for kube-rbac-proxy container
- Disable chart default CPU limits to prevent throttling
2026-01-10 02:32:53 +09:00
466ec6210c
CHORE(observability): align memory requests with limits
...
- Update opentelemetry-operator manager from 64Mi to 256Mi
- Update opentelemetry-operator kube-rbac-proxy from 32Mi to 64Mi
- Update opentelemetry-collector memory request from 256Mi to 512Mi
2026-01-10 02:31:19 +09:00
507395aca7
CHORE(otel-operator): schedule on master node
...
- Add tolerations and nodeSelector to run operator on control-plane node
2026-01-10 01:18:41 +09:00
02faf93555
FEAT(otel): add OTel Collector for logs and traces
...
- Add OpenTelemetry Operator for CR management
- Deploy OTel Collector as DaemonSet via CR
- Enable filelog receiver for container log collection
- Replace Promtail with OTel filelog receiver
- Keep Prometheus for ServiceMonitor-based metrics scraping
2026-01-09 23:23:51 +09:00