Major changes:
- Kubernetes tools: Replace subprocess kubectl calls with kubernetes-client library
- Supports in-cluster config for pod execution
- Fallback to local kubeconfig for development
- All k8s tools (nodes, pods, deployments, logs, describe) now use Python API
- PostgreSQL tools: Replace kubectl exec psql with direct psycopg2 connection
- Connect via Kubernetes service DNS
- Support for environment-based configuration
- Improved error handling with proper pgcode/pgerror
- Prometheus tools: Replace kubectl exec wget with direct HTTP requests
- Use requests library to query Prometheus API
- Connect via Kubernetes service DNS
- Configurable via PROMETHEUS_URL env var
- Deployment updates: Add explicit PostgreSQL connection env vars
- POSTGRES_HOST, POSTGRES_PORT, POSTGRES_USER
- Already had POSTGRES_PASSWORD from secret
Benefits:
- No longer requires kubectl binary in container
- Faster execution (no subprocess overhead)
- Better error handling and type safety
- Works seamlessly in Kubernetes pods with RBAC
- ServiceAccount for mas pod
- ClusterRole with read-only permissions
- ClusterRoleBinding
- kubectl installed in Docker image
- Now mas can query Kubernetes API!
Orchestrator and SRE agents now:
- Don't guess cluster-specific information
- Explicitly state when real verification is needed
- Provide general best practices only
- Created mas database in PostgreSQL
- Changed from mas_user to bluemayne (existing user)
- Use postgresql-password secret (root password)
- Add CHAINLIT_DATABASE_URL for Chainlit compatibility
- Improved error handling in chainlit_app.py
- Create /root/.chainlit directory in Dockerfile to prevent FileExistsError
- Reduce replicas from 2 to 1 to conserve resources
- Lower CPU request from 500m to 100m (insufficient CPU on nodes)
- Lower memory request from 512Mi to 256Mi
- Remove health check probes (Chainlit doesn't have /health endpoint)