cluster-audit
Cluster Audit Report - 2026-02-14
Generated at: 2026-02-14 18:01:15
1. 节点资源概览 (Node Resources)
| Node | Status | Roles | CPU Usage | Memory Usage | Version |
|---|---|---|---|---|---|
| master-01 | ✅ Ready | worker | 6% (125m) | 65% (2551Mi) | v1.34.3+k3s1 |
| worker-01 | ✅ Ready | worker | 17% (683m) | 88% (6995Mi) | v1.34.3+k3s1 |
| worker-02 | ✅ Ready | worker | 17% (692m) | 23% (1850Mi) | v1.34.3+k3s1 |
| worker-03 | ✅ Ready | worker | 17% (714m) | 75% (5965Mi) | v1.34.3+k3s1 |
2. 工作负载健康度 (Workload Health)
2.1 Deployments
- 状态: ✅ 所有 57 个 Deployment 均正常运行。
2.2 StatefulSets
- 状态: ✅ 所有 6 个 StatefulSet 均正常运行。
3. 异常 Pods (Abnormal Pods)
| Namespace | Pod Name | Status | Restarts | Node | Reason |
|---|---|---|---|---|---|
default |
w03-final |
Failed | 0 | worker-03 | Unknown |
dev-2 |
ingest-worker-v2-b1-7c7bb587f7-rzcpf |
Running | 19 | worker-03 | Unknown |
infra |
postgres-backup-29510100-xrthd |
Pending | 0 | master-01 | Unknown |
kube-system |
master-disk-check |
Running | 808 | master-01 | Unknown |
longhorn-system |
engine-image-ei-ff1cedad-45wkf |
Running | 50 | worker-01 | Unknown |
longhorn-system |
longhorn-csi-plugin-cbdfg |
Running | 0 | worker-01 | Unknown |
longhorn-system |
longhorn-csi-plugin-cbdfg |
Running | 44 | worker-01 | Unknown |
monitoring |
monitor-kube-prometheus-st-operator-688f6b9597-mgcn8 |
Running | 49 | worker-01 | Unknown |
monitoring |
monitor-kube-state-metrics-75b75b8c56-sdfd8 |
Running | 62 | worker-01 | Unknown |
monitoring |
monitor-prometheus-node-exporter-76q4c |
Running | 31 | worker-01 | Unknown |
monitoring |
prometheus-monitor-kube-prometheus-st-prometheus-0 |
Running | 0 | worker-01 | Unknown |
monitoring |
prometheus-monitor-kube-prometheus-st-prometheus-0 |
Running | 39 | worker-01 | Unknown |
project-team-chat |
rocket-chat-account-cfd85cc46-bjqzr |
Running | 8 | worker-03 | Unknown |
project-team-chat |
rocket-chat-ddp-streamer-664656df6b-g6hct |
Running | 113 | worker-01 | Unknown |
project-team-chat |
rocket-chat-presence-86449566b6-jkkmx |
Running | 8 | worker-03 | Unknown |
project-team-chat |
rocket-chat-rocketchat-7c8bcbb868-5rj2t |
Running | 68 | worker-01 | Unknown |
4. 存储健康度 (Storage Health)
4.1 PVC 使用率 (Top Usage)
| Namespace | PVC Name | Used | Avail | Use% | Mounted On |
|---|---|---|---|---|---|
infra |
firecrawl-redis-56b494c988-xbpqh (Vol) |
29.9G | 45.5G | 40% | /data |
infra |
nocodb-5ff5dd9484-gq2sd (Vol) |
24K | 4.9G | 1% | /usr/app/data |
infra |
postgres-f65fc7f79-mb5b5 (Vol) |
135M | 9.6G | 2% | /var/lib/postgresql/data |
npm |
nginx-proxy-manager-5dfcd6bcc8-pm5qp (Vol) |
43M | 1.9G | 3% | /data |
apps |
ghost-5b5f458c6d-cz4ns (Vol) |
3.4M | 9.8G | 1% | /var/lib/ghost/content |
apps |
ghost-mysql-5b9d4d6b68-8mq5m (Vol) |
223M | 4.7G | 5% | /var/lib/mysql |
4.2 PV 状态异常
| Name | Status | Claim | StorageClass | Reason |
|---|---|---|---|---|
| pvc-8ea3c692-5ac4-4b37-a984-06ab16a5ac8b | Released | apps/n8n-pvc | local-path | - |
5. 近期警告事件 (Recent Warnings)
| Time | Namespace | Object | Reason | Message |
|---|---|---|---|---|
| 2026-02-14T10:00:09Z | infra |
Pod/postgres-backup-29510100-xrthd | Failed | Error: ImagePullBackOff |
| 2026-02-14T09:56:59Z | kube-system |
Pod/master-disk-check | BackOff | Back-off restarting failed container check in pod master-disk-check_kube-system(d441b30a-fa5a-4139-bca3-5da6f77602ec) |
| 2026-02-14T09:46:15Z | default |
Pod/kalai-5bc87fb6d-zqhp8 | BackOff | Back-off restarting failed container kalai in pod kalai-5bc87fb6d-zqhp8_default(6b33515e-8e38-4cf8-bfbc-d857cd161c2f) |
| 2026-02-14T09:41:30Z | default |
Pod/kalai-6cd8d46866-5f2bx | BackOff | Back-off restarting failed container kalai in pod kalai-6cd8d46866-5f2bx_default(e94f71fe-6970-4f91-b918-f4467ece3365) |
| 2026-02-14T09:29:29Z | default |
Pod/kalai-7b844d8c95-wv2mv | BackOff | Back-off restarting failed container kalai in pod kalai-7b844d8c95-wv2mv_default(ef85c311-1cc1-4a24-a6a0-167429671831) |