cluster-audit
Cluster Audit Report - 2026-02-14
Generated at: 2026-02-14 00:46:35
1. 节点资源概览 (Node Resources)
| Node | Status | Roles | CPU Usage | Memory Usage | Version |
|---|---|---|---|---|---|
| master-01 | ✅ Ready | worker | 5% (104m) | 61% (2410Mi) | v1.34.3+k3s1 |
| worker-01 | ✅ Ready | worker | 23% (927m) | 81% (6471Mi) | v1.34.3+k3s1 |
| worker-02 | ✅ Ready | worker | 15% (607m) | 24% (1976Mi) | v1.34.3+k3s1 |
| worker-03 | ✅ Ready | worker | 16% (661m) | 69% (5500Mi) | v1.34.3+k3s1 |
2. 工作负载健康度 (Workload Health)
2.1 Deployments
- 状态: ✅ 所有 55 个 Deployment 均正常运行。
2.2 StatefulSets
- 状态: ✅ 所有 5 个 StatefulSet 均正常运行。
3. 异常 Pods (Abnormal Pods)
| Namespace | Pod Name | Status | Restarts | Node | Reason |
|---|---|---|---|---|---|
default |
w03-final |
Failed | 0 | worker-03 | Unknown |
dev-2 |
ingest-worker-v2-b1-7c7bb587f7-rzcpf |
Running | 14 | worker-03 | Unknown |
infra |
postgres-backup-29510100-xrthd |
Pending | 0 | master-01 | Unknown |
kube-system |
master-disk-check |
Running | 607 | master-01 | Unknown |
longhorn-system |
engine-image-ei-ff1cedad-45wkf |
Running | 50 | worker-01 | Unknown |
longhorn-system |
longhorn-csi-plugin-cbdfg |
Running | 0 | worker-01 | Unknown |
longhorn-system |
longhorn-csi-plugin-cbdfg |
Running | 44 | worker-01 | Unknown |
monitoring |
monitor-kube-prometheus-st-operator-688f6b9597-mgcn8 |
Running | 49 | worker-01 | Unknown |
monitoring |
monitor-kube-state-metrics-75b75b8c56-sdfd8 |
Running | 62 | worker-01 | Unknown |
monitoring |
monitor-prometheus-node-exporter-76q4c |
Running | 31 | worker-01 | Unknown |
monitoring |
prometheus-monitor-kube-prometheus-st-prometheus-0 |
Running | 0 | worker-01 | Unknown |
monitoring |
prometheus-monitor-kube-prometheus-st-prometheus-0 |
Running | 39 | worker-01 | Unknown |
project-team-chat |
rocket-chat-account-cfd85cc46-bjqzr |
Running | 8 | worker-03 | Unknown |
project-team-chat |
rocket-chat-ddp-streamer-664656df6b-g6hct |
Running | 113 | worker-01 | Unknown |
project-team-chat |
rocket-chat-presence-86449566b6-jkkmx |
Running | 8 | worker-03 | Unknown |
project-team-chat |
rocket-chat-rocketchat-7c8bcbb868-5rj2t |
Running | 68 | worker-01 | Unknown |
4. 存储健康度 (Storage Health)
4.1 PVC 使用率 (Top Usage)
| Namespace | PVC Name | Used | Avail | Use% | Mounted On |
|---|---|---|---|---|---|
infra |
firecrawl-redis-56b494c988-xbpqh (Vol) |
29.8G | 45.7G | 39% | /data |
infra |
nocodb-5ff5dd9484-gq2sd (Vol) |
24K | 4.9G | 1% | /usr/app/data |
infra |
postgres-f65fc7f79-mb5b5 (Vol) |
135M | 9.6G | 2% | /var/lib/postgresql/data |
npm |
nginx-proxy-manager-5dfcd6bcc8-pm5qp (Vol) |
39M | 1.9G | 2% | /data |
apps |
ghost-5b5f458c6d-cz4ns (Vol) |
2.7M | 9.8G | 1% | /var/lib/ghost/content |
apps |
ghost-mysql-5b9d4d6b68-8mq5m (Vol) |
213M | 4.7G | 5% | /var/lib/mysql |
4.2 PV 状态异常
| Name | Status | Claim | StorageClass | Reason |
|---|---|---|---|---|
| pvc-8ea3c692-5ac4-4b37-a984-06ab16a5ac8b | Released | apps/n8n-pvc | local-path | - |
5. 近期警告事件 (Recent Warnings)
| Time | Namespace | Object | Reason | Message |
|---|---|---|---|---|
| 2026-02-13T16:45:08Z | infra |
Pod/postgres-backup-29510100-xrthd | Failed | Error: ImagePullBackOff |
| 2026-02-13T16:42:03Z | kube-system |
Pod/master-disk-check | BackOff | Back-off restarting failed container check in pod master-disk-check_kube-system(d441b30a-fa5a-4139-bca3-5da6f77602ec) |
| 2026-02-13T16:20:14Z | infra |
Pod/postgres-backup-29510100-xrthd | Failed | Failed to pull image "akria/backup-engine:latest": failed to pull and unpack image "docker.io/akria/backup-engine:latest": failed to resolve reference "docker.io/akria/backup-engine:latest": pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed |
| 2026-02-13T16:05:29Z | project-team-chat |
Pod/rocket-chat-mongodb-0 | Unhealthy | Readiness probe failed: command timed out: "/bitnami/scripts/readiness-probe.sh" timed out after 5s |