cluster-audit
Cluster Audit Report - 2026-02-14
Generated at: 2026-02-14 00:41:08
1. 节点资源概览 (Node Resources)
| Node | Status | Roles | CPU Usage | Memory Usage | Version |
|---|---|---|---|---|---|
| master-01 | ✅ Ready | worker | 7% (155m) | 62% (2457Mi) | v1.34.3+k3s1 |
| worker-01 | ✅ Ready | worker | 10% (426m) | 82% (6531Mi) | v1.34.3+k3s1 |
| worker-02 | ✅ Ready | worker | 11% (440m) | 24% (1975Mi) | v1.34.3+k3s1 |
| worker-03 | ✅ Ready | worker | 13% (530m) | 69% (5528Mi) | v1.34.3+k3s1 |
2. 工作负载健康度 (Workload Health)
2.1 Deployments
| Namespace | Name | Ready | Up-to-date | Available | Age |
|---|
2.2 StatefulSets
| Namespace | Name | Ready | Current | Age |
|---|
3. 异常 Pods (Abnormal Pods)
| Namespace | Pod Name | Status | Restarts | Node | Reason |
|---|---|---|---|---|---|
default |
w03-final |
Failed | 0 | worker-03 | Unknown |
dev-2 |
ingest-worker-v2-b1-7c7bb587f7-rzcpf |
Running | 14 | worker-03 | Unknown |
infra |
postgres-backup-29510100-xrthd |
Pending | 0 | master-01 | Unknown |
kube-system |
master-disk-check |
Running | 605 | master-01 | Unknown |
longhorn-system |
engine-image-ei-ff1cedad-45wkf |
Running | 50 | worker-01 | Unknown |
longhorn-system |
longhorn-csi-plugin-cbdfg |
Running | 0 | worker-01 | Unknown |
longhorn-system |
longhorn-csi-plugin-cbdfg |
Running | 44 | worker-01 | Unknown |
monitoring |
monitor-kube-prometheus-st-operator-688f6b9597-mgcn8 |
Running | 49 | worker-01 | Unknown |
monitoring |
monitor-kube-state-metrics-75b75b8c56-sdfd8 |
Running | 62 | worker-01 | Unknown |
monitoring |
monitor-prometheus-node-exporter-76q4c |
Running | 31 | worker-01 | Unknown |
monitoring |
prometheus-monitor-kube-prometheus-st-prometheus-0 |
Running | 0 | worker-01 | Unknown |
monitoring |
prometheus-monitor-kube-prometheus-st-prometheus-0 |
Running | 39 | worker-01 | Unknown |
project-team-chat |
rocket-chat-account-cfd85cc46-bjqzr |
Running | 8 | worker-03 | Unknown |
project-team-chat |
rocket-chat-ddp-streamer-664656df6b-g6hct |
Running | 113 | worker-01 | Unknown |
project-team-chat |
rocket-chat-presence-86449566b6-jkkmx |
Running | 8 | worker-03 | Unknown |
project-team-chat |
rocket-chat-rocketchat-7c8bcbb868-5rj2t |
Running | 68 | worker-01 | Unknown |
4. 存储健康度 (Storage Health)
4.1 PVC 使用率 (Top Usage)
| Namespace | PVC Name | Capacity | Used | % | Status |
|---|---|---|---|---|---|
infra |
firecrawl-redis-56b494c988-xbpqh (Vol) |
78.7G | 29.7G | 39% | Check |
infra |
nocodb-5ff5dd9484-gq2sd (Vol) |
4.9G | 24K | 1% | Check |
infra |
postgres-f65fc7f79-mb5b5 (Vol) |
9.8G | 135M | 2% | Check |
npm |
nginx-proxy-manager-5dfcd6bcc8-pm5qp (Vol) |
2.0G | 39M | 2% | Check |
apps |
ghost-5b5f458c6d-cz4ns (Vol) |
9.8G | 2.7M | 1% | Check |
apps |
ghost-mysql-5b9d4d6b68-8mq5m (Vol) |
4.9G | 213M | 5% | Check |
4.2 PV 状态异常
| Name | Status | Claim | StorageClass | Reason |
|---|---|---|---|---|
| pvc-8ea3c692-5ac4-4b37-a984-06ab16a5ac8b | Released | apps/n8n-pvc | local-path | - |
5. 近期警告事件 (Recent Warnings)
| Time | Namespace | Object | Reason | Message |
|---|---|---|---|---|
| 2026-02-13T16:40:12Z | infra |
Pod/postgres-backup-29510100-xrthd | Failed | Error: ImagePullBackOff |
| 2026-02-13T16:37:05Z | kube-system |
Pod/master-disk-check | BackOff | Back-off restarting failed container check in pod master-disk-check_kube-system(d441b30a-fa5a-4139-bca3-5da6f77602ec) |
| 2026-02-13T16:20:14Z | infra |
Pod/postgres-backup-29510100-xrthd | Failed | Failed to pull image "akria/backup-engine:latest": failed to pull and unpack image "docker.io/akria/backup-engine:latest": failed to resolve reference "docker.io/akria/backup-engine:latest": pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed |
| 2026-02-13T16:05:29Z | project-team-chat |
Pod/rocket-chat-mongodb-0 | Unhealthy | Readiness probe failed: command timed out: "/bitnami/scripts/readiness-probe.sh" timed out after 5s |