Development Artist

[ArgoCD, Calico, k8s] ClusterInformation: connection is unauthorized: Unauthorized 본문

TroubleShooting/Devops Issue

[ArgoCD, Calico, k8s] ClusterInformation: connection is unauthorized: Unauthorized

JMcunst 2025. 2. 1. 13:38
728x90
반응형

이슈

ArgoCD에서 다음과 같은 Event Log 확인.

error killing pod: failed to "KillPodSandbox" for "5434cdfd-cb12-45b8-980d-a1e1bcd5fb09" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to destroy network for sandbox \"63901cdb6345abd4541257c68a19b517e4cb43aee13603648757ac1af89a4720\": plugin type=\"calico\" failed (delete): error getting ClusterInformation: connection is unauthorized: Unauthorized"

예상 원인

  • Calico RBAC 권한 부족
  • Calico Pod 또는 DaemonSet 비정상 상태
  • Kubernetes API 서버 문제
  • Calico CNI 설정 문제

해결

Calico 관련 리소스 상태 확인

# kubectl get pods -n kube-system | grep calico
calico-kube-controllers-5947598c79-2ss67          1/1     Running     1              43d
calico-node-6lgcv                                 1/1     Running     0              12h
calico-node-fvpmk                                 1/1     Running     0              43d
calico-node-slw4p                                 1/1     Running     13 (10d ago)   11d

재시작 횟수가 높은 calico-node-slw4p의 로그를 확인해서 네트워크 문제가 있는지 확인.

# kubectl logs calico-node-***** -n kube-system --tail=50
2025-01-31 02:33:34.244 [INFO][77] felix/int_dataplane.go 1900: Received *proto.HostMetadataV4V6Update update from calculation graph msg=hostname:"[MASKED]" ipv4_addr:"[MASKED]" labels:<key:"app" value:"[MASKED]" > labels:<key:"beta.kubernetes.io/arch" value:"amd64" > labels:<key:"beta.kubernetes.io/os" value:"linux" > labels:<key:"feature.node.kubernetes.io/cpu-cpuid.ADX" value:"true" > labels:<key:"feature.node.kubernetes.io/cpu-cpuid.AESNI" value:"true" > ... labels:<key:"feature.node.kubernetes.io/system-os_release.ID" value:"ubuntu" > labels:<key:"feature.node.kubernetes.io/system-os_release.VERSION_ID" value:"[MASKED]" > labels:<key:"kubernetes.io/arch" value:"amd64" > labels:<key:"kubernetes.io/hostname" value:"[MASKED]" > labels:<key:"kubernetes.io/os" value:"linux" > labels:<key:"microk8s.io/cluster" value:"true" > labels:<key:"node.kubernetes.io/microk8s-controlplane" value:"microk8s-controlplane" > labels:<key:"nvidia.com/cuda.driver-version.full" value:"[MASKED]" > labels:<key:"nvidia.com/gpu.memory" value:"[MASKED]" > labels:<key:"nvidia.com/gpu.product" value:"[MASKED]" > labels:<key:"nvidia.com/gpu.machine" value:"[MASKED]" > 

2025-01-31 02:33:40.286 [INFO][79] monitor-addresses/reachaddr.go 47: Auto-detected address by connecting to remote Destination="[MASKED]" IP=[MASKED]
2025-01-31 02:33:40.287 [INFO][79] monitor-addresses/autodetection_methods.go 143: Using autodetected IPv4 address [MASKED], detected by connecting to [MASKED]

2025-01-31 02:34:31.348 [INFO][77] felix/int_dataplane.go 1900: Received *proto.HostMetadataV4V6Update update from calculation graph msg=hostname:"[MASKED]" ipv4_addr:"[MASKED]" labels:<key:"beta.kubernetes.io/arch" value:"amd64" > labels:<key:"beta.kubernetes.io/os" value:"linux" > ... labels:<key:"feature.node.kubernetes.io/kernel-version.full" value:"[MASKED]" > labels:<key:"feature.node.kubernetes.io/system-os_release.VERSION_ID" value:"[MASKED]" > labels:<key:"kubernetes.io/hostname" value:"[MASKED]" > labels:<key:"nvidia.com/cuda.driver-version.full" value:"[MASKED]" > labels:<key:"nvidia.com/gpu.memory" value:"[MASKED]" > labels:<key:"nvidia.com/gpu.product" value:"[MASKED]" > labels:<key:"nvidia.com/gpu.machine" value:"[MASKED]" > 

2025-01-31 02:34:31.348 [INFO][77] felix/summary.go 100: Summarising 7 dataplane reconciliation loops over 1m0.4s: avg=9ms longest=16ms (resync-ipsets-v4)

특별히 문제가 없어보이는데 Error 로그만 선별

# kubectl logs calico-node-6lgcv -n kube-system | grep -i "error
2025-01-30 14:07:08.257 [WARNING][69] felix/bpf_ep_mgr.go 4202: Failed to make sure that bpfin.cali/bpfout.cali device is (not) present. error=Link not found
2025-01-30 14:07:08.266 [INFO][69] felix/int_dataplane.go 2010: attempted to modprobe nf_conntrack_proto_sctp error=exit status 1 output=""
2025-01-31 02:38:13.305 [INFO][69] felix/route_table.go 1038: Failed to access interface because it doesn't exist. error=Link not found ifaceName="cali88be84acd7e" ifaceRegex="^cali.*" ipVersion=0x4 tableIndex=254

Calico 네트워크 인터페이스 존재 확인

ip link show | grep cali

아무것도 안뜬다...

Calico 네트워크 인터페이스(cali*)가 생성되지 않았다는 것을 의미한다.

Calico Restart

kubectl rollout restart daemonset calico-node -n kube-system

이렇게 하고 기존에 Pending 되고 있던 Pod를 가진 ReplicaSet Delete 하니,

새로 ReplicaSet이 생성되면서 정상적으로 Pod가 뜨는 것을 확인할 수 있다!

728x90
반응형
Comments