ユーザ用ツール

サイト用ツール


サイドバー

このページの翻訳:



最近の更新



Tag Cloud

06_virtualization:05_container:15_kubernetes_error

文書の過去の版を表示しています。


15 Kubernetes Error

ContainerCreatingままスタック

kubectl describe pods で見ると下記のエラーが出ている

  Warning  FailedCreatePodSandBox  90s               kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "49a5d016c6aacfbea51d08b00f0edef8575396ba4843294ee176269bdc2d4132": failed to delegate add: failed to set bridge addr: "cni0" already has an IP address different from 10.244.6.1/24

対応

Nodeを1度再起動してあげれば治る

Unable to connect to the server: x509

Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

対応

下記実行

mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
unset KUBECONFIG
export KUBECONFIG=/etc/kubernetes/admin.conf

rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService

root@g-master:~# kubeadm init --pod-network-cidr=10.224.0.0/16
[init] Using Kubernetes version: v1.23.5
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR CRI]: container runtime is not running: output: time="2022-04-19T02:14:43Z" level=fatal msg="getting status of runtime: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

対応

rm /etc/containerd/config.toml
systemctl restart containerd

Warning FailedScheduling that the pod didn't tolerate

  Warning  FailedScheduling  63s   default-scheduler  0/4 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.

対応1

taintが付いている可能性

これはマスターはスケージュールしないというtaint
taintの後ろに「-」ハイフンを付けると、untaintする。

# kubectl describe node g-master | grep -i taint
Taints:             node-role.kubernetes.io/master:NoSchedule

#kubectl taint node g-master node-role.kubernetes.io/master:NoSchedule-

# kubectl describe node g-master | grep -i taint
Taints:             node.kubernetes.io/not-ready:NoSchedule

対応2

STATUSがNotReady、これだとスケージュールされない

# kubectl get node
NAME       STATUS     ROLES                  AGE    VERSION
g-master   NotReady   control-plane,master   147m   v1.23.5
g-work01   NotReady   <none>                 146m   v1.23.5
g-work02   NotReady   <none>                 146m   v1.23.5
g-work03   NotReady   <none>                 52m    v1.23.5

fannelがapplyされてない可能性

# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

Initial timeout of 40s passed.

kubeletが起動していない。。

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

	Unfortunately, an error has occurred:
		timed out waiting for the condition

	This error is likely caused by:
		- The kubelet is not running
		- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

	If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
		- 'systemctl status kubelet'
		- 'journalctl -xeu kubelet'

	Additionally, a control plane component may have crashed or exited when started by the container runtime.
	To troubleshoot, list all containers using your preferred container runtimes CLI.

	Here is one example how you may list all Kubernetes containers running in cri-o/containerd using crictl:
		- 'crictl --runtime-endpoint /run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
		Once you have found the failing container, you can inspect its logs with:
		- 'crictl --runtime-endpoint /run/containerd/containerd.sock logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

対応1

systemctl start kubelet

対応2

こんなログが出ている場合。

これは「–control-plane-endpoint」を指定している場合に出るエラーで、VIPが起動してないと思われる。

VIPがないと、対象のcontrol-plane-endpointでマスターホストに接続できないからnot foundで出るよう。

一度下記などで、VIPを付けて試してみる。

ip addr add [VIP] dev [Eth]
# systemctl status kubelet
node \"g-master02\" not found"
Apr 19 22:58:16 g-master02 kubelet[20595]: E0419 22:58:16.331432   20595 kubelet.go:2422] "Error getting node" err="node \"g-master02\" not found"
06_virtualization/05_container/15_kubernetes_error.1650426849.txt.gz · 最終更新: 2022/04/20 12:54 by matsui