ユーザ用ツール

サイト用ツール


サイドバー

このページの翻訳:



最近の更新



Tag Cloud

このページへのアクセス
今日: 2 / 昨日: 1
総計: 890

06_virtualization:05_container:15_kubernetes_error

15 Kubernetes Error

ContainerCreatingままスタック

kubectl describe pods で見ると下記のエラーが出ている

  Warning  FailedCreatePodSandBox  90s               kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "49a5d016c6aacfbea51d08b00f0edef8575396ba4843294ee176269bdc2d4132": failed to delegate add: failed to set bridge addr: "cni0" already has an IP address different from 10.244.6.1/24

対応

Nodeを1度再起動してあげれば治る

Unable to connect to the server: x509

Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

対応

下記実行

mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
unset KUBECONFIG
export KUBECONFIG=/etc/kubernetes/admin.conf

rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService

root@g-master:~# kubeadm init --pod-network-cidr=10.224.0.0/16
[init] Using Kubernetes version: v1.23.5
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR CRI]: container runtime is not running: output: time="2022-04-19T02:14:43Z" level=fatal msg="getting status of runtime: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

対応

rm /etc/containerd/config.toml
systemctl restart containerd
kubeadm reset

Warning FailedScheduling that the pod didn't tolerate

  Warning  FailedScheduling  63s   default-scheduler  0/4 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.

対応1

taintが付いている可能性

これはマスターはスケージュールしないというtaint
taintの後ろに「-」ハイフンを付けると、untaintする。

# kubectl describe node g-master | grep -i taint
Taints:             node-role.kubernetes.io/master:NoSchedule

#kubectl taint node g-master node-role.kubernetes.io/master:NoSchedule-

# kubectl describe node g-master | grep -i taint
Taints:             node.kubernetes.io/not-ready:NoSchedule

対応2

STATUSがNotReady、これだとスケージュールされない

# kubectl get node
NAME       STATUS     ROLES                  AGE    VERSION
g-master   NotReady   control-plane,master   147m   v1.23.5
g-work01   NotReady   <none>                 146m   v1.23.5
g-work02   NotReady   <none>                 146m   v1.23.5
g-work03   NotReady   <none>                 52m    v1.23.5

fannelがapplyされてない可能性

# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

Initial timeout of 40s passed.

kubeletが起動していない。。

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

	Unfortunately, an error has occurred:
		timed out waiting for the condition

	This error is likely caused by:
		- The kubelet is not running
		- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

	If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
		- 'systemctl status kubelet'
		- 'journalctl -xeu kubelet'

	Additionally, a control plane component may have crashed or exited when started by the container runtime.
	To troubleshoot, list all containers using your preferred container runtimes CLI.

	Here is one example how you may list all Kubernetes containers running in cri-o/containerd using crictl:
		- 'crictl --runtime-endpoint /run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
		Once you have found the failing container, you can inspect its logs with:
		- 'crictl --runtime-endpoint /run/containerd/containerd.sock logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

対応1

systemctl start kubelet

対応2

こんなログが出ている場合。

これは「–control-plane-endpoint」を指定している場合に出るエラーで、VIPが起動してないと思われる。

VIPがないと、対象のcontrol-plane-endpointでマスターホストに接続できないからnot foundで出るよう。

一度下記などで、VIPを付けて試してみる。

ip addr add [VIP] dev [Eth]

※VIPだけ先に用意していおいて、後からipvsadmなどでVIP用意すれば良いと思う。

# systemctl status kubelet
node \"g-master02\" not found"
Apr 19 22:58:16 g-master02 kubelet[20595]: E0419 22:58:16.331432   20595 kubelet.go:2422] "Error getting node" err="node \"g-master02\" not found"

failure loading certificate for CA: couldn't load the certificate file

W0421 22:57:17.689558    6442 utils.go:69] The recommended value for "resolvConf" in "KubeletConfiguration" is: /run/systemd/resolve/resolv.conf; the provided value is: /run/systemd/resolve/resolv.conf
error execution phase preflight: 
One or more conditions for hosting a new control plane instance is not satisfied.

[failure loading certificate for CA: couldn't load the certificate file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory, failure loading key for service account: couldn't load the private key file /etc/kubernetes/pki/sa.key: open /etc/kubernetes/pki/sa.key: no such file or directory, failure loading certificate for front-proxy CA: couldn't load the certificate file /etc/kubernetes/pki/front-proxy-ca.crt: open /etc/kubernetes/pki/front-proxy-ca.crt: no such file or directory, failure loading certificate for etcd CA: couldn't load the certificate file /etc/kubernetes/pki/etcd/ca.crt: open /etc/kubernetes/pki/etcd/ca.crt: no such file or directory]

Please ensure that:
* The cluster has a stable controlPlaneEndpoint address.
* The certificates that must be shared among control plane instances are provided.


To see the stack trace of this error execute with --v=5 or higher

対応

kubernetes controllerをjoinする時に出るエラー

Master Join token再作成をしれあげれば、joinできる。

06_virtualization/05_container/15_kubernetes_error.txt · 最終更新: 2022/05/24 15:29 by matsui