このページの翻訳:
- 日本語 (ja)
- English (en)
最近の更新
最近の更新
kubectl describe pods で見ると下記のエラーが出ている
Warning FailedCreatePodSandBox 90s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "49a5d016c6aacfbea51d08b00f0edef8575396ba4843294ee176269bdc2d4132": failed to delegate add: failed to set bridge addr: "cni0" already has an IP address different from 10.244.6.1/24
Nodeを1度再起動してあげれば治る
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
下記実行
mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config unset KUBECONFIG export KUBECONFIG=/etc/kubernetes/admin.conf
root@g-master:~# kubeadm init --pod-network-cidr=10.224.0.0/16 [init] Using Kubernetes version: v1.23.5 [preflight] Running pre-flight checks error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR CRI]: container runtime is not running: output: time="2022-04-19T02:14:43Z" level=fatal msg="getting status of runtime: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService" , error: exit status 1 [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...` To see the stack trace of this error execute with --v=5 or higher
rm /etc/containerd/config.toml systemctl restart containerd kubeadm reset
Warning FailedScheduling 63s default-scheduler 0/4 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
taintが付いている可能性
これはマスターはスケージュールしないというtaint
taintの後ろに「-」ハイフンを付けると、untaintする。
# kubectl describe node g-master | grep -i taint Taints: node-role.kubernetes.io/master:NoSchedule #kubectl taint node g-master node-role.kubernetes.io/master:NoSchedule- # kubectl describe node g-master | grep -i taint Taints: node.kubernetes.io/not-ready:NoSchedule
STATUSがNotReady、これだとスケージュールされない
# kubectl get node NAME STATUS ROLES AGE VERSION g-master NotReady control-plane,master 147m v1.23.5 g-work01 NotReady <none> 146m v1.23.5 g-work02 NotReady <none> 146m v1.23.5 g-work03 NotReady <none> 52m v1.23.5
fannelがapplyされてない可能性
# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubeletが起動していない。。
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [kubelet-check] Initial timeout of 40s passed. Unfortunately, an error has occurred: timed out waiting for the condition This error is likely caused by: - The kubelet is not running - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled) If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands: - 'systemctl status kubelet' - 'journalctl -xeu kubelet' Additionally, a control plane component may have crashed or exited when started by the container runtime. To troubleshoot, list all containers using your preferred container runtimes CLI. Here is one example how you may list all Kubernetes containers running in cri-o/containerd using crictl: - 'crictl --runtime-endpoint /run/containerd/containerd.sock ps -a | grep kube | grep -v pause' Once you have found the failing container, you can inspect its logs with: - 'crictl --runtime-endpoint /run/containerd/containerd.sock logs CONTAINERID' error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster To see the stack trace of this error execute with --v=5 or higher
systemctl start kubelet
こんなログが出ている場合。
これは「–control-plane-endpoint」を指定している場合に出るエラーで、VIPが起動してないと思われる。
VIPがないと、対象のcontrol-plane-endpointでマスターホストに接続できないからnot foundで出るよう。
一度下記などで、VIPを付けて試してみる。
ip addr add [VIP] dev [Eth]
※VIPだけ先に用意していおいて、後からipvsadmなどでVIP用意すれば良いと思う。
# systemctl status kubelet node \"g-master02\" not found" Apr 19 22:58:16 g-master02 kubelet[20595]: E0419 22:58:16.331432 20595 kubelet.go:2422] "Error getting node" err="node \"g-master02\" not found"
W0421 22:57:17.689558 6442 utils.go:69] The recommended value for "resolvConf" in "KubeletConfiguration" is: /run/systemd/resolve/resolv.conf; the provided value is: /run/systemd/resolve/resolv.conf error execution phase preflight: One or more conditions for hosting a new control plane instance is not satisfied. [failure loading certificate for CA: couldn't load the certificate file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory, failure loading key for service account: couldn't load the private key file /etc/kubernetes/pki/sa.key: open /etc/kubernetes/pki/sa.key: no such file or directory, failure loading certificate for front-proxy CA: couldn't load the certificate file /etc/kubernetes/pki/front-proxy-ca.crt: open /etc/kubernetes/pki/front-proxy-ca.crt: no such file or directory, failure loading certificate for etcd CA: couldn't load the certificate file /etc/kubernetes/pki/etcd/ca.crt: open /etc/kubernetes/pki/etcd/ca.crt: no such file or directory] Please ensure that: * The cluster has a stable controlPlaneEndpoint address. * The certificates that must be shared among control plane instances are provided. To see the stack trace of this error execute with --v=5 or higher
kubernetes controllerをjoinする時に出るエラー
Master Join token再作成をしれあげれば、joinできる。