初始化 Kubernetes 集群并加入节点
Master 节点⌗
初始化集群前,kubeadm 会去 Google 的镜像仓库拉取镜像,如果无法正常访问,请参考《准备 Kubernetes 集群环境》。
sudo kubeadm init \
--apiserver-advertise-address 10.0.8.81 \
--apiserver-bind-port 6443 \
--control-plane-endpoint cluster-endpoint \
--kubernetes-version v1.25.2 \
--service-cidr 10.96.0.0/16 \
--pod-network-cidr 192.168.0.0/16
[init] Using Kubernetes version: v1.25.2
[preflight] Running pre-flight checks
[WARNING HTTPProxy]: Connection to "https://10.0.8.81" uses proxy "http://10.0.8.18:8234". If that is not intended, adjust your proxy settings
[WARNING HTTPProxyCIDR]: connection to "192.168.0.0/16" uses proxy "http://10.0.8.18:8234". This may lead to malfunctional cluster setup. Make sure that Pod and Services IP ranges specified correctly as exceptions in proxy configuration
[WARNING SystemVerification]: missing optional cgroups: blkio
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [cluster-endpoint kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local s01] and IPs [10.96.0.1 10.0.8.81]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost s01] and IPs [10.0.8.81 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost s01] and IPs [10.0.8.81 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 27.503677 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node s01 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node s01 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: aog7zw.pigdvq7fzg1e4y5w
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:
kubeadm join cluster-endpoint:6443 --token aog7zw.pigdvq7fzg1e4y5w \
--discovery-token-ca-cert-hash sha256:09149deed5c5697105c73c64168dd5d2e2e92fc565e94c04a61792f8012e514c \
--control-plane
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join cluster-endpoint:6443 --token aog7zw.pigdvq7fzg1e4y5w \
--discovery-token-ca-cert-hash sha256:09149deed5c5697105c73c64168dd5d2e2e92fc565e94c04a61792f8012e514c
切记保存好初始化集群成功后的输出结果,其中的 Token 后面在加入节点的时还会用到!默认情况下加入集群的 Token 有效期为 24 小时,过期后可以使用
kubeadm token create --print-join-command
来重新生成。
参数解释:
- –apiserver-advertise-address 10.0.8.81: API 服务器所公布的其正在监听的 IP 地址。如果未设置,则使用默认网络接口。
- –apiserver-bind-port 6443: API 服务器绑定的端口。
- –control-plane-endpoint cluster-endpoint: 为控制平面指定一个稳定的 IP 地址或 DNS 名称。
- –kubernetes-version v1.25.2: 为控制平面选择一个特定的 Kubernetes 版本。
- –service-cidr 10.96.0.0/16: 为服务的虚拟 IP 地址另外指定 IP 地址段,默认:10.96.0.0/16。
- –pod-network-cidr 192.168.0.0/16: 指明 pod 网络可以使用的 IP 地址段。如果设置了这个参数,控制平面将会为每一个节点自动分配 CIDRs,默认:192.168.0.0/16。
service-cidr
和 pod-network-cidr
两个网段不能有冲突,更不能与 apiserver-advertise-address
有冲突,否则可能存在一些奇怪的问题。
可以看到上面的输出中,第 10 行和第 11 行,因为安装 kubeadm, kubectl 和 kubelet 时需要访问 Google 的源,我在虚拟机 SSH Session 中使用了如下命令设置了代理:
export https_proxy=http://10.0.8.18:8234;export http_proxy=http://10.0.8.18:8234;export all_proxy=socks5://10.0.8.18:8235;export no_proxy=cluster-master,cluster-endpoint,10.96.0.1,localhost,127.0.0.1,::1
这导致验证时产生了一些警告,但并不影响集群的初始化,可以忽略,但是还是建议在初始化之前,将这些临时环境变量 unset 掉。
unset https_proxy http_proxy all_proxy no_proxy
拷贝 Kubernetes 配置文件⌗
执行上述初始化集群成功后的输出内容第 66~68 行的命令:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
此时查看 Kubernetes 所启动的 Pod 中 coreDNS 应该是出于 Pending 状态,需要安装 CNI 后才能正常启动。
安装 Calico CNI⌗
我这里所使用的 CNI 的实现是 Calico,当前版本是 v3.24.1,更多内容可以参考官方文档:
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.24.1/manifests/tigera-operator.yaml
如果在初始化集群时,pod-network-cidr
的值是默认的 192.168.0.0/16
这个网段,那么直接执行下面命令就可以了:
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.24.1/manifests/custom-resources.yaml
否则需要将 custom-resources.yaml
配置文件下载下来,然后手动替换里面的 CIDR 设置:
curl -O https://raw.githubusercontent.com/projectcalico/calico/v3.24.1/manifests/custom-resources.yaml
cat custom-resources.yaml
# This section includes base Calico installation configuration.
# For more information, see: https://projectcalico.docs.tigera.io/master/reference/installation/api#operator.tigera.io/v1.Installation
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
# Configures Calico networking.
calicoNetwork:
# Note: The ipPools section cannot be modified post-install.
ipPools:
- blockSize: 26
cidr: 192.168.0.0/16
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()
---
# This section configures the Calico API server.
# For more information, see: https://projectcalico.docs.tigera.io/master/reference/installation/api#operator.tigera.io/v1.APIServer
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
name: default
spec: {}
修改第 15 行中的 cidr
值,与你初始化集群时的 pod-network-cidr
值保持一致,然后执行:
kubectl create -f custom-resources.yaml
如果之前给 Containerd 配置了代理,切记执行完上述命令后,将代理设置取消掉,否则将导致 Calico plugin 无法从 Kubernetes Service API 获取集群信息,从而无法启动 Pod!
查看集群 Pod 的运行情况⌗
安装好 Calico 后,使用如下命令查看集群的 Pod 运行情况 (每两秒更新一次):
watch kubectl get pods -A
Every 2.0s: kubectl get pods -A s01: Mon Oct 3 03:31:39 2022
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-apiserver calico-apiserver-5c5d497dbc-cxb5q 1/1 Running 1 (10m ago) 14h
calico-apiserver calico-apiserver-5c5d497dbc-gbh6m 1/1 Running 1 (10m ago) 14h
calico-system calico-kube-controllers-85666c5b94-h7gnj 1/1 Running 1 (10m ago) 14h
calico-system calico-node-6qrfk 1/1 Running 1 (10m ago) 14h
calico-system calico-typha-b84cfb796-ctzx2 1/1 Running 2 (10m ago) 14h
calico-system calico-typha-b84cfb796-w9t7k 1/1 Running 1 (10m ago) 14h
calico-system csi-node-driver-c94pg 2/2 Running 2 (10m ago) 14h
kube-system coredns-565d847f94-fm848 1/1 Running 1 (10m ago) 15h
kube-system coredns-565d847f94-tbhr2 1/1 Running 1 (10m ago) 15h
kube-system etcd-s01 1/1 Running 1 (10m ago) 15h
kube-system kube-apiserver-s01 1/1 Running 1 (10m ago) 15h
kube-system kube-controller-manager-s01 1/1 Running 1 (10m ago) 15h
kube-system kube-proxy-kmvzb 1/1 Running 1 (10m ago) 15h
kube-system kube-scheduler-s01 1/1 Running 1 (10m ago) 15h
tigera-operator tigera-operator-6675dc47f4-7w8gm 1/1 Running 1 (10m ago) 14h
当所有 Pod 的 STATUS 都是 Running 的时候,说明 Kubernetes 集群的 Master 节点已经初始化并启动完成。
加入工作节点⌗
工作节点基本上与 Master 节点配置差不多,请参考《准备 Kubernetes 集群环境》。
安装完 Containerd 和 Kubeadm 等 CLI 后,执行初始化 Master 节点时提供的加入节点命令:
kubeadm join cluster-endpoint:6443 --token aog7zw.pigdvq7fzg1e4y5w \
--discovery-token-ca-cert-hash sha256:09149deed5c5697105c73c64168dd5d2e2e92fc565e94c04a61792f8012e514c
这是节点会拉取所需镜像,然后启动 Pod,此时可以去 Master 节点查看 Pod 的运行情况,如果所有 Pod 都出于 Running 状态,则表明节点加入成功:
watch kubectl get pods -A
Every 2.0s: kubectl get pods -A s01: Mon Oct 3 03:31:39 2022
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-apiserver calico-apiserver-5c5d497dbc-cxb5q 1/1 Running 1 (26m ago) 15h
calico-apiserver calico-apiserver-5c5d497dbc-gbh6m 1/1 Running 1 (26m ago) 15h
calico-system calico-kube-controllers-85666c5b94-h7gnj 1/1 Running 1 (26m ago) 15h
calico-system calico-node-6qrfk 1/1 Running 1 (26m ago) 14h
calico-system calico-node-cnwdl 1/1 Running 1 (26m ago) 15h
calico-system calico-node-w8p2h 1/1 Running 1 (26m ago) 14h
calico-system calico-typha-b84cfb796-ctzx2 1/1 Running 2 (26m ago) 14h
calico-system calico-typha-b84cfb796-w9t7k 1/1 Running 1 (26m ago) 15h
calico-system csi-node-driver-c94pg 2/2 Running 2 (26m ago) 15h
calico-system csi-node-driver-jmmg2 2/2 Running 2 (26m ago) 14h
calico-system csi-node-driver-qmvpx 2/2 Running 2 (26m ago) 14h
kube-system coredns-565d847f94-fm848 1/1 Running 1 (26m ago) 15h
kube-system coredns-565d847f94-tbhr2 1/1 Running 1 (26m ago) 15h
kube-system etcd-s01 1/1 Running 1 (26m ago) 15h
kube-system kube-apiserver-s01 1/1 Running 1 (26m ago) 15h
kube-system kube-controller-manager-s01 1/1 Running 1 (26m ago) 15h
kube-system kube-proxy-kmvzb 1/1 Running 1 (26m ago) 15h
kube-system kube-proxy-w6swd 1/1 Running 1 (26m ago) 14h
kube-system kube-proxy-x7z96 1/1 Running 1 (26m ago) 14h
kube-system kube-scheduler-s01 1/1 Running 1 (26m ago) 15h
tigera-operator tigera-operator-6675dc47f4-7w8gm 1/1 Running 1 (26m ago) 15h
可以看到,上面是我又加入了两个 Node 后的所有 Pod 运行情况,每个节点会启动 calico-node
和 csi-node-driver
两个 Pod。
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
s01 Ready control-plane 15h v1.25.2 10.0.8.81 <none> Ubuntu 22.04.1 LTS 5.15.0-48-generic containerd://1.6.8
s02 Ready <none> 14h v1.25.2 10.0.8.82 <none> Ubuntu 22.04.1 LTS 5.15.0-48-generic containerd://1.6.8
s03 Ready <none> 14h v1.25.2 10.0.8.83 <none> Ubuntu 22.04.1 LTS 5.15.0-48-generic containerd://1.6.8
到这里整个 Kubernetes Cluster 就算是安装完成了,接下来就是安装 Dashboard 了……
I hope this is helpful, Happy hacking…