前置条件⌗

CPU: 2 Core
RAM: >= 2Gb
OS: Ubuntu Server 22.04 LTS
Kubeadm v1.25.2
Kubelet v1.25.2
Kubectl v1.25.2
Containerd.io
Proxy（可选，如果所在网络环境能正常访问 Google 服务，则不需要）

建议不要 Clone 虚拟机，来部署其他节点，否则可能因为一些 Host，MAC 和 UUID 冲突的问题导致出现不可预料的错误！

环境信息⌗

我的虚拟机都是运行在 10.0.8.0/24 这个网段的，我的 macOS 上运行这 Surge Proxy，具体 IP 信息如下：

S01: 10.0.8.81 (cluster-master)
S02: 10.0.8.82 (cluster-node)
S03: 10.0.8.83 (cluster-node)
macOS: 10.0.8.18

接下来所有用到 Google 源的都将使用 macOS 作为代理。

安装 ZSH⌗

此项是可选项，之所以使用 ZSH 主要还是因为想使用 zsh-autosuggestions 插件，这在管理 Kubernetes 的时候可以很大程度的提高效率。

sudo apt install zsh
sh -c "$(curl -fsSL https://raw.github.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"
git clone https://github.com/zsh-users/zsh-autosuggestions ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-autosuggestions

完成上述命令后，编辑 ~/.zshrc 文件中添加 zsh-autosuggestions 插件：

plugins=( 
    # other plugins...
    zsh-autosuggestions
)

应用 ZSH 设置：

source ~/.zshrc

关闭 Swap⌗

sudo swapoff -a
sudo rm /swap.img

编辑 /etc/fstab 文件，删除包含 swap.img 的行，如果不存在则忽略:

# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/sda2 during curtin installation
/dev/disk/by-uuid/df8f75a8-980b-4055-bafa-5bdef04872b9 / ext4 defaults 0 1
/swap.img	none	swap	sw	0	0

配置网络转发⌗

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

# 设置所需的 sysctl 参数，参数在重新启动后保持不变
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

# 应用 sysctl 参数而不重新启动
sudo sysctl --system

安装 Containerd⌗

sudo apt-get remove docker docker-engine docker.io containerd runc

# 更新 APT 包索引并安装包
sudo apt-get update
sudo apt-get install \
    ca-certificates \
    curl \
    gnupg \
    lsb-release

# 添加 Docker 的官方 GPG 密钥
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

# 设置 APT 源仓库地址
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt-get update
sudo apt-get install containerd.io

# 安装 docker 引擎和 compose 插件（可选）
sudo apt-get install docker-ce docker-ce-cli docker-compose-plugin

配置 Containerd⌗

sudo rm -rf /etc/containerd/conf.toml
containerd config default | sudo tee /etc/containerd/config.toml

编辑 /etc/containerd/config.toml 配置文件，设置 runc 使用 systemd cgroup 驱动：

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  ...
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = true

将 SystemdCgroup 的值由默认的 false 修改为 true，然后重启 Containerd。

sudo systemctl daemon-reload
sudo systemctl restart containerd

Containerd 代理⌗

这部分是可选项，如果所在网络环境可以正常访问 Google 服务，则不需要，另外需要注意的是，如果配置了代理，在拉取完镜像以后需要将代理配置取消，否则导致无法正常访问 Kubernetes Service API！

sudo mkdir -p /etc/systemd/system/containerd.service.d

cat <<EOF | sudo tee /etc/systemd/system/containerd.service.d/proxy.conf
[Service]
Environment="HTTP_PROXY=http://PROXY_IP:8234"
Environment="HTTPS_PROXY=http://PROXY_IP:8234"
Environment="NO_PROXY="10.96.0.1,localhost,127.0.0.1,::1"
EOF

为 Containerd 设置代理，主要是在拉取 Google 的 image 时会用到，但是在启动 Calico 相关 Pod 的时候，会因为设置了代理，而无法启动。获取 Pod 的信息回看到如下内容：

plugin type="calico" failed (add): error getting ClusterInformation: Get "https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": Service Unavailable

因为给 Containerd 配置了代理，导致启动的容器也无法正常访问 Kubernetes 的 Service IP，这个问题困扰了我很久，我在 Google 上翻阅了各种资料和在线问题，依然无法解决，最后我在启动 Kubernetes 集群后禁用 Containerd 代理，再重启 Containerd 就正常了。

sudo mv /etc/systemd/system/containerd.service.d/proxy.conf /etc/systemd/system/containerd.service.d/proxy.conf.back
sudo systemctl daemon-reload
sudo systemctl restart containerd

配置 crictl⌗

cat <<EOF | sudo tee /etc/crictl.yaml
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 5
debug: false
EOF

配置好以后，就可以使用 crictl 管理 Pod 以及容器了！

设置 sudo 允许获取环境变量⌗

编辑 /etc/sudoers 并取消注释 Defaults:%sudo env_keep += "http_proxy https_proxy ftp_proxy all_proxy no_proxy" 行。

Install Kube CLIs⌗

sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl

# 下载 Google Cloud 公开签名密钥
sudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg

# 添加 Kubernetes APT 源
echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list

# 安装 kubelet、kubeadm 和 kubectl，并固定它们的版本:
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

设置域名解析⌗

将 cluster-master 和 cluster-endpoint 名称添加到 /etc/hosts 中，对应的地址是当前服务器的 IP。

总结⌗

到此，Kubernetes 的预备环境就准备好了，接下来就可以通过 kubeadm 进行初始化集群了。虽然因为代理的事情折腾了两天，但是好在最终还是解决了。

哎，网络问题真的是让很多人走了不少弯路！

I hope this is helpful, Happy hacking…

准备 Kubernetes 集群环境