kubernetes-集群构建

本实验参考:https://github.com/gjmzj/kubeasz
 
软硬件限制:
①cpu和内存 master:至少1c2g,推荐2c4g;node:至少1c2g
②linux系统 内核版本至少3.10,推荐CentOS7/RHEL7
③docker 至少1.9版本,推荐1.12+
④etcd 至少2.0版本,推荐3.0+
 
 
高可用集群所需节点规划:
①部署节点------x1 : 运行这份 ansible 脚本的节点
②etcd节点------x3 : 注意etcd集群必须是1,3,5,7...奇数个节点
③master节点----x2 : 根据实际集群规模可以增加节点数,需要额外规划一个master VIP(虚地址)
④lb节点--------x2 : 负载均衡节点两个,安装 haproxy+keepalived
⑤node节点------x3 : 真正应用负载的节点,根据需要提升机器配置和增加节点数
 
四台主机规划:
                     主机 主机名集色角色
             192.168.1.200masterdeploy、etcd、lb1、master1
             192.168.1.201master2lb2、master2
             192.168.1.202nodeetcd2、node1
             192.168.1.203node2etcd3、node2
             192.168.1.250   vip
一、准备工作
1:四台机器都执行安装epel源、更新、安装Python包。(说明:这边是为了做实验,防止出现不必要错误,把防火墙关闭了,生成环境勿学)
yum install -y  epel-release
yum install -y  python
iptables -F
setenforce 0
【deploy节点操作】
 2:安装ansible
1 [ ~]# yum -y install ansible
3:生成密钥对
[ ~]# ssh-keygen 
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Created directory ‘/root/.ssh‘.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:cfoSPSgeEkAkgY08UIVWK2t2eNJIrKph5wkRkZX7AKs 
The key‘s randomart image is:
+---[RSA 2048]----+
|BOB=+            |
|oB=o .           |
| oB +   . .      |
| +.O .   *       |
|o.B B o S o      |
|Eo.+ + o o .     |
|oo .  . . .      |
|o.+ .    .       |
|.  o             |
+----[SHA256]-----+
4:拷贝秘钥到四台机器中
1 [ ~]# for ip in 200 201 202 203; do ssh-copy-id 192.168.1.$ip; done

 5:测试是否可以免密登录

[ ~]# ssh 192.168.1.200
Last login: Wed Dec 11 10:47:55 2019 from 192.168.1.2
[ ~]# exit
登出
Connection to 192.168.1.200 closed.
[ ~]# ssh 192.168.1.201
Last login: Wed Dec 11 10:48:00 2019 from 192.168.1.2
[ ~]# exit
登出
Connection to 192.168.1.201 closed.
[ ~]# ssh 192.168.1.202
Last login: Wed Dec 11 11:13:53 2019 from 192.168.1.200
[ ~]# exit
登出
Connection to 192.168.1.202 closed.
[ ~]# ssh 192.168.1.203
Last login: Wed Dec 11 10:48:20 2019 from 192.168.1.2
[ ~]#  exit
登出
Connection to 192.168.1.203 closed.
6:下载脚本文件,安装kubeasz代码、二进制、离线镜像
脚本下载链接:https://pan.baidu.com/s/1GLoU9ntjUL2SP4R_Do7mlQ
提取码:96eg
[ ~]# chmod +x easzup 
[ ~]# ./easzup -D
[ ~]# ls /etc/ansible/
01.prepare.yml     03.docker.yml       06.network.yml        22.upgrade.yml  90.setup.yml  bin          down       pics       tools
02.etcd.yml        04.kube-master.yml  07.cluster-addon.yml  23.backup.yml   99.clean.yml  dockerfiles  example    README.md
03.containerd.yml  05.kube-node.yml    11.harbor.yml         24.restore.yml  ansible.cfg   docs         manifests  roles

 7:配置hosts集群参数

[ ~ ]# cd /etc/ansible
[ ansible]# cp example/hosts.multi-node hosts
[ ansible]# vim hosts
[etcd]   ##设置etcd节点ip 
192.168.1.200 NODE_NAME=etcd1
192.168.1.202 NODE_NAME=etcd2
192.168.1.203 NODE_NAME=etcd3

[kube-master]   ##设置master节点ip 
192.168.1.200
192.168.1.201

[kube-node]  ##设置node节点ip 
192.168.1.202
192.168.1.203

[ex-lb]  ##设置lb节点ip和VIP
192.168.1.200 LB_ROLE=backup EX_APISERVER_VIP=192.168.1.250 EX_APISERVER_PORT=8443
192.168.1.201 LB_ROLE=master EX_APISERVER_VIP=192.168.1.250 EX_APISERVER_PORT=8443
8:修改完hosts,测试连通性
[ ansible]# ansible all -m ping
[DEPRECATION WARNING]: The TRANSFORM_INVALID_GROUP_CHARS settings is set to allow bad characters in group names by default, this will change, but still be user 
configurable on deprecation. This feature will be removed in version 2.10. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
[WARNING]: Invalid characters were found in group names but not replaced, use -vvvv to see details

192.168.1.201 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    }, 
    "changed": false, 
    "ping": "pong"
}
192.168.1.202 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    }, 
    "changed": false, 
    "ping": "pong"
}
192.168.1.203 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    }, 
    "changed": false, 
    "ping": "pong"
}
192.168.1.200 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    }, 
    "changed": false, 
    "ping": "pong"
}

 二、开始部署集群

【deploy节点操作】手动安装方式

1:安装ca证书

1 [ ansible]# ansible-playbook 01.prepare.yml 

2:安装etcd
1 [ ansible]# ansible-playbook 02.etcd.yml
检查etcd健康状态,显示healthy: successfully表示节点正常
[ ansible]# for ip in 200 202 203 ; do ETCDCTL_API=3 etcdctl --endpoints=https://192.168.1.$ip:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem endpoint healt; done
https://192.168.1.200:2379 is healthy: successfully committed proposal: took = 5.658163ms
https://192.168.1.202:2379 is healthy: successfully committed proposal: took = 6.384588ms
https://192.168.1.203:2379 is healthy: successfully committed proposal: took = 7.386942ms

 3:安装docker

1 [ ansible]# ansible-playbook 03.docker.yml

4:安装master
 1 [ ansible]# ansible-playbook 04.kube-master.yml 
查看集群状态
[ ansible]# kubectl get componentstatus
NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok                  
controller-manager   Healthy   ok                  
etcd-0               Healthy   {"health":"true"}   
etcd-2               Healthy   {"health":"true"}   
etcd-1               Healthy   {"health":"true"}

 5:安装node节点 

1 [ ansible]# ansible-playbook 05.kube-node.yml

查看node节点
[ ansible]# kubectl get nodes
NAME            STATUS                     ROLES    AGE     VERSION
192.168.1.200   Ready,SchedulingDisabled   master   4m45s   v1.15.0
192.168.1.201   Ready,SchedulingDisabled   master   4m45s   v1.15.0
192.168.1.202   Ready                      node     12s     v1.15.0
192.168.1.203   Ready                      node     12s     v1.15.0
6:部署集群网络
 1 [ ansible]# ansible-playbook 06.network.yml
查看kube-system namespace上的pod,从中可以看到flannel相关的pod
[ ansible]# kubectl get pod -n kube-system
NAME                          READY   STATUS    RESTARTS   AGE
kube-flannel-ds-amd64-7bk5w   1/1     Running   0          61s
kube-flannel-ds-amd64-blcxx   1/1     Running   0          61s
kube-flannel-ds-amd64-c4sfx   1/1     Running   0          61s
kube-flannel-ds-amd64-f8pnz   1/1     Running   0          61s
7:安装集群插件
1 [ ansible]# ansible-playbook 07.cluster-addon.yml
查看kube-system namespace下的服务
[ ansible]# kubectl get svc -n kube-system
NAME                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                       AGE
heapster                  ClusterIP   10.68.191.0     <none>        80/TCP                        13m
kube-dns                  ClusterIP   10.68.0.2       <none>        53/UDP,53/TCP,9153/TCP        15m
kubernetes-dashboard      NodePort    10.68.115.45    <none>        443:35294/TCP                 13m
metrics-server            ClusterIP   10.68.116.163   <none>        443/TCP                       15m
traefik-ingress-service   NodePort    10.68.106.241   <none>        80:23456/TCP,8080:26004/TCP   12m
 【自动安装方式】
一步执行上面所有手动安装操作
1 [ ansible]# ansible-playbook 90.setup.yml
查看node/pod使用资源情况
[ ansible]# kubectl top node
NAME            CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
192.168.1.200   58m          7%     960Mi           85%       
192.168.1.201   34m          4%     1018Mi          91%       
192.168.1.202   76m          9%     549Mi           49%       
192.168.1.203   89m          11%    568Mi           50% 
[ ansible]# kubectl top pod --all-namespaces
NAMESPACE     NAME                                          CPU(cores)   MEMORY(bytes)   
kube-system   coredns-797455887b-9nscp                      5m           22Mi            
kube-system   coredns-797455887b-k92wv                      5m           19Mi            
kube-system   heapster-5f848f54bc-vvwzx                     1m           11Mi            
kube-system   kube-flannel-ds-amd64-7bk5w                   3m           20Mi            
kube-system   kube-flannel-ds-amd64-blcxx                   2m           19Mi            
kube-system   kube-flannel-ds-amd64-c4sfx                   2m           18Mi            
kube-system   kube-flannel-ds-amd64-f8pnz                   2m           10Mi            
kube-system   kubernetes-dashboard-5c7687cf8-hnbdp          1m           22Mi            
kube-system   metrics-server-85c7b8c8c4-6q4vj               1m           16Mi            
kube-system   traefik-ingress-controller-766dbfdddd-98trv   4m           17Mi
查看集群信息
[ ansible]# kubectl cluster-info
Kubernetes master is running at https://192.168.1.200:6443
CoreDNS is running at https://192.168.1.200:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
kubernetes-dashboard is running at https://192.168.1.200:6443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
Metrics-server is running at https://192.168.1.200:6443/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy

To further debug and diagnose cluster problems, use ‘kubectl cluster-info dump‘.
8:测试DNS
①创建一个nginx.service
[ ansible]# kubectl run nginx --image=nginx --expose --port=80
kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
service/nginx created
deployment.apps/nginx created
②创建busybox测试pod,可以看到nginx监听的虚拟地址10.68.243.55
[ ansible]# kubectl run busybox --rm -it --image=busybox /bin/sh
kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
If you don‘t see a command prompt, try pressing enter.
/ # nslookup nginx.default.svc.cluster.local
Server:    10.68.0.2
Address:    10.68.0.2:53

Name:    nginx.default.svc.cluster.local
Address: 10.68.243.55
三、增加node节点,IP:192.168.1.204
【deploy节点操作】
1:拷贝公钥到新的node节点机器上
 1 [ ansible]# ssh-copy-id 192.168.1.204 
2:修改hosts文件,添加新的node节点IP
[ ansible]# vim hosts 
[kube-node]
192.168.1.202
192.168.1.203
192.168.1.204
3:执行添加node安装文件,并指导节点的IP

1 [ ansible]# ansible-playbook tools/02.addnode.yml -e NODE_TO_ADD=192.168.1.204 
 4:验证node节点是否添加成功
[ ansible]# kubectl get node
NAME            STATUS                     ROLES    AGE     VERSION
192.168.1.200   Ready,SchedulingDisabled   master   9h      v1.15.0
192.168.1.201   Ready,SchedulingDisabled   master   9h      v1.15.0
192.168.1.202   Ready                      node     9h      v1.15.0
192.168.1.203   Ready                      node     9h      v1.15.0
192.168.1.204   Ready                      node     2m11s   v1.15.0 
[ ansible]# kubectl get pod -n kube-system -o wide
NAME                                          READY   STATUS    RESTARTS   AGE   IP              NODE            NOMINATED NODE   READINESS GATES
coredns-797455887b-9nscp                      1/1     Running   0          31h   172.20.3.2      192.168.1.203   <none>           <none>
coredns-797455887b-k92wv                      1/1     Running   0          31h   172.20.2.2      192.168.1.202   <none>           <none>
heapster-5f848f54bc-vvwzx                     1/1     Running   1          31h   172.20.2.4      192.168.1.202   <none>           <none>
kube-flannel-ds-amd64-7bk5w                   1/1     Running   0          31h   192.168.1.202   192.168.1.202   <none>           <none>
kube-flannel-ds-amd64-blcxx                   1/1     Running   0          31h   192.168.1.200   192.168.1.200   <none>           <none>
kube-flannel-ds-amd64-c4sfx                   1/1     Running   0          31h   192.168.1.203   192.168.1.203   <none>           <none>
kube-flannel-ds-amd64-f8pnz                   1/1     Running   0          31h   192.168.1.201   192.168.1.201   <none>           <none>
kube-flannel-ds-amd64-vdd7n                   1/1     Running   0          21h   192.168.1.204   192.168.1.204   <none>           <none>
kubernetes-dashboard-5c7687cf8-hnbdp          1/1     Running   0          31h   172.20.3.3      192.168.1.203   <none>           <none>
metrics-server-85c7b8c8c4-6q4vj               1/1     Running   0          31h   172.20.2.3      192.168.1.202   <none>           <none>
traefik-ingress-controller-766dbfdddd-98trv   1/1     Running   0          31h   172.20.3.4      192.168.1.203   <none>           <none>

相关推荐