1. etcd概念
1.1. etcd介绍
etcd的名字来源于/etc
和distibuted
,即Unix下配置文件目录和分布式两个关键词。Etcd是一个分布式、可靠的k-v分布式数据库,同时还能提供配置共享、服务发现等功能,常常用在go语言的项目中。etcd主要有以下几个方面的特点:
- 部署简单:只有一个二进制文件,可以开箱即用
- 使用简单:etcd有着丰富的client SDK
- 安全性高:支持SSL证书认证,数据加密,节点之间相互验证身份
- 强一致性:通过raft算法实现数据的强一致性,少于二分之一节点宕机仍然能提供服务,通常部署3个或者5个节点
- 数据落地:etcd的数据会通过wal格式的数据持久化到磁盘,并且支持snapshot快照
etcd的架构主要由以下四个部分:
- HTTP Server:请求的入口,用来处理各种API请求
- Store:用于处理etcd支持的各类功能的事务,包括数据索引、节点状态变更、监控与反馈、事件处理与执行等等,是etcd对用户提供的大多数API功能的具体实现
- Raft状态机:实现多节点的etcd集群中数据一致性和节点选主
- WAL:Write Ahead Log(预写式日志),是etcd的数据存储方式。除了在内存中存有所有数据的状态以及节点的索引以外,etcd就通过WAL进行持久化存储。WAL中,所有的数据提交前都会事先记录日志。Snapshot是为了防止数据过多而进行的状态快照;Entry表示存储的具体日志内容。
1.2. Raft
Raft 是etcd保障分布式节点中数据一致性的关键,该算法本身有一定的复杂度。网上有很多etcd的raft算法的博客,详细介绍了节点直接如何选主、如何同步数据、如何补偿数据的。
2. etcd部署
2.1. 准备工作
2.1.1. 硬件需求
这里简单介绍下ectd对硬件的要求,详情需要参考官方文档。
- CPU
etcd对CPU的消耗不是很大,通常需要2-4个核心即可。在高负载集群下,如同时给上千个客户端提供服务,或者每秒有上万个请求的场景,需要8-16核心。一般规模,可以先使用2-4核心的CPU,后续如果CPU使用率较高,可以逐个节点升级配置。
- Memory
etcd对内存的需求不是很大,但是也会积极的缓存k-v数据到内存,并用剩余的内存跟踪watcher。一般而言,8G内存也足够了。
- Disk
磁盘的写入速度是影响etcd性能的关键,etcd的raft共识协议依赖于将元数据写入到日志中,每个etcd节点都要将每个请求写入磁盘。并且,etcd还需要将增量的checkpoint写入磁盘,用来截断日志。如果写入时间过长,可能因为超过心跳时间造成集群重新选举,从而破坏集群的稳定性。在检查磁盘性能的时候,可以使用 fio 进行测试,具体可以参考IBM的这篇文章。
磁盘的性能要从两个方面去评估,第一个是IOPS能力:一般而言,需要50顺序IOPS的速度(比如7200RPM的机械盘),高负载集群需要500顺序IOPS速度(通常是固态)。第二个是磁盘带宽,这个决定了新节点或者失败节点同步数据的时间,典型的10MB/s的磁盘通常在15秒内能恢复100MB数据,100MB/s可以在15秒内恢复1GB的数据。
etcd的数据会相互同步,不要担心节点坏盘问题,因此没必要做RAID5这种RAID10,如果需要做RAID提升磁盘性能,请考虑使用RAID0。
- Network
etcd集群需要快速网络,并且保障网络的稳定性,通常1GbE的内部网络可以满足大部分场景的需求,通常建议将etcd节点部署到一个网络域内,减少跨网络的延迟。
etcd官方文档还提供了,不同规模下的kubernetes的对etcd集群配置的要求,需要注意的是,这些集群中etcd是单独部署的,并不是和master节点共用。
- 小集群
请求客户端少于100个,每秒请求少于200个,并且存储数据少于100MB。如50个node的kuberentes集群:
Provider |
Type |
vCPUs |
Memory (GB) |
Max concurrent IOPS |
Disk bandwidth (MB/s) |
AWS |
m4.large |
2 |
8 |
3600 |
56.25 |
GCE |
n1-standard-2 + 50GB PD SSD |
2 |
7.5 |
1500 |
25 |
- 中等规模集群
请求客户端少于500个,每秒请求少于1000个,并且存储数据少于500MB。如250个node的kuberentes集群:
Provider |
Type |
vCPUs |
Memory (GB) |
Max concurrent IOPS |
Disk bandwidth (MB/s) |
AWS |
m4.xlarge |
4 |
16 |
6000 |
93.75 |
GCE |
n1-standard-4 + 150GB PD SSD |
4 |
15 |
4500 |
75 |
- 大规模集群
请求客户端少于1500个,每秒请求少于10000个,并且存储数据少于1000MB。如1000个node的kuberentes集群
Provider |
Type |
vCPUs |
Memory (GB) |
Max concurrent IOPS |
Disk bandwidth (MB/s) |
AWS |
m4.2xlarge |
8 |
32 |
8000 |
125 |
GCE |
n1-standard-8 + 250GB PD SSD |
8 |
30 |
7500 |
125 |
- 超大规模集群
请求客户端超过1500个,每秒请求超过10000个,并且存储数据大于1000MB。如3000个node的kuberentes集群
Provider |
Type |
vCPUs |
Memory (GB) |
Max concurrent IOPS |
Disk bandwidth (MB/s) |
AWS |
m4.4xlarge |
16 |
64 |
16,000 |
250 |
GCE |
n1-standard-16 + 500GB PD SSD |
16 |
60 |
15,000 |
250 |
2.1.2. 机器规划
规划小规模测试集群,集群中节均关闭swap分区,安装chronyd时间同步服务。
磁盘写入性能由 fio 测试得出,具体可以参考IBM的这篇文章。
节点 |
IP |
系统 |
CPU/Memory |
Disk |
write IOPS avg |
write bandwidth avg(MB/s) |
etcd-1 |
10.4.7.121 |
ubuntu 18.04.5 |
2C 4G |
20G SSD |
min= 2706, max= 2758, avg=2727.14 |
min= 6077, max= 6194,avg=6124.86 |
etcd-2 |
10.4.7.122 |
ubuntu 18.04.5 |
2C 4G |
20G SSD |
|
|
etcd-3 |
10.4.7.123 |
ubuntu 18.04.5 |
2C 4G |
20G SSD |
|
|
2.2. etcd集群部署
ectd 版本下载页面:https://github.com/etcd-io/etcd/releases,这里以 3.5.1 版本为例,进行安装和学习。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139
| [root@duduniao etcd]# wget https://github.com/etcd-io/etcd/releases/download/v3.5.1/etcd-v3.5.1-linux-amd64.tar.gz [root@duduniao etcd]# tar -xf etcd-v3.5.1-linux-amd64.tar.gz [root@duduniao etcd]# ls etcd-v3.5.1-linux-amd64/etcd* -l -rwxr-xr-x 1 114762 114762 23568384 Oct 15 22:22 etcd-v3.5.1-linux-amd64/etcd -rwxr-xr-x 1 114762 114762 17981440 Oct 15 22:22 etcd-v3.5.1-linux-amd64/etcdctl -rwxr-xr-x 1 114762 114762 16056320 Oct 15 22:22 etcd-v3.5.1-linux-amd64/etcdutl
# 下发etcd软件到各个服务器上 [root@duduniao etcd]# scan_host.sh push -h 10.4.7.121 10.4.7.122 10.4.7.123 etcd-v3.5.1-linux-amd64/etcd* /usr/local/bin/ 10.4.7.123 etcd-v3.5.1-linux-amd64/etcd etcd-v3.5.1-linux-amd64/etcdctl etcd-v3.5.1-linux-amd64/etcdutl --> /usr/local/bin/ Y 10.4.7.121 etcd-v3.5.1-linux-amd64/etcd etcd-v3.5.1-linux-amd64/etcdctl etcd-v3.5.1-linux-amd64/etcdutl --> /usr/local/bin/ Y 10.4.7.122 etcd-v3.5.1-linux-amd64/etcd etcd-v3.5.1-linux-amd64/etcdctl etcd-v3.5.1-linux-amd64/etcdutl --> /usr/local/bin/ Y
# 配置启动的etcd.service文件 [root@duduniao etcd]# scp etcd-1.service 10.4.7.121:/lib/systemd/system/etcd.service [root@duduniao etcd]# scp etcd-2.service 10.4.7.122:/lib/systemd/system/etcd.service [root@duduniao etcd]# scp etcd-3.service 10.4.7.123:/lib/systemd/system/etcd.service
# 启动etcd [root@duduniao etcd]# scan_host.sh cmd -h 10.4.7.121 10.4.7.122 10.4.7.123 "mkdir /data/etcd ; systemctl daemon-reload" [root@duduniao etcd]# scan_host.sh cmd -h 10.4.7.121 10.4.7.122 10.4.7.123 "systemctl start etcd && systemctl enable etcd" [root@duduniao etcd]# scan_host.sh cmd -h 10.4.7.121 10.4.7.122 10.4.7.123 "systemctl is-enabled etcd && systemctl is-active etcd"
# 检查集群状态 [root@duduniao etcd]# ./etcd-v3.5.1-linux-amd64/etcdctl --endpoints=10.4.7.121:2379 member list --write-out=table +------------------+---------+--------+------------------------+------------------------+------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | +------------------+---------+--------+------------------------+------------------------+------------+ | 4c45db44e1021917 | started | etcd-1 | http://10.4.7.121:2380 | http://10.4.7.121:2379 | false | | 721eef2714f1477a | started | etcd-2 | http://10.4.7.122:2380 | http://10.4.7.122:2379 | false | | f6d5f5c8eef4f092 | started | etcd-3 | http://10.4.7.123:2380 | http://10.4.7.123:2379 | false | +------------------+---------+--------+------------------------+------------------------+------------+
# 测试读写 [root@duduniao etcd]# ./etcd-v3.5.1-linux-amd64/etcdctl --endpoints=10.4.7.121:2379 put k1 test-value-1 [root@duduniao etcd]# ./etcd-v3.5.1-linux-amd64/etcdctl --endpoints=10.4.7.123:2379 get k1 k1 test-value-1 # etcd-1.service [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target Documentation=https://github.com/coreos
[Service] Type=notify WorkingDirectory=/data/etcd ExecStart=/usr/local/bin/etcd \ --name etcd-1 \ --initial-advertise-peer-urls http://10.4.7.121:2380 \ --listen-peer-urls http://10.4.7.121:2380 \ --listen-client-urls http://10.4.7.121:2379,http://127.0.0.1:2379 \ --advertise-client-urls http://10.4.7.121:2379 \ --initial-cluster-token etcd-cluster-1 \ --initial-cluster etcd-1=http://10.4.7.121:2380,etcd-2=http://10.4.7.122:2380,etcd-3=http://10.4.7.123:2380 \ --initial-cluster-state new \ --data-dir /data/etcd \ --snapshot-count 50000 \ --auto-compaction-retention 1 \ --auto-compaction-mode periodic \ --max-request-bytes 10485760 \ --quota-backend-bytes 8589934592 Restart=always RestartSec=15 LimitNOFILE=65536 OOMScoreAdjust=-999
[Install] WantedBy=multi-user.target
# etcd-2.service [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target Documentation=https://github.com/coreos
[Service] Type=notify WorkingDirectory=/data/etcd ExecStart=/usr/local/bin/etcd \ --name etcd-2 \ --initial-advertise-peer-urls http://10.4.7.122:2380 \ --listen-peer-urls http://10.4.7.122:2380 \ --listen-client-urls http://10.4.7.122:2379,http://127.0.0.1:2379 \ --advertise-client-urls http://10.4.7.122:2379 \ --initial-cluster-token etcd-cluster-1 \ --initial-cluster etcd-1=http://10.4.7.121:2380,etcd-2=http://10.4.7.122:2380,etcd-3=http://10.4.7.123:2380 \ --initial-cluster-state new \ --data-dir /data/etcd \ --snapshot-count 50000 \ --auto-compaction-retention 1 \ --auto-compaction-mode periodic \ --max-request-bytes 10485760 \ --quota-backend-bytes 8589934592 Restart=always RestartSec=15 LimitNOFILE=65536 OOMScoreAdjust=-999
[Install] WantedBy=multi-user.target
# etcd-3.service [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target Documentation=https://github.com/coreos
[Service] Type=notify WorkingDirectory=/data/etcd ExecStart=/usr/local/bin/etcd \ --name etcd-3 \ --initial-advertise-peer-urls http://10.4.7.123:2380 \ --listen-peer-urls http://10.4.7.123:2380 \ --listen-client-urls http://10.4.7.123:2379,http://127.0.0.1:2379 \ --advertise-client-urls http://10.4.7.123:2379 \ --initial-cluster-token etcd-cluster-1 \ --initial-cluster etcd-1=http://10.4.7.121:2380,etcd-2=http://10.4.7.122:2380,etcd-3=http://10.4.7.123:2380 \ --initial-cluster-state new \ --data-dir /data/etcd \ --snapshot-count 50000 \ --auto-compaction-retention 1 \ --auto-compaction-mode periodic \ --max-request-bytes 10485760 \ --quota-backend-bytes 8589934592 Restart=always RestartSec=15 LimitNOFILE=65536 OOMScoreAdjust=-999
[Install] WantedBy=multi-user.target
|
2.3. TLS加密通信的etcd集群部署
etcd有两个对外暴露的端口:2379 和 2380。其中2379是的用来接收客户端请求的,2380用来和集群内部其它节点通信和数据同步的。这两种通信都可以进行TLS加密,并且可以使用CA证书进行验证对方是否合法。etcd默认是没用启动RBAC认证的,所有连接上的客户端都是可以操作所有的key。在k8s集群中,ectd服务端是通过客户端证书验证是否合法的,只有客户端拿etcd认可的CA签发的证书才能通过认证。核心参数如下:
1 2 3 4 5 6 7 8 9 10 11
| # client和server直接通信 --trusted-ca-file=<path> 受信任的CA证书 --cert-file=<path> etcd 服务端证书 --key-file=<path> etcd 服务端证书私钥 --client-cert-auth 指定该参数后,服务端要求客户端证书也是trusted-ca-file签发的证书
# etcd节点之间通信(peer) --peer-trusted-ca-file=<path> 受信任的CA证书 --peer-cert-file=<path> etcd 服务端证书 --peer-key-file=<path> etcd 服务端证书私钥 --peer-client-cert-auth 指定该参数后,要求对端证书也是peer-trusted-ca-file签发的证书
|
2.3.1. 签发证书
通常我们会同时加密 client-server 以及 peer 节点证书,为了简化,client-server 和 peer 的ca证书通常是相同的,甚至peer证书和server证书都可以相同。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125
| # 下载证书签发工具 [root@duduniao etcd]# wget -O /usr/local/bin/cfssl https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 [root@duduniao etcd]# wget -O /usr/local/bin/cfssljson https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 [root@duduniao etcd]# chmod +x /usr/local/bin/cfssljson /usr/local/bin/cfssl # 签发ca证书,这里有个很大的坑: # etcd节点会不断以客户端身份访问自身的2379端口,并且使用的是server证书,因此ca的server配置必须添加line:19 "client auth"。否则会报以下错误: # WARNING: 2021/10/17 09:55:56 [core] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 127.0.0.1:2379 <nil> 0 <nil>}. # Err: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate". Reconnecting... [root@duduniao etcd]# mkdir ssl/ && cd ssl [root@duduniao ssl]# cat ca-config.json { "signing": { "default": { "expiry": "43800h" }, "profiles": { "server": { "expiry": "43800h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] }, "client": { "expiry": "43800h", "usages": [ "signing", "key encipherment", "client auth" ] }, "peer": { "expiry": "43800h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] } } } } [root@duduniao ssl]# cat ca-csr.json { "CN": "local-etcd-ca", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "Shanghai", "O": "duduniao", "ST": "Shanghai", "OU": "devops" } ] } [root@duduniao ssl]# cfssl gencert -initca ca-csr.json | cfssljson -bare ca - [root@duduniao ssl]# ll total 20 -rw-r--r-- 1 root root 832 2021-10-17 15:23:33 ca-config.json -rw-r--r-- 1 root root 274 2021-10-17 15:26:46 ca-csr.json -rw------- 1 root root 1679 2021-10-17 15:27:08 ca-key.pem -rw-r--r-- 1 root root 1013 2021-10-17 15:27:08 ca.csr -rw-r--r-- 1 root root 1387 2021-10-17 15:27:08 ca.pem # 签发server证书, client验证server使用,同时server自检也需要 [root@duduniao ssl]# cat server.json { "CN": "local-etcd.duduniao.com", "hosts": [ "10.4.7.121", "10.4.7.122", "10.4.7.123", "127.0.0.1", "etcd-1", "etcd-2", "etcd-3", "localhost" ], "key": { "algo": "ecdsa", "size": 256 }, "names": [ { "C": "CN", "L": "Shanghai", "ST": "Shanghai" } ] } [root@duduniao ssl]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server.json | cfssljson -bare server [root@duduniao ssl]# ll server* -rw------- 1 root root 227 2021-10-17 15:32:56 server-key.pem -rw-r--r-- 1 root root 558 2021-10-17 15:32:56 server.csr -rw-r--r-- 1 root root 391 2021-10-17 15:32:24 server.json -rw-r--r-- 1 root root 1184 2021-10-17 15:32:56 server.pem # 签发peer证书,推荐每个节点一个, 以etcd-1为例,其它节点修改IP、域名和主机名 [root@duduniao ssl]# cat etcd-1.json { "CN": "local-etcd-1.duduniao.com", "hosts": [ "10.4.7.121", "etcd-1" ], "key": { "algo": "ecdsa", "size": 256 }, "names": [ { "C": "CN", "L": "Shanghai", "ST": "Shanghai" } ] } [root@duduniao ssl]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer etcd-1.json | cfssljson -bare etcd-1 [root@duduniao ssl]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer etcd-2.json | cfssljson -bare etcd-2 [root@duduniao ssl]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer etcd-3.json | cfssljson -bare etcd-3
|
2.3.2. 部署etcd集群
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166
| # 停止掉非加密的etcd集群,清理etcd数据 [root@duduniao etcd]# scan_host.sh cmd -h 10.4.7.121 10.4.7.122 10.4.7.123 "systemctl stop etcd" [root@duduniao etcd]# scan_host.sh cmd -h 10.4.7.121 10.4.7.122 10.4.7.123 "rm -fr /data/etcd" # 下发证书 [root@duduniao etcd]# scan_host.sh cmd -h 10.4.7.121 10.4.7.122 10.4.7.123 "mkdir -p /data/etcd/data /data/etcd/ssl" [root@duduniao etcd]# scan_host.sh push -h 10.4.7.121 10.4.7.122 10.4.7.123 ssl/ca.pem ssl/server.pem ssl/server-key.pem /data/etcd/ssl/ [root@duduniao etcd]# scp ssl/etcd-1-key.pem ssl/etcd-1.pem 10.4.7.121:/data/etcd/ssl/ [root@duduniao etcd]# scp ssl/etcd-2-key.pem ssl/etcd-2.pem 10.4.7.122:/data/etcd/ssl/ [root@duduniao etcd]# scp ssl/etcd-3-key.pem ssl/etcd-3.pem 10.4.7.123:/data/etcd/ssl/ [root@duduniao etcd]# scan_host.sh cmd -h 10.4.7.121 10.4.7.122 10.4.7.123 "ls -l /data/etcd/ssl" 10.4.7.122 total 20 -rw-r--r-- 1 root root 1387 Oct 17 07:53 ca.pem -rw------- 1 root root 227 Oct 17 07:54 etcd-2-key.pem -rw-r--r-- 1 root root 1147 Oct 17 07:54 etcd-2.pem -rw------- 1 root root 227 Oct 17 07:53 server-key.pem -rw-r--r-- 1 root root 1184 Oct 17 07:53 server.pem 10.4.7.121 total 20 -rw-r--r-- 1 root root 1387 Oct 17 07:53 ca.pem -rw------- 1 root root 227 Oct 17 07:54 etcd-1-key.pem -rw-r--r-- 1 root root 1147 Oct 17 07:54 etcd-1.pem -rw------- 1 root root 227 Oct 17 07:53 server-key.pem -rw-r--r-- 1 root root 1184 Oct 17 07:53 server.pem 10.4.7.123 total 20 -rw-r--r-- 1 root root 1387 Oct 17 07:53 ca.pem -rw------- 1 root root 227 Oct 17 07:55 etcd-3-key.pem -rw-r--r-- 1 root root 1147 Oct 17 07:55 etcd-3.pem -rw------- 1 root root 227 Oct 17 07:53 server-key.pem -rw-r--r-- 1 root root 1184 Oct 17 07:53 server.pem # 修改servcie文件 # etcd-1.service [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target Documentation=https://github.com/coreos
[Service] Type=notify WorkingDirectory=/data/etcd ExecStart=/usr/local/bin/etcd \ --name etcd-1 \ --initial-advertise-peer-urls https://10.4.7.121:2380 \ --listen-peer-urls https://10.4.7.121:2380 \ --listen-client-urls https://10.4.7.121:2379,https://127.0.0.1:2379 \ --advertise-client-urls https://10.4.7.121:2379 \ --initial-cluster-token etcd-cluster-1 \ --initial-cluster etcd-1=https://10.4.7.121:2380,etcd-2=https://10.4.7.122:2380,etcd-3=https://10.4.7.123:2380 \ --initial-cluster-state new \ --client-cert-auth \ --cert-file ssl/server.pem \ --key-file ssl/server-key.pem \ --trusted-ca-file ssl/ca.pem \ --peer-client-cert-auth \ --peer-trusted-ca-file ssl/ca.pem \ --peer-cert-file ssl/etcd-1.pem \ --peer-key-file ssl/etcd-1-key.pem \ --data-dir data \ --snapshot-count 50000 \ --auto-compaction-retention 1 \ --auto-compaction-mode periodic \ --max-request-bytes 10485760 \ --quota-backend-bytes 8589934592 Restart=always RestartSec=15 LimitNOFILE=65536 OOMScoreAdjust=-999
[Install] WantedBy=multi-user.target
# etcd-2.service [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target Documentation=https://github.com/coreos
[Service] Type=notify WorkingDirectory=/data/etcd ExecStart=/usr/local/bin/etcd \ --name etcd-2 \ --initial-advertise-peer-urls https://10.4.7.122:2380 \ --listen-peer-urls https://10.4.7.122:2380 \ --listen-client-urls https://10.4.7.122:2379,https://127.0.0.1:2379 \ --advertise-client-urls https://10.4.7.122:2379 \ --initial-cluster-token etcd-cluster-1 \ --initial-cluster etcd-1=https://10.4.7.121:2380,etcd-2=https://10.4.7.122:2380,etcd-3=https://10.4.7.123:2380 \ --initial-cluster-state new \ --client-cert-auth \ --cert-file ssl/server.pem \ --key-file ssl/server-key.pem \ --trusted-ca-file ssl/ca.pem \ --peer-client-cert-auth \ --peer-trusted-ca-file ssl/ca.pem \ --peer-cert-file ssl/etcd-2.pem \ --peer-key-file ssl/etcd-2-key.pem \ --data-dir data \ --snapshot-count 50000 \ --auto-compaction-retention 1 \ --auto-compaction-mode periodic \ --max-request-bytes 10485760 \ --quota-backend-bytes 8589934592 Restart=always RestartSec=15 LimitNOFILE=65536 OOMScoreAdjust=-999
[Install] WantedBy=multi-user.target
# etcd-3.service [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target Documentation=https://github.com/coreos
[Service] Type=notify WorkingDirectory=/data/etcd ExecStart=/usr/local/bin/etcd \ --name etcd-3 \ --initial-advertise-peer-urls https://10.4.7.123:2380 \ --listen-peer-urls https://10.4.7.123:2380 \ --listen-client-urls https://10.4.7.123:2379,https://127.0.0.1:2379 \ --advertise-client-urls https://10.4.7.123:2379 \ --initial-cluster-token etcd-cluster-1 \ --initial-cluster etcd-1=https://10.4.7.121:2380,etcd-2=https://10.4.7.122:2380,etcd-3=https://10.4.7.123:2380 \ --initial-cluster-state new \ --client-cert-auth \ --cert-file ssl/server.pem \ --key-file ssl/server-key.pem \ --trusted-ca-file ssl/ca.pem \ --peer-client-cert-auth \ --peer-trusted-ca-file ssl/ca.pem \ --peer-cert-file ssl/etcd-3.pem \ --peer-key-file ssl/etcd-3-key.pem \ --data-dir data \ --snapshot-count 50000 \ --auto-compaction-retention 1 \ --auto-compaction-mode periodic \ --max-request-bytes 10485760 \ --quota-backend-bytes 8589934592 Restart=always RestartSec=15 LimitNOFILE=65536 OOMScoreAdjust=-999
[Install] WantedBy=multi-user.target # 启动集群 [root@duduniao etcd]# scp etcd-1.service 10.4.7.121:/lib/systemd/system/etcd.service [root@duduniao etcd]# scp etcd-2.service 10.4.7.122:/lib/systemd/system/etcd.service [root@duduniao etcd]# scp etcd-3.service 10.4.7.123:/lib/systemd/system/etcd.service [root@duduniao etcd]# scan_host.sh cmd -h 10.4.7.121 10.4.7.122 10.4.7.123 "systemctl daemon-reload" 10.4.7.122 10.4.7.123 10.4.7.121 [root@duduniao etcd]# scan_host.sh cmd -h 10.4.7.121 10.4.7.122 10.4.7.123 "systemctl restart etcd && systemctl enable etcd"
|
2.3.3. 客户端验证
1 2 3 4 5 6 7 8 9 10
| # 不加证书 root@ubuntu-1804-122:~# etcdctl --endpoints=10.4.7.121:2379 member list --write-out=table {"level":"warn","ts":"2021-10-17T10:26:16.722Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000318380/10.4.7.121:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection closed"} Error: context deadline exceeded
# 加上证书 root@ubuntu-1804-122:~# cd /data/etcd/ssl/ && etcdctl --cacert ca.pem --cert server.pem --key server-key.pem --endpoints https://10.4.7.121:2379 member list 4fe2b98ed7b794f7, started, etcd-3, https://10.4.7.123:2380, https://10.4.7.123:2379, false bbd6739258f69625, started, etcd-1, https://10.4.7.121:2380, https://10.4.7.121:2379, false c5542f3740ec56cd, started, etcd-2, https://10.4.7.122:2380, https://10.4.7.122:2379, false
|
2.4. etcd命令行
2.4.1. etcd
ectd 启动参数非常多,核心的启动参数有以下部分,其它请参考官方文档。
1 2 3 4 5 6 7 8 9
| -name 节点名称,用于组建集群,默认 default -data-dir 数据存储目录,默认工作目录下 ${name}.etcd -wal-dir wal日志目录,默认为data-dir -snapshot-count 触发snapshot的事务提交次数,默认值 100000 -listen-peer-urls 集群节点中间对等网络监听URL,可以是http也可以是https, 0.0.0.0表示所有地址。默认 http://localhost:2380 -listen-peer-urls 用于暴露给客户端的URL地址,可以是http也可以是https, 0.0.0.0表示所有地址。默认 http://localhost:2379
–max-request-bytes 客户端最大请求的字节数,默认1572864 -quota-backend-bytes后端配额大小,默认为2G.最大为8G,超过会导致数据无法写入
|
1 2 3 4 5
| –initial-cluster 引导并初始化一个新集群。仅在启动新的集群成员时生效,后续运行中会忽略该参数 –initial-advertise-peer-urls 对外发布peer节点通信地址。仅在启动新的集群成员时生效,后续运行中会忽略该参数 -initial-cluster-state 集群状态。new:表示所有节点都是第一次启动并组建集群;existing表示加入一个已存在的集群。仅在启动新的集群成员时生效,后续运行中会忽略该参数 -initial-cluster-token 集群的token,所有节点启动时需要指定。默认为 etcd-cluster。仅在启动新的集群成员时生效,后续运行中会忽略该参数 -advertise-client-urls 发布给其它成员,告知他们当前节点暴露给客户端的地址。默认为 http://localhost:2379
|
1 2 3 4 5 6 7 8 9
| -trusted-ca-file # 服务端信任的CA证书,用来验证客户端是否合法 -cert-file # 服务端证书,由信任的CA签发,加密与客户端的通信,并且被客户端验证 -key-file # 服务端证书私钥 -client-cert-auth # 启用该参数表示服务端验证客户端证书是否为受信任的CA签发
–peer-trusted-ca-file # peer节点通信信任的CA证书,用来验证其它节点是否合法 -peer-cert-file # peer节点证书,由信任的CA签发,加密与客户端的通信,并且被其它节点验证 -peer-key-file # peer节点证书私钥 -peer-client-cert-auth # 启用该参数表示验证对端证书是否为受信任的CA签发
|
1
| -log-level 指定日志级别,默认info.可选: debug, info, warn, error, panic, fatal
|
2.4.2. etcdctl
2.4.2.1. 常用参数
1 2 3 4 5
| --endpoints 指定服务端GRPC接口,默认 127.0.0.1:2379
--cacert 指定ca证书,用来验证服务端是否合法 --cert 指定客户端证书 --key 指定客户端证书的私钥
|
对于需要通过证书访问的场景,可以配置命令的别名,以下的案例均使用别名:
1
| alias etc='etcdctl --cacert ssl/ca.pem --cert ssl/client.pem --key ssl/client-key.pem --endpoints https://10.4.7.121:2379'
|
2.4.2.2. 常用指令
etcdctl 是etcd的客户端工具,通常用来查询集群和节点状态,偶尔用来查询指定key的值。
- 指定API版本
etcd的API有v2和v3,etcdctl 命令在3.4之前默认为v2,之后为v3。v2和v3版本的API数据不兼容,在查询的时候需要通过环境变量指定版本:
1 2
| export ETCDCTL_API=3 # 指定v3版本 export ETCDCTL_API=2 # 指定v2版本
|
- 写入key
1 2 3
| [root@duduniao etcd]# etc put key-1 value-1 [root@duduniao etcd]# etc put key-2 value-2 [root@duduniao etcd]# etc put key-3 value-3
|
- 查询key
1 2 3 4 5 6 7 8
| # 精确查询 [root@duduniao etcd]# etc get key-1 # 根据前缀查询 [root@duduniao etcd]# etc get --prefix key # 只显示value [root@duduniao etcd]# etc get --prefix key --print-value-only # 现在查询结果的数量 [root@duduniao etcd]# etc get --prefix key --limit 2
|
- watch key
1
| [root@duduniao etcd]# etc watch key-1
|
- 删除key
1
| [root@duduniao etcd]# etc del key-watch
|
- 查看集群状态
1 2 3 4 5 6 7 8
| [root@duduniao etcd]# etc member list --write-out=table +------------------+---------+--------+-------------------------+-------------------------+------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | +------------------+---------+--------+-------------------------+-------------------------+------------+ | 4fe2b98ed7b794f7 | started | etcd-3 | https://10.4.7.123:2380 | https://10.4.7.123:2379 | false | | bbd6739258f69625 | started | etcd-1 | https://10.4.7.121:2380 | https://10.4.7.121:2379 | false | | c5542f3740ec56cd | started | etcd-2 | https://10.4.7.122:2380 | https://10.4.7.122:2379 | false | +------------------+---------+--------+-------------------------+-------------------------+------------+
|
- 查看节点状态
1 2 3 4 5 6 7 8 9 10 11 12
| [root@duduniao etcd]# etc endpoint status --write-out=table +-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | https://10.4.7.121:2379 | bbd6739258f69625 | 3.5.1 | 20 kB | true | false | 2 | 23 | 23 | | +-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ [root@duduniao etcd]# etc endpoint health --write-out=table +-------------------------+--------+----------+-------+ | ENDPOINT | HEALTH | TOOK | ERROR | +-------------------------+--------+----------+-------+ | https://10.4.7.121:2379 | true | 6.6739ms | | +-------------------------+--------+----------+-------+
|
原文来自:运维渡渡鸟
-----------------