最新消息:

Redis集群Cluster搭建

Linux ipcpu 2015浏览

Redis集群Cluster搭建.md

一、Redis 集群Cluster 简介

Redis 是一个开源的 key-value 存储系统,由于出众的性能,大部分互联网企业都用来做服务器端缓存。Redis 在3.0版本前只支持单实例模式,虽然支持主从模式、哨兵模式部署来解决单点故障,但是现在互联网企业动辄大几百G的数据,可完全是没法满足业务的需求,所以,Redis 在 3.0 版本以后就推出了集群模式。

Redis 集群采用了P2P的模式,完全去中心化。Redis 把所有的 Key 分成了 16384 个 slot *,每个 Redis 实例负责其中一部分 slot 。集群中的所有信息(节点、端口、slot等),都通过节点之间定期的数据交换而更新。
Redis 客户端可以在任意一个 Redis 实例发出请求,如果所需数据不在该实例中,通过重定向命令引导客户端访问所需的实例。

二、redis节点准备

Redis 集群搭建规划,由于集群至少需要6个节点(3主3从模式)
服务器列表如下

10.140.12.101
10.140.12.102
10.140.12.103
10.140.12.104
10.140.12.105
10.140.12.106

每个机器都按照redis 3.2.12版本,然后编写配置文件/etc/redis.conf,改动如下

bind 0.0.0.0
cluster-enabled yes
appendonly yes

然后分别启动6台redis服务。
启动之后,redis集群并不能使用,因为集群还没有建立。

10.140.12.101:6379> keys *
(empty list or set)
10.140.12.101:6379> set foo bar
(error) CLUSTERDOWN Hash slot not served 

三、集群的建立

Redis集群的管理是通过官方提供的redis-trib.rb脚本完成的,改脚本由Ruby语言编写而成,因此需要有ruby运行环境。

yum install ruby rubygems -y
#更换gem源
gem sources --add https://mirrors.tuna.tsinghua.edu.cn/rubygems/ --remove https://rubygems.org/
gem sources -l
#安装指定版本的redis
gem install redis -v 3.3.0

redis-trib.rb可以从源码中找到,貌似RPM安装的redis里面没有提供。
注意gem这里制定redis版本,否则会提示让你升级ruby。
然后接下来,我们使用命令来对集群进行初始化

# redis-trib.rb create --replicas 1 10.140.12.101:6379 10.140.12.102:6379 10.140.12.103:6379 10.140.12.104:6379 10.140.12.105:6379 10.140.12.106:6379
>>> Creating cluster
>>> Performing hash slots allocation on 6 nodes...
Using 3 masters:
10.140.12.101:6379
10.140.12.102:6379
10.140.12.103:6379
Adding replica 10.140.12.104:6379 to 10.140.12.101:6379
Adding replica 10.140.12.105:6379 to 10.140.12.102:6379
Adding replica 10.140.12.106:6379 to 10.140.12.103:6379
M: e1f4165568a7eab01706754631be75822f13f791 10.140.12.101:6379
   slots:0-5460 (5461 slots) master
M: 52f7a4319c9e745e997b1bd7269e9578bd215d90 10.140.12.102:6379
   slots:5461-10922 (5462 slots) master
M: 7298d2cd3b7707b953ca94ac24bd70e18ee4c062 10.140.12.103:6379
   slots:10923-16383 (5461 slots) master
S: a328972e855c9d05526489b85401d4803b12e11b 10.140.12.104:6379
   replicates e1f4165568a7eab01706754631be75822f13f791
S: 77cfabdec1e964eba3de529d5f80b8d9542e7735 10.140.12.105:6379
   replicates 52f7a4319c9e745e997b1bd7269e9578bd215d90
S: f5ea720adb75a6cdc2e373665342e84365853477 10.140.12.106:6379
   replicates 7298d2cd3b7707b953ca94ac24bd70e18ee4c062
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join...
>>> Performing Cluster Check (using node 10.140.12.101:6379)
M: e1f4165568a7eab01706754631be75822f13f791 10.140.12.101:6379
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
S: f5ea720adb75a6cdc2e373665342e84365853477 10.140.12.106:6379
   slots: (0 slots) slave
   replicates 7298d2cd3b7707b953ca94ac24bd70e18ee4c062
S: a328972e855c9d05526489b85401d4803b12e11b 10.140.12.104:6379
   slots: (0 slots) slave
   replicates e1f4165568a7eab01706754631be75822f13f791
M: 7298d2cd3b7707b953ca94ac24bd70e18ee4c062 10.140.12.103:6379
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
S: 77cfabdec1e964eba3de529d5f80b8d9542e7735 10.140.12.105:6379
   slots: (0 slots) slave
   replicates 52f7a4319c9e745e997b1bd7269e9578bd215d90
M: 52f7a4319c9e745e997b1bd7269e9578bd215d90 10.140.12.102:6379
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

简单解释一下这个命令:调用 ruby 命令来进行创建集群,--replicas 1 表示主从复制比例为 1:1,即一个主节点对应一个从节点;然后,默认给我们分配好了每个主节点和对应从节点服务,以及 solt 的大小,因为在 Redis 集群中有且仅有 16383 个 solt ,默认情况会给我们平均分配,当然你可以指定,后续的增减节点也可以重新分配。

四、redis集群的验证和测试

我们登陆其中一台机器,执行验证命令

# redis-cli -c -h 10.140.12.106 
10.140.12.106:6379> 
10.140.12.106:6379> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:3
cluster_stats_messages_sent:2721
cluster_stats_messages_received:1575
10.140.12.106:6379> 
10.140.12.106:6379> 
10.140.12.106:6379> cluster nodes
e1f4165568a7eab01706754631be75822f13f791 10.140.12.101:6379 master - 0 1544931962412 1 connected 0-5460
52f7a4319c9e745e997b1bd7269e9578bd215d90 10.140.12.102:6379 master - 0 1544931964417 2 connected 5461-10922
a328972e855c9d05526489b85401d4803b12e11b 10.140.12.104:6379 slave e1f4165568a7eab01706754631be75822f13f791 0 1544931963415 4 connected
f5ea720adb75a6cdc2e373665342e84365853477 10.140.12.106:6379 slave 7298d2cd3b7707b953ca94ac24bd70e18ee4c062 0 0 6 connected
7298d2cd3b7707b953ca94ac24bd70e18ee4c062 10.140.12.103:6379 master - 0 1544931965421 3 connected 10923-16383
77cfabdec1e964eba3de529d5f80b8d9542e7735 10.140.12.105:6379 slave 52f7a4319c9e745e997b1bd7269e9578bd215d90 0 1544931961410 5 connected
10.140.12.106:6379> 
10.140.12.106:6379> 
10.140.12.106:6379> 
10.140.12.106:6379> set foo bar
-> Redirected to slot [12182] located at 10.140.12.103:6379
OK
10.140.12.103:6379> SELECT 2
(error) ERR SELECT is not allowed in cluster mode
10.140.12.103:6379>

注意这里的redis-cli 需要指定参数-c 集群模式,否则就会出现如下错误

10.140.12.106:6379> set ipcpu 000
(error) MOVED 6021 10.140.12.102:6379

另外,Redis的cluster模式,只有DB0,select命令选择其他的数据库都会报错。

五、高可用测试(主备切换)

Redis集群中内置了Sentinel的实现方法,我们将10.140.12.101上的redis-server进程kill掉,其他节点的日志如下

11381:S 16 Dec 12:38:04.740 # Connection with master lost.
11381:S 16 Dec 12:38:04.740 * Caching the disconnected master state.
11381:S 16 Dec 12:38:05.093 * Connecting to MASTER 10.140.12.104:6379
11381:S 16 Dec 12:38:05.093 * MASTER <-> SLAVE sync started
11381:S 16 Dec 12:38:05.094 # Error condition on socket for SYNC: Connection refused
11381:S 16 Dec 12:38:06.096 * Connecting to MASTER 10.140.12.104:6379
11381:S 16 Dec 12:38:06.096 * MASTER <-> SLAVE sync started
11381:S 16 Dec 12:38:06.096 # Error condition on socket for SYNC: Connection refused
<输出省略>
11381:S 16 Dec 12:38:21.754 * FAIL message received from 7298d2cd3b7707b953ca94ac24bd70e18ee4c062 about a328972e855c9d05526489b85401d4803b12e11b
11381:S 16 Dec 12:38:21.754 # Cluster state changed: fail
11381:S 16 Dec 12:38:21.854 # Start of election delayed for 765 milliseconds (rank #0, offset 687).
11381:S 16 Dec 12:38:22.154 * Connecting to MASTER 10.140.12.104:6379
11381:S 16 Dec 12:38:22.154 * MASTER <-> SLAVE sync started
11381:S 16 Dec 12:38:22.155 # Error condition on socket for SYNC: Connection refused
11381:S 16 Dec 12:38:22.656 # Starting a failover election for epoch 8.
11381:S 16 Dec 12:38:22.702 # Failover election won: I'm the new master.
11381:S 16 Dec 12:38:22.702 # configEpoch set to 8 after successful failover
11381:M 16 Dec 12:38:22.702 * Discarding previously cached master state.
11381:M 16 Dec 12:38:22.702 # Cluster state changed: ok

我们可以看到,这里Cluster本身会进行主备的切换。

六、吐槽

Redis Cluster 虽然解决了高可用的问题,可以还有很多缺点,

集群的管理是通过一个外部的ruby脚本来完成的,没有集成到cluster本身。
相对于其他的分布式系统,redis cluster 没有设计成"任何一个节点都可以负责转发请求",反而使用了重定向这种方式,对于客户端来说极度不友好。

redis每次的出的东西都是留一个小尾巴,想让客户端来主动对接,Sentinel也是,Cluster也是,不知道官方是怎么考虑的。

参考资料

http://www.cnblogs.com/mafly/p/redis_cluster.html
https://www.cnblogs.com/youzhibing/p/8512529.html#_label5
https://www.cnblogs.com/gomysql/p/4395504.html

转载请注明:IPCPU-网络之路 » Redis集群Cluster搭建