目录
一.半同步和无损复制
主从复制
基本上都是 异步复制
, Master并不关Slave节点有没有获取到数据 ,所以复制效率很高,但是数据有可能会丢失。
从 MySQL5.5 开始,MySQL推出了
semi-sync replication (半同步复制)
- 至少有一个Slave节点收到binlog后再返回( IO线程接收到即可 )
- 减少数据丢失风险
- 不能完全避免数据丢失
- 超时后,切换回异步复制
从 MySQL5.7.2 开始,MySQL推出了
lossless semi-sync replication (无损复制)
- 二进制日志(binlog)先写远程( IO线程接收到即可 )
- 可保证数据完全不丢失
1.1. loss less / semi-sync replication插件安装
- 1、手工安装
mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
- 2、写入配置文件
[mysqld]plugin_dir=/usr/local/mysql/lib/pluginplugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
上述操作 仅仅是加载了插件 ,还 未启动
对应的功能,需要配置额外的参数:
[mysqld]# 等同于 rpl_semi_sync_master_enabled = 1loose_rpl_semi_sync_master_enabled = 1# 等同于 rpl_semi_sync_slave_enabled = 1loose_rpl_semi_sync_slave_enabled = 1# 超时5秒后,则切换回异步方式loose_rpl_semi_sync_master_timeout = 5000
使用 loose_
前缀表示如果没有加载 semi_sync
的插件,则 忽略该参数
Timeout
后,又追上了Master了( IO线程 ),则会 自动切换回半同步复制
注意:
半同步复
制 /无损复制
在主从
上都要安装插件和开启功能
。
注意:要保证
任意时刻发生一台机器宕机都不丢失数据
的前提是 mastersync_binlog
设置为1,slavesync_relay_log
设置为1。
1.2. semi-sync replication
semi-sync replication
称为 半同步复制
,在一个事务 提交(commit)
的过程时,在 InnoDB 层的 commit log
步骤后,Master节点需要收到 至少一个
Slave节点回复的 ACK
(表示 收到了binlog )后,才能继续下一个事务;
没有收到ACK
,则 切换为异步模式
,具体流程如下: 对应的配置参数如下:
[mysqld]# 开启主的半同步复制rpl_semi_sync_master_enabled=1# 开启从的半同步复制rpl_semi_sync_slave_enabled=1# 超时1秒,切回异步rpl_semi_sync_master_timeout=1000# 至少收到 1 个 slave发回的ackrpl_semi_sync_master_wait_for_slave_count=1
1.3. loss less semi-sync replication
loss less semi-sync replication
称为 无损复制
,在一个事务提交(commit
) 的过程时,在 MySQL
层的write binlog
步骤后,Master节点需要收到 至少一个
Slave节点回复的 ACK
(表示 收到了binlog )后,才能继续下一个事务;
没有收到ACK
,则 切换为异步模式
,具体流程如下: 对应的配置参数如下:
[mysqld]# 开启主的半同步复制rpl_semi_sync_master_enabled=1# 开启从的半同步复制rpl_semi_sync_slave_enabled=1# 超时1秒,切回异步rpl_semi_sync_master_timeout=1000[mysqld57]# 控制 半同步复制 还是 无损复制 的参数# - AFTER_SYNC 表示的是无损复制;(5.7 默认)# - AFTER_COMMIT 表示的是半同步复制;rpl_semi_sync_master_wait_point=AFTER_SYNC# 至少收到 1 个 slave发回的ackrpl_semi_sync_master_wait_for_slave_count=1
1.4. 半同步复制与无损复制的对比
ACK的时间点不同
半同步复制
在InnoDB层
的Commit Log后
等待ACK,主从切换会有数据丢失
风险。无损复制
在MySQL Server层
的Write binlog后
等待ACK,主从切换会有数据变多
风险。
主从数据一致性
半同步复制
意味着在Master节点上,这个刚刚提交的事务对数据库的修改,对其他事务是可见的。因此,如果在等待Slave ACK的时候crash了,那么会对其他事务出现幻读
,数据丢失
。无损复制
在write binlog完成后,就传输binlog,但还没有去写commit log,意味着当前这个事务对数据库的修改,其他事务也是不可见的
。因此,不会出现幻读,数据丢失风险。- 因此5.7.2引入了
无损复制(after_sync)模式
,带来的主要收益是解决after_commit
导致的master crash
后数据丢失问题
,因此在引入after_sync模式
后,所有提交的数据已被复制
,故障切换时数据一致性将得到提升。
1.5. 演示无损/半同步复制
- master server
mysql root@localhost:employees> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';Query OK, 0 rows affectedTime: 0.063smysql root@localhost:employees> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';Query OK, 0 rows affectedTime: 0.005smysql root@localhost:employees> set global rpl_semi_sync_master_enabled = 1;Query OK, 0 rows affectedTime: 0.020smysql root@localhost:employees> set global rpl_semi_sync_master_timeout = 5000;Query OK, 0 rows affectedTime: 0.001smysql root@localhost:employees> show global status like "%rpl%";+--------------------------------------------+-------+| Variable_name | Value |+--------------------------------------------+-------+| Rpl_semi_sync_master_clients | 0 || Rpl_semi_sync_master_net_avg_wait_time | 0 || Rpl_semi_sync_master_net_wait_time | 0 || Rpl_semi_sync_master_net_waits | 0 || Rpl_semi_sync_master_no_times | 0 || Rpl_semi_sync_master_no_tx | 0 || Rpl_semi_sync_master_status | ON | -- status ok| Rpl_semi_sync_master_timefunc_failures | 0 || Rpl_semi_sync_master_tx_avg_wait_time | 0 || Rpl_semi_sync_master_tx_wait_time | 0 || Rpl_semi_sync_master_tx_waits | 0 || Rpl_semi_sync_master_wait_pos_backtraverse | 0 || Rpl_semi_sync_master_wait_sessions | 0 || Rpl_semi_sync_master_yes_tx | 0 || Rpl_semi_sync_slave_status | OFF |+--------------------------------------------+-------+15 rows in setTime: 0.013smysql root@localhost:employees>
- slave server
(root@localhost) 09:59:14 [employees]> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';Query OK, 0 rows affected (0.05 sec)(root@localhost) 18:07:15 [employees]> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';Query OK, 0 rows affected (0.00 sec)(root@localhost) 18:07:29 [employees]> set global rpl_semi_sync_slave_enabled = 1;Query OK, 0 rows affected (0.00 sec)(root@localhost) 18:14:59 [employees]> show global variables like '%semi%';+-------------------------------------------+------------+| Variable_name | Value |+-------------------------------------------+------------+| rpl_semi_sync_master_enabled | OFF || rpl_semi_sync_master_timeout | 10000 || rpl_semi_sync_master_trace_level | 32 || rpl_semi_sync_master_wait_for_slave_count | 1 || rpl_semi_sync_master_wait_no_slave | ON || rpl_semi_sync_master_wait_point | AFTER_SYNC || rpl_semi_sync_slave_enabled | ON || rpl_semi_sync_slave_trace_level | 32 |+-------------------------------------------+------------+8 rows in set (0.02 sec)
半复制切换异步同步过程的状态
- master server
(root@localhost) 10:11:10 [tablespace]> show global status like "%rpl%";+--------------------------------------------+-------+| Variable_name | Value |+--------------------------------------------+-------+| Rpl_semi_sync_master_clients | 1 | -- 半同步复制的client数量| Rpl_semi_sync_master_net_avg_wait_time | 0 || Rpl_semi_sync_master_net_wait_time | 0 | | Rpl_semi_sync_master_net_waits | 1 | -- master总的等待slave的次数| Rpl_semi_sync_master_no_times | 0 | -- 切成异步的次数(no = number of)| Rpl_semi_sync_master_no_tx | 0 | -- 切成异步后提交的事物数| Rpl_semi_sync_master_status | ON | -- 半同步复制的状态| Rpl_semi_sync_master_timefunc_failures | 0 || Rpl_semi_sync_master_tx_avg_wait_time | 2209 | -- master等待事物的平均时间| Rpl_semi_sync_master_tx_wait_time | 2209 | -- master等待事物的总的时间| Rpl_semi_sync_master_tx_waits | 1 | -- master等待事物的次数| Rpl_semi_sync_master_wait_pos_backtraverse | 0 || Rpl_semi_sync_master_wait_sessions | 0 || Rpl_semi_sync_master_yes_tx | 1 || Rpl_semi_sync_slave_status | OFF |+--------------------------------------------+-------+15 rows in set (0.03 sec)
- slave server
(root@localhost) 13:47:48 [tablespace]> stop slave io_thread; --停掉IO线程Query OK, 0 rows affected (0.05 sec)
- master server
(root@localhost) 14:48:11 [tablespace]> show global status like "%rpl%";+--------------------------------------------+-------+| Variable_name | Value |+--------------------------------------------+-------+| Rpl_semi_sync_master_clients | 1 || Rpl_semi_sync_master_net_avg_wait_time | 0 || Rpl_semi_sync_master_net_wait_time | 0 || Rpl_semi_sync_master_net_waits | 1 || Rpl_semi_sync_master_no_times | 0 || Rpl_semi_sync_master_no_tx | 0 || Rpl_semi_sync_master_status | ON || Rpl_semi_sync_master_timefunc_failures | 0 || Rpl_semi_sync_master_tx_avg_wait_time | 2209 || Rpl_semi_sync_master_tx_wait_time | 2209 || Rpl_semi_sync_master_tx_waits | 1 || Rpl_semi_sync_master_wait_pos_backtraverse | 0 || Rpl_semi_sync_master_wait_sessions | 0 || Rpl_semi_sync_master_yes_tx | 1 || Rpl_semi_sync_slave_status | OFF |+--------------------------------------------+-------+15 rows in set (0.00 sec)(root@localhost) 14:48:14 [tablespace]> insert into qqq values(99); --插入sqlQuery OK, 1 row affected (5.04 sec) -- 等待5秒后,切成异步(root@localhost) 14:49:28 [tablespace]> show global status like "%rpl%";+--------------------------------------------+-------+| Variable_name | Value |+--------------------------------------------+-------+| Rpl_semi_sync_master_clients | 0 || Rpl_semi_sync_master_net_avg_wait_time | 0 || Rpl_semi_sync_master_net_wait_time | 0 || Rpl_semi_sync_master_net_waits | 1 | | Rpl_semi_sync_master_no_times | 1 | --切成异步的次数| Rpl_semi_sync_master_no_tx | 1 | --切成异步后的事物数| Rpl_semi_sync_master_status | OFF | --status 为off| Rpl_semi_sync_master_timefunc_failures | 0 || Rpl_semi_sync_master_tx_avg_wait_time | 2209 || Rpl_semi_sync_master_tx_wait_time | 2209 || Rpl_semi_sync_master_tx_waits | 1 || Rpl_semi_sync_master_wait_pos_backtraverse | 0 || Rpl_semi_sync_master_wait_sessions | 0 || Rpl_semi_sync_master_yes_tx | 1 || Rpl_semi_sync_slave_status | OFF |+--------------------------------------------+-------+15 rows in set (0.00 sec)(root@localhost) 14:49:57 [tablespace]>
1.6. 两种复制方式的性能
备注:上图是Facebook的测试性能图;其中Y轴是QPS,X轴是并发数
- 蓝色的
Normal Slave
是异步复制
- 性能很好,但是随着并发数的增长,性能有所下降
- 绿色的
Enhanced mysqlbinlog
是无损复制
- 随着并发数的增长,性能几乎是线性增长的,在高并发下,性能会优于异步复制
- 紫色的
Normal Semi Slave
是 半同步复制- 性能较低
无损复制性能优于半同步复制的原因
等待ACK回包
问题上,其实两种复制的开销是一样的,没有区别,都是网络的等待开销。无损复制
由于在write binlog
(commit 的第二步)后,需要等待ACK,后续的事务无法提交
,这样就堆积
了N多个需要落盘的事务
(半同步复制由于已经提交了事务,没有堆积事务的效果),通过组提交
机制一次 fsync的多个事务
(半同步复制也有组提交,只是一次fsync
的事务数没那么多), 相当于提高了I/O性能
;所以线程(事务)越多,效果越明显
,以至于有上图中超过异步复制的效果。(无损复制的组提交比例比原版的高3~4倍)
产生上述测试效果的前提:测试用例是
IO Bound
的(比如数据量有 100G,而 buffer pool 只有 10G),且并发数足够多。
下面这两个参数不要去设置,设置了反而性能差
(root@localhost) 10:10:47 [tablespace]> show variables like "%binlog_group%";+-----------------------------------------+-------+| Variable_name | Value |+-----------------------------------------+-------+| binlog_group_commit_sync_delay | 0 || binlog_group_commit_sync_no_delay_count | 0 | -- 等待一组里面有多少事务我才提交+-----------------------------------------+-------+2 rows in set (0.01 sec)(root@localhost) 10:10:50 [tablespace]> show variables like "%binlog_max%";+-----------------------------+-------+| Variable_name | Value |+-----------------------------+-------+| binlog_max_flush_queue_time | 0 | -- 等待多少时间后才进行组提交+-----------------------------+-------+1 row in set (0.00 sec)(root@localhost) 10:11:10 [tablespace]>
1.7. rpl_semi_sync_master_wait_for_slave_count
该参数控制Master在收到 多少个 Slave的ACK
后,才可以继续commit。配置多个ACK和配置一个ACK的效果是类似的,因为他们是 并行执行
的(理论上来说不会有两倍的等待时间), 取决于最慢的那个 。
二. 并行复制(Multi-Threaded Slave)
2.1. MTS介绍
在官方文档中,并行复制的叫法为 Multi-Threaded Slave (MTS)
- MySQL的并行复制基于组提交:
一个组提交中的事务都是可以并行执行的 ,因为既然处于组提交中,这意味着事务之间没有冲突(不会去更新同一行数据),否则不可能在同一个组里面。
Slave上开启并行复制,需要在配置文件中增加以下参数:[mysqld]slave-parallel-type=LOGICAL_CLOCK slave_preserve_commit_order=1slave-parallel-workers=4
slave-parallel-type 参数
DATABASE
基于库级别的并行复制
,如果只有一个库,就还是串行(为了兼容5.6)。LOGICAL_CLOCK
基于逻辑时钟
,主上怎么并行执行,从上也是怎么并行回放的。
slave-parallel-workers
并行复制的线程数
,一般设置为一个组内提交的事务数,线上设置为32足够了slave_preserve_commit_order
Slave上commit
的顺序保持一致
,必须为1,否则可能会有GAP锁产生
2.2. 动态调整复制线程数
配置并行复制后,Slave节点可以看到4个 Coordinator
线程
mysql root@localhost:(none)> show processlist;+----+-------------+-----------+--------+---------+--------+--------------------------------------------------------+------------------+| Id | User | Host | db | Command | Time | State | Info |+----+-------------+-----------+--------+---------+--------+--------------------------------------------------------+------------------+| 1 | system user | || Connect | 148155 | Waiting for master to send event | || 2 | system user | | | Connect | 57792 | Slave has read all relay log; waiting for more updates | || 3 | system user | | | Connect | 57791 | Waiting for an event from Coordinator | || 4 | system user | | | Connect | 148155 | Waiting for an event from Coordinator | || 7 | system user | | | Connect | 148155 | Waiting for an event from Coordinator | || 8 | system user | | | Connect | 148155 | Waiting for an event from Coordinator | || 12 | root | localhost | | Query | 0 | starting | show processlist |+----+-------------+-----------+--------+---------+--------+--------------------------------------------------------+------------------+7 rows in setTime: 0.019smysql root@localhost:(none)>-- 动态调整方式如下:mysql root@localhost:(none)> set global slave_parallel_workers=8;Query OK, 0 rows affectedTime: 0.003smysql root@localhost:(none)> stop slave; 一定要重启一下slave才能有效Query OK, 0 rows affectedTime: 0.038smysql root@localhost:(none)> start slave;Query OK, 0 rows affectedTime: 0.080smysql root@localhost:(none)> show processlist;+----+-------------+-----------+--------+---------+------+--------------------------------------------------------+------------------+| Id | User | Host | db | Command | Time | State | Info |+----+-------------+-----------+--------+---------+------+--------------------------------------------------------+------------------+| 12 | root | localhost | | Query | 0 | starting | show processlist || 13 | system user | | | Connect | 3 | Waiting for master to send event | || 14 | system user | | | Connect | 3 | Slave has read all relay log; waiting for more updates | || 15 | system user | | | Connect | 3 | Waiting for an event from Coordinator | || 16 | system user | | | Connect | 3 | Waiting for an event from Coordinator | || 17 | system user | | | Connect | 3 | Waiting for an event from Coordinator | || 18 | system user | | | Connect | 3 | Waiting for an event from Coordinator | || 19 | system user | | | Connect | 3 | Waiting for an event from Coordinator | || 20 | system user | | | Connect | 3 | Waiting for an event from Coordinator | || 21 | system user | | | Connect | 3 | Waiting for an event from Coordinator | || 22 | system user | | | Connect | 3 | Waiting for an event from Coordinator | |+----+-------------+-----------+--------+---------+------+--------------------------------------------------------+------------------+11 rows in setTime: 0.019smysql root@localhost:(none)>
特别注意:
这里的
并行复制
指的是SQL Thread (回放线程)
,而非IO Thread (IO线程)
Waiting for master to send event
这个State
在show processlist
中只有一个,即只有一个IO Thread
线上环境可以配置成两台Slave做无损复制(保证数据不丢),其他的Slave做异步复制(配置为只读,用于负载均衡),都指向同一台Master。
三. GTID
3.1. GTID的介绍
1.Global Transaction Id entifier -- 全局事物ID
2.GTID = Server_UUID + Transaction_ID
- Server_UUID 是全局唯一的
- Transaction_ID 是自增的
3.GTID 的作用是替代 Filename + Position
(root@localhost) 14:54:25 [tablespace]> show variables like "server_uuid";+---------------+--------------------------------------+| Variable_name | Value |+---------------+--------------------------------------+| server_uuid | 9dc847d8-bf72-11e7-9ec4-000c2998e4f1 |+---------------+--------------------------------------+1 row in set (0.01 sec)
在MySQL中看到的 UUID
,实际是保存在 $DATADIR/auto.cnf
中的,且该文件是服务器初始化
的时候自动生成
的。
[root@node1 mysqldata]# cat /r2/mysqldata/auto.cnf[auto]server-uuid=9dc847d8-bf72-11e7-9ec4-000c2998e4f1[root@node1 mysqldata]#
通过
冷备
做备份,拷贝$DATADIR
时,记得要把备份中的auto.cnf
给删除、
3.2. GTID的意义
未使用GTID
- 当
Master宕机
后,一个Slave
被选举提升
为New Master
,如果需要重建复制关系
,就需要把另外两个Slave
的CHANGE MASTER
指向New Master
; - 那问题来了,原来Slave是指向
Master
的Filename_M + Position_M
的位置,现在要指向New Master
上新的Filename_N + Position_N
的位置,由于基于二进制日志复制
,定位两个位置步骤繁琐; 此时两个Slave要继续重建复制关系(CHANGE MASTER)
会比较麻烦。
- 当
使用GTID
- 和上面一样的场景,
选举机制
提升为New Master的GTID
执行到最新事务,两个Slave
需要重新指向New Master
,由于使用了GTID
,目前Slave-A
获取到的事务对应GTID为GTID_A
,Slave-B
获取到的事务对应GTID为GTID_B
; - 此时
New Master
上GTID是存在GTID_A 和 GTID_B
,那两个Slave就可以直接使用GTID_A 和 GTID_B
这两个GTID,通过指向New Master
接着重建复制;
- 和上面一样的场景,
3.3. GTID的配置
[mysqld]log_bin = binloglog_slave_updates = 1gtid_mode = ON enforce-gtid-consistency = true
注意:
1.MySQL5.6 必须开启参数
log_slave_updates
(5.6版本的限制)2.MySQL5.6 升级到gtid模式需要
停机重启
3.MySQL5.7 版本开始可以不开启
log_slave_updates
4.MySQL5.7.6 版本开始可以
在线升级
成gtid模式
3.4. 基于GTID的复制
3.4.1. 配置基于GTID的复制
配置信息 | 主库(master) | 从库(slave) |
---|---|---|
主机 | node1.gczheng.com | node3.gczheng.com |
IP | 192.168.88.88 | 192.168.88.100 |
Port | 3306 | 3306 |
MySQL版本 | MySQL5.7.18 | MySQL5.7.18 |
Server_ID | 8888 | 88100 |
数据库的备份和还原步骤略过,这里使用 mydumper
备份,master上创建账号repl@'192.168.88.100'
,密码和权限同repl@'192.168.88.99'
。
- 1.master节点的配置文件加入
[mysqld]gtid_mode = ON enforce-gtid-consistency = true
- 2.slave 节点加入
[mysqld]gtid_mode = ON enforce-gtid-consistency = true
- 3.查看metadata信息
[root@node3 alldb]# cat metadataStarted dump at: 2018-02-15 18:53:58 --dump时间SHOW MASTER STATUS: Log: binlog.000014 --Filename Pos: 1361 --Pos GTID:9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-52 --GTID时间
注意:metadata中的 GTID:9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-52 表示为:这个GTID对应的
事物
已在备份中了,Slave还原备份后,进行复制时要跳过
这个GTID即对应的事物。
3.4.2. 跳过GTIDs
- Slave节点
如果备份使用 mysqldump
进行备份, 请跳过该步骤 ,因为在备份中已经存在下面这个语句:
---- GTID state at the beginning of the backup --SET @@GLOBAL.GTID_PURGED='9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-52';
因为我们使用 mydumper
进行备份, myloader
还原并不会
帮我们执行上述语句,需要手工执行该语句,让 Slave
知道这个GTID需要跳过:
mysql> SET @@GLOBAL.GTID_PURGED='9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-52';ERROR 1840 (HY000): @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.-- 这里报出一个错误,意思为如果要设置 @@GLOBAL.GTID_PURGED ,则 @@GLOBAL.GTID_EXECUTED 必须为空。
需要使用 reset master
命令,将 @@GLOBAL.GTID_EXECUTED
清空。
(root@localhost) 09:22:38 [tablespace]> reset master;Query OK, 0 rows affected (0.07 sec)(root@localhost) 09:23:18 [tablespace]> SET @@GLOBAL.GTID_PURGED='9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-52';Query OK, 0 rows affected (0.02 sec)
至此,我们已经跳过了备份中存在的GTIDs所对应的事物
3.4.3. CHANGE MASTER
- Slave节点
(root@localhost) 10:33:35 [tablespace]> change master to master_host="192.168.88.88", master_port=3306, master_user='repl',master_password='123456', master_auto_position=1;Query OK, 0 rows affected, 2 warnings (0.04 sec)(root@localhost) 10:33:46 [tablespace]> start slave;Query OK, 0 rows affected (0.03 sec)(root@localhost) 10:58:50 [tablespace]> show slave status \G;*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.88.88 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: binlog.000014 Read_Master_Log_Pos: 1893 Relay_Log_File: node3-relay-bin.000007 Relay_Log_Pos: 445 Relay_Master_Log_File: binlog.000014 Slave_IO_Running: Yes --IO线程ok Slave_SQL_Running: Yes --SQL线程ok Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 1893 Relay_Log_Space: 2598 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 8888 Master_UUID: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1 Master_Info_File: /r2/mysqldata/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:29-54 --获取到GTID Executed_Gtid_Set: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:29-54 --执行到GTID Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version:1 row in set (0.00 sec)ERROR:No query specified
3.4.4. 复制完成
- master节点
(root@localhost) 10:57:09 [tablespace]> show slave hosts;+-----------+------+------+-----------+--------------------------------------+| Server_id | Host | Port | Master_id | Slave_UUID |+-----------+------+------+-----------+--------------------------------------+| 88100 | | 3306 | 8888 | 5d4e3453-0fc5-11e8-b387-000c29c9f498 | --新建slave| 8899 | | 3306 | 8888 | 1951c7ee-c1fb-11e7-ac44-005056305232 |+-----------+------+------+-----------+--------------------------------------+2 rows in set (0.00 sec)(root@localhost) 10:58:34 [tablespace]> insert into qqq values(101); --插入a的值101Query OK, 1 row affected (0.02 sec)(root@localhost) 10:59:07 [tablespace]> select * from qqq;+------+| a |+------+| 1 || 2 || 3 || 4 || 10 || 99 || 101 |+------+7 rows in set (0.00 sec)
- slave节点
(root@localhost) 11:02:32 [tablespace]> select * from qqq;+------+| a |+------+| 1 || 2 || 3 || 4 || 10 || 99 || 101 | --slave同步到值101+------+7 rows in set (0.00 sec)(root@localhost) 11:03:08 [tablespace]>
3.5. GTID 与 Filename-Pos的对应
在 binlog 中,多了一个 GTID 的 event ,如下所示
(root@localhost) 11:43:24 [tablespace]> show binlog events;+---------------+-----+----------------+-----------+-------------+--------------------------------------------------------------------+| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |+---------------+-----+----------------+-----------+-------------+--------------------------------------------------------------------+| binlog.000001 | 4 | Format_desc | 88100 | 123 | Server ver: 5.7.18-log, Binlog ver: 4 || binlog.000001 | 123 | Previous_gtids | 88100 | 154 | || binlog.000001 | 154 | Gtid | 8888 | 219 | SET @@SESSION.GTID_NEXT= '9dc847d8-bf72-11e7-9ec4-000c2998e4f1:55' | -- 一个GTID 的 event| binlog.000001 | 219 | Query | 8888 | 282 | BEGIN || binlog.000001 | 282 | Table_map | 8888 | 334 | table_id: 227 (tablespace.qqq) || binlog.000001 | 334 | Write_rows | 8888 | 374 | table_id: 227 flags: STMT_END_F || binlog.000001 | 374 | Xid | 8888 | 405 | COMMIT /* xid=1407 */ |+---------------+-----+----------------+-----------+-------------+--------------------------------------------------------------------+7 rows in set (0.00 sec)(root@localhost) 11:43:39 [tablespace]>
通过 扫描 binlog中的GTID值,就可以知道 GTID
与Filename-Pos
对应的关系 ,但是如果binlog非常大,扫描的量也是会很大的,所以在binlog开头部分有一个 Previous_gtids 的event,如下所示:
(root@localhost) 11:44:57 [tablespace]> show binlog events in 'binlog.000001' \G;*************************** 1. row *************************** Log_name: binlog.000001 Pos: 4 Event_type: Format_desc Server_id: 88100End_log_pos: 123 Info: Server ver: 5.7.18-log, Binlog ver: 4*************************** 2. row *************************** Log_name: binlog.000001 Pos: 123 Event_type: Previous_gtids -- 表示在次之前,GTID运行到的范围是哪里 Server_id: 88100End_log_pos: 154 Info:*************************** 3. row *************************** Log_name: binlog.000001 Pos: 154 Event_type: Gtid Server_id: 8888End_log_pos: 219 Info: SET @@SESSION.GTID_NEXT= '9dc847d8-bf72-11e7-9ec4-000c2998e4f1:55'*************************** 4. row *************************** Log_name: binlog.000001 Pos: 219 Event_type: Query Server_id: 8888End_log_pos: 282 Info: BEGIN*************************** 5. row *************************** Log_name: binlog.000001 Pos: 282 Event_type: Table_map Server_id: 8888End_log_pos: 334 Info: table_id: 227 (tablespace.qqq)*************************** 6. row *************************** Log_name: binlog.000001 Pos: 334 Event_type: Write_rows Server_id: 8888End_log_pos: 374 Info: table_id: 227 flags: STMT_END_F*************************** 7. row *************************** Log_name: binlog.000001 Pos: 374 Event_type: Xid Server_id: 8888End_log_pos: 405 Info: COMMIT /* xid=1407 */7 rows in set (0.01 sec)ERROR:No query specified(root@localhost) 11:45:19 [tablespace]>
如果我要的
GTID
比Previous_gtids
的大,就扫描当前文件,反之则扫描之前的文件,依次类推。因为binlog在rotate(rotate events)
的时候,是知道当前最大的GTID的,可以将该值写入到下一个新的binlog的开头,即Previous_gtid
3.6. GTID复制出错的处理
演示人为的在从机上误操作导致的复制失败,如何恢复?
3.6.1. 演示
1.在从机上插入一条记录(模拟误操作)
- slave节点
现在 Slave
上插入一条记录(现实中如果配置了readonly,在app中是无法插入的,app不会给root权限)
mysql root@localhost:(none)> select * from ttt.t1;+---+| a | --a是主键+---+| 1 || 2 || 3 |+---+3 rows in setTime: 0.012smysql root@localhost:(none)> insert into ttt.t1 values(null);Query OK, 1 row affectedTime: 0.003smysql root@localhost:(none)> select * from ttt.t1;+---+| a |+---+| 1 || 2 || 3 || 4 |+---+4 rows in setTime: 0.014smysql root@localhost:(none)>
2.在主机上插入同样的记录
- master节点
mysql root@localhost:ttt> select * from t1;+---+| a |+---+| 1 || 2 || 3 |+---+3 rows in setTime: 0.009smysql root@localhost:ttt> insert into t1 values(null);Query OK, 1 row affectedTime: 0.007smysql root@localhost:ttt>
3.查看 Slave上的状态
- slave节点
mysql root@localhost:(none)> show slave status \G;***************************[ 1. row ]***************************Slave_IO_State | Waiting for master to send eventMaster_Host | 192.168.88.88Master_User | replMaster_Port | 3306Connect_Retry | 60Master_Log_File | binlog.000014Read_Master_Log_Pos | 6195Relay_Log_File | node2-relay-bin.000002Relay_Log_Pos | 568Relay_Master_Log_File | binlog.000014Slave_IO_Running | YesSlave_SQL_Running | NoReplicate_Do_DB |Replicate_Ignore_DB |Replicate_Do_Table |Replicate_Ignore_Table |Replicate_Wild_Do_Table |Replicate_Wild_Ignore_Table |Last_Errno | 1062Last_Error | Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '9dc847d8-bf72-11e7-9ec4-000c2998e4f1:74' at master log binlog.000014, end_log_pos 6164. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.Skip_Counter | 0Exec_Master_Log_Pos | 5944Relay_Log_Space | 1026Until_Condition | NoneUntil_Log_File |Until_Log_Pos | 0Master_SSL_Allowed | NoMaster_SSL_CA_File |Master_SSL_CA_Path |Master_SSL_Cert |Master_SSL_Cipher |Master_SSL_Key |Seconds_Behind_Master |Master_SSL_Verify_Server_Cert | NoLast_IO_Errno | 0Last_IO_Error |Last_SQL_Errno | 1062 --错误代码1062,主键重复Last_SQL_Error | Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '9dc847d8-bf72-11e7-9ec4-000c2998e4f1:74' at master log binlog.000014, end_log_pos 6164. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.Replicate_Ignore_Server_Ids |Master_Server_Id | 8888Master_UUID | 9dc847d8-bf72-11e7-9ec4-000c2998e4f1Master_Info_File | mysql.slave_master_infoSQL_Delay | 0SQL_Remaining_Delay | Slave_SQL_Running_State |Master_Retry_Count | 86400Master_Bind |Last_IO_Error_Timestamp |Last_SQL_Error_Timestamp | 180219 16:07:45Master_SSL_Crl |Master_SSL_Crlpath |Retrieved_Gtid_Set | 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:73-74Executed_Gtid_Set | 1951c7ee-c1fb-11e7-ac44-005056305232:1,9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-73 --回放到这个事务时报错Auto_Position | 0Replicate_Rewrite_DB |Channel_Name |Master_TLS_Version |
其实从数据一致性看,目前主从数据是一致的,只是复制过来的日志在回放时,发现已有了该部分数据(Error:1062),从而引发了
复制异常
(SQL回放线程停止)。 我们只需要告诉MySQL,跳过
这部分一样的GTID,继续复制,即可。
3.6.2. 处理复制错误
这里的 跳过
的方法很巧妙,步骤如下
1.将Slave上的
gtid_next
指向执行失败
的那个gtid
- 这里执行失败的 gtid 报错信息中已经给出: '9dc847d8-bf72-11e7-9ec4-000c2998e4f1:73-74'
- 如果不看报错信息,可以看 Retrieved_Gtid_Set 和 Executed_Gtid_Set 的对比结果
2.执行一个空的事物,即
begin;commit
;- 这样就把
失败的gtid
对应到了一个空的事务
上,这个步骤即为 “跳过” 的意思
- 这样就把
3.将
gtid_next
设置(还原)为automatic
操作:
- slave节点mysql root@localhost:(none)> select @@gtid_next; -- 当前为默认值,AUTOMATIC+-------------+| @@gtid_next |+-------------+| AUTOMATIC |+-------------+1 row in setTime: 0.010s-- 步骤1 : 设置 gtid_next 为回放失败的gtidmysql root@localhost:(none)> set gtid_next="9dc847d8-bf72-11e7-9ec4-000c2998e4f1:74";Query OK, 0 rows affectedTime: 0.001s-- 步骤2 : 执行一个空的事物,让回放失败的gtid对应到这个空的事物mysql root@localhost:(none)> begin;Query OK, 0 rows affectedTime: 0.000smysql root@localhost:(none)> commit;Query OK, 0 rows affectedTime: 0.000s-- -- 步骤3 : 还原gtid_next为automaticmysql root@localhost:(none)> set gtid_next="automatic";Query OK, 0 rows affectedTime: 0.000s-- -- 步骤4 : 重新启动slavemysql root@localhost:(none)> stop slave;Query OK, 0 rows affectedTime: 0.001smysql root@localhost:(none)> start slave;Query OK, 0 rows affectedTime: 0.046smysql root@localhost:(none)> show slave status \G;***************************[ 1. row ]***************************Slave_IO_State | Waiting for master to send eventMaster_Host | 192.168.88.88Master_User | replMaster_Port | 3306Connect_Retry | 60Master_Log_File | binlog.000014Read_Master_Log_Pos | 6195Relay_Log_File | node2-relay-bin.000005Relay_Log_Pos | 357Relay_Master_Log_File | binlog.000014Slave_IO_Running | YesSlave_SQL_Running | YesReplicate_Do_DB |Replicate_Ignore_DB |Replicate_Do_Table |Replicate_Ignore_Table |Replicate_Wild_Do_Table |Replicate_Wild_Ignore_Table |Last_Errno | 0Last_Error |Skip_Counter | 01 row in setTime: 0.010smysql root@localhost:(none)>
skip_gtid_errors.sh 脚本
#!/bin/bash# line V1.0# mail:gczheng@139.com# data:2018-08-30# script_name:skip_gtid_errors.sh user=rootpass='iforgot'host='localhost'for((i=1;i<1000;i++))do#sql21=`mysql -u${user} -p${pass} -h${host} -e "show slave status\G;" 2>/dev/null|grep -v Last_Error | grep 'executing transaction'| awk '{print $21}'`Errno=`mysql -u${user} -p${pass} -h${host} -e "show slave status\G;" 2>/dev/null|grep 'Last_SQL_Errno' | awk -F ":" '{print $2}'`Seconds=`mysql -u${user} -p${pass} -h${host} -e "show slave status\G;" 2>/dev/null|grep 'Seconds_Behind_Master' | awk -F ":" '{print $2}'`Last_seen_transaction=`mysql -u${user} -p${pass} -h${host} -Ne "select LAST_SEEN_TRANSACTION from performance_schema.replication_applier_status_by_worker" 2>/dev/null`if [ $Errno -eq 1061 ] || [$Errno -eq 1062 ] || [$Errno -eq 1217 ] || [$Errno -eq 1050 ] then echo "----------------------" echo $(date +'%Y-%m-%d %H:%M:%S') && echo "Last_seen_transaction:${Last_seen_transaction}" && echo -e "Last_Errno:\033[31m ${Errno}\033[0m" mysql -u${user} -p${pass} -h${host} -e "STOP SLAVE;SET @@SESSION.GTID_NEXT='${Last_seen_transaction}';BEGIN; COMMIT;SET @@SESSION.GTID_NEXT = AUTOMATIC;START SLAVE;" 2>/dev/null else sleep 3 echo "----------------------" echo $(date +'%Y-%m-%d %H:%M:%S') && echo "Seconds_Behind_Master: ${Seconds} " && echo -e "Last_Errno:\033[32m ${Errno}\033[0m"fidone
3.6.3. 测试复制
1.Master端插入一个测试数据
- **master节点*
mysql root@localhost:ttt> insert into t1 values(null);Query OK, 1 row affectedTime: 0.007smysql root@localhost:ttt>
2.查看Slave2上的数据是否同步
- **slave节点*
mysql root@localhost:(none)> select * from ttt.t1;+---+| a |+---+| 1 || 2 || 3 || 4 || 5 | -- 立即同步过来的数据+---+5 rows in setTime: 0.011smysql root@localhost:(none)>
至此,GTID复制出错的处理就完成了。
注意:这里仅仅是
跳过错误
,和原来的sql_slave_skip_counter
(该功能在GTID下失效)功能类似,无法保证主从数据是一致的
(需要人工介入进行确认,比如仅仅主键一样,其他列不一样)
如果出现了很多的GTID的错误,可能是从机上有大量的操作,建议
但还是要在源头上避免此类情况的发生,确保在从机上开启重新搭建主从复制
read_only=1
,并且避免人工的误操作。 最后强调:GTID
是基于事务
的复制,一致性要求很高, 强烈建议在 Slave 上开启 read_only=1
3.7. 其他注意事项
- reset slave all 使用该命令时
不会清空数据
,仅仅是清空show slave status\G
里面的信息,所以在使用该命令之前,请先记录show slave status\G
的信息。 - GTID 在开启GTID后,不能在一个事物中使用
创建临时表的语句
,需要使得autocommit=1;
才可以。 在开启GTID后,不能使用create table select ...
的语法来创建表了,因为这其实是多个事物
了,GTID没法对应
四. 级联复制
4.1. 级联复制的介绍
slave B
从 master A
上复制, slave C
从 salve B
上复制,此时 slave B
上就要开启 log_slave_updates
产生binlog ,没有 binlog
则无法传递给
salve c
。 master
配置log_bin
参数是当变更时产生binlog, 复制关系中slave获取master的binlog写入relay-log中,slave自身不产生binlog
4.2. 级联复制的场景
4.2.1. 跨机房的复制
在跨机房搭建复制时,如果
缺点是复制的master
挂了,深圳slave
提升为New Master
,此时北京slave
是不需要去做CHANGE MASTER
操作的。延迟会更大
(跨机房的延迟本来就很难避免)。
4.2.2. 库的拆分
当 生产库
压力很大时或者需要做统计分析时,需要把 逻辑库
拆分出去,就可以使用 级联复制
,让 逻辑库
形成单独的库。
4.3. 级联复制测试
配置信息 | 主库(master) | 从库(slave A) | 从库(slave B) | 从库(slave C) |
---|---|---|---|---|
主机 | node1.gczheng.com | node2.gczheng.com | node3.gczheng.com | proxy.gczheng.com |
IP | 192.168.88.88 | 192.168.88.99 | 192.168.88.100 | 192.168.88.77 |
Port | 3306 | 3306 | 3306 | 3306 |
MySQL版本 | MySQL5.7.18 | MySQL5.7.18 | MySQL5.7.18 | MySQL5.7.18 |
Server_ID | 8888 | 8899 | 88100 | 8877 |
Master --> Slave A --> Slave C
是级联复制,Master到Slave A 和 Master到Slave B 之前已部署完,直接做 Slave A到Slave C 的级联复制
1、在Master端新建一个DB名为
mts库
,用于在 Slave3 上测试参数replicate_do_db
。- master节点
(root@localhost) 14:54:28 [tablespace]> create database mts;Query OK, 1 row affected (0.03 sec)(root@localhost) 14:54:33 [tablespace]> use mts;Database changed(root@localhost) 14:55:34 [mts]> create table t(a int);Query OK, 0 rows affected (0.07 sec)(root@localhost) 14:55:38 [mts]> insert into t values(1),(2),(3);Query OK, 3 rows affected (0.01 sec)Records: 3 Duplicates: 0 Warnings: 0(root@localhost) 14:56:10 [mts]> select * from mts.t;+------+| a |+------+| 1 || 2 || 3 |+------+3 rows in set (0.00 sec)(root@localhost) 14:56:22 [mts]>
- 2、在Slave C上增加如下参数
[mysqld]report-host=192.168.88.77 # 只复制mts的库 replicate_do_db=mts
- 3、确认在slave A 已添加log_slave_updates,并建立通用账号'repl'@'192.168.88.%'
[mysqld]log_slave_updates=1
4、change master
- Slave C 节点
(root@localhost) 16:19:58 [mts]> change master to master_host="192.168.88.99", master_port=3306, master_user='repl', master_password='123456', master_auto_position=1;Query OK, 0 rows affected, 2 warnings (0.02 sec)(root@localhost) 16:20:31 [mts]> start slave;Query OK, 0 rows affected (0.03 sec)(root@localhost) 16:20:38 [mts]> show slave status \G;*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.88.99 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: binlog.000035 Read_Master_Log_Pos: 2707 Relay_Log_File: proxy-relay-bin.000002 Relay_Log_Pos: 405 Relay_Master_Log_File: binlog.000035 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: mts Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2707 Relay_Log_Space: 612 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 8899 Master_UUID: 1951c7ee-c1fb-11e7-ac44-005056305232 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: 1951c7ee-c1fb-11e7-ac44-005056305232:1-17,9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-29:47-58 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version:1 row in set (0.00 sec)ERROR:No query specified(root@localhost) 16:20:44 [mts]>
复制关系:Master-->Slave A-->Slave C
,且使用了基于GTID的复制
;
五.多源复制
5.1. 多源复制的介绍
多源复制
是指多主一从的复制,从 MySQL5.7.6 才有的功能,如下图所示:
在语法层面上,只是在原来的change master
的基础上,增加了 for channel 'channel_name'
5.2. 多源复制的演示
主机信息:
配置信息 | 主库(master A) | 主库(master B) | 从库(slave ) |
---|---|---|---|
主机 | node1.gczheng.com | node2.gczheng.com | node3.gczheng.com |
IP | 192.168.88.88 | 192.168.88.99 | 192.168.88.100 |
Port | 3306 | 3306 | 3306 |
MySQL版本 | MySQL5.7.18 | MySQL5.7.18 | MySQL5.7.18 |
Server_ID | 8888 | 8899 | 88100 |
master A 和 master B 统一创建同步user `repl@'192.168.88.%'' password为123456
-
- master准备数据
- master A 节点
mysql root@localhost:(none)> create database dbn1;Query OK, 1 row affectedTime: 0.011s】mysql root@localhost:(none)> use dbn1You are now connected to database "dbn1" as user "root"Time: 0.007smysql root@localhost:dbn1> create table tn1(a int auto_increment primary key);Query OK, 0 rows affectedTime: 0.062smysql root@localhost:dbn1> insert into dbn1.tn1 values(null);Query OK, 1 row affectedTime: 0.068smysql root@localhost:dbn1> insert into dbn1.tn1 values(null);Query OK, 1 row affectedTime: 0.014smysql root@localhost:dbn1> select * from dbn1.tn1;+---+| a |+---+| 1 || 2 |+---+2 rows in setTime: 0.010smysql root@localhost:dbn1> show master status \G;***************************[ 1. row ]***************************File | binlog.000014Position | 3953Binlog_Do_DB |Binlog_Ignore_DB |Executed_Gtid_Set | 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-64 --执行到641 row in setTime: 0.007smysql root@localhost:dbn1>
- **master B 节点**
mysql gcdb@localhost:(none)> create database dbn2;Query OK, 1 row affectedTime: 0.010smysql gcdb@localhost:(none)> use dbn2;You are now connected to database "dbn2" as user "gcdb"Time: 0.015smysql gcdb@localhost:dbn2> create table tn2(a int auto_increment primary key);Query OK, 0 rows affectedTime: 0.061smysql gcdb@localhost:dbn2> insert into dbn2.tn2 values(null);Query OK, 1 row affectedTime: 0.015smysql gcdb@localhost:dbn2> insert into dbn2.tn2 values(null);Query OK, 1 row affectedTime: 0.003smysql gcdb@localhost:dbn2> insert into dbn2.tn2 values(null);Query OK, 1 row affectedTime: 0.003smysql gcdb@localhost:dbn2> insert into dbn2.tn2 values(null);Query OK, 1 row affectedTime: 0.002smysql gcdb@localhost:dbn2> select * from dbn2.tn2;+---+| a |+---+| 1 || 2 || 3 || 4 |+---+4 rows in setTime: 0.012smysql gcdb@localhost:dbn2> show master status \G;***************************[ 1. row ]***************************File | binlog.000001Position | 1687Binlog_Do_DB |Binlog_Ignore_DB |Executed_Gtid_Set | 1951c7ee-c1fb-11e7-ac44-005056305232:1-7 --执行到71 row in setTime: 0.012smysql gcdb@localhost:dbn2>
2.mysqldump 备份导出导入
- Slave 节点
[root@node3 ~]# mysqldump -ugcdb -piforgot -h192.168.88.99 -P 3306 -B dbn2 >/tmp/dbn2.sql[root@node3 ~]# mysqldump -ugcdb -piforgot -h192.168.88.88 -P 3306 -B dbn1 >/tmp/dbn1.sql[root@node3 tmp]# less dbn2.sql-----------省略其他输出--------------在恢复备份的时候,会自动执行该语句,就不需要我们手工跳过了,如果是mydumper之类的,需要手工跳过SET @@GLOBAL.GTID_PURGED='1951c7ee-c1fb-11e7-ac44-005056305232:1-7';------------省略其他输出------------[root@node3 tmp]# mysql -uroot -piforgot < /tmp/dbn2.sqlmysql: [Warning] Using a password on the command line interface can be insecure.ERROR 1840 (HY000) at line 24: @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.--还是之前的问题,需要在Slave上,先操作一下 reset master ,以清空 @@GLOBAL.GTID_EXECUTED [root@node3 tmp]# mysql -uroot -piforgot < /tmp/dbn2.sqlmysql: [Warning] Using a password on the command line interface can be insecure.[root@node3 tmp]# mysql -uroot -piforgot < /tmp/dbn1.sqlmysql: [Warning] Using a password on the command line interface can be insecure.ERROR 1840 (HY000) at line 24: @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.[root@node3 tmp]# mysql -uroot -piforgot < /tmp/dbn1.sqlmysql: [Warning] Using a password on the command line interface can be insecure.[root@node3 tmp]#
- 3 change master
先在Slave上设置需要复制的库
,如果不设置的话,默认会同步系统的库(mysql),这样可能会复制出错(因为master a
和master b
上的mysql
库中可能有相同的记录)。
- Slave 节点
[mysqld]# 只复制dbn1 和 dbn2,这个步骤很重要,且有多个时,必须分多行写replicate_do_db=dbn1 replicate_do_db=dbn2#mysql5.7多源复制必须添加的参数(不加报错),5.7版本之前不用加master_info_repository=TABLErelay_log_info_repository=TABLE
(root@localhost) 13:57:46 [(none)]> change master to master_host='192.168.88.88', master_port=3306, master_user='repl', master_password='123456', master_auto_position=1 for channel 'ch1';Query OK, 0 rows affected, 2 warnings (0.01 sec)(root@localhost) 13:58:12 [(none)]> change master to master_host='192.168.88.99', master_port=3306, master_user='repl', master_password='123456', master_auto_position=1 for channel 'ch2';Query OK, 0 rows affected, 2 warnings (0.01 sec)(root@localhost) 13:59:19 [(none)]> start slave for channel 'ch1';Query OK, 0 rows affected (0.00 sec)(root@localhost) 13:59:59 [(none)]> start slave for channel 'ch2';Query OK, 0 rows affected (0.00 sec)(root@localhost) 14:39:13 [(none)]> show slave status for channel 'ch1' \G;*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.88.88 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: binlog.000014 Read_Master_Log_Pos: 4349 Relay_Log_File: node3-relay-bin-ch1.000002 Relay_Log_Pos: 405 Relay_Master_Log_File: binlog.000014 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: dbn1,dbn2 Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 4349 Relay_Log_Space: 616 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 8888 Master_UUID: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-66 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: ch1 Master_TLS_Version:1 row in set (0.00 sec)ERROR:No query specified(root@localhost) 14:41:49 [(none)]> show slave status for channel 'ch2' \G;*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.88.99 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: binlog.000001 Read_Master_Log_Pos: 2083 Relay_Log_File: node3-relay-bin-ch2.000002 Relay_Log_Pos: 801 Relay_Master_Log_File: binlog.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: dbn1,dbn2 Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2083 Relay_Log_Space: 1012 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 8899 Master_UUID: 1951c7ee-c1fb-11e7-ac44-005056305232 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 1951c7ee-c1fb-11e7-ac44-005056305232:8-9 Executed_Gtid_Set: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-66 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: ch2 Master_TLS_Version:1 row in set (0.00 sec)ERROR:No query specified(root@localhost) 14:41:54 [(none)]>
验证
- master A 节点
mysql root@localhost:(none)> insert into dbn1.tn1 values(null);Query OK, 1 row affectedTime: 0.013smysql root@localhost:(none)> insert into dbn1.tn1 values(null);Query OK, 1 row affectedTime: 0.012smysql root@localhost:(none)>
- master B 节点
mysql root@localhost:(none)> insert into dbn2.tn2 values(null);Query OK, 1 row affectedTime: 0.003smysql root@localhost:(none)> insert into dbn2.tn2 values(null);Query OK, 1 row affectedTime: 0.003smysql root@localhost:(none)>
- slave A 节点
(root@localhost) 14:41:54 [(none)]> select * from dbn1.tn1;+---+| a |+---+| 1 || 2 || 3 || 4 |+---+4 rows in set (0.00 sec)(root@localhost) 14:45:14 [(none)]> select * from dbn2.tn2;+---+| a |+---+| 1 || 2 || 3 || 4 || 5 || 6 |+---+6 rows in set (0.00 sec)(root@localhost) 14:45:17 [(none)]>
至此,多源复制的主从搭建就完成了。
5.3.使用场景
如果 Master B
上也有一个 dbn1
的库,会有问题么?
如果不做额外的配置,是
会有错误
的;如果配置了
slave_skip_errors = ddl_exist_errors
,且没有重复数据
话,复制关系还是正常的。这种操作可以起到
但是最合适的场景还是将数据聚合
的效果。将分库分表后的数据聚合在一起,以供其他应用进行分析(前提是数据不能有重复)。不同的库
进行复制。中间件的unique key
中间件可以保证分区键是唯一的(比如order_id),但是对于其他唯一索引来说,需要业务层去保证
。