MyISAM表删除后,MariaDB Galera Closter崩溃

时间:2018-08-22 14:16:37

标签: mariadb myisam galera

我们有5个节点的MariaDB Galera集群。在我们的案例中,删除了一个MyISAM表,之后整个集群崩溃了。我读过其他博客/问题,但需要一些有关内部工作的知识。

**我需要了解- 1.启动了一个drop table命令,并且从日志中可以清楚地看到,在其他节点(因为它是MyISAM)中找不到正在删除的记录,因此一致性受到损害,因此群集崩溃了。 MyISAM复制DDL语句,然后为什么要检查所有节点上表的内部记录-它不应该直接删除表而不比较其他节点上的记录,因为它是drop而不是delete语句。 (仅供参考-该表存在于所有节点上,但记录不同)

  1. 建议避免应用此补丁或任何修补程序/错误修复程序?

  2. 我想念的其他任何地方 **

MariaDB版本-10.1.33

日志-

2018-08-16 15:55:26 140366938635008 [Note] InnoDB: Online DDL : End of reading clustered index of the table and create temporary files
**2018-08-16 16:24:27 140366938635008 [ERROR] Slave SQL: Error 'Unknown table 'DB_TOTAL4DEV_P.users_last_login_2'' on query. Default database: ''. Query: 'drop table `DB_TOTAL4DEV_P`.`users_last_login_2`', Internal MariaDB error code: 1051**
**2018-08-16 16:24:27 140366938635008 [Warning] WSREP: RBR event 1 Query apply warning: 1, 309050572
2018-08-16 16:24:27 140366938635008 [Warning] WSREP: Ignoring error for TO isolated action: source: 4533077b-9f57-11e8-94aa-6b070f22a4b5 version: 3 local: 0 state: APPLYING flags: 65 conn_id: 492382 trx_id: -1 seqnos (l: 805550, g: 309050572, s: 309050571, d: 309050571, ts: 223731321595519)**
2018-08-16 16:24:28 140366938635008 [ERROR] Slave SQL: Error 'Unknown table 'DB_TOTAL4DEV_P.users_last_login_2'' on query. Default database: ''. Query: 'drop table `DB_TOTAL4DEV_P`.`users_last_login_2`', Internal MariaDB error code: 1051
2018-08-16 16:24:28 140366938635008 [Warning] WSREP: RBR event 1 Query apply warning: 1, 309050575
2018-08-16 16:24:28 140366938635008 [Warning] WSREP: Ignoring error for TO isolated action: source: 56daeaa4-9e8e-11e8-b891-6f203bac329a version: 3 local: 0 state: APPLYING flags: 65 conn_id: 708893 trx_id: -1 seqnos (l: 805553, g: 309050575, s: 309050574, d: 309050574, ts: 310032366214700)
2018-08-16 16:28:45 140366938635008 [ERROR] mysqld: Can't find record in 'users_last_login'
2018-08-16 16:28:45 140366938635008 [ERROR] Slave SQL: Could not execute Delete_rows_v1 event on table DB_TOTAL4DEV_P.users_last_login; Can't find record in 'users_last_login', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 1862, Internal MariaDB error code: 1032
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: RBR event 3 Delete_rows_v1 apply warning: 120, 309051933
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: Failed to apply app buffer: seqno: 309051933, status: 1
2018-08-16 16:28:45 140366938635008 [ERROR] mysqld: Can't find record in 'users_last_login'
2018-08-16 16:28:45 140366938635008 [ERROR] Slave SQL: Could not execute Delete_rows_v1 event on table DB_TOTAL4DEV_P.users_last_login; Can't find record in 'users_last_login', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 1862, Internal MariaDB error code: 1032
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: RBR event 3 Delete_rows_v1 apply warning: 120, 309051933
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: Failed to apply app buffer: seqno: 309051933, status: 1
2018-08-16 16:28:45 140366938635008 [ERROR] mysqld: Can't find record in 'users_last_login'
2018-08-16 16:28:45 140366938635008 [ERROR] Slave SQL: Could not execute Delete_rows_v1 event on table DB_TOTAL4DEV_P.users_last_login; Can't find record in 'users_last_login', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 1862, Internal MariaDB error code: 1032
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: RBR event 3 Delete_rows_v1 apply warning: 120, 309051933
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: Failed to apply app buffer: seqno: 309051933, status: 1
2018-08-16 16:28:45 140366938635008 [ERROR] mysqld: Can't find record in 'users_last_login'
2018-08-16 16:28:45 140366938635008 [ERROR] Slave SQL: Could not execute Delete_rows_v1 event on table DB_TOTAL4DEV_P.users_last_login; Can't find record in 'users_last_login', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 1862, Internal MariaDB error code: 1032
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: RBR event 3 Delete_rows_v1 apply warning: 120, 309051933
2018-08-16 16:28:45 140366938635008 [ERROR] WSREP: Failed to apply trx: source: 808e3422-9dcd-11e8-bbc0-3ab028afd1b7 version: 3 local: 0 state: APPLYING flags: 1 conn_id: 893106 trx_id: 15068161908 seqnos (l: 806931, g: 309051933, s: 309051932, d: 309051868, ts: 393116532516870)
**2018-08-16 16:28:45 140366938635008 [ERROR] WSREP: Failed to apply trx 309051933 4 times
2018-08-16 16:28:45 140366938635008 [ERROR] WSREP: Node consistency compromised, aborting...**
2018-08-16 16:28:45 140366938635008 [Note] WSREP: Closing send monitor...
2018-08-16 16:28:45 140366938635008 [Note] WSREP: Closed send monitor.
2018-08-16 16:28:45 140366938635008 [Note] WSREP: gcomm: terminating thread
2018-08-16 16:28:45 140366938635008 [Note] WSREP: gcomm: joining thread
2018-08-16 16:28:45 140366938635008 [Note] WSREP: gcomm: closing backend
2018-08-16 16:28:45 140366938635008 [Note] WSREP: (590a3f66, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.134.18.25:4567 tcp://10.134.18.5:4567
2018-08-16 16:28:45 140366938635008 [Note] WSREP: declaring 7d266759 at tcp://10.134.19.4:4567 stable
2018-08-16 16:28:45 140366938635008 [Note] WSREP: declaring 808e3422 at tcp://10.134.18.4:4567 stable
2018-08-16 16:28:45 140366938635008 [Note] WSREP: forgetting 4533077b (tcp://10.134.18.5:4567)
2018-08-16 16:28:45 140366938635008 [Note] WSREP: forgetting 56daeaa4 (tcp://10.134.18.25:4567)
2018-08-16 16:28:45 140366938635008 [Note] WSREP: (590a3f66, 'tcp://0.0.0.0:4567') turning message relay requesting off
2018-08-16 16:28:45 140366938635008 [Note] WSREP: Node 590a3f66 state prim
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: user message in state LEAVING
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: 590a3f66 sending install message failed: Transport endpoint is not connected
2018-08-16 16:28:48 140366938635008 [Note] WSREP: (590a3f66, 'tcp://0.0.0.0:4567') connection to peer 808e3422 with addr tcp://10.134.18.4:4567 timed out, no messages seen in PT3S
2018-08-16 16:28:48 140366938635008 [Note] WSREP: (590a3f66, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.134.18.4:4567
2018-08-16 16:28:49 140366938635008 [Note] WSREP: (590a3f66, 'tcp://0.0.0.0:4567') reconnecting to 808e3422 (tcp://10.134.18.4:4567), attempt 0
2018-08-16 16:28:50 140366938635008 [Note] WSREP:  cleaning up 56daeaa4 (tcp://10.134.18.25:4567)
2018-08-16 16:28:50 140366938635008 [Note] WSREP:  cleaning up 4533077b (tcp://10.134.18.5:4567)
2018-08-16 16:28:50 140366938635008 [Note] WSREP: evs::proto(590a3f66, LEAVING, view_id(REG,590a3f66,613)) suspecting node: 7d266759
2018-08-16 16:28:50 140366938635008 [Note] WSREP: evs::proto(590a3f66, LEAVING, view_id(REG,590a3f66,613)) suspecting node: 808e3422
2018-08-16 16:28:50 140366938635008 [Note] WSREP: evs::proto(590a3f66, LEAVING, view_id(REG,590a3f66,613)) suspected node without join message, declaring inactive
2018-08-16 16:28:50 140366938635008 [Note] WSREP: view(view_id(NON_PRIM,590a3f66,613) memb {
2018-08-16 16:28:50 140366938635008 [Note] WSREP: view((empty))
2018-08-16 16:28:50 140366938635008 [Note] WSREP: gcomm: closed
2018-08-16 16:28:50 140366636377856 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Flow-control interval: [16, 16]
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Trying to continue unpaused monitor
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Received NON-PRIMARY.
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 309051933)
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Received self-leave message.
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Flow-control interval: [0, 0]
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Trying to continue unpaused monitor
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 309051933)
2018-08-16 16:28:50 140366636377856 [Note] WSREP: RECV thread exiting 0: Success
2018-08-16 16:28:50 140366938635008 [Note] WSREP: recv_thread() joined.
2018-08-16 16:28:50 140366938635008 [Note] WSREP: Closing replication queue.
2018-08-16 16:28:50 140366938635008 [Note] WSREP: Closing slave action queue.
2018-08-16 16:28:50 140366938635008 [Note] WSREP: /usr/sbin/mysqld: Terminated.

感谢您的帮助。 :) 请原谅我的mariaDB / Galera基本知识。

1 个答案:

答案 0 :(得分:0)

我们在测试中成功复制了该问题。问题是MyISAM表已更改为INNODB,然后该表被删除。由于表中的数据不同,MySQL在所有其他节点上都停止了运行,而Galera崩溃了。

更改为INNODB后,将复制任何新数据,但各个节点之间不同的现有数据也将保留在表中。现在,如果我们从表中删除该表或未复制的记录,则该记录在其他节点上不可用,因此群集崩溃。因此需要在更改表存储引擎之前确保这一点。