galera优先中止和线程认证错误

时间:2017-04-28 14:59:52

标签: mysql galera

我们正在运行Percona MySQL 5.5 XtraDB集群(2个节点和一个仲裁员)galera 2-2.12。我正在运行haproxy来提供透明代理(通过iptables TPROXY),专门针对一个节点,除非它不可用。 每10-20天我们遇到一个看起来像这样的问题。 日志中问题的第一个迹象是:

[Warning] Too many connections

这将持续几分钟,但接着我们会得到:

TRANSACTION 2B37091B, ACTIVE 1506 sec, thread declared inside InnoDB 499
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1248, 2 row lock(s), undo log entries 1
MySQL thread id 1498250, OS thread handle 0x7efccc658700, query id 14839064 <db02> <db02 ip> <db> wsrep in pre-commit stage
<update query>
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 773 page no 1707 n bits 304 index `date` of table <table> trx id 2B37091B lock_mode X locks rec but not gap
170427 14:03:47 [Note] WSREP: cluster conflict due to high priority abort for threads:
170427 14:03:47 [Note] WSREP: Winning thread: 
   THD: 4, mode: applier, state: executing, conflict: no conflict, seqno: 38463147
   SQL: (null)
170427 14:03:47 [Note] WSREP: Victim thread: 
   THD: 1498250, mode: local, state: committing, conflict: no conflict, seqno: 38463644
   SQL: <update query>

然后我们会得到一堆:

170427 14:03:49 [Note] WSREP: cluster conflict due to certification failure for threads:
170427 14:03:49 [Note] WSREP: Victim thread: 
   THD: 1498309, mode: local, state: executing, conflict: cert failure, seqno: 38463678
   SQL: <insert query> 

完成这些操作后,群集将恢复正常。在这种情况下,群集被清除,最终用户报告数据库中断。它本身并没有用来解决,但是一旦我将它添加到我的配置中,它就会在1-5分钟内从这个事件中恢复:

wsrep_provider_options="gcs.fc_limit=500; gcs.fc_master_slave=YES; gcs.fc_factor=1.0"

我的数据库配置:

[client]
socket=/var/lib/mysql/mysql.sock

[mysqld]
server-id=<id>
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
log-bin
log_slave_updates
expire_logs_days=7
symbolic-links=0
wsrep_provider=/usr/lib64/galera2/libgalera_smm.so
wsrep_cluster_address=gcomm://<gcom string>
binlog_format=ROW
default_storage_engine=InnoDB
wsrep_slave_threads= 8
wsrep_log_conflicts
wsrep_cluster_name=<cluster name>
wsrep_node_name=<node name>
wsrep_node_address=<node ip>
wsrep_provider_options="gcs.fc_limit=500; gcs.fc_master_slave=YES; gcs.fc_factor=1.0"
wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=<redact>
max_connections=300
innodb_buffer_pool_size=20G
innodb_additional_mem_pool_size = 20M
innodb_autoinc_lock_mode = 2
innodb_buffer_pool_instances = 20
innodb_lock_wait_timeout = 120
innodb_log_buffer_size = 8M
innodb_log_file_size = 48M
innodb_log_files_in_group = 3
innodb_max_dirty_pages_pct = 90
innodb_read_io_threads = 8
innodb_thread_concurrency = 16
innodb_write_io_threads = 8
innodb_file_per_table = 1

查询似乎总是引用的表格如下所示:

+-----------+--------------+------+-----+---------+-------+
| Field     | Type         | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| date      | date         | NO   | MUL | NULL    |       |
| page_name | varchar(100) | YES  | MUL | NULL    |       |
| page_hits | float        | NO   |     | 1       |       |
+-----------+--------------+------+-----+---------+-------+

+-----------+-------------+------+-----+---------+-------+
| Field     | Type        | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+-------+
| ip_hash   | varchar(32) | NO   | MUL | NULL    |       |
| timestamp | timestamp   | YES  | MUL | NULL    |       |
+-----------+-------------+------+-----+---------+-------+

我现在已经在与galera结束并准备好回到独立的mysql,但任何建议都会非常感激。

0 个答案:

没有答案