我们的MySQL 5.7.17 AWS RDS实例与OOM崩溃。我们做了一些研究并得出结论,它不是任何特定的SELECT查询导致它。
我们有一个用例,要求我们将对所有表(大约200个)的更改记录到单个AuditLog表中。为此,我们运行一个扫描信息模式的脚本,并为每个表生成一个触发器,总共200个。示例触发器如下所示:
DROP TRIGGER IF EXISTS data.sor_insert_AlignedObjective;
delimiter //
CREATE TRIGGER data.sor_insert_AlignedObjective AFTER INSERT ON data.AlignedObjective FOR EACH ROW
BEGIN
DECLARE primaryKeyVal, userVal VARCHAR(50);
DECLARE fieldVal, columnValue TEXT;
DECLARE timeNow DATETIME;
DECLARE clientID, contextCountryCode VARCHAR(36);
DECLARE valueIndicator Boolean;
SET primaryKeyVal = new.AlignedObjectiveID;
SET userVal = user();
SET timeNow = NOW();
SET clientID = new.ClientID;
SET contextCountryCode = new.Context_Country_Code;
SET fieldVal = CONCAT('"',new.PerformanceObjectiveID,'"');
IF fieldVal IS NOT NULL THEN
SET columnValue = JSON_OBJECT("newValue",new.PerformanceObjectiveID);
INSERT INTO log.EventLog (EventID, SchemaName, TableName, ColumnName, ActionType, PrimaryKey, ColumnValue, CreationDateTime, Context_Country_Code, ClientID, UserName)
VALUES (@eventID, 'data', 'AlignedObjective', 'PerformanceObjectiveID', 'INSERT', primaryKeyVal, columnValue, timeNow, contextCountryCode, clientID, userVal);
END IF;
SET fieldVal = CONCAT('"',new.RelatedPerformanceObjectiveID,'"');
IF fieldVal IS NOT NULL THEN
SET columnValue = JSON_OBJECT("newValue",new.RelatedPerformanceObjectiveID);
INSERT INTO log.EventLog (EventID, SchemaName, TableName, ColumnName, ActionType, PrimaryKey, ColumnValue, CreationDateTime, Context_Country_Code, ClientID, UserName)
VALUES (@eventID, 'data', 'AlignedObjective', 'RelatedPerformanceObjectiveID', 'INSERT', primaryKeyVal, columnValue, timeNow, contextCountryCode, clientID, userVal);
END IF;
SET fieldVal = CONCAT('"',new._DeletedDateTime,'"');
IF fieldVal IS NOT NULL THEN
SET columnValue = JSON_OBJECT("newValue",new._DeletedDateTime);
INSERT INTO log.EventLog (EventID, SchemaName, TableName, ColumnName, ActionType, PrimaryKey, ColumnValue, CreationDateTime, Context_Country_Code, ClientID, UserName)
VALUES (@eventID, 'data', 'AlignedObjective', '_DeletedDateTime', 'INSERT', primaryKeyVal, columnValue, timeNow, contextCountryCode, clientID, userVal);
END IF;
END; //
delimiter ;
观察可用内存的cloudwatch指标,我们发现当我们创建触发器时,可释放内存的数量急剧下降。接下来,一些查询将其置于边缘(或者它自己完成)并且守护程序与OOM崩溃。
更令人费解的是,重启后,可用内存再次快速下降,就好像MySQL正在尝试将内容加载到内存中一样。
我们相对于默认AWS RDS for MySQL参数修改的唯一配置参数如下:
因此,我们对可能的原因感到有些不知所措。有什么想法吗?