我在我的应用程序中对插入/更新/删除例程的性能进行了基准测试,我将其从SQL Server移植到MariaDB。
基准测试会触发50,000次插入,相同的更新和删除。
SQL Server通过net.sourceforge.jtds JDBC驱动程序在1秒内处理它们。
使用MariaDB-java-client驱动程序的MariaDB可以更快地执行插入操作,但更新(和删除)在3.5秒时会慢得多。两个数据库中的模式是相同的,我假设因为MariaDB中的插入很快,这可能会排除索引问题或服务器配置错误。
我已经为JDBC连接字符串尝试了多种变体,以最快的速度结束:
?verifyServerCertificate=true\
&useSSL=true\
&requireSSL=true\
&allowMultiQueries=true\
&cachePrepStmts=true\
&cacheResultSetMetadata=true\
&cacheServerConfiguration=true\
&elideSetAutoCommits=true\
&maintainTimeStats=false\
&prepStmtCacheSize=50000\
&prepStmtCacheSqlLimit=204800\
&rewriteBatchedStatements=false\
&useBatchMultiSend=true\
&useBatchMultiSendNumber=50000\
&useBulkStmts=true\
&useLocalSessionState=true\
&useLocalTransactionState=true\
&useServerPrepStmts=true
在所有情况下,mysql和mysql-connectorj的性能都比mariadb差。
我现在已经看了一个星期了,并且正在考虑使用我之前的问题How do I increase the speed of a large series of UPDATEs in mySQL vs SQL Server?
中建议的解决方法以防万一可能是服务器配置错误,以下是我为关键变量所做的事情:
key_buffer_size 16MB
innodb_buffer_pool_size 24GB (mem 30GB)
innodb_log_file_size 134MB
innodb_log_buffer_size 8MB
innodb_flush_log_at_trx_commit 0
max_allowed_packet 16MB
我的50,000次写入只是少量数据 - 大约2MB。但是使用SQL语法,当它超过JDBC连接时,这大概是10倍 - 这是正确的吗?
这里是SQL和解释计划:
Describe `data`
+---------------+------------------+------+-----+---------------------+-------------------------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+------------------+------+-----+---------------------+-------------------------------+
| parentId | int(10) unsigned | NO | PRI | NULL | |
| modifiedDate | date | NO | PRI | NULL | |
| valueDate | date | NO | PRI | NULL | |
| value | float | NO | | NULL | |
| versionstamp | int(10) unsigned | NO | | 1 | |
| createdDate | datetime | YES | | current_timestamp() | |
| last_modified | datetime | YES | | NULL | on update current_timestamp() |
+---------------+------------------+------+-----+---------------------+-------------------------------+
INSERT INTO `data` (`value`, `parentId`, `modifiedDate`, `valueDate`) VALUES (4853.16314229298,52054,'20-Apr-18','28-Dec-18')
+------+-------------+-------+------+---------------+------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+------+---------------+------+---------+------+------+-------+
| 1 | INSERT | data | ALL | NULL | NULL | NULL | NULL | NULL | NULL |
+------+-------------+-------+------+---------------+------+---------+------+------+-------+
UPDATE `data` SET `value` = 4853.16314229298 WHERE `parentId` = 52054 AND `modifiedDate` = '20-Apr-18' AND `valueDate` = '28-Dec-18'
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | data | range | PRIMARY | PRIMARY | 10 | NULL | 1 | Using where |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
DELETE FROM `data` WHERE `parentId` = 52054 AND `modifiedDate` = '20-Apr-18' AND `valueDate` = '29-Jan-16'
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | data | range | PRIMARY | PRIMARY | 10 | NULL | 1 | Using where |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
[UPDATE]
JDBC用法 - 这是一个简化的版本,所以请原谅任何严重的错误:
final Connection connection = dataSource.getConnection();
connection.setAutoCommit(false);
try (PreparedStatement statement = connection.prepareStatement(
"UPDATE data SET value = ? " +
"WHERE parentId = ? " +
"AND modifiedDate = ? " +
"AND valueDate = ? ")) {
// timeSeries is a list of 50,000 data points
Arrays.stream(timeSeries)
.forEach(ts -> {
try {
statement.setDouble(1, value);
statement.setLong(2, parentId);
statement.setDate(3, new java.sql.Date(
modifiedDate.getTime()));
statement.setDate(4, new java.sql.Date(
valueDate.getTime()));
statement.addBatch();
} catch (SQLException e) {
throw new RuntimeException(
"Bad batch statement handling", e);
}
});
int[] results = statement.executeBatch();
connection.commit();
} catch (SQLException e) {
connection.rollback();
throw e;
} finally {
connection.close();
}
我也有一些来自general_log的数据显示了传入的JDBC调用,它看起来很基本 - 一个'准备'调用设置语句,然后单独更新。
这让我感到惊讶 - 似乎没有批处理:
13/06/2018 15:09 service_user_t[service_user_t] @ [9.177.2.31] 75954 298206495 Query set autocommit=0
13/06/2018 15:09 service_user_t[service_user_t] @ [9.177.2.31] 75954 298206495 Prepare UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
13/06/2018 15:09 service_user_t[service_user_t] @ [9.177.2.31] 75954 298206495 Execute UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
13/06/2018 15:09 service_user_t[service_user_t] @ [9.177.2.31] 75954 298206495 Execute UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
13/06/2018 15:09 service_user_t[service_user_t] @ [9.177.2.31] 75954 298206495 Execute UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
13/06/2018 15:09 service_user_t[service_user_t] @ [9.177.2.31] 75954 298206495 Execute UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
13/06/2018 15:09 service_user_t[service_user_t] @ [9.177.2.31] 75954 298206495 Execute UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
etc
etc
答案 0 :(得分:0)
在批处理中的某些行之间添加“begin”和“commit”语句。 或者在批处理之前启动事务,然后提交。 这将比成千上万的个人陈述快得多。
如果你只做插入,rewriteBatchStatements = true应该大大加快它,没有事务。此外,您还可以将max_packet_size增加到1GB,这样可以进行更多批处理,也许您的整个批处理将转换为1个非常大的多插入。