使用JDBC进行批量INSERTS的有效方法

时间:2010-09-24 04:29:16

标签: java sql performance jdbc

在我的应用程序中,我需要做很多插入。它是一个Java应用程序,我使用普通的JDBC来执行查询。数据库是Oracle。我已启用批处理,因此它可以节省执行查询的网络延迟。但是查询作为单独的INSERT串行执行:

insert into some_table (col1, col2) values (val1, val2)
insert into some_table (col1, col2) values (val3, val4)
insert into some_table (col1, col2) values (val5, val6)

我想知道以下形式的INSERT是否可能更有效:

insert into some_table (col1, col2) values (val1, val2), (val3, val4), (val5, val6)

即。将多个INSERT折叠为一个。

使批量INSERT更快的任何其他提示?

12 个答案:

答案 0 :(得分:110)

这是前两个答案的混合:

  PreparedStatement ps = c.prepareStatement("INSERT INTO employees VALUES (?, ?)");

  ps.setString(1, "John");
  ps.setString(2,"Doe");
  ps.addBatch();

  ps.clearParameters();
  ps.setString(1, "Dave");
  ps.setString(2,"Smith");
  ps.addBatch();

  ps.clearParameters();
  int[] results = ps.executeBatch();

答案 1 :(得分:19)

虽然问题是使用JDBC高效插入Oracle ,但我目前正在玩DB2(在IBM大型机上),概念上插入类似,所以认为看到我可能会有所帮助

之间的指标
  • 一次插入一条记录

  • 插入一批记录(效率很高)

这里有指标

1)一次插入一条记录

public void writeWithCompileQuery(int records) {
    PreparedStatement statement;

    try {
        Connection connection = getDatabaseConnection();
        connection.setAutoCommit(true);

        String compiledQuery = "INSERT INTO TESTDB.EMPLOYEE(EMPNO, EMPNM, DEPT, RANK, USERNAME)" +
                " VALUES" + "(?, ?, ?, ?, ?)";
        statement = connection.prepareStatement(compiledQuery);

        long start = System.currentTimeMillis();

        for(int index = 1; index < records; index++) {
            statement.setInt(1, index);
            statement.setString(2, "emp number-"+index);
            statement.setInt(3, index);
            statement.setInt(4, index);
            statement.setString(5, "username");

            long startInternal = System.currentTimeMillis();
            statement.executeUpdate();
            System.out.println("each transaction time taken = " + (System.currentTimeMillis() - startInternal) + " ms");
        }

        long end = System.currentTimeMillis();
        System.out.println("total time taken = " + (end - start) + " ms");
        System.out.println("avg total time taken = " + (end - start)/ records + " ms");

        statement.close();
        connection.close();

    } catch (SQLException ex) {
        System.err.println("SQLException information");
        while (ex != null) {
            System.err.println("Error msg: " + ex.getMessage());
            ex = ex.getNextException();
        }
    }
}

100笔交易的指标:

each transaction time taken = 123 ms
each transaction time taken = 53 ms
each transaction time taken = 48 ms
each transaction time taken = 48 ms
each transaction time taken = 49 ms
each transaction time taken = 49 ms
...
..
.
each transaction time taken = 49 ms
each transaction time taken = 49 ms
total time taken = 4935 ms
avg total time taken = 49 ms

第一个事务是120-150msthe query parse,然后执行,后续事务只占用50ms。 (这仍然很高,但我的数据库位于不同的服务器上(我需要对网络进行故障排除))

2)批量插入(高效率) - 由preparedStatement.executeBatch()

实现
public int[] writeInABatchWithCompiledQuery(int records) {
    PreparedStatement preparedStatement;

    try {
        Connection connection = getDatabaseConnection();
        connection.setAutoCommit(true);

        String compiledQuery = "INSERT INTO TESTDB.EMPLOYEE(EMPNO, EMPNM, DEPT, RANK, USERNAME)" +
                " VALUES" + "(?, ?, ?, ?, ?)";
        preparedStatement = connection.prepareStatement(compiledQuery);

        for(int index = 1; index <= records; index++) {
            preparedStatement.setInt(1, index);
            preparedStatement.setString(2, "empo number-"+index);
            preparedStatement.setInt(3, index+100);
            preparedStatement.setInt(4, index+200);
            preparedStatement.setString(5, "usernames");
            preparedStatement.addBatch();
        }

        long start = System.currentTimeMillis();
        int[] inserted = preparedStatement.executeBatch();
        long end = System.currentTimeMillis();

        System.out.println("total time taken to insert the batch = " + (end - start) + " ms");
        System.out.println("total time taken = " + (end - start)/records + " s");

        preparedStatement.close();
        connection.close();

        return inserted;

    } catch (SQLException ex) {
        System.err.println("SQLException information");
        while (ex != null) {
            System.err.println("Error msg: " + ex.getMessage());
            ex = ex.getNextException();
        }
        throw new RuntimeException("Error");
    }
}

一批100笔交易的指标是

total time taken to insert the batch = 127 ms

和1000笔交易

total time taken to insert the batch = 341 ms

因此,在~5000ms(一次只有一个trxn)中进行100次交易减少到~150ms(一批100条记录)。

注意 - 忽略我的超级网络,但指标值是相对的。

答案 2 :(得分:6)

Statement为您提供以下选项:

Statement stmt = con.createStatement();

stmt.addBatch("INSERT INTO employees VALUES (1000, 'Joe Jones')");
stmt.addBatch("INSERT INTO departments VALUES (260, 'Shoe')");
stmt.addBatch("INSERT INTO emp_dept VALUES (1000, 260)");

// submit a batch of update commands for execution
int[] updateCounts = stmt.executeBatch();

答案 3 :(得分:4)

显然,你必须进行基准测试,但是如果使用PreparedStatement而不是Statement,那么通过JDBC发出多个插入将会快得多。

答案 4 :(得分:1)

您可以使用此rewriteBatchedStatements参数使批处理插入更快。

您可以在此处阅读有关该参数的信息:MySQL and JDBC with rewriteBatchedStatements=true

答案 5 :(得分:0)

如何使用INSERT ALL语句?

INSERT ALL

INTO table_name VALUES ()

INTO table_name VALUES ()

...

SELECT Statement;

我记得最后一个select语句是强制性的,以使此请求成功。不记得为什么。 您也可以考虑使用 PreparedStatement 。很多优点!

法里德

答案 6 :(得分:0)

您可以在java中使用addBatch和executeBatch进行批量插入请参阅示例:Batch Insert In Java

答案 7 :(得分:0)

在我的代码中,我无法直接访问“ preparedStatement”,因此无法使用批处理,我只是向其传递查询和参数列表。但是,技巧是创建可变长度的插入语句和参数的LinkedList。参数输入长度可变,其效果与最上面的示例相同(请参见下文(省略错误检查))。 假设“ myTable”具有3个可更新字段:f1,f2和f3

String []args={"A","B","C", "X","Y","Z" }; // etc, input list of triplets
final String QUERY="INSERT INTO [myTable] (f1,f2,f3) values ";
LinkedList params=new LinkedList();
String comma="";
StringBuilder q=QUERY;
for(int nl=0; nl< args.length; nl+=3 ) { // args is a list of triplets values
    params.add(args[nl]);
    params.add(args[nl+1]);
    params.add(args[nl+2]);
    q.append(comma+"(?,?,?)");
    comma=",";
}      
int nr=insertIntoDB(q, params);

在我的DBInterface类中,我有:

int insertIntoDB(String query, LinkedList <String>params) {
    preparedUPDStmt = connectionSQL.prepareStatement(query);
    int n=1;
    for(String x:params) {
        preparedUPDStmt.setString(n++, x);
    }
    int updates=preparedUPDStmt.executeUpdate();
    return updates;
}

答案 8 :(得分:0)

SQLite:以上答案都是正确的。对于 SQLite,它有点不同。没有什么真正的帮助,即使把它放在一个批次中(有时)也不会提高性能。在这种情况下,尝试禁用自动提交并在完成后手动提交(警告!当多个连接同时写入时,您可能会与这些操作发生冲突)

// connect(), yourList and compiledQuery you have to implement/define beforehand
try (Connection conn = connect()) {
     conn.setAutoCommit(false);
     preparedStatement pstmt = conn.prepareStatement(compiledQuery);
     for(Object o : yourList){
        pstmt.setString(o.toString());
        pstmt.executeUpdate();
        pstmt.getGeneratedKeys(); //if you need the generated keys
     }
     pstmt.close();
     conn.commit();

}

答案 9 :(得分:0)

如果你使用 jdbcTemplate 那么:

import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.jdbc.core.BatchPreparedStatementSetter;

    public int[] batchInsert(List<Book> books) {

        return this.jdbcTemplate.batchUpdate(
            "insert into books (name, price) values(?,?)",
            new BatchPreparedStatementSetter() {

                public void setValues(PreparedStatement ps, int i) throws SQLException {
                    ps.setString(1, books.get(i).getName());
                    ps.setBigDecimal(2, books.get(i).getPrice());
                }

                public int getBatchSize() {
                    return books.size();
                }

            });
    }

或使用更高级的配置

  import org.springframework.jdbc.core.JdbcTemplate;
  import org.springframework.jdbc.core.ParameterizedPreparedStatementSetter;

    public int[][] batchInsert(List<Book> books, int batchSize) {

        int[][] updateCounts = jdbcTemplate.batchUpdate(
                "insert into books (name, price) values(?,?)",
                books,
                batchSize,
                new ParameterizedPreparedStatementSetter<Book>() {
                    public void setValues(PreparedStatement ps, Book argument) 
                        throws SQLException {
                        ps.setString(1, argument.getName());
                        ps.setBigDecimal(2, argument.getPrice());
                    }
                });
        return updateCounts;

    }

链接到 source

答案 10 :(得分:-3)

如果迭代次数较少,使用PreparedStatements将比语句慢很多。要在语句上使用PrepareStatement获得性能优势,您需要在迭代次数至少为50或更高的循环中使用它。

答案 11 :(得分:-9)

使用声明

批量插入
int a= 100;
            try {
                        for (int i = 0; i < 10; i++) {
                            String insert = "insert into usermaster"
                                    + "("
                                    + "userid"
                                    + ")"
                                    + "values("
                                    + "'" + a + "'"
                                    + ");";
                            statement.addBatch(insert);
                            System.out.println(insert);
                            a++;
                        }
                      dbConnection.commit();
                    } catch (SQLException e) {
                        System.out.println(" Insert Failed");
                        System.out.println(e.getMessage());
                    } finally {

                        if (statement != null) {
                            statement.close();
                        }
                        if (dbConnection != null) {
                            dbConnection.close();
                        }
                    }