Question

我目前正在开发一个Java Benchmark来评估Apache Derby数据库中的一些用例（插入，更新，删除等）。

我的实施如下：

在预热了JVM之后，我执行了一个系列（for循环：（100k到1M迭代）），比如说，ÌNSERT在数据库的（单个表中）。因为它是Apache Derby，对于那些知道的人，我测试每种模式（内存/嵌入式，内存/网络，持久/嵌入，持久/网络）

进程的执行可以是singleThreaded，也可以是multiThreaded（使用Executors.newFixedThreadPool(poolSize)

嗯，这就是我的问题：

当我只用1个线程执行基准测试时，我有很好的实际结果

In memory/embedded[Simple Integer Insert] : 35K inserts/second (1 thread)

然后，我决定顺序执行1和2（并发）线程。

现在，我有以下结果：

In memory/embedded[Simple Integer Insert] : 21K inserts/second (1 thread)
In memory/embedded[Simple Integer Insert] : 20K inserts/second (2 thread)

为什么1个线程的结果会发生这么大的变化？

基本上，我在循环之前和之后开始和结束计时器：

// Processing
long start = System.nanoTime();

for (int i = 0; i < loopSize; i++) {
    process();
}
// end timer
long absTime = System.nanoTime() - start;
double absTimeMilli = absTime * 1e-6;

和process（）方法：

private void process() throws SQLException {
        PreparedStatement ps = clientConn.prepareStatement(query);
        ps.setObject(1, val);
        ps.execute();
        clientConn.commit();
        ps.close();
}

由于执行的处理是基本的，我的代码（数据处理）的重新编写不应该改变基准？

随着顺序线程数量的增长（例如，1,2,4,8），结果会变得更糟。

如果这令人困惑，我很抱歉。如果需要，我会提供更多信息或重新解释它！

谢谢你的帮助：）

编辑：

以下是调用上述执行的方法（来自Usecase类）：

@Override
public ArrayList<ContextBean> bench(int loopSize, int poolSize) throws InterruptedException, ExecutionException {
    Future<ContextBean> t = null;
    ArrayList<ContextBean> cbl = new ArrayList<ContextBean>();

    try {

        ExecutorService es = Executors.newFixedThreadPool(poolSize);


        for (int i = 0; i < poolSize; i++) {
            BenchExecutor be = new BenchExecutor(eds, insertStatement, loopSize, poolSize, "test-varchar");
            t = es.submit(be); 
            cbl.add(t.get());
        }

        es.shutdown();
        es.awaitTermination(Long.MAX_VALUE,TimeUnit.MILLISECONDS);

    } catch (InterruptedException e) {
        e.printStackTrace();
    } catch (SQLException e) {
        e.printStackTrace();
    }
    return cbl;
}

Answer 1

在简单操作上，每个数据库都按照您的描述行事。

原因是您产生的所有线程都尝试在同一个表（或一组表）上运行，因此数据库必须序列化访问。

在这种情况下，每个线程的工作速度都会慢一些，但总体结果是（小）增益。（对于35K的单线程版本，21K + 20K = 41K）。

随着线程数量的增加（通常是指数级），最终由于锁定升级而导致丢失（请参阅https://dba.stackexchange.com/questions/12864/what-is-lock-escalation）。

通常，当多线程解决方案的性能不受单个资源的约束时，多线程解决方案获得最多，但是多个因素（即计算，选择多个表，插入不同的表）。

数据库基准：测试并发时的奇怪结果（ExecutorService）

1 个答案: