Question

在查询大型Oracle数据库表的CLOB和LONG时遇到性能问题。

到目前为止，我用cx_Oracle（python）和JDBC（java）编写了以下单元测试：

使用cx_Oracle的Python代码：

class CXOraclePerformanceTest(TestCase):
    def test_cx_oracle_performance_with_clob(self):
        self.execute_cx_oracle_performance("CREATE TABLE my_table (my_text CLOB)")

    def test_cx_oracle_performance_with_long(self):
        self.execute_cx_oracle_performance("CREATE TABLE my_table (my_text LONG)")

    def execute_cx_oracle_performance(self, create_table_statement):
        # prepare test data
        current_milli_time = lambda: int(round(time.time() * 1000))
        db = cx_Oracle.connect(CONNECT_STRING)

        db.cursor().execute(create_table_statement)
        db.cursor().execute("INSERT INTO my_table (my_text) VALUES ('abc')")

        for i in range(13):
            db.cursor().execute("INSERT INTO my_table (my_text) SELECT 'abc' FROM my_table")

        row_count = db.cursor().execute("SELECT count(*) FROM my_table").fetchall()[0][0]
        self.assertEqual(8192, row_count)

        # execute query with big result set
        timer = current_milli_time()

        rows = db.cursor().execute("SELECT * FROM my_table")
        for row in rows:
            self.assertEqual("abc", str(row[0]))

        timer = current_milli_time() - timer
        print("{} -> duration: {} ms".format(create_table_statement, timer))

        # clean-up
        db.cursor().execute("DROP TABLE my_table")
        db.close()

使用ojdbc7.jar的Java代码：

public class OJDBCPerformanceTest {

    @Test public void testOJDBCPerformanceWithCLob() throws Exception {
        testOJDBCPerformance("CREATE TABLE my_table (my_text CLOB)");
    }

    @Test public void testOJDBCPerformanceWithLong() throws Exception {
        testOJDBCPerformance("CREATE TABLE my_table (my_text LONG)");
    }

    private void testOJDBCPerformance(String createTableStmt) throws Exception {
        // prepare connection
        OracleConnection connection = (OracleConnection) DriverManager.getConnection(connectionString);
        connection.setAutoCommit(false);
        connection.setDefaultRowPrefetch(512);

        // prepare test data
        Statement stmt = connection.createStatement();
        stmt.execute(createTableStmt);
        stmt.execute("INSERT INTO my_table (my_text) VALUES ('abc')");

        for (int i = 0; i < 13; i++)
            stmt.execute("INSERT INTO my_table (my_text) SELECT 'abc' FROM my_table");

        ResultSet resultSet = stmt.executeQuery("SELECT count(*) FROM my_table");
        resultSet.next();
        Assert.assertEquals(8192, resultSet.getInt(1));

        // execute query with big result set
        long timer = new Date().getTime();

        stmt = connection.createStatement();
        resultSet = stmt.executeQuery("SELECT * FROM my_table");
        while (resultSet.next())
            Assert.assertEquals("abc", resultSet.getString(1));

        timer = new Date().getTime() - timer;
        System.out.println(String.format("%s -> duration: %d ms", createTableStmt, timer));

        // clean-up
        stmt = connection.createStatement();
        stmt.execute("DROP TABLE my_table");
    }

}

Python测试输出：

CREATE TABLE my_table (my_text CLOB) -> duration: 31186 ms
CREATE TABLE my_table (my_text LONG) -> duration: 218 ms

Java测试输出：

CREATE TABLE my_table (my_text CLOB) -> duration: 359 ms
CREATE TABLE my_table (my_text LONG) -> duration: 14174 ms

为什么两个持续时间之间的差异如此之大？
我可以做些什么来提高一个或两个程序的性能？
是否有可用于提高查询性能的Oracle特定选项或参数？

Answer 1

要获得与LONG相同的性能，您需要告诉cx_Oracle以这种方式获取CLOB。你可以看一下这个样本： https://github.com/oracle/python-cx_Oracle/blob/master/samples/ReturnLongs.py

在您的代码中，我添加了此方法：

def output_type_handler(self, cursor, name, defaultType, size, precision, scale):
    if defaultType == cx_Oracle.CLOB:
        return cursor.var(cx_Oracle.LONG_STRING, arraysize = cursor.arraysize)

然后，在创建了与数据库的连接后，我添加了以下代码：

db.outputtypehandler = self.output_type_handler

通过这些更改，性能几乎完全相同。

请注意，在幕后，cx_Oracle正在使用动态提取和分配。这种方法适用于小型CLOB（小型通常意味着几兆字节或更少）。在这种情况下，数据库可以直接发送数据，而当使用LOB时，只需将定位器返回给客户端，然后需要另一次往返数据库来获取数据。您可以想象，这会显着降低操作速度，尤其是在数据库和客户端在网络上分离的情况下！

Answer 2

经过一些研究，我可以部分回答我的问题。

我设法改善了OJDBC的表现。 OJDBC API提供了属性useFetchSizeWithLongColumn，您可以使用该属性快速查询LONG列。

新查询时长： CREATE TABLE my_table (my_text LONG) -> duration: 134 ms

Oracle文档：

这是一个非常好的财产。它不应该与任何其他驱动程序一起使用。   如果设置为＆＃34; true＆＃34;，则在＆＃39; SELECT＆＃39;中检索数据时的性能。将得到改进，但处理LONG列的默认行为将更改为获取多行（预取大小）。这意味着将分配足够的内存来读取此数据。因此，如果要使用此属性，请确保要检索的LONG列不是太大，否则可能会耗尽内存。此属性也可以设置为java属性：

java -Doracle.jdbc.useFetchSizeWithLongColumn=true myApplication

或通过API：

Properties props = new Properties();
props.setProperty("useFetchSizeWithLongColumn", "true");
OracleConnection connection = (OracleConnection) DriverManager.getConnection(connectionString, props);

我仍然没有cx_Oracle的解决方案。这就是我打开github问题的原因：

https://github.com/oracle/python-cx_Oracle/issues/63

如何在Oracle-DB（cx_Oracle vs OJDBC）中提高CLOB和LONG值的查询性能？

2 个答案: