即使setFetchSize

时间:2018-08-12 23:30:54

标签: java jdbc memory-management ojdbc

我正在尝试将价值约一百万的数据加载到结果集中,并将结果集写入CSV字符串,然后写入字节输入流。我使用copy语句将字节流批量加载到Greenplum Postges数据库中。这样做时,代码会遇到内存不足错误。

这是代码:

import java.sql.*; 
import au.com.bytecode.opencsv.CSVWriter;
import java.io.*;
import org.postgresql.copy.CopyManager;
import org.postgresql.core.BaseConnection;
import java.util.Date;
import java.util.Calendar;

public class WIDEtoGP { 
        public static void main(String[] args) throws SQLException {
            try {
                /*String dbURL = "jdbc:oracle:thin:@(DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = ortp14-scan)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = WIORDP) (SRVR = DEDICATED)))";*/
                String dbURL = "jdbc:oracle:thin:@(DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = XXXXX)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = XXXXX) (SRVR = DEDICATED)))";
                String strUserID = "XXXX";
                String strPassword = "XXXXX";             
                Connection myConnection=DriverManager.getConnection(dbURL,strUserID,strPassword);

                String readRecordSQL = "select id,name from account";    
                myConnection.setAutoCommit(false);
                PreparedStatement sqlStatement=myConnection.prepareStatement(readRecordSQL,ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY);
                System.out.print("Loading in to result set from Oracle: "+ Calendar.getInstance().getTime() + "\r\n");
                sqlStatement.setFetchSize(50000);
                ResultSet rs = sqlStatement.executeQuery(); 


                String dbURL1 = "jdbc:postgresql://xxxxx:5432/xxxx";
                String user = "xxxxxx";
                String pass = "xxxxxx";
                Connection conn2 = DriverManager.getConnection(dbURL1, user, pass);
                Statement GPsqlStatement = conn2.createStatement();
                String readGPRecordSQL = "truncate java_test";
                GPsqlStatement.execute(readGPRecordSQL);

                rs.beforeFirst(); 

                while( rs.next() )

                {   

                StringWriter stringWriter = new StringWriter();
                CSVWriter csvWriter = new CSVWriter(stringWriter);
                System.out.print("Start writing to CSV: "+ Calendar.getInstance().getTime() + "\r\n");
                csvWriter.writeAll(rs, true);
                System.out.print("Start writing to byte array input stream "+ Calendar.getInstance().getTime() + "\r\n");

                String orresult = stringWriter.toString();
                byte[] bytes = orresult.getBytes("UTF8");
                ByteArrayInputStream orinput = new ByteArrayInputStream(bytes); 

                System.out.print("End time writing to byte array "+ Calendar.getInstance().getTime() + "\r\n");

                System.out.print("Load to Greenplum starts "+ Calendar.getInstance().getTime() + "\r\n");
                CopyManager copyManager = new CopyManager((BaseConnection) conn2);
                copyManager.copyIn("copy java_test from stdin csv header",orinput);
                System.out.print("Load to Greenplum ends "+ Calendar.getInstance().getTime() + "\r\n");

                csvWriter.close();
                }

                int size = 0;
                rs.last();
                size = rs.getRow();
                System.out.println("Row count of the load: " + size + "\r\n");

                rs.close();
                myConnection.close();
                conn2.close();
                System.out.print("Run time memory end: "+ Runtime.getRuntime().freeMemory()+ "\r\n");


            } catch (Exception e) {
                System.out.print("Error out time: "+ Runtime.getRuntime().freeMemory()+ "\r\n");
                System.out.println(e);
            }       
        }
    }

这是确切的错误-java.lang.OutOfMemoryError:Java堆空间

知道为什么会这样吗?如果一次迭代5万行并关闭所有使用的对象,我很确定不会出现内存不足的情况,因为查询一次可以加载多达50万行。

我不想遍历SQL查询路由(而不是获取大小)来创建块,因为表每小时都​​在更新,并且由于某种原因,如果此过程的运行时间超过一小时,它将最终搞砸了数据。

我还必须提到,它最多可以处理500000行,而无需任何while阻塞并直接移动数据。我必须引入while块以确保它在加载数据时循环运行。在“开始写入CSV”注释后,它完全失败。.甚至在将任何内容加载到CSV对象之前都给出了内存错误。

任何建议都没有解决方法?预先感谢!

0 个答案:

没有答案