我正在尝试将价值约一百万的数据加载到结果集中,并将结果集写入CSV字符串,然后写入字节输入流。我使用copy语句将字节流批量加载到Greenplum Postges数据库中。这样做时,代码会遇到内存不足错误。
这是代码:
import java.sql.*;
import au.com.bytecode.opencsv.CSVWriter;
import java.io.*;
import org.postgresql.copy.CopyManager;
import org.postgresql.core.BaseConnection;
import java.util.Date;
import java.util.Calendar;
public class WIDEtoGP {
public static void main(String[] args) throws SQLException {
try {
/*String dbURL = "jdbc:oracle:thin:@(DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = ortp14-scan)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = WIORDP) (SRVR = DEDICATED)))";*/
String dbURL = "jdbc:oracle:thin:@(DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = XXXXX)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = XXXXX) (SRVR = DEDICATED)))";
String strUserID = "XXXX";
String strPassword = "XXXXX";
Connection myConnection=DriverManager.getConnection(dbURL,strUserID,strPassword);
String readRecordSQL = "select id,name from account";
myConnection.setAutoCommit(false);
PreparedStatement sqlStatement=myConnection.prepareStatement(readRecordSQL,ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY);
System.out.print("Loading in to result set from Oracle: "+ Calendar.getInstance().getTime() + "\r\n");
sqlStatement.setFetchSize(50000);
ResultSet rs = sqlStatement.executeQuery();
String dbURL1 = "jdbc:postgresql://xxxxx:5432/xxxx";
String user = "xxxxxx";
String pass = "xxxxxx";
Connection conn2 = DriverManager.getConnection(dbURL1, user, pass);
Statement GPsqlStatement = conn2.createStatement();
String readGPRecordSQL = "truncate java_test";
GPsqlStatement.execute(readGPRecordSQL);
rs.beforeFirst();
while( rs.next() )
{
StringWriter stringWriter = new StringWriter();
CSVWriter csvWriter = new CSVWriter(stringWriter);
System.out.print("Start writing to CSV: "+ Calendar.getInstance().getTime() + "\r\n");
csvWriter.writeAll(rs, true);
System.out.print("Start writing to byte array input stream "+ Calendar.getInstance().getTime() + "\r\n");
String orresult = stringWriter.toString();
byte[] bytes = orresult.getBytes("UTF8");
ByteArrayInputStream orinput = new ByteArrayInputStream(bytes);
System.out.print("End time writing to byte array "+ Calendar.getInstance().getTime() + "\r\n");
System.out.print("Load to Greenplum starts "+ Calendar.getInstance().getTime() + "\r\n");
CopyManager copyManager = new CopyManager((BaseConnection) conn2);
copyManager.copyIn("copy java_test from stdin csv header",orinput);
System.out.print("Load to Greenplum ends "+ Calendar.getInstance().getTime() + "\r\n");
csvWriter.close();
}
int size = 0;
rs.last();
size = rs.getRow();
System.out.println("Row count of the load: " + size + "\r\n");
rs.close();
myConnection.close();
conn2.close();
System.out.print("Run time memory end: "+ Runtime.getRuntime().freeMemory()+ "\r\n");
} catch (Exception e) {
System.out.print("Error out time: "+ Runtime.getRuntime().freeMemory()+ "\r\n");
System.out.println(e);
}
}
}
这是确切的错误-java.lang.OutOfMemoryError:Java堆空间
知道为什么会这样吗?如果一次迭代5万行并关闭所有使用的对象,我很确定不会出现内存不足的情况,因为查询一次可以加载多达50万行。
我不想遍历SQL查询路由(而不是获取大小)来创建块,因为表每小时都在更新,并且由于某种原因,如果此过程的运行时间超过一小时,它将最终搞砸了数据。
我还必须提到,它最多可以处理500000行,而无需任何while阻塞并直接移动数据。我必须引入while块以确保它在加载数据时循环运行。在“开始写入CSV”注释后,它完全失败。.甚至在将任何内容加载到CSV对象之前都给出了内存错误。
任何建议都没有解决方法?预先感谢!