我正在尝试使用groovy读取具有1300万条记录的巨大蜂巢表,其中数据为拼花格式。我使用以下代码编写代码,但出现OOM Java堆空间错误。
我给了最大32 GB的内存,setFetchsize(5000)
仍然出现错误。
JAVA_OPTS="-Xms1024M"
JAVA_OPTS="-Xmx32556M"
任何帮助将不胜感激。
代码:
String contSql = "select * from staging.cont_staging";
ResultSet resRateRecords = stmt.executeQuery(contSql);
Map <String,Map<String,String>> masterRecords = new HashMap<String,Map<String,String>>();
Map<String,String> existingRecords = null;
int count = 0;
resRateRecords.setFetchSize(5000);
while(resRateRecords.next()) {
try{existingRecords = new HashMap<String,String>();
masterRecords.put(resRateRecords.getString("contract_id")+"#"+count++,existingRecords);
}catch(Exception e){
e.printStackTrace();
}
错误
java.lang.OutOfMemoryError: Java heap space
at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:355)
at org.apache.thrift.protocol.TBinaryProtocol.readString(TBinaryProtocol.java:347)
at org.apache.hive.service.cli.thrift.TStringColumn$TStringColumnStandardScheme.read(TStringColumn.java:453)
at org.apache.hive.service.cli.thrift.TStringColumn$TStringColumnStandardScheme.read(TStringColumn.java:433)
at org.apache.hive.service.cli.thrift.TStringColumn.read(TStringColumn.java:367)
at org.apache.hive.service.cli.thrift.TColumn.standardSchemeReadValue(TColumn.java:328)
at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:224)
at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:213)
at org.apache.thrift.TUnion.read(TUnion.java:138)
at org.apache.hive.service.cli.thrift.TRowSet$TRowSetStandardScheme.read(TRowSet.java:573)
at org.apache.hive.service.cli.thrift.TRowSet$TRowSetStandardScheme.read(TRowSet.java:525)
at org.apache.hive.service.cli.thrift.TRowSet.read(TRowSet.java:451)
at org.apache.hive.service.cli.thrift.TFetchResultsResp$TFetchResultsRespStandardScheme.read(TFetchResultsResp.java:518)
at org.apache.hive.service.cli.thrift.TFetchResultsResp$TFetchResultsRespStandardScheme.read(TFetchResultsResp.java:486)
at org.apache.hive.service.cli.thrift.TFetchResultsResp.read(TFetchResultsResp.java:408)
at org.apache.hive.service.cli.thrift.TCLIService$FetchResults_result$FetchResults_resultStandardScheme.read(TCLIService.java:13251)
at org.apache.hive.service.cli.thrift.TCLIService$FetchResults_result$FetchResults_resultStandardScheme.read(TCLIService.java:13236)
at org.apache.hive.service.cli.thrift.TCLIService$FetchResults_result.read(TCLIService.java:13183)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_FetchResults(TCLIService.java:505)
at org.apache.hive.service.cli.thrift.TCLIService$Client.FetchResults(TCLIService.java:492)
at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:335)
at java_sql_ResultSet$next.call(Unknown Source)
at BEContractRateLoad.fetchContractRateRecords(DestRateLoad.groovy:300)
at BEContractRateLoad.processContractRecords(DestRateLoad.groovy:397)
at BEContractRateLoad$processContractRecords$1.call(Unknown Source)
Groovy has reported an error, terminating