我在Pentaho PDI中使用了用于Google BigQuery的starschema JDBC驱动程序:
http://code.google.com/p/starschema-bigquery-jdbc/
我通过BigQuery Web控制台查询返回129,993行,但是当我通过JDBC驱动程序执行相同的查询时,它只返回100,000行。是否有某种我不知道的选择或限制?
答案 0 :(得分:1)
StarSchema代码看起来只返回结果的第一页。
此处的代码here应该更新以获得其余结果。它应该看起来像:
public static GetQueryResultsResponse getQueryResults(Bigquery bigquery,
String projectId, Job completedJob) throws IOException {
GetQueryResultsResponse queryResult = bigquery.jobs()
.getQueryResults(projectId,
completedJob.getJobReference().getJobId()).execute();
while(queryResult.getTotalRows() > queryResult.getRows().size()) {
queryResult.getRows().addAll(
bigquery.jobs()
.getQueryResults(projectId,
completedJob.getJobReference().getJobId())
.setStartIndex(queryResult.getRows().size())
.execute()
.getRows());
}
return queryResult;
}
答案 1 :(得分:1)
根据Jordan的答案修改代码,解决方案如下:
public static GetQueryResultsResponse getQueryResults(Bigquery bigquery,
String projectId, Job completedJob) throws IOException {
GetQueryResultsResponse queryResult = bigquery.jobs()
.getQueryResults(projectId,
completedJob.getJobReference().getJobId()).execute();
long totalRows = queryResult.getTotalRows().longValue();
if(totalRows == 0){
//if we don't have results we'll get a nullPointerException on the queryResult.getRows().size()
return queryResult;
}
while( totalRows > (long)queryResult.getRows().size() ) {
queryResult.getRows().addAll(
bigquery.jobs()
.getQueryResults(projectId,
completedJob.getJobReference().getJobId())
.setStartIndex(BigInteger.valueOf((long)queryResult.getRows().size()) )
.execute()
.getRows());
}
return queryResult;
}
这应该可以解决问题。 还将新版本上传到Google代码,名为bqjdbc-1.3.1.jar