尝试从需要分区过滤器的BigQuery表中提取数据时,提取作业失败。
这是一个创建表并运行提取作业的简单示例。
package com.example;
import com.google.cloud.bigquery.*;
public class BigQueryExtractTest {
private static final String PROJECT_ID = "my-project-id";
private static final String DATASET_ID = "test_dataset";
private static final String GCS_LOCATION = "gs://my-bucket/path/to/files/part-*";
public static void main(String[] args) throws Exception {
// create BigQuery client
BigQuery bigQuery = BigQueryOptions.newBuilder().setProjectId(PROJECT_ID).build().getService();
// create dataset and table that requires partition filter
bigQuery.create(DatasetInfo.of(DATASET_ID));
bigQuery.query(QueryJobConfiguration.of(
String.format("CREATE TABLE %s.table1 (\n", DATASET_ID) +
"stringColumn STRING,\n" +
"timeColumn TIMESTAMP\n" +
") PARTITION BY DATE(timeColumn)\n" +
"OPTIONS(\n" +
"require_partition_filter=true\n" +
")"));
// extract table
Job job = bigQuery.getTable(TableId.of(DATASET_ID, "table1"))
.extract("NEWLINE_DELIMITED_JSON", GCS_LOCATION)
.waitFor();
// throw exception on error
if (job != null && job.getStatus().getError() != null) {
throw new Exception(job.getStatus().getError().toString());
}
}
}
上面的代码段会产生以下错误
Exception in thread "main" java.lang.Exception: BigQueryError{reason=invalidQuery, location=query, message=Cannot query over table 'my-project-id.test_dataset.table1' without a filter that can be used for partition elimination}
at com.example.BigQueryExtractTest.main(BigQueryExtractTest.java:34)
此示例使用的google-cloud-bigquery
maven依赖关系如下所示。
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-bigquery</artifactId>
<version>1.23.0</version>
</dependency>
该示例还使用依赖项版本0.34.0-beta
运行提取作业时如何指定分区筛选器?
答案 0 :(得分:3)
这是一个错误,现在a bug report正在跟踪问题。若要解决此限制,您可以使用mDbHelper.getTestData()
命令行工具更新表以允许不带分区筛选器的查询,执行导出,然后更新表以再次要求它。例如,使用摄取时分区表:
Cursor