无法从需要分区过滤器的BigQuery表中提取

时间:2018-03-27 21:21:29

标签: google-bigquery

尝试从需要分区过滤器的BigQuery表中提取数据时,提取作业失败。

这是一个创建表并运行提取作业的简单示例。

package com.example;

import com.google.cloud.bigquery.*;

public class BigQueryExtractTest {

    private static final String PROJECT_ID = "my-project-id";
    private static final String DATASET_ID = "test_dataset";
    private static final String GCS_LOCATION = "gs://my-bucket/path/to/files/part-*";

    public static void main(String[] args) throws Exception {
        // create BigQuery client
        BigQuery bigQuery = BigQueryOptions.newBuilder().setProjectId(PROJECT_ID).build().getService();

        // create dataset and table that requires partition filter
        bigQuery.create(DatasetInfo.of(DATASET_ID));
        bigQuery.query(QueryJobConfiguration.of(
                String.format("CREATE TABLE %s.table1 (\n", DATASET_ID) +
                        "stringColumn STRING,\n" +
                        "timeColumn TIMESTAMP\n" +
                        ") PARTITION BY DATE(timeColumn)\n" +
                        "OPTIONS(\n" +
                        "require_partition_filter=true\n" +
                        ")"));

        // extract table
        Job job = bigQuery.getTable(TableId.of(DATASET_ID, "table1"))
                .extract("NEWLINE_DELIMITED_JSON", GCS_LOCATION)
                .waitFor();

        // throw exception on error
        if (job != null && job.getStatus().getError() != null) {
            throw new Exception(job.getStatus().getError().toString());
        }
    }

}

上面的代码段会产生以下错误

Exception in thread "main" java.lang.Exception: BigQueryError{reason=invalidQuery, location=query, message=Cannot query over table 'my-project-id.test_dataset.table1' without a filter that can be used for partition elimination}
    at com.example.BigQueryExtractTest.main(BigQueryExtractTest.java:34)

此示例使用的google-cloud-bigquery maven依赖关系如下所示。

<dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-bigquery</artifactId>
    <version>1.23.0</version>
</dependency>

该示例还使用依赖项版本0.34.0-beta

抛出了异常

运行提取作业时如何指定分区筛选器?

1 个答案:

答案 0 :(得分:3)

这是一个错误,现在a bug report正在跟踪问题。若要解决此限制,您可以使用mDbHelper.getTestData()命令行工具更新表以允许不带分区筛选器的查询,执行导出,然后更新表以再次要求它。例如,使用摄取时分区表:

Cursor