如何在Dataflow中使用BigQuery Standard SQL?

时间:2016-07-20 14:16:39

标签: google-bigquery google-cloud-dataflow

我想在数据流中使用BigQuery Standard SQL运行简单查询,但我无法找到启用此选项的位置。我怎么能这样做?

pipeline.apply(Read.named(metricName + " Read").fromQuery("select * from table1 UNION DISTINCT select * from table2"));

当我尝试运行它时,我收到错误:

2016-07-20T13:35:22.543Z: Error:   (6e0ad847af078af9): Workflow failed. Causes: (fe6c7bcb1a35a057): S01:warehouse_handled_returns Read/DataflowPipelineRunner.BatchBigQueryIONativeRead+ParMultiDo(FormatData)+warehouse_handled_returns Write/DataflowPipelineRunner.BatchBigQueryIOWrite/DataflowPipelineRunner.BatchBigQueryIONativeWrite failed., (7f29f1d9435d27bc): BigQuery execution failed., (7f29f1d9435d2823): Error:
Message: Encountered "" at line 23, column 27.

HTTP Code: 400

3 个答案:

答案 0 :(得分:4)

您现在可以将标准SQL与数据流一起使用。

https://cloud.google.com/dataflow/model/bigquery-io

PCollection<TableRow> weatherData = p.apply(
BigQueryIO.Read
.named("ReadYearAndTemp")
.fromQuery("SELECT year, mean_temp FROM `samples.weather_stations`")
.usingStandardSql();

答案 1 :(得分:1)

在DataFlow正式支持BigQuery Standard SQL之前,一种解决方法是使用以下注释开始查询:

#StandardSQL

这将指示BigQuery使用标准SQL而不是旧版SQL

答案 2 :(得分:0)

Dataflow SDK for Java支持从1.8.0版本开始的BigQuery标准SQL方言。