我有一个BigQuery数据集位于新的" asia-northeast1"区域。我试图运行Dataflow模板化管道(在澳大利亚地区运行)从中读取表格。即使数据集/表确实存在,它也会丢失以下错误:
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
{
"code" : 404,
"errors" : [ {
"domain" : "global",
"message" : "Not found: Dataset grey-sort-challenge:Konnichiwa_Tokyo",
"reason" : "notFound"
} ],
"message" : "Not found: Dataset grey-sort-challenge:Konnichiwa_Tokyo"
}
我在这里做错了吗?
/**
* BigQuery -> ParDo -> GCS (one file)
*/
public class BigQueryTableToOneFile {
public static void main(String[] args) throws Exception {
PipelineOptionsFactory.register(TemplateOptions.class);
TemplateOptions options = PipelineOptionsFactory
.fromArgs(args)
.withValidation()
.as(TemplateOptions.class);
options.setAutoscalingAlgorithm(THROUGHPUT_BASED);
Pipeline pipeline = Pipeline.create(options);
pipeline.apply(BigQueryIO.read().from(options.getBigQueryTableName()).withoutValidation())
.apply(ParDo.of(new DoFn<TableRow, String>() {
@ProcessElement
public void processElement(ProcessContext c) throws Exception {
String commaSep = c.element().values()
.stream()
.map(cell -> cell.toString().trim())
.collect(Collectors.joining("\",\""));
c.output(commaSep);
}
}))
.apply(TextIO.write().to(options.getOutputFile())
.withoutSharding()
.withWritableByteChannelFactory(GZIP)
);
pipeline.run();
}
public interface TemplateOptions extends DataflowPipelineOptions {
@Description("The BigQuery table to read from in the format project:dataset.table")
@Default.String("bigquery-samples:wikipedia_benchmark.Wiki1k")
ValueProvider<String> getBigQueryTableName();
void setBigQueryTableName(ValueProvider<String> value);
@Description("The name of the output file to produce in the format gs://bucket_name/filname.csv")
@Default.String("gs://bigquery-table-to-one-file/output/bar.csv.gz")
ValueProvider<String> getOutputFile();
void setOutputFile(ValueProvider<String> value);
}
}
参数数量:
--project=grey-sort-challenge
--runner=DataflowRunner
--jobName=bigquery-table-to-one-file
--maxNumWorkers=1
--zone=australia-southeast1-a
--stagingLocation=gs://bigquery-table-to-one-file/jars
--tempLocation=gs://bigquery-table-to-one-file/tmp
--templateLocation=gs://bigquery-table-to-one-file/template
工作编号:2018-05-05_05_37_08-8260293482986343692
答案 0 :(得分:0)
对不起,这个问题。将在即将发布的Beam SDK 2.5.0中解决(您可以尝试使用Beam回购中的当前头部快照)