我正在使用Google云数据流,当我执行此代码时:
public static void main(String[] args) {
String query = "SELECT * FROM [*****.*****]";
Pipeline p = Pipeline.create(PipelineOptionsFactory.fromArgs(args).withValidation().create());
PCollection<TableRow> lines = p.apply(BigQueryIO.read().fromQuery(query));
p.run();
}
我有这个
(332b4f3b83bd3397): java.io.IOException: Query job beam_job_d1772eb4136d4982be55be20d173f63d_testradiateurmodegfcvsoasc07281159145481871-query failed, status: {
"errorResult" : {
"message" : "Cannot read and write in different locations: source: EU, destination: US",
"reason" : "invalid"
},
"errors" : [ {
"message" : "Cannot read and write in different locations: source: EU, destination: US",
"reason" : "invalid"
}],
"state" : "DONE"
}.
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryQuerySource.executeQuery(BigQueryQuerySource.java:173)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryQuerySource.getTableToExtract(BigQueryQuerySource.java:120)
at org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase.split(BigQuerySourceBase.java:87)
at com.google.cloud.dataflow.worker.runners.worker.WorkerCustomSources.splitAndValidate(WorkerCustomSources.java:261)
at com.google.cloud.dataflow.worker.runners.worker.WorkerCustomSources.performSplitTyped(WorkerCustomSources.java:209)
at com.google.cloud.dataflow.worker.runners.worker.WorkerCustomSources.performSplitWithApiLimit(WorkerCustomSources.java:184)
at com.google.cloud.dataflow.worker.runners.worker.WorkerCustomSources.performSplit(WorkerCustomSources.java:161)
at com.google.cloud.dataflow.worker.runners.worker.WorkerCustomSourceOperationExecutor.execute(WorkerCustomSourceOperationExecutor.java:47)
at com.google.cloud.dataflow.worker.runners.worker.DataflowWorker.executeWork(DataflowWorker.java:341)
at com.google.cloud.dataflow.worker.runners.worker.DataflowWorker.doWork(DataflowWorker.java:297)
at com.google.cloud.dataflow.worker.runners.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:244)
at com.google.cloud.dataflow.worker.runners.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:125)
at com.google.cloud.dataflow.worker.runners.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:105)
at com.google.cloud.dataflow.worker.runners.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:92)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
我已经阅读了帖子37298504,42135002 和https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/405,但没有解决方案适合我。
了解更多信息:
我尝试使用--zone=europe-west1-b
和region=europe-west1-b
我使用DataFlowRunner
当我转到BigQuery Web UI时,我明白了 theses temporary datasets
编辑:我使用数据流SDK的1.9.0版解决了我的问题