Google Cloud DataFlow无法在不同位置读写错误(Java SDK v2.0.0)

时间:2017-07-28 13:17:18

标签: java google-cloud-dataflow

我正在使用Google云数据流,当我执行此代码时:

public static void main(String[] args) {

    String query = "SELECT * FROM [*****.*****]";

    Pipeline p = Pipeline.create(PipelineOptionsFactory.fromArgs(args).withValidation().create());

    PCollection<TableRow> lines = p.apply(BigQueryIO.read().fromQuery(query));

    p.run();
}

我有这个

(332b4f3b83bd3397): java.io.IOException: Query job beam_job_d1772eb4136d4982be55be20d173f63d_testradiateurmodegfcvsoasc07281159145481871-query failed, status: {
    "errorResult" : {
        "message" : "Cannot read and write in different locations: source: EU, destination: US",
        "reason" : "invalid"
    },
    "errors" : [ {
        "message" : "Cannot read and write in different locations: source: EU, destination: US",
        "reason" : "invalid"
    }],
    "state" : "DONE"
}.
    at org.apache.beam.sdk.io.gcp.bigquery.BigQueryQuerySource.executeQuery(BigQueryQuerySource.java:173)
    at org.apache.beam.sdk.io.gcp.bigquery.BigQueryQuerySource.getTableToExtract(BigQueryQuerySource.java:120)
    at org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase.split(BigQuerySourceBase.java:87)
    at com.google.cloud.dataflow.worker.runners.worker.WorkerCustomSources.splitAndValidate(WorkerCustomSources.java:261)
    at com.google.cloud.dataflow.worker.runners.worker.WorkerCustomSources.performSplitTyped(WorkerCustomSources.java:209)
    at com.google.cloud.dataflow.worker.runners.worker.WorkerCustomSources.performSplitWithApiLimit(WorkerCustomSources.java:184)
    at com.google.cloud.dataflow.worker.runners.worker.WorkerCustomSources.performSplit(WorkerCustomSources.java:161)
    at com.google.cloud.dataflow.worker.runners.worker.WorkerCustomSourceOperationExecutor.execute(WorkerCustomSourceOperationExecutor.java:47)
    at com.google.cloud.dataflow.worker.runners.worker.DataflowWorker.executeWork(DataflowWorker.java:341)
    at com.google.cloud.dataflow.worker.runners.worker.DataflowWorker.doWork(DataflowWorker.java:297)
    at com.google.cloud.dataflow.worker.runners.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:244)
    at com.google.cloud.dataflow.worker.runners.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:125)
    at com.google.cloud.dataflow.worker.runners.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:105)
    at com.google.cloud.dataflow.worker.runners.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:92)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

我已经阅读了帖子3729850442135002  和https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/405,但没有解决方案适合我。

了解更多信息:

  • BigQuery表位于EU
  • 我尝试使用--zone=europe-west1-bregion=europe-west1-b

  • 开始这项工作
  • 我使用DataFlowRunner

当我转到BigQuery Web UI时,我明白了 theses temporary datasets

编辑:我使用数据流SDK的1.9.0版解决了我的问题

0 个答案:

没有答案