Question

我有一个用apache-beam创建的代码，该代码读取一个csv文件并将其插入到BigQuery中。在管道中，我执行三个步骤（应用）：1.读取csv，2.将文本转换为TableRow，并3.插入到BigQuery（BigQueryIO.writeTableRow()）

我创建了模板，但是当我要执行它（使用dataflow-runner）时，它仅执行BigQuery步骤。不要开始读取csv或转换为TableRow的步骤。

会发生什么？

-- This is the failed execution --

我尝试注释BigQuery块（应用），然后在其中执行前面的步骤。我还尝试并行生成管道，并运行它。当我将管道链接到BigQuery的步骤（应用）时，就会出现问题。

public static void main(String[] args) throws Throwable {

     String sourceFilePath = "gs://dgomez_test/input.csv";
     String tempLocationPath = "gs://dgomez_test/tmp";

     PipelineOptions options = PipelineOptionsFactory.fromArgs(args).withValidation().create();
     options.setTempLocation(tempLocationPath);
     options.setJobName("csvtobq");
     Pipeline p = Pipeline.create(options);

     p.apply("read csv", TextIO.read().from(sourceFilePath))
      .apply("string to tablerow", ParDo.of(new FormatForBigquery()));
      .apply("write to bigquery",
           BigQueryIO.writeTableRows().to(TABLE)
               .withSchema(FormatForBigquery.getSchema())
               .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
               .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE));

     p.run();
}

我的数据流模板在BigQuery之前没有启动步骤

0 个答案: