apache beam 2.2 pipeline.apply没有这样的方法异常

时间:2017-12-08 11:44:40

标签: google-cloud-dataflow apache-beam

    public static void main(String[] args) {
            //Pipeline p = Pipeline.create(PipelineOptionsFactory.fromArgs(args).withValidation().create());
            DataflowPipelineOptions options = PipelineOptionsFactory.as(DataflowPipelineOptions.class);
            options.setRunner(DataflowRunner.class);
            options.setStagingLocation("gs://bucketname/stageapache");
            options.setTempLocation("gs://bucketname/stageapachetemp");
            options.setProject("projectid");
            Pipeline p=Pipeline.create(options);
    p.apply(TextIO.read().from("gs://bucketname/filename.csv"));
//p.apply(FileIO.match().filepattern("gs://bucketname/f.csv"));
            p.run();
        }

pom.xml

<dependency>
  <groupId>org.apache.beam</groupId>
  <artifactId>beam-sdks-java-core</artifactId>
  <version>2.2.0</version>
</dependency>
<dependency>
    <groupId>org.apache.beam</groupId>
    <artifactId>beam-sdks-java-io-google-cloud-platform</artifactId>
    <version>2.0.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-runners-google-cloud-dataflow-java -->
<dependency>
    <groupId>org.apache.beam</groupId>
    <artifactId>beam-runners-google-cloud-dataflow-java</artifactId>
    <version>2.0.0</version>
</dependency>

错误

Dec 08, 2017 5:09:35 PM org.apache.beam.runners.dataflow.DataflowRunner fromOptions
INFO: PipelineOptions.filesToStage was not specified. Defaulting to files from the classpath: will stage 85 files. Enable logging at DEBUG level to see which files will be staged.
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.beam.sdk.values.PCollection.createPrimitiveOutputInternal(Lorg/apache/beam/sdk/Pipeline;Lorg/apache/beam/sdk/values/WindowingStrategy;Lorg/apache/beam/sdk/values/PCollection$IsBounded;)Lorg/apache/beam/sdk/values/PCollection;
    at org.apache.beam.runners.dataflow.PrimitiveParDoSingleFactory$ParDoSingle.expand(PrimitiveParDoSingleFactory.java:68)
    at org.apache.beam.runners.dataflow.PrimitiveParDoSingleFactory$ParDoSingle.expand(PrimitiveParDoSingleFactory.java:58)
    at org.apache.beam.sdk.Pipeline.applyReplacement(Pipeline.java:550)
    at org.apache.beam.sdk.Pipeline.replace(Pipeline.java:280)
    at org.apache.beam.sdk.Pipeline.replaceAll(Pipeline.java:201)
    at org.apache.beam.runners.dataflow.DataflowRunner.replaceTransforms(DataflowRunner.java:688)
    at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:498)
    at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:153)
    at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303)
    at org.apache.beam.sdk.Pipeline.run(Pipeline.java:289)
    at com.pearson.apachebeam.StarterPipeline.main(StarterPipeline.java:60)

在上面的代码中,如果添加FileIO / TextIO行,我收到上述错误,如果我运行它,则添加该行是创建作业,因为它没有操作失败。我在我的开发中坚持这一点,我迁移到apache beam 2.2来控制我们从存储中读取的文件

帮助将不胜感激

由于

1 个答案:

答案 0 :(得分:2)

问题是,您的pom.xml取决于不同版本的Beam SDK的不同组件:beam-sdks-java-core位于2.2.0,beam-sdks-java-io-google-cloud-platformbeam-runners-google-cloud-dataflow-java位于2.0 0.0。它们需要处于相同的版本。