将Cloud Firestore集成到Google Cloud中的Cloud Dataflow

时间:2019-05-09 01:52:03

标签: maven google-cloud-platform google-cloud-firestore google-cloud-dataflow

我正在将一个“应用”集成到云数据流中,该应用将JSON消息写入Cloud Firestore。问题是apache Beam库(和依赖项)和Firestore库不兼容。下面,我向您展示我的pom,数据流代码和maven编译错误的摘录:

从pub / sub或云存储中读取和写入pub / sub时,数据流工作异常出色。但是,当我添加firestore依赖性时,在编译时出现了依赖性错误。我认为grpc有问题。 我打算使用最新的Apache Beam版本,但出现相同的错误。

pom

    <properties>
        <beam.version>2.8.0</beam.version>
    </properties>
    <dependencies>
        <dependency>
            <groupId>org.apache.beam</groupId>
            <artifactId>beam-sdks-java-core</artifactId>
            <version>${beam.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.beam</groupId>
            <artifactId>beam-sdks-java-io-google-cloud-platform</artifactId>
            <version>${beam.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.beam</groupId>
            <artifactId>beam-runners-google-cloud-dataflow-java</artifactId>
            <version>${beam.version}</version>
            <!--scope>runtime</scope-->
        </dependency>
        <dependency>
            <groupId>org.apache.beam</groupId>
            <artifactId>beam-runners-direct-java</artifactId>
            <version>${beam.version}</version>
            <scope>runtime</scope>
        </dependency>
        <dependency>
            <groupId>com.google.cloud</groupId>
            <artifactId>google-cloud-firestore</artifactId>
            <version>1.0.0</version>
        </dependency>

数据流(Java 8) FirestoreFn:插入到Firestore中的功能

PCollection<String> pcollStorage =
                pColl
                    .apply("Read from GCS", TextIO.read().from(options.getInputFile()));
            PCollection<EventAvailabilityAlert> pcollEvent = pcollStorage
                    .apply("CsvLineToJson", ParDo.of(new CsvLineToJsonMsgFn()))
                    .apply("Firestore Insert", ParDo.of(new FirestoreFn()));
            PCollection<PubsubMessage> pcollPubsubMsg = pcollEvent
                    .apply("JsonMsgToPubsubMsg", ParDo.of(new JsonMsgToPubsubMsgFn()));
            pcollPubsubMsg
                    .apply("Sending To Pub/Sub",PubsubIO.writeMessages().to(pubSubProjectsFolder + projectId + pubSubTopicFolder + pubSubTopicName));
            pColl.run();

Maven错误

Could not resolve version conflict among [
org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.8.0 -> io.grpc:grpc-core:jar:1.13.1, org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.8.0 
-> com.google.api:gax-grpc:jar:1.29.0 
-> io.grpc:grpc-protobuf:jar:1.10.1 
-> io.grpc:grpc-core:jar:1.10.1, org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.8.0 
-> com.google.api:gax-grpc:jar:1.29.0 
-> io.grpc:grpc-protobuf:jar:1.10.1 
-> io.grpc:grpc-protobuf-lite:jar:1.10.1 
-> io.grpc:grpc-core:jar:1.10.1, org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.8.0 
-> io.grpc:grpc-auth:jar:1.13.1 
-> io.grpc:grpc-core:jar:[1.13.1,1.13.1], org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.8.0 
-> io.grpc:grpc-netty:jar:1.13.1 
-> io.grpc:grpc-core:jar:[1.13.1,1.13.1], org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.8.0 
-> io.grpc:grpc-stub:jar:1.13.1 
-> io.grpc:grpc-core:jar:1.13.1, org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.8.0 
-> com.google.cloud.bigtable:bigtable-client-core:jar:1.4.0 
-> io.grpc:grpc-core:jar:1.10.1, org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.8.0 
-> io.grpc:grpc-all:jar:1.13.1 
-> io.grpc:grpc-core:jar:[1.13.1,1.13.1], org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.8.0 
-> io.grpc:grpc-all:jar:1.13.1 
-> io.grpc:grpc-okhttp:jar:1.13.1 
-> io.grpc:grpc-core:jar:[1.13.1,1.13.1], org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.8.0 
-> io.grpc:grpc-all:jar:1.13.1 
-> io.grpc:grpc-protobuf-nano:jar:1.13.1 
-> io.grpc:grpc-core:jar:1.13.1, org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.8.0 
-> io.grpc:grpc-all:jar:1.13.1 
-> io.grpc:grpc-testing:jar:1.13.1 
-> io.grpc:grpc-core:jar:[1.13.1,1.13.1], com.google.cloud:google-cloud-firestore:jar:1.0.0 
-> io.grpc:grpc-netty-shaded:jar:1.19.0 
-> io.grpc:grpc-core:jar:[1.19.0,1.19.0], com.google.cloud:google-cloud-firestore:jar:1.0.0 
-> io.opencensus:opencensus-contrib-grpc-util:jar:0.19.2 
-> io.grpc:grpc-core:jar:1.18.0]

完全的Maven编译错误

org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal on project yyy: Could not resolve dependencies for project xx.xxx:yyy:jar:0.1: Failed to collect dependencies for xx.xxx:yyy:jar:0.1
    at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies (LifecycleDependencyResolver.java:269)
    at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies (LifecycleDependencyResolver.java:147)
    at org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved (MojoExecutor.java:248)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:202)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:288)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)


I expect to insert the messages in real time to firestore and i think that integrating the functionality in dataflow (like a "apply" operation) is the best approach.
Does exist a **workaround** to resolve the issue or definitively is a bug of compatibility.
Could you explain if exist another eficiente design to achieve the goal (for example, calling from dataflow to firestore rest/grpc api, the last one implemented in another project).

1 个答案:

答案 0 :(得分:0)

一种实现此目标的方法是将Firebase和Cloud pubsub的云函数与云数据流一起使用,因此对于要写入Firestore的每个data(json),您宁愿将其写入Cloud pubsub主题以及每个此类事件将会触发一个云函数,该函数将从pubsub中读取并写入Firestore中,如下所示:

    export const myFunction = functions.pubsub.topic('sushtopictarget').onPublish(message => {
       const message1 = message.data;
       functions.logger.info('Message recieved1.',message.json);
       var userObject = {
          fname : message.json.fname,
          lname : message.json.lname,
       };
       admin.firestore().collection('/sushtest2/').add(userObject);
       const doc = admin.firestore().doc('/sushtest2/{pushId}');
       doc.set(userObject);
       //return 
      admin.firestore.DocumentReference('/sushtest2/{pushId}').add(userObject);
    });