用于读写的Google Cloud Dataflow DatastoreIO依赖项问题

时间:2017-05-22 17:24:48

标签: java maven dependencies google-cloud-datastore google-cloud-dataflow

我需要的是:依赖关系版本的正确组合,能够通过DatastoreIO.v1()读取/写入数据流(v.1.9.0)中的数据存储区和读取/写入,以及需要哪些依赖关系在pom中引用?

数据流特定依赖关系在pom中引用,来自mavenrepo for Dataflow 1.9.0:

com.google.cloud.dataflow/google-cloud-dataflow-java-sdk-all/1.9.0
com.google.cloud.datastore/datastore-v1-protos/1.0.1
com.google.cloud.datastore/datastore-v1-proto-client/1.1.0
com.google.protobuf/protobuf-java/3.0.0-beta-1

写入数据存储区时(实际上在构建实体时),我得到以下异常:

// CamelExecutionException (Setup running with Camel-Routes, but for development purposes not in Fuse but as a local CamelRoute in Eclipse)
Caused by: java.lang.NoClassDefFoundError: com/google/protobuf/GeneratedMessageV3
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at com.google.datastore.v1.Value.toBuilder(Value.java:749)
    at com.google.datastore.v1.Value.newBuilder(Value.java:743)
    at xmlsource.dataflow.test.EntityUtil.getStringValue(EntityUtil.java:404)
    at xmlsource.dataflow.test.EntityUtil.getArticleEntity(EntityUtil.java:152)
    at xmlsource.dataflow.test.parser.ArticleToEntity.processElement(ArticleToEntity.java:21)
    at com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:49)
    at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:139)
    at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateHelper(ParDo.java:1229)
    at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateSingleHelper(ParDo.java:1098)
    at com.google.cloud.dataflow.sdk.transforms.ParDo.access$300(ParDo.java:457)
    at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:1084)
    at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:1079)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.visitTransform(DirectPipelineRunner.java:858)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:221)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:217)
    at com.google.cloud.dataflow.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:103)
    at com.google.cloud.dataflow.sdk.Pipeline.traverseTopologically(Pipeline.java:260)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.run(DirectPipelineRunner.java:814)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:526)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:96)
    at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:181)
    at xmlsource.dataflow.test.PipelineParseTest.createAndRun(PipelineParseTest.java:208)
    at xmlsource.dataflow.test.PipelineTester.process(PipelineTester.java:11)
    at org.apache.camel.processor.DelegateSyncProcessor.process(DelegateSyncProcessor.java:63)
    ... 8 more

xmlsource.dataflow.test.EntityUtil.getStringValue(EntityUtil.java:404)中引用的行:

Value.newBuilder().setStringValue(value).build();

当阅读的内容大致相同时:

java.lang.NoClassDefFoundError: com/google/protobuf/GeneratedMessageV3
…

将依赖项更改为(仅限protobuf-java的beta版本)

com.google.cloud.datastore/datastore-v1-protos/1.0.1
com.google.cloud.datastore/datastore-v1-proto-client/1.1.0
com.google.protobuf/protobuf-java/3.0.0

并尝试写入,发生异常后:

// CamelExecutionException...
Caused by: java.lang.VerifyError: Bad type on operand stack
Exception Details:
  Location:
    com/google/datastore/v1/Value$Builder.mergeGeoPointValue(Lcom/google/type/LatLng;)Lcom/google/datastore/v1/Value$Builder; @76: invokevirtual
  Reason:
    Type 'com/google/type/LatLng' (current frame, stack[1]) is not assignable to 'com/google/protobuf/GeneratedMessage'
  Current Frame:
    bci: @76
    flags: { }
    locals: { 'com/google/datastore/v1/Value$Builder', 'com/google/type/LatLng' }
    stack: { 'com/google/protobuf/SingleFieldBuilder', 'com/google/type/LatLng' }
  Bytecode:
    someBytecode                                    
  Stackmap Table:
    same_frame(@50)
    same_frame(@55)
    same_frame(@62)
    same_frame(@80)
    same_frame(@89)

    at com.google.datastore.v1.Value.toBuilder(Value.java:749)
    at com.google.datastore.v1.Value.newBuilder(Value.java:743)
    at xmlsource.dataflow.test.EntityUtil.getStringValue(EntityUtil.java:404)
    at xmlsource.dataflow.test.EntityUtil.getArticleEntity(EntityUtil.java:152)
    at xmlsource.dataflow.test.parser.ArticleToEntity.processElement(ArticleToEntity.java:21)
    at com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:49)
    at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:139)
    at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateHelper(ParDo.java:1229)
    at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateSingleHelper(ParDo.java:1098)
    at com.google.cloud.dataflow.sdk.transforms.ParDo.access$300(ParDo.java:457)
    at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:1084)
    at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:1079)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.visitTransform(DirectPipelineRunner.java:858)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:221)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:217)
    at com.google.cloud.dataflow.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:103)
    at com.google.cloud.dataflow.sdk.Pipeline.traverseTopologically(Pipeline.java:260)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.run(DirectPipelineRunner.java:814)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:526)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:96)
    at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:181)
    at xmlsource.dataflow.test.PipelineParseTest.createAndRun(PipelineParseTest.java:208)
    at xmlsource.dataflow.test.PipelineTester.process(PipelineTester.java:11)
    at org.apache.camel.processor.DelegateSyncProcessor.process(DelegateSyncProcessor.java:63)

这里的异常引用了一个函数mergeGeoPointValue,而我的代码从不调用任何函数来设置LatLng或GeoPoint值。我的代码中的引用行再次设置String-Value

读取时我有相同的异常,再次将POJO转换为数据存储区实体时

Value.newBuilder().setStringValue("someString").build()

整个查询:

Query query = Query.newBuilder()
  .addKind(KindExpression.newBuilder()
    .setName("test_article").build())
  .setFilter(Filter.newBuilder()
    .setPropertyFilter(PropertyFilter.newBuilder()
      .setProperty(PropertyReference.newBuilder()
        .setName("somePropertyName"))
        .setOp(PropertyFilter.Operator.EQUAL)
        .setValue(Value.newBuilder()
          .setStringValue("someString").build())                                
      .build())
    .build())
  .build();

将依赖项更改为(datastore-v1-protos / 1.3.0):

com.google.cloud.datastore/datastore-v1-protos/1.3.0
com.google.cloud.datastore/datastore-v1-proto-client/1.1.0
com.google.protobuf/protobuf-java/3.0.0 (or 3.2.0)

通过此设置,我可以通过.apply(DatastoreIO.v1().write().withProjectId("someProjectId"));

成功写入数据存储区

尝试阅读时,Query-Object已成功构建,但是......:

// CamelExecutionException
Caused by: java.lang.NoSuchMethodError: com.google.datastore.v1.Query$Builder.clone()Lcom/google/protobuf/GeneratedMessage$Builder;
    at com.google.cloud.dataflow.sdk.io.datastore.DatastoreV1$Read$ReadFn.processElement(DatastoreV1.java:648)
    at com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:49)
    at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:139)
    at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateHelper(ParDo.java:1229)
    at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateSingleHelper(ParDo.java:1098)
    at com.google.cloud.dataflow.sdk.transforms.ParDo.access$300(ParDo.java:457)
    at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:1084)
    at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:1079)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.visitTransform(DirectPipelineRunner.java:858)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:221)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:217)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:217)
    at com.google.cloud.dataflow.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:103)
    at com.google.cloud.dataflow.sdk.Pipeline.traverseTopologically(Pipeline.java:260)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.run(DirectPipelineRunner.java:814)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:526)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:96)
    at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:181)
    at xmlsource.dataflow.test.PipelineParseTest.createAndRun(PipelineParseTest.java:208)
    at xmlsource.dataflow.test.PipelineTester.process(PipelineTester.java:11)
    at org.apache.camel.processor.DelegateSyncProcessor.process(DelegateSyncProcessor.java:63)
    ... 8 more

我尝试从数据存储中读取的行:

PCollection<Entity> entityCollection = p.apply(
  DatastoreIO.v1().read().withNamespace("test_ns_df")
    .withProjectId("someProjectId")
    .withQuery(query));

编辑: 当使用来自GitHubDataflowExample的依赖项(和parent-pom)时,我再次得到java.lang.NoClassDefFoundError:com / google / protobuf / GeneratedMessageV3 在为查询构建值时....

所以我从来没有让阅读工作......有没有人遇到类似的问题,并找到了如何解决这个问题?或者我需要以不同方式构建值吗?使用DatastoreHelper.makeValue时会出现相同的异常...工作项目中引用的依赖项也会有很大帮助!

我认为这将是一个依赖/版本问题,但也许你们中的某个人知道的更好。不可能,我是第一个遇到java.lang.NoSuchMethodError: com.google.datastore.v1.Query$Builder.clone()这类问题的人protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_main); btn = (Button) findViewById(R.id.btn); show = (Button) findViewById(R.id.btn2); mStorageRef = FirebaseStorage.getInstance().getReference(); pd = new ProgressDialog(this); btn.setOnClickListener(new View.OnClickListener() { @Override public void onClick(View v) { Intent i = new Intent(Intent.ACTION_PICK); i.setType("image/*"); startActivityForResult(i, GALLERY_INTENT); } }); show.setOnClickListener(new View.OnClickListener() { @Override public void onClick(View v) { pager =(ViewPager) findViewById(R.id.view_pager); adapter = new CustomAdapter(MainActivity.this, list); pager.setAdapter(adapter); adapter.notifyDataSetChanged(); } }); } @Override protected void onActivityResult(int requestCode, int resultCode, Intent data) { super.onActivityResult(requestCode, resultCode, data); if(requestCode == GALLERY_INTENT && resultCode == RESULT_OK){ pd.setMessage("Uploading...!"); pd.show(); Uri file = data.getData(); StorageReference filepath = mStorageRef.child("Photos").child(file.getLastPathSegment()); filepath.putFile(file) .addOnSuccessListener(new OnSuccessListener<UploadTask.TaskSnapshot>() { @Override public void onSuccess(UploadTask.TaskSnapshot taskSnapshot) { // Get a URL to the uploaded content Uri downloadUrl = taskSnapshot.getDownloadUrl(); list.add(downloadUrl); pd.dismiss(); Toast.makeText(MainActivity.this, "Upload done!", Toast.LENGTH_LONG).show(); } }) .addOnFailureListener(new OnFailureListener() { @Override public void onFailure(@NonNull Exception exception) { // Handle unsuccessful uploads // ... } }); } } 只是推了一个错误的版本,但在我看来,这并没有结果成功。

提前致谢

1 个答案:

答案 0 :(得分:1)

发现问题:

由于Camel-Fuse在Google Storage中存储文件的同一项目中存在预处理,因此我依赖google-storage:

<dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-storage</artifactId>
    <version>0.6.0</version>
</dependency>

在数据流依赖性之前的pom.xml中提到了这种依赖关系。在切换依赖项的顺序(数据存储之前的数据流)并删除所有其他依赖项之后,DatastoreIO完美运行!然后,根据您的操作(例如XMLSource),需要添加一些运行时依赖性