尝试从Cloud Dataflow工作程序连接到Cloud Bigtable时发生NoSuchMethodError

时间:2015-12-28 14:19:43

标签: google-cloud-dataflow

我根据本文设置了云数据流管道:https://cloud.google.com/bigtable/docs/dataflow-hbase

当我将其提交给Cloud Dataflow托管服务时,我在Cloud Dataflow工作人员处收到以下错误:

Uncaught exception in main thread. Exiting with status code 1.
java.lang.NoSuchMethodError: io.grpc.netty.GrpcSslContexts.forClient()Lcom/google/bigtable/repackaged/io/netty/handler/ssl/SslContextBuilder;
at com.google.cloud.bigtable.grpc.BigtableSession.createSslContext(BigtableSession.java:98)
at com.google.cloud.bigtable.grpc.BigtableSession.access$000(BigtableSession.java:82)
at com.google.cloud.bigtable.grpc.BigtableSession$1.run(BigtableSession.java:151)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

我该如何处理这个问题?

My Cloud Dataflow Pipeline源代码如下:

package mypackage

import com.google.cloud.bigtable.dataflow.CloudBigtableIO;
import com.google.cloud.bigtable.dataflow.CloudBigtableOptions;
import com.google.cloud.bigtable.dataflow.CloudBigtableTableConfiguration;
import com.google.cloud.dataflow.sdk.Pipeline;
import com.google.cloud.dataflow.sdk.options.PipelineOptionsFactory;
import com.google.cloud.dataflow.sdk.transforms.Create;
import com.google.cloud.dataflow.sdk.transforms.DoFn;
import com.google.cloud.dataflow.sdk.transforms.ParDo;
import org.apache.hadoop.hbase.client.Mutation;
import org.apache.hadoop.hbase.client.Put;

public class Main {
    // Create a DoFn that creates a Put or Delete.  MUTATION_TRANSFORM is a simplistic example.
    static final DoFn<String, Mutation> MUTATION_TRANSFORM = new DoFn<String, Mutation>() {
        @Override
        public void processElement(DoFn<String, Mutation>.ProcessContext c) throws Exception {
            c.output(new Put(c.element().getBytes()).addColumn("v".getBytes(), "v".getBytes(), "value".getBytes()));
        }
    };

    public static void main(String[] args) {
        // CloudBigtableOptions is one way to retrieve the options.  It's not required to use this
        // specific PipelineOptions extension; CloudBigtableOptions is there as a convenience.
        CloudBigtableOptions options =
                PipelineOptionsFactory.fromArgs(args).withValidation().as(CloudBigtableOptions.class);

        // CloudBigtableTableConfiguration contains the project, zone, cluster and table to connect to
        CloudBigtableTableConfiguration config = CloudBigtableTableConfiguration.fromCBTOptions(options);

        Pipeline p = Pipeline.create(options);
        // This sets up serialization for Puts and Deletes so that Dataflow can potentially move them through
        // the network
        CloudBigtableIO.initializeForWrite(p);

        p
                .apply(Create.of("Hello", "World"))
                .apply(ParDo.of(MUTATION_TRANSFORM))
                .apply(CloudBigtableIO.writeToTable(config));

        p.run();
    }
}

2 个答案:

答案 0 :(得分:1)

此问题与:https://github.com/GoogleCloudPlatform/cloud-bigtable-client/issues/613

相同

我认为这里的问题是Dataflow和Bigtable都包含io.grpc。 Bigtable使用着色插件并更改包名称,但未更改io.grpc包名称,如下所述:https://github.com/GoogleCloudPlatform/cloud-bigtable-client/issues/582

解决问题的最佳方法是使用bigtable-hbase的0.2.3-SNAPSHOT版本。您必须在pom.xml中添加以下内容才能使用SNAPSHOTs:

resolve();

我们将在新的一年中尽快发布正式版。

答案 1 :(得分:-1)

听起来工作人员没有相应的jar文件。它在你的本地类路径上吗?

默认情况下,Dataflow会在Dataflow Service上启动作业之前将类路径的内容从主程序复制到Google Cloud Storage。然后工人们从Google云端存储中获取罐子。您可以在主程序的日志中看到这种情况:

INFO: PipelineOptions.filesToStage was not specified. Defaulting to files from the classpath: will stage XX files. Enable logging at DEBUG level to see which files will be staged.
...
INFO: Uploading XX files from PipelineOptions.filesToStage to staging location to prepare for execution.
INFO: Uploading PipelineOptions.filesToStage complete: YY files newly uploaded, ZZ files cached
...
Submitted job: 2015-12-28_07_22_37-8675309