我根据本文设置了云数据流管道:https://cloud.google.com/bigtable/docs/dataflow-hbase
当我将其提交给Cloud Dataflow托管服务时,我在Cloud Dataflow工作人员处收到以下错误:
Uncaught exception in main thread. Exiting with status code 1.
java.lang.NoSuchMethodError: io.grpc.netty.GrpcSslContexts.forClient()Lcom/google/bigtable/repackaged/io/netty/handler/ssl/SslContextBuilder;
at com.google.cloud.bigtable.grpc.BigtableSession.createSslContext(BigtableSession.java:98)
at com.google.cloud.bigtable.grpc.BigtableSession.access$000(BigtableSession.java:82)
at com.google.cloud.bigtable.grpc.BigtableSession$1.run(BigtableSession.java:151)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
我该如何处理这个问题?
My Cloud Dataflow Pipeline源代码如下:
package mypackage
import com.google.cloud.bigtable.dataflow.CloudBigtableIO;
import com.google.cloud.bigtable.dataflow.CloudBigtableOptions;
import com.google.cloud.bigtable.dataflow.CloudBigtableTableConfiguration;
import com.google.cloud.dataflow.sdk.Pipeline;
import com.google.cloud.dataflow.sdk.options.PipelineOptionsFactory;
import com.google.cloud.dataflow.sdk.transforms.Create;
import com.google.cloud.dataflow.sdk.transforms.DoFn;
import com.google.cloud.dataflow.sdk.transforms.ParDo;
import org.apache.hadoop.hbase.client.Mutation;
import org.apache.hadoop.hbase.client.Put;
public class Main {
// Create a DoFn that creates a Put or Delete. MUTATION_TRANSFORM is a simplistic example.
static final DoFn<String, Mutation> MUTATION_TRANSFORM = new DoFn<String, Mutation>() {
@Override
public void processElement(DoFn<String, Mutation>.ProcessContext c) throws Exception {
c.output(new Put(c.element().getBytes()).addColumn("v".getBytes(), "v".getBytes(), "value".getBytes()));
}
};
public static void main(String[] args) {
// CloudBigtableOptions is one way to retrieve the options. It's not required to use this
// specific PipelineOptions extension; CloudBigtableOptions is there as a convenience.
CloudBigtableOptions options =
PipelineOptionsFactory.fromArgs(args).withValidation().as(CloudBigtableOptions.class);
// CloudBigtableTableConfiguration contains the project, zone, cluster and table to connect to
CloudBigtableTableConfiguration config = CloudBigtableTableConfiguration.fromCBTOptions(options);
Pipeline p = Pipeline.create(options);
// This sets up serialization for Puts and Deletes so that Dataflow can potentially move them through
// the network
CloudBigtableIO.initializeForWrite(p);
p
.apply(Create.of("Hello", "World"))
.apply(ParDo.of(MUTATION_TRANSFORM))
.apply(CloudBigtableIO.writeToTable(config));
p.run();
}
}
答案 0 :(得分:1)
此问题与:https://github.com/GoogleCloudPlatform/cloud-bigtable-client/issues/613
相同我认为这里的问题是Dataflow和Bigtable都包含io.grpc。 Bigtable使用着色插件并更改包名称,但未更改io.grpc包名称,如下所述:https://github.com/GoogleCloudPlatform/cloud-bigtable-client/issues/582
解决问题的最佳方法是使用bigtable-hbase的0.2.3-SNAPSHOT版本。您必须在pom.xml中添加以下内容才能使用SNAPSHOTs:
resolve();
我们将在新的一年中尽快发布正式版。
答案 1 :(得分:-1)
听起来工作人员没有相应的jar文件。它在你的本地类路径上吗?
默认情况下,Dataflow会在Dataflow Service上启动作业之前将类路径的内容从主程序复制到Google Cloud Storage。然后工人们从Google云端存储中获取罐子。您可以在主程序的日志中看到这种情况:
INFO: PipelineOptions.filesToStage was not specified. Defaulting to files from the classpath: will stage XX files. Enable logging at DEBUG level to see which files will be staged.
...
INFO: Uploading XX files from PipelineOptions.filesToStage to staging location to prepare for execution.
INFO: Uploading PipelineOptions.filesToStage complete: YY files newly uploaded, ZZ files cached
...
Submitted job: 2015-12-28_07_22_37-8675309