我尝试使用cloud-bigtable-client(https://github.com/GoogleCloudPlatform/cloud-bigtable-client)通过Dataflow将突变(增量)应用于Bigtable。
以下是我的工作内容的高级摘要:
mail_settings
PCollection<SomeData> somedata = ...;
somedata.apply(ParDo.of(new CreateMutations()))
.setCoder(new HBaseMutationCoder()).apply(CloudBigtableIO.writeToTable(config));
// I don't think it is necessary to explicitly set Coder here; I tried both ways.
是一个看起来像的DoFn:
CreateMutations
令人惊讶的是,执行此DoFn时作业失败,因为HBaseMutationCoder无法对元素进行编码。这是堆栈跟踪的一小部分:
// c.element() is KV<String, Iterable<SomeData>>
public void processElement(ProcessContext c) {
Increment mutation = new Increment(c.element().getKey().getBytes());
for (SomeData data : c.element().getValue()) {
// Obtain cf (String), qual (String), value (long) from data.
// None of them is null.
mutation.addColumn(cf.getBytes(), qual.getBytes(), value);
}
c.output(mutation);
}
请注意,在错误消息中,它清楚地显示(e8a8d266ed05e19f): java.lang.RuntimeException: com.google.cloud.dataflow.sdk.util.UserCodeException: java.lang.RuntimeException: java.lang.IllegalArgumentException: Unable to encode element 'row=some_string, families={(family=a, columns={some_string/a:2:text/LATEST_TIMESTAMP/Put/vlen=8/seqid=0+=1, some_string/a:8:text/LATEST_TIMESTAMP/Put/vlen=8/seqid=0+=9620}), (family=m, columns={some_string/m:2:text/LATEST_TIMESTAMP/Put/vlen=8/seqid=0+=1, some_string/m:8:text/LATEST_TIMESTAMP/Put/vlen=8/seqid=0+=9620}}' with coder 'HBaseMutationCoder'.
at com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:160)
at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnContext.outputWindowedValue(DoFnRunnerBase.java:288)
at ......
,row
,family
限定符和column
已正确填充。此特定错误消息显示它包含四个要递增的单元格。
我还没有使用删除和删除功能,但这是我第一次使用增量功能 - 除了行之外我还需要填充其他任何内容吗? ,家庭,资格和价值?
任何帮助都将非常感激。
我还尝试使用value
代替Put
并且它有效(它与上面的代码相同,但标有(*)的两行除外)。
Increment
(我在这里找到了一个相关的问题:How to load data into Google Cloud Bigtable from Google BigQuery
但我所遇到的问题似乎并非由// c.element() is KV<String, Iterable<SomeData>>
public void processElement(ProcessContext c) {
Put mutation = new Put(c.element().getKey().getBytes()); //(*)
for (SomeData data : c.element().getValue()) {
// Obtain cf (String), qual (String), value (long) from data.
// None of them is null.
mutation.addImmutable(cf.getBytes(), qual.getBytes(), Bytes.toBytes(value)); //(*)
}
c.output(mutation);
}
值引起,因为所有行/列系列/限定符/值都已正确填充。)
更新:这是我得到的完整堆栈跟踪。
null
答案 0 :(得分:0)
好的,我看到HBaseMutationCoder.java抛出了IllegalArgumentException。看https://github.com/apache/beam/blob/master/sdks/java/io/hbase/src/main/java/org/apache/beam/sdk/io/hbase/HBaseMutationCoder.java#L68,INCREMENT不起作用,因为它不是幂等的(但PUT是),这解释了你所看到的。