Question

在Beam（Dataflow 2.0.0）中，我正在阅读PubSub主题，然后尝试根据主题中的消息从Bigtable中获取几行。我无法通过Beam文档找到基于pubsub消息扫描BigTable的方法。我试着编写ParDo函数并将其传输到光束管道但是徒劳无功。

BigTableIO提供了一个读取选项，但它不在管道之内，我不确定它是否会像我的用例一样以蒸汽方式工作。

任何人都可以告诉我这是否可行，如流媒体PubSub和基于消息内容阅读BigTable。

P.S：我在Beam 2.0中使用Java API。

    PCollection<String> keyLines = 
                pipeline.apply(PubsubIO.readMessagesWithAttributes()
                .fromSubscription("*************"))
                .apply("PubSub Message to Payload as String", 
                     ParDo.of(new PubSubMessageToStringConverter()));

现在我希望keyLines充当扫描BigTable的行键。我正在使用BigTable的以下代码片段。我可以看到＆＃39; RowFilter.newBuilder（）＆＃39;和＆＃39; ByteKeyRange＆＃39;但是它们似乎都是以批处理模式工作而不是以流媒体方式工作。

   pipeline.apply("read",
                BigtableIO.read()
                     .withBigtableOptions(optionsBuilder)
                     .withTableId("**********");

    pipeline.run();

请告知。

Answer 1

您应该能够在ParDo中读取BigTable。您必须直接使用Cloud Big Table或HBase API。最好在DoFn（example）中的// profile.js (client action code with react-redux) export function profile(terms, location) { return (dispatch) => { return fetch('profile', {credentials: 'include'}) .then(response => response.json()) .then(json => { dispatch(profileSuccess(json)); }) .catch(error => dispatch(profileError(error))); } }方法中初始化客户端。如果不起作用，请发布更多详细信息。

从Google PubSub中读取，然后根据PubSub消息主题从Bigtable中读取

1 个答案: