无法使用 Apache Beam (Python SDK) 读取发布/订阅消息

时间:2021-01-21 15:20:51

标签: python streaming apache-beam google-cloud-pubsub apache-beam-io

我正在尝试使用 Beam 编程框架 (Python SDK) 从 Pub/Sub 主题流式传输消息并将它们写出到控制台。

这是我的代码(带有 apache-beam==2.27.0):


import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions

TOPIC_PATH = "projects/<project-id>/topics/<topic-id>"

def run(pubsub_topic):
    options = PipelineOptions(
        streaming=True
    )
    runner = 'DirectRunner'

    print("I reached before pipeline")

    with beam.Pipeline(runner, options=options) as pipeline:
        (
            pipeline
            | "Read from Pub/Sub topic" >> beam.io.ReadFromPubSub(topic=pubsub_topic)
            | "Writing to console" >> beam.Map(print)
        )

    print("I reached after pipeline")

    result = pipeline.run()
    result.wait_until_finish()


run(TOPIC_PATH)

然而,当我执行这个管道时,我得到这个类型错误:

ERROR:apache_beam.runners.direct.executor:Exception at bundle <apache_beam.runners.direct.bundle_factory._Bundle object at 0x1349763c0>, due to an exception.

TypeError: create_subscription() takes from 1 to 2 positional arguments but 3 were given

最后说:

ERROR:apache_beam.runners.direct.executor:Giving up after 4 attempts.

我不确定,我做错了什么,在此先感谢您的帮助。

2 个答案:

答案 0 :(得分:0)

答案 1 :(得分:0)

我在通过“pip install apache-beam”安装时遇到了同样的问题。当我切换到“pip install apache-beam [gcp]”时,它对我有用,即使使用 DirectRunner。

相关问题