我正在尝试使用 Beam 编程框架 (Python SDK) 从 Pub/Sub 主题流式传输消息并将它们写出到控制台。
这是我的代码(带有 apache-beam==2.27.0
):
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
TOPIC_PATH = "projects/<project-id>/topics/<topic-id>"
def run(pubsub_topic):
options = PipelineOptions(
streaming=True
)
runner = 'DirectRunner'
print("I reached before pipeline")
with beam.Pipeline(runner, options=options) as pipeline:
(
pipeline
| "Read from Pub/Sub topic" >> beam.io.ReadFromPubSub(topic=pubsub_topic)
| "Writing to console" >> beam.Map(print)
)
print("I reached after pipeline")
result = pipeline.run()
result.wait_until_finish()
run(TOPIC_PATH)
然而,当我执行这个管道时,我得到这个类型错误:
ERROR:apache_beam.runners.direct.executor:Exception at bundle <apache_beam.runners.direct.bundle_factory._Bundle object at 0x1349763c0>, due to an exception.
TypeError: create_subscription() takes from 1 to 2 positional arguments but 3 were given
最后说:
ERROR:apache_beam.runners.direct.executor:Giving up after 4 attempts.
我不确定,我做错了什么,在此先感谢您的帮助。
答案 0 :(得分:0)
我不知道错误究竟在哪里,但您能否考虑使用以下 Beam 示例之一作为模型并从那里开始?
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/pubsub_it_pipeline.py
答案 1 :(得分:0)
我在通过“pip install apache-beam”安装时遇到了同样的问题。当我切换到“pip install apache-beam [gcp]”时,它对我有用,即使使用 DirectRunner。