) 这是我的第一个问题,在找到答案之前我真的很难过。 我想创建非常简单的管道,并且在开始时已经卡住了。这是我的代码:
import apache_beam as beam
options = PipelineOptions()
google_cloud_options = options.view_as(GoogleCloudOptions)
google_cloud_options.project = 'myproject'
google_cloud_options.job_name = 'mypipe'
google_cloud_options.staging_location = 'gs://mybucket/staging'
google_cloud_options.temp_location = 'gs://mybucket/temp'
options.view_as(StandardOptions).runner = 'DataflowRunner'
产生错误: NameError:name' PipelineOptions'未定义
感谢您的帮助。
答案 0 :(得分:1)
模块代码已从 apache_beam.utils 更改为 apache_beam.option :
您现在应该使用:
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.options.pipeline_options import SetupOptions
from apache_beam.options.pipeline_options import GoogleCloudOptions
from apache_beam.options.pipeline_options import StandardOptions
此处的官方文档:https://beam.apache.org/releases/pydoc/2.0.0/_modules/apache_beam/options/pipeline_options.html
答案 1 :(得分:0)
您需要添加一些其他导入才能使示例正常工作:
from apache_beam.io import ReadFromText
from apache_beam.io import WriteToText
from apache_beam.metrics import Metrics
from apache_beam.utils.pipeline_options import PipelineOptions
from apache_beam.utils.pipeline_options import SetupOptions
from apache_beam.utils.pipeline_options import GoogleCloudOptions
from apache_beam.utils.pipeline_options import StandardOptions
答案 2 :(得分:0)
from apache_beam.pipeline import PipelineOptions
options = PipelineOptions()