尝试使用Python SDK运行Google Dataflow示例。
我可以在本地运行:
python -m apache_beam.examples.wordcount --output OUTPUT_FILE
然而,当尝试在GCP上运行时:
python -m apache_beam.examples.wordcount \
--project myproject \
--job_name myproject-wordcount \
--runner DataflowRunner \
--staging_location gs://myproject/staging \
--output gs://myproject/output \
--network myproject-network \
--zone europe-west1-b \
--subnetwork regions/europe-west1/subnetworks/europe-west1 \
--temp_location gs://myproject/temp
我收到以下错误:
apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, Error:
(9d6636d5e214c789): Workflow failed. Causes: (ab9869cb8161ec27): Error:
Message: Invalid value for field 'resource.properties.networkInterfaces[0].subnetwork': ''. Network interface must specify a subnet if the network resource is in custom subnet mode.
HTTP Code: 400
我正在使用apache-beam Python SDK 0.6.0
有人可以帮忙吗?
答案 0 :(得分:0)
这对你来说可能为时已晚,但如果其他人遇到同样的问题,问题就在这里:
这一行:
--zone europe-west1-b \
应该只是:
--region europe-west1 \