我需要运行同时安装了BigQuery和Cloud Storage连接器的Dataproc
集群。
我使用this script的变体(因为我无法访问通用桶中的存储桶),一切工作正常,但是当我运行作业,集群启动并运行时,它总是会导致Task was not acquired
错误。
我可以通过简单地在每个节点上重新启动dataproc代理来解决此问题,但是我真的需要它能够正常工作,以便能够在创建集群后立即运行作业。似乎脚本的这一部分无法正常工作:
# Restarts Dataproc Agent after successful initialization
# WARNING: this function relies on undocumented and not officially supported Dataproc Agent
# "sentinel" files to determine successful Agent initialization and not guaranteed
# to work in the future. Use at your own risk!
restart_dataproc_agent() {
# Because Dataproc Agent should be restarted after initialization, we need to wait until
# it will create a sentinel file that signals initialization competition (success or failure)
while [[ ! -f /var/lib/google/dataproc/has_run_before ]]; do
sleep 1
done
# If Dataproc Agent didn't create a sentinel file that signals initialization
# failure then it means that initialization succeded and it should be restarted
if [[ ! -f /var/lib/google/dataproc/has_failed_before ]]; then
service google-dataproc-agent restart
fi
}
export -f restart_dataproc_agent
# Schedule asynchronous Dataproc Agent restart so it will use updated connectors.
# It could not be restarted sycnhronously because Dataproc Agent should be restarted
# after its initialization, including init actions execution, has been completed.
bash -c restart_dataproc_agent & disown
我的问题是:
编辑: 这是我用来创建集群的命令(使用1.3映像版本):
gcloud dataproc --region europe-west1 \
clusters create my-cluster \
--bucket my-bucket \
--subnet default \
--zone europe-west1-b \
--master-machine-type n1-standard-1 \
--master-boot-disk-size 50 \
--num-workers 2 \
--worker-machine-type n1-standard-2 \
--worker-boot-disk-size 100 \
--image-version 1.3 \
--scopes 'https://www.googleapis.com/auth/cloud-platform' \
--project my-project \
--initialization-actions gs://dataproc-initialization-actions/connectors/connectors.sh \
--metadata 'gcs-connector-version=1.9.6' \
--metadata 'bigquery-connector-version=0.13.6'
另外,请注意,连接器初始化脚本已经修复,并且可以正常运行,因此我现在正在使用它,但是我仍然必须手动重新启动dataproc代理才能运行作业。
答案 0 :(得分:1)
在初始化操作成功后,Dataproc代理将Custom initialization actions finished.
消息记录在/var/log/google-dataproc-agent.0.log
文件中。
否,您不需要手动重新启动Dataproc代理。
此问题是由Dataproc agent service restart in the connectors initialization action引起的,应由this PR解决。
答案 1 :(得分:0)
至于知道初始化动作何时完成,您可以检查dataproc的status.state
,如果它是CREATING
则意味着仍在执行初始化动作,如果RUNNING
则意味着他们完成了!
选中here