我有一个启动脚本(用户数据),它在启动时在带有ubuntu 16.04映像的aws中运行,而我遇到的问题是当它到达运行一个ansible playbook的部分时,playbook失败了说这个基本错误消息Could not get lock /var/lib/dpkg/lock
。现在,当我登录并尝试手动运行ansible脚本时,它可以工作,但是如果我从aws用户数据运行它,则会因错误而失败。
这是完整的错误
TASK [rabbitmq : install packages (Ubuntu default repo is used)] ***************
task path: /etc/ansible/roles/rabbitmq/tasks/main.yml:50
<localhost> ESTABLISH LOCAL CONNECTION FOR USER: root
<localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo $HOME/.ansible/tmp/ansible-tmp-1480352390.01-116502531862586 `" && echo ansible-tmp-1480352390.01-116502531862586="` echo $HOME/.ansible/tmp/ansible-tmp-1480352390.01-116502531862586 `" ) && sleep 0'
<localhost> PUT /tmp/tmpGHaVRP TO /.ansible/tmp/ansible-tmp-1480352390.01-116502531862586/apt
<localhost> EXEC /bin/sh -c 'chmod u+x /.ansible/tmp/ansible-tmp-1480352390.01-116502531862586/ /.ansible/tmp/ansible-tmp-1480352390.01-116502531862586/apt && sleep 0'
<localhost> EXEC /bin/sh -c 'LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 /usr/bin/python /.ansible/tmp/ansible-tmp-1480352390.01-116502531862586/apt; rm -rf "/.ansible/tmp/ansible-tmp-1480352390.01-116502531862586/" > /dev/null 2>&1 && sleep 0'
fatal: [localhost]: FAILED! => {"cache_update_time": 0, "cache_updated":
false, "changed": false, "failed": true, "invocation": {"module_args":
{"allow_unauthenticated": false, "autoremove": false, "cache_valid_time":
null, "deb": null, "default_release": null, "dpkg_options": "force-
confdef,force-confold", "force": false, "install_recommends": null, "name":
"rabbitmq-server", "only_upgrade": false, "package": ["rabbitmq-server"],
"purge": false, "state": "present", "update_cache": false, "upgrade": null},
"module_name": "apt"}, "msg": "'/usr/bin/apt-get -y -o \"Dpkg::Options::=--
force-confdef\" -o \"Dpkg::Options::=--force-confold\" install
'rabbitmq-server'' failed: E: Could not get lock /var/lib/dpkg/lock - open
(11: Resource temporarily unavailable)\nE: Unable to lock the administration
directory (/var/lib/dpkg/), is another process using it?\n", "stderr": "E: Could
not get lock /var/lib/dpkg/lock - open (11: Resource temporarily
unavailable)\nE: Unable to lock the administration directory (/var/lib/dpkg/),
is another process using it?\n", "stdout": "", "stdout_lines": []}
答案 0 :(得分:4)
我遇到了同样的锁定问题。我发现ubuntu在第一次启动时安装了一些软件包,cloud-init没有等待。
在尝试安装任何内容之前,我使用以下脚本检查锁定文件是否可用至少15秒。
#!/bin/bash
i="0"
while [ $i -lt 15 ]
do
if [ $(fuser /var/lib/dpkg/lock) ]; then
i="0"
fi
sleep 1
i=$[$i+1]
done
我更喜欢这个vs sleep 5m
的原因,因为在自动缩放组中,实例可能会在其配置之前被删除。