用户数据脚本失败而没有说明理由

时间:2014-05-22 22:38:54

标签: amazon-ec2

我正在使用Web控制台启动Amazon Linux实例(ami-fb8e9292),将数据粘贴到用户数据框中以在启动时运行脚本。如果我使用example given by amazon 启动Web服务器,它可以工作。但是当我运行自己的脚本(也是一个#!/bin/bash脚本)时,它不会运行。

如果我查看var/log/cloud-init.log,它就没有提供有关该主题的有用信息:

May 22 21:06:12 cloud-init[1286]: util.py[DEBUG]: Running command ['/var/lib/cloud/instance/scripts/part-001'] with allowed return codes [0] (shell=True, capture=False)
May 22 21:06:16 cloud-init[1286]: util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/part-001 [2]
May 22 21:06:16 cloud-init[1286]: util.py[DEBUG]: Failed running /var/lib/cloud/instance/scripts/part-001 [2]
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/cloudinit/util.py", line 637, in runparts
    subp([exe_path], capture=False, shell=True)
  File "/usr/lib/python2.6/site-packages/cloudinit/util.py", line 1528, in subp
    cmd=args)
ProcessExecutionError: Unexpected error while running command.
Command: ['/var/lib/cloud/instance/scripts/part-001']
Exit code: 2
Reason: -
Stdout: ''
Stderr: ''

如果我ssh到实例和sudo su并直接执行shell脚本:

/var/lib/cloud/instance/scripts/part-001

然后运行正常。此外,如果我模仿cloud-init运行它的方式,它也可以工作:

python
>>> import cloudinit.util
>>> cloudinit.util.runparts("/var/lib/cloud/instance/scripts/")

使用这些方法中的任何一种,如果我故意在脚本中引入错误,那么它会产生错误消息。如何调试选择性缺少有用的调试输出?

4 个答案:

答案 0 :(得分:2)

我有一个类似的问题,我能够解决它。我意识到不会为sudo设置环境变量EC2_HOME。我在我的configset中使用aws cli做了很多东西,为了使这些工作起来,需要设置EC2_HOME。所以,我进入并在我的configset和UserData中删除了sudo。 早些时候,当我遇到问题时,我的UserData看起来像:

"UserData"       : { "Fn::Base64" : { "Fn::Join" : ["", [
                                "#!/bin/bash\n",
                                "sudo yum update -y aws-cfn-bootstrap\n",

                                "# Install the files and packages and run the commands from the metadata\n",
                                "sudo /opt/aws/bin/cfn-init -v --access-key ", { "Ref" : "IAMUserAccessKey" }, " --secret-key ", { "Ref" : "SecretAccessKey" },  
                                "         --stack ", { "Ref" : "AWS::StackName" },
                                "         --resource NAT2 ",
                                "         --configsets config ",
                                "         --region ", { "Ref" : "AWS::Region" }, "\n"
                        ]]}}

更改后的我的UserData如下:

"UserData"       : { "Fn::Base64" : { "Fn::Join" : ["", [
                                "#!/bin/bash -xe\n",
                                "yum update -y aws-cfn-bootstrap\n",

                                "# Install the files and packages and run the commands from the metadata\n",
                                "/opt/aws/bin/cfn-init -v --access-key ", { "Ref" : "IAMUserAccessKey" }, " --secret-key ", { "Ref" : "SecretAccessKey" },  
                                "         --stack ", { "Ref" : "AWS::StackName" },
                                "         --resource NAT2 ",
                                "         --configsets config ",
                                "         --region ", { "Ref" : "AWS::Region" }, "\n"
                        ]]}}

同样,我删除了我在配置集中执行的所有sudo调用

答案 1 :(得分:1)

考虑在/var/log/cloud-init.log内搜索诸如“ Failed”,“ ERROR”,“ WARNING”或“ / var / lib / cloud / instance / scripts /”之类的关键字,而不是/var/log/cloud-init-output.log- ,其中包含非常清晰的错误消息。

例如-运行错误的命令将在/var/log/cloud-init-output.log中产生以下错误:

/var/lib/cloud/instance/scripts/part-001: line 10: vncpasswd: command not found
cp: cannot stat '/lib/systemd/system/vncserver@.service': No such file or directory
sed: can't read /etc/systemd/system/vncserver@.service: No such file or directory
Failed to execute operation: No such file or directory
Failed to start vncserver@:1.service: Unit not found.
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
Cleaning repos: amzn2-core amzn2extra-docker amzn2extra-epel

/var/log/cloud-init.log的结尾,您将收到一条安静的常规错误消息:

Aug 31 15:14:00 cloud-init[3532]: util.py[DEBUG]: Failed running /var/lib/cloud/instance/scripts/part-001 [1]
    Traceback (most recent call last):
      File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 910, in runparts
        subp(prefix + [exe_path], capture=False, shell=True)
      File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 2105, in subp
        cmd=args)
    ProcessExecutionError: Unexpected error while running command.
    Command: ['/var/lib/cloud/instance/scripts/part-001']
    Exit code: 1
    Reason: -
    Stdout: -
    Stderr: -
    cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts)

(*)尝试通过以下方式grep仅显示相关的错误消息:

grep -C 10 '<search-keyword>' cloud-init-output.log

答案 2 :(得分:1)

希望它能减少某人的调试时间。 我的 /var/log/cloud-init-output.log 中没有任何明确的错误消息,仅此:

<块引用>

2021-04-07 10:36:57,748 - cc_scripts_user.py[警告]:无法运行模块脚本用户(/var/lib/cloud/instance/scripts 中的脚本) 2021-04-07 10:36:57,748 - util.py[警告]:运行模块脚本用户() 失败

经过一番调查,我意识到原因是 shebang 字符串中的拼写错误:#!?bin/bash 而不是 #!/bin/bash

答案 3 :(得分:0)

我不确定每个人是否都会如此,但我遇到了这个问题并且能够通过更改我的第一行来修复它:

#!/bin/bash -e -v

就是这样:

#!/bin/bash

当然,现在我的脚本失败了,我不知道它到底有多远,但至少我已经超过了它没有运行它。 :)