很长一段时间我一直在努力在Azure VM中运行自定义shell脚本。 Shell命令工作正常但是当我将它们捆绑到shell脚本时它失败了。我在settings
部分中定义了shell脚本。
Terraform代码:
resource "azurerm_resource_group" "test" {
name = "acctestrg"
location = "West US"
}
resource "azurerm_virtual_network" "test" {
name = "acctvn"
address_space = ["10.0.0.0/16"]
location = "West US"
resource_group_name = "${azurerm_resource_group.test.name}"
}
resource "azurerm_subnet" "test" {
name = "acctsub"
resource_group_name = "${azurerm_resource_group.test.name}"
virtual_network_name = "${azurerm_virtual_network.test.name}"
address_prefix = "10.0.2.0/24"
}
resource "azurerm_public_ip" "pubip" {
name = "tom-pip"
location = "${azurerm_resource_group.test.location}"
resource_group_name = "${azurerm_resource_group.test.name}"
public_ip_address_allocation = "Dynamic"
idle_timeout_in_minutes = 30
tags {
environment = "test"
}
}
resource "azurerm_network_interface" "test" {
name = "acctni"
location = "West US"
resource_group_name = "${azurerm_resource_group.test.name}"
ip_configuration {
name = "testconfiguration1"
subnet_id = "${azurerm_subnet.test.id}"
private_ip_address_allocation = "dynamic"
public_ip_address_id = "${azurerm_public_ip.pubip.id}"
}
}
resource "azurerm_storage_account" "test" {
name = "mostor"
resource_group_name = "${azurerm_resource_group.test.name}"
location = "westus"
account_tier = "Standard"
account_replication_type = "LRS"
tags {
environment = "staging"
}
}
resource "azurerm_storage_container" "test" {
name = "vhds"
resource_group_name = "${azurerm_resource_group.test.name}"
storage_account_name = "${azurerm_storage_account.test.name}"
container_access_type = "private"
}
resource "azurerm_virtual_machine" "test" {
name = "acctvm"
location = "West US"
resource_group_name = "${azurerm_resource_group.test.name}"
network_interface_ids = ["${azurerm_network_interface.test.id}"]
vm_size = "Standard_A0"
storage_image_reference {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "16.04-LTS"
version = "latest"
}
storage_os_disk {
name = "myosdisk1"
vhd_uri = "${azurerm_storage_account.test.primary_blob_endpoint}${azurerm_storage_container.test.name}/myosdisk1.vhd"
caching = "ReadWrite"
create_option = "FromImage"
}
os_profile {
computer_name = "hostname"
admin_username = "testadmin"
admin_password = "Password1234!"
}
os_profile_linux_config {
disable_password_authentication = false
}
tags {
environment = "staging"
}
}
resource "azurerm_virtual_machine_extension" "test" {
name = "hostname"
location = "West US"
resource_group_name = "${azurerm_resource_group.test.name}"
virtual_machine_name = "${azurerm_virtual_machine.test.name}"
publisher = "Microsoft.OSTCExtensions"
type = "CustomScriptForLinux"
type_handler_version = "1.2"
settings = <<SETTINGS
{
"fileUris": ["https://sag.blob.core.windows.net/sagcont/install_nginx_ubuntu.sh"],
"commandToExecute": "sh install_nginx_ubuntu.sh"
}
SETTINGS
tags {
environment = "Production"
}
}
我已从脚本中的命令中删除了任何sudo,因为Azure以root身份运行所有命令。 FYR,下面的shell脚本:
外壳代码:
#!/bin/bash
echo "Running apt update"
apt-get update
echo "Installing nginx"
apt-get install nginx
我面临的错误只不过是一条超时消息,如下所示:
错误:
azurerm_virtual_machine.test: Creation complete after 3m21s (ID: /subscriptions/b017dff9-5685-4a83-80d3-...crosoft.Compute/virtualMachines/acctvm)
azurerm_virtual_machine_extension.test: Creating...
location: "" => "westus"
name: "" => "hostname"
publisher: "" => "Microsoft.OSTCExtensions"
resource_group_name: "" => "acctestrg"
settings: "" => " {\n \"fileUris\": [\"https://sag.blob.core.windows.net/sagcont/install_nginx_ubuntu.sh\"],\n\t\"commandToExecute\": \"sh install_nginx_ubuntu.sh\"\n }\n"
tags.%: "" => "1"
tags.environment: "" => "Production"
type: "" => "CustomScriptForLinux"
type_handler_version: "" => "1.2"
virtual_machine_name: "" => "acctvm"
azurerm_virtual_machine_extension.test: Still creating... (10s elapsed)
azurerm_virtual_machine_extension.test: Still creating... (20s elapsed)
azurerm_virtual_machine_extension.test: Still creating... (30s elapsed)
azurerm_virtual_machine_extension.test: Still creating... (40s elapsed)
azurerm_virtual_machine_extension.test: Still creating... (50s elapsed)
azurerm_virtual_machine_extension.test: Still creating... (1m0s elapsed)
Error: Error applying plan:
1 error(s) occurred:
* azurerm_virtual_machine_extension.test: 1 error(s) occurred:
* azurerm_virtual_machine_extension.test: compute.VirtualMachineExtensionsClient#CreateOrUpdate: Failure sending request: StatusCode=200 -- Original Error: Long running operation terminated with status 'Failed': Code="VMExtensionProvisioningError" Message="VM has reported a failure when processing extension 'hostname'. Error message: \"Malformed status file [ExtensionError] Invalid status/status: failed\"."
Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.
我可以确认每个人都可以访问该脚本,因为我可以使用wget下载它。不确定是什么错。 已经在网络上挖掘了很多东西,但到处都是我找到了一个开放的bug或问题。此外,Azure使用Terraform的内容不多。任何帮助表示赞赏!
答案 0 :(得分:2)
是的,您的脚本需要-y
。
apt-get install nginx -y
执行Azure自定义脚本扩展时,脚本应该是自动的,无法等待手动输入参数。
在您的脚本中,如果您不添加-y
,脚本会挂起并等待您的输入yes
。 Azure自定义脚本扩展等待几分钟,然后您就会出现错误。
评论更新:
我无法找到tar /脚本所在的位置 下载。请你在这里说清楚。
脚本的所有执行输出和错误都记录在scripts / var / lib / waagent // download //的下载目录中,输出的尾部记录在HandlerEnvironment.json中指定的日志目录中并报告回Azure
扩展名的操作日志是/var/log/azure///extension.log文件。
有关此内容的详细信息,请参阅此link。
答案 1 :(得分:0)
问题出现在您的脚本中,本身不在terraform文件中
<强>问题强>
当您在Ubuntu VM中运行 install_nginx_ubuntu.sh 脚本时,这是框中发生的输出(仅显示最后一部分):
0 upgraded, 14 newly installed, 0 to remove and 162 not upgraded.
Need to get 3,000 kB of archives.
After this operation, 9,783 kB of additional disk space will be used.
Do you want to continue? [Y/n]
所以Terraform只是等待用户输入,这导致进程超时。
<强>解决方案强>
解决方案只是自动批准linux软件包的安装,这对linux用户来说应该是熟悉的。因此,请在 install_nginx_ubuntu.sh
中更改以下内容apt-get install nginx -y
可能在课题之外学到的经验教训
您可能想查看how to debug Terraform。我觉得如果你至少看到了一些更冗长的反馈,那么你就可以找到解决问题的方法了。