Terraform:SSH认证失败(user @:22):ssh:握手失败

时间:2019-09-03 15:45:33

标签: azure terraform terraform-provider-azure

我编写了一些Terraform代码来创建新的VM,并希望通过remote-exec在其上执行命令,但是它会引发SSH连接错误:

Error: timeout - last error: SSH authentication failed (admin@:22): ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain.

我的Terraform代码:

# Create a resource group if it doesn’t exist
resource "azurerm_resource_group" "rg" {
  name     = "${var.deployment}-mp-rg"
  location = "${var.azure_environment}"

  tags = {
    environment = "${var.deployment}"
  }
}

# Create virtual network
resource "azurerm_virtual_network" "vnet" {
  name                = "${var.deployment}-mp-vnet"
  address_space       = ["10.0.0.0/16"]
  location            = "${var.azure_environment}"
  resource_group_name = "${azurerm_resource_group.rg.name}"

  tags = {
    environment = "${var.deployment}"
  }
}

# Create subnet
resource "azurerm_subnet" "subnet" {
  name                 = "${var.deployment}-mp-subnet"
  resource_group_name  = "${azurerm_resource_group.rg.name}"
  virtual_network_name = "${azurerm_virtual_network.vnet.name}"
  address_prefix       = "10.0.1.0/24"
}

# Create public IPs
resource "azurerm_public_ip" "publicip" {
  name                = "${var.deployment}-mp-publicip"
  location            = "${var.azure_environment}"
  resource_group_name = "${azurerm_resource_group.rg.name}"
  allocation_method   = "Dynamic"

  tags = {
    environment = "${var.deployment}"
  }
}

# Create Network Security Group and rule
resource "azurerm_network_security_group" "nsg" {
  name                = "${var.deployment}-mp-nsg"
  location            = "${var.azure_environment}"
  resource_group_name = "${azurerm_resource_group.rg.name}"

  security_rule {
    name                       = "SSH"
    priority                   = 1001
    direction                  = "Inbound"
    access                     = "Allow"
    protocol                   = "Tcp"
    source_port_range          = "*"
    destination_port_range     = "22"
    source_address_prefix      = "*"
    destination_address_prefix = "*"
  }

  tags = {
    environment = "${var.deployment}"
  }
}

# Create network interface
resource "azurerm_network_interface" "nic" {
  name                      = "${var.deployment}-mp-nic"
  location                  = "${var.azure_environment}"
  resource_group_name       = "${azurerm_resource_group.rg.name}"
  network_security_group_id = "${azurerm_network_security_group.nsg.id}"

  ip_configuration {
    name                          = "${var.deployment}-mp-nicconfiguration"
    subnet_id                     = "${azurerm_subnet.subnet.id}"
    private_ip_address_allocation = "Dynamic"
    public_ip_address_id          = "${azurerm_public_ip.publicip.id}"
  }

  tags = {
    environment = "${var.deployment}"
  }
}

# Generate random text for a unique storage account name
resource "random_id" "randomId" {
  keepers = {
    # Generate a new ID only when a new resource group is defined
    resource_group = "${azurerm_resource_group.rg.name}"
  }

  byte_length = 8
}

# Create storage account for boot diagnostics
resource "azurerm_storage_account" "storageaccount" {
  name                     = "diag${random_id.randomId.hex}"
  resource_group_name      = "${azurerm_resource_group.rg.name}"
  location                 = "${var.azure_environment}"
  account_tier             = "Standard"
  account_replication_type = "LRS"

  tags = {
    environment = "${var.deployment}"
  }
}

# Create virtual machine
resource "azurerm_virtual_machine" "vm" {
  name                  = "${var.deployment}-mp-vm"
  location              = "${var.azure_environment}"
  resource_group_name   = "${azurerm_resource_group.rg.name}"
  network_interface_ids = ["${azurerm_network_interface.nic.id}"]
  vm_size               = "Standard_DS1_v2"

  storage_os_disk {
    name              = "${var.deployment}-mp-disk"
    caching           = "ReadWrite"
    create_option     = "FromImage"
    managed_disk_type = "Premium_LRS"
  }

  storage_image_reference {
    publisher = "Canonical"
    offer     = "UbuntuServer"
    sku       = "16.04-LTS"
    version   = "latest"
  }

  os_profile {
    computer_name  = "${var.deployment}-mp-ansible"
    admin_username = "${var.ansible_user}"
  }

  os_profile_linux_config {
    disable_password_authentication = true
    ssh_keys {
      path     = "/home/${var.ansible_user}/.ssh/authorized_keys"
      key_data = "${var.public_key}"
    }
  }

  boot_diagnostics {
    enabled     = "true"
    storage_uri = "${azurerm_storage_account.storageaccount.primary_blob_endpoint}"
  }

  tags = {
    environment = "${var.deployment}"
  }
}

resource "null_resource" "ssh_connection" {

  connection {
    host        = "${azurerm_public_ip.publicip.ip_address}"
    type        = "ssh"
    private_key = "${file(var.private_key)}"
    port        = 22
    user        = "${var.ansible_user}"
    agent       = false
    timeout     = "1m"
  }

  provisioner "remote-exec" {
    inline = ["sudo apt-get -qq install python"]
  }
}

我尝试使用admin@xx.xx.xx.xx:22手动SSH到新VM中,并且可以正常工作。查看错误消息,然后输出参数${azurerm_public_ip.publicip.ip_address},但它是null,所以我认为这是SSH身份验证失败的原因,但我不知道原因。如果要通过Terraform脚本SSH服务器,该如何修改代码?

1 个答案:

答案 0 :(得分:1)

您的问题是Terraform建立了一个依赖关系图,告诉它null_resource.ssh_connection的唯一依赖关系是azurerm_public_ip.publicip资源,因此它开始尝试在创建实例之前进行连接。

这本身不是一个大问题,因为在SSH还不可用的情况下,预配置程序通常会尝试重试,但是一旦空资源启动,连接详细信息就会确定。并且将azurerm_public_ip设置为Dynamic的{​​{3}},直到将其附加到资源后,它才会获得IP地址:

  

请注意,只有在Azure内通过设计将动态公共IP地址分配给资源(例如虚拟机或负载平衡器)后,才会分配动态公共IP地址-

有几种方法可以解决此问题。您可以通过插值或通过allocation_methodnull_resource资源上制作azurerm_virtual_machine.vm depend

resource "null_resource" "ssh_connection" {

  connection {
    host        = "${azurerm_public_ip.publicip.ip_address}"
    type        = "ssh"
    private_key = "${file(var.private_key)}"
    port        = 22
    user        = "${var.ansible_user}"
    agent       = false
    timeout     = "1m"
  }

  provisioner "remote-exec" {
    inline = [
      "echo ${azurerm_virtual_machine.vm.id}",
      "sudo apt-get -qq install python",
    ]
  }
}

resource "null_resource" "ssh_connection" {
  depends_on = ["azurerm_virtual_machine.vm"]

  connection {
    host        = "${azurerm_public_ip.publicip.ip_address}"
    type        = "ssh"
    private_key = "${file(var.private_key)}"
    port        = 22
    user        = "${var.ansible_user}"
    agent       = false
    timeout     = "1m"
  }

  provisioner "remote-exec" {
    inline = ["sudo apt-get -qq install python"]
  }
}

一种更好的方法是将配置程序作为azurerm_virtual_machine.vm资源的一部分而不是null_resource来运行。使用null_resource启动配置程序的正常原因是,当您需要等到某个资源发生其他事情(例如,附加磁盘)或存在depends_on时,但这实际上并不适用。这里。因此,您可以将配置程序移至null_resource资源中,而不是现有的azurerm_virtual_machine.vm

resource "azurerm_virtual_machine" "vm" {
  # ...

  provisioner "remote-exec" {
    connection {
      host        = "${azurerm_public_ip.publicip.ip_address}"
      type        = "ssh"
      private_key = "${file(var.private_key)}"
      port        = 22
      user        = "${var.ansible_user}"
      agent       = false
      timeout     = "1m"
    }

    inline = ["sudo apt-get -qq install python"]
  }
}

对于许多资源,这还允许您使用not an appropriate resource to attach it to来引用要供应的资源的输出。不幸的是,由于azurerm_virtual_machine设置了network_interface_ids资源,因此似乎无法轻易公开VM的IP地址。