同时创建具有私有IP的多个Google Cloud SQL实例时,如何解决“发生未知错误”?

时间:2019-05-05 09:50:20

标签: google-cloud-platform google-cloud-sql terraform-provider-gcp

我们的云后端设置包含5个用于Postgres实例的Cloud SQL。我们使用Terraform管理基础架构。我们正在使用公共IP和Cloud SQL container从GKE连接它们。

为了简化我们的设置,我们希望通过转移到私有IP来摆脱代理容器。我尝试遵循Terraform guide。虽然创建一个实例很好,但是尝试同时创建5个实例会以4个失败实例和一个成功实例结束: Failed instance list in the GCP console

出现在失败实例上的Google Clod控制台中的错误是“发生未知错误”: Failed instance with error message in the GCP console

以下是复制它的代码。注意count = 5行:

resource "google_compute_network" "private_network" {
  provider = "google-beta"

  name = "private-network"
}

resource "google_compute_global_address" "private_ip_address" {
  provider = "google-beta"

  name = "private-ip-address"
  purpose = "VPC_PEERING"
  address_type = "INTERNAL"
  prefix_length = 16
  network = "${google_compute_network.private_network.self_link}"
}

resource "google_service_networking_connection" "private_vpc_connection" {
  provider = "google-beta"

  network = "${google_compute_network.private_network.self_link}"
  service = "servicenetworking.googleapis.com"
  reserved_peering_ranges = ["${google_compute_global_address.private_ip_address.name}"]
}

resource "google_sql_database_instance" "instance" {
  provider = "google-beta"
  count = 5

  name = "private-instance-${count.index}"
  database_version = "POSTGRES_9_6"

  depends_on = [
    "google_service_networking_connection.private_vpc_connection"
  ]

  settings {
    tier = "db-custom-1-3840"
    availability_type = "REGIONAL"
    ip_configuration {
      ipv4_enabled = "false"
      private_network = "${google_compute_network.private_network.self_link}"
    }
  }
}

provider "google-beta" {
  version = "~> 2.5"
  credentials = "credentials.json"
  project = "PROJECT_ID"
  region = "us-central1"
  zone = "us-central1-a"
}

我尝试了几种选择:

  • 在创建google_service_networking_connection之后等待一分钟,然后同时创建所有实例,但是我遇到了相同的错误。
  • 为每个实例创建一个地址范围和一个google_service_networking_connection,但是我收到一个错误,提示无法同时创建google_service_networking_connection
  • 为每个实例创建一个地址范围,并创建一个链接到所有实例的google_service_networking_connection,但是我遇到了相同的错误。

3 个答案:

答案 0 :(得分:1)

找到了一个难看但可行的解决方案。尽管is a bug in GCP无法完成,但它不会阻止同时创建实例。没有关于它的文档,也没有有意义的错误消息。它也出现在Terraform Google provider issue tracker中。

一种选择是在实例之间添加依赖关系。这使他们的创建成功完成。但是,每个实例需要几分钟才能创建。这累积了很多时间。如果我们在实例创建之间添加了60秒的人为延迟,那么我们将设法避免失败。注意:

  • 所需的延迟秒数取决于实例层。例如,对于db-f1-micro,30秒就足够了。它们还不够db-custom-1-3840
  • 我不确定db-custom-1-3840所需的确切秒数是多少。 30秒还不够,60秒就够了。

以下是解决此问题的代码示例。它仅显示2个实例,因为由于depends_on的限制,我无法使用计数功能,并且显示5个实例的完整代码将非常长。对于5个实例,它的工作原理相同:

resource "google_compute_network" "private_network" {
  provider = "google-beta"

  name = "private-network"
}

resource "google_compute_global_address" "private_ip_address" {
  provider = "google-beta"

  name = "private-ip-address"
  purpose = "VPC_PEERING"
  address_type = "INTERNAL"
  prefix_length = 16
  network = "${google_compute_network.private_network.self_link}"
}

resource "google_service_networking_connection" "private_vpc_connection" {
  provider = "google-beta"

  network = "${google_compute_network.private_network.self_link}"
  service = "servicenetworking.googleapis.com"
  reserved_peering_ranges = ["${google_compute_global_address.private_ip_address.name}"]
}

locals {
  db_instance_creation_delay_factor_seconds = 60
}

resource "null_resource" "delayer_1" {
  depends_on = ["google_service_networking_connection.private_vpc_connection"]

  provisioner "local-exec" {
    command = "echo Gradual DB instance creation && sleep ${local.db_instance_creation_delay_factor_seconds * 0}"
  }
}

resource "google_sql_database_instance" "instance_1" {
  provider = "google-beta"

  name = "private-instance-delayed-1"
  database_version = "POSTGRES_9_6"

  depends_on = [
    "google_service_networking_connection.private_vpc_connection",
    "null_resource.delayer_1"
  ]

  settings {
    tier = "db-custom-1-3840"
    availability_type = "REGIONAL"
    ip_configuration {
      ipv4_enabled = "false"
      private_network = "${google_compute_network.private_network.self_link}"
    }
  }
}

resource "null_resource" "delayer_2" {
  depends_on = ["google_service_networking_connection.private_vpc_connection"]

  provisioner "local-exec" {
    command = "echo Gradual DB instance creation && sleep ${local.db_instance_creation_delay_factor_seconds * 1}"
  }
}

resource "google_sql_database_instance" "instance_2" {
  provider = "google-beta"

  name = "private-instance-delayed-2"
  database_version = "POSTGRES_9_6"

  depends_on = [
    "google_service_networking_connection.private_vpc_connection",
    "null_resource.delayer_2"
  ]

  settings {
    tier = "db-custom-1-3840"
    availability_type = "REGIONAL"
    ip_configuration {
      ipv4_enabled = "false"
      private_network = "${google_compute_network.private_network.self_link}"
    }
  }
}

provider "google-beta" {
  version = "~> 2.5"
  credentials = "credentials.json"
  project = "PROJECT_ID"
  region = "us-central1"
  zone = "us-central1-a"
}

provider "null" {
  version = "~> 1.0"
}

答案 1 :(得分:0)

如果有人登陆这里的情况稍有不同(在专用网络中创建 google_sql_database_instance 会导致“未知错误”):

  1. 手动启动一个Cloud SQL实例(这将为看来的项目启用 servicenetworking.googleapis.com 和其他一些API)
  2. 运行清单
  3. 终止在步骤1中创建的实例。

之后为我工作

¯__(ツ)_ /¯

答案 2 :(得分:0)

我带着稍微不同的情况来到这里,与@Grigorash Vasilij 相同 (在专用网络中创建 google_sql_database_instance 会导致“未知错误”)。

我使用 UI 在私有 VPC 上部署 SQL 实例,出于某种原因,我也遇到了“未知错误”。我终于用 gcloud 命令解决了(为什么它有效而没有 UI?IDK,也许 UI 与命令的作用不同)

gcloud --project=[PROJECT_ID] beta sql instances create [INSTANCE_ID]
       --network=[VPC_NETWORK_NAME]
       --no-assign-ip 

follow this for more details