使用kops创建的AWS上的k8s集群上的DNS错误

时间:2018-09-01 12:49:56

标签: amazon-web-services dns kubernetes kops

我们在AWS上使用kops创建了一个k8s集群,并且收到了不确定的DNS错误(主机名未知),我们用CoreDNS替换了kube-dns,但仍然出现此错误,该错误是针对内部k8s集群的DNS名称服务以及外部DNS名称。错误通常是在很短的时间内来自各种名称的所有广告连播。我们正在调试数周。感谢您的帮助。

kops配置:

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: kops/v1alpha2
kind: Cluster
metadata:
  name: cluster-name
spec:
  additionalPolicies:
    node: "[\n    {\n        \"Effect\": \"Allow\",\n        \"Action\": [\n          \"cloudwatch:GetMetricData\",\n
      \         \"cloudwatch:GetMetricStatistics\",\n          \"cloudwatch:ListMetrics\",\n
      \         \"cloudwatch:PutMetricData\",\n          \"autoscaling:DescribeAutoScalingGroups\",\n
      \         \"autoscaling:DescribeAutoScalingInstances\",\n          \"autoscaling:SetDesiredCapacity\",\n
      \         \"autoscaling:DescribeTags\",\n          \"autoscaling:TerminateInstanceInAutoScalingGroup\",\n
      \         \"sqs:*\"\n        ],\n        \"Resource\": [\n            \"*\"\n
      \       ]\n    },\n    {\n        \"Effect\": \"Allow\",\n        \"Action\":
      [\n            \"SNS:Publish\",\n\"SNS:CreateTopic\"\n        ],\n        \"Resource\":
      \"arn:aws:sns:us-east-1:333449552137:XXX-*\",\n        \"Principal\":
      {\n           \"AWS\": [ \n               \"333449552137\"\n           ]\n        }\n
      \   }\n]\n"
  api:
    loadBalancer:
      type: Internal
  authorization:
    rbac: {}
  channel: stable
  cloudConfig: {}
  cloudProvider: spotinst
  configBase: s3://via-k8s-state-lab/cluster-name
  etcdClusters:
  - etcdMembers:
    - instanceGroup: master-us-east-1a-1
      name: "1"
    - instanceGroup: master-us-east-1a-2
      name: "2"
    - instanceGroup: master-us-east-1a-3
      name: "3"
    name: main
  - etcdMembers:
    - instanceGroup: master-us-east-1a-1
      name: "1"
        - instanceGroup: master-us-east-1a-2
      name: "2"
    - instanceGroup: master-us-east-1a-3
      name: "3"
    name: events
  iam:
    allowContainerRegistry: true
    legacy: false
  kubernetesApiAccess:
  - 0.0.0.0/0
  kubernetesVersion: 1.10.4
  masterInternalName: api.internal.cluster-name
  masterPublicName: api.cluster-name
  networkCIDR: 10.251.0.0/17
  networking:
    calico: {}
  nonMasqueradeCIDR: 100.64.0.0/10
  sshAccess:
  - 0.0.0.0/0
  subnets:
  - cidr: 10.251.16.0/20
    name: us-east-1a
    type: Private
    zone: us-east-1a
  - cidr: 10.251.32.0/20
    name: us-east-1b
    type: Private
    zone: us-east-1b
  - cidr: 10.251.0.0/23
    name: utility-us-east-1a
    type: Utility
    zone: us-east-1a
  - cidr: 10.251.2.0/23
    name: utility-us-east-1b
    type: Utility
    zone: us-east-1b
  topology:
    dns:
      type: Public
    masters: private
    nodes: private

如果我们在发生DNS问题的节点之一上查看/var/log/kern.log,则会看到以下内容 enter image description here

NETDEV_UP和NETDEV_CHANGE错误,这意味着网络接口出现故障

目前还不确定这将如何影响DNS

任何帮助都非常有用

0 个答案:

没有答案