在AKS上调试证书管理器证书创建失败

时间:2020-05-20 13:23:51

标签: ssl kubernetes azure-aks cert-manager

我正在Azure AKS上部署cert-manager,并试图让它请求Let's Encrypt证书。它失败并显示certificate signed by unknown authority错误,我在进一步排除故障时遇到问题。

不确定这是否是信任LE服务器,tunnelfront窗格或内部AKS自生成CA的问题。所以我的问题是:

  • 如何强制证书管理器对其不信任的证书进行调试(显示更多信息)?
  • 也许这个问题经常发生并且有已知的解决方案?
  • 应该采取什么步骤来进一步调试问题?

我在jetstack/cert-manager的Github页面上创建了一个问题,但没有得到答复,所以我来到了这里。

整个故事如下:

未创建证书。报告了以下错误:

证书: Error from server: conversion webhook for &{map[apiVersion:cert-manager.io/v1alpha2 kind:Certificate metadata:map[creationTimestamp:2020-05-13T17:30:48Z generation:1 name:xxx-tls namespace:test ownerReferences:[map[apiVersion:extensions/v1beta1 blockOwnerDeletion:true controller:true kind:Ingress name:xxx-ingress uid:6d73b182-bbce-4834-aee2-414d2b3aa802]] uid:d40bc037-aef7-4139-868f-bd615a423b38] spec:map[dnsNames:[xxx.test.domain.com] issuerRef:map[group:cert-manager.io kind:ClusterIssuer name:letsencrypt-prod] secretName:xxx-tls] status:map[conditions:[map[lastTransitionTime:2020-05-13T18:55:31Z message:Waiting for CertificateRequest "xxx-tls-1403681706" to complete reason:InProgress status:False type:Ready]]]]} failed: Post https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s: x509: certificate signed by unknown authority

cert-manager-webhook容器: cert-manager 2020/05/15 14:22:58 http: TLS handshake error from 10.20.0.19:35350: remote error: tls: bad certificate

10.20.0.19tunnelfront窗格的IP。

尝试将kubectl describe order...设置为kubectl describe certificaterequest...时使用https://cert-manager.io/docs/faq/acme/类“失败”进行调试会返回错误的CSR内容(如上),但不会返回订单ID。

环境详细信息:

  • Kubernetes版本:1.15.10
  • 云提供商/提供商:Azure (AKS)
  • 证书管理器版本:0.14.3
  • 安装方法:静态清单(见下文)+群集发行者(见下文)+常规CRD(不是旧版)

集群发行人:

kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
  namespace: cert-manager
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: x
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
      - dns01:
          azuredns:
            clientID: x
            clientSecretSecretRef:
              name: cert-manager-stage
              key: CLIENT_SECRET
            subscriptionID: x
            tenantID: x
            resourceGroupName: dns-stage
            hostedZoneName: x

清单:

  imagePullSecrets: []
  isOpenshift: false

  priorityClassName: ""
  rbac:
    create: true

  podSecurityPolicy:
    enabled: false

  logLevel: 2

  leaderElection:
    namespace: "kube-system"

replicaCount: 1

strategy: {}


image:
  repository: quay.io/jetstack/cert-manager-controller
  pullPolicy: IfNotPresent

  tag: v0.14.3

clusterResourceNamespace: ""

serviceAccount:
  create: true
  name:
  annotations: {}

extraArgs: []

extraEnv: []

resources: {}

securityContext:
  enabled: false
  fsGroup: 1001
  runAsUser: 1001

podAnnotations: {}

podLabels: {}

nodeSelector: {}

ingressShim:
  defaultIssuerName: letsencrypt-prod
  defaultIssuerKind: ClusterIssuer

prometheus:
  enabled: true
  servicemonitor:
    enabled: false
    prometheusInstance: default
    targetPort: 9402
    path: /metrics
    interval: 60s
    scrapeTimeout: 30s
    labels: {}


affinity: {}

tolerations: []

webhook:
  enabled: true
  replicaCount: 1

  strategy: {}

  podAnnotations: {}

  extraArgs: []

  resources: {}

  nodeSelector: {}

  affinity: {}

  tolerations: []

  image:
    repository: quay.io/jetstack/cert-manager-webhook
    pullPolicy: IfNotPresent
    tag: v0.14.3

  injectAPIServerCA: true

  securePort: 10250

cainjector:
  replicaCount: 1

  strategy: {}

  podAnnotations: {}

  extraArgs: []

  resources: {}

  nodeSelector: {}

  affinity: {}

  tolerations: []

  image:
    repository: quay.io/jetstack/cert-manager-cainjector
    pullPolicy: IfNotPresent
    tag: v0.14.3

1 个答案:

答案 0 :(得分:2)

似乎v0.14.3有某种错误。 v0.15.0不会发生此问题。