Prometheus的AlertManager中的CrashLoopBackOff

时间:2018-11-29 18:50:14

标签: docker kubernetes prometheus prometheus-alertmanager prometheus-operator

我正在尝试为Kubernetes集群设置AlertManager。我已遵循本文档(https://github.com/coreos/prometheus-operator/blob/master/Documentation/user-guides/getting-started.md)->一切正常。

要设置AlertManager,我正在研究本文档(https://github.com/coreos/prometheus-operator/blob/master/Documentation/user-guides/alerting.md

我得到CrashLoopBackOff的{​​{1}}。请检查随附的日志:

第一张图片:alertmanager-example-0

第二张图片:$ kubectl logs -f prometheus-operator-88fcf6d95-zctgw -n monitoring

enter image description here enter image description here

谁能指出我做错了什么?预先感谢。

1 个答案:

答案 0 :(得分:1)

像您这样的声音出现问题,即警报管理器窗格使用的RBACService Accountsystem:serviceaccount:monitoring:prometheus-operator)没有足够的权限与kube-apiserver对话。

在您使用Prometheus运算符的情况下,其ClusterRoleBinding prometheus-operator如下所示:

$ kubectl get clusterrolebinding prometheus-operator -o=yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    app: prometheus-operator
  name: prometheus-operator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus-operator
subjects:
- kind: ServiceAccount
  name: prometheus-operator
  namespace: monitoring

更重要的是,ClusterRole应该看起来像这样:

$ kubectl get clusterrole prometheus-operator -o=yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app: prometheus-operator
  name: prometheus-operator
rules:
- apiGroups:
  - extensions
  resources:
  - thirdpartyresources
  verbs:
  - '*'
- apiGroups:
  - apiextensions.k8s.io
  resources:
  - customresourcedefinitions
  verbs:
  - '*'
- apiGroups:
  - monitoring.coreos.com
  resources:
  - alertmanager
  - alertmanagers
  - prometheus
  - prometheuses
  - service-monitor
  - servicemonitors
  - prometheusrules
  verbs:
  - '*'
- apiGroups:
  - apps
  resources:
  - statefulsets
  verbs:
  - '*'
- apiGroups:
  - ""
  resources:
  - configmaps
  - secrets
  verbs:
  - '*'
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - list
  - delete
- apiGroups:
  - ""
  resources:
  - services
  - endpoints
  verbs:
  - get
  - create
  - update
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - namespaces
  verbs:
  - list
  - watch