在AWS上使用kops install k8s群集。
使用Helm
安装Prometheus
:
$ helm install stable/prometheus \
--set server.persistentVolume.enabled=false \
--set alertmanager.persistentVolume.enabled=false
然后按照此注释执行port-forward
:
Get the Prometheus server URL by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace default -l "app=prometheus,component=server" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace default port-forward $POD_NAME 9090
我在AWS上的EC2实例公共IP是12.29.43.14
(不是这样)。当我尝试从浏览器访问它时:
http://12.29.43.14:9090
无法访问该页面。为什么呢?
另一个问题是,在安装prometheus
图表后,alertmanager
窗格未运行:
ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff-qhhw4 1/2 CrashLoopBackOff 1 9s
ungaged-woodpecker-prometheus-kube-state-metrics-5fd97698cktsj5 1/1 Running 0 9s
ungaged-woodpecker-prometheus-node-exporter-45jtn 1/1 Running 0 9s
ungaged-woodpecker-prometheus-node-exporter-ztj9w 1/1 Running 0 9s
ungaged-woodpecker-prometheus-pushgateway-57b67c7575-c868b 0/1 Running 0 9s
ungaged-woodpecker-prometheus-server-7f858db57-w5h2j 1/2 Running 0 9s
检查窗格详情:
$ kubectl describe po ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff-qhhw4
Name: ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff-qhhw4
Namespace: default
Node: ip-100.200.0.1.ap-northeast-1.compute.internal/100.200.0.1
Start Time: Fri, 26 Jan 2018 02:45:10 +0000
Labels: app=prometheus
component=alertmanager
pod-template-hash=2959465499
release=ungaged-woodpecker
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff","uid":"ec...
kubernetes.io/limit-ranger=LimitRanger plugin set: cpu request for container prometheus-alertmanager; cpu request for container prometheus-alertmanager-configmap-reload
Status: Running
IP: 100.96.6.91
Created By: ReplicaSet/ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff
Controlled By: ReplicaSet/ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff
Containers:
prometheus-alertmanager:
Container ID: docker://e9fe9d7bd4f78354f2c072d426fa935d955e0d6748c4ab67ebdb84b51b32d720
Image: prom/alertmanager:v0.9.1
Image ID: docker-pullable://prom/alertmanager@sha256:ed926b227327eecfa61a9703702c9b16fc7fe95b69e22baa656d93cfbe098320
Port: 9093/TCP
Args:
--config.file=/etc/config/alertmanager.yml
--storage.path=/data
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 26 Jan 2018 02:45:26 +0000
Finished: Fri, 26 Jan 2018 02:45:26 +0000
Ready: False
Restart Count: 2
Requests:
cpu: 100m
Readiness: http-get http://:9093/%23/status delay=30s timeout=30s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/data from storage-volume (rw)
/etc/config from config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-wppzm (ro)
prometheus-alertmanager-configmap-reload:
Container ID: docker://9320a0f157aeee7c3947027667aa6a2e00728d7156520c19daec7f59c1bf6534
Image: jimmidyson/configmap-reload:v0.1
Image ID: docker-pullable://jimmidyson/configmap-reload@sha256:2d40c2eaa6f435b2511d0cfc5f6c0a681eeb2eaa455a5d5ac25f88ce5139986e
Port: <none>
Args:
--volume-dir=/etc/config
--webhook-url=http://localhost:9093/-/reload
State: Running
Started: Fri, 26 Jan 2018 02:45:11 +0000
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Environment: <none>
Mounts:
/etc/config from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-wppzm (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: ungaged-woodpecker-prometheus-alertmanager
Optional: false
storage-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-wppzm:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-wppzm
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 34s default-scheduler Successfully assigned ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff-qhhw4 to ip-100.200.0.1.ap-northeast-1.compute.internal
Normal SuccessfulMountVolume 34s kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal MountVolume.SetUp succeeded for volume "storage-volume"
Normal SuccessfulMountVolume 34s kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal MountVolume.SetUp succeeded for volume "config-volume"
Normal SuccessfulMountVolume 34s kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal MountVolume.SetUp succeeded for volume "default-token-wppzm"
Normal Pulled 33s kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Container image "jimmidyson/configmap-reload:v0.1" already present on machine
Normal Created 33s kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Created container
Normal Started 33s kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Started container
Normal Pulled 18s (x3 over 34s) kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Container image "prom/alertmanager:v0.9.1" already present on machine
Normal Created 18s (x3 over 34s) kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Created container
Normal Started 18s (x3 over 33s) kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Started container
Warning BackOff 2s (x4 over 32s) kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Back-off restarting failed container
Warning FailedSync 2s (x4 over 32s) kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Error syncing pod
不确定原因FailedSync
。
答案 0 :(得分:1)
当您使用该命令执行kubectl port-forward
时,它会使您的localhost上的端口可用。所以运行命令然后点击http://localhost:9090。
您将无法直接从群集外的公共IP命中prometheus端口。从长远来看,您可能希望通过入口(图表支持)将prometheus暴露在一个不错的域名中,这就是我如何做到的。要使用图表对入口的支持,您需要在群集中安装入口控制器(例如nginx入口控制器),然后通过设置--set service.ingress.enabled=true
和--set server.ingress.hosts[0]=prometheus.yourdomain.com
来启用入口。 Ingress本身就是一个相当大的话题,所以我只想把你推荐给那个官方文档:
https://kubernetes.io/docs/concepts/services-networking/ingress/
这是nginx入口控制器:
https://github.com/kubernetes/ingress-nginx
对于显示FailedSync
的广告连播,请使用kubectl logs ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff-qhhw4
查看日志,看看是否有任何其他信息。