fluentd 无法连接到集群中的 elasticsearch

时间:2021-07-23 22:42:14

标签: elasticsearch kubernetes fluentd rke

我试图建立一个 EFK 堆栈。虽然 E+K 在默认命名空间中工作正常,但 Fluentd 容器无法连接到 elasticsearch。

kubectl get services -n default
NAME                            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
elasticsearch-master            ClusterIP   10.43.40.136    <none>        9200/TCP,9300/TCP   92m
elasticsearch-master-headless   ClusterIP   None            <none>        9200/TCP,9300/TCP   92m
kibana-kibana                   ClusterIP   10.43.152.189   <none>        5601/TCP            74m
kubernetes                      ClusterIP   10.43.0.1       <none>        443/TCP             14d

我已经从这个 repo 安装了 fluentd 并将 url 更改为 elasticsearch

https://github.com/fluent/fluentd-kubernetes-daemonset/blob/master/fluentd-daemonset-elasticsearch-rbac.yaml

kubectl -n kube-system get pods | grep fluentd
fluentd-4fd2s                                1/1     Running     0          51m
fluentd-7t2v5                                1/1     Running     0          49m
fluentd-dfnfg                                1/1     Running     0          50m
fluentd-lvrsv                                1/1     Running     0          48m
fluentd-rv4td                                1/1     Running     0          50m

但日志告诉我:

2021-07-23 21:38:59 +0000 [info]: starting fluentd-1.13.2 pid=7 ruby="2.6.8"
2021-07-23 21:38:59 +0000 [info]: spawn command to main:  cmdline=["/usr/local/bin/ruby", "-Eascii-8bit:ascii-8bit", "/fluentd/vendor/bundle/ruby/2.6.0/bin/fluentd", "-c", "/fluentd/etc/fluent.conf", "-p", "/fluentd/plugins", "--gemfile", "/fluentd/Gemfile", "-r", "/fluentd/vendor/bundle/ruby/2.6.0/gems/fluent-plugin-elasticsearch-5.0.5/lib/fluent/plugin/elasticsearch_simple_sniffer.rb", "--under-supervisor"]
2021-07-23 21:39:01 +0000 [info]: adding match in @FLUENT_LOG pattern="fluent.**" type="null"
2021-07-23 21:39:01 +0000 [info]: adding filter pattern="kubernetes.**" type="kubernetes_metadata"
2021-07-23 21:39:01 +0000 [warn]: #0 [filter_kube_metadata] !! The environment variable 'K8S_NODE_NAME' is not set to the node name which can affect the API server and watch efficiency !!
2021-07-23 21:39:01 +0000 [info]: adding match pattern="**" type="elasticsearch"
2021-07-23 21:39:09 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
2021-07-23 21:39:09 +0000 [warn]: #0 [out_es] Remaining retry: 14. Retry to communicate after 2 second(s).
2021-07-23 21:39:18 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
2021-07-23 21:39:18 +0000 [warn]: #0 [out_es] Remaining retry: 13. Retry to communicate after 4 second(s).
2021-07-23 21:39:31 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
2021-07-23 21:39:31 +0000 [warn]: #0 [out_es] Remaining retry: 12. Retry to communicate after 8 second(s).
2021-07-23 21:39:52 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
2021-07-23 21:39:52 +0000 [warn]: #0 [out_es] Remaining retry: 11. Retry to communicate after 16 second(s).
2021-07-23 21:40:29 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
2021-07-23 21:40:29 +0000 [warn]: #0 [out_es] Remaining retry: 10. Retry to communicate after 32 second(s).
2021-07-23 21:41:38 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached

我安装了 dig 并解决了该服务:

root@fluentd-dfnfg:/home/fluent# nslookup elasticsearch-master.default.svc.cluster.local
Server:     10.43.0.10
Address:    10.43.0.10#53

Name:   elasticsearch-master.default.svc.cluster.local
Address: 10.43.40.136

我没有想法了。

PS:我使用的是加固的 RKE2。 (https://github.com/rancherfederal/rke2-ansible)

0 个答案:

没有答案