我已部署了elasticsearch 2.2.0现在,我正在使用td-agent 2.3.0-0向其发送日志。
标签链上的最后一个是..
<match extra.geoip.processed5.**>
type copy
<store>
type file
path /var/log/td-agent/sp_l5
time_slice_format %Y%m%d
time_slice_wait 10m
time_format %Y%m%dT%H%M%S%z
compress gzip
utc
</store>
<store>
type elasticsearch
host 11.0.0.174
port 9200
logstash_format true
logstash_prefix logstash_business
logstash_dateformat %Y.%m
flush_interval 5s
</store>
</match>
现在,我在elasticsearch类型块中添加了一个超时
request_timeout 45s
这是启用了调试的td-agent.log。
2016-02-08 15:58:07 +0100 [info]: plugin/in_syslog.rb:176:listen: listening syslog socket on 0.0.0.0:5514 with udp
2016-02-08 15:59:03 +0100 [info]: plugin/out_elasticsearch.rb:77:client: Connection opened to Elasticsearch cluster => {:host=>"11.0.0.174", :port=>9200, :scheme=>"http"}
2016-02-08 15:59:03 +0100 [info]: plugin/out_elasticsearch.rb:77:client: Connection opened to Elasticsearch cluster => {:host=>"11.0.0.174", :port=>9200, :scheme=>"http"}
2016-02-08 16:03:33 +0100 [warn]: plugin/out_elasticsearch.rb:200:rescue in send: Could not push logs to Elasticsearch, resetting connection and trying again. read timeout reached
2016-02-08 16:03:33 +0100 [warn]: plugin/out_elasticsearch.rb:200:rescue in send: Could not push logs to Elasticsearch, resetting connection and trying again. read timeout reached
2016-02-08 16:03:35 +0100 [info]: plugin/out_elasticsearch.rb:77:client: Connection opened to Elasticsearch cluster => {:host=>"11.0.0.174", :port=>9200, :scheme=>"http"}
2016-02-08 16:03:35 +0100 [info]: plugin/out_elasticsearch.rb:77:client: Connection opened to Elasticsearch cluster => {:host=>"11.0.0.174", :port=>9200, :scheme=>"http"}
2016-02-08 16:08:05 +0100 [warn]: plugin/out_elasticsearch.rb:200:rescue in send: Could not push logs to Elasticsearch, resetting connection and trying again. read timeout reached
2016-02-08 16:08:05 +0100 [warn]: plugin/out_elasticsearch.rb:200:rescue in send: Could not push logs to Elasticsearch, resetting connection and trying again. read timeout reached
2016-02-08 16:08:09 +0100 [info]: plugin/out_elasticsearch.rb:77:client: Connection opened to Elasticsearch cluster => {:host=>"11.0.0.174", :port=>9200, :scheme=>"http"}
2016-02-08 16:08:09 +0100 [info]: plugin/out_elasticsearch.rb:77:client: Connection opened to Elasticsearch cluster => {:host=>"11.0.0.174", :port=>9200, :scheme=>"http"}
2016-02-08 16:12:40 +0100 [warn]: fluent/output.rb:354:rescue in try_flush: temporarily failed to flush the buffer. next_retry=2016-02-08 15:59:04 +0100 error_class="Fluent::ElasticsearchOutput::ConnectionFailure" error="Could not push logs to Elasticsearch after 2 retries. read timeout reached" plugin_id="object:fd9738"
2016-02-08 16:12:40 +0100 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-elasticsearch-1.3.0/lib/fluent/plugin/out_elasticsearch.rb:204:in `rescue in send'
2016-02-08 16:12:40 +0100 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-elasticsearch-1.3.0/lib/fluent/plugin/out_elasticsearch.rb:194:in `send'
2016-02-08 16:12:40 +0100 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-elasticsearch-1.3.0/lib/fluent/plugin/out_elasticsearch.rb:188:in `write'
2016-02-08 16:12:40 +0100 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.19/lib/fluent/buffer.rb:345:in `write_chunk'
2016-02-08 16:12:40 +0100 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.19/lib/fluent/buffer.rb:324:in `pop'
2016-02-08 16:12:40 +0100 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.19/lib/fluent/output.rb:321:in `try_flush'
2016-02-08 16:12:40 +0100 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.19/lib/fluent/output.rb:140:in `run'
2016-02-08 16:12:40 +0100 [warn]: fluent/output.rb:354:rescue in try_flush: temporarily failed to flush the buffer. next_retry=2016-02-08 15:59:04 +0100 error_class="Fluent::ElasticsearchOutput::ConnectionFailure" error="Could not push logs to Elasticsearch after 2 retries. read timeout reached" plugin_id="object:1034980"
2016-02-08 16:12:40 +0100 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-elasticsearch-1.3.0/lib/fluent/plugin/out_elasticsearch.rb:204:in `rescue in send'
2016-02-08 16:12:40 +0100 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-elasticsearch-1.3.0/lib/fluent/plugin/out_elasticsearch.rb:194:in `send'
2016-02-08 16:12:40 +0100 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-elasticsearch-1.3.0/lib/fluent/plugin/out_elasticsearch.rb:188:in `write'
2016-02-08 16:12:40 +0100 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.19/lib/fluent/buffer.rb:345:in `write_chunk'
2016-02-08 16:12:40 +0100 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.19/lib/fluent/buffer.rb:324:in `pop'
2016-02-08 16:12:40 +0100 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.19/lib/fluent/output.rb:321:in `try_flush'
2016-02-08 16:12:40 +0100 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.19/lib/fluent/output.rb:140:in `run'
我是通过ubuntu 14.04 c3.large在AWS上运行的。我已经测试了来自td-agent机器的访问权限,创建了一个索引,添加了文档并使用curl删除了索引而没有任何问题。 (当然,我在我的安全组中打开了所有通信)
来自td-agent机器的另一项测试...
root@bilbo:~# telnet 11.0.0.174 9200
Trying 11.0.0.174...
Connected to 11.0.0.174.
Escape character is '^]'.
GET / HTTP/1.0
HTTP/1.0 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 320
{
"name" : "gandalf-gandalf",
"cluster_name" : "aaaa_dev",
"version" : {
"number" : "2.2.0",
"build_hash" : "8ff36d139e16f8720f2947ef62c8167a888992fe",
"build_timestamp" : "2016-01-27T13:32:39Z",
"build_snapshot" : false,
"lucene_version" : "5.4.1"
},
"tagline" : "You Know, for Search"
}
Connection closed by foreign host.
root@bilbo:~#
有了strace,我可以看到这个
[pid 10774] connect(23, {sa_family=AF_INET, sin_port=htons(9200), sin_addr=inet_addr("11.0.0.174")}, 16) = -1 EINPROGRESS (Operation now in progress)
[pid 10774] clock_gettime(CLOCK_MONOTONIC, {4876, 531526283}) = 0
[pid 10774] select(24, NULL, [23], NULL, {45, 0}) = 1 (out [23], left {44, 999925})
[pid 10774] fcntl(23, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
[pid 10774] connect(23, {sa_family=AF_INET, sin_port=htons(9200), sin_addr=inet_addr("11.0.0.174")}, 16) = 0
[pid 10774] fcntl(23, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
[pid 10774] write(23, "POST /_bulk HTTP/1.1\r\nUser-Agent: Faraday v0.9.2\r\nHost: 11.0.0.174:9200\r\nContent-Length: 4798\r\n\r\n", 97) = 97
[pid 10774] fcntl(23, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
[pid 10774] write(23, "{\"index\":{\"_index\":\"logstash_apache-2016.02.08\",\"_type\":\"fluentd\"}}\n{\"message\":\"Feb 8 16:18:47 bilbo ::apache::PRE::access: - 10.0.0.15 - control [08/Feb/2016:16:18:47 +0100] \\\"GET /status/memcached.php HTTP/1.1\\\" 200 1785 \\\"-\\\" \\\"check_http/v2.0 (monitoring-plugins 2.0)\\\" control-pre.fluzo.com:443 0\",\"n\":\"bilbo\",\"s\":\"info\",\"f\":\"local3\",\"t\":\"Feb 8 16:18:47\",\"h\":\"bilbo\",\"a\":\"apache\",\"e\":\"PRE\",\"o\":\"access\",\"ip\":\"-\",\"ip2\":\"10.0.0.15\",\"rl\":\"-\",\"ru\":\"control\",\"rt\":\"[08/Feb/2016:16:18:47 +0100]\",\"met\":\"GET\",\"pqf\":\"status/memcached.php\",\"hv\":\"HTTP/1.1\",\"st\":\"200\",\"bs\":\"1785\",\"ref\":\"-\",\"ua\":\"check_http/v2.0 (monitoring-plugins 2.0)\",\"vh\":\"control-pre.aaaa.com\",\"p\":\"443\",\"rpt\":\"0\",\"co\":null,\"ci\":null,\"la\":null,\"lo\":null,\"ar\":null,\"dm\":null,\"re\":null,\"@timestamp\":\"2016-02-08T16:19:47+01:00\"}\n{\"index\":{\"_index\":\"logstash_apache-2016.02.08\",\"_type\":\"fluentd\"}}\n{\"message\":\"Feb 8 16:18:47 bilbo ::apache::PRE::access: - 10.0.0.15 - control [08/Feb/2016:16:18:47 +0100] \\\"GET /status/core.php HTTP/1.1\\\" 200 1898 \\\"-\\\" \\\"c"..., 4798) = 4798
似乎连接已打开。
为避免混淆,我在elasticsearch机器上安装了具有相同输出的td-agent。
这是我的弹性搜索配置......
### MANAGED BY PUPPET ###
---
bootstrap:
mlockall: true
cluster:
name: aaaa0_dev
discovery:
zen:
minimum_master_nodes: 1
ping:
multicast:
enabled: false
unicast:
hosts:
- 11.0.0.174
gateway:
expected_nodes: 1
recover_after_nodes: 1
recover_after_time: 5m
hostname: gandalf
http:
compression: true
index:
store:
compress:
stored: true
type: niofs
network:
bind_host: 11.0.0.174
publish_host: 11.0.0.174
node:
name: gandalf-gandalf
path:
data: /var/lib/elasticsearch-gandalf
logs: /var/log/elasticsearch/gandalf
transport:
tcp:
compress: true
有什么想法吗?
感谢。
更新
针对真正的弹性搜索集群工作(3个节点)
这是配置。
### MANAGED BY PUPPET ###
---
bootstrap:
mlockall: true
cluster:
name: aaaa
discovery:
zen:
minimum_master_nodes: 2
ping:
multicast:
enabled: false
unicast:
hosts:
- el0
- el1
- el2
gateway:
expected_nodes: 3
recover_after_nodes: 2
recover_after_time: 5m
hostname: kili
http:
compression: true
index:
store:
compress:
stored: true
type: niofs
network:
bind_host: 11.0.0.253
publish_host: 11.0.0.253
node:
name: kili-kili
path:
data:
- /var/lib/elasticsearch0
- /var/lib/elasticsearch1
logs: /var/log/elasticsearch/kili
transport:
tcp:
compress: true
然而,kibana确实在一个节点的elasticsearch中创建了.kibana索引。另外,y用curl测试这个节点。