我正在尝试建立一个ELK堆栈,该堆栈从单独的Docker容器上接收来自Kafka Consumers的索引。
结构为:
Kafka生产者(在本地运行以进行测试)向Kafka群集(http://xxx.xxx.xxx.xxx:9000)发送消息。集群将消息转发到Kafka使用者(Docker容器1)。消费者对Kafka消息中的URL适口。结束时,应该将通过索引发送到ELK堆栈(Docker容器2),但是此步骤失败。
Segment dir is complete: crawl2019-05-09150441/segments/20190509150506.
Indexer: starting at 2019-05-09 15:05:56
Indexer: deleting gone documents: false
Indexer: URL filtering: false
Indexer: URL normalizing: false
No exchange was configured. The documents will be routed to all index writers.
Active IndexWriters :
ElasticRestIndexWriter:
...
Indexing job did not succeed, job status:FAILED, reason: NA
Indexer: java.lang.RuntimeException: Indexing job did not succeed, job status:FAILED, reason: NA
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:150)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:231)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:240)
Error running:
/WebContentExtraction/NutchScript/nutch/dist/apache-nutch-1.16-SNAPSHOT-bin/bin/nutch index crawl2019-05-09150441/crawldb -linkdb crawl2019-05-09150441/linkdb crawl2019-05-09150441/segments/20190509150506
Failed with exit value 255.
两个Docker容器都在本地主机上运行,并且我正在使用ELK堆栈的默认端口。
Nutch版本1.16(高级版)
ELK版本sebp / elk:622
我正在使用elastic.rest索引器,并尝试了几种不同版本的ELK堆栈。
我被限制使用nutch的主版本,因为它具有我需要的插件。
Kafka Consumer运行命令:
docker run -dit \
--name kafka1 \
-e HOST=localhost \
-e PORT=9200 \
-e INDEX=nutch \
-e TOPIC=health \
-e INDEXER=elastic.rest \
wce ./startup
ELK堆栈设置:
docker-compose -f elkDockerComp.yaml up elk
elkDockerComp.yaml:
elk:
image: sebp/elk:622
ports:
- "5601:5601"
- "9200:9200"
- "5044:5044"
我认为这是Docker容器之间的网络问题,但是,我似乎找不到任何解决方案。
我已将两个容器都添加到docker网络中,并且使用者的ping命令找到了ELK堆栈,因此它们现在可以相互到达。
docker network inspect myNet
[
{
"Name": "myNet",
"Id": "ecb47fb89a70ce4979420663287e67c2324dbd3ccf2367ead18261eb3e6089d8",
"Created": "2019-05-09T15:57:19.3169609Z",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": {},
"Config": [
{
"Subnet": "172.19.0.0/16",
"Gateway": "172.19.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"5b3764610f448098124b6a6dc2a28621858795394f3f280e6acc9aed8c13c0bf": {
"Name": "kafka1",
"EndpointID": "7380523f048bc516e6b0eb5811f3314804d53e0f758c1ef0aa9f7b049b1f7b4b",
"MacAddress": "02:42:ac:13:00:02",
"IPv4Address": "172.19.0.2/16",
"IPv6Address": ""
},
"87b87f2a62abeca39babd592ffc02025397e3d7207eea603168475f578c25002": {
"Name": "scripts_elk_1",
"EndpointID": "914aee56fe28f9b0182aeea5a6091ca6c0b6e4ef83f6dcf22564e554047ac858",
"MacAddress": "02:42:ac:13:00:03",
"IPv4Address": "172.19.0.3/16",
"IPv6Address": ""
}
},
"Options": {},
"Labels": {}
}
]
ping(从用户内部到ELK)
docker exec -it kafka1 ping 172.19.0.3
PING 172.19.0.3 (172.19.0.3) 56(84) bytes of data.
64 bytes from 172.19.0.3: icmp_seq=1 ttl=64 time=0.228 ms
64 bytes from 172.19.0.3: icmp_seq=2 ttl=64 time=0.125 ms
64 bytes from 172.19.0.3: icmp_seq=3 ttl=64 time=0.128 ms
64 bytes from 172.19.0.3: icmp_seq=4 ttl=64 time=0.138 ms
64 bytes from 172.19.0.3: icmp_seq=5 ttl=64 time=0.155 ms
^C
--- 172.19.0.3 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4146ms
rtt min/avg/max/mdev = 0.125/0.154/0.228/0.041 ms
我结束了制作一个大型docker-compose文件并将它们放到网络中的情况
version: "3.7"
services:
elk:
image: sebp/elk:670
container_name: elk1
ports:
- "5601:5601"
- "9200:9200"
- "5044:5044"
networks:
- myNet
kafka:
image: wce:latest
container_name: kafka1
environment:
- HOST=172.20.0.2
- PORT=9200
- INDEX=nutch
- TOPIC=health
- INDEXER=elastic.rest
command:
tail -F startup
networks:
- myNet
networks:
myNet:
然后我忘记了docker容器相对于彼此而言不是localhost。因此,我将index-writer.xml中的HOST值更改为新的正确IP,并按预期工作。