我在Windows 10 / Docker下拥有一个(开发中的)单节点elasticsearch服务器。创建服务器和索引已成功完成。在用大约150万个文档填充索引后,分片失败,从而导致elasticsearch红色索引状态:
{"log":"[2018-12-29T18:04:31,279][WARN ][o.e.i.e.Engine ] [elasticsearch01] [epg_v21][0] failed to rollback writer on close\n","stream":"stdout","time":"2018-12-29T18:04:31.2819642Z"}
{"log":"java.nio.file.NoSuchFileException: /usr/share/elasticsearch/data/nodes/0/indices/VBTjn1OcSvSJq6iEnqmLAg/0/index/_4o.cfs\n","stream":"stdout","time":"2018-12-29T18:04:31.2820693Z"}
{"log":"\u0009at sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) ~[?:?]\n","stream":"stdout","time":"2018-12-29T18:04:31.2820818Z"}
{"log":"\u0009at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]\n","stream":"stdout","time":"2018-12-29T18:04:31.2820904Z"}
{"log":"\u0009at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) ~[?:?]\n","stream":"stdout","time":"2018-12-29T18:04:31.2820977Z"}
{"log":"\u0009at sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:245) ~[?:?]\n","stream":"stdout","time":"2018-12-29T18:04:31.2821048Z"}
{"log":"\u0009at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:105) ~[?:?]\n","stream":"stdout","time":"2018-12-29T18:04:31.2821222Z"}
{"log":"\u0009at java.nio.file.Files.delete(Files.java:1141) ~[?:?]\n","stream":"stdout","time":"2018-12-29T18:04:31.2821298Z"}
{"log":"\u0009at org.apache.lucene.store.FSDirectory.privateDeleteFile(FSDirectory.java:371) ~[lucene-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:01:13]\n","stream":"stdout","time":"2018-12-29T18:04:31.2821359Z"}
{"log":"\u0009at org.apache.lucene.store.FSDirectory.deleteFile(FSDirectory.java:340) ~[lucene-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:01:13]\n","stream":"stdout","time":"2018-12-29T18:04:31.2821425Z"}
{"log":"\u0009at org.apache.lucene.store.FilterDirectory.deleteFile(FilterDirectory.java:63) ~[lucene-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:01:13]\n","stream":"stdout","time":"2018-12-29T18:04:31.2821493Z"}
{"log":"\u0009at org.elasticsearch.index.store.ByteSizeCachingDirectory.deleteFile(ByteSizeCachingDirectory.java:175) ~[elasticsearch-6.5.1.jar:6.5.1]\n","stream":"stdout","time":"2018-12-29T18:04:31.2821562Z"}
{"log":"\u0009at org.apache.lucene.store.FilterDirectory.deleteFile(FilterDirectory.java:63) ~[lucene-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:01:13]\n","stream":"stdout","time":"2018-12-29T18:04:31.282163Z"}
{"log":"\u0009at org.elasticsearch.index.store.Store$StoreDirectory.deleteFile(Store.java:733) ~[elasticsearch-6.5.1.jar:6.5.1]\n","stream":"stdout","time":"2018-12-29T18:04:31.2821703Z"}
{"log":"\u0009at org.elasticsearch.index.store.Store$StoreDirectory.deleteFile(Store.java:738) ~[elasticsearch-6.5.1.jar:6.5.1]\n","stream":"stdout","time":"2018-12-29T18:04:31.2821764Z"}
{"log":"\u0009at org.apache.lucene.store.LockValidatingDirectoryWrapper.deleteFile(LockValidatingDirectoryWrapper.java:38) ~[lucene-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:01:13]\n","stream":"stdout","time":"2018-12-29T18:04:31.2821833Z"}
{"log":"\u0009at org.apache.lucene.index.IndexFileDeleter.deleteFile(IndexFileDeleter.java:696) ~[lucene-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:01:13]\n","stream":"stdout","time":"2018-12-29T18:04:31.2821921Z"}
{"log":"\u0009at org.apache.lucene.index.IndexFileDeleter.deleteFiles(IndexFileDeleter.java:690) ~[lucene-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:01:13]\n","stream":"stdout","time":"2018-12-29T18:04:31.2822Z"}
{"log":"\u0009at org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:449) ~[lucene-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:01:13]\n","stream":"stdout","time":"2018-12-29T18:04:31.2822075Z"}
{"log":"\u0009at org.apache.lucene.index.IndexWriter.rollbackInternalNoCommit(IndexWriter.java:2339) ~[lucene-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:01:13]\n","stream":"stdout","time":"2018-12-29T18:04:31.2822259Z"}
{"log":"\u0009at org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2281) ~[lucene-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:01:13]\n","stream":"stdout","time":"2018-12-29T18:04:31.2822336Z"}
{"log":"\u0009at org.apache.lucene.index.IndexWriter.rollback(IndexWriter.java:2274) ~[lucene-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:01:13]\n","stream":"stdout","time":"2018-12-29T18:04:31.2822401Z"}
{"log":"\u0009at org.elasticsearch.index.engine.InternalEngine.closeNoLock(InternalEngine.java:2093) [elasticsearch-6.5.1.jar:6.5.1]\n","stream":"stdout","time":"2018-12-29T18:04:31.2822888Z"}
...
{"log":"\u0009at java.lang.Thread.run(Thread.java:834) [?:?]\n","stream":"stdout","time":"2018-12-29T18:04:31.3199944Z"}
{"log":"[2018-12-29T18:04:31,319][INFO ][o.e.c.r.a.AllocationService] [elasticsearch01] Cluster health status changed from [GREEN] to [RED] (reason: [shards failed [[epg_v21][0]] ...]).\n","stream":"stdout","time":"2018-12-29T18:04:31.3293225Z"}
{"log":"[2018-12-29T18:07:50,808][WARN ][o.e.i.e.Engine ] [elasticsearch01] [epg_v21][0] failed to rollback writer on close\n","stream":"stdout","time":"2018-12-29T18:07:50.810894Z"}
{"log":"java.nio.file.NoSuchFileException: /usr/share/elasticsearch/data/nodes/0/indices/VBTjn1OcSvSJq6iEnqmLAg/0/index/_7.cfs\n","stream":"stdout","time":"2018-12-29T18:07:50.8109745Z"}
在填充期间查看索引(_cat / indices),它首先增加了文档计数,但是在发生此错误之后,文档计数被重置为0,并且索引似乎已损坏。发生此错误后,索引状态仍为红色。成功索引了少量文档。
我使用docker卷安装来使索引持久化。创建后似乎可以使用,但是在出现上述错误后失败。
当我查看在Windows下创建的文件时,我可以看到它首先填充了索引(从索引的大小来看),但是它在日志中抱怨的文件消失了。还在Windows上创建文件,它显示在容器中,因此挂载仍然存在。
我使用以下docker-compose.yml进行启动:
version: "2"
services:
elasticsearch01:
image: docker.elastic.co/elasticsearch/elasticsearch:6.5.1
container_name: elasticsearch01
ports:
- "9200:9200"
- "9300:9300"
volumes:
- /host_mnt/c/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
- /host_mnt/c/elasticsearch/config/log4j2.properties:/usr/share/elasticsearch/config/log4j2.properties
- /host_mnt/c/elasticsearch/lib:/usr/share/elasticsearch/data
- /host_mnt/c/elasticsearch/log:/usr/share/elasticsearch/logs
environment:
- node.name=elasticsearch01
networks:
- esnet
networks:
esnet:
driver: bridge
我首先尝试从Windows WSL bash提示符启动docker-compose(使用WSL中的安装点),但这具有相同的效果。
我使用elasticdump进行批量进口。我尝试了几种批处理大小,但结果相同。不知何故,这个问题在大约1700000个文档之后出现,我以任何顺序执行导入(分解,小块,随机顺序或块)。
我尝试过的其他事情是扩大内存等。
有人能够帮助解决我的索引损坏吗?