Elasticsearch重新索引后更大的索引大小

时间:2019-02-07 16:54:38

标签: elasticsearch reindex

对75GB索引执行重新索引后,新索引达到79GB。

两个索引具有相同的文档计数(54,123,676),并且两者具有完全相同的映射。原始索引具有6 * 2的分片,而新索引具有3 * 2的分片。

原始索引还具有75,857个未移动的已删除文档,因此我们对于它甚至可能比新索引更小感到困惑,更不用说整个4GB了。

原始索引

{
    "_shards": {
        "total": 12,
        "successful": 12,
        "failed": 0
    },
    "_all": {
        "primaries": {
            "docs": {
                "count": 54123676,
                "deleted": 75857
            },
            "store": {
                "size_in_bytes": 75357819717,
                "throttle_time_in_millis": 0
            },
            ...
            "segments": {
                "count": 6,
                "memory_in_bytes": 173650124,
                "terms_memory_in_bytes": 152493380,
                "stored_fields_memory_in_bytes": 17914688,
                "term_vectors_memory_in_bytes": 0,
                "norms_memory_in_bytes": 79424,
                "points_memory_in_bytes": 2728328,
                "doc_values_memory_in_bytes": 434304,
                "index_writer_memory_in_bytes": 0,
                "version_map_memory_in_bytes": 0,
                "fixed_bit_set_memory_in_bytes": 0,
                "max_unsafe_auto_id_timestamp": -1,
                "file_sizes": {}
            }
            ...

新索引

{
    "_shards": {
        "total": 6,
        "successful": 6,
        "failed": 0
    },
    "_all": {
        "primaries": {
            "docs": {
                "count": 54123676,
                "deleted": 0
            },
            "store": {
                "size_in_bytes": 79484557149,
                "throttle_time_in_millis": 0
            },
            ...
            "segments": {
                "count": 3,
                "memory_in_bytes": 166728713,
                "terms_memory_in_bytes": 145815659,
                "stored_fields_memory_in_bytes": 17870464,
                "term_vectors_memory_in_bytes": 0,
                "norms_memory_in_bytes": 37696,
                "points_memory_in_bytes": 2683802,
                "doc_values_memory_in_bytes": 321092,
                "index_writer_memory_in_bytes": 0,
                "version_map_memory_in_bytes": 0,
                "fixed_bit_set_memory_in_bytes": 0,
                "max_unsafe_auto_id_timestamp": -1,
                "file_sizes": {}
            }
            ...

有任何线索吗?

1 个答案:

答案 0 :(得分:0)

您应该使用细分合并功能。由于段是不可变的,ES总是创建新的段,然后缓慢地合并自身。但是此请求将帮助您解决问题,它将所有段合并并节省内存。但是,当您发送此请求时,请注意,此请求有点麻烦。因此,请选择非高峰时间执行。

POST /_forcemerge?only_expunge_deletes=true