ElasticSearch速度问题

时间:2014-07-11 22:38:01

标签: performance elasticsearch

我们正在使用ElasticSearch获取15M记录。记录按不同的索引大小分割,其中一些索引有150万条记录。

我们有足够的内存80 GB,整个60 GB的索引适合RAM。作为ElasticSearch的响应时间,我们有统计数据,查询执行时间为7毫秒,但我们从300毫秒的ElasticSearch获得结果。这有什么不对?我们在哪里可以搜索,我们的时间在哪里?

ES设置:

2 Nodes on 2 different hosts

Each index has 1 primary shard we have 2 shards each index 

3,762 Total Shards

3,762 Successful Shards

85 Indices

20,347,989 Documents

40.5GB Size

enter image description here

elasticsearch.yml

index.cache.field.type: soft

indices.cache.filter.size: 50%

index.fielddata.cache: soft

index.cache.field.expire: 60m

indices.fielddata.cache.size: 50%

indices.fielddata.cache.expire : 60m

index.store.type: mmapfs

transport.tcp.compress: true;

bootstrap.mlockall: true

index.search.slowlog.threshold.query.warn: 10s

index.search.slowlog.threshold.query.info: 5s

index.search.slowlog.threshold.query.debug: 2s

index.search.slowlog.threshold.query.trace: 500ms

示例:我们有一个国家DE索引,有1,5M文档。该索引有2个分片。

ES的开始:

/usr/lib/jvm/java-7-openjdk-amd64//bin/java -Xms32g -Xmx32g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.pidfile=/var/run/elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/share/elasticsearch/lib/elasticsearch-1.1.2.jar:/usr/share/elasticsearch/lib/*:/usr/share/elasticsearch/lib/sigar/* -Des.default.config=/etc/elasticsearch/elasticsearch.yml -Des.default.path.home=/usr/share/elasticsearch -Des.default.path.logs=/var/log/elasticsearch -Des.default.path.data=/var/lib/elasticsearch -Des.default.path.work=/tmp/elasticsearch -Des.default.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch

OS:

24 Cores

80 GB of RAM

60 GB are used

Disk space: 1,2 TB

350 GB used / 780GB free

Disc type: SAS

Mysql is running also on this machine

示例查询:搜索某个城市,我们向ES提供location_id:

{
    "query": {
        "match_all": {}
    },
    "sort": {},
    "facets": {
        "location_id": {
            "facet_filter": {
                "bool": {
                    "must": [{
                        "terms": {
                            "sponsored": [
                                1,
                                0
                            ]
                        }
                    }, {
                        "geo_distance": {
                            "distance": "50km",
                            "geo_point": {
                                "lat": -33.42628,
                                "lon": -70.56656
                            }
                        }
                    }]
                }
            },
            "terms": {
                "field": "location_facet",
                "all_terms": true,
                "size": 100,
                "script": "doc['geo_point'].empty ? null : ceil(doc['geo_point'].arcDistanceInKm(-33.42628,    -70.56656)) + '|' + doc['location_facet'].value\n + '|' + doc['location_id'].value"
            }
        },
        "company_id": {
            "facet_filter": {
                "bool": {
                    "must": [{
                        "terms": {
                            "sponsored": [
                                1,
                                0
                            ]
                        }
                    }, {
                        "geo_distance": {
                            "distance": "50km",
                            "geo_point": {
                                "lat": -33.42628,
                                "lon": -70.56656
                            }
                        }
                    }, {
                        "terms": {
                            "location_id": [
                                25717
                            ]
                        }
                    }]
                }
            },
            "terms": {
                "field": "company_facet",
                "order": "count",
                "script": "doc['company_facet'].value + '|' + doc['company_id'].value"
            }
        },
        "job_type_id": {
            "facet_filter": {
                "bool": {
                    "must": [{
                        "terms": {
                            "sponsored": [
                                1,
                                0
                            ]
                        }
                    }, {
                        "geo_distance": {
                            "distance": "50km",
                            "geo_point": {
                                "lat": -33.42628,
                                "lon": -70.56656
                            }
                        }
                    }]
                }
            },
            "terms": {
                "field": "jobtype_facet",
                "order": "term",
                "all_terms": true
            }
        }
    },
    "filter": {},
    "size": 10,
    "from": 0,
    "explain": false,
    "highlight": {
        "order": "score",
        "require_field_match": false,
        "pre_tags": [
            "<b>"
        ],
        "post_tags": [
            "</b>"
        ],
        "fields": {
            "description": {
                "type": "fvh",
                "force_source": true,
                "no_match_size": 200,
                "index_options": "offsets",
                "fragment_size": 200,
                "number_of_fragments": 2,
                "matched_fields": [
                    "description",
                    "title"
                ]
            }
        }
    }
}

此查询的响应时间:&gt; 400ms,非常慢。我们也禁用了面孔,但没有任何改变。

1 个答案:

答案 0 :(得分:0)

对于单点a&#34; geo_bounding_box&#34;过滤器可能比&#34; geo_distance&#34;。

更快