我正在使用 Elasticsearch 6.2 配置一个 2个节点的群集。
GET _cluster/health
:
{
"cluster_name": "cluster_name",
"status": "green",
"timed_out": false,
"number_of_nodes": 2,
"number_of_data_nodes": 2,
"active_primary_shards": 47,
"active_shards": 94,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 0,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 100
}
GET myindex/_settings
:
{
"myindex": {
"settings": {
"index": {
"number_of_shards": "3",
"analysis": {
"analyzer": {
"url_split_analyzer": {
"filter": "lowercase",
"tokenizer": "url_split"
}
},
"tokenizer": {
"url_split": {
"pattern": "[^a-zA-Z0-9]",
"type": "pattern"
}
}
},
"number_of_replicas": "1",
"version": {
"created": "6020499"
}
}
}
}
}
这里是_mappings
结构的快照:
"myindex": {
"mappings": {
"mytype": {
"properties": {
"@timestamp": {
"type": "date"
},
............
"active": {
"type": "short"
},
"id_domain": {
"type": "short",
"ignore_malformed": true
},
"url": {
"type": "text",
"similarity": "boolean",
"analyzer": "url_split_analyzer"
}
}
.......
我在索引中偶然发现了一些文件,如果我使用id_domain
属性查询索引,我找不到。
例如:
GET /myindex/mytype/_search
{
"query": {
"bool": {
"must": [
{
"match": { "active": 1 }
}
]
}
}
}
输出示例:
{
"_index": "myindex",
"_type": "mytype",
"_id": "myurl",
"_score": 1,
"_source": {
"id_domain": "73993",
"active": 1,
"url": "myurl",
"@timestamp": "2018-05-21T10:55:16.247Z"
}
}
....
返回我找到id_domain
的文档列表,我找不到该ID域的查询,如下所示:
GET /myindex/mytype/_search
{
"query": {
"match": {
"id_domain": 73993 // with or without " got the same result
}
}
}
输出
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 3,
"successful": 3,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
我无法理解为什么会这样。 我也尝试重新索引索引,但我得到了相同的结果。
我确信我错过了什么。 这种行为有什么理由吗?
谢谢
答案 0 :(得分:0)
在您的映射中,id_domain
的类型为short
,但在您的文档中,您的值超出了短值的范围([-32,768到32,767]),即73993. / p>
您需要将类型更改为integer
,一切都会正常