Elasticsearch多字段类型搜索和排序问题

时间:2014-03-10 16:31:25

标签: elasticsearch

我的某个索引中的multi_field映射类型存在问题,我不确定问题是什么。我在另一个索引中使用非常相似的映射,我没有这些问题。 ES版本是90.12

我已经设置了这个,我有一个如下所示的映射:

{
  "settings": {
    "index": {
      "number_of_shards": 10,
      "number_of_replicas": 1
    }
  },
  "mappings": {
    "production": {
      "properties": {
        "production_title": {
          "type": "multi_field",
          "fields": {
            "production_title_edgengram": {
              "type": "string",
              "index": "analyzed",
              "index_analyzer": "autocomplete_index",
              "search_analyzer": "autocomplete_search"
            },
            "production_title": {
              "type": "string",
              "index": "not_analyzed"
            }
          }
        }
      }
    }
  }
}

.yml看起来像这样:

index:
  mapper:
    dynamic: true
  analysis:
    analyzer:
      autocomplete_index:
        tokenizer: keyword
        filter: ["lowercase", "autocomplete_ngram"]
      autocomplete_search:
        tokenizer: keyword
        filter: lowercase
      ngram_index:
        tokenizer: keyword
        filter: ["ngram_filter"]
      ngram_search:
        tokenizer: keyword
        filter: lowercase
    filter:
      autocomplete_ngram:
        type: edgeNGram
        min_gram: 1
        max_gram: 15
        side: front
      ngram_filter:
        type: nGram
        min_gram: 2
        max_gram: 8

这样做:

curl -XGET 'http://localhost:9200/productionindex/production/_search' -d '{
  "sort": [
    {
      "production_title": "asc"
    }
  ],
  "size": 1
}'

curl -XGET 'http://localhost:9200/productionindex/production/_search' -d '{
  "sort": [
    {
      "production_title": "desc"
    }
  ],
  "size": 1
}'

我最终在字母表中间的某处得到了完全相同的结果:

"production_title": "IL, 'Hoodoo Love'"

但是,如果我这样做:

{
  "query": {
    "term": {
      "production_title": "IL, 'Hoodoo Love'"
    }
  }
}

我得到零结果。

此外,如果我这样做:

{
  "query": {
    "match": {
      "production_title_edgengram": "Il"
    }
  }
}

我也得到零结果。

如果我不使用multi_field并将其分开,我可以搜索它们(术语和自动完成),但我仍然无法排序。

编制索引时,我只在编制production_title索引时发送multi_field

有谁知道这里发生了什么?

下面请找到解释(最后的结果只是为了简洁)

{
  "_shard": 6,
  "_node": "j-D2SYPCT0qZt1lD1RcKOg",
  "_index": "productionindex",
  "_type": "production",
  "_id": "casting_call.productiondatetime.689",
  "_score": null,
  "_source": {
    "venue_state": "WA",
    "updated_date": "2014-03-10T12:08:13.927273",
    "django_id": 689,
    "production_types": [
      69,
      87,
      89
    ],
    "production_title": "WA, 'Footloose'"
  },
  "sort": [
    null
  ],
  "_explanation": {
    "value": 1.0,
    "description": "ConstantScore(cache(_type:audition)), product of:",
    "details": [
      {
        "value": 1.0,
        "description": "boost"
      },
      {
        "value": 1.0,
        "description": "queryNorm"
      }
    ]
  }
}

来自这个卷曲:

curl -XPOST 'http://localhost:9200/productionindex/production/_search?pretty=true&explain=true' -d '{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "production_title": {
        "order": "desc"
      }
    }
  ]
}'

0 个答案:

没有答案