禁用_source时,是否可以提取not_analyzed字段的实际值?

时间:2016-05-27 16:31:50

标签: elasticsearch

我有以下映射:

{
   "articles":{
      "mappings":{
         "article":{
            "_all":{
               "enabled":false
            },
            "_source":{
               "enabled":false
            },
            "properties":{
               "content":{
                  "type":"string",
                  "norms":{
                     "enabled":false
                  }
               },
               "url":{
                  "type":"string",
                  "index":"not_analyzed"
               }
            }
         }
      },
      "settings":{
         "index":{
            "refresh_interval":"30s",
            "number_of_shards":"20",
            "analysis":{
               "analyzer":{
                  "default":{
                     "filter":[
                        "icu_folding",
                        "icu_normalizer"
                     ],
                     "type":"custom",
                     "tokenizer":"icu_tokenizer"
                  }
               }
            },
            "number_of_replicas":"1"
         }
      }
   }
}

问题是可以以某种方式提取url字段的实际值,因为它not_analyzed以及何时未启用_source?我只需要为这个索引执行一次这样的操作,所以即使是一种hacky方式也是可以接受的。

我知道not_analyzed意味着字符串不会被标记化,所以我觉得它应该存储在某个地方,但我不知道它是哈希还是1:1而我在文档中找不到相关信息。

我的服务器正在运行带有JVM的ES版1.4.41.8.0_31

1 个答案:

答案 0 :(得分:1)

您可以阅读字段数据以从文档中检索网址。我们将直接从ES索引中阅读,因此我们将得到我们的确切内容"匹配"在这种情况下,在您未编制索引的索引的确切URL。

使用您提供的示例索引,我索引了两个URL(在您提供的索引的较小子集上:

POST /articles/article/1
{
    "url":"https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-fielddata-fields.html"
}
POST /articles/article/2
{
    "url":"http://stackoverflow.com/questions/37488389/can-i-extract-the-actual-value-of-not-analyzed-field-when-source-is-disabled"
}

然后这个查询将为我提供一个新的"字段"每次击中的对象:

GET /articles/article/_search
{
    "fielddata_fields" : ["url"]
}

给我们这些结果:

"hits": [
         {
            "_index": "articles",
            "_type": "article",
            "_id": "2",
            "_score": 1,
            "fields": {
               "url": [
                  "http://stackoverflow.com/questions/37488389/can-i-extract-the-actual-value-of-not-analyzed-field-when-source-is-disabled"
               ]
            }
         },
         {
            "_index": "articles",
            "_type": "article",
            "_id": "1",
            "_score": 1,
            "fields": {
               "url": [
                  "https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-fielddata-fields.html"
               ]
            }
         }
      ]

希望有所帮助!