Elasticsearch:默认模板不检测日期

时间:2016-05-09 14:25:28

标签: json elasticsearch elasticsearch-2.0

我有一个默认模板,看起来像

PUT /_template/abtemp
{
    "template": "abt*",
  "settings": {
    "index.refresh_interval": "5s",
    "number_of_shards": 5,
    "number_of_replicas": 1,
    "index.codec": "best_compression"
  },
  "mappings": {
    "_default_": {
      "_all": {
        "enabled": false
      },
      "_source": {
        "enabled": true
      },
      "dynamic_templates": [
        {
          "message_field": {
            "match": "message",
            "match_mapping_type": "string",
            "mapping": {
              "type": "string",
              "index": "analyzed",
              "omit_norms": true,
              "fielddata": {
                "format": "disabled"
              }
            }
          }
        },
        {
          "string_fields": {
            "match": "*",
            "match_mapping_type": "string",
            "mapping": {
              "type": "string",
              "index": "analyzed",
              "omit_norms": true,
              "fielddata": {
                "format": "disabled"
              },
              "fields": {
                "raw": {
                  "type": "string",
                  "index": "not_analyzed",
                  "ignore_above": 256
                }
              }
            }
          }
        }
      ]
    }
  }
}

这里的想法就是这个

  1. 为名称与abt*
  2. 匹配的所有索引应用模板
  3. 仅分析字符串字段,如果它名为message。所有其他字符串字段均为not_analyzed,并且具有相应的.raw字段
  4. 现在我尝试将一些数据索引为

    curl -s -XPOST hostName:port/indexName/_bulk --data-binary @myFile.json
    

    这是文件

    { "index" : { "_index" : "abtclm3","_type" : "test"} }
    {   "FIELD1":1,   "FIELD2":"2015-11-18 15:32:18"",   "FIELD3":"MATTHEWS",   "FIELD4":"GARY",   "FIELD5":"",   "FIELD6":"STARMX",   "FIELD7":"AL",   "FIELD8":"05/15/2010 11:30",   "FIELD9":"05/19/2010 7:00",   "FIELD10":"05/19/2010 23:00",   "FIELD11":3275,   "FIELD12":"LC",   "FIELD13":"WIN",   "FIELD14":"05/15/2010 11:30",   "FIELD15":"LC",   "FIELD16":"POTUS",   "FIELD17":"WH",   "FIELD18":"S GROUNDS",   "FIELD19":"OFFICE",   "FIELD20":"VISITORS",   "FIELD21":"STATE ARRIVAL - MEXICO**",   "FIELD22":"08/27/2010 07:00:00 AM +0000",   "FIELD23":"MATTHEWS",   "FIELD24":"GARY",   "FIELD25":"",   "FIELD26":"STARMX",   "FIELD27":"AL",   "FIELD28":"05/15/2010 11:30",   "FIELD29":"05/19/2010 7:00",   "FIELD30":"05/19/2010 23:00",   "FIELD31":3275,   "FIELD32":"LC",   "FIELD33":"WIN",   "FIELD34":"05/15/2010 11:30",   "FIELD35":"LC",   "FIELD36":"POTUS",   "FIELD37":"WH",   "FIELD38":"S GROUNDS",   "FIELD39":"OFFICE",   "FIELD40":"VISITORS",   "FIELD41":"STATE ARRIVAL - MEXICO**",   "FIELD42":"08/27/2010 07:00:00 AM +0000" }
    

    请注意,有一些字段,例如FIELD2应归类为date。此外,FIELD31应归类为long。所以索引发生了,当我查看数据时,我看到数字已被正确分类,但其他所有内容都放在string下。如何确保将具有时间戳的字段归类为date s?

1 个答案:

答案 0 :(得分:1)

你有很多日期格式。你需要一个像这样的模板:

{
  "template": "abt*",
  "settings": {
    "index.refresh_interval": "5s",
    "number_of_shards": 5,
    "number_of_replicas": 1,
    "index.codec": "best_compression"
  },
  "mappings": {
    "_default_": {
      "dynamic_date_formats":["dateOptionalTime||yyyy-mm-dd HH:mm:ss||mm/dd/yyyy HH:mm||mm/dd/yyyy HH:mm:ss aa ZZ"],
      "_all": {
        "enabled": false
      },
      "_source": {
        "enabled": true
      },
      "dynamic_templates": [
        {
          "message_field": {
            "match": "message",
            "match_mapping_type": "string",
            "mapping": {
              "type": "string",
              "index": "analyzed",
              "omit_norms": true,
              "fielddata": {
                "format": "disabled"
              }
            }
          }
        },
        {
          "dates": {
            "match": "*",
            "match_mapping_type": "date",
            "mapping": {
              "type": "date",
              "format": "dateOptionalTime||yyyy-mm-dd HH:mm:ss||mm/dd/yyyy HH:mm||mm/dd/yyyy HH:mm:ss aa ZZ"
            }
          }
        },
        {
          "string_fields": {
            "match": "*",
            "match_mapping_type": "string",
            "mapping": {
              "type": "string",
              "index": "analyzed",
              "omit_norms": true,
              "fielddata": {
                "format": "disabled"
              },
              "fields": {
                "raw": {
                  "type": "string",
                  "index": "not_analyzed",
                  "ignore_above": 256
                }
              }
            }
          }
        }
      ]
    }
  }
}

这可能并未涵盖您在那里的所有格式,您需要添加其余格式。我们的想法是在dynamic_date_formats下以||分隔,然后在format字段本身的date字段下指定它们。

要了解您需要做些什么来定义它们,请参阅this section of the documentation了解内置格式,this piece of documentation了解您计划使用的任何自定义格式。