在弹性搜索中对嵌套数据应用分析器

时间:2015-04-24 06:44:26

标签: java search elasticsearch lucene elasticsearch-plugin

我有以下json数据:

{
"employee_id": "190",
"working_office": "India",
"skillsDetails": {
"CSS": "Beginner",
"Financial Services & Insurance": "Beginner",
"Planning": "Moderate",
"Pig": "Beginner",
"SQL": "Moderate",
"Go": "Beginner",
"iOS": "Beginner",
"Storytelling & Storyboarding": "Beginner",
"Relationship building": "Moderate",
"Facilitation": "Moderate"
}

我已经在employee_id,working_office等所有领域应用了analaysers。但我对如何将它应用于嵌套字段 - skillsDetails

无能为力
{
"profiles": {      
  "dynamic" : "true",
  "properties": {
    "employee_id": {
      "type": "integer",
        "analyzer":"standard"
    },
    "working_office": {
      "type": "string",
       "analyzer":"edge_ngram_analyzer"
    },
"skillsDetails":{
//I need to apply analyser here.Also I Dont want to hardcode the fieldNames like java,sql ,etc. I want the analyzer to apply over all     skillsdetails
}
}

任何指针或帮助都会非常有用。

1 个答案:

答案 0 :(得分:0)

这取决于您的数据结构。从输入看,您将拥有技能及其等级的动态集。根据它你可以有如下映射:

{
"profiles": {      
  "dynamic" : "true",
  "properties": {
    "employee_id": {
      "type": "integer"
    },
    "working_office": {
      "type": "string",
       "analyzer":"edge_ngram_analyzer"
    },
"skillsDetails":{
        "type" : "nested",
        "properties": {
                "name" : {"type": "string", "analyzer": "your-analyzer" },
                "experience"  : {"type": "string", "index" : "not_analyzed" }
        }
    }
}

有了这个你需要输入json为

{
"employee_id": "190",
"working_office": "India",
"skillsDetails": [
{"name:"CSS", "experience":"Beginner"},
{"name:"Financial Services & Insurance", "experience":"Beginner"},
{"name:"Planning", "experience":"Moderate"},
{"name:"Pig", "experience":"Beginner"},
{"name:"SQL", "experience":"Moderate"},
{"name:"Go", "experience":"Beginner"},
{"name:"iOS", "experience":"Beginner"},
{"name:"Storytelling & Storyboarding", "experience":"Beginner"},
{"name:"Relationship building", "experience":"Moderate"},
{"name:"Facilitation", "experience":"Moderate"}
]
}

通过这种方法,您可以在每个文档中拥有动态的技能组合。

如果您在输入数据中指定了一组技能,那么映射可能会有所不同。但我认为这不是一般情况

编辑:添加对象类型的映射:

如果您不想更改JSON输入,那么您可以将映射作为类型对象,如下所示:

{
    "profiles": {      
      "dynamic" : "true",
      "properties": {
        "employee_id": {
          "type": "integer"
        },
        "working_office": {
          "type": "string",
           "analyzer":"edge_ngram_analyzer"
        },
    "skillsDetails":{
            "type" : "object",
            "dynamic": true
        }
       }
    }
}

但主要的缺点是它会创造你的技能作为领域名称和专业知识作为价值。在进行搜索时,您将面临更多复杂性。这也将导致增加。字段(如果你有更多技能)从而增加集群状态大小。

最后,这取决于你和你想要实现的目标