如何处理导入到ElasticSearch

时间:2019-02-14 17:30:18

标签: json elasticsearch

我有一个带有嵌套对象的JSON,其中详细描述了诺贝尔奖获得者。获奖者按年份和类别汇总。例如片段1- Physics Nobel Prize Winners for 2018

  1. 处理此类数据结构的正确方法是 ElasticSearch?
  2. 我应该展平所有数据并使用不同的REST输入它 像片段2这样的电话?
  3. ElasticSearch是否具有内部功能来映射和展平 对象?

如果我使用代码段3中的映射并导入JSON,则会在文档_doc内嵌套文档。

  1. 有没有一种方法可以保留并映射原始的嵌套JSON,而 告诉ElasticSearch展平数据,类似于您将要进行的操作 如图所示从CSV文件Link to same API but as a CSV file获取 在摘要4中

谢谢。

  

代码段1:带有嵌套获奖者对象的JSON

{
"prizes": [
{
"year": "2018",
"category": "physics",
"overallMotivation": "“for groundbreaking inventions in the field of laser physics”",
"laureates": [
{
"id": "960",
"firstname": "Arthur",
"surname": "Ashkin",
"motivation": ""for the optical tweezers and their application to biological systems"",
"share": "2"
},
{
"id": "961",
"firstname": "Gérard",
"surname": "Mourou",
"motivation": ""for their method of generating high-intensity, ultra-short optical pulses"",
"share": "4"
},
{
"id": "962",
"firstname": "Donna",
"surname": "Strickland",
"motivation": ""for their method of generating high-intensity, ultra-short optical pulses"",
"share": "4"
}
]
}
]
}
  

代码段2:三个单独的REST调用以插入固定记录

PUT /nobel1/_doc/960
{
"year": "2018",
"category": "physics",
"overallMotivation": "“for groundbreaking inventions in the field of laser physics”",
"id": "960",
"firstname": "Arthur",
"surname": "Ashkin",
"motivation": "for the optical tweezers and their application to biological systems",
"share": "2"
}

PUT /nobel1/_doc/961
{
"year": "2018",
"category": "physics",
"overallMotivation": "“for groundbreaking inventions in the field of laser physics”",
"id": "961",
"firstname": "Gérard",
"surname": "Mourou",
"motivation": "for their method of generating high-intensity, ultra-short optical pulses",
"share": "4"
}

PUT /nobel1/_doc/962
{
"year": "2018",
"category": "physics",
"overallMotivation": "“for groundbreaking inventions in the field of laser physics”",
"id": "962",
"firstname": "Donna",
"surname": "Strickland",
"motivation": "for their method of generating high-intensity, ultra-short optical pulses",
"share": "4"
}
  

代码段3:嵌套对象的ElasticSearch映射

PUT /nobel
{
  "mappings": {
    "_doc": {
      "properties": {
        "year": {
          "type": "integer" 
        },
        "category": {
          "type": "keyword",
          "fields":{
            "raw": {
              "type": "text"
            }
          }
        },
        "overallMotivation": {
          "type": "text"
        },
        "laureates": {
          "type": "nested",
          "properties": {
              "id": {
                "type": "integer"
              },
              "firstname": {
                "type": "text"
              },
              "surname": {
                "type": "text"
              },
              "motivation": {
                "type": "text"
              },
              "share": {
                "type": "integer"
              }
          }
        }
      }
    }
  }
}
  

代码段4:CSV格式的诺贝尔奖获得者API

year,category,overallMotivation,id,firstname,surname,motivation,share
2018,physics,"“for groundbreaking inventions in the field of laser physics”",960,Arthur,Ashkin,"""for the optical tweezers and their application to biological systems""",2
2018,physics,"“for groundbreaking inventions in the field of laser physics”",961,Gérard,Mourou,"""for their method of generating high-intensity, ultra-short optical pulses""",4
2018,physics,"“for groundbreaking inventions in the field of laser physics”",962,Donna,Strickland,"""for their method of generating high-intensity, ultra-short optical pulses""",4

0 个答案:

没有答案