elasticsearch 6.2使用bulk_api进行映射

时间:2018-05-25 10:01:14

标签: json elasticsearch curl

我在尝试将数据插入elasticsearch时遇到问题。 我有1000个json文件,我想使用批量API使用curl迭代所有文件。

我的json文件如下:

{"index": {"_index": "stuff", "_type": "text", "_id": "1"}}{"lastversion":"2018-01-19","attribution":[],"description":"","notes":[],"alt_names":[],"sources":[],"urls":["https://www.fireeye.com/blog/threat-research/2018/01/microsoft-office-vulnerabilities-used-to-distribute-zyklon-malware.html"],"common_name":"anonym","samples":[{"status":"dumped","sha256":"8d0be4dd8b0ca7608bf3a02a2d212ce845ac733d150b4428376a5a939f1679ec","version":""}]}

我做的是:

1。创建名为“stuff”的索引。

curl -H 'Content-Type: application/json' -XPUT "localhost:9200/stuff/"; echo

2。创建映射(对于大多数json文件,因为我不知道如何创建映射:

"samples": [
  {
    "status": "dumped",
    "sha256": "8d0be4dd8b0ca7608bf3a02a2d212ce845ac733d150b4428376a5a939f1679ec",
    "version": ""
  }
]

我跑了卷曲:

curl -H 'Content-Type: application/json' -XPUT "localhost:9200/stuff" -d'
{
 "mappings": {
  "doc": {
   "properties": {
    "updated": {"type": "keyword"},
    "attribution": {"type": "keyword"},
    "description": {"type": "keyword"},
    "notes": {"type": "keyword"},
    "alt_names": {"type": "keyword"},
    "sources": {"type": "keyword"},
    "urls": {"type": "keyword"},
    "common_name": {"type": "keyword"}
   }
  }
 }
}
'

第3。我尝试使用curl上传到elasticsearch集群:

curl -H 'Content-Type: application/x-ndjson' -XPOST "localhost:9200/stuff/_bulk" --data-binary @our.json
{"took":5,"errors":true,"items":[{"index":{"_index":"stuff","_type":"text","_id":"1","status":400,"error":{"type":"mapper_parsing_exception","reason":"failed to parse, document is empty"}}}]}

我在这里做错了什么? 如何为json提供正确的映射?

我将不胜感激任何反馈。

1 个答案:

答案 0 :(得分:0)

首先,您要创建一个名为doc的映射,并尝试使用以下内容对数据编制索引:

{"index": {"_index": "stuff", "_type": "text", "_id": "1"}}

告诉ES,如果您正在使用ES>,那么下一个文档将成为_type: text的一部分并不存在。 6.0这将导致错误,因为您在一个索引中不能有多个类型。

另一方面,我猜你的our.json未正确创建,你必须用\n分隔每个句子。

然后你的例子就像:

{"index": {"_index": "stuff", "_type": "text", "_id": "1"}}
{"lastversion":"2018-01-19","attribution":[],"description":"","notes":[],"alt_names":[],"sources":[],"urls":["https://www.fireeye.com/blog/threat-research/2018/01/microsoft-office-vulnerabilities-used-to-distribute-zyklon-malware.html"],"common_name":"anonym","samples":[{"status":"dumped","sha256":"8d0be4dd8b0ca7608bf3a02a2d212ce845ac733d150b4428376a5a939f1679ec","version":""}]}