无法使用ElasticDump

时间:2016-08-24 11:48:24

标签: database elasticsearch import

我有一些来自ElasticSearch数据库的JSON文件,我正在尝试使用ElasticDump导入它们。

这是映射文件:“mylog.mapping.json”

[
"{\"mylog\":{\"mappings\":{\"search_log\":{\"_timestamp\":{\"enabled\":true,\"store\":true},\"properties\":{\"preArray\":{\"type\":\"long\"},\"preId\":{\"type\":\"string\"},\"filteredSearch\":{\"type\":\"string\"},\"hits\":{\"type\":\"long\"},\"search\":{\"type\":\"string\"},\"searchType\":{\"properties\":{\"name\":{\"type\":\"string\"}}}}}}}}"
]

包含数据本身的文件:“mylog.json”

{"_index":"mylog","_type":"search_log","_id":"AU5AcRy7dbXLQfUndnNS","_score":1,"_source":{"searchType":{"name":"TypeSearchOne"},"search":"test","filteredSearch":"test","hits":1470,"preId":"","preArray":[47752,51493,52206,50159,52182,53243,43237,51329,42772,44938,44945,44952,42773,58319,43238,48963,52856,52185,47751,61542,51327,42028,51341,45356,44853,44939,48587,42774,43063,98779,46235,53533,47745,48844,44979,53209,47738,98781,47757,44948,44950,48832,97529,52186,96033,53002,48419,44943,44955,52179]},"fields":{"_timestamp":1435600231611}}
{"_index":"mylog","_type":"search_log","_id":"AU5AcSdcdbXLQfUndnNd","_score":1,"_source":{"searchType":{"name":"TypeSearchTwo"},"search":"squared","filteredSearch":"squared","hits":34,"preId":null,"preArray":null},"fields":{"_timestamp":1435600234333}}
{"_index":"mylog","_type":"search_log","_id":"AU5AcSiZdbXLQfUndnNj","_score":1,"_source":{"searchType":{"name":"TypeSearchOne"},"search":"test","filteredSearch":"test","hits":1354,"preId":"","preArray":[55808,53545,53543,53651,55937,53544,54943,54942,54941]},"fields":{"_timestamp":1435600234649}}

...

{"_index":"mylog","_type":"search_log","_id":"AU5DSVzLdbXLQfUndnPp","_score":1,"_source":{"searchType":{"name":"TypeSearchOne"},"search":"lee","filteredSearch":"lee","hits":39,"preId":"53133","preArray":null},"fields":{"_timestamp":1435647958219}}
{"_index":"mylog","_type":"search_log","_id":"AU5D7M42dbXLQfUndnR9","_score":1,"_source":{"searchType":{"name":"TypeSearchOne"},"search":"leerwww","filteredSearch":"leerwww","hits":39,"preId":"53133","preArray":null},"fields":{"_timestamp":1435658669622}}

在我尝试在我的ElasticSearch服务器中导入此数据时,我尝试了以下ElasticDump命令:

elasticdump --input=/home/user/Desktop/LOGDATA/mylog.mapping.json --output=http://localhost:9200/mylog --type=mapping
elasticdump --input=/home/user/Desktop/LOGDATA/mylog.json --output=http://localhost:9200/mylog --type=data

此后,数据可用,但_timestamp字段无处可见。如果我检查映射,这就是我得到的:

user@computer:~$ curl -XGET 'localhost:9200/mylog/_mapping'

{
    "mylog":{
        "mappings":{
            "search_log":{
                "properties":{
                    "preArray":{"type":"long"},
                    "preId":{"type":"string"},
                    "filteredSearch":{"type":"string"},
                    "hits":{"type":"long"},
                    "search":{"type":"string"},
                    "searchType":{"properties":{"name":{"type":"string"}}}
                }
            }
        }
    }
}

如您所见,_timestamp字段不存在,即使它已在映射中指定。为什么会发生这种情况,如何在不丢失时间戳的情况下导入数据呢?

1 个答案:

答案 0 :(得分:1)

从2.0开始,_timestamp is deprecated and a special type of field known as a meta-field。它仍然存在于5.0中(至少目前为止),但你不应该依赖它,你应该期望它被删除。

与其他元字段一样,您不应该能够修改其映射(例如,指定stored: true),也不打算将其设置为文档的一部分。

您应该做的是将字段设置为请求参数:

PUT my_index/my_type/1?timestamp=1435600231611
{"searchType":{"name":"TypeSearchOne"},"search":"test","filteredSearch":"test","hits":1470,"preId":"","preArray":[47752,51493,52206,50159,52182,53243,43237,51329,42772,44938,44945,44952,42773,58319,43238,48963,52856,52185,47751,61542,51327,42028,51341,45356,44853,44939,48587,42774,43063,98779,46235,53533,47745,48844,44979,53209,47738,98781,47757,44948,44950,48832,97529,52186,96033,53002,48419,44943,44955,52179]}

我不太了解ElasticDump是否可以指示它做正确的事情",但实际上这里有一个更好的选择:

修改您的JSON输入以删除_timestamp并将其替换为名为timestamp的普通字段(或您选择的任何名称)。

"mappings": {
  "my_type": {
    "properties": {
      "timestamp": {
        "type": "date"
      },
      ...
    }
  }
}

请注意,您的ElasticDump输入会将_timestamp分隔为fields,而不是source,因此您必须确保执行正确连接的查找/替换他们在一起:

},"fields":{"_timestamp"

应该是:

,"timestamp"