我想在Elasticsearch中导入文本文件。文本文件每行包含3个值。经过几个小时的挣扎,我没有完成它。非常感谢帮助。
安装了Logstash的Elasticsearch 5.4.0。
示例数据:
username email hash
username email hash
username email hash
username email hash
username email hash
还构建了一个python脚本,但它太慢了:
import requests
import json
from elasticsearch import Elasticsearch
es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
i = 1
with open("my2") as fileobject:
for line in fileobject:
username, email, hash = line.strip('\n').split(' ')
body = {"username": username, "email": email, "password": hash}
es.index(index='dbs', doc_type='db1', id=i, body=body)
i += 1
编辑: 感谢它的工作,但我想我的过滤器很糟糕,因为我希望它看起来像这样:
{
"_index": "logstash-2017.06.01",
"_type": "db",
"_id": "AVxinqK5XRvft8kN7Q6M",
"_version": 1,
"_score": null,
"_source": {
"username": "Marlb0ro",
"email": "Marlb0ro@site.com",
"hash": "123456",
}
它把这样的数据放在:
{
"_index": "logstash-2017.06.01",
"_type": "logs",
"_id": "AVxinqK5XRvft8kN7Q6M",
"_version": 1,
"_score": null,
"_source": {
"path": "C:/Users/user/Desktop/user/log.txt",
"@timestamp": "2017-06-01T07:46:22.488Z",
"@version": "1",
"host": "DESKTOP-FNGSJ6C",
"message": "username email password",
"tags": [
"_grokparsefailure"
]
},
"fields": {
"@timestamp": [
1496303182488
]
},
"sort": [
1496303182488
]
}
答案 0 :(得分:1)
只需将其放在名为grok.conf
的文件中:
input {
file {
path => "/path/to/your/file.log"
start_position => beginning
sincedb_path => "/dev/null"
}
}
filter {
grok {
match => {"message" => "%{WORD:username} %{WORD:email} %{WORD:hash}" }
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
}
}
然后使用bin/logstash -f grok.conf
运行Logstash,你应该没问题。