我需要将数百万条记录从SQL数据库迁移到ES。目前我们通过GELF HTTP在ES中插入记录,但是一次只做一条记录是不可行的。
我已经在这几天工作了,并且对GrayLog和ElasticSearch都不熟悉。我试图找到一种方法将批量插入消息插入ES,然后将它们显示在GrayLog中。我一直在使用Cerebro来监控索引和每个索引的消息数量。当我执行批量插入时,消息计数确实在正确的索引中增加,但是我在GrayLog中看不到它们。
这就是我所拥有的:
var _elasticsearchContext = new ElasticsearchContext(ConnectionString, new ElasticsearchMappingResolver());
var connectionSettings = new ConnectionSettings(new Uri(ConnectionString))
.MapDefaultTypeIndices(m => m.Add(typeof(Auditing_Dev), "auditing-dev_0"));
var elasticClient = new ElasticClient(connectionSettings);
var items = new List<Auditing_Dev>();
//I loop through a DataReader creating new Auditing_Dev objects
//and add them to the items collection
var bulkResponse = elasticClient.Bulk(b => b.IndexMany(items, (d, doc) => d.Document(doc).Index("auditing-dev_0").Type("message")));
我收到了有效的回复,我在audit_dev_0索引中看到了Cerebro中的文件数增加。当我将通过Bulk插入的消息与通过HTTP请求插入的消息进行比较时,索引和类型是相同的。
消息我插入:
{
"_index" : "auditing-dev_0",
"_type" : "message",
"_id" : "AVsWWn-jNp2NX1vOria1",
"_version" : 1,
"found" : true,
"_source" : {
"level" : 5,
"origin" : "10.80.3.2",
"success" : true,
"type" : "Company.Enterprise",
"user" : "stupid@dropdown.test",
"gl2_source_input" : "57193c1d0cf25a44afc31c15",
"gl2_source_node" : "5866cc80-382e-4287-ae5b-8a0a68a9a1f1",
"gl2_remote_ip" : "10.100.20.164",
"gl2_remote_port" : 52273,
"streams" : [ "578fbabe738a897c6d91336b" ]
}
}
与通过HTTP插入的相比:
{
"_index" : "auditing-dev_0",
"_type" : "message",
"_id" : "e3d34d50-0a8a-11e7-84bb-00155d007a32",
"_version" : 1,
"found" : true,
"_source" : {
"level" : 5,
"gl2_remote_ip" : "192.168.211.114",
"origin" : "192.168.211.35",
"gl2_remote_port" : 2960,
"streams" : [ "578fbabe738a897c6d91336b" ],
"gl2_source_input" : "57193c1d0cf25a44afc31c15",
"success" : "True",
"gl2_source_node" : "5866cc80-382e-4287-ae5b-8a0a68a9a1f1",
"user" : "admin@purple-pink.test",
"timestamp" : "2017-03-16 22:43:44.000"
}
}
我看到_id是一种不同的格式,但这有关系吗?
在GrayLog中只有一个输入,而且是GELF HTTP的输入。我是否需要添加新输入?
答案 0 :(得分:0)
原来是Timestamp字段不存在。谁知道?