我已经构建了一个从SQL Server中的表中提取数据的索引
{
"type":"jdbc",
"jdbc":
{
"driver":"com.microsoft.sqlserver.jdbc.SQLServerDriver",
"url":"jdbc:sqlserver://[my_ip];databaseName=mega",
"user":"sa","password":"******",
"sql":"SELECT [OrderID],[CustomerName],[UserFullName],[Status] FROM [Orders_Table]",
"poll":"5s",
"index": "mega",
"type": "orders_search",
"schedule" : "0 0-59 0-23 ? * *"
}
}
问题是我收到了不相关的查询结果。
例如:[ 5220668 ]是数据库中只包含一次的行键。
{
"from" : 0, "size" : 5,
"query": {
"multi_match": {
"query": "5220668",
"fields": [ "_all" ]
}
}
}
结果:结果有问题。 我期待在数据库中只看到一个命中。而是查询检索行状态的整个生命周期
{
"took": 12,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 4,
"max_score": null,
"hits": [
{
"_index": "mega",
"_type": "handledorders_search",
"_id": "AU3OlBkh6JN7xIrOkzjm",
"_score": null,
"_source": {
"Status": "NEW",
"Date": "2015-06-07T03:00:12.110Z",
"UserFullName": "my name",
"CustomerName": "cust name",
"OrderID": 5220668
},
"sort": [
1433646012110
]
},
{
"_index": "mega",
"_type": "handledorders_search",
"_id": "AU3Ok0E-6JN7xIrOkvpF",
"_score": null,
"_source": {
"Status": "NEW",
"Date": "2015-06-07T03:00:12.110Z",
"UserFullName": "my name",
"CustomerName": "cust name",
"OrderID": 5220668
},
"sort": [
1433646012110
]
},
{
"_index": "mega",
"_type": "handledorders_search",
"_id": "AU3Ole0-6JN7xIrOk7Yo",
"_score": null,
"_source": {
"Status": "FIX",
"Date": "2015-06-07T03:00:12.110Z",
"UserFullName": "my name",
"CustomerName": "cust name",
"OrderID": 5220668
},
"sort": [
1433646012110
]
},
{
"_index": "mega",
"_type": "handledorders_search",
"_id": "AU3OlQL86JN7xIrOk3eH",
"_score": null,
"_source": {
"Status": "CLOSE",
"Date": "2015-06-07T03:00:12.110Z",
"UserFullName": "my name",
"CustomerName": "cust name",
"ExternalOrderID": 5220668
},
"sort": [
1433646012110
]
}
]
}
}
答案 0 :(得分:1)
我知道您正在使用_river插件或类似的东西,并且依赖于Elasticsearch轮询MSSQL数据。
棘手的部分是,当文档发生变化时,Elasticsearch不知道是否需要更新文档或创建新文档。你知道文件是一样的,但ES没有。您需要告诉ES文档是相同的。
有两种不同的方式。第一个是告诉ES特定字段是唯一标识符。您需要使用与
类似的内容创建映射{
"mega" : {
"_id" : {
"path" : "OrderId"
}
}
}
此方法自1.5.0以来已弃用
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-id-field.html
另一种可能性是最简单的,它是在SQL初始化中将OrderId映射到_id。
更多信息http://blog.pluralsight.com/elasticsearch-and-sql-server
带有别名的select语句告诉SQL Server的方式 将主键字段“ID”返回为“_id”。这是默认密钥 Elasticsearch用于所有文档的约定。这一点很重要 在选择数据时保持这种术语 Elasticsearch知道更新文档而不是创建新文档 每次民意调查