Question

这是我的JDBC river命令，用于从数据库中获取所有记录。

localhost:9200/_river/my_update_river/_meta
{
  "type" : "jdbc",
   "jdbc" : {
     "url" : "jdbc:mysql://localhost:3306/admin",
      "user" : "root",
      "password" : "",
      "poll" : "6s",
      "index" : "updateauto",
      "type" : "users",
      "schedule":"0/10 * * ? * *",
      "strategy" : "simple",
      "sql" : "select * from users"
    }
 }

当我运行此命令时：我有两个问题：

重复记录
当我在数据库中添加新记录时，它不会更新索引文档，而是通过
进行搜索
{ “查询”：{ “过滤”：{ “过滤器”：{ “term”：{“Name”：“testing”} } } } }

这是我的结果。

   {
     "took" : 4,
     "timed_out" : false,
      "_shards" : {
      "total" : 5,
      "successful" : 5,
      "failed" : 0
   },
     "hits" : {
     "total" : 37551,
      "max_score" : 1.0,
      "hits" : [ {
      "_index" : "updateauto",
      "_type" : "users",
      "_id" : "AUvjnNHmMKBTPrby96Jg",
      "_score" : 1.0,
      "_source":{"ID":23,"Name":"Abudul  Rafay","Email":"a","Password":"afasd"}
}, {
      "_index" : "updateauto",
     "_type" : "users",
     "_id" : "AUvjnNHnMKBTPrby96Jk",
    "_score" : 1.0,
     "_source":{"ID":25,"Name":"r rafay ","Email":"r rafay","Password":"r rafay"}
}, {
      "_index" : "updateauto",
      "_type" : "users",
       "_id" : "AUvjngk0MKBTPrby96Ka",
      "_score" : 1.0,
      "_source":{"ID":23,"Name":"Abudul Rafay","Email":"a","Password":"afasd"}
}, {
     "_index" : "updateauto",
     "_type" : "users",
     "_id" : "AUvjngk0MKBTPrby96Kf",
     " _score" : 1.0,
     "_source":{"ID":24,"Name":"rafay","Email":"hello","Password":"fasfas"}
}, {
      "_index" : "updateauto",
      "_type" : "users",
     "_id" : "AUvjnjA0MKBTPrby96Kh",
     "_score" : 1.0,
     "_source":{"ID":23,"Name":"Abudul Rafay","Email":"a","Password":"afasd"}
}, {
     "_index" : "updateauto",
      "_type" : "users",
    "_id" : "AUvjnjA0MKBTPrby96Km",
    "_score" : 1.0,
    "_source":{"ID":24,"Name":"rafay","Email":"hello","Password":"fasfas"}
},  {
    "_index" : "updateauto",
    "_type" : "users",
    "_id" : "AUvjnZP0MKBTPrby96KD",
    "_score" : 1.0,
    "_source":{"ID":24,"Name":"rafay","Email":"hello","Password":"fasfas"}
}, {
    "_index" : "updateauto",
    "_type" : "users",
    "_id" : "AUvjnPe-MKBTPrby96Jq",
   "_score" : 1.0,
    "_source":{"ID":25,"Name":"r rafay ","Email":"r rafay","Password":"r rafay"}
}, {
    "_index" : "updateauto",
    "_type" : "users",
   "_id" : "AUvjnR7NMKBTPrby96Ju",
    "_score" : 1.0,
    "_source":{"ID":26,"Name":"New User","Email":"New","Password":"new"}
}, {
    "_index" : "updateauto",
    "_type" : "users",
    "_id" : "AUvjnbuLMKBTPrby96KO",
    "_score" : 1.0,
    "_source":{"ID":26,"Name":"New User","Email":"New","Password":"new"}
    } ]
   }
 }

我想要没有重复记录的结果，也会自动更新。

Answer 1

我没有完全理解你的第二个问题，但考虑到这里的重复问题是你需要做的事情：

您需要在河流定义中指定文档的ID，如下所示：

"sql" : "select *, ID as _id from user"

通过这种方式，河流只会写出每个用户都在想它的身份。

Elasticsearch：从索引文档中删除重复记录

1 个答案: