我正在使用mongo-connector将mongoDB replicaSet中的数据与elast2-doc-manager同步为Doc Manager。
我正在运行mongo-connector:
$mongo-connector --auto-commit-interval=5 --verbose -m 127.0.0.1:27017 -t localhost:9200 -d elastic2_doc_manager --namespace-set=db.collection1,db.collection2 --fields=f1,f2,f3
在某些时候我得到了这个例外:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
self.run()
File "/usr/local/lib/python2.7/dist-packages/mongo_connector/util.py", line 85, in wrapped
func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/mongo_connector/oplog_manager.py", line 261, in run
docman.upsert(doc, ns, timestamp)
File "/usr/local/lib/python2.7/dist-packages/mongo_connector/util.py", line 32, in wrapped
return f(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/mongo_connector/doc_managers/elastic2_doc_manager.py", line 150, in upsert
doc_id = u(doc.pop("_id"))
我添加了一个try / except包装方法File "/usr/local/lib/python2.7/dist-packages/mongo_connector/doc_managers/elastic2_doc_manager.py" Line 148
,以便在出现异常时打印有问题的文档。
不知何故,印刷文档中缺少_id
。但是,如果我直接从交互式cmd查询mongo,我可以获取相同的文档,并且_id
键存在。
所以我不知道为什么mongo-connector/elastic2_doc_manager
没有看到某些文档的_id
属性。
答案 0 :(得分:0)
Mongo-connector,无论出于何种原因,似乎都会从您的文档中删除_id
。然而,来自mongodb的ObjectId的字符串表示将被存储为elasticsearch中的_id。它仍然存在但不在文档中,或者弹性搜索会将其称为“源”。
查看查询结果,它的结构如下:
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"failed" : 0
},
"hits" : {
"total" : 135513,
"max_score" : 1.0,
"hits" : [ {
"_index" : "myIndex",
"_type" : "myType",
"_id" : "5294b93e6c255bb82d0000c0", <-- ID from mongodb
"_score" : 1.0,
"_source":{
"some": "data",
"my": "document"
},
{
"_index" : "myIndex",
"_type" : "myType",
"_id" : "5294b93e6c255bb82d0000de", <-- ID from mongodb
"_score" : 1.0,
"_source":{
"some": "data2",
"my": "document2"
}
}]
}
}
我的印象是mongo-connector故意这样做。要仅将_id存储在相应的ES字段中,但我也没有理由同时从文档的_source
中删除_id。但是我注意到在使用elastic_doc_manager(v1)时ES中的文档缺少id。
答案 1 :(得分:0)
运行mongo-connector -c config.json 这里是config.json的示例文件,您可以正确配置_id。
并在.json文件中定义 -
"docManagers": [
{
"docManager": "elastic2_doc_manager",
"__targetURL": "localhost:9200",
"bulkSize": 5000,
"uniqueKey": "_id",
"__autoCommitInterval": null,
"args": {
"aws": {
"region_name": "your-choice"
}
}