I quite new with MongoDB and am working with it in my Java project.
I have the folloing document structure in my collection:
{
"_id":"ProcessX",
"tasks":[
{
"taskName":"TaskX",
"taskTime":"2018-08-09T13:38:58.317Z",
"crawledList":[
"http://dbpedia.org/ontology/birthYear"
]
},
{
"taskName":"TaskX",
"taskTime":"2018-08-10T06:19:32.006Z",
"crawledList":[
"http://dbpedia.org/ontology/birthYear",
"http://dbpedia.org/page/Mo_Chua_of_Balla"
]
},
{
"taskName":"TaskY",
"taskTime":"2018-08-10T06:21:58.737Z",
"crawledList":[
"http://dbpedia.org/page/Mo_Chua_of_Balla"
]
}
]
}
I want to put a "newURI" into a task's crawledList if it does not exists. Here is the process:
I don't want to retrieve documents into memory and work with primitive Java types (Lists etc.) Can you help me to write the most efficient code by using MongoDB's Java Driver commands?
I don't have any indexes defined because I don't know which indexes I should define. I can also change the document structure if there is a better way to represent them and do this operation faster.
Thank you in advance.
答案 0 :(得分:0)
最后,通过阅读Java驱动程序文档并在网上搜索,我成功实现了以下两个功能:
public boolean crawledBefore(IRI iri) {
return collection.countDocuments(
and(eq("_id", CrawlProcess.getProcessName()),
elemMatch("tasks", and(eq("taskName", CrawlProcess.getTaskName()),
eq("taskTime", CrawlProcess.getCreationTime()),
in("crawledList",iri.toString()))))) != 0;
}
public void addToStore(IRI iri) {
if(!crawledBefore(iri)) {
collection.updateOne(
and(eq("_id", CrawlProcess.getProcessName()),
elemMatch("tasks", and(eq("taskName", CrawlProcess.getTaskName()),
eq("taskTime", CrawlProcess.getCreationTime())))),
push("tasks.$.crawledList",iri.toString()));
}
}
这是它的工作方式:
crawledBefore()函数获取一个IRI并查看是否存在任何文档;在任务文档中的IRI的crawledList数组中具有该IRI,该任务文档是过程文档中的嵌入式文档。带有给定流程名称,任务名称和时间的流程文档始终存在于我的集合中,我在这里检查的只是该文档中存在IRI。
如果是,则第二个函数将新IRI添加到流程文档中该特定任务文档的crawledList。
干杯。