Inserting an element into an Array in an embedded document in MongoDB

时间:2018-08-22 13:57:21

标签: java mongodb document

I quite new with MongoDB and am working with it in my Java project.

I have the folloing document structure in my collection:

{ "_id":"ProcessX", "tasks":[ { "taskName":"TaskX", "taskTime":"2018-08-09T13:38:58.317Z", "crawledList":[ "http://dbpedia.org/ontology/birthYear" ] }, { "taskName":"TaskX", "taskTime":"2018-08-10T06:19:32.006Z", "crawledList":[ "http://dbpedia.org/ontology/birthYear", "http://dbpedia.org/page/Mo_Chua_of_Balla" ] }, { "taskName":"TaskY", "taskTime":"2018-08-10T06:21:58.737Z", "crawledList":[ "http://dbpedia.org/page/Mo_Chua_of_Balla" ] } ] }

I want to put a "newURI" into a task's crawledList if it does not exists. Here is the process:

  • Find the process document with _id = "someProcessName"
  • Find the task document, in tasks array, with taskName = "someTaskName" and taskTime = "someTaskTime"
  • Check if the "newURI" exists in the crawledList of that task document
  • If it does not exists, insert the newURI into crawledList of the task document

I don't want to retrieve documents into memory and work with primitive Java types (Lists etc.) Can you help me to write the most efficient code by using MongoDB's Java Driver commands?

I don't have any indexes defined because I don't know which indexes I should define. I can also change the document structure if there is a better way to represent them and do this operation faster.

Thank you in advance.

1 个答案:

答案 0 :(得分:0)

最后,通过阅读Java驱动程序文档并在网上搜索,我成功实现了以下两个功能:

public boolean crawledBefore(IRI iri) {
    return collection.countDocuments(
            and(eq("_id", CrawlProcess.getProcessName()),
                    elemMatch("tasks", and(eq("taskName", CrawlProcess.getTaskName()),
                                            eq("taskTime", CrawlProcess.getCreationTime()),
                                            in("crawledList",iri.toString()))))) != 0;
}

public void addToStore(IRI iri) {
    if(!crawledBefore(iri)) {
        collection.updateOne(
                and(eq("_id", CrawlProcess.getProcessName()),
                    elemMatch("tasks", and(eq("taskName", CrawlProcess.getTaskName()),
                                            eq("taskTime", CrawlProcess.getCreationTime())))), 
                push("tasks.$.crawledList",iri.toString()));        
    }
}

这是它的工作方式:

crawledBefore()函数获取一个IRI并查看是否存在任何文档;在任务文档中的IRI的crawledList数组中具有该IRI,该任务文档是过程文档中的嵌入式文档。带有给定流程名称,任务名称和时间的流程文档始终存在于我的集合中,我在这里检查的只是该文档中存在IRI。

如果是,则第二个函数将新IRI添加到流程文档中该特定任务文档的crawledList。

干杯。