我收集了以下结构的文件:
id: ObjectId
name: String
placeSeen: String
dateTimeSeen: Date
我需要通过匹配代表“旅行”的name
找到文档对。目标是查看从一个点到另一个点的旅行时间。人可以从任何地方到任何他们想要的地方。
e.g。 (使用下面的示例数据):我需要有结果,以便更容易获得如下信息: “约翰从A1到B1,他花了2分钟.John从B1到C1,他花了2分钟.John从C1到A1,他花了3分钟”
目前我正在考虑通过迭代完整的集合来做到这一点;对于每个文档的name
字段,我可以搜索第一个匹配的name
,其中placeSeen
按dateTimeSeen
升序排序。它会有点工作,但它似乎并不真正有效 - 许多行要迭代。
什么是更好的方法?
示例数据:
{
"_id" : ObjectId("56e933a186983c6f2978e8a1"),
"name" : "John",
"placeSeen" : "A1",
"dateTimeSeen" : ISODate("2016-03-16T10:25:41.000+0000")
}
{
"_id" : ObjectId("56e9354486983c6f2978e8a9"),
"name" : "John",
"placeSeen" : "B1",
"dateTimeSeen" : ISODate("2016-03-16T10:27:41.000+0000")
}
{
"_id" : ObjectId("56e9355186983c6f2978e8ab"),
"name" : "John",
"placeSeen" : "C1",
"dateTimeSeen" : ISODate("2016-03-16T10:29:41.000+0000")
}
{
"_id" : ObjectId("56e9355186983c6f2978e8ac"),
"name" : "John",
"placeSeen" : "A1",
"dateTimeSeen" : ISODate("2016-03-16T10:32:41.000+0000")
}
{
"_id" : ObjectId("56e9358186983c6f2978e8ad"),
"name" : "Sue",
"placeSeen" : "B1",
"dateTimeSeen" : ISODate("2016-03-16T10:21:41.000+0000")
}
{
"_id" : ObjectId("56e9358c86983c6f2978e8af"),
"name" : "Sue",
"placeSeen" : "A1",
"dateTimeSeen" : ISODate("2016-03-16T10:24:41.000+0000")
}
{
"_id" : ObjectId("56e9359686983c6f2978e8b1"),
"name" : "Sue",
"placeSeen" : "C1",
"dateTimeSeen" : ISODate("2016-03-16T10:29:41.000+0000")
}
答案 0 :(得分:1)
您可以通过聚合执行此操作。关键是弄清楚如何配对日期/地点,按每个人分组是很容易的部分。
我使用了您的示例数据,但我为“Sue”添加了另一个数据点,这是她以前访问过的地方 - 这表明只要正确检查时间,它就可以重复访问。
db.went.find({},{_id:0})
{ "name" : "John", "placeSeen" : "A1", "dateTimeSeen" : ISODate("2016-03-16T10:25:41Z") }
{ "name" : "John", "placeSeen" : "B1", "dateTimeSeen" : ISODate("2016-03-16T10:27:41Z") }
{ "name" : "John", "placeSeen" : "C1", "dateTimeSeen" : ISODate("2016-03-16T10:29:41Z") }
{ "name" : "Sue", "placeSeen" : "B1", "dateTimeSeen" : ISODate("2016-03-16T10:21:41Z") }
{ "name" : "Sue", "placeSeen" : "A1", "dateTimeSeen" : ISODate("2016-03-16T10:24:41Z") }
{ "name" : "Sue", "placeSeen" : "C1", "dateTimeSeen" : ISODate("2016-03-16T10:29:41Z") }
{ "name" : "Sue", "placeSeen" : "B1", "dateTimeSeen" : ISODate("2016-03-16T10:35:00Z") }
{ "name" : "John", "placeSeen" : "A1", "dateTimeSeen" : ISODate("2016-03-16T10:32:41Z") }
这是聚合:
db.went.aggregate( [
/* we want time to be sorted for each person in the next step */
{$sort:{name:1,dateTimeSeen:1}},
/* group each person's places and times into a single document */
{$group:{ _id:"$name", places:{$push:{place:"$placeSeen",time:"$dateTimeSeen"}}}},
/* this duplicates the "places" arrays into identical field "trips" */
{$project:{trips:"$places",places:1}},
/* unwind one of the arrays */
{$unwind:"$places"},
/* $filter keeps only elements of "trips" that are "later" than "place",
* then we only want the first element of remaining ones */
{$project:{ "places":1,
"to": {$arrayElemAt:[
{$filter {
input:"$trips",
as:"trip",
cond:{$and:[
{$ne:["$places.place","$$trip.place"],
{$lt:["$places.time","$$trip.time"]}
]}
}},
0
]}
}},
/* if "to" is null then it's the last point (no destination, remove) */
{$match:{to:{$ne:null}}},
/* format the "trip" output and calculate duration */
{$project:{ _id:0,
name:"$_id",
trip:{$concat:["$places.place","-","$to.place"]},
durationSeconds:{$divide:[{$subtract:["$to.time","$places.time"]},1000]}
}}
] )
输出:
{ "name" : "Sue", "trip" : "B1-A1", "durationSeconds" : 180 }
{ "name" : "Sue", "trip" : "A1-C1", "durationSeconds" : 300 }
{ "name" : "Sue", "trip" : "C1-B1", "durationSeconds" : 319 }
{ "name" : "John", "trip" : "A1-B1", "durationSeconds" : 120 }
{ "name" : "John", "trip" : "B1-C1", "durationSeconds" : 120 }
{ "name" : "John", "trip" : "C1-A1", "durationSeconds" : 180 }
您必须使用3.2.x或更高版本 - 我正在使用3.2.0中引入的几个聚合表达式。