在我的mongodb中,我收藏的很少,我想通过使用pymongo比较集合1和集合2来创建一个新的集合。
我希望获得第三个集合,对于每个主题,Prof_name查找collection2中的匹配主题以及特定时间戳之间的UUID和UUID_count
Collection 1 :
Object id timestamp Prof_Name subjects1
abc67478898k ISODate("2018-01-03T09:26:37.541Z") ABDC "sub1, sub2, sub3"
jjjjjjjjjj ISODate("2018-01-03T09:26:37.541Z") XYZ "sub2, sub4, sub8"
Collection 2 :
Object id timestamp UUID subjects2 rating score
3333333 ISODate("2018-01-03TZ") 7897 "sub1,sub4, sub7" 7 10
444444 ISODate("2018-01-03TZ") 4532 "sub2" 4 6
777777 ISODate("2018-01-03TZ") 7876 "sub1,sub2,sub3" 8 8
1111111 ISODate("2018-01-03TZ") 654 "sub1,sub3" 7 8
Collection 3 :
objectid Prof_name subjects_list UUID_list UUID-count subject_count
12 ABDC sub1,sub2,sub3 7897,4532,7876,654 4 3
34 XYZ sub2,sub4,sub8 7897,4532,7876 2 3
答案 0 :(得分:1)
您可以在3.6中尝试以下聚合。
以下代码将cExp : a -> (a -> b) -> b
cExp cm cn =
cn cm
字符串拆分为字符串值数组,然后$lookup
过滤subjects1
文档以匹配主题并输出collection_2
。
UUID
上的Prof_name
个$addToSet
文件,UUID
subjects1
和UUID
后跟$group
来计算subjects1
和db.collection_1.aggregate([
{"$addFields":{"subjects1":{"$split":["$subjects1",", "]}}},
{"$unwind":"$subjects1"},
{"$lookup":{
"from":"collection_2",
"let":{"subjects1":"$subjects1"},
"pipeline":[
{"$addFields":{"subjects2":{"$split":["$subjects2",","]}}},
{"$match":{"$expr":{"$in":["$$subjects1","$subjects2"]}}},
{"$project":{"UUID":1,"_id":0}}
],
"as":"ref_data"}},
{"$unwind":{"path":"$ref_data","preserveNullAndEmptyArrays":true}},
{"$group":{
"_id":"$Prof_Name",
"subjects_list":{"$addToSet":"$subjects1"},
"UUID_list":{"$addToSet":"$ref_data.UUID"}}},
{"$addFields":{
"Prof_name":"$_id",
"UUID_count":{"$size":"$UUID_list"},
"subject_count":{"$size":"$subjects_list"}}},
{"$project":{"_id":0}},
{"$out":"collection_3"}
])
。
$size
将回复写入新收藏。
$names = ['my name', 'another name'];
if (!$conn) {
die("Connection failed: " . mysqli_connect_error());
}
$sql = "SELECT * FROM `clients` WHERE `name` IN ('".implode("','",$names)."') ORDER BY id DESC";
$result = $conn->query($sql);
while($row = $result->fetch_assoc()) {
print_r($row);
}
$conn->close();