MongoDB聚合查询如何拆分数组元素

时间:2018-02-06 18:17:54

标签: python arrays mongodb mongodb-query aggregation-framework

在我的mongodb中,我收藏的很少,我想通过使用pymongo比较集合1和集合2来创建一个新的集合。

Collection 1 :
Object id       timestamp                          Prof_Name   SUBJECT
abc67478898k  ISODate("2018-01-03T09:26:37.541Z")   ABDC      "sub1, sub2, sub3"
jjjjjjjjjj    ISODate("2018-01-03T09:26:37.541Z")   XYZ       "sub2, sub4, sub8"

Collection 2 :
Object id   timestamp               UUID   SUBJECT_ID            rating score
3333333    ISODate("2018-01-03TZ")  7897  "sub1,sub4, sub7"     7      10
444444     ISODate("2018-01-03TZ")  4532   "sub2"               4      6
777777     ISODate("2018-01-03TZ")  7876  "sub1,sub2,sub3"      8      8
1111111    ISODate("2018-01-03TZ")   654   "sub1,sub3"          7      8

Json如下:

data1 :
{ "_id" : ObjectId("7a563a5a5560fd08da86dc44"), "Prof_Name" : "Jack", "timestamp" : ISODate("2018-01-10T16:08:26.613Z"), "SUBJECT" : ["Maths", "Chemistry", "Machinery1", "Ele1"] }
{ "_id" : ObjectId("7a563a5a5560fd08da86dc45"), "Prof_Name" : "Mac", "timestamp" : ISODate("2018-01-10T16:08:26.613Z"), "SUBJECT" : ["Chemistry", "CS", "German"] }
{ "_id" : ObjectId("7a563a5a5560fd08da86dc46"), "Prof_Name" : "Bill", "timestamp" : ISODate("2018-01-10T16:08:26.613Z"), "SUBJECT" : ["German"] }

data2 :
{ "_id" : ObjectId("7a563a5a5560fd08da86dc46"), "Rating" : 6, "UUID" : 8123, "timestamp" : ISODate("2018-01-10T16:08:26.613Z"), "SUBJECT_ID" : "Maths", "ID" : "OI-123" }
{ "_id" : ObjectId("7a563a5a5560fd08da86dc47"), "Rating" : 7, "UUID" : 8123, "timestamp" : ISODate("2018-01-10T16:08:26.613Z"), "SUBJECT_ID" : "Machinery1, Maths, French, German", "ID" : "OI-98" }

我尝试生成第3个集合,其中Prof_name的每个主题在collection2中找到匹配的主题,在某个时间戳和我的mongo查询之间找到UUID和UUID_count如下:

db.data1.aggregate([
  {"$lookup":{
    "from":"data2",
    "let":{"subject":{"$split":["$SUBJECT",", "]}},
    "pipeline":[
      {"$match": {"expr":{"$and":[{"$eq":[{"$year":"$timestamp"}, 2016]}, {"$eq":[{"$month":"$timestamp"}, 1]}]}}},
      {"$addFields":{"SUBJECT_ID":{"$split":["$SUBJECT_ID",", "]},"SUBJECT":"$$subject"}},
      {"$unwind":"$SUBJECT"},
      {"$match":{"$expr":{"$in":["$SUBJECT","$SUBJECT_ID"]}}},
      {"$facet":{
        "UUID":[{"$group":{"_id":{"id":"$_id","UUID":"$UUID"}}},{"$count":"UUID_Count"}],
        "REST":[
          {"$group":{"_id":null,"subjects_list":{"$addToSet":"$SUBJECT"},"UUID_distinct_list":{"$addToSet":"$UUID"}}},
          {"$addFields":{"subject_count":{"$size":"$subjects_list"},"UUID_distinct_count":{"$size":"$UUID_distinct_list"}}},
          {"$project":{"_id":0}}
         ]
      }},
      {"$replaceRoot":{"newRoot":{"$mergeObjects":[{"$arrayElemAt":["$UUID",0]},{"$arrayElemAt":["$REST",0]}]}}}
    ],
    "as":"ref_data"
  }},
  {"$unwind":{"path":"$ref_data","preserveNullAndEmptyArrays":true}},
  {"$addFields":{"ref_data.Prof_Name":"$Prof_Name"}},
  {"$replaceRoot":{"newRoot":"$ref_data"}},
  {"$out":"data3"}
])

如果SUBJECT是一个字符串,则abov查询可以正常工作:

SUBJECT
"sub1, sub2, sub3"
"sub2, sub4, sub8"

我的问题是:如果将SUBJECT列作为元素数组,如何更改查询。示例如下:

subjects1
["sub1", "sub2", "sub3"]
["sub2", "sub4", "sub8"]

如果我尝试相同的查询,我会收到类似的错误,在字符串上找到一个数组。

1 个答案:

答案 0 :(得分:0)

我认为您想使用

> new Array("sub1, sub2, sub3")

[“ sub1,sub2,sub3”]

我的控制台SS

enter image description here