想法是将 N 个字典与单个标准字典进行比较,其中每个键值对比较具有不同的条件规则。
例如,
标准字典-
{'ram': 16,
'storage': [512, 1, 2],
'manufacturers': ['Dell', 'Apple', 'Asus', 'Alienware'],
'year': 2018,
'drives': ['A', 'B', 'C', 'D', 'E']
}
字典列表 -
{'ram': 8,
'storage': 1,
'manufacturers': 'Apple',
'year': 2018,
'drives': ['C', 'D', 'E']
},
{'ram': 16,
'storage': 4,
'manufacturers': 'Asus',
'year': 2021,
'drives': ['F', 'G','H']
},
{'ram': 4,
'storage': 2,
'manufacturers': 'ACER',
'year': 2016,
'drives': ['F', 'G', 'H']
}
条件-
因此,预期的输出是显示所有具有不匹配值的不匹配值和匹配值的 none/null。
预期输出 -
{'ram': 8,
'storage': 1,
'manufacturers': None,
'year': None,
'drives': ['C', 'D', 'E']
},
{'ram': None,
'storage': None,
'manufacturers': None,
'year': None,
'drives': ['F','G','H']
},
{'ram': 4,
'storage': 2,
'manufacturers': 'ACER',
'year': 2016,
'drives': None
}
在使用 MongoDB 时,我遇到了这个问题,即应该将数据集合中的每个文档与标准集合进行比较。任何 MongoDB 直接查询也会非常有帮助。
答案 0 :(得分:0)
要达到使用 MongoDB 聚合的条件,请使用以下查询:
db.collection.aggregate([
{
"$project": {
"ram": {
"$cond": {
"if": {
"$gt": [
"$ram",
8
]
},
"then": null,
"else": "$ram",
}
},
"storage": {
"$cond": {
"if": {
"$and": [
{
"$gte": [
"$ram",
8
]
},
{
"$gte": [
"$storage",
2
]
},
],
},
"then": null,
"else": "$storage",
}
},
"manufacturers": {
"$cond": {
"if": {
"$in": [
"$manufacturers",
[
"Dell",
"Apple",
"Asus",
"Alienware"
],
]
},
"then": null,
"else": "$manufacturers",
}
},
"year": {
"$cond": {
"if": {
"$gte": [
"$year",
2018
]
},
"then": null,
"else": "$year",
}
},
"drives": {
"$cond": {
"if": {
"$gt": [
"$year",
2018
]
},
"then": {
"$setIntersection": [
"$drives",
[
"A",
"B",
"C",
"D",
"E"
]
]
},
"else": "$drives",
}
},
}
}
])
Mongo Playground Sample Execution
您可以将其与 Python 中的 for 循环结合使用
for std_doc in std_col.find({}, {
"ram": 1,
"storage": 1,
"manufacturers": 1,
"year": 1,
"drives": 1,
}):
print(list(list_col.aggregate([
{
"$project": {
"ram": {
"$cond": {
"if": {
"$gt": [
"$ram",
8
]
},
"then": None,
"else": "$ram",
}
},
"storage": {
"$cond": {
"if": {
"$and": [
{
"$gte": [
"$ram",
8
]
},
{
"$gte": [
"$storage",
2
]
},
],
},
"then": None,
"else": "$storage",
}
},
"manufacturers": {
"$cond": {
"if": {
"$in": [
"$manufacturers",
[
"Dell",
"Apple",
"Asus",
"Alienware"
],
]
},
"then": None,
"else": "$manufacturers",
}
},
"year": {
"$cond": {
"if": {
"$gte": [
"$year",
2018
]
},
"then": None,
"else": "$year",
}
},
"drives": {
"$cond": {
"if": {
"$gt": [
"$year",
2018
]
},
"then": {
"$setIntersection": [
"$drives",
[
"A",
"B",
"C",
"D",
"E"
]
]
},
"else": "$drives",
}
},
}
}
])))
最优化的解决方案是执行查找,但这取决于您的要求:
db.std_col.aggregate([
{
"$lookup": {
"from": "dict_col",
"let": {
"cmpRam": "$ram",
"cmpStorage": "$storage",
"cmpManufacturers": "$manufacturers",
"cmpYear": "$year",
"cmpDrives": "$drives",
},
"pipeline": [
{
"$project": {
"ram": {
"$cond": {
"if": {
"$gt": [
"$ram",
"$$cmpRam",
]
},
"then": null,
"else": "$ram",
}
},
"storage": {
"$cond": {
"if": {
"$and": [
{
"$gte": [
"$ram",
"$$cmpRam"
]
},
{
"$gte": [
"$storage",
"$$cmpStorage"
]
},
],
},
"then": null,
"else": "$storage",
}
},
"manufacturers": {
"$cond": {
"if": {
"$in": [
"$manufacturers",
"$$cmpManufacturers",
]
},
"then": null,
"else": "$manufacturers",
}
},
"year": {
"$cond": {
"if": {
"$gte": [
"$year",
"$$cmpYear",
]
},
"then": null,
"else": "$year",
}
},
"drives": {
"$cond": {
"if": {
"$gt": [
"$year",
"$$cmpYear"
]
},
"then": {
"$setIntersection": [
"$drives",
"$$cmpDrives"
]
},
"else": "$drives",
}
},
}
},
],
"as": "inventory_docs"
}
}
])