我有一个导入到mongodb中的json文档,该文档看起来类似于以下测试数据:
[
{
"subject_id": "1",
"name": "Bob",
"dob": "12/31/00",
"gender": "Male",
"visits": {
"12/31/15": {
"age": "17",
"visit_category": "Baseline Visit"
},
"12/31/16": {
"age": "18",
"visit_category": "Follow Up Visit"
},
"12/31/17": {
"age": "18",
"visit_category": "Follow Up Visit"
}
},
"samples": {
"XXX123": {
"completed_by": "Sally",
"label_on_sample": "1"
}
}
},
{
"subject_id": "2",
"name": null,
"dob": "1/1/01",
"gender": "Female",
"visits": {
"1/1/11": {
"age": "10",
"visit_category": "Baseline Visit"
},
"1/1/12": {
"age": "11",
"visit_category": "Follow Up Visit"
},
"1/1/13": {
"age": "12",
"visit_category": "Follow Up Visit"
},
"1/1/14": {
"age": "13",
"visit_category": "Follow Up Visit"
},
"1/1/15": {
"age": "14",
"visit_category": "Follow Up Visit"
}
},
"samples": {
"YYY456": {
"completed_by": null,
"label_on_sample": "2"
},
"ZZZ789": {
"completed_by": "Sally",
"label_on_sample": "2"
}
}
}
]
我想在访问日期或样品中查询信息,但我相信由于标题可变,我感到很困惑。查询所有子文档的最佳方法是什么。
filter_by = {'subject.samples': {'$elemMatch': {'visit_category': "Follow Up Visit" }}}
data = db['subject'].find(filter_by)
print(data.count())
返回0。如何在'subject.samples'之后格式化某种通配符才能使它起作用。
谢谢。
答案 0 :(得分:1)
首先,您可能需要更正文档结构,以使访问键包含一个访问数组。
Mongo允许一个人做pipeline query that converts an object to an array,但我认为如果不考虑其他优化搜索的方法,这对于大型馆藏就很难轻易扩展。
现在,我将在这里查询与“后续访问”匹配的访问总数
pipeline = [
{
'$project': {
'visits': { '$objectToArray': '$visits' }
}
},
{
'$unwind': '$visits'
},
{
'$match': {
'visits.v.visit_category': 'Follow Up Visit'
}
},
{
'$count': 'count'
}
]
cur = db.patient.aggregate(pipeline)
result = next(cur)
print(result['count'])