如果数据不重复,我想检查文档中的字段(pv_time
),然后在pv_time
和其他字段中插入数据。在其他字段中,允许复制。使用$addToSet
我正在尝试这样做。
这是我的python代码:
for row in results.get('rows'):
path = row[0]
feedbackId = row[1]
pvDate = row[2]+' '+row[3]+':'+row[4]
city = row[5]
country = row[6]
pageviews = int(row[7])
db.customer_feedback_requests_archive.update({'feedback_request_id':ObjectId(feedbackId)},{'$addToSet':{'pv_time.'+path:pvDate},'$push':{'pv_city.'+path:city,'pv_country.'+path:country},'$inc':{'pv_count.'+path:pageviews}})
如果我第一次运行它会给出
{
"_id" : ObjectId("558d3900996f95a24aa69ef3"),
"feedback_request_id" : ObjectId("5665015a882a5174379d4dbd"),
"pv_count" : {
"main-rating" : 2
},
"pv_city" : {
"main-rating" : [
"Bengaluru",
"Bengaluru"
]
},
"pv_country" : {
"main-rating" : [
"India",
"India"
]
},
"pv_time" : {
"main-rating" : [
"20151208 10:00",
"20151208 10:01"
]
}
}
但如果我两次完成这项工作,那么它会给出:
{
"_id" : ObjectId("558d3900996f95a24aa69ef3"),
"feedback_request_id" : ObjectId("5665015a882a5174379d4dbd"),
"pv_count" : {
"main-rating" : 4
},
"pv_city" : {
"main-rating" : [
"Bengaluru",
"Bengaluru",
"Bengaluru",
"Bengaluru"
]
},
"pv_country" : {
"main-rating" : [
"India",
"India",
"India",
"India"
]
},
"pv_time" : {
"main-rating" : [
"20151208 10:00",
"20151208 10:01"
]
}
}
我希望pv_city
和pv_country
中的重复值仅在pv_time
不同的情况下才会出现,而在第二次我期望pv_time
未更新时,它应该不会更新pv_city
和pv_country
。
答案 0 :(得分:1)
它相当简单,你只需要稍微扩展你的查询。
db.customer_feedback_requests_archive.update(
{'feedback_request_id':ObjectId(feedbackId),'pv_time.'+path:{'$ne':pvDate}},
{'$addToSet':{'pv_time.'+path:pvDate},'$push':{'pv_city.'+path:city,'pv_country.'+path:country},'$inc':{'pv_count.'+path:pageviews}}
)
额外查询参数的作用是,它将搜索数组是否已有日期。如果它不存在,则更新将触发,这将解决您的问题。