我有一个嵌入在文档中的大数字数据集。我想将它们全部提取出来,将它们放在一个有序列表中,然后将它的“pvalue”返回到每个文档:这是它在排序列表中的顺序除以列表的长度。 我在查找如何在python代码中执行此操作时遇到了很多麻烦。
movie_records = db.movies.find()
list=[]
for i in movie_records:
num = i["total_tickets"]
#put them all in a list, order the list
for i in movie_records:
number=i["total_tickets"]
tickets_pvalue= 1 - ( #position of number /len(list) )
shows.update({"id":i["id"]}, {'$set':{"total_tickets_pvalue":tickets_pvalue}})
答案 0 :(得分:0)
除非有任何关于mongodb的知识(你最好按照它进行排序,如评论中所建议的那样):
movie_records = sorted([(m['total_tickets'], m) for m in db.movies.find()])
# Select distinct on total_tickets
distinct_records = []
prev = None
while movie_records:
record = movie_records.pop(0)
if record[0] != prev:
prev = record[0]
distinct_records.append(record[1])
for index, movie in enumerate(distinct_records):
tickets_pvalue = 1 - (index + 1.0) / len(distinct_records)
shows.update({"id": movie["id"]},
{'$set': {"total_tickets_pvalue": tickets_pvalue}})