行for doc in collection.find({'is_timeline_valid': True}):
给出了消息长度错误。如何在没有错误的情况下获得所有收藏?我知道find().limit()
,但我不知道如何使用它。
代码:
from openpyxl import load_workbook
import pymongo
import os
wb = load_workbook('concilia.xlsx')
ws = wb.active
client = pymongo.MongoClient('...')
db = client['...']
collection = db['...']
r = 2
for doc in collection.find({'is_timeline_valid': True}):
for dic in doc['timeline']['datas']:
if 'concilia' in dic['tramite'].lower():
ws.cell(row = r, column = 1).value = doc['id_process_unformatted']
ws.cell(row = r, column = 2).value = dic['data']
ws.cell(row = r, column = 3).value = dic['tramite']
wb.save('concilia.xlsx')
print('*****************************')
print(dic['tramite'])
# print('check!')
r += 1
答案 0 :(得分:1)
这是一个简单的分页器,它将查询执行拆分为分页查询。
from itertools import count
class PaginatedCursor(object):
def __init__(self, cur, limit=100):
self.cur = cur
self.limit = limit
self.count = cur.count()
def __iter__(self):
skipper = count(start=0, step=self.limit)
for skip in skipper:
if skip >= self.count:
break
for document in self.cur.skip(skip).limit(self.limit):
yield document
self.cur.rewind()
...
cur = collection.find({'is_timeline_valid': True})
...
for doc in PaginatedCursor(cur, limit=100):
...
答案 1 :(得分:1)
我今天遇到了这个问题,事实证明,这与集合中特定文档的大小超过max_bson_size
限制有关。将文档添加到集合中时,请确保文档大小不超过max_bson_size
大小。
document_size_limit = client.max_bson_size
assert len(json.dumps(data)) < document_size_limit
我目前正在调查为什么馆藏首先允许大于max_bson_size
的文档。
答案 2 :(得分:0)
我们可以将batch_size添加到find()来减小消息的大小。
for doc in collection.find({'is_timeline_valid': True}):
成为
for doc in collection.find({'is_timeline_valid': True}, batch_size=1):