pymongo - 消息长度大于服务器最大消息大小

时间:2018-02-27 17:35:01

标签: python mongodb pymongo

for doc in collection.find({'is_timeline_valid': True}):给出了消息长度错误。如何在没有错误的情况下获得所有收藏?我知道find().limit(),但我不知道如何使用它。

代码:

from openpyxl import load_workbook
import pymongo
import os

wb = load_workbook('concilia.xlsx')
ws = wb.active
client = pymongo.MongoClient('...')
db = client['...']
collection = db['...']

r = 2
for doc in collection.find({'is_timeline_valid': True}):
   for dic in doc['timeline']['datas']:
     if 'concilia' in dic['tramite'].lower():
        ws.cell(row = r, column = 1).value = doc['id_process_unformatted']
        ws.cell(row = r, column = 2).value = dic['data']
        ws.cell(row = r, column = 3).value = dic['tramite']
        wb.save('concilia.xlsx')
        print('*****************************')
        print(dic['tramite'])
        # print('check!')
        r += 1

3 个答案:

答案 0 :(得分:1)

这是一个简单的分页器,它将查询执行拆分为分页查询。

from itertools import count

class PaginatedCursor(object):
    def __init__(self, cur, limit=100):
        self.cur = cur
        self.limit = limit
        self.count = cur.count()

    def __iter__(self):
        skipper = count(start=0, step=self.limit)

        for skip in skipper:
            if skip >= self.count:
                break

            for document in self.cur.skip(skip).limit(self.limit):
                yield document

            self.cur.rewind()

...
cur = collection.find({'is_timeline_valid': True})
...
for doc in PaginatedCursor(cur, limit=100):
   ...

答案 1 :(得分:1)

我今天遇到了这个问题,事实证明,这与集合中特定文档的大小超过max_bson_size限制有关。将文档添加到集合中时,请确保文档大小不超过max_bson_size大小。

document_size_limit = client.max_bson_size
assert len(json.dumps(data)) < document_size_limit

我目前正在调查为什么馆藏首先允许大于max_bson_size的文档。

答案 2 :(得分:0)

我们可以将batch_size添加到find()来减小消息的大小。

for doc in collection.find({'is_timeline_valid': True}):

成为

for doc in collection.find({'is_timeline_valid': True}, batch_size=1):