pymongo.errors.CursorNotFound:在服务器上找不到游标ID'...'

时间:2018-06-08 12:41:19

标签: python mongodb pymongo

我正在尝试使用pymongo阅读从mongodb到csv文件的1M文档。我的代码如下:

import csv
from pymongo import MongoClient
from datetime import datetime
from bson import json_util
from tempfile import NamedTemporaryFile

client = MongoClient('mongodb://login:pass@server:port')
db = client.some_mongo_database
collection = db.some_mongo_collection

fromDate = datetime.strptime("2018-05-15 21:00", '%Y-%m-%d %H:%M')
tillDate = datetime.strptime("2018-05-16 21:00", '%Y-%m-%d %H:%M')
query = {
        "$or": [
                 {"LastUpdated": {"$gte": fromDate
                                , "$lt": tillDate}
                 },
                 {"$and": [
                            {"Created": {"$gte": fromDate
                                       , "$lt": tillDate}
                            },
                            {"LastUpdated": None}
                       ]
                  }
            ]
        }

cursor = collection.find(query, no_cursor_timeout=True)

之后如果我这样做:

for row in cursor:
    print(row)
cursor.close()

一切正常,我可以获得所有文件。 但如果我这样做的话:

with NamedTemporaryFile("w", delete=False) as temp:
    csv_writer = csv.writer(temp, delimiter='\t', quotechar='\b', quoting=csv.QUOTE_MINIMAL)
    for row in cursor:
        csv_row = [ [[row['_id']], str(json.dumps(row,default=json_util.default))] ]
        csv_writer.writerows(csv_row)
cursor.close()

大约2分钟后,我收到了20万份文件:

Traceback (most recent call last):
  File "mongo_data_loader.py", line 25, in <module>
    for row in cursor:

  File "/Library/Python/2.7/site-packages/pymongo/cursor.py", line 1169, in next
    if len(self.__data) or self._refresh():
  File "/Library/Python/2.7/site-packages/pymongo/cursor.py", line 1106, in _refresh
    self.__send_message(g)
  File "/Library/Python/2.7/site-packages/pymongo/cursor.py", line 975, in __send_message
    helpers._check_command_response(first)
  File "/Library/Python/2.7/site-packages/pymongo/helpers.py", line 142, in _check_command_response
    raise CursorNotFound(errmsg, code, response)
pymongo.errors.CursorNotFound: cursor id 184972541202 not found

我做错了什么?

Python 2.7.10
pymongo 3.6.1
mongo db.version() 3.6.5

1 个答案:

答案 0 :(得分:1)

作为临时解决方案,我做了:

processed = 0

while True:
    cursor = collection.find(query, no_cursor_timeout=True).skip(processed)

    try:
        for row in cursor:
            csv_row = [ [[row['_id']], str(json.dumps(row,default=json_util.default))] ]
            csv_writer.writerows(csv_row)
            processed += 1
        cursor.close()
        break
    except CursorNotFound:
        print("Lost cursor. Retry with skip")

但上述行为的问题仍然是实际的