源代码如下:
# total_record with more than 4 millon datas, which comes from file
n = 200
idlist = [ total_record[i:i + n] for i in range(0, len(total_record ), n)]
for sub_list in idlist:
ret = self.table.find({'_id': {'$in': sub_list}})
for r in ret:
# some logic to process r
该代码可以正确处理部分数据,然后遇到以下问题。另外,这只是读操作,不是写操作
for r in ret:
File "/data/yard/base/miniconda3/lib/python3.6/site-packages/pymongo/cursor.py", line 1189, in next
if len(self.__data) or self._refresh():
File "/data/yard/base/miniconda3/lib/python3.6/site-packages/pymongo/cursor.py", line 1126, in _refresh
self.__send_message(g)
File "/data/yard/base/miniconda3/lib/python3.6/site-packages/pymongo/cursor.py", line 982, in __send_message
helpers._check_command_response(first)
File "/data/yard/base/miniconda3/lib/python3.6/site-packages/pymongo/helpers.py", line 155, in _check_command_response
raise OperationFailure(msg % errmsg, code, response)
pymongo.errors.OperationFailure: operation was interrupted
我的python版本是3.6.4, pymongo版本是3.7.1
答案 0 :(得分:0)
您可以尝试使用no_cursor_timeout
初始化MongoDB游标,并使用batch_size()
限制批处理大小
for sub_list in idlist:
ret = self.table.find({'_id': {'$in': sub_list}}, no_cursor_timeout=True).batch_size(40)
try:
for r in ret:
# some logic to process r
finally:
ret.close()