我使用MongoDB存储使用Scrapy作为抓取工具从Web抓取的数据。问题是,当我开始使用多个蜘蛛进行长时间的抓取过程时,Mongo崩溃,并且蜘蛛开始收到以下消息:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/twisted/internet/defer.py", line 654, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/home/ubuntu/search/decapod/updater/updater/pipelines.py", line 90, in process_item
self.db[self.collection_name].insert_one(dict(item))
File "/usr/local/lib/python3.5/dist-packages/pymongo/collection.py", line 693, in insert_one
session=session),
File "/usr/local/lib/python3.5/dist-packages/pymongo/collection.py", line 607, in _insert
bypass_doc_val, session)
File "/usr/local/lib/python3.5/dist-packages/pymongo/collection.py", line 595, in _insert_one
acknowledged, _insert_command, session)
File "/usr/local/lib/python3.5/dist-packages/pymongo/mongo_client.py", line 1242, in _retryable_write
with self._tmp_session(session) as s:
File "/usr/lib/python3.5/contextlib.py", line 59, in __enter__
return next(self.gen)
File "/usr/local/lib/python3.5/dist-packages/pymongo/mongo_client.py", line 1571, in _tmp_session
s = self._ensure_session(session)
File "/usr/local/lib/python3.5/dist-packages/pymongo/mongo_client.py", line 1558, in _ensure_session
return self.__start_session(True, causal_consistency=False)
File "/usr/local/lib/python3.5/dist-packages/pymongo/mongo_client.py", line 1511, in __start_session
server_session = self._get_server_session()
File "/usr/local/lib/python3.5/dist-packages/pymongo/mongo_client.py", line 1544, in _get_server_session
return self._topology.get_server_session()
File "/usr/local/lib/python3.5/dist-packages/pymongo/topology.py", line 427, in get_server_session
None)
File "/usr/local/lib/python3.5/dist-packages/pymongo/topology.py", line 199, in _select_servers_loop
self._error_message(selector))
pymongo.errors.ServerSelectionTimeoutError: mongodb.getmore.com.br:27017: timed out
当Mongo崩溃时,如何自动重启它或防止这种情况发生?
我当前在EC2实例t2.small
上运行mongo。