我有一个从mongodb查询的python线程,这是我第一次遇到这个错误。这也是我第一次查询我的数据库。它是一个拥有5亿个文档的大型数据库。这是错误:
IV
BIVV
adding AAXN to retry list
adding AABA to retry list
Fatal Python error: Cannot recover from stack overflow.
我没有编写添加评论的代码。 Python似乎是在堆栈溢出错误之前将这些股票符号添加回队列,然后所有线程都死掉。
我尝试在从队列获取的每次迭代中调用gc.collect但是没有修复它。这种情况发生在15个线程和5个线程,在相同的股票代码。我很确定我没有任何内存泄漏。我应该只删除每个线程在每次迭代时拥有的所有变量吗?也许尝试多进程而不是多线程?有什么建议吗?
答案 0 :(得分:-1)
我更多地关注“添加”评论,并由pandas_datareader打印。特别是mstar / daily.py。
似乎重试计数永远不会增加。这会产生递归堆栈溢出错误。
def _dl_mult_symbols(self, symbols):
failed = []
symbol_data = []
for symbol in symbols:
params = self._url_params()
params.update({"ticker": symbol})
try:
resp = requests.get(self.url, params=params)
except Exception:
if symbol not in failed:
if self.retry_count == 0:
warn("skipping symbol %s: number of retries "
"exceeded." % symbol)
pass
else:
print("adding %s to retry list" % symbol)
failed.append(symbol)
else:
if resp.status_code == requests.codes.ok:
jsondata = resp.json()
if jsondata is None:
failed.append(symbol)
continue
jsdata = self._restruct_json(symbol=symbol,
jsondata=jsondata)
symbol_data.extend(jsdata)
else:
raise Exception("Request Error!: %s : %s" % (
resp.status_code, resp.reason))
time.sleep(self.pause)
if len(failed) > 0 and self.retry_count > 0:
# TODO: This appears to do nothing since
# TODO: successful symbols are not added to
self._dl_mult_symbols(symbols=failed)
self.retry_count -= 1
else:
self.retry_count = 0
if not symbol_data:
raise ValueError('All symbols were invalid')
elif self.retry_count == 0 and len(failed) > 0:
warn("The following symbols were excluded do to http "
"request errors: \n %s" % failed, SymbolWarning)
symbols_df = DataFrame(data=symbol_data)
dfx = symbols_df.set_index(["Symbol", "Date"])
return dfx