使用多处理

时间:2018-02-25 21:00:09

标签: python parsing beautifulsoup multiprocessing file-writing

我正在研究html解析器,它使用Python多处理池,因为它运行了大量的页面。每页的输出都保存到单独的CSV文件中。问题是有时我得到意外的错误,整个程序崩溃,我几乎无处不在处理错误 - 阅读页面,解析页面,甚至编写文件。此外,它看起来像脚本在完成一批文件的写入后崩溃,所以它不应该是什么可以粉碎。因此,经过一整天的调试,我一言不发。

错误:

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "D:\Programy\Python36-32\lib\multiprocessing\pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "D:\Programy\Python36-32\lib\multiprocessing\pool.py", line 44, in mapstar
    return list(map(*args))
  File "D:\ppp\Python\parser\run.py", line 244, in media_process
    save_media_product(DIRECTORY, category, media_data)
  File "D:\ppp\Python\parser\manage_output.py", line 180, in save_media_product
    _file_manager(target_file, temp, temp2)
  File "D:\ppp\Python\store_parser\manage_output.py", line 214, in _file_manager
    file_to_write.close()
UnboundLocalError: local variable 'file_to_write' referenced before assignment
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\ppp\Python\store_parser\run.py", line 356, in <module>
    main()
  File "D:\Rzeczy Mariusza\Python\store_parser\run.py", line 318, in main
    process.map(media_process, batch)
  File "D:\Programy\Python36-32\lib\multiprocessing\pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "D:\Programy\Python36-32\lib\multiprocessing\pool.py", line 644, in get
    raise self._value
UnboundLocalError: local variable 'file_to_write' referenced before assignment

看起来,变量赋值存在错误,但它不是:

try:
    file_to_write = open(target_file, 'w')
except OSError:
    message = 'OSError while writing file name - {}'.format(target_file)
    log_error(message)
except UnboundLocalError:
    message = 'UnboundLocalError while writing file name - {}'.format(target_file)
    log_error(message)
except Exception as e:
    message = 'Total failure "{}" while writing file name - {}'.format(e, target_file)
    log_error(message)
else:
    file_to_write.write(temp)
    file_to_write.write(temp2)
finally:
    file_to_write.close()

行 - except Exception as e:,没有任何帮助,整个事情仍然崩溃。到目前为止,我只排除了内存不足的情况,因为这个脚本被设计为在低规格的VPS上处理,但在测试阶段我在8 GB的ram环境中运行它。所以,如果你有任何理论,请分享。

0 个答案:

没有答案