我创建了一个使用PyQt5“渲染”HTML并返回结果的函数。它如下:
def render(source_html):
"""Fully render HTML, JavaScript and all."""
import sys
from PyQt5.QtWidgets import QApplication
from PyQt5.QtWebKitWidgets import QWebPage
class Render(QWebPage):
def __init__(self, html):
self.html = None
self.app = QApplication(sys.argv)
QWebPage.__init__(self)
self.loadFinished.connect(self._loadFinished)
self.mainFrame().setHtml(html)
self.app.exec_()
def _loadFinished(self, result):
self.html = self.mainFrame().toHtml()
self.app.quit()
return Render(source_html).html
偶尔它的线程将无限期挂起,我将不得不杀死整个程序。不幸的是,PyQt5也可能是一个黑盒子,因为我不知道如果它行为不当会如何杀死它。
理想情况下,我可以实现 n 秒的超时。作为一种解决方法,我将函数放在它自己的脚本render.py
中,并通过subprocess
通过这个怪物来调用它:
def render(html):
"""Return fully rendered HTML, JavaScript and all."""
args = ['render.py', '-']
timeout = 20
try:
return subprocess.check_output(args,
input=html,
timeout=timeout,
universal_newlines=True)
# Python 2's subprocess.check_output doesn't support input or timeout
except TypeError:
class SubprocessError(Exception):
"""Base exception from subprocess module."""
pass
class TimeoutExpired(SubprocessError):
"""
This exception is raised when the timeout expires while
waiting for a child process.
"""
def __init__(self, cmd, timeout, output=None):
super(TimeoutExpired, self).__init__()
self.cmd = cmd
self.timeout = timeout
self.output = output
def __str__(self):
return ('Command %r timed out after %s seconds' %
(self.cmd, self.timeout))
process = subprocess.Popen(['timeout', str(timeout)] + args,
stderr=subprocess.PIPE,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE)
# pipe html into render.py's stdin
output = process.communicate(
html.encode('utf8'))[0].decode('latin1')
retcode = process.poll()
if retcode == 124:
raise TimeoutExpired(args, timeout)
return output
multiprocessing
模块似乎大大简化了事情:
from multiprocessing import Pool
pool = Pool(1)
rendered_html = pool.apply_async(render, args=(html,)).get(timeout=20)
pool.terminate()
有没有办法实现不需要这种恶作剧的超时?
答案 0 :(得分:0)
我也在寻找一种解决方案,显然没有一个是故意的。
如果您使用的是Linux,而您想要的只是Python在N秒内尝试执行某项操作,然后在这N秒后超时并处理错误情况,您可以执行以下操作:
import time
import signal
# This stuff is so when we get SIGALRM from the timeout functionality we can handle it instead of
# crashing to the ground
class TimeOutError(Exception):
pass
def raise_timeout(var1, var2):
raise TimeOutError
signal.signal(signal.SIGALRM, raise_timeout)
# Turn the alarm on
signal.alarm(1)
# Try your thing
try:
time.sleep(2)
except TimeOutError as e:
print(" We hit our timeout value and we bailed out of whatever that BS was.")
# Remember to turn the alarm back off if your attempt succeeds!
signal.alarm(0)
一个缺点是您不能嵌套 signal.alarm()挂钩;如果您在try语句中调用了其他内容,然后又设置了signal.alarm(),它将覆盖第一个内容并弄乱您的内容。