我有一个python脚本,其中包含一个读取文件并执行某些操作的大循环(我使用的是几个包,如urllib2,httplib2或BeautifulSoup)。
看起来像这样:
try:
with open(fileName, 'r') as file :
for i, line in enumerate(file):
try:
# a lot of code
# ....
# ....
except urllib2.HTTPError:
print "\n >>> HTTPError"
# a lot of other exceptions
# ....
except (KeyboardInterrupt, SystemExit):
print "Process manually stopped"
raise
except Exception, e:
print(repr(e))
except (KeyboardInterrupt, SystemExit):
print "Process manually stopped"
# some stuff
问题是程序在我按下Ctrl-C时停止但是它没有被我的两个KeyboardInterrupt异常中的任何一个抓住,虽然我确定它当前在循环中(因此至少在big try / except中)
怎么可能?起初我以为是因为我使用的其中一个软件包没有正确处理异常(比如使用"除了:"只有)但是如果是这样的话,我的剧本不会停止。但脚本会停止,它至少应该被我的两个人抓住,除了,对吗?
我哪里错了?
提前致谢!
修改
在try-except之后添加finally:
子句并在两个try-except块中打印回溯,当我按下Ctrl-C时它通常会显示None
但我曾经设法得到这个(似乎它来自urllib2,但我不知道这是否是我无法捕获KeyboardInterrupt的原因):
追踪(最近一次呼叫最后一次):
File "/home/darcot/code/Crawler/crawler.py", line 294, in get_articles_from_file
content = Extractor(extractor='ArticleExtractor', url=url).getText()
File "/usr/local/lib/python2.7/site-packages/boilerpipe/extract/__init__.py", line 36, in __init__
connection = urllib2.urlopen(request)
File "/usr/local/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/local/lib/python2.7/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/local/lib/python2.7/urllib2.py", line 409, in _open
'_open', req)
File "/usr/local/lib/python2.7/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/local/lib/python2.7/urllib2.py", line 1173, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/local/lib/python2.7/urllib2.py", line 1148, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 4] Interrupted system call>
答案 0 :(得分:2)
我已在评论中提出这个问题,这个问题可能是由问题中遗漏的代码部分引起的。但是,确切的代码不应该相关,因为当Python代码被Ctrl-C中断时,Python通常会抛出KeyboardInterrupt
异常。
您在评论中提到您使用boilerpipe
Python包。这个Python包使用JPype
创建绑定到Java的语言...我可以使用以下Python程序重现您的问题:
from boilerpipe.extract import Extractor
import time
try:
for i in range(10):
time.sleep(1)
except KeyboardInterrupt:
print "Keyboard Interrupt Exception"
如果使用Ctrl-C中断此程序,则不会抛出异常。似乎程序立即终止,而Python解释器没有机会抛出异常。删除导入boilerpipe
后,问题就会消失......
使用gdb
的调试会话表示,如果导入boilerpipe
,则Python会启动大量线程:
gdb --args python boilerpipe_test.py
[...]
(gdb) run
Starting program: /home/fabian/Experimente/pykeyinterrupt/bin/python boilerpipe_test.py
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[New Thread 0x7fffef62b700 (LWP 3840)]
[New Thread 0x7fffef52a700 (LWP 3841)]
[New Thread 0x7fffef429700 (LWP 3842)]
[New Thread 0x7fffef328700 (LWP 3843)]
[New Thread 0x7fffed99a700 (LWP 3844)]
[New Thread 0x7fffed899700 (LWP 3845)]
[New Thread 0x7fffed798700 (LWP 3846)]
[New Thread 0x7fffed697700 (LWP 3847)]
[New Thread 0x7fffed596700 (LWP 3848)]
[New Thread 0x7fffed495700 (LWP 3849)]
[New Thread 0x7fffed394700 (LWP 3850)]
[New Thread 0x7fffed293700 (LWP 3851)]
[New Thread 0x7fffed192700 (LWP 3852)]
没有gdb
导入的 boilerpipe
会话:
gdb --args python boilerpipe_test.py
[...]
(gdb) r
Starting program: /home/fabian/Experimente/pykeyinterrupt/bin/python boilerpipe_test.py
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
^C
Program received signal SIGINT, Interrupt.
0x00007ffff7529533 in __select_nocancel () from /usr/lib/libc.so.6
(gdb) signal 2
Continuing with signal SIGINT.
Keyboard Interrupt Exception
[Inferior 1 (process 3904) exited normally
所以我假设您的Ctrl-C信号在另一个线程中处理,或者jpype
执行其他破坏Ctrl-C处理的奇怪事情。
编辑:作为一种可能的解决方法,您可以注册一个信号处理程序,捕获当您按Ctrl-C时进程收到的SIGINT
信号。即使导入boilerpipe
和JPype
,信号处理程序也会被触发。这样,当用户按下Ctrl-C时,您将收到通知,并且您将能够在程序的中心点处理该事件。如果要在此处理程序中,可以终止脚本。如果不这样做,脚本将在信号处理函数返回后继续运行。请参阅以下示例:
from boilerpipe.extract import Extractor
import time
import signal
import sys
def interuppt_handler(signum, frame):
print "Signal handler!!!"
sys.exit(-2) #Terminate process here as catching the signal removes the close process behaviour of Ctrl-C
signal.signal(signal.SIGINT, interuppt_handler)
try:
for i in range(10):
time.sleep(1)
# your_url = "http://www.zeit.de"
# extractor = Extractor(extractor='ArticleExtractor', url=your_url)
except KeyboardInterrupt:
print "Keyboard Interrupt Exception"
答案 1 :(得分:0)
最有可能当你的脚本在try块之外时你发出了CTRL-C,因此没有捕获信号。