Question

我使用此脚本大约每分钟下载一次json并保存到唯一的文件名。有时它只是挂起。它成功保存了打印行显示的文件，但只是等了好几个小时才发现。

我的问题是1）是否有一些显而易见的东西我没有添加（某种超时？）和2）我能做些什么来找出它被卡住的地方，当它是卡住？（除了在每隔一行之间放置一条打印线）

如果互联网连接没有响应，我会按预期每分钟看一次“已失败”行，直到互联网连接再次运行，这似乎不是问题。

注意：我以这种方式保存并加载n，以防止随机崩溃，重启等等。

import json
import urllib2
import numpy as np

n = np.load("n.npy")
print "loaded n: ", n

n += 10 # leave a gap

np.save("n", n) 
print "saved: n: ", n

url = "http:// etc..."

for i in range(10000):

    n = np.load("n.npy")
    n += 1

    try:
        req        = urllib2.Request(url, headers={"Connection":"keep-alive",
                                                   "User-Agent":"Mozilla/5.0"})
        response   = urllib2.urlopen(req)

        dictionary = json.loads(response.read())

        filename   = "info_" + str(100000+n)[1:]
        with open(filename, 'w') as outfile:
            json.dump(dictionary, outfile)

        np.save("n", n)
        print "n, i = ", n, i, filename, "len = ", len(dictionary)
    except:
        print "n, i = ", n, i, " has FAILED, now continuing..."
    time.sleep(50)

如何在urllib2脚本挂起时查看发生了什么？

0 个答案: