我使用python,一直在从API网址页面收集数据,这些页面将数据作为json格式提供给我。在代码的最后,我做了一个for循环,以便可以遍历一个长列表,其中包含要循环的所有ID。在for循环下,我以可以收集每个ID的对应键和值的方式进行编码。这是示例代码(为简化起见,我只是对其进行了简化)。
def first_part(id):
url="www.example1.com"
try:
# below, I set 10 seconds timeout.
data = requests.get(url, timeout=10).json()
...skipped this part;getting the keys and values that I want to get...
return ["value", keys]
except:
print("Something went wrong in the the first part")
def second_part(id):
url_2="www.example2.com"
try:
# below, I set 10 seconds timeout.
data=requests.get(url_2,timeout=10).json()
...skipped this part;getting the keys and values that I want to get...
return ["value", keys_2]
except:
print("Something went wrong in the second part")
def main():
ids=[id1,id2,id3,id4,id5,id6,...]
header = ""
header += second_part(id[0])[0] + ","
header += first_part(id[0])[0]
file_w = open("test.csv", "w")
file_w.write(header + "\n")
ctr = 1
for i in ids:
try:
## print out id & sequence number so that I know python is actually running.
print(i,ctr)
body = ""
body += second_part(i)[1] + ","
body += first_part(i)[1]
file_w.write(body + "\n")
ctr += 1
except:
pass
file_w.close()
main()
我的问题是经过一定数量的循环后,python终端才挂在那里;它仍在运行,但未在收集数据,也不会继续进行下一个迭代。
Output printed example looks like this:
id1 1
id2 2
id3 3
...
id564 564
id565 565 -> should move on to 566th but python just stops here and doesn't move, doesn't make any error message, doesn't end.
一个有趣的事情是,过去一个月我一直在毫无问题地收集数据,但这是两天前发生的事。为了解决这个问题,我已经尽力了。我不确定,但是我不认为这是内存问题,因为我使用的是32 GB的计算机。我在互联网上搜索并尝试了interruptingcow,但没有成功。这是我得到的错误消息:AttributeError:模块“信号”没有属性“ SIGALRM”。这是因为我根据其他Q&A帖子使用Windows?有人可以帮我吗?因为我真的在看护这段代码,所以我很沮丧。如果挂起,我将杀死终端,然后从停止的地方开始迭代。