我使用以下代码从网上批量下载json文件的文本列表。这些链接不是标准化的,可以是https或http,也可以以'.json'结尾。
def save_json(url):
import os
filename = url.replace('/','').replace(':','') .replace('.','|').replace('|json','.json').replace('|JSON','.json').replace('|','').replace('?','').replace('=','').replace('&','')
path = "U:/location/json"
fullpath = os.path.join(path, filename)
import urllib2
response = urllib2.urlopen(url)
webContent = response.read()
f = open(fullpath, 'w')
f.write(webContent)
f.close()
f = open('U:/location/index_dl.txt')
p = f.read()
url_list = p.split('\n') #here's where \n is the line break delimiter that can be changed
for url in url_list:
save_json(url)
我经常遇到错误:
Errno 10054远程主机强行关闭现有连接。
问题:有没有人知道从网上批量下载json链接列表的另一种方法,或者有办法处理这个错误?
提前致谢! SJB