我在这里有一个脚本(理想情况下)为每个实体Y迭代JSON数据的多个页面X(在这种情况下,每个团队Y的多个贷款X)。构建api的方式,我相信我必须在物理上更改URL中的子目录,以便遍历多个实体。以下是明确的文档和URL:
GET / teams /:id / loans
返回属于特定团队的贷款。 示例http://api.kivaws.org/v1/teams/2/loans.json
参数id(number)必需。要退还贷款的团队ID。 page(number)要返回的结果的页面位置。默认值:1 sort_by(string)排序结果的顺序。其中一个:最古老的, 最新默认值:最新app_id(字符串)反向应用程序ID DNS表示法。 ids_only(string)仅返回ID以进行返回 对象更小。其中一个:true,false默认值:false响应
loan_listing - HTML,JSON,XML,RSS状态制作
这是我的脚本,它运行并且似乎提取正确的数据,但似乎没有将任何数据写入outfile:
# -*- coding: utf-8 -*-
import urllib.request as urllib
import json
import time
# storing team loans dict. The key is the team id, en value is the list of lenders
team_loans = {}
url = "http://api.kivaws.org/v1/teams/"
#teams_id range 1 - 11885
for i in range(1, 100):
params = dict(
id = i
)
#i =1
try:
handle = urllib.urlopen(str(url+str(i)+"/loans.json"))
print(handle)
except:
print("Could not handle url")
continue
# reading response
item_html = handle.read().decode('utf-8')
# converting bytes to str
data = str(item_html)
# converting to json
data = json.loads(data)
# getting number of pages to crawl
numPages = data['paging']['pages']
# deleting paging data
data.pop('paging')
# calling additional pages
if numPages >1:
for pa in range(2,numPages+1,1):
#pa = 2
handle = urllib.urlopen(str(url+str(i)+"/loans.json?page="+str(pa)))
print("Pulling loan data from team " + str(i) + "...")
# reading response
item_html = handle.read().decode('utf-8')
# converting bytes to str
datatemp = str(item_html)
# converting to json
datatemp = json.loads(datatemp)
#Pagings are redundant headers
datatemp.pop('paging')
# adding data to initial list
for loan in datatemp['loans']:
data['loans'].append(loan)
time.sleep(2)
# recording loans by team in dict
team_loans[i] = data['loans']
if (data['loans']):
print("===Data added to the team_loan dictionary===")
else:
print("!!!FAILURE to add data to team_loan dictionary!!!")
# recorging data to file when 10 teams are read
print("===Finished pulling from page " + str(i) + "===")
if (int(i) % 10 == 0):
outfile = open("team_loan.json", "w")
print("===Now writing data to outfile===")
json.dump(team_loans, outfile, sort_keys = True, indent = 2, ensure_ascii=True)
outfile.close()
else:
print("!!!FAILURE to write data to outfile!!!")
# compliance with API # of requests
time.sleep(2)
print ('Done! Check your outfile (team_loan.json)')
我知道这可能是一大堆代码,但这是一个相当顺序的过程。
同样,这个程序正在提取正确的数据,但是不将此数据写入outfile。谁能理解为什么?
答案 0 :(得分:0)
对于可能阅读此帖子的其他人,该脚本将面向外部文件写入数据。这只是测试代码逻辑错误。忽略我已经实施的印刷陈述。