urllib.request:数据不写入Outfile

时间:2016-10-05 23:34:25

标签: python json rest python-3.x urllib

我在这里有一个脚本(理想情况下)为每个实体Y迭代JSON数据的多个页面X(在这种情况下,每个团队Y的多个贷款X)。构建api的方式,我相信我必须在物理上更改URL中的子目录,以便遍历多个实体。以下是明确的文档和URL:

  

GET / teams /:id / loans

     

返回属于特定团队的贷款。   示例http://api.kivaws.org/v1/teams/2/loans.json

     

参数id(number)必需。要退还贷款的团队ID。   page(number)要返回的结果的页面位置。默认值:1   sort_by(string)排序结果的顺序。其中一个:最古老的,   最新默认值:最新app_id(字符串)反向应用程序ID   DNS表示法。 ids_only(string)仅返回ID以进行返回   对象更小。其中一个:true,false默认值:false响应
  loan_listing - HTML,JSON,XML,RSS

     

状态制作

这是我的脚本,它运行并且似乎提取正确的数据,但似乎没有将任何数据写入outfile:

# -*- coding: utf-8 -*-

import urllib.request as urllib
import json
import time

# storing team loans dict. The key is the team id, en value is the list of lenders
team_loans = {}

url = "http://api.kivaws.org/v1/teams/"

#teams_id range 1 - 11885
for i in range(1, 100):

  params = dict(
    id = i
  )

  #i =1
  try:
    handle = urllib.urlopen(str(url+str(i)+"/loans.json"))
    print(handle)
  except:
    print("Could not handle url")
    continue
  # reading response
  item_html =  handle.read().decode('utf-8')
  # converting bytes to str
  data = str(item_html)
  # converting to json
  data = json.loads(data)
  # getting number of pages to crawl
  numPages = data['paging']['pages']
  # deleting paging data
  data.pop('paging')

  # calling additional pages
  if numPages >1:
    for pa in range(2,numPages+1,1):
        #pa = 2
        handle = urllib.urlopen(str(url+str(i)+"/loans.json?page="+str(pa)))
        print("Pulling loan data from team " + str(i) + "...")
        # reading response
        item_html =  handle.read().decode('utf-8')
        # converting bytes to str
        datatemp = str(item_html)
        # converting to json
        datatemp = json.loads(datatemp)
        #Pagings are redundant headers
        datatemp.pop('paging')
        # adding data to initial list
        for loan in datatemp['loans']:
            data['loans'].append(loan)
        time.sleep(2)

  # recording loans by team in dict
  team_loans[i] = data['loans']
  if (data['loans']):
    print("===Data added to the team_loan dictionary===")
  else:
    print("!!!FAILURE to add data to team_loan dictionary!!!")

  # recorging data to file when 10 teams are read
  print("===Finished pulling from page " + str(i) + "===")
  if (int(i) % 10 == 0):
    outfile = open("team_loan.json", "w")
    print("===Now writing data to outfile===")
    json.dump(team_loans, outfile, sort_keys = True, indent = 2, ensure_ascii=True)
    outfile.close()
  else:
    print("!!!FAILURE to write data to outfile!!!")

  # compliance with API # of requests
  time.sleep(2)

print ('Done! Check your outfile (team_loan.json)')

我知道这可能是一大堆代码,但这是一个相当顺序的过程。

同样,这个程序正在提取正确的数据,但是将此数据写入outfile。谁能理解为什么?

1 个答案:

答案 0 :(得分:0)

对于可能阅读此帖子的其他人,该脚本将面向外部文件写入数据。这只是测试代码逻辑错误。忽略我已经实施的印刷陈述。