Question

我在这里有一个脚本（理想情况下）为每个实体Y迭代JSON数据的多个页面X（在这种情况下，每个团队Y的多个贷款X）。构建api的方式，我相信我必须在物理上更改URL中的子目录，以便遍历多个实体。以下是明确的文档和URL：

GET / teams /：id / loans

返回属于特定团队的贷款。   示例http://api.kivaws.org/v1/teams/2/loans.json

参数id（number）必需。要退还贷款的团队ID。   page（number）要返回的结果的页面位置。默认值：1   sort_by（string）排序结果的顺序。其中一个：最古老的，   最新默认值：最新app_id（字符串）反向应用程序ID   DNS表示法。 ids_only（string）仅返回ID以进行返回   对象更小。其中一个：true，false默认值：false响应
  loan_listing - HTML，JSON，XML，RSS

状态制作

这是我的脚本，它运行并且似乎提取正确的数据，但似乎没有将任何数据写入outfile：

# -*- coding: utf-8 -*-

import urllib.request as urllib
import json
import time

# storing team loans dict. The key is the team id, en value is the list of lenders
team_loans = {}

url = "http://api.kivaws.org/v1/teams/"

#teams_id range 1 - 11885
for i in range(1, 100):

  params = dict(
    id = i
  )

  #i =1
  try:
    handle = urllib.urlopen(str(url+str(i)+"/loans.json"))
    print(handle)
  except:
    print("Could not handle url")
    continue
  # reading response
  item_html =  handle.read().decode('utf-8')
  # converting bytes to str
  data = str(item_html)
  # converting to json
  data = json.loads(data)
  # getting number of pages to crawl
  numPages = data['paging']['pages']
  # deleting paging data
  data.pop('paging')

  # calling additional pages
  if numPages >1:
    for pa in range(2,numPages+1,1):
        #pa = 2
        handle = urllib.urlopen(str(url+str(i)+"/loans.json?page="+str(pa)))
        print("Pulling loan data from team " + str(i) + "...")
        # reading response
        item_html =  handle.read().decode('utf-8')
        # converting bytes to str
        datatemp = str(item_html)
        # converting to json
        datatemp = json.loads(datatemp)
        #Pagings are redundant headers
        datatemp.pop('paging')
        # adding data to initial list
        for loan in datatemp['loans']:
            data['loans'].append(loan)
        time.sleep(2)

  # recording loans by team in dict
  team_loans[i] = data['loans']
  if (data['loans']):
    print("===Data added to the team_loan dictionary===")
  else:
    print("!!!FAILURE to add data to team_loan dictionary!!!")

  # recorging data to file when 10 teams are read
  print("===Finished pulling from page " + str(i) + "===")
  if (int(i) % 10 == 0):
    outfile = open("team_loan.json", "w")
    print("===Now writing data to outfile===")
    json.dump(team_loans, outfile, sort_keys = True, indent = 2, ensure_ascii=True)
    outfile.close()
  else:
    print("!!!FAILURE to write data to outfile!!!")

  # compliance with API # of requests
  time.sleep(2)

print ('Done! Check your outfile (team_loan.json)')

我知道这可能是一大堆代码，但这是一个相当顺序的过程。

同样，这个程序正在提取正确的数据，但是不将此数据写入outfile。谁能理解为什么？

Answer 1

对于可能阅读此帖子的其他人，该脚本将面向外部文件写入数据。这只是测试代码逻辑错误。忽略我已经实施的印刷陈述。

urllib.request：数据不写入Outfile

1 个答案: