如何使用python中的pagenation从API创建有效的json文件?

时间:2018-02-12 05:46:03

标签: python json api

我试图将所有人从星球大战API中拉到一个有效的json文件中。

API限制了结果集和所有“人”。跨度9分离呼叫(即' https://swapi.co/api/people/?page=1',' https://swapi.co/api/people/?page=2'等)。

当我遍历多个请求时,我最终得到了无效的json,因为文档和现在的开始和结束括号之间没有逗号分隔符。

请帮我解决我在通话中遇到的问题。

import requests
import json

for x in range(1,9):  
    response = requests.get("https://swapi.co/api/people/?page="+str(x))
    data = response.json()

    next_page = data["next"] 
    results = data["results"]

    for result in results:  
        with open('data.json', 'a') as outfile:
            json.dump(result, outfile)
print('Done!')

json文件输出:

{"name": "Luke Skywalker", "height": "172", "mass": "77", "hair_color": "blond", "skin_color": "fair", "eye_color": "blue", "birth_year": "19BBY", "gender": "male", "homeworld": "https://swapi.co/api/planets/1/", "films": ["https://swapi.co/api/films/2/", "https://swapi.co/api/films/6/", "https://swapi.co/api/films/3/", "https://swapi.co/api/films/1/", "https://swapi.co/api/films/7/"], "species": ["https://swapi.co/api/species/1/"], "vehicles": ["https://swapi.co/api/vehicles/14/", "https://swapi.co/api/vehicles/30/"], "starships": ["https://swapi.co/api/starships/12/", "https://swapi.co/api/starships/22/"], "created": "2014-12-09T13:50:51.644000Z", "edited": "2014-12-20T21:17:56.891000Z", "url": "https://swapi.co/api/people/1/"}{"name": "C-3PO", "height": "167", "mass": "75", "hair_color": "n/a", "skin_color": "gold", "eye_color": "yellow", "birth_year": "112BBY", "gender": "n/a", "homeworld": "https://swapi.co/api/planets/1/", "films": ["https://swapi.co/api/films/2/", "https://swapi.co/api/films/5/", "https://swapi.co/api/films/4/", "https://swapi.co/api/films/6/", "https://swapi.co/api/films/3/", "https://swapi.co/api/films/1/"], "species": ["https://swapi.co/api/species/2/"], "vehicles": [], "starships": [], "created": "2014-12-10T15:10:51.357000Z", "edited": "2014-12-20T21:17:50.309000Z", "url": "https://swapi.co/api/people/2/"}

3 个答案:

答案 0 :(得分:2)

将结果保存到内存中,然后只需一个json.dump即可解决问题:

import requests
import json

results = []
for x in range(1, 9):
    response = requests.get("https://swapi.co/api/people/?page="+str(x))
    data = response.json()

    next_page = data["next"]
    results.extend(data["results"])

with open('data.json', 'w') as outfile:
    json.dump(results, outfile)

答案 1 :(得分:0)

不要随意序列化,而是将数据附加到列表中,并在到达目的时将其序列化一次。

all_results = []

for x in range(1,9):  
    response = requests.get("https://swapi.co/api/people/?page="+str(x))
    data = response.json()

    next_page = data["next"] 
    results = data["results"]
    all_results.extend(results)

with open('data.json', 'w') as outfile:
    json.dump(all_results, outfile)

答案 2 :(得分:0)

我希望这可以解决您丢失逗号的问题

import requests
import json

for x in range(1,9):  
    response = requests.get("https://swapi.co/api/people/?page="+str(x))
    data = response.json()

    next_page = data["next"] 
    results = data["results"]

    res = ''
    for result in results:  
        temp = str(result) + ',' 
        res = res + temp 
    with open('data.json', 'a') as outfile:
      outfile.write(res)

print('Done!')

我刚刚将'result'变量转换为字符串并为每个页面添加了它。当单个页面的字典结束时,它会将其附加到文件“data.json”。