我试图将所有人从星球大战API中拉到一个有效的json文件中。
API限制了结果集和所有“人”。跨度9分离呼叫(即' https://swapi.co/api/people/?page=1',' https://swapi.co/api/people/?page=2'等)。
当我遍历多个请求时,我最终得到了无效的json,因为文档和现在的开始和结束括号之间没有逗号分隔符。
请帮我解决我在通话中遇到的问题。
import requests
import json
for x in range(1,9):
response = requests.get("https://swapi.co/api/people/?page="+str(x))
data = response.json()
next_page = data["next"]
results = data["results"]
for result in results:
with open('data.json', 'a') as outfile:
json.dump(result, outfile)
print('Done!')
json文件输出:
{"name": "Luke Skywalker", "height": "172", "mass": "77", "hair_color": "blond", "skin_color": "fair", "eye_color": "blue", "birth_year": "19BBY", "gender": "male", "homeworld": "https://swapi.co/api/planets/1/", "films": ["https://swapi.co/api/films/2/", "https://swapi.co/api/films/6/", "https://swapi.co/api/films/3/", "https://swapi.co/api/films/1/", "https://swapi.co/api/films/7/"], "species": ["https://swapi.co/api/species/1/"], "vehicles": ["https://swapi.co/api/vehicles/14/", "https://swapi.co/api/vehicles/30/"], "starships": ["https://swapi.co/api/starships/12/", "https://swapi.co/api/starships/22/"], "created": "2014-12-09T13:50:51.644000Z", "edited": "2014-12-20T21:17:56.891000Z", "url": "https://swapi.co/api/people/1/"}{"name": "C-3PO", "height": "167", "mass": "75", "hair_color": "n/a", "skin_color": "gold", "eye_color": "yellow", "birth_year": "112BBY", "gender": "n/a", "homeworld": "https://swapi.co/api/planets/1/", "films": ["https://swapi.co/api/films/2/", "https://swapi.co/api/films/5/", "https://swapi.co/api/films/4/", "https://swapi.co/api/films/6/", "https://swapi.co/api/films/3/", "https://swapi.co/api/films/1/"], "species": ["https://swapi.co/api/species/2/"], "vehicles": [], "starships": [], "created": "2014-12-10T15:10:51.357000Z", "edited": "2014-12-20T21:17:50.309000Z", "url": "https://swapi.co/api/people/2/"}
答案 0 :(得分:2)
将结果保存到内存中,然后只需一个json.dump
即可解决问题:
import requests
import json
results = []
for x in range(1, 9):
response = requests.get("https://swapi.co/api/people/?page="+str(x))
data = response.json()
next_page = data["next"]
results.extend(data["results"])
with open('data.json', 'w') as outfile:
json.dump(results, outfile)
答案 1 :(得分:0)
不要随意序列化,而是将数据附加到列表中,并在到达目的时将其序列化一次。
all_results = []
for x in range(1,9):
response = requests.get("https://swapi.co/api/people/?page="+str(x))
data = response.json()
next_page = data["next"]
results = data["results"]
all_results.extend(results)
with open('data.json', 'w') as outfile:
json.dump(all_results, outfile)
答案 2 :(得分:0)
我希望这可以解决您丢失逗号的问题
import requests
import json
for x in range(1,9):
response = requests.get("https://swapi.co/api/people/?page="+str(x))
data = response.json()
next_page = data["next"]
results = data["results"]
res = ''
for result in results:
temp = str(result) + ','
res = res + temp
with open('data.json', 'a') as outfile:
outfile.write(res)
print('Done!')
我刚刚将'result'变量转换为字符串并为每个页面添加了它。当单个页面的字典结束时,它会将其附加到文件“data.json”。