我想解析这个json的回复:
{
"status": "ok",
"results_time": "0.6756 sec.",
"results_count": 1,
"results": [
{
"date": "2017-01-01",
"site_url": "asana.com",
"site_title": "Use Asana to track your team’s work & manage projects · Asana",
"site_description": "It’s free to use, simple to get started, and powerful enough to run your entire business. Sign up for free today.",
"audience": {
"visits": 19952871,
"time_on_site_avg": "00:09:25",
"page_views_avg": 6.9773123942789,
"bounce_rate": 35.85
},
"traffic": {
"value": 19952871,
"percent": 100,
"countries": [
{
"country": "United States",
"value": 6864349,
"percent": 34.4
},
{
"country": "United Kingdom",
"value": 1133338,
"percent": 5.68
},
{
"country": "Brazil",
"value": 705693,
"percent": 3.54
},
{
"country": "Canada",
"value": 703566,
"percent": 3.53
},
{
"country": "Poland",
"value": 700182,
"percent": 3.51
},
{
"country": "Other",
"value": 984474655,
"percent": 49.34
}
],
......... }
我想用这些字段导出csv:
audience.visits
audience.time_on_site_avg
audience.page_views_avg
audience.bounce_rate
traffic.countries.country
traffic.countries.value
traffic.countries.percent
我有这些代码但没有成功。
import json
import pandas as pd
from pandas.io.json import json_normalize
with open('dict.competitor') as f:
d = json.load(f)
traffic1 =j son_normalize(data=d['results'],record_path='traffic','countries'])
print(traffic1)
我觉得我在那里。我已经尝试了其他SO帖子的几种组合和建议来获取剩余的数据。到目前为止,没有任何工作。我知道我遇到的问题是由于嵌套,只需要找到一种方法来获得所需的结果。感谢您的帮助!
答案 0 :(得分:0)
这可能是一个更聪明的熊猫方式(这将是很好看的),但这是一个应该产生所需结果的循环。
根据我对您的问题的理解,这将生成一个如下所示的DataFrame,并可以导出为CSV:
data = []
for r in the_json["results"]:
for d in r["traffic"]["countries"]:
row = {}
for key in d.keys():
row["traffic.{}".format(key)] = d[key]
for key in r["audience"]:
row["audience.{}".format(key)] = r["audience"][key]
data.append(row)
df = pd.DataFrame(data)
df.to_csv("filename.csv")