我从API中提取输出如下所示(尝试尽可能地格式化):
{
"other":{
Not important.. (ignored later)
},
"resultList":[
{
"date": "2017-10-26T21:52:59.840Z",
"uniqueId": "c0a9c665-0f6f-c8",
"children":[
{
"identifier": "FAMR@316069707@3160697070",
"score": 1,
"parentId": "c0a9c665-0f6f-4fc8"
},
{
Same format as first child...
},
{
Same format as first child...
}
],
"weights":[
60,
20,
20
],
"type": "ABC"
},
{
Same format as first dictionary…
}
]
}
根据对stackoverflow的搜索,我通过提取json来解决它,仅为resultList
(这是我唯一关心的部分)规范化其输出,然后按列定向并转换为熊猫DataFrame。
这是代码:
import requests
import pandas as pd
from pandas.io.json import json_normalize
# Get JSON from API
user = str(input("Enter User Name: "))
password = getpass.getpass("Enter Password: ")
url = 'https://API_url'
req = requests.post(url = url, auth=(user, password))
out = req.json()
# Create normalized dataframe from API
solr_df = pd.DataFrame.from_dict(json_normalize(out["resultList"]), orient='columns')
但是,虽然这会将resultList
展平为列,但children
列仍会嵌套为词典列表(实际上附加了u
,我不想要)并且weights
列仍然是列表..
你可以帮助重组这个以返回一个结果,其中儿童和重量被压扁为列?
提前谢谢!
答案 0 :(得分:0)
无法想到一种更有效的方法来做到这一点,虽然我确信存在。
循环浏览json对象并手动压平数据。
dfAll = pd.DataFrame()
for record in r['resultList']:
conc = []
otherFields = {}
for field in record:
if isinstance(record[field], list):
if len(record[field]) > 0:
if isinstance(record[field][0], dict):
conc.append(pd.DataFrame(record[field]))
else:
conc.append(pd.DataFrame(record[field],columns=[field]))
else:
otherFields[field] = record[field]
df = pd.concat(conc,axis=1)
for field in otherFields:
df[field] = otherFields[field]
dfAll = dfAll.append(df)
dfAll
weights identifier parentId score \
0 60 FAMR@316069707@3160697070 c0a9c665-0f6f-4fc8 1
1 20 FAMR@316069707@3160697070 c0a9c665-0f6f-4fc8 1
2 20 FAMR@316069707@3160697070 c0a9c665-0f6f-4fc8 1
0 10 FAMR@316069707@3160697070 c0a9c665-0f6f-4fc8 1
1 20 FAMR@316069707@3160697070 c0a9c665-0f6f-4fc8 1
2 30 FAMR@316069707@3160697070 c0a9c665-0f6f-4fc8 1
date type uniqueId
0 2017-10-26T21:52:59.840Z ABC c0a9c665-0f6f-c8
1 2017-10-26T21:52:59.840Z ABC c0a9c665-0f6f-c8
2 2017-10-26T21:52:59.840Z ABC c0a9c665-0f6f-c8
0 2015-10-26T21:52:59.840Z ABC 123
1 2015-10-26T21:52:59.840Z ABC 123
2 2015-10-26T21:52:59.840Z ABC 123