我有一个复杂/嵌套的JSON,我需要将其转换为DataFrame(Python)。我可以得到第一部分,但是我正在努力解决第二部分。
import requests
from pandas.io.json import json_normalize
import json
url = 'url'
headers = {'api-key':'key'}
resp = requests.get(url, headers = headers)
print(resp.status_code)
r = resp.content
r
responses = json.loads(r.decode('utf-8'))
responses
输出(响应)
{'count': 39,
'requestAt': '2020-06-09T20:10:23.201+00:00',
'data': {'Id1': {'id': 'Id1',
'groupId': '1',
'label': 'Question 1',
'options': {'1_1': {'id': '1_1',
'prefix': 'A',
'label': 'Alternative A',
'isCorrect': True},
'1_2': {'id': '1_2',
'prefix': 'B',
'label': 'Alternative B',
'isCorrect': False},
'1_3': {'id': '1_3',
'prefix': 'C',
'label': 'Alternative C',
'isCorrect': False}}}}}
df = DataFrame(responses['data'])
df.T
输出(DataFrame.T):
+-----+---------+------------+-------------+
| id | groupId | label | options |
+-----+---------+------------+-------------+
| Id1 | 1 | Question 1 | **JSON 2** |
+-----+---------+------------+-------------+
**JSON 2** (all inside the cell above)
{'1_1': {'id': '1_1',
'prefix': 'A',
'label': 'Alternative A',
'isCorrect': True},
'1_2': {'id': '1_2',
'prefix': 'B',
'label': 'Alternative B',
'isCorrect': False},
'1_3': {'id': '1_3',
'prefix': 'C',
'label': 'Alternative C',
'isCorrect': False}}
我也需要将JSON 2打开到DataFrame中。
所需的输出:
+-----+---------+------------+--------+---------------+-----------+
| id | groupId | label | prefix | label | isCorrect |
+-----+---------+------------+--------+---------------+-----------+
| Id1 | 1 | Question 1 | A | Alternative A | True |
| Id1 | 1 | Question 1 | B | Alternative B | False |
| Id1 | 1 | Question 1 | C | Alternative C | False |
+-----+---------+------------+--------+---------------+-----------+
如何获得所需的输出?谢谢。
答案 0 :(得分:1)
这是一种实现方法:
import pandas as pd
responses = {
'count': 39,
'requestAt': '2020-06-09T20:10:23.201+00:00',
'data': {
'Id1': {
'id': 'Id1',
'groupId': '1',
'label': 'Question 1',
'options': {
'1_1': {
'id': '1_1',
'prefix': 'A',
'label': 'Alternative A',
'isCorrect': True},
'1_2': {
'id': '1_2',
'prefix': 'B',
'label': 'Alternative B',
'isCorrect': False},
'1_3': {
'id': '1_3',
'prefix': 'C',
'label': 'Alternative C',
'isCorrect': False}
}
}
}
}
# refactor response to a list of dicts
# where each item is a dictionary of keys and values
# corresponding to a single row of dataframe
response_list = []
for id in responses['data']:
# get the keys of interest
data = {k: v for k, v in responses['data'][id].items() if k in ['id', 'groupId', 'label']}
# lets rename 'label' key as deeper inside the json there's another key named 'label'
# lets not have two columns named the same inside the dataframe
data['label_'] = data.pop('label')
# dig deeper inside the current id
for key in responses['data'][id]['options']:
# get the keys of interest
inner_data = {k: v for k, v in responses['data'][id]['options'][key].items() if k in ['prefix', 'label', 'isCorrect']}
# combine the two dicts and append it to the final list
response_list.append({**data, **inner_data})
print(pd.DataFrame(response_list))
这是输出:
id groupId label_ prefix label isCorrect
0 Id1 1 Question 1 A Alternative A True
1 Id1 1 Question 1 B Alternative B False
2 Id1 1 Question 1 C Alternative C False