我正在尝试将数据从JSON API获取到Pandas Dataframe。但是,熊猫无法正确读取数据。下面是我的代码和输出:
import pandas as pd
import requests
r = requests.get('https://api.covid19india.org/raw_data5.json')
j = r.json()
df = pd.DataFrame.from_dict(j)
但是,我得到的输出不正确
raw_data
0 {'agebracket': '', 'contractedfromwhichpatient...
1 {'agebracket': '', 'contractedfromwhichpatient...
2 {'agebracket': '', 'contractedfromwhichpatient...
3 {'agebracket': '', 'contractedfromwhichpatient...
4 {'agebracket': '', 'contractedfromwhichpatient...
运行df.info()
时,我得到:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20409 entries, 0 to 20408
Data columns (total 1 columns):
raw_data 20409 non-null object
dtypes: object(1)
memory usage: 159.5+ KB
有人可以帮我这个忙吗?
答案 0 :(得分:0)
使用j = r.json()['raw_data']
从json中选择raw_data密钥。
df.info()
输出:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20409 entries, 0 to 20408
Data columns (total 20 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 agebracket 20409 non-null object
1 contractedfromwhichpatientsuspected 20409 non-null object
2 currentstatus 20409 non-null object
3 dateannounced 20409 non-null object
4 detectedcity 20409 non-null object
5 detecteddistrict 20409 non-null object
6 detectedstate 20409 non-null object
7 entryid 20409 non-null object
8 gender 20409 non-null object
9 nationality 20409 non-null object
10 notes 20409 non-null object
11 numcases 20409 non-null object
12 patientnumber 20409 non-null object
13 source1 20409 non-null object
14 source2 20409 non-null object
15 source3 20409 non-null object
16 statecode 20409 non-null object
17 statepatientnumber 20409 non-null object
18 statuschangedate 20409 non-null object
19 typeoftransmission 20409 non-null object
dtypes: object(20)
memory usage: 3.1+ MB
答案 1 :(得分:0)
请尝试:
df = df['raw_data'].apply(pd.Series)
df.info()
输出
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20409 entries, 0 to 20408
Data columns (total 20 columns):
agebracket 20409 non-null object
contractedfromwhichpatientsuspected 20409 non-null object
currentstatus 20409 non-null object
dateannounced 20409 non-null object
detectedcity 20409 non-null object
detecteddistrict 20409 non-null object
detectedstate 20409 non-null object
entryid 20409 non-null object
gender 20409 non-null object
nationality 20409 non-null object
notes 20409 non-null object
numcases 20409 non-null object
patientnumber 20409 non-null object
source1 20409 non-null object
source2 20409 non-null object
source3 20409 non-null object
statecode 20409 non-null object
statepatientnumber 20409 non-null object
statuschangedate 20409 non-null object
typeoftransmission 20409 non-null object
dtypes: object(20)
memory usage: 3.1+ MB