我一直在尝试将嵌套的json文件格式化为pandas数据帧,但我可能遗漏了一些东西,
如何将时间序列提取到pandas数据帧?我一直在努力尝试提取所有编号,但如果成功,我在数据框架中结束了一些元数据
请帮忙!
{
"Meta Data": {
"1. Information": "Intraday (60min) prices and volumes",
"2. Symbol": "BHP.AX",
"3. Last Refreshed": "2018-02-09 00:00:00",
"4. Interval": "60min",
"5. Output Size": "Compact",
"6. Time Zone": "US/Eastern"
},
"Time Series (60min)": {
"2018-02-09 00:00:00": {
"1. open": "29.1100",
"2. high": "29.1950",
"3. low": "29.1000",
"4. close": "29.1300",
"5. volume": "788213"
},
"2018-02-08 23:00:00": {
"1. open": "29.0000",
"2. high": "29.2000",
"3. low": "29.0000",
"4. close": "29.1100",
"5. volume": "768704"
},
"2018-02-08 22:00:00": {
"1. open": "29.1000",
"2. high": "29.1000",
"3. low": "28.9600",
"4. close": "29.0000",
"5. volume": "830235"
},
"2018-02-08 21:00:00": {
"1. open": "29.0850",
"2. high": "29.2250",
"3. low": "29.0750",
"4. close": "29.1050",
"5. volume": "803142"
},
"2018-02-08 20:00:00": {
"1. open": "28.9200",
"2. high": "29.1500",
"3. low": "28.8900",
"4. close": "29.0900",
"5. volume": "1231131"
}
}
}
任何想法?
答案 0 :(得分:1)
您可以使用variables
,指定pd.DataFrame.from_dict
。
orient
如果需要,您还可以清理列标题。
data = json.loads(json_data)
df = pd.DataFrame.from_dict(data['Time Series (60min)'], orient='index')
df.columns = df.columns.str.split('. ').str[1] # an optional step
答案 1 :(得分:0)
如果您只需要时间序列,请告诉Pandas!
js = '''Your_JSON_String'''
jsdata = json.loads(js)
jsts = jsdata["Time Series (60min)"]
ts = pd.DataFrame(jsts).T
# 1. open 2. high 3. low 4. close 5. volume
# 2018-02-08 20:00:00 28.9200 29.1500 28.8900 29.0900 1231131
# 2018-02-08 21:00:00 29.0850 29.2250 29.0750 29.1050 803142
# 2018-02-08 22:00:00 29.1000 29.1000 28.9600 29.0000 830235
# 2018-02-08 23:00:00 29.0000 29.2000 29.0000 29.1100 768704
# 2018-02-09 00:00:00 29.1100 29.1950 29.1000 29.1300 788213