请让我知道如何将以下格式的json文件转换为数据框。
json文件数据:
{
"series_id":"STEO",
"f":"A",
"data":[["2018",5.8400705041],["2017",3.5671511014],["2016",2.3014617486],["2015",2.4989178082],["2014",2.2089452055]]
}
我试过下面的代码:
sourcePath = r'D:\source\STEO.txt'
data = pd.read_json(sourcePath, lines=True)
我需要以下json的输出:
series_id f date value
STEO A 2018 5.840070504
STEO A 2017 3.567151101
STEO A 2016 2.301461749
STEO A 2015 2.498917808
STEO A 2014 2.208945206
答案 0 :(得分:1)
一种方式可能如下:
x86_64
输出:
import pandas as pd
df = pd.read_json('input.txt')
print(df)
data f series_id
0 [2018, 5.8400705041] A STEO
1 [2017, 3.5671511014] A STEO
2 [2016, 2.3014617486] A STEO
3 [2015, 2.4989178082] A STEO
4 [2014, 2.2089452055] A STEO
输出:
# splitting into multiple columns for list
# https://stackoverflow.com/a/35491399/5916727
df[['Date','Value']] = pd.DataFrame([item for item in df.data])
# removing initial data column now
del df['data']
print(df)
答案 1 :(得分:1)
您可以使用read_json
,然后使用pop
删除列data
并按DataFrame
构造函数创建新列,转换为values
:
df = pd.read_json('file.json')
df[['date','value']] = pd.DataFrame(df.pop('data').values.tolist())
#if necessary convert to int
df['date'] = df['date'].astype(int)
print (df)
f series_id date value
0 A STEO 2018 5.840071
1 A STEO 2017 3.567151
2 A STEO 2016 2.301462
3 A STEO 2015 2.498918
4 A STEO 2014 2.208945
另一种解决方案:
您可以使用json_normalize
,然后使用rename
列,并在必要时按reindex_axis
重新排序:
from pandas.io.json import json_normalize
import json
with open('file.json') as data_file:
d = json.load(data_file)
d_cols = {0:'date', 1:'value'}
names_cols = ['series_id','f','date','value']
df = json_normalize(d, 'data', ['f', 'series_id']) \
.rename(columns=d_cols) \
.reindex_axis(names_cols, axis=1)
df['date'] = df['date'].astype(int)
print (df)
series_id f date value
0 STEO A 2018 5.840071
1 STEO A 2017 3.567151
2 STEO A 2016 2.301462
3 STEO A 2015 2.498918
4 STEO A 2014 2.208945