{'endDate': '2017-12-31',
'results': [{'data': [{'period': '2017-01-01', 'ratio': 26.91301},
{'period': '2017-02-01', 'ratio': 19.77063},
{'period': '2017-03-01', 'ratio': 20.40775},
{'period': '2017-04-01', 'ratio': 16.02843},
{'period': '2017-05-01', 'ratio': 10.38159},
{'period': '2017-06-01', 'ratio': 8.2087},
{'period': '2017-07-01', 'ratio': 8.67815},
{'period': '2017-08-01', 'ratio': 19.58956},
{'period': '2017-09-01', 'ratio': 36.94587},
{'period': '2017-10-01', 'ratio': 36.28194},
{'period': '2017-11-01', 'ratio': 16.64543},
{'period': '2017-12-01', 'ratio': 1.67661}],
'keywords': ['Data_spec'],
'title': 'Data_spec'},
{'data': [{'period': '2017-01-01', 'ratio': 17.65139},
{'period': '2017-02-01', 'ratio': 14.52618},
{'period': '2017-03-01', 'ratio': 15.00234},
{'period': '2017-04-01', 'ratio': 12.1521},
{'period': '2017-05-01', 'ratio': 10.63644},
{'period': '2017-06-01', 'ratio': 8.59767},
{'period': '2017-07-01', 'ratio': 8.95312},
{'period': '2017-08-01', 'ratio': 13.05747},
{'period': '2017-09-01', 'ratio': 48.00482},
{'period': '2017-10-01', 'ratio': 23.7811},
{'period': '2017-11-01', 'ratio': 16.90027},
{'period': '2017-12-01', 'ratio': 0.89866}],
'keywords': ['Data_rate'],
'title': 'Date_rate'},
{'data': [], 'keywords': ['Data_over'], 'title': 'Data_over'},
{'data': [{'period': '2017-01-01', 'ratio': 79.17644},
{'period': '2017-02-01', 'ratio': 84.01851},
{'period': '2017-03-01', 'ratio': 100.0},
{'period': '2017-04-01', 'ratio': 91.19442},
{'period': '2017-05-01', 'ratio': 93.21976},
{'period': '2017-06-01', 'ratio': 93.42096},
{'period': '2017-07-01', 'ratio': 89.14895},
{'period': '2017-08-01', 'ratio': 91.85165},
{'period': '2017-09-01', 'ratio': 91.24136},
{'period': '2017-10-01', 'ratio': 90.35611},
{'period': '2017-11-01', 'ratio': 81.88585},
{'period': '2017-12-01', 'ratio': 7.49111}],
'keywords': ['Data_under'],
'title': 'Data_under'},
{'data': [{'period': '2017-01-01', 'ratio': 0.70417},
{'period': '2017-02-01', 'ratio': 1.11997},
{'period': '2017-03-01', 'ratio': 1.81074},
{'period': '2017-04-01', 'ratio': 1.38823},
{'period': '2017-05-01', 'ratio': 0.97914},
{'period': '2017-06-01', 'ratio': 1.14009},
{'period': '2017-07-01', 'ratio': 0.78465},
{'period': '2017-08-01', 'ratio': 1.07973},
{'period': '2017-09-01', 'ratio': 0.94561},
{'period': '2017-10-01', 'ratio': 0.85172},
{'period': '2017-11-01', 'ratio': 1.27422},
{'period': '2017-12-01', 'ratio': 0.08718}],
'keywords': ['Data_tune'],
'title': 'Data_tune'}],
'startDate': '2017-01-01',
'timeUnit': 'month'}
上面是'my_dict = json.loads(js)',其中js是字符串。我试图将这些数据放到Pandas DataFrame中。我使用下面的代码。
lst = [pd.DataFrame.from_dict(r['data']).set_index('period').rename(columns={'ratio' : r['title']})
for r in d['results']]
df = pd.concat(lst, 1)
我的代码完美运行,直到这个js set为空值。 一个问题是你注意到'keyworks':['Data_over']有空'数据'。所以我不能将索引设置为'period'。我仍然想在我的熊猫DF中使用'Data_over'但是空的。是否可以将DataFrame设置为“Data_over”作为列名但值为空?所以我的最终代码可以使用或不使用'数据'将json转换为df。
以下是我想要的输出。 (实际值不同,但你有了概念)
Data_spec Data_rate Data_over Data_under Data_tune
2017-01-01 0.55116 NaN NaN 7.12056 2.25329
2017-02-01 0.32016 0.08915 NaN 6.43161 1.19959
2017-03-01 0.32421 0.10131 NaN 6.48024 1.30091
2017-04-01 0.33232 0.01215 NaN 6.05471 1.26038
2017-05-01 0.39311 0.12968 NaN 6.19655 1.21985
2017-06-01 0.47011 0.03647 NaN 5.71023 1.03748
2017-07-01 4.32016 NaN NaN 11.85005 0.84295
2017-08-01 8.81053 0.04052 NaN 51.44072 0.89564
2017-09-01 14.46808 0.02836 NaN 100.00000 0.85511
2017-10-01 4.27152 0.10942 NaN 34.65451 0.87132
2017-11-01 0.29989 0.05673 NaN 13.02127 0.77811
2017-12-01 0.00810 0.06079 NaN 0.80243 NaN
答案 0 :(得分:1)
可以length
检查len(r['data']) > 0
并过滤list comprehension
中的空数据框:
lst = [pd.DataFrame(r['data']).set_index('period').rename(columns={'ratio' : r['title']})
for r in d['results'] if len(r['data']) > 0]
df = pd.concat(lst, 1)
print (df)
Data_spec Date_rate Data_under Data_tune
period
2017-01-01 26.91301 17.65139 79.17644 0.70417
2017-02-01 19.77063 14.52618 84.01851 1.11997
2017-03-01 20.40775 15.00234 100.00000 1.81074
2017-04-01 16.02843 12.15210 91.19442 1.38823
2017-05-01 10.38159 10.63644 93.21976 0.97914
2017-06-01 8.20870 8.59767 93.42096 1.14009
2017-07-01 8.67815 8.95312 89.14895 0.78465
2017-08-01 19.58956 13.05747 91.85165 1.07973
2017-09-01 36.94587 48.00482 91.24136 0.94561
2017-10-01 36.28194 23.78110 90.35611 0.85172
2017-11-01 16.64543 16.90027 81.88585 1.27422
2017-12-01 1.67661 0.89866 7.49111 0.08718
编辑:
如果r['data']
为空,可能会创建自定义DataFrame,因为使用了对齐索引d['startDate']
:
lst = [pd.DataFrame(r['data']).set_index('period').rename(columns={'ratio' : r['title']})
if len(r['data']) > 0
else pd.DataFrame([np.nan], columns=[r['title']], index=[d['startDate']])
for r in d['results'] ]
df = pd.concat(lst, 1)
print (df)
Data_spec Date_rate Data_over Data_under Data_tune
2017-01-01 26.91301 17.65139 NaN 79.17644 0.70417
2017-02-01 19.77063 14.52618 NaN 84.01851 1.11997
2017-03-01 20.40775 15.00234 NaN 100.00000 1.81074
2017-04-01 16.02843 12.15210 NaN 91.19442 1.38823
2017-05-01 10.38159 10.63644 NaN 93.21976 0.97914
2017-06-01 8.20870 8.59767 NaN 93.42096 1.14009
2017-07-01 8.67815 8.95312 NaN 89.14895 0.78465
2017-08-01 19.58956 13.05747 NaN 91.85165 1.07973
2017-09-01 36.94587 48.00482 NaN 91.24136 0.94561
2017-10-01 36.28194 23.78110 NaN 90.35611 0.85172
2017-11-01 16.64543 16.90027 NaN 81.88585 1.27422
2017-12-01 1.67661 0.89866 NaN 7.49111 0.08718