我找不到正确保存和正确检索多索引熊猫数据框的方法,从而无法保留多索引列结构。对于可重现的示例:
toy_data.to_json()
'{"["GOOG","Shares"]":{"1521849600000":null,"1521936000000":null,"1522368000000":null,"1522454400000":694548763.0,"1522540800000":null},"["GOOG","ROE"]":{"1521849600000":null,"1521936000000":null,"1522368000000":null,"1522454400000":0.1076,"1522540800000":null},"["FB","Shares"]":{"1521849600000":null,"1521936000000":null,"1522368000000":null,"1522454400000":2398606201.0,"1522540800000":null},"["FB","ROE"]":{"1521849600000":null,"1521936000000":null,"1522368000000":null,"1522454400000":0.2465,"1522540800000":null}}'
toy_data.to_csv('toy_data.csv')
toy_data1 = pd.read_csv('toy_data.csv')
您的建议将不胜感激。
答案 0 :(得分:2)
read_csv
在header
中使用index_col
和read_csv
参数可以满足您的需求。
toy_data.to_csv('sample.csv')
pd.read_csv('sample.csv', header=[0, 1], index_col=[0])
Company GOOG FB
Indicators Shares ROE Shares ROE
Quarter_end
2018-03-24 NaN NaN NaN NaN
2018-03-25 NaN NaN NaN NaN
2018-03-30 NaN NaN NaN NaN
2018-03-31 1.0 2.0 3.0 4.0
2018-04-01 NaN NaN NaN NaN
read_hdf
保存到hdf
可能是一个更好的选择。
toy_data.to_hdf('sample.h5', 'toy_key')
pd.read_hdf('sample.h5', 'toy_key')
Company GOOG FB
Indicators Shares ROE Shares ROE
Quarter_end
2018-03-24 NaN NaN NaN NaN
2018-03-25 NaN NaN NaN NaN
2018-03-30 NaN NaN NaN NaN
2018-03-31 1.0 2.0 3.0 4.0
2018-04-01 NaN NaN NaN NaN
cols = pd.MultiIndex.from_product(
[['GOOG', 'FB'], ['Shares', 'ROE']],
names=['Company', 'Indicators']
)
idx = pd.to_datetime(
['2018-03-24', '2018-03-25', '2018-03-30',
'2018-03-31', '2018-04-01']
).rename('Quarter_end')
toy_data = pd.DataFrame([
[np.nan, np.nan, np.nan, np.nan],
[np.nan, np.nan, np.nan, np.nan],
[np.nan, np.nan, np.nan, np.nan],
[1, 2, 3, 4],
[np.nan, np.nan, np.nan, np.nan],
], idx, cols)
答案 1 :(得分:0)
您尚未提供可用的示例数据,但我可以肯定的是,您需要做的就是传递header=[0, 1]
和index_col=0
作为read_csv
的参数。