我想在pandas中沿着短轴扩展Panel的数据框架。我开始创建一个dic
DataFrame
来生成一个Panel。
import pandas as pd
import numpy as np
rng = pd.date_range('1/1/2013',periods=100,freq='D')
df1 = pd.DataFrame(np.random.randn(100, 4), index = rng, columns = ['A','B','C','D'])
df2 = pd.DataFrame(np.random.randn(100, 4), index = rng, columns = ['A','B','C','D'])
df3 = pd.DataFrame(np.random.randn(100, 4), index = rng, columns = ['A','B','C','D'])
pf = pd.Panel({'df1':df1,'df2':df2,'df3':df3})
正如我所料,我发现我的面板具有以下尺寸:
尺寸:3(项目)x 100(major_axis)x 4(minor_axis)项目轴: df1至df3 Major_axis轴:2013-01-01 00:00:00至2013-04-10 00:00:00 Minor_axis轴:A到D
我现在想要向Minor轴添加一个新数据集:
pf['df1']['E'] = pd.DataFrame(np.random.randn(100, 1), index = rng)
pf['df2']['E'] = pd.DataFrame(np.random.randn(100, 1), index = rng)
pf['df2']['E'] = pd.DataFrame(np.random.randn(100, 1), index = rng)
我发现在添加这个新的短轴后,面板阵列尺寸的形状没有改变:
shape(pf)
[3,100,4]
我可以访问major_axis中每个项目的数据:
pf.ix['df1',-10:,'E']
2013-04-01 0.168205 2013-04-02 0.677929 2013-04-03 0.845444 2013-04-04 0.431610 2013-04-05 0.501003 2013-04-06 -0.403605 2013-04-07 -0.185033 2013-04-08 0.270093 2013-04-09 1.569180 2013-04-10 -1.374779频率:D,姓名:E
但如果我将切片扩展为包含多个主轴:
pf.ix[:,:,'E']
然后我遇到一个错误,说“E”未知。
任何人都可以建议我出错的地方或更好的方法来执行此操作吗?
答案 0 :(得分:5)
现在这不起作用,https://github.com/pydata/pandas/issues/2578 但是你可以用这种方式完成你想要的。这是一个相当便宜的操作,因为没有 复制。
In [18]: x = pf.transpose(2,0,1)
In [19]: x
Out[19]:
<class 'pandas.core.panel.Panel'>
Dimensions: 4 (items) x 3 (major_axis) x 100 (minor_axis)
Items axis: A to D
Major_axis axis: df1 to df3
Minor_axis axis: 2013-01-01 00:00:00 to 2013-04-10 00:00:00
In [20]: x['E'] = new_df
In [21]: x.transpose(1,2,0)
Out[21]:
<class 'pandas.core.panel.Panel'>
Dimensions: 3 (items) x 100 (major_axis) x 5 (minor_axis)
Items axis: df1 to df3
Major_axis axis: 2013-01-01 00:00:00 to 2013-04-10 00:00:00
Minor_axis axis: A to E
答案 1 :(得分:3)
似乎问题已修复,但您的问题让我很感兴趣。
由于您可以有效地将切片添加到主轴和副轴上的面板而无需进行移调,因此以下两行可以避免在Dataframe的大小上划伤...
pf.ix[:,'another major axis',:] = pd.DataFrame(np.random.randn(pf.minor_axis.shape[0],pf.items.shape[0]), index=pf.minor_axis, columns=pf.items)
pf.ix[:, :, 'another minor axis'] = pd.DataFrame(np.random.randn(pf.major_axis.shape[0],pf.items.shape[0]), index=pf.major_axis, columns=pf.items)
我想知道是否有更简单的东西?
在沿着各个轴添加切片的代码段下面。
import pandas as pd
import numpy as np
rng = pd.date_range('25/11/2014', periods=2, freq='D')
df1 = pd.DataFrame(np.random.randn(2, 5), index=rng, columns=['A', 'B', 'C', 'D', 'E'])
df2 = pd.DataFrame(np.random.randn(2, 5), index=rng, columns=['A', 'B', 'C', 'D', 'E'])
df3 = pd.DataFrame(np.random.randn(2, 5), index=rng, columns=['A', 'B', 'C', 'D', 'E'])
pf = pd.Panel({'df1': df1, 'df2': df2, 'df3': df3})
# print("slice before adding df4:\n")
# for i in pf.items:
# print("{}:\n{}".format(i, pf[i]))
pf['df4'] = pd.DataFrame(np.random.randn(pf.major_axis.shape[0], pf.minor_axis.shape[0]), index=pf.major_axis, columns=pf.minor_axis)
print pf
# print("slice after df4 before transposing 1:\n")
# for i in pf.items:
# print("{}:\n{}".format(i, pf[i]))
x = pf.transpose(1, 0, 2)
x['new major axis item'] = pd.DataFrame(np.random.randn(pf.items.shape[0], pf.minor_axis.shape[0]), index=pf.items,
columns=pf.minor_axis)
pf = x.transpose(1, 0, 2)
print pf
# print("slice after:\n")
# for i in pf.items:
# print("{}:\n{}".format(i, pf[i]))
print("success on adding slice on major axis:")
print pf.major_xs(key='new major axis item')
print("trying to add major axis directly")
pf.ix[:,'another major axis',:] = pd.DataFrame(np.random.randn(pf.minor_axis.shape[0],pf.items.shape[0]), index=pf.minor_axis, columns=pf.items)
print pf.major_xs(key='another major axis')
print pf