在Dataframe Pandas行上应用权重公式

时间:2016-09-15 19:44:23

标签: python pandas dataframe apply

我下面有一个df1。我将其复制到df2以保存df1;然后我使用df3来计算df2

df2=df1.copy()

我想计算一个权重,例如Weight(A) = Price(A) / Sum(row_Prices)并将其返回到df2以下的价格,例如每行我得到3行数据,价格,标准和权重行。我还想计算行上的std,我想它的形式类似。

我试过这个

df3 = df2.iloc[1:,1:].div(df2.iloc[1:,1:].sum(axis=1), axis=0)

获取权重然后打印df3但它不起作用。

为了获得每个日期的2行我尝试堆叠.stack(),但我可能做错了。救命!谢谢

                       A      B      C        D     E
2006-04-27 00:00:00                                    
2006-04-28 00:00:00  69.62  69.62  6.518   65.09  69.62
2006-05-01 00:00:00   71.5   71.5  6.522   65.16   71.5
2006-05-02 00:00:00  72.34  72.34  6.669   66.55  72.34
2006-05-03 00:00:00  70.22  70.22  6.662   66.46  70.22
2006-05-04 00:00:00  68.32  68.32  6.758   67.48  68.32
2006-05-05 00:00:00     68     68  6.805   67.99     68
2006-05-08 00:00:00  67.88  67.88  6.768   67.56  67.88

我希望它可以很好地输出:

                            A      B      C        D     E
2006-04-27 00:00:00

2006-04-28 00:00:00                                    
            price        69.62  69.62  6.518   65.09  69.62
            weight
            std
2006-05-01 00:00:00  
            price         71.5   71.5  6.522   65.16   71.5
            weight
            std
2006-05-02 00:00:00   
            price        72.34  72.34  6.669   66.55  72.34
            weight
            std

1 个答案:

答案 0 :(得分:1)

据我所知,没有任何一种快速而肮脏的方式来实现你想要做的事情。 您需要计算所有数据,然后将其全部合并到使用多级索引的DataFrame中:

# Making weight/std DataFrames
cols = list('ABCDE')
weight = pd.DataFrame([df[col] / df.sum(axis=1) for col in df], index=cols).T
std = pd.DataFrame([df.std(axis=1) for col in df], index=cols).T

# Making MultiIndex DataFrame
mindex = pd.MultiIndex.from_product([['price', 'weight', 'std'], df.index])
new_df = pd.DataFrame(index=mindex, columns=cols)

# Inserting data
new_df.ix['price'] = df.values
new_df.ix['weight'] = weight.values
new_df.ix['std'] = std.values

# Swapping levels
new_df = new_df.swaplevel(0, 1).sort_index()

结果new_df应该看起来像这样:

2006-04-28 price      69.62     69.62      6.518     65.09     69.62
           std      27.7829   27.7829    27.7829   27.7829   27.7829
           weight  0.248228  0.248228  0.0232397  0.232076  0.248228
2006-05-01 price       71.5      71.5      6.522     65.16      71.5
           std      28.4828   28.4828    28.4828   28.4828   28.4828
           weight  0.249841  0.249841  0.0227897  0.227687  0.249841
2006-05-02 price      72.34     72.34      6.669     66.55     72.34
           std      28.8308   28.8308    28.8308   28.8308   28.8308
           weight  0.249243  0.249243  0.0229776  0.229294  0.249243
2006-05-03 price      70.22     70.22      6.662     66.46     70.22
           std      28.0509   28.0509    28.0509   28.0509   28.0509
           weight  0.247443  0.247443  0.0234758  0.234194  0.247443
2006-05-04 price      68.32     68.32      6.758     67.48     68.32
           std      27.4399   27.4399    27.4399   27.4399   27.4399
           weight  0.244701  0.244701   0.024205  0.241692  0.244701
2006-05-05 price         68        68      6.805     67.99        68
           std      27.3661   27.3661    27.3661   27.3661   27.3661
           weight  0.243907  0.243907  0.0244086  0.243871  0.243907
2006-05-08 price      67.88     67.88      6.768     67.56     67.88
           std      27.2947   27.2947    27.2947   27.2947   27.2947
           weight  0.244201  0.244201  0.0243481   0.24305  0.244201

作为旁注,我不确定你想要计算什么样的std,所以我只是假设它是行式价格std(每行的单个/重复值)。