填充multiIndexed pandas系列

时间:2017-10-13 05:08:51

标签: python pandas

我有一个充满数据的pandas数据框

import pandas as pd
import numpy as np

varNames = ["point1","point2","point3","point4","point5"]
df = pd.DataFrame(np.random.randn(5,2),index=varNames,columns=["data1","data2"])

我想用这个创建一个带有multiIndex的系列。我能做的指数:

iterables=[["point1","point2","point3"],["point4","point5"]]
index=pd.MultiIndex.from_product(iterables, names=['numerator', 'denominator'])

我不知道如何填写这个系列。我喜欢

之类的东西
s = pd.Series(max(df.loc[index["numerator"]]/df.loc[index["denominator"]]),index=index)

我想将第一个数据帧中的每一行作为分子列出,并将其除以列出分母的第一个数据帧中的每一行,从结果行中找到最大值值并将其存储在系列中的相关位置(s [variableN,variableM])。

这是我第一次使用这个多索引的东西,没有逐行完成系列,计算出价值并存储它,类似(我想,我不认为我'我已经能够完全理解这一点了{} {},我无法弄清楚如何做到这一点。

1 个答案:

答案 0 :(得分:0)

您可以将reindex与参数level一起使用max

df3 = df.reindex(index, level=0).div(df.reindex(index, level=1)).max(level=0)

样品:

np.random.seed(456)
varNames = ["point1","point2","point3","point4","point5"]
df = pd.DataFrame(np.random.randn(5,2),index=varNames,columns=["data1","data2"])
print (df)
           data1     data2
point1 -0.668129 -0.498210
point2  0.618576  0.568692
point3  1.350509  1.629589
point4  0.301966  0.449483
point5 -0.345811 -0.315231

iterables=[["point1","point2","point3"],["point4","point5"]]
index=pd.MultiIndex.from_product(iterables, names=['numerator', 'denominator'])
df1 = df.reindex(index, level=0)
print (df1)
                          data1     data2
numerator denominator                    
point1    point4      -0.668129 -0.498210
          point5      -0.668129 -0.498210
point2    point4       0.618576  0.568692
          point5       0.618576  0.568692
point3    point4       1.350509  1.629589
          point5       1.350509  1.629589

df2 = df.reindex(index, level=1)
print (df2)
                          data1     data2
numerator denominator                    
point1    point4       0.301966  0.449483
          point5      -0.345811 -0.315231
point2    point4       0.301966  0.449483
          point5      -0.345811 -0.315231
point3    point4       0.301966  0.449483
          point5      -0.345811 -0.315231

print (df1.div(df2))
                          data1     data2
numerator denominator                    
point1    point4      -2.212594 -1.108405
          point5       1.932062  1.580459
point2    point4       2.048493  1.265214
          point5      -1.788768 -1.804050
point3    point4       4.472386  3.625472
          point5      -3.905339 -5.169509
df3 = df.reindex(index, level=0).div(df.reindex(index, level=1)).max(level=0)
print (df3)
              data1     data2
numerator                    
point1     1.932062  1.580459
point2     2.048493  1.265214
point3     4.472386  3.625472


df3 = (df.reindex(index, level=0).div(df.reindex(index, level=1))
        .max(level=0)
        .reindex(index, level=0))
print (df3)
                          data1     data2
numerator denominator                    
point1    point4       1.932062  1.580459
          point5       1.932062  1.580459
point2    point4       2.048493  1.265214
          point5       2.048493  1.265214
point3    point4       4.472386  3.625472
          point5       4.472386  3.625472