如何在DataFrame中应用每列的权重因子

时间:2016-05-08 23:27:41

标签: python pandas dataframe

我有以下DataFrame:

                AAPL        F       IBM
    Date            
    2016-05-02  93.073328   13.62   143.881476
    2016-05-03  94.604009   13.43   142.752373
    2016-05-04  93.620002   13.31   142.871221
    2016-05-05  93.239998   13.32   145.070003
    2016-05-06  92.720001   13.44   147.289993

我有一个权重列表:说... w = [20,30,50]

我希望将每一列除以第一行的值,然后乘以相应的权重。

                     AAPL             F               IBM
    2016-05-02  93.07/93.07*20  13.62/13.62*30  143.88/143.88*50
    2016-05-03  94.60/93.07*20  13.43/13.62*30  142.75/143.88*50

有一种简单的方法吗? 等等

2 个答案:

答案 0 :(得分:4)

另一种方式:

w = [20, 30, 50]

In [110]: df /= df.iloc[0]/w

In [111]: df
Out[111]:
                 AAPL          F        IBM
Date
2016-05-02  20.000000  30.000000  50.000000
2016-05-03  20.328919  29.581498  49.607627
2016-05-04  20.117472  29.317181  49.648928
2016-05-05  20.035815  29.339207  50.413023
2016-05-06  19.924076  29.603524  51.184488

或者像这样(取决于你想要达到的目标):

In [103]: df /= (df.iloc[0]*w)

In [104]: df
Out[104]:
                AAPL         F       IBM
Date
2016-05-02  0.050000  0.033333  0.020000
2016-05-03  0.050822  0.032868  0.019843
2016-05-04  0.050294  0.032575  0.019860
2016-05-05  0.050090  0.032599  0.020165
2016-05-06  0.049810  0.032893  0.020474

答案 1 :(得分:3)

设置

from StringIO import StringIO
import pandas as pd
import numpy as np


text = """Date                AAPL        F       IBM         
    2016-05-02  93.073328   13.62   143.881476
    2016-05-03  94.604009   13.43   142.752373
    2016-05-04  93.620002   13.31   142.871221
    2016-05-05  93.239998   13.32   145.070003
    2016-05-06  92.720001   13.44   147.289993"""

df = pd.read_csv(StringIO(text), delim_whitespace=True, parse_dates=[0], index_col=0)

解决方案

df.div(df.iloc[0]).mul([20, 30, 50])