为什么Pandas .loc只更改一行?

时间:2019-11-18 15:39:20

标签: python pandas

我有一个看起来像这样的数据框:

              Open     High      Low    Close       Volume         MA  Status  Portfolio
Date
1958-03-12    42.41    42.41    42.41    42.41    2688889.0    41.3016     1.0      100.0
1958-03-13    42.46    42.46    42.46    42.46    3144444.0    41.3442     1.0        NaN
1958-03-14    42.33    42.33    42.33    42.33    2388889.0    41.3734     1.0        NaN
1958-03-17    42.04    42.04    42.04    42.04    2366667.0    41.4006     1.0        NaN
1958-03-18    41.89    41.89    41.89    41.89    2300000.0    41.4184     1.0        NaN
1958-03-19    42.09    42.09    42.09    42.09    2677778.0    41.4404     1.0        NaN
1958-03-20    42.11    42.11    42.11    42.11    2533333.0    41.4676     1.0        NaN
1958-03-21    42.42    42.42    42.42    42.42    2700000.0    41.5086     1.0        NaN
1958-03-24    42.58    42.58    42.58    42.58    2866667.0    41.5504     1.0        NaN

如果“状态”等于1,我希望将“投资组合”列计算为前一天的值加上当天的收益。我有这样一行:

spx_daily.loc['1958-03-13':].loc[spx_daily['Status'] == 1, 'Portfolio'] = ((spx_daily.Close / spx_daily.Close.shift(1))) * spx_daily.Portfolio.shift(1)

但是,当我运行代码时,输​​出如下:

                   Open     High      Low    Close       Volume         MA  Status   Portfolio
Date
1958-03-12    42.41    42.41    42.41    42.41    2688889.0    41.3016     1.0  100.000000
1958-03-13    42.46    42.46    42.46    42.46    3144444.0    41.3442     1.0  100.117897
1958-03-14    42.33    42.33    42.33    42.33    2388889.0    41.3734     1.0         NaN
1958-03-17    42.04    42.04    42.04    42.04    2366667.0    41.4006     1.0         NaN
1958-03-18    41.89    41.89    41.89    41.89    2300000.0    41.4184     1.0         NaN
1958-03-19    42.09    42.09    42.09    42.09    2677778.0    41.4404     1.0         NaN
1958-03-20    42.11    42.11    42.11    42.11    2533333.0    41.4676     1.0         NaN
1958-03-21    42.42    42.42    42.42    42.42    2700000.0    41.5086     1.0         NaN
1958-03-24    42.58    42.58    42.58    42.58    2866667.0    41.5504     1.0         NaN

仅计算第一行。那是因为该操作“一次全部发生”并且剩余的行被检测为nan吗?

在避免重复遍历的同时我该如何解决呢?

1 个答案:

答案 0 :(得分:0)

使用Series.fillna + Series.cumprod

df['Portfolio']=df['Portfolio'].fillna( (df['Close']/df['Close'].shift()).mask(df.Status.ne(1),1) )
df['Portfolio']=df['Portfolio'].cumprod()
print(df)
             Open   High    Low  Close     Volume       MA  Status   Portfolio
Date                                                                          
1958-03-12  42.41  42.41  42.41  42.41  2688889.0  41.3016     1.0  100.000000
1958-03-13  42.46  42.46  42.46  42.46  3144444.0  41.3442     1.0  100.117897
1958-03-14  42.33  42.33  42.33  42.33  2388889.0  41.3734     1.0   99.811365
1958-03-17  42.04  42.04  42.04  42.04  2366667.0  41.4006     1.0   99.127564
1958-03-18  41.89  41.89  41.89  41.89  2300000.0  41.4184     1.0   98.773874
1958-03-19  42.09  42.09  42.09  42.09  2677778.0  41.4404     1.0   99.245461
1958-03-20  42.11  42.11  42.11  42.11  2533333.0  41.4676     1.0   99.292620
1958-03-21  42.42  42.42  42.42  42.42  2700000.0  41.5086     1.0  100.023579
1958-03-24  42.58  42.58  42.58  42.58  2866667.0  41.5504     1.0  100.400849

我使用了 df 而不是 spx_daily 。 我只想让你理解这个主意


检查状态为== 0的一行:

df.iloc[4,6]=0
print(df)
             Open   High    Low  Close     Volume       MA  Status  Portfolio
Date                                                                         
1958-03-12  42.41  42.41  42.41  42.41  2688889.0  41.3016     1.0      100.0
1958-03-13  42.46  42.46  42.46  42.46  3144444.0  41.3442     1.0        NaN
1958-03-14  42.33  42.33  42.33  42.33  2388889.0  41.3734     1.0        NaN
1958-03-17  42.04  42.04  42.04  42.04  2366667.0  41.4006     1.0        NaN
1958-03-18  41.89  41.89  41.89  41.89  2300000.0  41.4184     0.0        NaN
1958-03-19  42.09  42.09  42.09  42.09  2677778.0  41.4404     1.0        NaN
1958-03-20  42.11  42.11  42.11  42.11  2533333.0  41.4676     1.0        NaN
1958-03-21  42.42  42.42  42.42  42.42  2700000.0  41.5086     1.0        NaN
1958-03-24  42.58  42.58  42.58  42.58  2866667.0  41.5504     1.0        NaN

df['Portfolio']=df['Portfolio'].fillna( (df['Close']/df['Close'].shift()).mask(df.Status.ne(1),1) )
df['Portfolio']=df['Portfolio'].cumprod()
print(df)
                 Open   High    Low  Close     Volume       MA  Status   Portfolio
Date                                                                          
1958-03-12  42.41  42.41  42.41  42.41  2688889.0  41.3016     1.0  100.000000
1958-03-13  42.46  42.46  42.46  42.46  3144444.0  41.3442     1.0  100.117897
1958-03-14  42.33  42.33  42.33  42.33  2388889.0  41.3734     1.0   99.811365
1958-03-17  42.04  42.04  42.04  42.04  2366667.0  41.4006     1.0   99.127564
1958-03-18  41.89  41.89  41.89  41.89  2300000.0  41.4184     0.0   99.127564
1958-03-19  42.09  42.09  42.09  42.09  2677778.0  41.4404     1.0   99.600840
1958-03-20  42.11  42.11  42.11  42.11  2533333.0  41.4676     1.0   99.648167
1958-03-21  42.42  42.42  42.42  42.42  2700000.0  41.5086     1.0  100.381744
1958-03-24  42.58  42.58  42.58  42.58  2866667.0  41.5504     1.0  100.760365