我有以下数据框 -
50d-200d Regime
Date
2017-02-22 NaN 0
2017-02-23 NaN 0
2017-02-24 NaN 0
2017-02-27 0.52 1
2017-02-28 0.92 1
...
2017-04-04 0.39 1
2017-04-05 0.16 1
2017-04-06 -0.08 -1
2017-04-07 -0.30 -1
2017-04-10 -0.51 -1
...
2017-08-09 -1.15 -1
2017-08-10 -0.52 -1
2017-08-11 0.07 1
2017-08-17 2.67 1
我想修改此数据帧,使“Regime”列值设置为0,直到第一次出现“-1”。在那之后,我想保持数据框未经修改。我怎么做到这一点?
TIA
答案 0 :(得分:2)
使用idxmax
作为第一个-1
的索引值,然后设置0
:
idx = df['Regime'].eq(-1).idxmax()
df.iloc[:df.index.get_loc(idx), df.columns.get_loc('Regime')] = 0
print (df)
50d-200d Regime
Date
2017-02-22 NaN 0
2017-02-23 NaN 0
2017-02-24 NaN 0
2017-02-27 0.52 0
2017-02-28 0.92 0
2017-04-04 0.39 0
2017-04-05 0.16 0
2017-04-06 -0.08 -1
2017-04-07 -0.30 -1
2017-04-10 -0.51 -1
2017-08-09 -1.15 -1
2017-08-10 -0.52 -1
2017-08-11 0.07 1
2017-08-17 2.67 1
piRSquared的另一个解决方案,谢谢你:
df.iloc[:df.Regime.eq(-1).values.argmax(), df.columns.get_loc('Regime')] = 0
答案 1 :(得分:2)
选项1
np.logical_and.accumulate
df.assign(Regime=df.Regime.mask(np.logical_and.accumulate(df.Regime.ne(-1)), 0))
50d-200d Regime
Date
2017-02-22 NaN 0
2017-02-23 NaN 0
2017-02-24 NaN 0
2017-02-27 0.52 0
2017-02-28 0.92 0
2017-04-04 0.39 0
2017-04-05 0.16 0
2017-04-06 -0.08 -1
2017-04-07 -0.30 -1
2017-04-10 -0.51 -1
2017-08-09 -1.15 -1
2017-08-10 -0.52 -1
2017-08-11 0.07 1
2017-08-17 2.67 1
选项2
df.assign(Regime=df.Regime.mask(df.Regime.ne(-1).cumprod().astype(bool), 0))
50d-200d Regime
Date
2017-02-22 NaN 0
2017-02-23 NaN 0
2017-02-24 NaN 0
2017-02-27 0.52 0
2017-02-28 0.92 0
2017-04-04 0.39 0
2017-04-05 0.16 0
2017-04-06 -0.08 -1
2017-04-07 -0.30 -1
2017-04-10 -0.51 -1
2017-08-09 -1.15 -1
2017-08-10 -0.52 -1
2017-08-11 0.07 1
2017-08-17 2.67 1