我有一个数据框,在其中需要查询0.00s并将其直接替换为下面的值如果满足某些条件。我一直在寻找有关这种行为的文档,但一直找不到有效的Pythonic解决方案。
逻辑如下:
如果[符号] ='VIX'和[QuoteDateTime]包含'09:31:00'和[关闭] ='0.00'
之后,我想将[Close]值替换为它下面的[Close]值。
+----+--------+---------------------+---------+
| | Symbol | QuoteDateTime | Close |
+----+--------+---------------------+---------+
| 0 | VIX | 2019-04-11 09:31:00 | 0.00 |
| 1 | VIX | 2019-04-11 09:32:00 | 14.24 |
| 2 | VIX | 2019-04-11 09:33:00 | 14.40 |
| 3 | SPX | 2019-04-11 09:31:00 | 2911.09 |
| 4 | SPX | 2019-04-11 09:32:00 | 2911.55 |
| 5 | SPX | 2019-04-11 09:33:00 | 2915.22 |
| 6 | VIX | 2019-04-12 09:31:00 | 0.00 |
| 7 | VIX | 2019-04-12 09:32:00 | 15.64 |
| 8 | VIX | 2019-04-12 09:33:00 | 15.80 |
| 9 | SPX | 2019-04-12 09:31:00 | 2901.09 |
| 10 | SPX | 2019-04-12 09:32:00 | 2901.55 |
| 11 | SPX | 2019-04-12 09:33:00 | 2905.22 |
+----+--------+---------------------+---------+
预期的输出将是索引0 [关闭]为14.24,索引6 [关闭]为15.64。其他所有内容保持不变。
+----+--------+---------------------+---------+
| | Symbol | QuoteDateTime | Close |
+----+--------+---------------------+---------+
| 0 | VIX | 2019-04-11 09:31:00 | 14.24 |
| 1 | VIX | 2019-04-11 09:32:00 | 14.24 |
| 2 | VIX | 2019-04-11 09:33:00 | 14.40 |
| 3 | SPX | 2019-04-11 09:31:00 | 2911.09 |
| 4 | SPX | 2019-04-11 09:32:00 | 2911.55 |
| 5 | SPX | 2019-04-11 09:33:00 | 2915.22 |
| 6 | VIX | 2019-04-12 09:31:00 | 15.64 |
| 7 | VIX | 2019-04-12 09:32:00 | 15.64 |
| 8 | VIX | 2019-04-12 09:33:00 | 15.80 |
| 9 | SPX | 2019-04-12 09:31:00 | 2901.09 |
| 10 | SPX | 2019-04-12 09:32:00 | 2901.55 |
| 11 | SPX | 2019-04-12 09:33:00 | 2905.22 |
+----+--------+---------------------+---------+
答案 0 :(得分:2)
用Series.eq
为==
创建布尔掩码,用Series.dt.strftime
为datetimes
中的字符串创建布尔掩码,并用Series.mask
用Series.shift
设置新值:< / p>
#convert to datetimes if necessary
df['QuoteDateTime'] = pd.to_datetime(df['QuoteDateTime'])
mask = (df['Symbol'].eq('VIX') &
df['QuoteDateTime'].dt.strftime('%H:%M:%S').eq('09:31:00') &
df['Close'].eq(0))
df['Close'] = df['Close'].mask(mask, df['Close'].shift(-1))
#alternative1
#df.loc[mask, 'Close'] = df['Close'].shift(-1)
#alternative2
#df['Close'] = np.where(mask, df['Close'].shift(-1), df['Close'])
print (df)
Symbol QuoteDateTime Close
0 VIX 2019-04-11 09:31:00 14.24
1 VIX 2019-04-11 09:32:00 14.24
2 VIX 2019-04-11 09:33:00 14.40
3 SPX 2019-04-11 09:31:00 2911.09
4 SPX 2019-04-11 09:32:00 2911.55
5 SPX 2019-04-11 09:33:00 2915.22
6 VIX 2019-04-12 09:31:00 15.64
7 VIX 2019-04-12 09:32:00 15.64
8 VIX 2019-04-12 09:33:00 15.80
9 SPX 2019-04-12 09:31:00 2901.09
10 SPX 2019-04-12 09:32:00 2901.55
11 SPX 2019-04-12 09:33:00 2905.22
答案 1 :(得分:1)
不是专家,但是您可以尝试使用索引:
首先使用以下 short 行获取索引:
idx = df.index[(df['Symbol'] == 'VIX') & (df['QuoteDateTime'].str.contains("09:31:00")) & (df['Close'] == '0.0')]
然后使用索引将值设置为以下行中的值:
df.loc[idx, 'Close'] = df.loc[idx+1, 'Close'].values