我具有以下类型的数据框,我只想保留名称为“模型”的列等于“购买”或“出售”的行
输入:
Date Ticker IssuerTier Action ... ModelG1 ModelG2 ModelG3 ModelG4
0 2020-05-28 AAPL gold None ... STAND STAND STAND STAND
1 2020-05-28 ABBV gold reiterate ... STAND STAND STAND STAND
2 2020-05-28 ABMD standard None ... STAND STAND SELL STAND
3 2020-05-28 ACAD gold None ... BUY STAND STAND STAND
4 2020-05-28 ADSK standard None ... STAND STAND STAND STAND
.. ... ... ... ... ... ... ... ... ...
130 2020-05-28 WEX gold None ... STAND STAND STAND STAND
131 2020-05-28 WYNN gold None ... STAND STAND STAND STAND
132 2020-05-28 ZEN gold None ... BUY STAND STAND STAND
133 2020-05-28 ZEN gold reiterate ... STAND STAND STAND STAND
134 2020-05-28 ZEN silver None ... STAND STAND STAND STAND
[135 rows x 58 columns]
输出:
Date Ticker IssuerTier Action ... ModelG1 ModelG2 ModelG3 ModelG4
2 2020-05-28 ABMD standard None ... STAND STAND SELL STAND
3 2020-05-28 ACAD gold None ... BUY STAND STAND STAND
132 2020-05-28 ZEN gold None ... BUY STAND STAND STAND
我尝试使用以下掩码,但是由于某种原因我在所有数据帧上都获得了NaN:
mask1 = signals.loc[:, 'ModelA1':] == 'BUY'
mask2 = signals.loc[:, 'ModelA1':] == 'SELL'
signals = signals[mask1 & mask2]
模型列从A1,A2,A3,A4 ...到G1,G2,G3,G4。
感谢您的帮助!
答案 0 :(得分:0)
model_cols = (cols for cols in signals.columns if cols.startswith("Model"))
mask = signals[model_cols].apply(lambda s: "BUY" in s.values or "SELL" in s.values, axis=1)
signals = signals[mask]
我们首先获得“型号”列,然后根据您的条件生成掩码并使用它。
答案 1 :(得分:0)
你可以试一下吗?
cols_to_check = list(filter(lambda col : col.startswith('Model'), df.columns))
def should_allow(row):
return all(map(lambda col : row[col] in ('BUY', 'SELL'), cols_to_check))
df = df[df.apply(should_allow, axis=1)]
print(df)