过滤数据框,根据某些列的值显示行

时间:2020-05-28 14:24:09

标签: python pandas dataframe filter

我具有以下类型的数据框,我只想保留名称为“模型”的列等于“购买”或“出售”的行

输入:

          Date Ticker IssuerTier     Action  ... ModelG1 ModelG2  ModelG3  ModelG4
0   2020-05-28   AAPL       gold       None  ...   STAND   STAND    STAND    STAND
1   2020-05-28   ABBV       gold  reiterate  ...   STAND   STAND    STAND    STAND
2   2020-05-28   ABMD   standard       None  ...   STAND   STAND    SELL     STAND
3   2020-05-28   ACAD       gold       None  ...   BUY     STAND    STAND    STAND
4   2020-05-28   ADSK   standard       None  ...   STAND   STAND    STAND    STAND
..         ...    ...        ...        ...  ...     ...     ...      ...      ...
130 2020-05-28    WEX       gold       None  ...   STAND   STAND    STAND    STAND
131 2020-05-28   WYNN       gold       None  ...   STAND   STAND    STAND    STAND
132 2020-05-28    ZEN       gold       None  ...   BUY     STAND    STAND    STAND
133 2020-05-28    ZEN       gold  reiterate  ...   STAND   STAND    STAND    STAND
134 2020-05-28    ZEN     silver       None  ...   STAND   STAND    STAND    STAND

[135 rows x 58 columns]

输出:

          Date Ticker IssuerTier     Action  ... ModelG1 ModelG2  ModelG3  ModelG4


2   2020-05-28   ABMD   standard       None  ...   STAND   STAND    SELL     STAND
3   2020-05-28   ACAD       gold       None  ...   BUY     STAND    STAND    STAND
132 2020-05-28    ZEN       gold       None  ...   BUY     STAND    STAND    STAND

我尝试使用以下掩码,但是由于某种原因我在所有数据帧上都获得了NaN:

mask1 = signals.loc[:, 'ModelA1':] == 'BUY'
mask2 = signals.loc[:, 'ModelA1':] == 'SELL'
signals = signals[mask1 & mask2]

模型列从A1,A2,A3,A4 ...到G1,G2,G3,G4。

感谢您的帮助!

2 个答案:

答案 0 :(得分:0)

model_cols = (cols for cols in signals.columns if cols.startswith("Model"))
mask = signals[model_cols].apply(lambda s: "BUY" in s.values or "SELL" in s.values, axis=1)
signals = signals[mask]

我们首先获得“型号”列,然后根据您的条件生成掩码并使用它。

答案 1 :(得分:0)

你可以试一下吗?

cols_to_check = list(filter(lambda col : col.startswith('Model'), df.columns))

def should_allow(row):
    return all(map(lambda col : row[col] in ('BUY', 'SELL'), cols_to_check))

df = df[df.apply(should_allow, axis=1)]
print(df)