Question

我有一个大熊猫数据框，其中包含有关曲目的信息以及它们的匹配方式。有很多列，但我想根据这些列删除行：

     ID          Hard_Match  Soft_Match
75   205487000      False       True
91   205487000      False       True
47   205487000       True      False
0    209845000       True      False
62   210842000       True      False
81   212085000      False       True
96   229132000      False      False
90   229550000      False      False
66   229758000       True      False

如果想要删除＆＃34;软匹配＆＃34;行如果存在＆＃34;硬匹配＆＃34;具有相同ID的行：

for each row in dataframe:
    if row[hardmatched] == True somewhere else:
       remove row
    else:
       keep row

因此，在上面的示例中，205487000将在索引91和75中删除，并保留在索引47中。

Answer 1

IIUC然后以下做你想做的事：

In [112]:
ids = df.loc[df['Soft_Match'] == True, 'ID'].unique()
ids

Out[112]:
array([205487000, 212085000], dtype=int64)

In [113]:    
hard_matches = df.loc[(df['Hard_Match'] == True) & df['ID'].isin(ids), 'ID']
hard_matches

Out[113]:
47    205487000
Name: ID, dtype: int64

In [116]:
df.loc[~((df['ID'].isin(hard_matches)) & (df['Hard_Match'] == False))]

Out[116]:
           ID Hard_Match Soft_Match
47  205487000       True      False
0   209845000       True      False
62  210842000       True      False
81  212085000      False       True
96  229132000      False      False
90  229550000      False      False
66  229758000       True      False

因此，首先查找“Soft_Match”为True时的ID，然后我们找到“Hard_Match”为True的位置以及ID与这些ID匹配的位置，然后我们过滤掉这些行ID匹配的位置和'Hard_Match'为False

的位置

Pandas：根据不同的行值删除行

1 个答案: