如何删除熊猫中的配对重复?

时间:2018-07-12 10:42:25

标签: python pandas dataframe duplicates

我有数据集,该数据集具有配对重复。这是我的数据

Id    antecedent           descendant
1     one                  two
2     two                  one
3     two                  three
4     one                  three
5     three                two

这就是我需要的,因为one, two等于two, one,所以我想ro删除重复的对

Id    antecedent           descendant
1     one                  two
3     two                  three
4     one                  three

1 个答案:

答案 0 :(得分:3)

使用numpy.sort进行每行排序,使用duplicated进行布尔掩码:

df1 = pd.DataFrame(np.sort(df[['antecedent','descendant']], axis=1))

或者:

#slowier solution
#df1 = df[['antecedent','descendant']].apply(frozenset, 1)

df = df[~df1.duplicated()]
print (df)
   Id antecedent descendant
0   1        one        two
2   3        two      three
3   4        one      three