Python Pandas - 使用反向值删除重复项

时间:2018-03-23 14:41:04

标签: python pandas duplicates

假设df就像:

pd.DataFrame({"col1": ["banana", "apple", "grapes", "banana"],
              "col2": ["apple", "banana", "apple", "grapes"]})
col1    col2
banana  apple
apple   banana
grapes  apple
banana  grapes

我们如何删除反复制品,即:banana - appleapple - banana组合?

我试过

df["col_to_check"] = df.apply(lambda x: set(x), axis=1).astype(str)
idx_to_remove = df[df.duplicated(["col_to_check"])].index

其他建议?

0 个答案:

没有答案