Question

我有一组像这样的x，y，z数据：

我想选择重复项并删除它们（按x和y列），如下所示：

不重复：

重复：

然后我想再次（递归）：

不重复：

x  y  z
1  1  2
2  2  4

重复

x  y  z
1  1  3

如何实现（排除子集直到没有子集）？我目前有这个：

notDuplicate = df.drop_duplicates(subset=['x', 'y'], keep='first')

非常感谢！

Answer 1

没关系，pandas功能duplicated（）是我正在寻找的。