我有一个这样的数据框:
import pandas as pd dict = {'col_a':['A', 'A', 'A', 'A', 'B', 'B', 'C', 'C'], 'col_b':['xyz','xyz','xyw','xyw','abc','abe','pqr','pqr']} dt = pd.DataFrame(dict) print(dt) col_a col_b A xyz A xyz A xyw A xyw B abc B ade C pqr C pqr
我想获得col_a和col_b重复的所有行,但是即使col_a相同,col_b也不得相同,例如:
col_a col_b C pqr C pqr
注意:
dt[dt.duplicated(subset=['col_a', 'col_b'], keep=False)] col_a col_b A xyz A xyz A xyw A xyw C pqr C pqr
感谢您的帮助和关注
答案 0 :(得分:3)
似乎需要
dt[dt.duplicated(keep=False)&(dt.groupby(['col_a'])['col_b'].transform('nunique').eq(1))]
Out[662]:
col_a col_b
6 C pqr
7 C pqr