Question

我尝试删除熊猫中具有重复数据的列，例如以下数据（它们具有相同的数据，但列名称不同）：

df1 = pd.DataFrame({'one': [1, 2, 3, 4], 'two': ['a', 'b', 'c', 'd'], 'three': [1, 2, 3, 4]})
   one two  three
0    1   a      1
1    2   b      2
2    3   c      3
3    4   d      4

我希望得到这个结果：

  one two
0   1   a
1   2   b
2   3   c
3   4   d

我现在使用的方法是：

df2 = df1.T.drop_duplicates().T

但这太低效了，有更好的方法吗？

希望得到您的帮助，谢谢

Answer 1

我试图像这样提高一点效率：

让我知道这是否有帮助。

要完全删除`In [935]: df_int = df1.select_dtypes(include=['int']) In [933]: df_other = df1.select_dtypes(exclude=['int']) In [949]: if df_int.T.drop_duplicates().shape[0] == 1: ...: res = pd.concat([df_int.iloc[:,0], df_other], axis=1) ...: In [950]: res Out[950]: one two 0 1 a 1 2 b 2 3 c 3 4 d`，您可以执行以下操作：

transpose

删除熊猫中的重复列

1 个答案: