Question

我有一个使用“重复”功能创建的数据框，它看起来像这样：

IX  Campaign_Response   Gender  Presence_of_Child   Marital_Status  Age_Group_ID    Cluster Income_Group    Payer_Type  Race    dwell_type  education   Region  is_duplicated
 7         0               0              1                1             1              18        D                 NK  W           S           2           3   True
27          0              0              1                1             2              13        E                 PK  W           S           5             4 True
43          0              0              1                 1            2              8         H                  NK H            S           5           3  True
The rest of these lines are spaced as above Roughly
80  1   0   1   1   4   7   F   NK  H   S   1   3   True
81  1   0   1   1   4   7   F   NK  H   S   1   3   True
82  1   0   1   1   4   7   F   NK  H   S   1   3   True

所以我想要的是找到重复行的索引号（带有行的实例？所以我希望能够看到行重复的实例和行内容，看看重复行的特征是什么。

我正在考虑一些小组，但是消除了索引号，我还需要查看未包含在“查找重复项”功能中的Campaign响应，我希望其他一些相同的记录有不同的响应当然还有不同的索引号......

所以期望的输出可能看起来像：任何替代的显示方式都很好

80  1   0   1   1   4   7   F   NK  H   S   1   3   True
81  1   0   1   1   4   7   F   NK  H   S   1   3   True *** <<< indicating dupe of prior record (as many occurrences as required
82  1   0   1   1   4   7   F   NK  H   S   1   3   True
391  1   0   1   1   4   7   F   NK  H   S   1   3   True****
508  1   0   1   1   4   7   F   NK  H   S   1   3   True****
83  1   0   1   1   4   7   F   NK  H   S   1   3   True
108  1   0   1   1   4   7   F   NK  H   S   1   3   True *** another dupe

Answer 1

假设您的DataFrame名为df，您只需按以下方式获取重复项的索引值：

idx_dups = df[df.duplicated()].index

与pandas索引一起获取重复项

1 个答案: