使用熊猫分组

时间:2016-05-20 10:00:44

标签: python excel pandas

我有数据

ID_panel                               id_vk    Profile       Audio        Video
03a63f1c5a89fb89fcc4d7cf60e2e6b1    100334438                                1
1ea192ddd5c042d71910de18595553a5    100897602                               0.25
1ea192ddd5c042d71910de18595553a5    123581809                               0.35
0038ccb3a47d51a68de51ffeb9607906    35226722    0.058823529
03a63f1c5a89fb89fcc4d7cf60e2e6b1    100334438   0.003552398
03a63f1c5a89fb89fcc4d7cf60e2e6b1    117790896   0.011545293
18441890537f6d9a0559a5f44c28ff67    39356974                  0.974025974
1ea192ddd5c042d71910de18595553a5    123581809                    0.15

欲望输出:

ID_panel                               id_vk    Profile       Audio        Video
03a63f1c5a89fb89fcc4d7cf60e2e6b1    100334438   0.003552398                  1
                                    117790896   0.011545293
1ea192ddd5c042d71910de18595553a5    100897602                               0.25
                                    123581809                  0.15         0.35
0038ccb3a47d51a68de51ffeb9607906    35226722    0.058823529
18441890537f6d9a0559a5f44c28ff67    39356974                  0.974025974

我尝试使用

print df.groupby(['ID_panel', 'id_vk'])['Profile', 'Audio', 'Video'].apply(lambda x: "{%s}" % ', '.join(x))

但它返回

0038ccb3a47d51a68de51ffeb9607906  13312        {Profile, Audio, Video}
                                  35226722     {Profile, Audio, Video}
03a63f1c5a89fb89fcc4d7cf60e2e6b1  795020       {Profile, Audio, Video}
                                  2412315      {Profile, Audio, Video}

不是数字。我写%s%f返回error

但它确实有效,我有一个错误。我怎样才能使用我的数据呢?

1 个答案:

答案 0 :(得分:1)

你走了:

#You have to put the column names in a list.
df.groupby(['ID_panel', 'id_vk'])['Profile', 'Audio', 'Video'].count()