合并多个值和排序

时间:2019-06-05 17:20:27

标签: python pandas

我下面有详细的发票数据。

+------------+---------------+--------+-----+-------+------------+-------------+
| Invoice No | Invoice Total | Item # | qty | price | Item Total | Inventory # |
+------------+---------------+--------+-----+-------+------------+-------------+
|          1 |            42 |    123 |   1 |    10 |         10 |           0 |
|          1 |            42 |    234 |   2 |    12 |         24 |          10 |
|          1 |            42 |    345 |   1 |     8 |          8 |           0 |
|          2 |           224 |    123 |   3 |    10 |         30 |           4 |
|          2 |           220 |    234 |   2 |    12 |         24 |           3 |
|          2 |           220 |    345 |   8 |     1 |          8 |           0 |
|          2 |           220 |    456 |  10 |    12 |        120 |           2 |
|          2 |           220 |    567 |   7 |     6 |         42 |           4 |
|          3 |            34 |    123 |   1 |    10 |         10 |          10 |
|          3 |            34 |    234 |   2 |    12 |         24 |           0 |
|          4 |            30 |    123 |   1 |    10 |         10 |           0 |
|          4 |            30 |    234 |   2 |    12 |         24 |           3 |
+------------+---------------+--------+-----+-------+------------+-------------+

对于每个唯一的个体Invoice No,我想串联Inventory #并用串联和排序后的值(从左向右升序)替换该列。同样,任何重复的值也应删除。例如。 Invoice No - 2已重复Inventory # - 4两次。

我想要的结果如下

+------------+---------------+--------+-----+-------+------------+-------------+
| Invoice No | Invoice Total | Item # | qty | price | Item Total | Inventory # |
+------------+---------------+--------+-----+-------+------------+-------------+
|          1 |            42 |    123 |   1 |    10 |         10 | 0,10        |
|          1 |            42 |    234 |   2 |    12 |         24 | 0,10        |
|          1 |            42 |    345 |   1 |     8 |          8 | 0,10        |
|          2 |           224 |    123 |   3 |    10 |         30 | 0,2,3,4     |
|          2 |           220 |    234 |   2 |    12 |         24 | 0,2,3,4     |
|          2 |           220 |    345 |   8 |     1 |          8 | 0,2,3,4     |
|          2 |           220 |    456 |  10 |    12 |        120 | 0,2,3,4     |
|          2 |           220 |    567 |   7 |     6 |         42 | 0,2,3,4     |
|          3 |            34 |    123 |   1 |    10 |         10 | 0,10        |
|          3 |            34 |    234 |   2 |    12 |         24 | 0,10        |
|          4 |            30 |    123 |   1 |    10 |         10 | 0,3         |
|          4 |            30 |    234 |   2 |    12 |         24 | 0,3         |
+------------+---------------+--------+-----+-------+------------+-------------+

请指导我解决这个问题。

1 个答案:

答案 0 :(得分:3)

我将做transformset将删除重复项和排序,然后只需要join

df['Inventory #']=df.groupby('Invoice No')['Inventory'].\
                      transform(lambda x : ','.join(set(x.astype(str))))