我下面有详细的发票数据。
+------------+---------------+--------+-----+-------+------------+-------------+
| Invoice No | Invoice Total | Item # | qty | price | Item Total | Inventory # |
+------------+---------------+--------+-----+-------+------------+-------------+
| 1 | 42 | 123 | 1 | 10 | 10 | 0 |
| 1 | 42 | 234 | 2 | 12 | 24 | 10 |
| 1 | 42 | 345 | 1 | 8 | 8 | 0 |
| 2 | 224 | 123 | 3 | 10 | 30 | 4 |
| 2 | 220 | 234 | 2 | 12 | 24 | 3 |
| 2 | 220 | 345 | 8 | 1 | 8 | 0 |
| 2 | 220 | 456 | 10 | 12 | 120 | 2 |
| 2 | 220 | 567 | 7 | 6 | 42 | 4 |
| 3 | 34 | 123 | 1 | 10 | 10 | 10 |
| 3 | 34 | 234 | 2 | 12 | 24 | 0 |
| 4 | 30 | 123 | 1 | 10 | 10 | 0 |
| 4 | 30 | 234 | 2 | 12 | 24 | 3 |
+------------+---------------+--------+-----+-------+------------+-------------+
对于每个唯一的个体Invoice No
,我想串联Inventory #
并用串联和排序后的值(从左向右升序)替换该列。同样,任何重复的值也应删除。例如。 Invoice No - 2
已重复Inventory # - 4
两次。
我想要的结果如下
+------------+---------------+--------+-----+-------+------------+-------------+
| Invoice No | Invoice Total | Item # | qty | price | Item Total | Inventory # |
+------------+---------------+--------+-----+-------+------------+-------------+
| 1 | 42 | 123 | 1 | 10 | 10 | 0,10 |
| 1 | 42 | 234 | 2 | 12 | 24 | 0,10 |
| 1 | 42 | 345 | 1 | 8 | 8 | 0,10 |
| 2 | 224 | 123 | 3 | 10 | 30 | 0,2,3,4 |
| 2 | 220 | 234 | 2 | 12 | 24 | 0,2,3,4 |
| 2 | 220 | 345 | 8 | 1 | 8 | 0,2,3,4 |
| 2 | 220 | 456 | 10 | 12 | 120 | 0,2,3,4 |
| 2 | 220 | 567 | 7 | 6 | 42 | 0,2,3,4 |
| 3 | 34 | 123 | 1 | 10 | 10 | 0,10 |
| 3 | 34 | 234 | 2 | 12 | 24 | 0,10 |
| 4 | 30 | 123 | 1 | 10 | 10 | 0,3 |
| 4 | 30 | 234 | 2 | 12 | 24 | 0,3 |
+------------+---------------+--------+-----+-------+------------+-------------+
请指导我解决这个问题。
答案 0 :(得分:3)
我将做transform
,set
将删除重复项和排序,然后只需要join
df['Inventory #']=df.groupby('Invoice No')['Inventory'].\
transform(lambda x : ','.join(set(x.astype(str))))