通过对列进行分组来连续执行Concat

时间:2018-05-01 10:31:32

标签: python pandas

我正在尝试使用Python来基于ProdCategory连接ProdID。我需要的是最后两列MainProdConcatMainProdConcat_PCOnly

让我知道它是否可能

OrderN0 ProdID  ProdCategory    ItemNo  ProdType    MainItem MainProdConcat MainProdConcat_PConly
123334  1   PC  100 Main    100 1,2,3,4,5,6 1,2,3,4
123334  2   PC  110 Option  100 1,2,3,4,5,6 1,2,3,4
123334  3   PC  120 Option  100 1,2,3,4,5,6 1,2,3,4
123334  4   PC  130 Option  100 1,2,3,4,5,6 1,2,3,4
123334  5   Accessories 140 Option  100 1,2,3,4,5,6 
123334  6   Accessories 150 Option  100 1,2,3,4,5,6 
123334  7   PC  200 Main    200 7,8,9,10,11 7,8,9,10
123334  8   PC  210 Option  200 7,8,9,10,11 7,8,9,10
123334  9   PC  220 Option  200 7,8,9,10,11 7,8,9,10
123334  10  PC  240 Option  200 7,8,9,10,11 7,8,9,10
123334  11  Accessories 260 Option  200 7,8,9,10,11 

for index, row in df_OrderNo_WithBase.iterrows(): 
      orderid = row['Legacy Sales Order Identifier'] 
      dealid = row['Deal ID'] 
      df_Master.loc[(df_Master['OrderNo'] == orderid ) & (df_Master['Deal ID'] == dealid)),'ProductConcatMain'] = df_Master[(df_Master['OrderNo'] == orderid) & (df_Master['Deal ID'] == dealid) ]['ProdID'].str.cat(sep=',') 

2 个答案:

答案 0 :(得分:0)

input = '''
OrderN0 ProdID  ProdCategory    ItemNo  ProdType    MainItem MainProdConcat MainProdConcat_PConly
123334  1   PC  100 Main    100 1,2,3,4,5,6 1,2,3,4
123334  2   PC  110 Option  100 1,2,3,4,5,6 1,2,3,4
123334  3   PC  120 Option  100 1,2,3,4,5,6 1,2,3,4
123334  4   PC  130 Option  100 1,2,3,4,5,6 1,2,3,4
123334  5   Accessories 140 Option  100 1,2,3,4,5,6 
123334  6   Accessories 150 Option  100 1,2,3,4,5,6 
123334  7   PC  200 Main    200 7,8,9,10,11 7,8,9,10
123334  8   PC  210 Option  200 7,8,9,10,11 7,8,9,10
123334  9   PC  220 Option  200 7,8,9,10,11 7,8,9,10
123334  10  PC  240 Option  200 7,8,9,10,11 7,8,9,10
123334  11  Accessories 260 Option  200 7,8,9,10,11'''

from itertools import groupby

table = [x.split() for x in input.split("\n")]
heading = table[1]
data = [dict(zip(heading, x)) for x in table[2:]]

for x,y in groupby(data, key=lambda x: x['MainItem']):
    y = list(y)
    MainProdConcat = ','.join([z['ProdID'] for z in y])
    MainProdConcat_PConly = ','.join([z['ProdID'] for z in y if z['ProdCategory'] == 'PC'])
    for t in y:
        print t['ProdID'], MainProdConcat,
        if t['ProdCategory'] == 'PC':
            print MainProdConcat_PConly
        else:
            print

输出:

1 1,2,3,4,5,6 1,2,3,4
2 1,2,3,4,5,6 1,2,3,4
3 1,2,3,4,5,6 1,2,3,4
4 1,2,3,4,5,6 1,2,3,4
5 1,2,3,4,5,6
6 1,2,3,4,5,6
7 7,8,9,10,11 7,8,9,10
8 7,8,9,10,11 7,8,9,10
9 7,8,9,10,11 7,8,9,10
10 7,8,9,10,11 7,8,9,10
11 7,8,9,10,11

答案 1 :(得分:0)

鉴于print(df):

    OrderN0  ProdID ProdCategory  ItemNo ProdType  MainItem
0    123334       1           PC     100     Main       100
1    123334       2           PC     110   Option       100
2    123334       3           PC     120   Option       100
3    123334       4           PC     130   Option       100
4    123334       5  Accessories     140   Option       100
5    123334       6  Accessories     150   Option       100
6    123334       7           PC     200     Main       200
7    123334       8           PC     210   Option       200
8    123334       9           PC     220   Option       200
9    123334      10           PC     240   Option       200
10   123334      11  Accessories     260   Option       200

然后我们可以使用它们来填充“MainProdConcat”。和' MainProdConcat_PConly':

df['MainProdConcat_PConly'] = (df[df.ProdCategory == 'PC']
                                 .groupby([df.ProdType.eq('Main').cumsum()])['ProdID']
                                 .transform(lambda x: ','.join(x.astype(str))))

df['MainProdConcat'] = (df.groupby([df.ProdType.eq('Main').cumsum()])['ProdID']
                          .transform(lambda x: ','.join(x.astype(str))))

输出打印(df):

    OrderN0  ProdID ProdCategory  ItemNo ProdType  MainItem MainProdConcat_PConly MainProdConcat
0    123334       1           PC     100     Main       100               1,2,3,4    1,2,3,4,5,6
1    123334       2           PC     110   Option       100               1,2,3,4    1,2,3,4,5,6
2    123334       3           PC     120   Option       100               1,2,3,4    1,2,3,4,5,6
3    123334       4           PC     130   Option       100               1,2,3,4    1,2,3,4,5,6
4    123334       5  Accessories     140   Option       100                   NaN    1,2,3,4,5,6
5    123334       6  Accessories     150   Option       100                   NaN    1,2,3,4,5,6
6    123334       7           PC     200     Main       200              7,8,9,10    7,8,9,10,11
7    123334       8           PC     210   Option       200              7,8,9,10    7,8,9,10,11
8    123334       9           PC     220   Option       200              7,8,9,10    7,8,9,10,11
9    123334      10           PC     240   Option       200              7,8,9,10    7,8,9,10,11
10   123334      11  Accessories     260   Option       200                   NaN    7,8,9,10,11