我正在尝试使用Python来基于ProdCategory连接ProdID
。我需要的是最后两列MainProdConcat
和MainProdConcat_PCOnly
。
让我知道它是否可能
OrderN0 ProdID ProdCategory ItemNo ProdType MainItem MainProdConcat MainProdConcat_PConly
123334 1 PC 100 Main 100 1,2,3,4,5,6 1,2,3,4
123334 2 PC 110 Option 100 1,2,3,4,5,6 1,2,3,4
123334 3 PC 120 Option 100 1,2,3,4,5,6 1,2,3,4
123334 4 PC 130 Option 100 1,2,3,4,5,6 1,2,3,4
123334 5 Accessories 140 Option 100 1,2,3,4,5,6
123334 6 Accessories 150 Option 100 1,2,3,4,5,6
123334 7 PC 200 Main 200 7,8,9,10,11 7,8,9,10
123334 8 PC 210 Option 200 7,8,9,10,11 7,8,9,10
123334 9 PC 220 Option 200 7,8,9,10,11 7,8,9,10
123334 10 PC 240 Option 200 7,8,9,10,11 7,8,9,10
123334 11 Accessories 260 Option 200 7,8,9,10,11
for index, row in df_OrderNo_WithBase.iterrows():
orderid = row['Legacy Sales Order Identifier']
dealid = row['Deal ID']
df_Master.loc[(df_Master['OrderNo'] == orderid ) & (df_Master['Deal ID'] == dealid)),'ProductConcatMain'] = df_Master[(df_Master['OrderNo'] == orderid) & (df_Master['Deal ID'] == dealid) ]['ProdID'].str.cat(sep=',')
答案 0 :(得分:0)
input = '''
OrderN0 ProdID ProdCategory ItemNo ProdType MainItem MainProdConcat MainProdConcat_PConly
123334 1 PC 100 Main 100 1,2,3,4,5,6 1,2,3,4
123334 2 PC 110 Option 100 1,2,3,4,5,6 1,2,3,4
123334 3 PC 120 Option 100 1,2,3,4,5,6 1,2,3,4
123334 4 PC 130 Option 100 1,2,3,4,5,6 1,2,3,4
123334 5 Accessories 140 Option 100 1,2,3,4,5,6
123334 6 Accessories 150 Option 100 1,2,3,4,5,6
123334 7 PC 200 Main 200 7,8,9,10,11 7,8,9,10
123334 8 PC 210 Option 200 7,8,9,10,11 7,8,9,10
123334 9 PC 220 Option 200 7,8,9,10,11 7,8,9,10
123334 10 PC 240 Option 200 7,8,9,10,11 7,8,9,10
123334 11 Accessories 260 Option 200 7,8,9,10,11'''
from itertools import groupby
table = [x.split() for x in input.split("\n")]
heading = table[1]
data = [dict(zip(heading, x)) for x in table[2:]]
for x,y in groupby(data, key=lambda x: x['MainItem']):
y = list(y)
MainProdConcat = ','.join([z['ProdID'] for z in y])
MainProdConcat_PConly = ','.join([z['ProdID'] for z in y if z['ProdCategory'] == 'PC'])
for t in y:
print t['ProdID'], MainProdConcat,
if t['ProdCategory'] == 'PC':
print MainProdConcat_PConly
else:
print
输出:
1 1,2,3,4,5,6 1,2,3,4
2 1,2,3,4,5,6 1,2,3,4
3 1,2,3,4,5,6 1,2,3,4
4 1,2,3,4,5,6 1,2,3,4
5 1,2,3,4,5,6
6 1,2,3,4,5,6
7 7,8,9,10,11 7,8,9,10
8 7,8,9,10,11 7,8,9,10
9 7,8,9,10,11 7,8,9,10
10 7,8,9,10,11 7,8,9,10
11 7,8,9,10,11
答案 1 :(得分:0)
鉴于print(df):
OrderN0 ProdID ProdCategory ItemNo ProdType MainItem
0 123334 1 PC 100 Main 100
1 123334 2 PC 110 Option 100
2 123334 3 PC 120 Option 100
3 123334 4 PC 130 Option 100
4 123334 5 Accessories 140 Option 100
5 123334 6 Accessories 150 Option 100
6 123334 7 PC 200 Main 200
7 123334 8 PC 210 Option 200
8 123334 9 PC 220 Option 200
9 123334 10 PC 240 Option 200
10 123334 11 Accessories 260 Option 200
然后我们可以使用它们来填充“MainProdConcat”。和' MainProdConcat_PConly':
df['MainProdConcat_PConly'] = (df[df.ProdCategory == 'PC']
.groupby([df.ProdType.eq('Main').cumsum()])['ProdID']
.transform(lambda x: ','.join(x.astype(str))))
df['MainProdConcat'] = (df.groupby([df.ProdType.eq('Main').cumsum()])['ProdID']
.transform(lambda x: ','.join(x.astype(str))))
输出打印(df):
OrderN0 ProdID ProdCategory ItemNo ProdType MainItem MainProdConcat_PConly MainProdConcat
0 123334 1 PC 100 Main 100 1,2,3,4 1,2,3,4,5,6
1 123334 2 PC 110 Option 100 1,2,3,4 1,2,3,4,5,6
2 123334 3 PC 120 Option 100 1,2,3,4 1,2,3,4,5,6
3 123334 4 PC 130 Option 100 1,2,3,4 1,2,3,4,5,6
4 123334 5 Accessories 140 Option 100 NaN 1,2,3,4,5,6
5 123334 6 Accessories 150 Option 100 NaN 1,2,3,4,5,6
6 123334 7 PC 200 Main 200 7,8,9,10 7,8,9,10,11
7 123334 8 PC 210 Option 200 7,8,9,10 7,8,9,10,11
8 123334 9 PC 220 Option 200 7,8,9,10 7,8,9,10,11
9 123334 10 PC 240 Option 200 7,8,9,10 7,8,9,10,11
10 123334 11 Accessories 260 Option 200 NaN 7,8,9,10,11