我试图运行Python(pandas)脚本,该脚本将initial_order_DT作为列返回,将initial_order_value作为另一列返回。
输入:
Order ID Customer ID Value Date
00000001 0001 1500 10/01/2017
00000002 0001 1300 11/15/2017
00000003 0001 1200 12/12/2017
00000004 0002 500 12/13/2017
00000005 0003 1750 09/15/2017
00000006 0003 1800 10/01/2017
00000007 0003 1600 11/01/2017
00000008 0003 1400 12/01/2017
输出:
Order ID Customer ID Value Date First_Order_DT First_Value
00000001 0001 1500 // 10/01/2017 1500
00000002 0001 1300 // 10/01/2017 1500
00000003 0001 1200 // 10/01/2017 1500
00000004 0002 500 // 12/13/2017 500
00000005 0003 1750 // 09/15/2017 1750
00000006 0003 1800 // 09/15/2017 1750
00000007 0003 1600 // 09/15/2017 1750
00000008 0003 1400 etc 09/15/2017 1750
答案 0 :(得分:0)
您可以使用groupby
+ transform
-
v = df.groupby('Customer ID')['Date', 'Value'].transform('first').add_prefix('First_')
v
First_Date First_Value
0 10/01/2017 1500
1 10/01/2017 1500
2 10/01/2017 1500
3 12/13/2017 500
4 09/15/2017 1750
5 09/15/2017 1750
6 09/15/2017 1750
7 09/15/2017 1750
将v
与原始df
-
df = pd.concat([df, v], 1)
df
Order ID Customer ID Value Date First_Date First_Value
0 1 1 1500 10/01/2017 10/01/2017 1500
1 2 1 1300 11/15/2017 10/01/2017 1500
2 3 1 1200 12/12/2017 10/01/2017 1500
3 4 2 500 12/13/2017 12/13/2017 500
4 5 3 1750 09/15/2017 09/15/2017 1750
5 6 3 1800 10/01/2017 09/15/2017 1750
6 7 3 1600 11/01/2017 09/15/2017 1750
7 8 3 1400 12/01/2017 09/15/2017 1750