我的输入数据框如下: 输入数据帧:
Input1 = pd.DataFrame({'LOT': {0: 'A1', 1: 'A2', 2: 'A3', 3: 'A4', 4: 'A5'},
'OPERATION': {0: 100.0, 1: 100.0, 2: 100.0, 3: 100.0, 4: 100.0},
'TXN_DATE': {0: '12/6/2016',
1: '12/5/2016',
2: '11/30/2016',
3: '11/27/2016',
4: '11/22/2016'}})
Input2 = pd.DataFrame({'LOT': {0: 'B1', 1: 'B2', 2: 'B3', 3: 'B4', 4: 'B5', 5: 'B6'},
'OPERATION': {0: 500, 1: 500, 2: 500, 3: 500, 4: 500, 5: 500},
'TXN_DATE': {0: '12/7/2016',
1: '12/3/2016',
2: '11/17/2016',
3: '11/22/2016',
4: '12/4/2016',
5: '12/3/2016'}})
我很有兴趣根据它们之间的最小TXN_DATES增量来计算Input1表中从Input2到lot的伴随批次(时间增量假设为最小值):
Final DataFrame:
Expected_out = pd.DataFrame({'COMPANION_LOT': {0: 'B5', 1: 'B5', 2: 'B4', 3: 'B4', 4: 'B4'},
'COMPANION_LOT TXN_DATE': {0: '12/4/2016',
1: '12/4/2016',
2: '11/22/2016',
3: '11/22/2016',
4: '11/22/2016'},
'LOT': {0: 'A1', 1: 'A2', 2: 'A3', 3: 'A4', 4: 'A5'},
'OPERATION': {0: 100, 1: 100, 2: 100, 3: 100, 4: 100},
'TXN_DATE': {0: '12/6/2016',
1: '12/5/2016',
2: '11/30/2016',
3: '11/27/2016',
4: '11/22/2016'}})`
谢谢
答案 0 :(得分:1)
您可以主要使用pandas.merge_asof
,然后按map
添加新列:
Input1.TXN_DATE = pd.to_datetime(Input1.TXN_DATE)
Input2.TXN_DATE = pd.to_datetime(Input2.TXN_DATE)
Input1 = Input1.sort_values('TXN_DATE')
Input2 = Input2.sort_values('TXN_DATE')
df = pd.merge_asof(Input1, Input2, on='TXN_DATE', suffixes=('','_COMPANION')) \
.sort_values('LOT') \
.drop('OPERATION_COMPANION', axis=1)
df['LOT_TXN_DATE'] = df.LOT_COMPANION.map(Input2.set_index('LOT')['TXN_DATE'])
print (df)
LOT OPERATION TXN_DATE LOT_COMPANION LOT_TXN_DATE
4 A1 100.0 2016-12-06 B5 2016-12-04
3 A2 100.0 2016-12-05 B5 2016-12-04
2 A3 100.0 2016-11-30 B4 2016-11-22
1 A4 100.0 2016-11-27 B4 2016-11-22
0 A5 100.0 2016-11-22 B4 2016-11-22