如何选择最接近mtc_date的plr_date行? 优选地,不在每行上循环。 我想最后得到第13,28,43行。
我应该澄清一下:从具有相同mtc_date的行集中选择一行与最接近的plr_date。
mtc_date mtc_id plr_date plr_id plr_measure
12 2010-03-15 1 2010-01-31 1 0
13 2010-03-15 1 2010-02-28 1 1
24 2010-06-15 2 2010-01-31 2 12
25 2010-06-15 2 2010-02-28 2 13
26 2010-06-15 2 2010-03-31 2 14
27 2010-06-15 2 2010-04-30 2 15
28 2010-06-15 2 2010-05-31 2 16
36 2010-09-15 2 2010-01-31 2 12
37 2010-09-15 2 2010-02-28 2 13
38 2010-09-15 2 2010-03-31 2 14
39 2010-09-15 2 2010-04-30 2 15
40 2010-09-15 2 2010-05-31 2 16
41 2010-09-15 2 2010-06-30 2 17
42 2010-09-15 2 2010-07-31 2 18
43 2010-09-15 2 2010-08-31 2 19
import pandas as pd
import numpy as np
### 1 df
plr = pd.concat(\
[pd.DataFrame({'plr_date': pd.date_range("01/01/2010", "12/31/2010", freq="m"),\
'plr_id': 1 , 'plr_measure': 10}),\
pd.DataFrame({'plr_date': pd.date_range("01/01/2010", "12/31/2010", freq="m"),\
'plr_id': 2 , 'plr_measure': 20})])
#plr = plr.reindex(index=range(0, plr.shape[0]), columns=None, fill_value=0)
plr['plr_measure']=np.arange(0, plr.shape[0])
### 2 df
mtc = pd.DataFrame({'mtc_date':pd.to_datetime(['15/01/2010','15/03/2010','15/06/2010','15/09/2010']),\
'mtc_id':(1,1,2,2)})
mtc = mtc.merge(plr, left_on='mtc_id', right_on='plr_id', how='left')
#select only smaller
mtc = mtc[(mtc.plr_date < mtc.mtc_date)]
#leave the bigest