与系列和数据框的vlookup等效的python

时间:2018-09-22 13:17:18

标签: python pandas

我有一个df

  entrydate  exitdate    ddmax    
1 2012-02-15 2012-02-17    -1        
2 2012-02-18 2012-02-19    -2       
3 2012-02-20 2012-02-21    -3     
4 2012-02-22 2012-02-22    -2      
5 2012-02-24 2012-02-24    -6    

我希望添加一列df['location']=,其结果是发生ddmax的DATE。此日期介于进入和退出日期之间。

但是要找到这个日期,我需要在另一个系列上进行查找:

s = 
2012-02-15   -3
2012-02-16   -1
2012-02-17   -2
2012-02-18   -2
2012-02-19   -1
2012-02-20   -1
2012-02-21   -3
2012-02-22   -2
2012-02-23   -3
2012-02-24   -6
2012-02-25   -9

所以我通过数字进行查找,并取相关日期

我该怎么做?

我尝试了地图功能,并且pd左合并,但无济于事...

预期输出:

  entrydate  exitdate    ddmax      location  
1 2012-02-15 2012-02-17    -1      2012-02-16 
2 2012-02-18 2012-02-19    -2      2012-02-18
3 2012-02-20 2012-02-21    -3      2012-02-21
4 2012-02-22 2012-02-22    -2      2012-02-22
5 2012-02-24 2012-02-24    -6      2012-02-24

1 个答案:

答案 0 :(得分:1)

并不是说这很漂亮,但是如果数据量较小(看起来确实如此)会有所帮助

def lookup(x):
    is_ = s.loc[(s.d >= x.entrydate) & (s.d <= x.exitdate), ['i', 'd']]
    return is_.loc[is_.i == x.ddmax, 'd'].iloc[0]

df['location'] = df.apply(lookup, 1)

输出

    entrydate   exitdate    ddmax   location
1   2012-02-15  2012-02-17  -1  2012-02-16
2   2012-02-18  2012-02-19  -2  2012-02-18
3   2012-02-20  2012-02-21  -3  2012-02-21
4   2012-02-22  2012-02-22  -2  2012-02-22
5   2012-02-24  2012-02-24  -6  2012-02-24

上面的代码假定您的s是一个数据帧,例如

    d           i
0   2012-02-15  -3
1   2012-02-16  -1
2   2012-02-17  -2
3   2012-02-18  -2
4   2012-02-19  -1
5   2012-02-20  -1
6   2012-02-21  -3
7   2012-02-22  -2
8   2012-02-23  -3
9   2012-02-24  -6
10  2012-02-25  -9

如果您有pd.Series,例如

d
2012-02-15   -3
2012-02-16   -1
2012-02-17   -2
2012-02-18   -2
2012-02-19   -1
2012-02-20   -1
2012-02-21   -3
2012-02-22   -2
2012-02-23   -3
2012-02-24   -6
2012-02-25   -9
Name: i, dtype: int64

lookup函数稍微改变为

def lookup(x):
    is_ = s.loc[(s.index >= x.entrydate) & (s.index <= x.exitdate)]
    return is_.loc[is_ == x.ddmax].iloc[0]