Python Pandas比较日期列,检查是否不为空,有条件> <=逻辑,返回值

时间:2018-10-24 16:48:03

标签: python pandas

我正在使用Python3 Pandas计算结果。我不断收到布尔值模棱两可的值错误。我是否需要先测试与之比较的每个日期列都不为null以避免错误?最终结果应模仿:

#check if D3_UNTIL is not empty
if df.RUNNING_DATE.isna()==False:
    if df.D3_UNTIL.isna()==False:
        if.RUNNING_DATE >= df.D3_UNTIL:
            df.RESULT = df.DVAL3
        elif (df.RUNNING_DATE >= df.D2_UNTIL & df.RUNNING_DATE < df.D3_UNTIL):
            df.RESULT = df.DVAL2    
        elif (df.RUNNING_DATE >= df.D1_UNTIL & df.RUNNING_DATE < df.D2_UNTIL):
            df.RESULT = df.DVAL1
        else None
#check if D2.UNTIL is not empty
    elif df.D2_UNTIL.isna()==False:
        if.RUNNING_DATE >= df.D2_UNTIL:
            df.RESULT = df.DVAL2
        elif (df.RUNNING_DATE >= df.D1_UNTIL & df.RUNNING_DATE < df.D2_UNTIL):
            df.RESULT = df.DVAL1    
        else None
#check if D1.UNTIL is not empty    
    elif df.D1_UNTIL.isna()==False:
        if.RUNNING_DATE >= df.D1_UNTIL:
            df.RESULT = df.DVAL1
        else None
else None



RUNNING_DATE  D1_UNTIL  DVAL1  D2_UNTIL  DVAL2  D3_UNTIL  DVAL3  RESULT
1/1/2018      1/1/2018  10                                       10             
1/2/2018                                
1/3/2018      1/1/2018                          
1/4/2018      1/1/2018  10     1/3/2018  15             
1/5/2018      1/1/2018  10     1/3/2018  20     1/31/2018 100    20 
1/6/2018      1/1/2018  10               999                
1/7/2018      1/1/2018  10     1/4/2018  25     1/6/2018  300    300    

1 个答案:

答案 0 :(得分:2)

使用if-else语句,您可以使用np.select来实现您的逻辑。同样检查df.RUNNING_DATE.isna()==False是多余的;只需使用df.RUNNING_DATE.notnull()

此外,此处的逻辑可以大大简化

  • >=进行任何==<=NaT日期比较都会返回 False,因此在已经检查RUNNING_DATE是否更大时,不需要首先检查该值是否有限。此外,NaTNaT的任何比较都将返回False,这为我们提供了RUNNING_DATE是否为空的默认检查。
  • 由于日期检查涵盖了所有可能性,因此只需依次检查>=

代码

import pandas as pd
import numpy a np

# Ensure Datetime
#df['RUNNING_DATE'] = pd.to_datetime(df.RUNNING_DATE, errors='coerce')
#df['D1_UNTIL'] = pd.to_datetime(df.D1_UNTIL, errors='coerce')
#df['D2_UNTIL'] = pd.to_datetime(df.D2_UNTIL, errors='coerce')
#df['D3_UNTIL'] = pd.to_datetime(df.D3_UNTIL, errors='coerce')

conds = [
    df.RUNNING_DATE >= df.D3_UNTIL,
    df.RUNNING_DATE >= df.D2_UNTIL,
    df.RUNNING_DATE >= df.D1_UNTIL]

choices = [
    df.DVAL3,
    df.DVAL2,
    df.DVAL1]

df['RESULT'] = np.select(conds, choices, default=None)

输出:

(我在末尾添加了额外的行来说明逻辑)

  RUNNING_DATE   D1_UNTIL  DVAL1   D2_UNTIL  DVAL2   D3_UNTIL  DVAL3 RESULT
0   2018-01-01 2018-01-01   10.0        NaT    NaN        NaT    NaN     10
1   2018-01-02        NaT    NaN        NaT    NaN        NaT    NaN   None
2   2018-01-03 2018-01-01    NaN        NaT    NaN        NaT    NaN    NaN
3   2018-01-04 2018-01-01   10.0 2018-01-03   15.0        NaT    NaN     15
4   2018-01-05 2018-01-01   10.0 2018-01-03   20.0 2018-01-31  100.0     20
5   2018-01-06 2018-01-01   10.0        NaT  999.0        NaT    NaN     10
6   2018-01-07 2018-01-01   10.0 2018-01-04   25.0 2018-01-06  300.0    300
7          NaT        NaT    NaN        NaT    NaN        NaT    NaN   None
8          NaT        NaT    NaN 2018-01-01   24.0        NaT    NaN   None