动态熊猫数据框过滤器不起作用

时间:2020-07-06 12:41:41

标签: python pandas filter

我无法使用此动态过滤器

df_dates

print(df_dates)

    Type  Entry      Exit 
    0     2008-03-03 2008-03-17  
    1     2010-05-19 2010-06-10 

此硬编码过滤器有效

df_to_filter = df_to_filter[
    (df_to_filter['date']>='2008-03-03 00:00:00') & (df_to_filter['date']<='2008-03-17 00:00:00') | 
    (df_to_filter['date']>='2010-05-19 00:00:00') & (df_to_filter['date']<='2010-06-10 00:00:00')
]

动态过滤器不起作用,字符串似乎完全相同

df_str = "df_to_filter['date']"
    
filter_mask = ' | '.join(f'({df_str}>=\'{start}\') & ({df_str}<=\'{stop}\')' for start,stop in zip(df_dates['Entry'],df_dates['Exit']))
filter_mask = filter_mask + ']'

print(filter_mask)

(df_to_filter['date']>='2008-03-03 00:00:00') & (df_to_filter['date']<='2008-03-17 00:00:00') | (df_to_filter['date']>='2010-05-19 00:00:00') & (df_to_filter['date']<='2010-06-10 00:00:00')]
    
df_to_filter = df_to_filter[filter_mask]

错误

KeyError: "(df_to_filter['date']>='2008-03-03 00:00:00') & (df_to_filter['date']<='2008-03-17 00:00:00') | (df_to_filter['date']>='2010-05-19 00:00:00') & (df_to_filter['date']<='2010-06-10 00:00:00')]"

1 个答案:

答案 0 :(得分:1)

例如,如果您有数据框:

df_dates:

Type  Entry      Exit 
0     2008-03-03 2008-03-17  
1     2010-05-19 2010-06-10 

df_to_filter:

date
2008-03-03 
2010-06-11

然后您可以使用filter_mask表达式对其进行过滤:

filter_mask = ' | '.join(f'({df_str}>=\'{start}\') & ({df_str}<=\'{stop}\')' for start,stop in zip(df_dates['Entry'],df_dates['Exit']))
"(df_to_filter['date']>='2008-03-03') & (df_to_filter['date']<='2008-03-17') | (df_to_filter['date']>='2010-05-19') & (df_to_filter['date']<='2010-06-10')"

print(df_to_filter[eval(filter_mask)])

结果:

         date
0  2008-03-03

要调用文字表达式eval()函数。