熊猫:用一些日期删除字符串

时间:2016-08-10 11:09:33

标签: python datetime pandas indexing dataframe

我有df:

ID,"address","used_at","active_seconds","pageviews"
71ecd2aa165114e5ee292131f1167d8c,"auto.drom.ru",2014-05-17 10:58:59,166,2
71ecd2aa165114e5ee292131f1167d8c,"auto.drom.ru",2016-07-17 17:34:07,92,4
70150aba267f671045f147767251d169,"avito.ru/*/avtomobili",2014-06-15 11:52:09,837,40
bc779f542049bcabb9e68518a215814e,"auto.yandex.ru",2014-01-16 22:23:56,8,1
bc779f542049bcabb9e68518a215814e,"avito.ru/*/avtomobili",2014-01-18 14:38:33,313,5
bc779f542049bcabb9e68518a215814e,"avito.ru/*/avtomobili",2016-07-18 18:12:07,20,1

我需要删除used_at超过2016-06-30的所有字符串。我怎么能这样做?

1 个答案:

答案 0 :(得分:4)

dt.date使用boolean indexing

print (df.used_at.dt.date > pd.to_datetime('2016-06-30').date())
0    False
1     True
2    False
3    False
4    False
5     True
Name: used_at, dtype: bool

print (df[df.used_at.dt.date > pd.to_datetime('2016-06-30').date()])
                                 ID                address  \
1  71ecd2aa165114e5ee292131f1167d8c           auto.drom.ru   
5  bc779f542049bcabb9e68518a215814e  avito.ru/*/avtomobili   

              used_at  active_seconds  pageviews  
1 2016-07-17 17:34:07              92          4  
5 2016-07-18 18:12:07              20          1  

或者您可以按yearmonthday定义日期时间:

print (df[df.used_at.dt.date > pd.datetime(2016, 6, 30).date()])
                                 ID                address  \
1  71ecd2aa165114e5ee292131f1167d8c           auto.drom.ru   
5  bc779f542049bcabb9e68518a215814e  avito.ru/*/avtomobili   

              used_at  active_seconds  pageviews  
1 2016-07-17 17:34:07              92          4  
5 2016-07-18 18:12:07              20          1