从pandas Dataframe获取给定日期的数据

时间:2018-05-17 14:32:15

标签: python python-3.x pandas datetime

我有一个数据框df,如下所示:

date1               item_id
2000-01-01 00:00:00    0
2000-01-01 10:01:00    1
2000-01-01 00:02:00    2
2000-01-01 00:03:00    3
2000-01-01 00:04:00    4
2000-01-01 00:05:00    5
2000-01-01 00:06:00    6
2000-01-01 12:07:00    7
2000-01-02 00:08:00    8
2000-01-02 00:00:00    0
2000-01-02 00:01:00    1
2000-01-02 03:02:00    2
2000-01-02 00:03:00    3
2000-01-02 00:04:00    4
2000-01-02 00:05:00    5
2000-01-02 04:06:00    6
2000-01-02 00:07:00    7
2000-01-02 00:08:00    8

我需要单日的数据,即2000年1月1日。以下查询给出了正确的结果。但有没有办法可以通过传递" 2000-01-01"?

来完成
result= df[(df['date1'] > '2000-01-01 00:00') & (df['date1'] < '2000-01-01 23:59')]

2 个答案:

答案 0 :(得分:3)

使用partial string indexing,但首先需要DatetimeIndex

df = df.set_index('date1')['2000-01-01']
print (df)
                     item_id
date1                       
2000-01-01 00:00:00        0
2000-01-01 10:01:00        1
2000-01-01 00:02:00        2
2000-01-01 00:03:00        3
2000-01-01 00:04:00        4
2000-01-01 00:05:00        5
2000-01-01 00:06:00        6
2000-01-01 12:07:00        7

另一种解决方案是按strftime将日期时间转换为字符串,然后按boolean indexing过滤:

df = df[df['date1'].dt.strftime('%Y-%m-%d') == '2000-01-01']
print (df)
                date1  item_id
0 2000-01-01 00:00:00        0
1 2000-01-01 10:01:00        1
2 2000-01-01 00:02:00        2
3 2000-01-01 00:03:00        3
4 2000-01-01 00:04:00        4
5 2000-01-01 00:05:00        5
6 2000-01-01 00:06:00        6
7 2000-01-01 12:07:00        7

答案 1 :(得分:2)

另一种选择是创建一个面具:

df[df.date1.dt.date.astype(str) == '2000-01-01']

完整示例:

import pandas as pd

data = '''\
date1                  item_id
2000-01-01T00:00:00    0
2000-01-01T10:01:00    1
2000-01-01T00:02:00    2
2000-01-01T00:03:00    3
2000-01-01T00:04:00    4
2000-01-01T00:05:00    5
2000-01-01T00:06:00    6
2000-01-01T12:07:00    7
2000-01-02T00:08:00    8
2000-01-02T00:00:00    0
2000-01-02T00:01:00    1
2000-01-02T03:02:00    2'''

df = pd.read_csv(pd.compat.StringIO(data), sep='\s+', parse_dates=['date1'])

res = df[df.date1.dt.date.astype(str) == '2000-01-01']
print(res)

返回:

                date1  item_id
0 2000-01-01 00:00:00        0
1 2000-01-01 10:01:00        1
2 2000-01-01 00:02:00        2
3 2000-01-01 00:03:00        3
4 2000-01-01 00:04:00        4
5 2000-01-01 00:05:00        5
6 2000-01-01 00:06:00        6
7 2000-01-01 12:07:00        7

import datetime
df[df.date1.dt.date == datetime.date(2000,1,1)]