python pandas中的日期时间戳和datetime64 [ns,UTC]比较

时间:2020-05-05 13:22:40

标签: python pandas date datetime timestamp

由于使用pandas DataFrames进行此过滤过程的效果,我感到很困惑。我正在尝试获取介于某些日期之间的行,但结果DataFrame为空。我确定那段时期有数据。

df.info()返回的'opentime'类型为:'opendate 440383 non-null datetime64 [ns,UTC]

代码段:

from datetime import timedelta
from datetime import datetime

current_date = pd.datetime.now()
t_delta_week = timedelta(days=7)
t_delta_year = timedelta(days=365)

#CurrentDate
date_start2020 = pd.Timestamp(current_date - t_delta_week, unit='ms')
date_end2020 = pd.Timestamp(current_date, unit='ms')

date_start2020 = date_start2020.tz_localize('utc')
date_end2020 = date_end2020.tz_localize('utc')


#LastYearDate
date_start2019 = pd.Timestamp(current_date - t_delta_year - t_delta_week, unit='ms')
date_end2019 = pd.Timestamp(current_date - t_delta_year, unit='ms')

date_start2019 = date_start2019.tz_localize('utc')
date_end2019 = date_end2019.tz_localize('utc')


df2020_2019['opendate'] = pd.to_datetime(df2020_2019['opendate'], unit='ms') 
mask = (df2020_2019['opendate'] > date_start2020) & (df2020_2019['opendate'] <= date_end2020)
df_currYear = df2020_2019.loc[mask]

df_currYear

返回的DataFrame为空

感谢您的帮助! :)

编辑:

也许有帮助:'opendate'生成列,并使用以下代码片段创建:

import pandas as pd
fmt = '%Y-%m-%dT%H:%M:%S'
df2020_2019.dropna(subset=['opentime_TS'], inplace=True)
df2020_2019['opendate'] = pd.to_datetime(df2020_2019['opentime_TS'], utc=True, format=fmt, errors='ignore')

此外,我还打印了head()data sample。出于隐私原因,我无法提供df的记录:)

1 个答案:

答案 0 :(得分:0)

好,我不好。由于专注于tz TypeErrors,我只是盲目...我已经选择了已经过时的错误数据源:)适用于正确数据的最终解决方案:

from datetime import timedelta
from datetime import datetime
import pandas as pd

fmt = '%Y-%m-%dT%H:%M:%S'
df2020_2019.dropna(subset=['opentime_TS'], inplace=True)
df2020_2019['opendate'] = pd.to_datetime(df2020_2019['opentime_TS'], utc=True, format=fmt, errors='ignore')
df2020_2019.info()

current_date = pd.Timestamp.now()
t_delta_week = timedelta(days=7)
t_delta_year = timedelta(days=365)

#CurrentDate
date_start2020 = pd.Timestamp(current_date - t_delta_week, unit='ms')
date_end2020 = pd.Timestamp(current_date, unit='ms')

date_start2020 = date_start2020.tz_localize('utc')
date_end2020 = date_end2020.tz_localize('utc')


#LastYearDate
date_start2019 = pd.Timestamp(current_date - t_delta_year - t_delta_week, unit='ms')
date_end2019 = pd.Timestamp(current_date - t_delta_year, unit='ms')

date_start2019 = date_start2019.tz_localize('utc')
date_end2019 = date_end2019.tz_localize('utc')

df2020_2019['opendate'] = pd.to_datetime(df2020_2019['opendate'], unit='ms') 

df_currYear = df2020_2019[df2020_2019["opendate"] > date_start2020]
df_lastYear = df2020_2019[df2020_2019["opendate"].between(date_start2019, date_end2019)]

df_currYear