pandas fillna datetime列现在带有时区

时间:2017-07-04 10:20:10

标签: pandas datetime python-3.6

我有一个带有datetime.now()值的pandas datetime列,我希望在特定时区填充df = pd.DataFrame([ {'end': "2017-07-01 12:00:00"}, {'end': "2017-07-02 18:13:00"}, {'end': None}, {'end': "2017-07-04 10:45:00"} ])

这是我的MWE数据帧:

fillna

如果我填写pd.to_datetime(df['end']).fillna(datetime.now())

datetime64[ns]

结果是预期dtype的系列:pd.to_datetime(df['end']).fillna( datetime.now(pytz.timezone('US/Pacific'))) 。但是当我指定时区时,例如:

object

这将返回一个dtype:<asp:TemplateField HeaderText="Editable"> <ItemTemplate> <asp:Label runat="server" Text="<%# Item.IsEditable %>" /> </ItemTemplate> <EditItemTemplate> <asp:CheckBox ID="CheckBoxEditable " runat="server" Text="Editable"></asp:CheckBox> </EditItemTemplate> </asp:TemplateField>

的系列

1 个答案:

答案 0 :(得分:1)

您似乎需要在date中将to_datetime转换为fillna

df['end'] = pd.to_datetime(df['end'])
df['end'] = df['end'].fillna(pd.to_datetime(pd.datetime.now(pytz.timezone('US/Pacific'))))
print (df)
                                end
0               2017-07-01 12:00:00
1               2017-07-02 18:13:00
2  2017-07-04 03:35:08.499418-07:00
3               2017-07-04 10:45:00

print (df['end'].apply(type))
0    <class 'pandas._libs.tslib.Timestamp'>
1    <class 'pandas._libs.tslib.Timestamp'>
2    <class 'pandas._libs.tslib.Timestamp'>
3    <class 'pandas._libs.tslib.Timestamp'>
Name: end, dtype: object

dtype仍然不是datetime64

print (df['end'].dtype)
object

我认为解决方案通过utc传递给to_datetime

  

utc :布尔值,默认无

     

返回UTC DatetimeIndex if True(转换任何tz感知的datetime.datetime对象)。

df['end'] = df['end'].fillna(pd.datetime.now(pytz.timezone('US/Pacific')))
df['end'] = pd.to_datetime(df['end'], utc=True)

#print (df)

print (df['end'].apply(type))
0    <class 'pandas._libs.tslib.Timestamp'>
1    <class 'pandas._libs.tslib.Timestamp'>
2    <class 'pandas._libs.tslib.Timestamp'>
3    <class 'pandas._libs.tslib.Timestamp'>
Name: end, dtype: object

print (df['end'].dtypes)
datetime64[ns]

来自comment of OP的最终解决方案:

df['end'] = pd.to_datetime(df['end']).dt.tz_localize('US/Pacific')
df['end'] = df['end'].fillna(pd.datetime.now(pytz.timezone('US/Pacific')))

print (df.end.dtype)
datetime64[ns, US/Pacific]