如何将字符串转换为日期时间,而忽略时间信息?

时间:2019-02-27 20:47:18

标签: python pandas datetime

在Python3和熊猫中,我有一个数据框,其中有一列代表日期的字符串-“ DataFim”列

df_lotacoes.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 52725 entries, 0 to 52724
Data columns (total 5 columns):
DataFim            48854 non-null object
DataInicio         52725 non-null object
IdUA               52725 non-null object
NomeFuncionario    52725 non-null object
NomeUA             52725 non-null object
dtypes: object(5)
memory usage: 1.0+ MB

print(df_lotacoes['DataFim'])

DataFim
0   2018-11-05T00:00:00-02:00
1   2008-08-28T00:00:00-03:00
2   2002-08-08T00:00:00-03:00
3   2007-03-14T00:00:00-03:00
4   2005-05-06T00:00:00-03:00

我试图将其转换为日期,但它仍然作为对象

df_lotacoes['DataFim'] = pd.to_datetime(df_lotacoes['DataFim'])

DataFim
0   2018-11-05 00:00:00-02:00
1   2008-08-28 00:00:00-03:00
2   2002-08-08 00:00:00-03:00
3   2007-03-14 00:00:00-03:00
4   2005-05-06 00:00:00-03:00

DataFim            48854 non-null object

我只需要年,月和日的信息。我想忽略的其他时间数据

请,有人知道我如何转换这种格式吗?

1 个答案:

答案 0 :(得分:2)

使用str.extract提取日期部分并将其转换为datetime,

df['DataFim'] = pd.to_datetime(df['DataFim'].str.extract('(.*)T')[0], format = '%Y-%m-%d')

    DataFim
0   2018-11-05
1   2008-08-28
2   2002-08-08
3   2007-03-14
4   2005-05-06

选项2:您也可以使用str.split

df['DataFim'] = pd.to_datetime(df['DataFim'].str.split('T').str[0], format = '%Y-%m-%d')

使用正则表达式很有趣

df['DataFim'] = pd.to_datetime(df['DataFim'].str.replace('T.*', '', regex = True), format = '%Y-%m-%d')