将一系列字符串转换为datetimeobjects以提取月份和年份

时间:2016-10-20 12:26:22

标签: python pandas

对ML使用python 3,我今天遇到的一个问题是这个。我有一个pandas数据框,其中一列包含日期。

data['Allfast time'].head()
0    31-Dec-14 17:55:00
1    31-Dec-14 22:55:00
2    31-Dec-14 09:30:00
3    01-Jan-15 10:55:00
4    01-Jan-15 21:15:00
Name: Allfast time, dtype: object

to_datetime()命令给出以下错误:

TypeError: object of type 'datetime.time' has no len()

如何创建包含仅月份的新列数据['月'],从数据['Allfast time']中提取?

谢谢!

2 个答案:

答案 0 :(得分:3)

The error message implies that your Series contains not only strings but also at least one datetime.time object. For example, the error message can be reproduced this way:

In [35]: test = pd.Series(['31-Dec-14 17:55:00', DT.time(21,15,00),])
In [36]: pd.to_datetime(test)
TypeError: object of type 'datetime.time' has no len()

Therefore, to convert this motley group of objects to Pandas Timestamps, pass errors='coerce' to pd.to_datetime. Invalid date strings and datetime.time objects will be replaced by NaT (Not-a-Time) objects:

import pandas as pd
import datetime as DT
df = pd.DataFrame(
    {'Allfast time': 
     ['31-Dec-14 17:55:00', '31-Dec-14 22:55:00', '31-Dec-14 09:30:00',
      '01-Jan-15 10:55:00', '01-Jan-15 21:15:00', 
      DT.time(21,15,00), DT.date(2000,1,1), DT.datetime(2000,1,1,8,10,20)]})

df['Allfast time'] = pd.to_datetime(df['Allfast time'], errors='coerce')
print(df['Allfast time'].dt.month)

yields

0    12.0
1    12.0
2    12.0
3     1.0
4     1.0
5     NaN
6     1.0
7     1.0
Name: Allfast time, dtype: float64

Since a datetime.time has no month, the best you can do is assign NaN to represent the missing month.

答案 1 :(得分:1)

我认为您需要to_datetime才能转换为日期时间列Allfast time,然后使用dt.monthdt.year

print (df)
         Allfast time
0  31-Dec-14 17:55:00
1  31-Dec-14 22:55:00
2  31-Dec-14 09:30:00
3  01-Jan-15 10:55:00
4  01-Jan-15 21:15:00

print (df.dtypes)
Allfast time    object
dtype: object

df['Allfast time'] = pd.to_datetime(df['Allfast time'])
df['months'] = df['Allfast time'].dt.month
df['year'] = df['Allfast time'].dt.year
print (df)
         Allfast time  months  year
0 2014-12-31 17:55:00      12  2014
1 2014-12-31 22:55:00      12  2014
2 2014-12-31 09:30:00      12  2014
3 2015-01-01 10:55:00       1  2015
4 2015-01-01 21:15:00       1  2015

print (df.dtypes)
Allfast time    datetime64[ns]
months                   int64
year                     int64
dtype: object