由于额外的键,to_datetime组装错误

时间:2019-01-12 23:55:05

标签: python pandas

我的熊猫版本是0.23.4。

我试图运行以下代码:

df['date_time'] = pd.to_datetime(df[['year','month','day','hour_scheduled_departure','minute_scheduled_departure']])  

并出现以下错误:

  

额外的键已传递给日期时间组合:[hour_scheduled_departure,minute_scheduled_departure]

关于如何在pd.to_datetime之前完成工作的任何想法?

@ anky_91

In this image an extract of first 10 rows is presented. First column [int32]: year; Second column[int32]: month; Third column[int32]: day; Fourth column[object]: hour; Fifth column[object]: minute. The length of objects is 2.

2 个答案:

答案 0 :(得分:5)

另一种解决方案:

>>pd.concat([df.A,pd.to_datetime(pd.Series(df[df.columns[1:]].fillna('').values.tolist(),name='Date').map(lambda x: '0'.join(map(str,x))))],axis=1)

    A   Date
0   a   2002-07-01 05:07:00
1   b   2002-08-03 03:08:00
2   c   2002-09-05 06:09:00
3   d   2002-04-07 09:04:00
4   e   2002-02-01 02:02:00
5   f   2002-03-05 04:03:00

对于您添加为图像的示例(由于节省时间,我跳过了最后三列)

>>df.month=df.month.map("{:02}".format)
>>df.day = df.day.map("{:02}".format)
>>pd.concat([df.A,pd.to_datetime(pd.Series(df[df.columns[1:]].fillna('').values.tolist(),name='Date').map(lambda x: ''.join(map(str,x))))],axis=1)

    A   Date
0   a   2015-01-01 00:05:00
1   b   2015-01-01 00:01:00
2   c   2015-01-01 00:02:00
3   d   2015-01-01 00:02:00
4   e   2015-01-01 00:25:00
5   f   2015-01-01 00:25:00

答案 1 :(得分:3)

您可以将rename用于列,因此可以将pandas.to_datetime与列year, month, day, hour, minute一起使用:

df = pd.DataFrame({
        'A':list('abcdef'),
         'year':[2002,2002,2002,2002,2002,2002],
         'month':[7,8,9,4,2,3],
         'day':[1,3,5,7,1,5],
         'hour_scheduled_departure':[5,3,6,9,2,4],
         'minute_scheduled_departure':[7,8,9,4,2,3]
})

print (df)
   A  year  month  day  hour_scheduled_departure  minute_scheduled_departure
0  a  2002      7    1                         5                           7
1  b  2002      8    3                         3                           8
2  c  2002      9    5                         6                           9
3  d  2002      4    7                         9                           4
4  e  2002      2    1                         2                           2
5  f  2002      3    5                         4                           3

cols = ['year','month','day','hour_scheduled_departure','minute_scheduled_departure']
d = {'hour_scheduled_departure':'hour','minute_scheduled_departure':'minute'}
df['date_time'] = pd.to_datetime(df[cols].rename(columns=d)) 
#if necessary remove columns
df = df.drop(cols, axis=1) 
print (df)
   A           date_time
0  a 2002-07-01 05:07:00
1  b 2002-08-03 03:08:00
2  c 2002-09-05 06:09:00
3  d 2002-04-07 09:04:00
4  e 2002-02-01 02:02:00
5  f 2002-03-05 04:03:00

详细信息

print (df[cols].rename(columns=d))
   year  month  day  hour  minute
0  2002      7    1     5       7
1  2002      8    3     3       8
2  2002      9    5     6       9
3  2002      4    7     9       4
4  2002      2    1     2       2
5  2002      3    5     4       3