Question

我有一个如下所示的Dataframe：

date,time,metric_x
2016-02-27,00:00:28.0000000,31
2016-02-27,00:01:19.0000000,40
2016-02-27,00:02:55.0000000,39
2016-02-27,00:03:51.0000000,48
2016-02-27,00:05:22.0000000,42
2016-02-27,00:05:59.0000000,35

我希望生成一个新列

df['time_slot'] = df.apply(lambda row: time_slot_convert(pd.to_datetime(row['time'])), axis =1)

其中，

def time_slot_convert(time):
    return time.hour + 1

此函数查找此记录的小时数加1。

这非常慢。我知道数据是作为字符串读取的。是否有一种更有效的方法可以加快速度？

Answer 1

更快删除apply：

df['time_slot'] = pd.to_datetime(df['time']).dt.hour + 1

print (df)
         date              time  metric_x  time_slot
0  2016-02-27  00:00:28.0000000        31          1
1  2016-02-27  00:01:19.0000000        40          1
2  2016-02-27  00:02:55.0000000        39          1
3  2016-02-27  00:03:51.0000000        48          1
4  2016-02-27  00:05:22.0000000        42          1
5  2016-02-27  00:05:59.0000000        35          1

日期时间转换 - 熊猫慢

1 个答案: