Question

我正在努力使用pandas来完善时间戳。

时间戳看起来像这样：

datetime.datetime(2017,06,25,00,31,53,993000)
datetime.datetime(2017,06,25,00,32,31,224000)
datetime.datetime(2017,06,25,00,33,11,223000)
datetime.datetime(2017,06,25,00,33,53,876000)
datetime.datetime(2017,06,25,00,34,31,219000)
datetime.datetime(2017,06,25,00,35,12,634000)

如何舍入到最近的秒？

以前我在这篇文章中尝试了一些建议，但他们没有工作： Rounding time off to the nearest second - Python

到目前为止，我的代码看起来像这样：

import pandas as pd
filename = 'data.csv'
readcsv = pd.read_csv(filename)

根据文件标题信息导入数据

log_date = readcsv.date
log_time = readcsv.time
log_lon = readcsv.lon
log_lat = readcsv.lat
log_heading = readcsv.heading

readcsv['date'] = pd.to_datetime(readcsv['date']).dt.date
readcsv['time'] = pd.to_datetime(readcsv['time']).dt.time

将日期和时间合并为一个变量

timestamp = [datetime.datetime.combine(log_date[i],log_time[i]) for i in range(len(log_date))]

创建数据框

data = {'timestamp':timestamp,'log_lon':log_lon,'log_lat':log_lat,'log_heading':log_heading}
log_data = pd.DataFrame(data,columns=['timestamp','log_lon','log_lat','log_heading'])
log_data.index = log_data['timestamp']

我还是陌生的新手，所以请原谅我的无知

Answer 1

您可以使用参数parse_dates的第一个read_csv来创建datetime和date列中的time，然后使用dt.round进行广告{ {1}} S：

datetime

import pandas as pd

temp=u"""date,time,lon,lat,heading
2017-06-25,00:31:53.993000,48.1254,17.1458,a
2017-06-25,00:32:31.224000,48.1254,17.1458,a
2017-06-25,00:33:11.223000,48.1254,17.1458,a
2017-06-25,00:33:53.876000,48.1254,17.1458,a
2017-06-25,00:34:31.219000,48.1254,17.1458,a
2017-06-25,00:35:12.634000,48.1254,17.1458,a"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), parse_dates={'timestamp':['date','time']})

print (df)
                timestamp      lon      lat heading
0 2017-06-25 00:31:53.993  48.1254  17.1458       a
1 2017-06-25 00:32:31.224  48.1254  17.1458       a
2 2017-06-25 00:33:11.223  48.1254  17.1458       a
3 2017-06-25 00:33:53.876  48.1254  17.1458       a
4 2017-06-25 00:34:31.219  48.1254  17.1458       a
5 2017-06-25 00:35:12.634  48.1254  17.1458       a

print (df.dtypes)
timestamp    datetime64[ns]
lon                 float64
lat                 float64
heading              object
dtype: object

编辑：

如果您希望将日期时间设置为df['timestamp'] = df['timestamp'].dt.round('1s') print (df) timestamp lon lat heading 0 2017-06-25 00:31:54 48.1254 17.1458 a 1 2017-06-25 00:32:31 48.1254 17.1458 a 2 2017-06-25 00:33:11 48.1254 17.1458 a 3 2017-06-25 00:33:54 48.1254 17.1458 a 4 2017-06-25 00:34:31 48.1254 17.1458 a 5 2017-06-25 00:35:13 48.1254 17.1458 a的列：

index

import pandas as pd

temp=u"""date,time,lon,lat,heading
2017-06-25,00:31:53.993000,48.1254,17.1458,a
2017-06-25,00:32:31.224000,48.1254,17.1458,a
2017-06-25,00:33:11.223000,48.1254,17.1458,a
2017-06-25,00:33:53.876000,48.1254,17.1458,a
2017-06-25,00:34:31.219000,48.1254,17.1458,a
2017-06-25,00:35:12.634000,48.1254,17.1458,a"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), parse_dates={'timestamp':['date','time']}, index_col=['timestamp'])
print (df)
                             lon      lat heading
timestamp                                        
2017-06-25 00:31:53.993  48.1254  17.1458       a
2017-06-25 00:32:31.224  48.1254  17.1458       a
2017-06-25 00:33:11.223  48.1254  17.1458       a
2017-06-25 00:33:53.876  48.1254  17.1458       a
2017-06-25 00:34:31.219  48.1254  17.1458       a
2017-06-25 00:35:12.634  48.1254  17.1458       a

Answer 2

dt.round就是你要找的。我只是创建一个较小版本的DataFrame，请注释，如果你不能修改它以完全适合你的情况，我也可以提供帮助。

import datetime
import pandas as pd

ts1 = datetime.datetime(2017,06,25,00,31,53,993000)
ts2 = datetime.datetime(2017,06,25,00,32,31,224000)
ts3 = datetime.datetime(2017,06,25,00,33,11,223000)
df = pd.DataFrame({'timestamp':[ts1, ts2, ts3]})

df.timestamp.dt.round('1s')

给您以下内容：

Out[89]: 
0   2017-06-25 00:31:54
1   2017-06-25 00:32:31
2   2017-06-25 00:33:11
Name: timestamp, dtype: datetime64[ns]

熊猫 - 将时间戳舍入到最近的第二个

2 个答案: