我正在尝试在df中格式化一列时间戳,以便每个时间点代表0.1秒。下面是我正在使用的功能:
d = ({
'Time' : ['2010-07-27 09:25:31','2010-07-27 09:25:31.1000','2010-07-27 09:25:31.2000','2010-07-27 09:25:31.3000','2010-07-27 09:25:31.4000','2010-07-27 09:25:31.5000','2010-07-27 09:25:32'],
'Value' : [np.nan,np.nan,np.nan,np.nan,-8,np.nan,-6],
})
df = pd.DataFrame(data=d)
df['Time']= pd.to_datetime(df['Time'])
def format_time(col):
t = col
s = t.dt.strftime('%Y-%m-%d %H:%M:%S.%f')
tail = s[-7:]
f = round(float(tail), 3)
temp = "%.1f" % f
return "%s%s" % (s[:-7], temp[1:])
df['Time'] = format_time(df['Time'])
预期输出:
Time Value
0 2010-07-27 09:25:31.1 NaN
1 2010-07-27 09:25:31.2 NaN
2 2010-07-27 09:25:31.3 NaN
3 2010-07-27 09:25:31.4 NaN
4 2010-07-27 09:25:31.5 -8.0
5 2010-07-27 09:25:31.6 NaN
6 2010-07-27 09:25:31.7 -6.0
答案 0 :(得分:3)
我相信您需要:
def format_time(t):
#removed dt and t = col
s = t.strftime('%Y-%m-%d %H:%M:%S.%f')
tail = s[-7:]
print (tail)
f = round(float(tail), 3)
print (f)
temp = "%.1f" % f
return "%s%s" % (s[:-7], temp[1:])
df['Time'] = df['Time'].apply(format_time)
print (df)
Time Value
0 2010-07-27 09:25:31.0 NaN
1 2010-07-27 09:25:31.1 NaN
2 2010-07-27 09:25:31.2 NaN
3 2010-07-27 09:25:31.3 NaN
4 2010-07-27 09:25:31.4 -8.0
5 2010-07-27 09:25:31.5 NaN
6 2010-07-27 09:25:32.0 -6.0
或将Series.dt.round
舍入为毫秒数,然后删除毫秒数的最后2个值:
df['Time'] = df['Time'].dt.round('L').astype(str).str[:-2]
print (df)
Time Value
0 2010-07-27 09:25:31.0 NaN
1 2010-07-27 09:25:31.1 NaN
2 2010-07-27 09:25:31.2 NaN
3 2010-07-27 09:25:31.3 NaN
4 2010-07-27 09:25:31.4 -8.0
5 2010-07-27 09:25:31.5 NaN
6 2010-07-27 09:25:32.0 -6.0
问题是-是否需要回合?我测试它并取决于数据,如果在时间戳(毫秒)的%f
参数之后有更多值,则有必要:
d = ({
'Time' : ['2010-07-27 09:25:31','2010-07-27 09:25:31.1000',
'2010-07-27 09:25:31.2000','2010-07-27 09:25:31.3000',
'2010-07-27 09:25:31.499','2010-07-27 09:25:31.5153',
'2010-07-27 09:25:32'],
'Value' : [np.nan,np.nan,np.nan,np.nan,-8,np.nan,-6],
})
df = pd.DataFrame(data=d)
#print (df)
df['Time']= pd.to_datetime(df['Time'])
def format_time(t):
#removed dt and t = col
s = t.strftime('%Y-%m-%d %H:%M:%S.%f')
tail = s[-7:]
f = round(float(tail), 3)
temp = "%.1f" % f
return "%s%s" % (s[:-7], temp[1:])
df['Time0'] = df['Time'].apply(format_time)
df['Time1'] = df['Time'].dt.round('L').astype(str).str[:-2]
df['Time2'] = df.Time.astype(str).str[:-2]
print (df)
Time Value Time0 \
0 2010-07-27 09:25:31.000000 NaN 2010-07-27 09:25:31.0
1 2010-07-27 09:25:31.100000 NaN 2010-07-27 09:25:31.1
2 2010-07-27 09:25:31.200000 NaN 2010-07-27 09:25:31.2
3 2010-07-27 09:25:31.300000 NaN 2010-07-27 09:25:31.3
4 2010-07-27 09:25:31.499000 -8.0 2010-07-27 09:25:31.5
5 2010-07-27 09:25:31.515300 NaN 2010-07-27 09:25:31.5
6 2010-07-27 09:25:32.000000 -6.0 2010-07-27 09:25:32.0
Time1 Time2
0 2010-07-27 09:25:31.0 2010-07-27 09:25:31.0000
1 2010-07-27 09:25:31.1 2010-07-27 09:25:31.1000
2 2010-07-27 09:25:31.2 2010-07-27 09:25:31.2000
3 2010-07-27 09:25:31.3 2010-07-27 09:25:31.3000
4 2010-07-27 09:25:31.4 2010-07-27 09:25:31.4990
5 2010-07-27 09:25:31.5 2010-07-27 09:25:31.5153
6 2010-07-27 09:25:32.0 2010-07-27 09:25:32.0000
答案 1 :(得分:2)
您可以通过以下方式实现您想要的:
df.Time.astype(str).str[:-2]
0 2010-07-27 09:25:31.0
1 2010-07-27 09:25:31.1
2 2010-07-27 09:25:31.2
3 2010-07-27 09:25:31.3
4 2010-07-27 09:25:31.4
5 2010-07-27 09:25:31.5
6 2010-07-27 09:25:32.0
Name: Time, dtype: object
尽管现在是object
类型,但不再是时间戳记。