在列中格式化时间戳-Pandas

时间:2019-10-24 03:32:20

标签: python pandas

我正在尝试在df中格式化一列时间戳,以便每个时间点代表0.1秒。下面是我正在使用的功能:

d = ({   
    'Time' : ['2010-07-27 09:25:31','2010-07-27 09:25:31.1000','2010-07-27 09:25:31.2000','2010-07-27 09:25:31.3000','2010-07-27 09:25:31.4000','2010-07-27 09:25:31.5000','2010-07-27 09:25:32'],
    'Value' : [np.nan,np.nan,np.nan,np.nan,-8,np.nan,-6],
   })

df = pd.DataFrame(data=d)

df['Time']= pd.to_datetime(df['Time']) 

def format_time(col):
    t = col    
    s = t.dt.strftime('%Y-%m-%d %H:%M:%S.%f')
    tail = s[-7:]
    f = round(float(tail), 3)
    temp = "%.1f" % f
    return "%s%s" % (s[:-7], temp[1:])

df['Time'] = format_time(df['Time'])

预期输出:

                     Time  Value
0 2010-07-27 09:25:31.1      NaN
1 2010-07-27 09:25:31.2      NaN
2 2010-07-27 09:25:31.3      NaN
3 2010-07-27 09:25:31.4      NaN
4 2010-07-27 09:25:31.5     -8.0
5 2010-07-27 09:25:31.6      NaN
6 2010-07-27 09:25:31.7     -6.0

2 个答案:

答案 0 :(得分:3)

我相信您需要:

def format_time(t):
    #removed dt and t = col       
    s = t.strftime('%Y-%m-%d %H:%M:%S.%f')
    tail = s[-7:]
    print (tail)
    f = round(float(tail), 3)
    print (f)
    temp = "%.1f" % f
    return "%s%s" % (s[:-7], temp[1:])


df['Time'] = df['Time'].apply(format_time)
print (df)
                    Time  Value
0  2010-07-27 09:25:31.0    NaN
1  2010-07-27 09:25:31.1    NaN
2  2010-07-27 09:25:31.2    NaN
3  2010-07-27 09:25:31.3    NaN
4  2010-07-27 09:25:31.4   -8.0
5  2010-07-27 09:25:31.5    NaN
6  2010-07-27 09:25:32.0   -6.0

或将Series.dt.round舍入为毫秒数,然后删除毫秒数的最后2个值:

df['Time'] = df['Time'].dt.round('L').astype(str).str[:-2]
print (df)
                    Time  Value
0  2010-07-27 09:25:31.0    NaN
1  2010-07-27 09:25:31.1    NaN
2  2010-07-27 09:25:31.2    NaN
3  2010-07-27 09:25:31.3    NaN
4  2010-07-27 09:25:31.4   -8.0
5  2010-07-27 09:25:31.5    NaN
6  2010-07-27 09:25:32.0   -6.0

问题是-是否需要回合?我测试它并取决于数据,如果在时间戳(毫秒)的%f参数之后有更多值,则有必要:

d = ({   
    'Time' : ['2010-07-27 09:25:31','2010-07-27 09:25:31.1000',
              '2010-07-27 09:25:31.2000','2010-07-27 09:25:31.3000',
              '2010-07-27 09:25:31.499','2010-07-27 09:25:31.5153',
              '2010-07-27 09:25:32'],
    'Value' : [np.nan,np.nan,np.nan,np.nan,-8,np.nan,-6],
   })

df = pd.DataFrame(data=d)
#print (df)

df['Time']= pd.to_datetime(df['Time']) 

def format_time(t):
    #removed dt and t = col       
    s = t.strftime('%Y-%m-%d %H:%M:%S.%f')
    tail = s[-7:]
    f = round(float(tail), 3)
    temp = "%.1f" % f
    return "%s%s" % (s[:-7], temp[1:])


df['Time0'] = df['Time'].apply(format_time)


df['Time1'] = df['Time'].dt.round('L').astype(str).str[:-2]

df['Time2'] = df.Time.astype(str).str[:-2]

print (df)
                        Time  Value                  Time0  \
0 2010-07-27 09:25:31.000000    NaN  2010-07-27 09:25:31.0   
1 2010-07-27 09:25:31.100000    NaN  2010-07-27 09:25:31.1   
2 2010-07-27 09:25:31.200000    NaN  2010-07-27 09:25:31.2   
3 2010-07-27 09:25:31.300000    NaN  2010-07-27 09:25:31.3   
4 2010-07-27 09:25:31.499000   -8.0  2010-07-27 09:25:31.5   
5 2010-07-27 09:25:31.515300    NaN  2010-07-27 09:25:31.5   
6 2010-07-27 09:25:32.000000   -6.0  2010-07-27 09:25:32.0   

                   Time1                     Time2  
0  2010-07-27 09:25:31.0  2010-07-27 09:25:31.0000  
1  2010-07-27 09:25:31.1  2010-07-27 09:25:31.1000  
2  2010-07-27 09:25:31.2  2010-07-27 09:25:31.2000  
3  2010-07-27 09:25:31.3  2010-07-27 09:25:31.3000  
4  2010-07-27 09:25:31.4  2010-07-27 09:25:31.4990  
5  2010-07-27 09:25:31.5  2010-07-27 09:25:31.5153  
6  2010-07-27 09:25:32.0  2010-07-27 09:25:32.0000  

答案 1 :(得分:2)

您可以通过以下方式实现您想要的:

df.Time.astype(str).str[:-2]
0    2010-07-27 09:25:31.0
1    2010-07-27 09:25:31.1
2    2010-07-27 09:25:31.2
3    2010-07-27 09:25:31.3
4    2010-07-27 09:25:31.4
5    2010-07-27 09:25:31.5
6    2010-07-27 09:25:32.0
Name: Time, dtype: object

尽管现在是object类型,但不再是时间戳记。