我下面有一个Timestamp
列的csv文件。我想将格式更改为2013-08-12 10:29:19.673
或1秒钟的粒度。当前Timestamp
的类型为object
。
我可以在excel中手动更改其格式,但是文件太大,并且某些行会丢失。
Id Timestamp Data Group_Id
0 19929927 00:07.5 27.0 27
1 19929928 00:08.3 26.5 27
2 19929929 00:48.7 33.5 157
3 19929930 00:50.0 33.0 157
4 19929931 00:53.1 35.0 25
...
1048570 20978497 10:11.9 34.5 152
1048571 20978498 10:13.3 34.0 152
1048572 20978499 10:41.2 42.0 138
1048573 20978500 10:42.5 45.0 138
1048574 20978501 10:43.9 44.0 138
答案 0 :(得分:1)
编辑:如果将时间转换为日期时间而没有日期信息,则熊猫显然会添加实际日期。
如果还需要几天,请检查以下解决方案:
如果时间以0
开头,则创建连续的日期时间的想法:
df = df[['Timestamp']]
print (df)
Timestamp
0 00:08.3 <- first day
1 00:48.7
2 00:50.0
3 00:53.1
4 10:11.9
5 10:13.3
6 10:41.2
7 00:50.0 <- second day
8 00:53.1
9 10:42.5
10 10:43.9
11 00:07.5 <- third day
12 00:08.3
13 10:11.9
14 10:13.3
15 10:43.9
#convert to datetimes and get hours for test 0
df['h'] = pd.to_datetime(df['Timestamp']).dt.hour
#test first 0 for start of day
df['mask'] = df['h'].shift().ne(0) & df['h'].eq(0)
#create consecutive groups - starts by 1 if first time start by 0, else start by 1
df['g'] = df['mask'].cumsum()
#specify first day in origin parameter
df['days'] = pd.to_datetime(df['g'], origin='2016-01-01', unit='d')
#add to original Timestamps if HH:MM.SS
df['Timestamp1'] = df['days'] + pd.to_timedelta(df['Timestamp'].str.replace('\.',':'))
#add to original Timestamps if format without hours - MM:SS.SS
df['Timestamp2'] = df['days'] + pd.to_timedelta('00:' + df['Timestamp'])
print (df)
Timestamp h mask g days Timestamp1 \
0 00:08.3 0 True 1 2016-01-02 2016-01-02 00:08:03
1 00:48.7 0 False 1 2016-01-02 2016-01-02 00:48:07
2 00:50.0 0 False 1 2016-01-02 2016-01-02 00:50:00
3 00:53.1 0 False 1 2016-01-02 2016-01-02 00:53:01
4 10:11.9 10 False 1 2016-01-02 2016-01-02 10:11:09
5 10:13.3 10 False 1 2016-01-02 2016-01-02 10:13:03
6 10:41.2 10 False 1 2016-01-02 2016-01-02 10:41:02
7 00:50.0 0 True 2 2016-01-03 2016-01-03 00:50:00
8 00:53.1 0 False 2 2016-01-03 2016-01-03 00:53:01
9 10:42.5 10 False 2 2016-01-03 2016-01-03 10:42:05
10 10:43.9 10 False 2 2016-01-03 2016-01-03 10:43:09
11 00:07.5 0 True 3 2016-01-04 2016-01-04 00:07:05
12 00:08.3 0 False 3 2016-01-04 2016-01-04 00:08:03
13 10:11.9 10 False 3 2016-01-04 2016-01-04 10:11:09
14 10:13.3 10 False 3 2016-01-04 2016-01-04 10:13:03
15 10:43.9 10 False 3 2016-01-04 2016-01-04 10:43:09
Timestamp2
0 2016-01-02 00:00:08.300
1 2016-01-02 00:00:48.700
2 2016-01-02 00:00:50.000
3 2016-01-02 00:00:53.100
4 2016-01-02 00:10:11.900
5 2016-01-02 00:10:13.300
6 2016-01-02 00:10:41.200
7 2016-01-03 00:00:50.000
8 2016-01-03 00:00:53.100
9 2016-01-03 00:10:42.500
10 2016-01-03 00:10:43.900
11 2016-01-04 00:00:07.500
12 2016-01-04 00:00:08.300
13 2016-01-04 00:10:11.900
14 2016-01-04 00:10:13.300
15 2016-01-04 00:10:43.900