在熊猫数据框中将时间格式转换为秒

时间:2020-11-12 11:25:15

标签: python pandas time

我有一个包含时间数据的df,我想将这些数据转换为秒(请参见下面的示例)。


 Compression_level Size (M) Real time (s) User time (s) Sys time (s)
0                  0      265      0:19.938      0:24.649      0:3.062
1                  1       76      0:17.910      0:25.929      0:3.098
2                  2       74      1:02.619      0:27.724      0:3.014
3                  3       73      0:20.607      0:27.937      0:3.193
4                  4       67      0:19.598      0:28.853      0:2.925
5                  5       67      0:21.032      0:30.119      0:3.206
6                  6       66      0:27.013      0:31.462      0:3.106
7                  7       65      0:27.337      0:36.226      0:3.060
8                  8       64      0:37.651      0:47.246      0:2.933
9                  9       64      0:59.241       1:8.333      0:3.027


这是我想要获得的输出。



df["Real time (s)"]
0    19.938
1    17.910
2    62.619
...


我有一些有用的代码,但是我没有如何在数据框中迭代该代码

x = time.strptime("00:01:00","%H:%M:%S")
datetime.timedelta(hours=x.tm_hour,minutes=x.tm_min, seconds=x.tm_sec).total_seconds()

1 个答案:

答案 0 :(得分:2)

从右侧为00:添加0hours,传递到to_timedelta,然后添加Series.dt.total_seconds

df["Real time (s)"] = pd.to_timedelta(df["Real time (s)"].radd('00:')).dt.total_seconds()
print (df)
   Compression_level  Size (M)  Real time (s) User time (s) Sys time (s)
0                  0       265         19.938      0:24.649      0:3.062
1                  1        76         17.910      0:25.929      0:3.098
2                  2        74         62.619      0:27.724      0:3.014
3                  3        73         20.607      0:27.937      0:3.193
4                  4        67         19.598      0:28.853      0:2.925
5                  5        67         21.032      0:30.119      0:3.206
6                  6        66         27.013      0:31.462      0:3.106
7                  7        65         27.337      0:36.226      0:3.060
8                  8        64         37.651      0:47.246      0:2.933
9                  9        64         59.241       1:8.333      0:3.027

处理多列的解决方案:

def to_td(x):
    return pd.to_timedelta(x.radd('00:')).dt.total_seconds()

cols = ["Real time (s)", "User time (s)", "Sys time (s)"]
df[cols] = df[cols].apply(to_td)
print (df)
   Compression_level  Size (M)  Real time (s)  User time (s)  Sys time (s)
0                  0       265         19.938         24.649         3.062
1                  1        76         17.910         25.929         3.098
2                  2        74         62.619         27.724         3.014
3                  3        73         20.607         27.937         3.193
4                  4        67         19.598         28.853         2.925
5                  5        67         21.032         30.119         3.206
6                  6        66         27.013         31.462         3.106
7                  7        65         27.337         36.226         3.060
8                  8        64         37.651         47.246         2.933
9                  9        64         59.241         68.333         3.027