我导入了一个txt文件,这是一个结果:
Place Bib Athlete Name City State Age Gender FinishTime \
0 1 120 Runner 1 Bronx NY 31 M 1:21:40
1 2 910 Runner 2 Bronx NY 38 M 1:23:16
2 3 352 Runner 3 New York NY 45 M 1:24:28
Unnamed: 8
0 NaN
1 NaN
2 NaN
我想将FinishTime字符串对象转换为时间格式。运用 pd.to_datetime(race [' FinishTime'])我得到ValueError:给定日期字符串不太可能是日期时间。有关如何做到这一点的任何建议?我想按时计算,比如Runner 1比Runner 2快x%。谢谢。
答案 0 :(得分:1)
to_datetime
可以使用format参数来指定时间戳的外观:
>>> pd.to_datetime('1:23:16', format='%H:%M:%S')
Timestamp('1900-01-01 01:23:16')
关于你的数据:
pd.to_datetime(race['FinishTime'], format='%H:%M:%S')
答案 1 :(得分:1)
您可以使用to_timedelta
将列转换为timedelta。
然后Athlete Name
列set_index
,然后按loc
选择值。最后得到timedeltas
的差异:
df.FinishTime = pd.to_timedelta(df.FinishTime)
df = df.set_index('Athlete Name')
runner1 = df.loc['Runner 1', 'FinishTime']
runner2 = df.loc['Runner 2', 'FinishTime']
print('Runner 1 is {} faster than runner 2.'.format(runner2 - runner1))
Runner 1 is 0 days 00:01:36 faster than runner 2.
答案 2 :(得分:0)
严格地说,完成时间不是一个时间,而是一个timedelta对象,表示持续时间,完成比赛需要多长时间。
以下代码可用于将字符串转换为timedelta并进行计算:
from datetime import timedelta
finish_time_1 = '1:23:16'
finish_time_2 = '1:24:28'
hours, minutes, seconds = finish_time_1.split(':')
duration_1 = timedelta(hours=int(hours), minutes=int(minutes), seconds=int(seconds))
hours, minutes, seconds = finish_time_2.split(':')
duration_2 = timedelta(hours=int(hours), minutes=int(minutes), seconds=int(seconds))
print('Runner 1 is {} faster than runner 2'.format(duration_2 - duration_1))
结果是:"跑步者1比跑步者2&#34更快0:01:12;
答案 3 :(得分:0)
这是我最终编码的方式:
#not sure if I needed to do this
race['FinishTime']=race.FinishTime.str.rjust(8, fillchar='0')
#split out each component of the string
race['hour'] = race.FinishTime.str.split(':').str[0]
race['mins'] = race.FinishTime.str.split(':').str[1]
race['secs'] = race.FinishTime.str.split(':', n=2).str[-1]
#concat each component into a new column
race['FTime'] = race['hour'] + ':' + race['mins'] + ':' + race['secs']
#convert the string to timedelta
race.FTime = pd.to_timedelta(race.FTime)