我的主要目的是找出每个工作所花费的总时间。我确实尝试减去两个时间列,但出现错误:<class 'datetime.time'> is not convertible to datetime
。
我运行.info()并看到时间列是对象。在excel文件中,格式仅是时间,而不是日期时间格式。我尝试将第一时间列转换为日期时间格式,结果如下:
import pandas as pd
df = pd.read_excel('C:/users/paul/desktop/data project/July.xls', index_col=0)
hrs_st = (pd.to_datetime(df['AST'].str.strip(), format='%H:%M:%S'))
print (hrs_st)
工作单
BAEBRO-906063 NaT
BAEBRO-906191 NaT
BAEBRO-906207 NaT
BAEBRO-906079 NaT
BAEBRO-906095 NaT
BAEBRO-906159 NaT
...
答案 0 :(得分:0)
说实话有点令人困惑。您能否更详细地说明主要目标是什么,并提供有关日期如何在excel文件中显示的更多信息。
第二次编辑* |我试图评论我在写的代码。
我做了一个类似的例子,只是为了了解如何为您提供帮助。
这是我的excel文件的样子:
这是一种以非常简单的方式读取和计算差异的代码:
import pandas as pd
df = pd.read_excel('dates.xlsx') #reading my excel
timeStart = [] #declaring 2 lists where I'm gonna put my records
timeEnd = []
#Here I append my values from the excel to my lists
for value in df.get('col1'):
timeStart.append(value)
for value in df.get('col2'):
timeEnd.append(value)
#I suppose they both have the same amount of elements in list
#therefore I can iterate for the len of any list between timeStart and timeEnd
for i in range(len(timeStart)):
#datetime.time object doesn't allow '-' operator to catch it's time difference,
#you can calculate it like this having how much hours, minutes or seconds
#spent working. Or you can just concatenate all 3 results to get it all.
hours = timeEnd[i].hour - timeStart[i].hour #hours difference
minutes = timeEnd[i].minute - timeStart[i].minute #minutes difference
seconds = timeEnd[i].second - timeStart[i].second #second difference
print(type(hours), type(minutes), type(seconds)) #all my results are int
print(hours, minutes, seconds) #I can see the difference from one time to another
这就是我的输出结果:
<class 'int'> <class 'int'> <class 'int'> #Here you can see I have 3 int types
1 30 15 #read as 1 hour 30 minutes and 15 seconds
<class 'int'> <class 'int'> <class 'int'>
1 30 15
<class 'int'> <class 'int'> <class 'int'>
1 30 15
<class 'int'> <class 'int'> <class 'int'>
1 30 15
<class 'int'> <class 'int'> <class 'int'>
1 30 15
<class 'int'> <class 'int'> <class 'int'>
1 30 15
<class 'int'> <class 'int'> <class 'int'>
1 30 15
<class 'int'> <class 'int'> <class 'int'>
1 30 15
<class 'int'> <class 'int'> <class 'int'>
1 30 15
[Finished in 0.5s]
答案 1 :(得分:0)
我想出了一个更好的解决方案,该解决方案非常适合我的原始问题,该问题是计算完成工作订单所需的总时间。此解决方案有助于克服作为对象类型的excel时间格式。转换为datetime [64]后,一栏直接减去另一栏。
import pandas as pd
from datetime import time
from datetime import timedelta
df = pd.read_excel('C:/Users/Nativ_Zero/Desktop/work data/July.xls', index_col =0)
df_work = df[['WorkType', 'AST','AFT']]
#to convert time format column which is an object to datetime[64
df_work['AFT'] = pd.to_datetime(df_work['AFT'], format='%H:%M:%S', errors='coerce')
df_work['AST'] = pd.to_datetime(df_work['AST'], format='%H:%M:%S', errors='coerce')
rm_work = df_work[df_work.WorkType == 'RM']
hrs_ft = rm_work['AFT']
hrs_st = rm_work['AST']
hrs_t = hrs_ft - hrs_st
Print(hrs_t)