大熊猫数据框时间格式值的总和

时间:2019-09-26 00:36:00

标签: python pandas

我曾尝试使用pandas数据系列求和相同格式'hh:mm:ss'的多个值,其数据类型为datetime.time / object。但是它出错了。请您指导我最好的方法。

代码如下:

    import pandas as pd  
    school_diesel =pd.read_excel(r'*********************** Diesel Log 23-09-2019 17_59_05.xlsx',heading=[1,2])  
    school_running = pd.read_excel(r'*************Daily Log 23-09-2019 18_09_41.xlsx',0)  
    school_diesel.columns = school_diesel.iloc[0] # replace headings with next row values  
school_running.columns = school_running.iloc[0] # replace headings with next row values  
school_running.columns  
school_diesel.columns  
school_diesel.drop(school_diesel.head(1).index, inplace=True) #drop first row of the table- as this repeated heading  
school_running.drop(school_running.head(1).index, inplace=True) #drop first row of the table- as this repeated heading  

每个字段的数据类型是:

input: school_running.info()  

output: <class 'pandas.core.frame.DataFrame'>  

Int64Index: 12469 entries, 1 to 12469  
Data columns (total 25 columns):  
Sno                       12468 non-null object  
City                    12467 non-null object
Zone                    12467 non-null object
Branch                  12467 non-null object
Building Code           12383 non-null object
 Branch Type            12305 non-null object
AC or Non AC            11405 non-null object
 Student Strength       12467 non-null object
Company Name            12381 non-null object
Gen SNo                 12467 non-null object
Capacity KVA            12381 non-null object
Fuel Capacity           12467 non-null object
Last Diesel Purchase    12467 non-null object
Purchase Qty            12467 non-null object
Amount                  12467 non-null object
Last Fuel Filled        12467 non-null object
Filling Qty             12467 non-null object
Diesel Opening Qty      12467 non-null object
Generator On Date       12467 non-null object
Generator Off Date      12467 non-null object
Running Hours           12466 non-null object
Consumed Units          12466 non-null object
Diesel Consumed         12466 non-null object
Diesel Balance Qty      12466 non-null object
Remarks                 6267 non-null object
dtypes: object(25)
memory usage: 1.3+ MB

错误发生在行:

school_running['Running Hours'].sum()

error is :

----> 1 school_running['Running Hours'].sum()  
**
TypeError: unsupported operand type(s) for +: 'datetime.time' and 'datetime.time'  

预期输出是总运行时间的总和。

**时间数据为:**

school_running['Running Hours'].head(10)

1     00:00:00
2     00:00:00
3     00:25:00
4     00:00:00
5     00:00:00
6     00:00:00
7     00:00:00
8     00:00:00
9     00:00:00
10    01:20:00
Name: Running Hours, dtype: object

1 个答案:

答案 0 :(得分:0)

您必须将它们转换为Timedeltas。下面将为您提供总秒数

df['Running Hours'] = df['Running Hours'].astype(str).map(lambda x: x[-1:] + x[:x.find("-")])

pd.to_timedelta(df['Running Hours']).sum().total_seconds()
相关问题