Pandas Dataframe:将整数转换为hh:mm

时间:2017-08-08 07:01:53

标签: python pandas dataframe

我在下面有以下df1,显示hhmm次。这些值表示文字时间,但格式不正确。例如。 845应该是08:45,而1125 = 11:25。

CU                         Parameters     31-07-2017    01-08-2017  02-08-2017  03-08-2017  
CU0111-039820-L       Time of Full Charge   1125           0         1359            1112   
CU0111-041796-H       Time of Full Charge   1233           0           0             1135   
CU0111-046907-0       Time of Full Charge   845            0         1229            1028   
CU0111-046933-6       Time of Full Charge   1053           0           0             1120   
CU0111-050103-K       Time of Full Charge   932            0          1314           1108     
CU0111-052525-J       Time of Full Charge   1214          1424        1307           1254   
CU0111-052534-M       Time of Full Charge   944            0            0            1128   
CU0111-052727-7       Time of Full Charge   1136           0          1443           1114   

我需要将所有这些值转换为hh:mm的有效时间戳,然后计算这些时间戳的平均值,不包括' 0'的值。

CU                         Parameters     31-07-2017    01-08-2017  02-08-2017  03-08-2017  
CU0111-039820-L       Time of Full Charge   11:25           0          13:59        11:12   
CU0111-041796-H       Time of Full Charge   12:33           0           0           11:35   
CU0111-046907-0       Time of Full Charge   08:45           0          12:29        10:28   
CU0111-046933-6       Time of Full Charge   10:53           0           0           11:20   
CU0111-050103-K       Time of Full Charge   09:32           0         13:14          11:08    
CU0111-052525-J       Time of Full Charge   12:14         14:24       13:07          12:54  
CU0111-052534-M       Time of Full Charge   09:44          0            0            11:28  
CU0111-052727-7       Time of Full Charge   11:36          0          14:43          11:14  

最终结果:

Average time of charge:  hh:hh (excluding 0 values)

Number of no charges:   =count(number of 0)

我尝试了这些方面的东西,但无济于事:

text = df1[col_list].astype(str)
df1[col_list] = text.str[:-2] + ':' + text.str[-2:]
hhmm = df1[col_list]
minutes = (hhmm / 100).astype(int) * 60 + hhmm % 100
df[col_list] = pd.to_timedelta(minutes, 'm')

1 个答案:

答案 0 :(得分:2)

我认为您可以先转换所有值to_timedelta

cols = df.columns.difference(['CU','Parameters'])

df[cols] = df[cols].replace(0, '0000')
                   .astype(str)
                   .apply(lambda x: pd.to_timedelta(x.str[:-2] + ':' + x.str[-2:] + ':00'))
print (df)
                CU           Parameters 31-07-2017 01-08-2017 02-08-2017  \
0  CU0111-039820-L  Time of Full Charge   11:25:00   00:00:00   13:59:00   
1  CU0111-041796-H  Time of Full Charge   12:33:00   00:00:00   00:00:00   
2  CU0111-046907-0  Time of Full Charge   08:45:00   00:00:00   12:29:00   
3  CU0111-046933-6  Time of Full Charge   10:53:00   00:00:00   00:00:00   
4  CU0111-050103-K  Time of Full Charge   09:32:00   00:00:00   13:14:00   
5  CU0111-052525-J  Time of Full Charge   12:14:00   14:24:00   13:07:00   
6  CU0111-052534-M  Time of Full Charge   09:44:00   00:00:00   00:00:00   
7  CU0111-052727-7  Time of Full Charge   11:36:00   00:00:00   14:43:00   

  03-08-2017  
0   11:12:00  
1   11:35:00  
2   10:28:00  
3   11:20:00  
4   11:08:00  
5   12:54:00  
6   11:28:00  
7   11:14:00  

然后为平均值而不是null timedeltas创建新列,并将0计为True值的总和:

df['avg'] = df[cols][df[cols].ne(0)].mean(axis=1)
df['number no changes'] = df[cols].eq(0).sum(axis=1)
print (df)
                CU           Parameters 31-07-2017 01-08-2017 02-08-2017  \
0  CU0111-039820-L  Time of Full Charge   11:25:00   00:00:00   13:59:00   
1  CU0111-041796-H  Time of Full Charge   12:33:00   00:00:00   00:00:00   
2  CU0111-046907-0  Time of Full Charge   08:45:00   00:00:00   12:29:00   
3  CU0111-046933-6  Time of Full Charge   10:53:00   00:00:00   00:00:00   
4  CU0111-050103-K  Time of Full Charge   09:32:00   00:00:00   13:14:00   
5  CU0111-052525-J  Time of Full Charge   12:14:00   14:24:00   13:07:00   
6  CU0111-052534-M  Time of Full Charge   09:44:00   00:00:00   00:00:00   
7  CU0111-052727-7  Time of Full Charge   11:36:00   00:00:00   14:43:00   

  03-08-2017      avg  number no changes  
0   11:12:00 12:12:00                  1  
1   11:35:00 12:04:00                  2  
2   10:28:00 10:34:00                  1  
3   11:20:00 11:06:30                  2  
4   11:08:00 11:18:00                  1  
5   12:54:00 13:09:45                  0  
6   11:28:00 10:36:00                  2  
7   11:14:00 12:31:00                  1  
print (df[cols][df[cols].ne(0)])
  01-08-2017 02-08-2017 03-08-2017 31-07-2017
0        NaT   13:59:00   11:12:00   11:25:00
1        NaT        NaT   11:35:00   12:33:00
2        NaT   12:29:00   10:28:00   08:45:00
3        NaT        NaT   11:20:00   10:53:00
4        NaT   13:14:00   11:08:00   09:32:00
5   14:24:00   13:07:00   12:54:00   12:14:00
6        NaT        NaT   11:28:00   09:44:00
7        NaT   14:43:00   11:14:00   11:36:00
print (df[cols].eq(0))
   01-08-2017  02-08-2017  03-08-2017  31-07-2017
0        True       False       False       False
1        True        True       False       False
2        True       False       False       False
3        True        True       False       False
4        True       False       False       False
5       False       False       False       False
6        True        True       False       False
7        True       False       False       False