通过提取计算大熊猫的时间

时间:2018-03-11 05:46:57

标签: python pandas

我有一组数据

Time1  Time2
XY40M  XY35M
XY5H   XY45M
XY30M  XY20M
XY1H   XY2H
XY1H30M   XY2H

我必须以分钟计算总时间

Time1+Time2
75
345
50
180
210

我如何得出这个?

1 个答案:

答案 0 :(得分:2)

str.extract使用numpy.where

a = df['Time1'].str.extract('(\d+[MH])', expand=False)
a1 = a.str[:-1].astype(int)
b = df['Time2'].str.extract('(\d+[MH])', expand=False)
b1 = b.str[:-1].astype(int)

df['Time'] = np.where(a.str[-1] == 'H', a1 * 60, a1) + np.where(b.str[-1] == 'H', b1 * 60, b1)

另一种解决方案:

a = df['Time1'].str.extract('(\d+)([MH])', expand=True)
a1 = a[0].astype(int)
b = df['Time2'].str.extract('(\d+)([MH])', expand=True)
b1 = b[0].astype(int)

df['Time'] = np.where(a[1] == 'H', a1 * 60, a1) + np.where(b[1] == 'H', b1 * 60, b1)
print (df)
   Time1  Time2  Time
0  XY40M  XY35M    75
1   XY5H  XY45M   345
2  XY30M  XY20M    50
3   XY1H   XY2H   180

编辑:

a = df['Time1'].str.extract('(\d+)([MH])(\d*)([M]*)', expand=True)
a1 = a[[0,2]].replace('', 0).astype(int)
b = df['Time2'].str.extract('(\d+)([MH])(\d*)([M]*)', expand=True)
b1 = b[[0,2]].replace('', 0).astype(int)

df['Time'] = np.where(a[1] == 'H', a1[0] * 60, a1[0]) + a1[2] + \
             np.where(b[1] == 'H', b1[0] * 60, b1[0]) + b1[2]

print (df)
     Time1  Time2  Time
0    XY40M  XY35M    75
1     XY5H  XY45M   345
2    XY30M  XY20M    50
3     XY1H   XY2H   180
4  XY1H30M   XY2H   210