Question

我有一个名为StaffHours_df的数据框，看起来类似于以下内容：


Name          Hours                  Description

Maria         5 hours 10 minutes     Volunteer

Taylor        2 hours 4 minutes      Employee

Ben           4hrs 30mins            Employee

Gary          8 hours 40 mins        Volunteer

我想提取小时和分钟以创建所有员工的总工作时间，但仅适用于被归类为“雇员”而不是志愿者的人。 我希望将此数字作为与数据帧分开的值进行总计-例如，上表应提供：timeWorked = [6，34]或minutesWorked = 394或类似值 我必须考虑员工输入工作时间的格式的差异，但是我认为如果使用.isdigit，这不会有问题。

这是尽管我不愿学习代码但据我所知的火车：

StaffHours_df[StaffHours_df[‘Description’].str.containts[‘Employee’]

s= [int(s) for s in str.split() if s.isdigit()]

Answer 1

这应该给您您所需要的：

df_emp = df[df['Description'] == 'Employee'] # filter for employees
df_emp['total_minutes'] = (df_emp['Hours']
                          .map(lambda x: [int(i) for i in re.findall("[0-9]+", x)]) # get list of intergers
                          .map(lambda x: 60 * x[0] + x[1]) # convert to minutes
                          )

print(df_emp.to_string())

     Name              Hours Description  total_minutes
1  Taylor  2 hours 4 minutes    Employee            124
2     Ben        4hrs 30mins    Employee            270

如何从数据框的列中提取两个整数值

1 个答案: