我必须根据现有时间列创建一个班次列。
例如,我有一个带有详细信息的数据框df1:
time
0 10:30
1 13:50
2 19:20
3 14:10
我需要一个数据框,如下图所示:
time shift 0 10:30 1 1 13:50 2 2 19:20 2 3 23:10 3
答案 0 :(得分:1)
以下使用轮班字典来帮助确定与给定时间相关的班次:
import pandas as pd
df = pd.DataFrame({'time': ['00:00','08:29', '08:30', '08:31', '12:29', '12:30', '12:31', '20:29', '20:30', '20:31', '23:59', '10:30', '13:50', '19:20', '14:10', '23:10']})
# Convert the time column into datetime objects
df.time = pd.to_datetime(df.time).dt.time
# Set up a shifts dictionary
shifts = {('8:30', '12:30'): 1 , ('12:30', '20:30'): 2, ('20:30', '8:30'): 3}
# Convert the keys to datetime objects
shifts = {tuple(map(pd.to_datetime, k)):v for k,v in shifts.items()}
# Expand the datetime objects beyond one day if the second element occurred after the first element
shifts = {(k if k[0].time() < k[1].time() else (k[0],k[1]+pd.to_timedelta('1day'))):v for k,v in shifts.items()}
# Determine shift
def get_shift(time):
try:
return shifts.get([k for k in shifts if time in pd.date_range(*k, freq='60S', closed='left').time][0])
except:
return 'No Shift'
# Use .apply on the time column to get the shift column
df['shift'] = df.time.apply(get_shift)
print(df)
输出:
# time shift
# 0 00:00:00 3
# 1 08:29:00 3
# 2 08:30:00 1
# 3 08:31:00 1
# 4 12:29:00 1
# 5 12:30:00 2
# 6 12:31:00 2
# 7 20:29:00 2
# 8 20:30:00 3
# 9 20:31:00 3
# 10 23:59:00 3
# 11 10:30:00 1
# 12 13:50:00 2
# 13 19:20:00 2
# 14 14:10:00 2
# 15 23:10:00 3
答案 1 :(得分:0)
您可以通过apply
功能创建shift
列来完成此操作。
import datetime
def check_shift(row):
shift_time = row[0]
if datetime.time(8, 30) <= shift_time <= datetime.time(12, 30):
return 1
elif datetime.time(12, 30) < shift_time <= datetime.time(20, 30):
return 2
else:
return 3
df['shift'] = df.apply(check_shift, axis='columns')
这将产生以下数据帧
time shift
0 10:30:00 1
1 13:50:00 2
2 19:20:00 2
3 14:10:00 2
如果我们将最后一个班次调整为23:10
(就像您的示例输出一样),我们会得到以下结果:
time shift
0 10:30:00 1
1 13:50:00 2
2 19:20:00 2
3 23:10:00 3
这里有一个重要的注意事项,我将time
列从字符串转换为实际的time
类型:
df['time'] = pd.to_datetime(df['time'], format="%H:%M").dt.time
答案 2 :(得分:0)
假设我们有以下DF:
In [380]: df
Out[380]:
time
0 00:00
1 08:29
2 08:30
3 08:31
4 12:29
5 12:30
6 12:31
7 20:29
8 20:30
9 20:31
10 23:59
In [381]: df.dtypes
Out[381]:
time object
dtype: object
考虑这个解决方案:
In [382]: bins = [-1, 830, 1230, 2030, 2400]
...: labels = [0,1,2,3]
...: df['shift'] = pd.cut(df.time.str.replace(':','').astype(int),
...: bins=bins, labels=labels, right=False)
...: df.loc[df['shift']==0, 'shift'] = 3
...:
In [383]: df
Out[383]:
time shift
0 00:00 3
1 08:29 3
2 08:30 1
3 08:31 1
4 12:29 1
5 12:30 2
6 12:31 2
7 20:29 2
8 20:30 3
9 20:31 3
10 23:59 3
<强>解释强>
time
转换为数字值08:29
- &gt; 829
,12:31
- &gt; 1231
等等。[0,1,2,3]
注意:标签必须是唯一的,这就是我们无法指定{{}的原因1}} [3,1,2,3]
- &gt; 0
我们必须将3
之间的时间间隔分为两个:20:30 - 08:30
和00:00 - 08:30