如何从python中的一列计时中确定餐点的不同部分

时间:2019-09-30 18:54:42

标签: python pandas dataframe data-cleaning

我想从一列时间(上午9:30至中午12点,下午3点至午夜12点)中获取餐点(“早餐”,“午餐”,“晚餐”) 以下是dataframe列的示例:-

0                                10am – 1am
1                           12noon – 1:30am 
2                              9:30am – 1am
3         12noon – 3:30pm, 7pm – 12midnight
4        11am – 3:30pm, 6:30pm – 12midnight
                       ...                 
170                           11:40am – 4am
171                            7pm – 1:30am
172                            12noon – 1am
173                            6pm – 3:30am
174                              9am – 10pm

我想分别用食物的 份/份 替换各自的时间 例如,如果上午11点至下午3:30,则将其替换为[“早餐”,“午餐”]
如果上午9点:10点,则将其替换为[“早餐”,“午餐”,“晚餐”],依此类推

1 个答案:

答案 0 :(得分:1)

我的解决方案:

import re

def parse_time(t):
    t = t.strip()
    hours = int(re.findall('^[0-9]+', t)[0])
    m = re.findall(':([0-9]+)', t)
    if len(m) > 0:
        minutes = int(m[0])
    else:
        minutes = 0
    afternoon = re.search('(pm)|(midnight)', t)
    if afternoon:
        hours += 12
    return (hours, minutes)

def get_parts(s):
    x = re.split('–|-', s)
    start, end = x[0].strip(), x[1].strip()
    start_hours, start_minutes = parse_time(start)
    end_hours, end_minutes = parse_time(end)
    parts = []
    if start_hours < 11: # or whenever you think breakfast ends
        parts.append("breakfast")
    if 12 < start_hours < 15 or 12 < end_hours < 15:
        parts.append("lunch")
    if end_hours > 17:
        parts.append("dinner")
    return parts

def get_all_parts(data):
    x = [set(get_parts(s)) for s in data.split(",")]
    return set.union(*x)

print(get_all_parts("10am-3:30pm"))
print(get_all_parts("11am - 3:30pm, 6:30pm - 12midnight"))
print(get_all_parts("10am - 11am, 5pm-7pm"))