\ b在Regex / Python中表现不正常

时间:2019-02-04 03:09:30

标签: regex python-3.x

我正在尝试使用re.findall()查找所有平日名称的出现。当我排除\b时有效,但当我包括它们时无效。这有效:

any_week_day_long = "([Mm]onday|[Tt]uesday|[Ww]ednesday|[Tt]hursday|[Ff]riday|[Ss]aturday|[Ss]unday)"
match = re.findall(any_week_day_long, "Monday is a great day of the week. Tuesday is pretty good, but Wednesday has it beat.")

但这不是:

any_week_day_long = "\b([Mm]onday|[Tt]uesday|[Ww]ednesday|[Tt]hursday|[Ff]riday|[Ss]aturday|[Ss]unday)\b"
match = re.findall(any_week_day_long, "Monday is a great day of the week. Tuesday is pretty good, but Wednesday has it beat.")

在我看来,\b应该可以找到星期一,星期二和星期三,但是当我print比赛时,它只是一个空列表。

2 个答案:

答案 0 :(得分:2)

代替使用\b,请尝试:\\b

any_week_day_long = "\\b([Mm]onday|[Tt]uesday|[Ww]ednesday|[Tt]hursday|[Ff]riday|[Ss]aturday|[Ss]unday)\\b"
match = re.findall(any_week_day_long, "Monday is a great day of the week. Tuesday is pretty good, but Wednesday has it beat.")

输出

['Monday', 'Tuesday', 'Wednesday']

答案 1 :(得分:0)

您甚至可以使用raw string实现相同的目标。与其做类似[M|m]的事情,不如使用re.IGNORECASE标志更好。一种更清洁的方式来做同样的事情。

any_week_day_long = r'\b(?:mon|tues|wednes|thurs|fri|satur|sun)day\b'
match = re.findall(any_week_day_long, "Monday is a great day of the week. Tuesday is pretty good, but Wednesday has it beat.")  

输出:

['Monday', 'Tuesday', 'Wednesday']