time_sentences = ["Monday: The doctor's appointment is at 2:45pm.",
"Tuesday: The dentist's appointment is at 11:30 am.",
"Wednesday: At 7:00pm, there is a basketball game!",
"Thursday: Be back home by 11:15 pm at the latest.",
"Friday: Take the train at 08:10 am, arrive at 09:00am."]
df['text'].str.replace(r'(\w+day\b)', lambda x: x.group(0)[:3])
注意上面我们有一个组,所以我们用0访问该组。
我期待如果我们为组传递1,我们应该得到错误超出范围,就像没有这样的组,但我们没有得到错误。
df['text'].str.replace(r'(\w+day\b)', lambda x: x.group(1)[:3])
如果我们为组传递2,那么我们将超出范围错误。
df['text'].str.replace(r'(\w+day\b)', lambda x: x.group(2)[:3])
有什么理由?
答案 0 :(得分:2)
因为()
捕获组将捕获的字符存储在第一个组索引中。 .group()
或.group(0)
应返回所有匹配的字符,其中索引1或n返回相应的捕获组1或n捕获的所有字符。
从正则表达式中删除那些()
捕获组,它会在访问x.group(1)
时出错