我试图使用正则表达式打印它
trying = 'Mar 20th, 2009'
我无法在20日之后打印逗号, 这是我试过的,
print (re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[\s]\d{2}[th , ]+', trying))
print (re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[\s]\d{2}[a-z,]+', trying))
print (re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[\s]\d{2}[a-z]+[,]', trying))
所需的输出应该是输入字符串。 我做错了什么?
答案 0 :(得分:3)
这将有效
>>> print (re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[\s]\d{1,2}th[,][\s]\d{4}',trying))
=> ['Mar 20th, 2009']`
现在让我们看看为什么你的试验没有给你预期的结果
print (re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[\s]\d{2}[th , ]+', trying))
- >这在th
之后有空格,因此它不匹配
print (re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[\s]\d{2}[a-z,]+', trying))
- >通过提供+
,您可以通过查找一个或多个th,
进行搜索,以便仅匹配到th,
print (re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[\s]\d{2}[a-z]+[,]', trying))
- >类似地,您搜索子字符串的结尾为,
,因此会发送到th,
答案 1 :(得分:2)
试试这个正则表达式
r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) (?:[0-9]{2}|[0-9])[rdth]{2}, \d{4}'
将与此匹配,
>>> x = re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) (?:[0-9]{2}|[0-9])[rdth]{2}, \d{4}', trying)
>>> x
['Mar 20th, 2009']
>>> tryig = 'Jun 3rd, 2017'
>>> x = re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) (?:[0-9]{2}|[0-9])[rdth]{2}, \d{4}', tryig)
>>> x
['Jun 3rd, 2017']
根据评论进行更新:
>>> regex = r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) \d{1,2}[rdth]{2}, \d{4}'
>>> x = re.findall(regex, trying)
>>> x
['Mar 20th, 2009']