Question

我正在学习正则表达式和美丽的汤，而我正在使用正则表达式的谷歌教程。我正在使用Google Tutorial网站中提供的HTML文件（本教程的设置部分中的练习集）

代码如下：

with open(filepath,"r") as f: soup = bs(f, 'lxml')
soup.title

出

<title>Popular Baby Names</title>

代码：

h3 = soup.find_all("h3") # With find_all() I will capture the content of the <h3> Tags (In fact only one h3 Tag exists
                         # containing the Year)

h3[0].get_text()

出

u'Popularity in 1990'

代码：

pattern = re.compile(r'.+(\d\d\d\d).+') 
string = h3[0].get_text()
pattern.match(string).group(0)

出

AttributeError                            Traceback (most recent call last)
<ipython-input-61-2e4daef3292c> in <module>()
----> 1 pattern.match(string).group(0)

AttributeError: 'NoneType' object has no attribute 'group'

我无法解释为什么match（）不能捕捉年份。

您的建议将不胜感激。

Answer 1

因为它预计一年后至少有一个角色。试试。*而不是。+

与正则表达式匹配的模式返回None，而不应该返回

1 个答案: