Question

我正在尝试在find命令中使用正则表达式。我想找到一个标签范围，其中包含'example'文本。

我已经尝试过了：

place = infoFrame.find('span',text = re.compile('.*example.*:'))

我收到了这个错误：

UnboundLocalError: local variable 're' referenced before assignment

这很奇怪，因为我在页面顶部写了import re。我上面写的这行是一个类的函数。

我知道这是另一种方式 - 查找所有span代码，然后检查每个代码是否包含'example'，但我很好奇如何使用regex内的find {{1}}命令。

你能告诉我一些错误吗？

Answer 1

您正在使用re = ...或import re功能的

其他地方。例如。您正在使用re作为本地变量。重命名或删除该局部变量：

>>> from bs4 import BeautifulSoup
>>> import re
>>> soup = BeautifulSoup('<span>example: foo</span>')
>>> def find_span():
...     return soup.find('span', text=re.compile('.*example.*:'))
...     re = 'oops'
... 
>>> find_span()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in find_span
UnboundLocalError: local variable 're' referenced before assignment
>>> def find_span():
...     return soup.find('span', text=re.compile('.*example.*:'))
... 
>>> find_span()
<span>example: foo</span>

您对re.compile()的使用不是很好;你可以删除第一个.*模式，并避免灾难性的回溯问题。对于包含大量文本和 no example文本的任何元素，其他模式将非常慢。使用*：

使第二个?非贪婪

place = infoFrame.find('span', text=re.compile('example.*?:'))

在BeautifulSoup的发现中使用正则表达式

1 个答案: