我有这段代码:
import requests
from bs4 import BeautifulSoup
import re
url = "http://www.rockefeller.edu/research/areas/summary.php?id=1"
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
for x in (soup.find_all(string=re.compile('comment'))):
print(x.parent)
print(x.parent.name)
当我听说它应该打印<a href="/about/comments">Comments</a>
和a
时,它什么都没打印出来
我正在使用:
要求:2.7.0
beautifulsoup4:4.4.0
Python:3.4.3
在python上运行Idle:Macbook Pro
答案 0 :(得分:1)
re.compile()
区分大小写。您必须设置标志re.I
以使其不区分大小写。请参阅以下演示示例:
import requests
from bs4 import BeautifulSoup
import re
url = "http://www.rockefeller.edu/research/areas/summary.php?id=1"
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
for x in (soup.find_all(True,text=re.compile(r'comment', re.I))):
print(x)
输出
<a href="/about/comments">Comments</a>