我有这段代码
import requests
from bs4 import BeautifulSoup
url = "http://www.rockefeller.edu/research/areas/summary.php?id=1"
r = requests.get(url)
soup = BeautifulSoup(r.content)
a = 'Comments'
for x in (soup.find_all('p')):
if a in x:
print (x)
else:
print ('it is not there')
基本上,我有一句话,我想知道它在页面中的位置。让我们说我的话是评论&#39;。我想知道那个单词的评论在哪里:能够打印出包含它的标签(例如:<a href=#>Comments</a>
更新的代码(对我来说不起作用)
import requests
from bs4 import BeautifulSoup
import re
url = "http://www.rockefeller.edu/research/areas/summary.php?id=1"
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
for x in (soup.find_all(string=re.compile('comment', flags=re.I))):
print(x.parent)
print(x.parent.name)
答案 0 :(得分:1)
使用编译正则表达式对象指定string
关键字参数;它将返回字符串对象包含文本;您可以使用parent
属性
import re
...
for x in soup.find_all(string=re.compile('comment', flags=re.I)):
print(x.parent)
print(x.parent.name)
答案 1 :(得分:0)
我得到了答案,现在是:
for x in (soup.find_all(True,text=re.compile(r'comment', re.I))):
print(x)