具有以下功能,可以检查某个网页是否存在keywoard
def checkString():
url_a = 'https://launchstudio.bluetooth.com/ListingDetails/50756'
r_a = requests.get(url_a)
soup_a = BeautifulSoup(r_a.text)
for blem in soup_a(text=re.compile(r'RFCOMM')):
return True
return False
已经验证我的soup_a与url的view-source相同,但似乎我的搜索只返回包含在head标签内的结果,并且很难弄清楚原因。有什么建议?
Python版本2.7.5
答案 0 :(得分:2)
您需要将lxml
传递给BeautifulSoup
课程。此外,如果找到匹配项,return True
将跳出for循环。因此,如果确实在head标签中找到RFCOMM
,则循环将退出,并且不会再注册匹配。最好使用列表推导并确定是否找到任何匹配项:
from bs4 import BeautifulSoup as soup
import urllib.request as urllib
import re
def checkString():
url_a = 'https://launchstudio.bluetooth.com/ListingDetails/50756'
s = soup(str(urllib.urlopen(url_a).read()), 'lxml')
return bool([i for i in s(text=re.compile(r'RFCOMM'))])
print(checkString())
输出:
True