我希望在[{1}}
中使用F
查找页面上Php
的所有出现(忽略大小写)
BeautifulSoup
(无论如何)可能出现在页面的任何地方,所以我基本上只是找到Python3
表示,而不是在特定的div或类中。
我目前有:
Php
string
包含来自from BeautifulSoup import BeautifulSoup
import requests
school_urls = ['somesite1.com','somesite2.com']
posting_keywords = ['PHP', 'Php', 'php']
for school in school_urls:
网址的 html标记,其中包含school
等字词。
这对你来说如何?有没有办法在Beautiful soup中执行此操作,找到request
忽略案例的所有变体而不必遍历php
?
由于
答案 0 :(得分:0)
posting_keywords.lower()是否适合您。
答案 1 :(得分:0)
import re, bs4
text = '''"""
<html><head><title>The Dormouse's story php</title></head>
<body>
<p class="title"><b>The Dormouse's story PHP</b></p>
<p class="story">Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" class="sister" id="link1">php</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Php</a> and
<a href="http://example.com/tillie" class="sister" id="link3">php Tillie</a>;
and they lived at the bottom of a well.</p>
<p class="story">...</p>
"""'''
soup = bs4.BeautifulSoup(text, 'lxml')
soup.find_all(text=re.compile(r'php', re.IGNORECASE))
出:
["The Dormouse's story php",
"The Dormouse's story PHP",
'php',
'Php',
'php Tillie']