我有以下HTML片段:
>>> a
<div class="headercolumn">
<h2>
<a class="results" data-name="result-name" href="/xxy> my text</a>
</h2>
我正在尝试仅在属性data-name =“result-name”
时选择标题列我试过了:
>>> a.select('a["data-name="result-name""]')
这给出了:
ValueError: Unsupported or invalid CSS selector:
我怎样才能使这个工作?
答案 0 :(得分:7)
你可以这样做:
soup = BeautifulSoup(html)
results = soup.findAll("a", {"data-name" : "result-name"})
来源:How to find tags with only certain attributes - BeautifulSoup
答案 1 :(得分:3)
html = """
<div class="headercolumn">
<h2>
<a class="results" data-name="result-name" href="/xxy> my text</a>
</h2>
"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(html)
for d in soup.findAll("div",{"class":"headercolumn"}):
print d.a.get("data-name")
print d.select("a.results")
result-name
[<a class="results" data-name="result-name" href="/xxy> my text</a></h2>"></a>]
答案 2 :(得分:2)
选择类或ID
soup.select('a.gamers') # select an `a` tag with the class gamers
soup.select('a#gamer') # select an `a` tag with the id gamer
选择单个属性:
soup.select('a[attr="value"]')
选择多个属性:
attr_dict = {
'attr1': 'val1',
'attr2': 'val2',
'attr3': 'val3'
}
soup.findAll('a', attr_dict)
您可以在soup.select