无法弄清楚如何组合span属性' title'和span类num的文本值
<ul>
<li>
<span class="abc" title="HOUSES"> </span>
<span class="num">1</span>
</li>
<li>
<span class="def" title="CARS"> </span>
<span class="num">2</span>
</li>
<li>
<span class="ghj" title="AGE"> </span>
<span class="num">90</span>
</li>
</ul>
我如何获得类似的属性?
HOUSES = 1
CARS = 2
AGE = 90
这是我现在所处的问题,但到目前为止尚未解决问题
for li_tag in soup.find_all('ul'):
for span_tag in li_tag.find_all('li'):
for span in span_tag.find_all('span'):
print(span)
答案 0 :(得分:2)
这是如何尝试获得所需结果的:
from bs4 import BeautifulSoup
content = """
<ul>
<li>
<span class="abc" title="HOUSES"> </span>
<span class="num">1</span>
</li>
<li>
<span class="abc" title="CARS"> </span>
<span class="num">2</span>
</li>
<li>
<span class="abc" title="AGE"> </span>
<span class="num">90</span>
</li>
</ul>
"""
soup = BeautifulSoup(content,"lxml")
for items in soup.find_all("li"):
title = items.find("span").get("title")
number = items.select_one("span:nth-of-type(2)").text
print("{} = {}".format(title,number))
你也可以尝试这样:
for items in soup.find_all(class_="num"):
title = items.find_previous_sibling()['title']
number = items.text
print("{} = {}".format(title,number))
这是另一种方式:
for items in soup.select("[title]"):
title = items.get("title")
number = items.find_next().text
print("{} = {}".format(title,number))
或者像这样:
for items in soup.find_all(lambda e: e.get("title")):
title = items.get("title")
number = items.find_next_sibling().text
print("{} = {}".format(title,number))
输出:
HOUSES = 1
CARS = 2
AGE = 90