结合span classe和title属性的值

时间:2018-05-01 21:39:03

标签: python html python-3.x beautifulsoup

无法弄清楚如何组合span属性' title'和span类num的文本值

<ul>
       <li>
          <span class="abc" title="HOUSES"> </span>
          <span class="num">1</span>
       </li>
       <li>
          <span class="def" title="CARS"> </span>
          <span class="num">2</span>
       </li>
       <li>
          <span class="ghj" title="AGE"> </span>
          <span class="num">90</span>
       </li>
</ul>

我如何获得类似的属性?

HOUSES = 1

CARS = 2

AGE = 90

这是我现在所处的问题,但到目前为止尚未解决问题

    for li_tag in soup.find_all('ul'):
        for span_tag in li_tag.find_all('li'):
            for span in span_tag.find_all('span'):
                print(span)

1 个答案:

答案 0 :(得分:2)

这是如何尝试获得所需结果的:

from bs4 import BeautifulSoup

content = """
<ul>
   <li>
      <span class="abc" title="HOUSES"> </span>
      <span class="num">1</span>
   </li>
   <li>
      <span class="abc" title="CARS"> </span>
      <span class="num">2</span>
   </li>
   <li>
      <span class="abc" title="AGE"> </span>
      <span class="num">90</span>
   </li>
</ul>
"""

soup = BeautifulSoup(content,"lxml")
for items in soup.find_all("li"):
    title = items.find("span").get("title")
    number = items.select_one("span:nth-of-type(2)").text
    print("{} = {}".format(title,number))

你也可以尝试这样:

for items in soup.find_all(class_="num"):
    title = items.find_previous_sibling()['title']
    number = items.text
    print("{} = {}".format(title,number))

这是另一种方式:

for items in soup.select("[title]"):
    title = items.get("title")
    number = items.find_next().text
    print("{} = {}".format(title,number))

或者像这样:

for items in soup.find_all(lambda e: e.get("title")):
    title = items.get("title")
    number = items.find_next_sibling().text
    print("{} = {}".format(title,number))

输出:

HOUSES = 1
CARS = 2
AGE = 90