用beautifulsoup解析小的html代码的pythonic方法?

时间:2018-08-10 13:28:37

标签: python beautifulsoup

使用BeautifulSoup解析以下html代码的最佳pythonic方法是什么?

<html>

<body>
  <div class="bet_group">
    <div class="bet-title bet-title_justify"><span class="bet-title__star"></span> Total
      <!-- -->
    </div>
    <div class="bets betCols2">
      <div class=""><span class="bet_type" data-type="9">Total Over 4.5</span> <span class="koeff" data-coef="3.38"><i>3.38</i></span></div>
      <div class=""><span class="bet_type" data-type="10">Total Under 4.5</span> <span class="koeff" data-coef="1.34"><i>1.34</i></span></div>
      <div class=""><span class="bet_type" data-type="9">Total Over 5.5</span> <span class="koeff" data-coef="12.5"><i>12.5</i></span></div>
      <div class=""><span class="bet_type" data-type="10">Total Under 5</span> <span class="koeff" data-coef="1.04"><i>1.04</i></span></div>
      <div class="bets__empty-cell"> </div>
      <div class=""><span class="bet_type" data-type="10">Total Under 5.5</span> <span class="koeff" data-coef="1.02"><i>1.02</i></span></div>
    </div>
  </div>
</body>

</html>

我正在尝试获取输出:

Title: Total

Total Over 4.5: 3.88, Total Under 4.5: 1.34

Total Over 5.5: 12.5, Total Under 4.5: 1.02

我尝试使用以下代码,但并没有完全到达那里。

soup = BeautifulSoup(html, 'lxml')

infos = soup.find_all('span', class_='bet_type')
for info in infos:
    info.get_text()
odds = soup.find_all('span', class_='koeff')
for odd in odds:
    odd.get_text()

2 个答案:

答案 0 :(得分:0)

尝试:

from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'lxml')
output = ""
for i in soup.find("div", class_="bet_group").text.splitlines():
    if i.strip():
        output += i.strip()+"\n"
print(output)

输出:

Total
Total Over 4.5 3.38
Total Under 4.5 1.34
Total Over 5.5 12.5
Total Under 5 1.04
Total Under 5.5 1.02

答案 1 :(得分:0)

这可能对您有帮助

    st = """
        <html>

<body>
  <div class="bet_group">
    <div class="bet-title bet-title_justify"><span class="bet-title__star"></span> Total
      <!-- -->
    </div>
    <div class="bets betCols2">
      <div class=""><span class="bet_type" data-type="9">Total Over 4.5</span> <span class="koeff" data-coef="3.38"><i>3.38</i></span></div>
      <div class=""><span class="bet_type" data-type="10">Total Under 4.5</span> <span class="koeff" data-coef="1.34"><i>1.34</i></span></div>
      <div class=""><span class="bet_type" data-type="9">Total Over 5.5</span> <span class="koeff" data-coef="12.5"><i>12.5</i></span></div>
      <div class=""><span class="bet_type" data-type="10">Total Under 5</span> <span class="koeff" data-coef="1.04"><i>1.04</i></span></div>
      <div class="bets__empty-cell"> </div>
      <div class=""><span class="bet_type" data-type="10">Total Under 5.5</span> <span class="koeff" data-coef="1.02"><i>1.02</i></span></div>
    </div>
  </div>
</body>

</html>
    """
    soup = BeautifulSoup(st, 'lxml')
    title = soup.find('div', attrs={'class': 'bet-title'}).get_text().strip()
    print(title)
    for spn in soup.find_all('span', attrs={'class': 'bet_type'}):
        bet_text = spn.get_text()
        print(bet_text)


    # Output as: Total
    #            Total Over 4.5
    #            Total Under 4.5
    #            Total Over 5.5
    #            Total Under 5
    #            Total Under 5.5