Question

我想知道如何使用python scrapper从列表中获取所有内容。我要采取http://prntscr.com/dged67 我想出了如何得到它，但它有一些我想摆脱的丑陋标签。

info = soup.findAll('ul',{'class':'list-unstyled pull-left custom-stats'})
    print info

这是我正在使用的代码。这就是我得到的http://prntscr.com/dgedzu

Answer 1

基于你的html：

from bs4 import BeautifulSoup

soup = BeautifulSoup(open("web.html"), "html.parser")
divs = soup.findAll('div',{'class':'pull-left custom-photo-modal-stats'})
for div in divs:
    for listItem in div.find_all('li'):
       print listItem.getText()

Answer 2

    <div class="pull-left custom-photo-modal-stats">
                        <h3>Stats</h3>
                        <ul class="list-unstyled pull-left custom-stats">
                            <li><span class="highlight">Eye color:</span> hazel</li>
                            <li><span class="highlight">Hair color:</span> brown</li>
                            <li><span class="highlight">Height:</span> 5'5"</li>
                            <li><span class="highlight">Weight:</span> 110 lbs</li>
                        </ul>
                        <ul class="list-unstyled pull-left custom-stats custom-stats-right">
                            <li><span class="highlight">Breasts:</span> medium</li>
                            <li><span class="highlight">Size:</span> 34/24/37</li>
                            <li><span class="highlight">Shaved:</span> shaved</li>
                            <li><span class="highlight">Ethnicity:</span> Caucasian</li>
                        </ul>
                    </div>

从列表中获取所有内容（使用python进行HTML Scrapping）

2 个答案: