BS4找不到所有的div

时间:2018-03-20 23:20:42

标签: python beautifulsoup findall

我试图到达网站的底部表格,但是findall()一直保持返回空对象,所以我将所有div放在同一级别上,并注意到当我尝试获取最后两个时它会给出我[]

the_page=urllib.request.urlopen("https://theunderminejournal.com/#eu/sylvanas/item/124105")
bsObj=BeautifulSoup(the_page,'html.parser')
test=bsObj.findAll('div',{'class':'page','id':"item-page"})
print(test)

我已经通过了我得到的bs4对象和我正在​​寻找的2个div。它们是什么时候发生的?

我正在寻找的div位于https://theunderminejournal.com/#eu/sylvanas/item/124105

this is the div im trying to extract

1 个答案:

答案 0 :(得分:1)

您需要使用selenium而不是普通的请求库。

请注意,由于解析的HTML很庞大,我无法发布所有输出。

代码:

from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://theunderminejournal.com/#eu/sylvanas/item/124105")
bsObj = BeautifulSoup(driver.page_source,'html.parser')
test = bsObj.find('div', id='item-page')
print(test.prettify())

输出:

<div class="page" id="item-page" style="display: block;">
 <div class="item-stats">
  <table>
   <tr class="available">
    <th>
     Available Quantity
    </th>
    <td>
     <span>
      30,545
     </span>
    </td>
   </tr>
   <tr class="spacer">
    <td colspan="3">
    </td>
   </tr>
   <tr class="current-price">
    <th>
     Current Price
    </th>
    <td>
     <span class="money-gold">
      27.34
     </span>
    </td>
   </tr>
   <tr class="median-price">
    <th>
     Median Price
    </th>
    <td>
     <span class="money-gold">
      30.11
     </span>
    </td>
   </tr>
   <tr class="mean-price">
    <th>
     Mean Price
    </th>
    <td>
     <span class="money-gold">
      30.52
     </span>
    </td>
   </tr>
   <tr class="standard-deviation">
    <th>
     Standard Deviation
    </th>
    <td>
     <span class="money-gold">
      .
      .
      .
       </span>
      </abbr>
     </td>
    </tr>
   </table>
  </div>
 </div>
</div>