当我尝试通过BeautifulSoup获取时变成空的

时间:2019-03-23 19:57:48

标签: python parsing web-scraping beautifulsoup screen-scraping

我正在尝试从网站https://www.kp.ru/best/kazan/abiturient_2018/ivmit/解析表。 Chrome的DevTools向我显示该表是:

<div class="t431__table-wapper" data-auto-correct-mobile-width="false"> 
<table class="t431__table " style="">
...
</table>
</div>

但是当我这样做时:

url = r"https://www.kp.ru/best/kazan/abiturient_2018/ivmit/"
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
tag = soup.find_all('div', {'class':r't431__table-wapper'})
print(tag)

它返回我,就像<table>为空:

[<div class="t431__table-wapper" data-auto-correct-mobile-width="false">
<table class="t431__table" style=""></table></div>, 
<div class="t431__table-wapper" data-auto-correct-mobile-width="false">
<table class="t431__table" style=""></table></div>,
<div class="t431__table-wapper" data-auto-correct-mobile-width="false">
<table class="t431__table" style=""></table></div>,
<div class="t431__table-wapper" data-auto-correct-mobile-width="false">
<table class="t431__table" style=""></table></div>]

是JavaScript还是其他?该如何解决?

1 个答案:

答案 0 :(得分:1)

您可以从另一个标签获取该信息

import requests
from bs4 import BeautifulSoup as bs

url = 'https://www.kp.ru/best/kazan/abiturient_2018/ivmit/'
soup = bs(requests.get(url).content, 'lxml')
print(soup.select_one('.t431__data-part2').text)

输出:

enter image description here