使用BeautifulSoup的Web Scraping Javascript表

时间:2018-04-30 20:51:40

标签: web-scraping beautifulsoup

我使用各种网站进行网页抓取和原型设计相对较新。我在抓取似乎是Javascript加载表的困难。任何帮助将非常感激。以下是我的代码:

import requests
from bs4 import BeautifulSoup


url='https://onlineservice.cvo.org/webs/cvo/register/#/search/
toronto/0/1/0/10'

response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
tables = soup.find_all(class_='table')
print(tables)

1 个答案:

答案 0 :(得分:1)

尝试以下网址,以眨眼间获取所有信息。您可以在网络选项卡下的xhr请求中使用chrome dev工具检索该网址。试一试:

import requests

URL = 'https://onlineservice.cvo.org/rest/public/registrant/search/?query=%20toronto&status=0&type=1&skip=0&take=427'
response = requests.get(URL)

for items in response.json()['result']:
    lastname = items['lastName']
    firstname = items['firstName']
    middlename = items['middleName']
    commonname = items['commonName']
    status = items['registrationStatus']['name']
    print(lastname,firstname,middlename,commonname,status)

部分结果:

Ackerman Kent Alan Kent Active
Albarracin Oscar Fernando Oscar Active
Alcock Kathleen  Kathleen Active
Ali Karissa Soraiya Karissa Active
Allen John Kyle John K. Active
Alvarez Luisa Cristina Luisa Active