我想抓取一个网站。我必须使用硒来通过登录表单,我在问自己,既然我已经使用硒,是否有办法使用beautifulSoup抓取网站?
答案 0 :(得分:2)
简单组合
from bs4 import BeautifulSoup as soup
from selenium import webdriver
url = "url"
browser = webdriver.Firefox()
browser.get(url)
# login/scroll/etc
full_page = browser.page_source
page_soup = soup(full_page, "html.parser")
# parse/find