Question

出于工作原因，在使用“driver = webdriver.PhantomJS（）”之前，我能够抓取以下网站。我在刮的是价格和日期。

https://www.cash.ch/fonds/swisscanto-ast-avant-bvg-portfolio-45-p-19225268/swc/chf

由于免责声明页面，我在几天前停止了工作，我首先必须同意。

https://www.cash.ch/fonds-investor-disclaimer?redirect=fonds/swisscanto-ast-avant-bvg-portfolio-45-p-19225268/swc/chf

一旦同意我在视觉上看到了真实的内容，但是驱动程序似乎没有，打印出来的是[]，所以它必须仍然是免责声明的网址。

请参阅下面的代码。

    from selenium import webdriver
    from bs4 import BeautifulSoup
    import csv
    import os

    driver = webdriver.PhantomJS()
    driver.set_window_size(1120, 550)

    #Swisscanto
    driver.get("https://www.cash.ch/fonds/swisscanto-ast-avant-bvg-       portfolio-45-p-19225268/swc/chf")
    s_swisscanto = BeautifulSoup(driver.page_source, 'lxml')
    nav_sc = s_swisscanto.find_all('span', {"data-field-entry": "value"})
    date_sc = s_swisscanto.find_all('span', {"data-field-entry": "datetime"})

    print(nav_sc)
    print(date_sc)
    print("Done Swisscanton")

Answer 1

这应该有用（我想你想在zustimmen中点击按钮？）

driver = webdriver.PhantomJS()
driver.get("https://www.cash.ch/fonds/swisscanto-ast-avant-bvg-portfolio-45-p-19225268/swc/chf"

accept_button = driver.find_element_by_link_text('zustimmen')
accept_button.click()

content = driver.page_source

这里有更多细节 python selenium click on button

如何在抓取网站时绕过免责声明

1 个答案: