使用代理时无法使用selenium抓取数据

时间:2018-02-19 13:37:53

标签: python selenium web-scraping proxy

我正在使用selenium从m.skybet.com抓取数据但是当我运行脚本浏览器时,没有打开网站浏览器显示消息"连接超时"。我在这里使用代理与selenium,因为这个网站在我所在的地区受到限制。

from bs4 import BeautifulSoup
from selenium import webdriver
from time import sleep

def install_firefox_proxy(PROXY_HOST,PROXY_PORT):
    fp = webdriver.FirefoxProfile()
    fp.set_preference("network.proxy.type", 1)

    fp.set_preference("network.proxy.http", PROXY_HOST)
    fp.set_preference("network.proxy.http_port", int(PROXY_PORT))

    #fp.set_preference("network.proxy.ssl", PROXY_HOST)
    #fp.set_preference("network.proxy.ssl_port", int(PROXY_PORT))

   fp.set_preference("network.proxy.ftp", PROXY_HOST)
   fp.set_preference("network.proxy.ftp_port", int(PROXY_PORT))

   fp.set_preference("network.proxy.socks", PROXY_HOST)
   fp.set_preference("network.proxy.socks_port", int(PROXY_PORT))
   fp.update_preferences()
   return webdriver.Firefox(firefox_profile=fp)

driver = install_firefox_proxy("163.172.27.213", 3128)
driver.get('https://m.skybet.com/football/world-cup-
2018/event/16742642')
sleep(4)

res = driver.execute_script('return 
document.documentElement.outerHTML')
soup = BeautifulSoup(res, 'lxml')

bet = soup.find_all('div', {'class':'row_11ssjiv'})

for p in bet:
    try:
        team = b.find('div',{'class':'title_1nskdmh'})
        score = b.find('span',{'class':'priceInner_14t1nf5'})
        print(team,score)
    except:
        pass

1 个答案:

答案 0 :(得分:0)

使用更快的代理或尝试增加页面加载超时,如下所示:

driver.set_page_load_timeout(60)

然后您的驱动程序将等待60秒,然后再返回超时。