硒登录看起来像它的工作,但然后BeautifulSoup输出显示登录页面

时间:2018-09-18 10:15:08

标签: python selenium beautifulsoup

我正在尝试用Python编写脚本以获取我的幻想足球联赛中的所有球员,但是您必须首先登录ESPN。我的代码如下。它在运行时似乎正常运行-即,我看到登录页面出现,我看到它登录,然后页面关闭。然后,当我打印汤时,我看不到任何队员名单。我将汤的输出另存为html文件,以查看其内容,而这只是重定向我重新登录的页面。在尝试登录之前,我是否需要通过BS4加载页面?

import time
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.support.ui import WebDriverWait # available since 2.4.0
from selenium.webdriver.support import expected_conditions as EC # available since 2.26.0
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import urllib.request as urllib2
from bs4 import BeautifulSoup

driver = webdriver.Chrome()

driver.get("http://games.espn.go.com/ffl/signin")
#implement wait it is mandatory in this case
WebDriverWait(driver,1000).until(EC.presence_of_all_elements_located((By.XPATH,"(//iframe)")))
frms = driver.find_elements_by_xpath("(//iframe)")

driver.switch_to_frame(frms[2])
time.sleep(2)
driver.find_element_by_xpath("(//input)[1]").send_keys("userrname")
driver.find_element_by_xpath("(//input)[2]").send_keys("password")
driver.find_element_by_xpath("//button").click()
driver.switch_to_default_content()
time.sleep(4)
#driver.close()

# specify the url
roster_page = 'http://games.espn.com/ffl/leaguerosters?leagueId=11111'
# query the website and return the html to the variable 'page'
page = urllib2.urlopen(roster_page)
# parse the html using beautiful soup and store in variable `soup`
soup = BeautifulSoup(page, 'html.parser')

2 个答案:

答案 0 :(得分:2)

您正在使用selenium登录,然后使用urllib2打开URL,该URL使用另一个会话访问该站点。从Selenium Webdriver获取源代码,然后将其与BeautifulSoup一起使用,它应该可以工作。

答案 1 :(得分:0)

尝试使用此方法代替urllib2

driver.get("http://games.espn.com/ffl/leaguerosters?leagueId=11111")
# query the website and return the html to the variable 'page'
page = driver.page_source
# parse the html using beautiful soup and store in variable 'soup'
soup = BeautifulSoup(page, 'html.parser')