网页刮痧页面

时间:2017-06-05 16:48:26

标签: python selenium web-scraping beautifulsoup

我想在这个link上抓桌子。我试图在页面加载后使用selenium获取数据,但我没有成功。关于如何从该网页上删除表格的任何其他想法?

编辑 -

我试过

from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get("https://steria.taleo.net/careersection/in_cs_ext_fs/jobsearch.ftl?lang=en&radiusType=K&location=462170431401&searchExpanded=true&radius=1") 
print(driver.find_element_by_class_name('table').text)
driver.close()

2 个答案:

答案 0 :(得分:3)

当动态生成表格内容时,您应该等到JavaScript执行才能获得所需数据:

from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium import webdriver

driver = webdriver.PhantomJS()
driver.get("https://steria.taleo.net/careersection/in_cs_ext_fs/jobsearch.ftl?lang=en&radiusType=K&location=462170431401&searchExpanded=true&radius=1")
table = wait(driver, 10).until(EC.presence_of_element_located(("xpath", "//table[@id='jobs' and ./tbody/tr]")))
print(table.text)
next_button = driver.find_element_by_link_text("Next")
next_button.click()

wait(driver, 5).until(lambda x: next_button.get_attribute("aria-disabled") == "true")
table = wait(driver, 10).until(EC.presence_of_element_located(("xpath", "//table[@id='jobs' and ./tbody/tr]")))
print(table.text)
driver.close()

答案 1 :(得分:0)