PhantomJS和Selenium Webdriver - 如何废弃由js

时间:2017-09-22 07:14:39

标签: javascript selenium

每个人都告诉我,一旦我使用PhantomJS,我将获得JS生成的表格内容。但我仍然失败。

我希望在网站http://data.eastmoney.com/xg/xg/default.html

上获取该表格

Page1没问题。

当我在page2的css选择器位置上使用click()来获取page2时,返回的内容仍然是page1。有什么问题?

#coding:utf-8
from selenium import webdriver
import time
from selenium.webdriver.support.ui import WebDriverWait

driver = webdriver.PhantomJS()
driver.get("http://data.eastmoney.com/xg/xg/default.html")
time.sleep(2)
driver.find_element_by_css_selector("#PageCont > span.at").click()

list_cates = driver.find_element_by_css_selector("#dt_1 > tbody > tr:nth-child(1) > td:nth-child(2) > a").text
print(list_cates)

4 个答案:

答案 0 :(得分:1)

您的问题是您没有等待点击发生后更新数据。您需要等待一段时间才能完成Ajax调用

#coding:utf-8
from selenium import webdriver
import time
from selenium.webdriver.support.ui import WebDriverWait

driver = webdriver.PhantomJS()
driver.get("http://data.eastmoney.com/xg/xg/default.html")
time.sleep(2)
driver.find_element_by_css_selector("#PageCont .next").click()

time.sleep(5)
list_cates = driver.find_element_by_css_selector("#dt_1 > tbody > tr:nth-child(1) > td:nth-child(2) > a").text
print(list_cates)
# Prints '太平鸟'

匹配第2页数据

Page 2

答案 1 :(得分:0)

看看我Tarun Lalwani,代码和照片如下:

 #coding:utf-8
    from selenium import webdriver
    import time
    from selenium.webdriver.support.ui import WebDriverWait

    driver = webdriver.PhantomJS()
    driver.get("http://data.eastmoney.com/xg/xg/default.html")
    time.sleep(2)
    for page_count in range(1,4):
        driver.find_element_by_id("gopage").send_keys(page_count)
        driver.find_element_by_css_selector("#PageCont > a.btn_link").click()
        time.sleep(10)
        list_cates = driver.find_element_by_css_selector("#dt_1 > tbody > tr:nth-child(1) > td:nth-child(2) > a").text
        print('get' + str(page_count) + 'pcs')
        print(list_cates)

run result

在另一种情况下,我想通过PhantomJS获取帧表并且也失败。

    #coding:utf-8
from selenium import webdriver
import time
from selenium.webdriver.support.ui import WebDriverWait

driver = webdriver.PhantomJS()
driver.get("http://ipo.csrc.gov.cn/checkClick.action?choice=info#")
driver.find_element_by_css_selector("#type1 > a").click()

time.sleep(5)
result = driver.find_element_by_css_selector("#frame_body > table > tbody > tr:nth-child(1) > td > table > tbody > tr:nth-child(3) > td:nth-child(1)").text
print(result)

答案 2 :(得分:0)

我想通过添加以下代码找到案例1的解决方案:driver.find_element_by_id(“gopage”)。clear()。

但另一个案例仍然需要你的帮助,非常感谢!

#coding:utf-8
from selenium import webdriver
import time
from selenium.webdriver.support.ui import WebDriverWait

driver = webdriver.PhantomJS()
driver.get("http://data.eastmoney.com/xg/xg/default.html")
time.sleep(2)
for page_count in range(1,4):
    driver.find_element_by_id("gopage").clear()
    driver.find_element_by_id("gopage").send_keys(page_count)
    driver.find_element_by_css_selector("#PageCont > a.btn_link").click()
    time.sleep(5)
    list_cates = driver.find_element_by_css_selector("#dt_1 > tbody > tr:nth-child(1) > td:nth-child(2) > a").text
    print('get' + str(page_count) + 'pcs')
    print(list_cates)
    driver.find_element_by_id("gopage").clear()

答案 3 :(得分:0)

我通过添加代码获得了案例2的解决方案:driver.switch_to_frame(" myframe")

问题需要在框架结构中解决!

还要非常感谢你解决我的问题。这是我在Stackoverflow中的第一个节目,我喜欢它!

#coding:utf-8
from selenium import webdriver
import time
from selenium.webdriver.support.ui import WebDriverWait

driver = webdriver.PhantomJS()
driver.get("http://ipo.csrc.gov.cn/checkClick.action?choice=info#")

time.sleep(2)
driver.switch_to_frame("myframe")
result = driver.find_element_by_css_selector("#frame_body > table > tbody > tr:nth-child(1) > td > table > tbody > tr:nth-child(3) > td:nth-child(1)").text
print(result)