无法使用Selenium PhantomJs驱动程序加载整个页面

时间:2016-01-28 10:33:56

标签: java selenium web-scraping webdriver phantomjs

我在Java中使用Selenium库废弃网站。我使用PhantomJsDriver作为webdriver。这个网站有一些我感兴趣的列表(li)标签中的网址。问题是该网站有64个(li )元素,但我只收到16(li)元素。这是我的代码:

DesiredCapabilities caps=new DesiredCapabilities();
caps.setJavascriptEnabled(true);
caps.setCapability(PhantomJSDriverService.PHANTOMJS_EXECUTABLE_PATH_PROPERTY    , "Path");
WebDriver driver=new PhantomJSDriver(caps);
driver.get("Some Website");

WebDriverWait wait=new WebDriverWait(driver, 600);

wait.until(new ExpectedCondition<Boolean>() {
    boolean resetCount=true;
    int counter=5;
    @Override
    public Boolean apply(WebDriver d) {
        if(resetCount){
            ((JavascriptExecutor) d).executeScript(
                    "   window.mssCount="+counter+";\r\n" + 
                    "   window.mssJSDelay=function mssJSDelay(){\r\n" + 
                    "       if((typeof jQuery != 'undefined') && (jQuery.active !== 0 || $(\":animated\").length !== 0))\r\n" + 
                    "           window.mssCount="+counter+";\r\n" + 
                    "       window.mssCount-->0 &&\r\n" + 
                    "       setTimeout(window.mssJSDelay,window.mssCount+1);\r\n" + 
                    "   }\r\n" + 
                    "   window.mssJSDelay();");
            resetCount=false;
        }
        boolean ready=false;
        try{
            ready=-1==((Long) ((JavascriptExecutor) d).executeScript(
                    "if(typeof window.mssJSDelay!=\"function\"){\r\n" + 
                    "   window.mssCount="+counter+";\r\n" + 
                    "   window.mssJSDelay=function mssJSDelay(){\r\n" + 
                    "       if((typeof jQuery != 'undefined') && (jQuery.active !== 0 || $(\":animated\").length !== 0))\r\n" + 
                    "           window.mssCount="+counter+";\r\n" + 
                    "       window.mssCount-->0 &&\r\n" + 
                    "       setTimeout(window.mssJSDelay,window.mssCount+1);\r\n" + 
                    "   }\r\n" + 
                    "   window.mssJSDelay();\r\n" + 
                    "}\r\n" + 
                    "return window.mssCount;"));
        }
        catch (NoSuchWindowException a){
            a.printStackTrace();
            return true;
        }
        catch (Exception e) {
            e.printStackTrace();
            return false;
        }
        return ready;
    }
    @Override
    public String toString() {
        return String.format("Timeout waiting for documentNotActive script");
    }
});


BufferedWriter bw=new BufferedWriter(new FileWriter(new File("C:\\abc.txt")));
bw.write(driver.getPageSource());
bw.close();
driver.quit();

我从一个答案重用了wait.until()中的代码。我的问题是为什么它只返回16个元素?我认为它应该返回没有或所有元素。标签数量或文件大小是否有限制?。这里的解决方案是什么?

1 个答案:

答案 0 :(得分:0)

尝试添加

  

隐式等待是告诉WebDriver对DOM进行轮询   尝试查找一个或多个元素的时间量   没有立即可用。默认设置为0.一旦设置,   隐式等待是为WebDriver对象实例的生命周期设置的。

WebDriver driver=new PhantomJSDriver(caps);
driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS);