如何通过Javascript执行器提取页面源?

时间:2018-11-21 12:07:14

标签: javascript java selenium selenium-webdriver pagesource

import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.JavascriptExecutor;
import org.openqa.selenium.WebDriver;       

public class oo {
   public static void main(String[] args) {

       System.setProperty("webdriver.chrome.driver","D:\\Java\\Lib\\chromedriver.exe");
      WebDriver driver = new ChromeDriver();         
      driver.navigate().to("https://google.com");

      JavascriptExecutor js = (JavascriptExecutor) driver;  
      Object s = js.executeScript("return document.body.innerHTML;",null).toString();

      System.out.println(s);
      driver.close();
   }
}

以上代码返回nullPointerException。

Exception in thread "main" java.lang.NullPointerException at java.util.Arrays.stream(Unknown Source) at java.util.stream.Stream.of(Unknown Source) at org.openqa.selenium.remote.RemoteWebDriver.executeScript(RemoteWebDriver.java:484) at oo.main(oo.java:25)

当我删除可选对象参数时,它将出现编译错误。

代码:

  JavascriptExecutor js = (JavascriptExecutor) driver;  
  Object s = js.executeScript("return document.body.innerHTML;").toString();

错误:

Exception in thread "main" java.lang.Error: Unresolved compilation problem:     The method executeScript(String, Object[]) in the type JavascriptExecutor is not applicable for the arguments (String)

使用Selenium-server-standalone-3.141.59.jar

1 个答案:

答案 0 :(得分:0)

要通过 JavascriptExecutor 提取打印 页面源,您可以使用以下( Java ))解决方案:

  • 语法:

    String page_source = ((JavascriptExecutor)driver).executeScript("return document.documentElement.innerHTML;").toString();
    System.out.println(page_source);
    

注意:在提取 Page Source 之前,需要诱使服务员完全加载页面。