Question

我正在尝试使用WebElements解析Octane基准页面http://octane-benchmark.googlecode.com/svn/latest/index.html：

<div class="hero-unit" id="inside-anchor">
    <h1 id="main-banner" align="center">Start Octane 2.0</h1>
    <div id="bar-appendix"></div>
</div>

我在我的平板电脑设备上启动了Selenium WebDriver（使用Java，Eclipse，Selendoroid）

SelendroidConfiguration config = new SelendroidConfiguration();
selendroidServer = new SelendroidLauncher(config);
selendroidServer.lauchSelendroid();
DesiredCapabilities caps = SelendroidCapabilities.android();
driver = new SelendroidDriver(caps);

我已经使用Octane页面初始化了驱动程序：

driver.get("http://octane-benchmark.googlecode.com/svn/latest/index.html");

我试图用xpath解析它：

String xpathString = "//div[@class='hero-unit']//h1";   
String line = driver.findElement(By.xpath(xpathString)).getText();
System.out.println(line);

但Java返回NullPointer Exception（在线） - 函数FindElement（）在此.html页面上找不到任何内容。

驱动程序启动良好，它为getCurrentUrl（）函数返回适当的值，但不能返回PageSource（），并且不能为findElement（By.something ...）返回任何值。看起来，这个Octane页面有一些东西可以阻止每个搜索请求（在解析过程中）。同样地，我已经解析了7个其他基准页面，并且它们运行良好，但是这个Octane页面...就像它是＃34;空的＆＃34; for WebDriver ...

我不知道是因为

<script type="text/javascript">

部分或其他什么？

这个Octane基准页面有什么特别之处吗？

...谢谢

Answer 1

By.xpath()仅在html页面符合XML标准时才有效。可能Octane 2.0页面不符合，因此该方法返回null。

Answer 2

xPath（）适用于符合XML标准的网站.HTML更宽容;您可以丢失结束标记和其他错误，但在XML中，这是禁止。所以很可能html不符合XML标准，所以我通过验证你在这个网站的链接进行了双重检查：

http://www.w3schools.com/xml/xml_validator.asp

猜猜是什么？它有一些错误。您可以通过首先在此站点上进行验证来节省下一次的麻烦。当然，这并不意味着符合XML的站点都适用于xPath（）webscraping（隐藏元素，javascript等）。但是，根据报告的错误的性质，您可以判断哪个不是。

Webdriver自动化 - 无法使用xpath查找元素（在Octane 2.0 Benchmark页面中）

2 个答案: