如何捕获此站点以供PhantomJsDriver和Selenium(Java)使用

时间:2016-11-28 09:11:08

标签: java selenium phantomjs

我尝试网站(https://m.naver.com)捕获。

我使用的代码是

String cli_args[] = new String[]{
    "--web-security=false",
    "--ssl-protocol=any",
    "--ignore-ssl-errors=true",
    "--webdriver-loglevel=NONE"
};

DesiredCapabilities capabilities = DesiredCapabilities.phantomjs();
capabilities.setJavascriptEnabled(true);
capabilities.setCapability("userAgent", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:48.0) Gecko/20100101 Firefox/48.0");
capabilities.setCapability(PhantomJSDriverService.PHANTOMJS_CLI_ARGS, cli_args);
capabilities.setCapability(PhantomJSDriverService.PHANTOMJS_EXECUTABLE_PATH_PROPERTY, "/usr/local/bin/phantomjs");

PhantomJSDriver driver = new PhantomJSDriver(capabilities);
driver.manage().timeouts().implicitlyWait(15, TimeUnit.SECONDS).pageLoadTimeout(20, TimeUnit.SECONDS).setScriptTimeout(15, TimeUnit.SECONDS);

driver.get("https://m.naver.com");
driver.findElement(By.xpath("//div[@id='nav']/div[3]/nav/ul/li[2]/a")).click();

new WebDriverWait(driver, 5).until(ExpectedConditions.visibilityOfElementLocated(By.xpath("//div[@id='nmap_news_1']")));
(this Wait is Work Fine)

new WebDriverWait(driver, 5).until(ExpectedConditions.visibilityOfElementLocated(By.xpath("//div[@id='nmap_news_1']/a")));
(this Wait is Work - NOT FOUND. GAVE UP Error)

new WebDriverWait(driver, 5).until(ExpectedConditions.visibilityOfElementLocated(By.xpath("//div[@id='nmap_news_1']/a/span/img")));
(this Wait is Work - NOT FOUND. GAVE UP Error)

效果很好,但突然间它无法正常工作。

我正在寻找的元素是下面的html元素。

<div id="nmap_news_1" class="ad" data-unit="1120D" data-tb="NEWS_1" data-extra="" data-mdom-unit="1120H" data-mdom="true" data-dom-url="http://mv.ad.naver.com/adshow" data-da-revision="161128111917755" data-position-type="rel" data-position-index="0" data-position-computed-index="7">
  <a href="https://mv.ad.naver.com/adclick?unit=1120D&amp;ac=7329029&amp;src=2999084&amp;br=2390035&amp;tb=NEWS_1&amp;rk=WDvymwpgE9kAAJjTh8AAAAQY&amp;eltts=0e%2F4bxL0hwyB3qHjBRDDRQ%3D%3D&amp;x_dy=1276&amp;x_ih=736&amp;x_th=85&amp;x_iv=0" style="display:block;background:#fff;text-decoration:none;">
    <span id="nbp_da_img" style="display:block;width:100%;height:85px;background:url(https://ssl.pstatic.net/tveta/libs/1148/1148810/20160929180724-bkVaw4AG_bg_left.jpg) repeat-x;background-size:auto 85px;-webkit-background-size:1px 85px;text-decoration:none;text-align:center;font-size:0;">
      <img src="https://ssl.pstatic.net/tveta/libs/1148/1148810/20160929180724-bkVaw4AG.jpg" alt="AD" width="320" height="85" data-media-width="640" data-media-height="170" data-content-type="image" data-bakery="material" style="vertical-align:top;border:none" onload="naver_corp_da.logParamManager['nmap_news_1'].imgOnloadHandler();">&nbsp;
    </span>
  </a>
</div>

但奇怪的是,它显示不正确,显示如下。

<div id="nmap_news_1" class="ad ready" data-unit="1120D" data-tb="NEWS_1" data-extra="" data-mdom-unit="1120H" data-mdom="true" data-dom-url="http://mv.ad.naver.com/adshow" data-da-revision="161128111917755" data-position-type="rel" data-position-index="0" data-position-computed-index="7">
  <span class="ad_load"></span>
</div>

我认为这是一个脚本执行错误,但我找不到解决方案。 你能帮我解释一下代码来获取网站上的元素吗?

使用翻译可以使一个单词变得奇怪。 谢谢你的理解。

0 个答案:

没有答案