在java中运行phantomjs让我很头疼。当我运行程序和getpagesource时,我能够提取a-> src属性ul-> li文本但不在SPAN标记内。这可能是由于屏蔽或不正确的CSS。 JS脚本是有角度的。
调用driver.getPageSource()之后我选择的控制台输出是(通知属性地址为空):
<div class="propertylist-property-details col-lg-6">
<a href="/property-detail/gblhrdlad152749">
<span class="property-name ng-binding" ng-bind="data.AddressLine1"></span>
<span class="property-address ng-binding" ng-bind="data.AddressLine2"></span>
</a>
<!-- ngIf: -->
<span class="property-bullets">
<ul>
<li>- Grade II listed facade </li>
<li>- Exposed concrete beams </li>
<li>- Italian kitchens </li>
<li>- Underfloor heating and comfort cooling </li>
<li>1054 Sq.Feet (97.92 Sq.Metres) </li>
</ul>
</span>
</div>
我的java代码如下:
public static synchronized void testPhantomDriver() throws Exception {
DesiredCapabilities caps = new DesiredCapabilities();
caps.setJavascriptEnabled(true);
caps.setCapability(PhantomJSDriverService.PHANTOMJS_EXECUTABLE_PATH_PROPERTY, "C:\\location of \\phantomjs.exe");
String oldpage=""; String newpage="";
WebDriver driver = new PhantomJSDriver(caps);
try{
driver.get("http://search.savills.com/property-detail/gbcsrdlad140551#/r/list/property-for-sale%252Fengland%252Fbristol%252Fbristol%252Fbs1%252Fgbp");
WebElement menu = driver.findElement(By.xpath("//*[@id=\"ctl_GRS_PT_ND\"]")); // the trigger event element
Actions build = new Actions(driver); // ActionBuider
build.moveToElement(menu).build().perform(); // perform hover mouse over the needed element to triger visibility
build.click();
driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS);
String pageSource = driver.getPageSource();
driver.findElement(By.id("ViewAll")).click();
driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS);
System.out.println("clicked");
//driver = scrollToBottom(driver, 2000);
try{
System.out.println("waiting");
driver.wait(4000);
}catch(InterruptedException ie){
System.out.println("iexception: " + ie);
}
System.out.println(driver.getPageSource());
}catch(Exception exp){
System.out.println("exception:" + exp);
driver.close();
driver.quit();
}
driver.close();
driver.quit();
}
/**
* Main method
*/
public static void main(String[] args) {
try {
// run webdriver
testPhantomDriver();
} catch (Exception ex) {
System.out.println("exception");
}
}
有些解决方案我一直在想你可以帮忙吗? (1)加载为iframe(2)或等待完整的页面加载。我尝试使用connect(url).get()在Jsoup中阅读页面,但再次隐藏了可见性。
任何帮助都将不胜感激。
答案 0 :(得分:0)
问题解决了
尝试使用firefoxdriver和chromedriver而不是phantomjs.exe。使用更多内存但至少可以获得所有数据。