我正在尝试使用以下代码阅读youtube视频评论:
FirefoxDriver driver = new FirefoxDriver();
driver.get("https://www.youtube.com/watch?v=JcbBNpYkuW4");
WebElement element = driver.findElementByCssSelector("#watch-discussion");
System.out.println(element.getText()); // this prints: loading..
// scrolll down so that comments start to load
driver.executeScript("window.scrollBy(0,500)", "");
Thread.sleep(10000);
element = driver.findElementByCssSelector("#watch-discussion");
System.out.println(element.getText());
Last语句打印一个空字符串。为什么呢?
答案 0 :(得分:1)
我建议您尝试使用this API非常简单/可靠,而不是依赖于元素的X路径。此外,您不能依赖Xpath来获取动态页面/内容。
答案 1 :(得分:1)
这会有点棘手,因为所有评论都是在观察讨论中的单独iframe标签中编写的。你必须首先使用driver.switchTo()。frame(“put ID or Name here”)打开iframe;但是iframe id是随机值。切换到iframe之后,您可以在div中找到具有类名“Ct”的所有注释的注释,以便您可以使用XPATH获取这些注释。见下面的工作代码
FirefoxDriver driver = new FirefoxDriver();
driver.get("https://www.youtube.com/watch?v=JcbBNpYkuW4");
WebElement element = driver.findElementByCssSelector("#watch-discussion");
System.out.println(element.getText()); // this prints: loading..
// scrolll down so that comments start to load
driver.executeScript("window.scrollBy(0,500)", "");
Thread.sleep(20000);
List<WebElement> iframes = driver.findElements(By.xpath("//iframe"));
for(WebElement e : iframes) {
if(e.getAttribute("id") != null && e.getAttribute("id").startsWith("I0_")) {
// switch to iframe which contains comments
driver.switchTo().frame(e);
break;
}
}
// fetch all comments
List<WebElement> comments = driver.findElements(By.xpath("//div[@class='Ct']"));
for(WebElement e : comments) {
System.out.println(e.getText());
}