我在Quora上使用selenium登录和搜索关键字。一切正常,我已经能够搜索关键字并获得第一页的结果。但是,我无法获取下一页的结果。我在网站上找到了下一页的标记,在网站上命名 min_seq 。但是,当我使用selenium获取页面时,包含该标记的html元素在响应中不存在。这是我的搜索关键字代码。
String term = "protein";
String searchUrl = "https://www.quora.com/search?q=%s";
String Xport = System.getProperty("lmportal.xvfb.id", ":1");
final File firefoxPath = new File(System.getProperty("lmportal.deploy.firefox.path",
"/home/infoobjects/firefox/firefox"));
FirefoxBinary firefoxBinary = new FirefoxBinary(firefoxPath);
firefoxBinary.setEnvironmentProperty("DISPLAY", Xport);
// Start Firefox driver
WebDriver driver = new FirefoxDriver(firefoxBinary, null);
driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS);
driver.get("https://www.quora.com/");
String str = driver.getPageSource();
System.out.println("str-->"+str);
WebElement emailElement = driver.findElement(By.name("email"));
emailElement.sendKeys("<email id>");
WebElement passwd = driver.findElement(By.name("password"));
passwd.sendKeys("<password>");
passwd.sendKeys(Keys.RETURN);
// get html page after login to selenium
str = driver.getPageSource();
url = String.format(searchUrl, term);
driver.get(url);
str = driver.getPageSource();
HtmlCleaner cleaner = new HtmlCleaner();
//next page url
NEXT_URL = node.getElementsByName("body", true)[0].getElementsByName("script", true)[0].getAttributeByName("src");
System.out.println(NEXT_URL);
在这种情况下,NEXT_URL给出null。这是因为body标签内没有脚本标签进入已清理的html。
任何建议都有助于从Quora中找到搜索结果的下一页