以下脚本在搜索文本"利物浦"之后进入新闻页面。然后将所有链接打印到文件,并在控制台中打印它们。 这里的问题是我无法获得谷歌新闻页面中所有新闻文章的链接。除了打印页面的所有其他链接
public static void main(String[] args) throws IOException {
String URL = "https://www.google.com";
WebDriver driver = new FirefoxDriver();
driver.manage().window().maximize();
driver.get(URL);
WebElement searchBar = driver.findElement(By.xpath("//*[@id='lst-ib']"));
searchBar.sendKeys("Liverpool");
WebElement clickSearch = driver.findElement(By.name("btnK"));
clickSearch.click();
WebElement newsButton = driver.findElement(By.xpath("//*[@id='hdtb-msb-vis']/div[2]/a"));
newsButton.click();
java.util.List<WebElement> links = driver.findElements(By.tagName("a"));
System.out.println(links.size());
FileWriter file = new FileWriter("/Users/lekharaj/Desktop/LFC.txt");
BufferedWriter b = new BufferedWriter(file);
for(int i=0;i<links.size();i++){
String text = links.get(i).getAttribute("href");
System.out.println("\n"+text);
b.write(text);
b.newLine();
b.flush();
}
}
答案 0 :(得分:1)
要在 Google主页上搜索利物浦文字,然后打印所有链接,您可以使用以下解决方案:
代码块:
System.setProperty("webdriver.gecko.driver", "C:\\Utility\\BrowserDrivers\\geckodriver.exe");
WebDriver driver = new FirefoxDriver();
driver.navigate().to("https://www.google.com/");
WebElement submit_button = driver.findElement(By.name("q"));
submit_button.sendKeys("Liverpool");
submit_button.submit();
new WebDriverWait(driver, 20).until(ExpectedConditions.elementToBeClickable(By.linkText("News"))).click();
List <WebElement> my_list = new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfAllElementsLocatedBy(By.cssSelector("h3.r.dO0Ag>a")));
System.out.println("The list of href links are : ");
for(WebElement element:my_list)
System.out.println(element.getAttribute("href"));
控制台输出:
The list of href links are :
http://www.espn.com/soccer/club/liverpool/364/blog/post/3506156/jurgen-klopp-stability-at-liverpool-the-envy-of-their-premier-league-rivals
https://www.liverpoolfc.com/news/first-team/302415-liverpool-fc-songs-fans-europe
http://www.espn.com/soccer/club/liverpool/364/blog/post/3489163/liverpools-andrew-robertson-hoping-to-mimic-predecessor-alan-kennedys-heroics-against-real-madrid
https://www.liverpoolecho.co.uk/sport/football/transfer-news/christian-pulisic-refuses-comment-reports-14688647
https://www.liverpoolecho.co.uk/sport/football/football-news/liverpool-legend-terry-mcdermotts-three-14688746
http://www.skysports.com/football/news/11669/11381416/liverpool-transfer-rumours-gianluigi-donnarumma-daniel-ceballos-jamaal-lascelles-and-james-tarkowski
http://www.skysports.com/more-sports/ufc/news/29876/11380754/ufc-how-liverpool-built-darren-till
https://www.independent.co.uk/sport/football/transfers/liverpool-transfer-news-jurgen-klopp-shortlist-tarkowski-lascelles-premier-league-epl-a8361576.html
https://www.belfasttelegraph.co.uk/sport/football/premier-league/liverpool/this-real-madrid-star-can-crush-liverpools-champions-league-dream-says-giggs-36932660.html
http://kwese.espn.com/football/blog/transfer-talk/79/post/3506910/transfer-rater-neymar-to-real-madridjamaal-lascelles-to-liverpool
答案 1 :(得分:0)
您可以使用以下x路径仅获取文章链接。
//article/a
在代码中,
public static void main(String[] args) throws IOException {
String URL = "https://www.google.com";
WebDriver driver = new FirefoxDriver();
driver.manage().window().maximize();
driver.get(URL);
WebElement searchBar = driver.findElement(By.xpath("//*[@id='lst-ib']"));
searchBar.sendKeys("Liverpool");
WebElement clickSearch = driver.findElement(By.name("btnK"));
clickSearch.click();
WebElement newsButton = driver.findElement(By.xpath("//*[@id='hdtb-msb-vis']/div[2]/a"));
newsButton.click();
java.util.List<WebElement> links = driver.findElements(By.xpath("//article/a"));
System.out.println(links.size());
FileWriter file = new FileWriter("/Users/lekharaj/Desktop/LFC.txt");
BufferedWriter b = new BufferedWriter(file);
for(int i=0;i<links.size();i++){
String text = links.get(i).getAttribute("href");
System.out.println("\n"+text);
b.write(text);
b.newLine();
b.flush();
}
}