我最近偶然发现了JSoup库,所以我决定通过创建一个谷歌查询程序来试验点击。
想法是输入Google搜索,接收要显示的查询数量,显示它们,然后要求用户输入一个更多的整数,这是&的索引#39;显示在链接旁边。
问题是新的扫描仪从未被调用过。它打印提示并关闭。
注意:我知道我可以自己去谷歌搜索。我正在试验这个新的图书馆,它抓住了我大脑的那部分,这让我想要深入了解某些事情。
以下是代码,输出 - 抱歉,如果它是草率的。还在学习...... :
import java.io.IOException;
import java.util.Scanner;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class GoogleSearchJava {
static int index;
static String linkHref;
public static final String GOOGLE_SEARCH_URL = "https://www.google.com/search";
public static void main(String[] args) throws IOException {
//GET INPUT FOR SEARCH TERM
Scanner input = new Scanner(System.in);
System.out.print("Search: ");
String searchTerm = input.nextLine();
System.out.print("Enter number of query results: ");
int num = input.nextInt();
String searchURL = GOOGLE_SEARCH_URL + "?q=" + searchTerm + "&num=" + num;
//NEED TO DEFINE USER AGENT TO PREVENT 403 ERROR.
Document document = Jsoup.connect(searchURL).userAgent("Mozilla/5.0").get();
//OPTION TO DISPLAY HTML FILE IN BROWSWER. DON'T KNOW YET.
//System.out.println(doc.html());
//If google search results HTML change the <h3 class="r" to <h3 class ="r1"
//need to change below stuff accordingly
Elements results = document.select("h3.r > a");
index = 0;
String news = "News";
for (Element result : results) {
index++;
linkHref = result.attr("href");
String linkText = result.text();
String pingResult = index + ": " + linkText + ", URL:: " + linkHref.substring(6, linkHref.indexOf("&"));
if (pingResult.contains(news)) {
System.out.println("FOUND " + "\"" + linkText + "\"" + "NO HYPERTEXT FOR NEWS QUERY RESULTS AT THIS TIME. SKIPPED INDEX.");
System.out.println();
} else {
System.out.println(pingResult);
}
}
System.out.println();
System.out.println();
goToURL(linkHref, input);
}
public static int goToURL(String hRef, Scanner input) {
try {
System.out.print("Enter Index (i.e. 1, 2, etc) you wish to visit, 0 to exit: ");
int newIndex = input.nextInt();
for (int i = 0; i < index; i++) {
if (newIndex == index) {
/*
RUNNING LINUX COMMAND WITH RUNTIME CLASS TO COCANTENATE THE HYPERLINK SUBSTRING
*/
Process process = Runtime.getRuntime().exec("xdg-open " + hRef.substring(6, hRef.indexOf("&")));
process.waitFor();
break;
} else if (newIndex == 0) {
System.out.println("Shutting program down.");
System.exit(0);
}
}
} catch (Exception e) {
System.out.println("ERROR while parsing URL");
}
return index;
}
}
这是输出 它在新扫描仪可以接收输入之前停止
Search: Oracle
Enter number of query results: 3
1: Oracle | Integrated Cloud Applications and Platform Services, URL:: =http://www.oracle.com/
2: Oracle Corporation - Wikipedia, the free encyclopedia, URL:: =https://en.wikipedia.org/wiki/Oracle_Corporation
3: Oracle (@Oracle) | Twitter, URL:: =https://twitter.com/oracle%3Flang%3Den
Enter Index (i.e. 1, 2, etc) you wish to visit, 0 to exit: Shutting program down.
Process finished with exit code 0
正如您所看到的,它会直接通过else语句关闭程序。 任何帮助将不胜感激。这是一个有趣的项目,我期待着完成它。
答案 0 :(得分:1)
根据SO团队成员的建议,我问为什么Scanner没有要求输入。从技术上讲,我通过程序停止 BEFORE 获取输入来解决问题。虽然在实际上没有处理输入的问题仍然存在,但先前的问题已得到修复,这是我的解决方案。
我没有关闭原始扫描仪,并将扫描仪作为参数添加到我的&#34; goToURL&#34;方法。我还删除了一个正在关闭程序的else语句,因为允许程序继续运行的输入仍然是错误的。尽管如此,这是&#34;工作&#34;至少解决原始问题的代码。
此外,我将String元素(pingResult)放入ArrayList中以改善goToURL方法中的循环结构。我觉得这是一种使用简单数据结构访问元素的好方法:
import java.io.IOException;
import java.util.ArrayList;
import java.util.Scanner;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class GoogleSearchJava {
static int index;
static String linkHref;
public static final String GOOGLE_SEARCH_URL = "https://www.google.com/search";
public static void main(String[] args) throws IOException {
//GET INPUT FOR SEARCH TERM
Scanner input = new Scanner(System.in);
System.out.print("Search: ");
String searchTerm = input.nextLine();
System.out.print("Enter number of query results: ");
int num = input.nextInt();
String searchURL = GOOGLE_SEARCH_URL + "?q=" + searchTerm + "&num=" + num;
//NEED TO DEFINE USER AGENT TO PREVENT 403 ERROR.
Document document = Jsoup.connect(searchURL).userAgent("Mozilla/5.0").get();
//OPTION TO DISPLAY HTML FILE IN BROWSWER. DON'T KNOW YET.
//System.out.println(doc.html());
//If google search results HTML change the <h3 class="r" to <h3 class ="r1"
//need to change below stuff accordingly
Elements results = document.select("h3.r > a");
index = 0;
String news = "News";
/*
THIS WILL ADD THE pingResult STRINGS TO AN ARRAYLIST
*/
ArrayList<String> displayResults = new ArrayList<>();
for (Element result : results) {
index++;
linkHref = result.attr("href");
String linkText = result.text();
String pingResult = index + ": " + linkText + ", URL:: " + linkHref.substring(6, linkHref.indexOf("&")) + "\n";
if (pingResult.contains(news)) {
System.out.println("FOUND " + "\"" + linkText + "\"" + "NO HYPERTEXT FOR NEWS QUERY RESULTS AT THIS TIME. SKIPPED INDEX.");
System.out.println();
} else {
displayResults.add(pingResult);
}
}
for(String urlString : displayResults) {
System.out.println(urlString);
}
System.out.println();
System.out.println();
goToURL(linkHref, input, displayResults);
}
public static int goToURL(String hRef, Scanner input, ArrayList<String> resultList) {
try {
System.out.print("Enter Index (i.e. 1, 2, etc) you wish to visit, 0 to exit: ");
index = input.nextInt();
for (String string : resultList) {
if (string.startsWith(Integer.toString(index))) {
Process process = Runtime.getRuntime().exec("xdg-open " + hRef.substring(6, hRef.indexOf("&")));
process.waitFor();
}
}
} catch (Exception e) {
System.out.println("ERROR while parsing URL");
}
return index;
}
}