阅读Yahoo!时出错答案网站

时间:2016-07-15 15:36:27

标签: java web-scraping yahoo

我想从文件(a.txt)中搜索我的一些查询,然后在Yahoo!中搜索它们。答案网站,最后将检索到的结果写入另一个文件(b.txt)

我的代码如下:

public static void run() throws IOException {
    Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress("XX.XX.XX.XX", 8080));
    LineNumberReader lnr = new LineNumberReader(new FileReader(new File("a.txt")));
    lnr.skip(Long.MAX_VALUE);
    int len = lnr.getLineNumber();
    lnr.close();
    for (int i = 0; i < len; i = i++) {
        String ll = Files.readAllLines(Paths.get("a.txt")).get(i);
        String l = URLEncoder.encode(ll, "UTF-8");
        String surl = "https://answers.yahoo.com/search/search_result?p=" + l + "&sort=rel";
        System.out.println("Search URL: " + surl);
        URL url = new URL(surl);
        InputStream in = url.openConnection(proxy).getInputStream();
        BufferedReader rd = new BufferedReader(new InputStreamReader(in));
        StringBuffer sb = new StringBuffer();
        String line;
        while ((line = rd.readLine()) != null) {
            PrintWriter pw = new PrintWriter(new FileOutputStream(new File("b.txt"), true));
            pw.println(line);
            pw.close();
        }
        rd.close();
    }

但是,我收到的错误如下:

Exception in thread "main" java.io.FileNotFoundException: https://answers.search.yahoo.com/search?p=How+a+13+year+old+boy+can+lose+weight%3F&sort=rel
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1834)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1439)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:254)
at Yahoo.run(Yahoo.java:117)
at Main.main(Main.java:36)

但是当我在浏览器网址中使用搜索字符串时,所需的结果将显示在Yahoo!站点。

1 个答案:

答案 0 :(得分:2)

错误不在代码中。阅读这个问题: Searching in yahoo using java

从现在开始,您必须使用BOSS API进行搜索。请参阅此example并从那里开始。您必须更改正在进行连接并从yahoo获取的代码。最好的。